Dirsearch

Dirsearch is a Python-based web path scanner featuring smart recursion, dictionary flexibility, and advanced filtering. It’s designed for discovering hidden directories and files on web servers.

Installation

Linux/Ubuntu

# Clone repository
git clone https://github.com/maurosoria/dirsearch.git
cd dirsearch

# Install dependencies
pip3 install -r requirements.txt

# Make executable
chmod +x dirsearch.py
sudo ln -s $(pwd)/dirsearch.py /usr/local/bin/dirsearch

macOS

# Homebrew
brew install dirsearch

# Or from source
git clone https://github.com/maurosoria/dirsearch.git
pip3 install -r requirements.txt

Basic Usage

Option	Description
`-u, --url <URL>`	Target URL
`-w, --wordlists <FILE>`	Wordlist file path
`-e, --extensions <EXT>`	Extensions (php,html,js,txt)
`-t, --threads <NUM>`	Thread count (default: 30)
`-r, --recursive`	Recursive search
`--depth <NUM>`	Recursion depth (default: unlimited)
`-f, --format <FORMAT>`	Output format (simple, json, csv)
`-o, --output <FILE>`	Output file
`--proxy <IP:PORT>`	HTTP proxy
`-H, --headers <HEADER>`	Custom headers
`-x, --exclude-status <CODES>`	Exclude status codes
`-i, --include-status <CODES>`	Include specific status codes
`--min-response-size <SIZE>`	Minimum response size
`--max-response-size <SIZE>`	Maximum response size
`--timeout <SEC>`	Connection timeout
`--user-agent <UA>`	Custom User-Agent
`-k, --insecure`	Skip SSL verification

Essential Commands

Basic Directory Scan

# Simple scan
python3 dirsearch.py -u http://target.com -w /usr/share/wordlists/dirb/common.txt

# With extensions
python3 dirsearch.py -u http://target.com -w wordlist.txt -e php,html,js

# Specific thread count
python3 dirsearch.py -u http://target.com -w wordlist.txt -t 100

# Save results
python3 dirsearch.py -u http://target.com -w wordlist.txt -o results.txt

Recursive Scanning

# Enable recursion
python3 dirsearch.py -u http://target.com -w wordlist.txt -r

# With depth limit
python3 dirsearch.py -u http://target.com -w wordlist.txt -r --depth 2

# Deep recursion
python3 dirsearch.py -u http://target.com -w wordlist.txt -r --depth 5

Response Filtering

# Exclude specific status codes
python3 dirsearch.py -u http://target.com -w wordlist.txt -x 404,403

# Include only specific codes
python3 dirsearch.py -u http://target.com -w wordlist.txt -i 200,301,302

# Filter by response size
python3 dirsearch.py -u http://target.com -w wordlist.txt --min-response-size 1000 --max-response-size 5000

Custom Headers and Authentication

# Add custom header
python3 dirsearch.py -u http://target.com -w wordlist.txt -H "Authorization: Bearer token123"

# Multiple headers
python3 dirsearch.py -u http://target.com -w wordlist.txt \
  -H "X-Forwarded-For: 127.0.0.1" \
  -H "User-Agent: Mozilla/5.0"

# Custom User-Agent
python3 dirsearch.py -u http://target.com -w wordlist.txt --user-agent "Mozilla/5.0 (Windows NT 10.0)"

Advanced Techniques

Proxy Usage

# HTTP proxy (Burp Suite)
python3 dirsearch.py -u http://target.com -w wordlist.txt --proxy http://127.0.0.1:8080

# SOCKS proxy
python3 dirsearch.py -u http://target.com -w wordlist.txt --proxy socks5://127.0.0.1:9050

SSL/TLS Configuration

# Skip SSL verification
python3 dirsearch.py -u https://target.com -w wordlist.txt -k

# Custom timeout for slow connections
python3 dirsearch.py -u https://target.com -w wordlist.txt --timeout 30 -k

Output Formats

# Simple text output (default)
python3 dirsearch.py -u http://target.com -w wordlist.txt -f simple

# JSON output
python3 dirsearch.py -u http://target.com -w wordlist.txt -f json -o results.json

# CSV output
python3 dirsearch.py -u http://target.com -w wordlist.txt -f csv -o results.csv

Configuration File

# Create dirsearch config
cat > .dirsearch.conf << EOF
[general]
threads = 50
timeout = 10
wordlist = /usr/share/wordlists/dirb/common.txt
extensions = php,html,js,txt
exclude-status = 404,403
recursive = true
depth = 2
EOF

# Use config
python3 dirsearch.py -u http://target.com --config .dirsearch.conf

Practical Examples

Complete Web App Enumeration

# Comprehensive scan
python3 dirsearch.py \
  -u http://target.com \
  -w /usr/share/wordlists/dirbuster/directory-list-2.3-medium.txt \
  -e php,html,js,json,xml,txt \
  -t 100 \
  -r \
  --depth 3 \
  -x 404,403,500 \
  -o full_scan.txt \
  --format json

API Endpoint Discovery

# Find API endpoints
python3 dirsearch.py \
  -u http://target.com/api \
  -w ./api_wordlist.txt \
  -e json,xml \
  -t 150 \
  -r \
  --depth 2 \
  -i 200,201,400,401 \
  -o api_endpoints.json \
  --format json

Admin Panel Search

# Focused admin/management panel search
cat > admin_wordlist.txt << EOF
admin
administrator
panel
management
cpanel
backend
control
console
dashboard
EOF

python3 dirsearch.py \
  -u http://target.com \
  -w admin_wordlist.txt \
  -e php,asp,aspx \
  -t 50

Wordlist Strategies

Using Different Wordlists

# Common directories
python3 dirsearch.py -u http://target.com -w db/dicc/common.txt

# Big wordlist
python3 dirsearch.py -u http://target.com -w db/dicc/big.txt -t 200

# SecLists integration
python3 dirsearch.py -u http://target.com -w SecLists/Discovery/Web-Content/common.txt

# PHP-specific
python3 dirsearch.py -u http://target.com -w db/dicc/php.txt -e php

Create Custom Wordlists

# Extract words from website
curl -s http://target.com | tr ' ' '\n' | grep -E '^[a-z]+$' | sort -u > site_words.txt

# Technology-specific
cat > django_wordlist.txt << EOF
admin
api
static
media
accounts
profile
settings
EOF

# Combine wordlists
cat db/dicc/common.txt SecLists/Discovery/Web-Content/API.txt | sort -u > combined.txt

Comparison with Alternatives

Tool	Speed	Recursion	Language	Best For
Dirsearch	Fast	Excellent	Python	Flexible, advanced scanning
Feroxbuster	Very Fast	Excellent	Rust	Maximum speed, automation
Gobuster	Fast	Good	Go	DNS, vhost, API discovery
FFUF	Very Fast	Manual	Go	Custom fuzzing, payloads
DirBuster	Slow	Limited	Java	GUI, manual assessment

Common Scenarios

WordPress Enumeration

# WordPress theme/plugin discovery
python3 dirsearch.py -u http://wordpress.local -w db/dicc/wordpress.txt -e php

# Include WordPress-specific extensions
python3 dirsearch.py \
  -u http://wordpress.local \
  -w db/dicc/common.txt \
  -e php,txt,xml,sql \
  -r --depth 2

Joomla Scanning

# Joomla directory discovery
python3 dirsearch.py -u http://joomla.local -w db/dicc/joomla.txt -e php

# Component enumeration
python3 dirsearch.py \
  -u http://joomla.local/components \
  -w ./joomla_components.txt \
  -r --depth 2

API Testing

# REST API endpoints
python3 dirsearch.py \
  -u http://api.target.com \
  -w ./api_paths.txt \
  -e json,xml \
  -H "Content-Type: application/json"

# GraphQL endpoint discovery
python3 dirsearch.py -u http://target.com -w db/dicc/common.txt -e graphql,gql

Performance Tuning

Speed Optimization

# Maximum threads for speed
python3 dirsearch.py -u http://target.com -w wordlist.txt -t 200

# Minimal extensions for speed
python3 dirsearch.py -u http://target.com -w wordlist.txt -e php

# Exclude large responses (false positives)
python3 dirsearch.py -u http://target.com -w wordlist.txt --max-response-size 5000

Stealth Optimization

# Slow scan to avoid detection
python3 dirsearch.py -u http://target.com -w wordlist.txt -t 5 --timeout 20

# Distribute across time
# Run smaller scans with delays
for i in {1..10}; do
  python3 dirsearch.py -u http://target.com -w segment_$i.txt -t 10
  sleep 60
done

Troubleshooting

High False Positives

# Filter by size to reduce false positives
python3 dirsearch.py -u http://target.com -w wordlist.txt \
  --min-response-size 100 \
  --max-response-size 10000

# Exclude error pages
python3 dirsearch.py -u http://target.com -w wordlist.txt -x 404,403,500,502

Slow Scanning

# Reduce thread count and increase timeout
python3 dirsearch.py -u http://target.com -w wordlist.txt -t 20 --timeout 30

# Use smaller wordlist
python3 dirsearch.py -u http://target.com -w smallwordlist.txt

Best Practices

Use recursive scanning with reasonable depth (2-3)
Filter status codes to reduce noise
Test with smaller wordlists first
Adjust extensions based on target technology
Use size filtering to eliminate false positives
Combine with other reconnaissance tools
Analyze results for patterns
Document findings with proper context
Respect rate limiting and targets’ policies
Use proxies through Burp for manual validation

Last updated: 2025-03-30 | Dirsearch GitHub