Skip to content

Dirsearch

Dirsearch is a Python-based web path scanner featuring smart recursion, dictionary flexibility, and advanced filtering. It’s designed for discovering hidden directories and files on web servers.

Installation

Linux/Ubuntu

# Clone repository
git clone https://github.com/maurosoria/dirsearch.git
cd dirsearch

# Install dependencies
pip3 install -r requirements.txt

# Make executable
chmod +x dirsearch.py
sudo ln -s $(pwd)/dirsearch.py /usr/local/bin/dirsearch

macOS

# Homebrew
brew install dirsearch

# Or from source
git clone https://github.com/maurosoria/dirsearch.git
pip3 install -r requirements.txt

Basic Usage

OptionDescription
-u, --url <URL>Target URL
-w, --wordlists <FILE>Wordlist file path
-e, --extensions <EXT>Extensions (php,html,js,txt)
-t, --threads <NUM>Thread count (default: 30)
-r, --recursiveRecursive search
--depth <NUM>Recursion depth (default: unlimited)
-f, --format <FORMAT>Output format (simple, json, csv)
-o, --output <FILE>Output file
--proxy <IP:PORT>HTTP proxy
-H, --headers <HEADER>Custom headers
-x, --exclude-status <CODES>Exclude status codes
-i, --include-status <CODES>Include specific status codes
--min-response-size <SIZE>Minimum response size
--max-response-size <SIZE>Maximum response size
--timeout <SEC>Connection timeout
--user-agent <UA>Custom User-Agent
-k, --insecureSkip SSL verification

Essential Commands

Basic Directory Scan

# Simple scan
python3 dirsearch.py -u http://target.com -w /usr/share/wordlists/dirb/common.txt

# With extensions
python3 dirsearch.py -u http://target.com -w wordlist.txt -e php,html,js

# Specific thread count
python3 dirsearch.py -u http://target.com -w wordlist.txt -t 100

# Save results
python3 dirsearch.py -u http://target.com -w wordlist.txt -o results.txt

Recursive Scanning

# Enable recursion
python3 dirsearch.py -u http://target.com -w wordlist.txt -r

# With depth limit
python3 dirsearch.py -u http://target.com -w wordlist.txt -r --depth 2

# Deep recursion
python3 dirsearch.py -u http://target.com -w wordlist.txt -r --depth 5

Response Filtering

# Exclude specific status codes
python3 dirsearch.py -u http://target.com -w wordlist.txt -x 404,403

# Include only specific codes
python3 dirsearch.py -u http://target.com -w wordlist.txt -i 200,301,302

# Filter by response size
python3 dirsearch.py -u http://target.com -w wordlist.txt --min-response-size 1000 --max-response-size 5000

Custom Headers and Authentication

# Add custom header
python3 dirsearch.py -u http://target.com -w wordlist.txt -H "Authorization: Bearer token123"

# Multiple headers
python3 dirsearch.py -u http://target.com -w wordlist.txt \
  -H "X-Forwarded-For: 127.0.0.1" \
  -H "User-Agent: Mozilla/5.0"

# Custom User-Agent
python3 dirsearch.py -u http://target.com -w wordlist.txt --user-agent "Mozilla/5.0 (Windows NT 10.0)"

Advanced Techniques

Proxy Usage

# HTTP proxy (Burp Suite)
python3 dirsearch.py -u http://target.com -w wordlist.txt --proxy http://127.0.0.1:8080

# SOCKS proxy
python3 dirsearch.py -u http://target.com -w wordlist.txt --proxy socks5://127.0.0.1:9050

SSL/TLS Configuration

# Skip SSL verification
python3 dirsearch.py -u https://target.com -w wordlist.txt -k

# Custom timeout for slow connections
python3 dirsearch.py -u https://target.com -w wordlist.txt --timeout 30 -k

Output Formats

# Simple text output (default)
python3 dirsearch.py -u http://target.com -w wordlist.txt -f simple

# JSON output
python3 dirsearch.py -u http://target.com -w wordlist.txt -f json -o results.json

# CSV output
python3 dirsearch.py -u http://target.com -w wordlist.txt -f csv -o results.csv

Configuration File

# Create dirsearch config
cat > .dirsearch.conf << EOF
[general]
threads = 50
timeout = 10
wordlist = /usr/share/wordlists/dirb/common.txt
extensions = php,html,js,txt
exclude-status = 404,403
recursive = true
depth = 2
EOF

# Use config
python3 dirsearch.py -u http://target.com --config .dirsearch.conf

Practical Examples

Complete Web App Enumeration

# Comprehensive scan
python3 dirsearch.py \
  -u http://target.com \
  -w /usr/share/wordlists/dirbuster/directory-list-2.3-medium.txt \
  -e php,html,js,json,xml,txt \
  -t 100 \
  -r \
  --depth 3 \
  -x 404,403,500 \
  -o full_scan.txt \
  --format json

API Endpoint Discovery

# Find API endpoints
python3 dirsearch.py \
  -u http://target.com/api \
  -w ./api_wordlist.txt \
  -e json,xml \
  -t 150 \
  -r \
  --depth 2 \
  -i 200,201,400,401 \
  -o api_endpoints.json \
  --format json
# Focused admin/management panel search
cat > admin_wordlist.txt << EOF
admin
administrator
panel
management
cpanel
backend
control
console
dashboard
EOF

python3 dirsearch.py \
  -u http://target.com \
  -w admin_wordlist.txt \
  -e php,asp,aspx \
  -t 50

Wordlist Strategies

Using Different Wordlists

# Common directories
python3 dirsearch.py -u http://target.com -w db/dicc/common.txt

# Big wordlist
python3 dirsearch.py -u http://target.com -w db/dicc/big.txt -t 200

# SecLists integration
python3 dirsearch.py -u http://target.com -w SecLists/Discovery/Web-Content/common.txt

# PHP-specific
python3 dirsearch.py -u http://target.com -w db/dicc/php.txt -e php

Create Custom Wordlists

# Extract words from website
curl -s http://target.com | tr ' ' '\n' | grep -E '^[a-z]+$' | sort -u > site_words.txt

# Technology-specific
cat > django_wordlist.txt << EOF
admin
api
static
media
accounts
profile
settings
EOF

# Combine wordlists
cat db/dicc/common.txt SecLists/Discovery/Web-Content/API.txt | sort -u > combined.txt

Comparison with Alternatives

ToolSpeedRecursionLanguageBest For
DirsearchFastExcellentPythonFlexible, advanced scanning
FeroxbusterVery FastExcellentRustMaximum speed, automation
GobusterFastGoodGoDNS, vhost, API discovery
FFUFVery FastManualGoCustom fuzzing, payloads
DirBusterSlowLimitedJavaGUI, manual assessment

Common Scenarios

WordPress Enumeration

# WordPress theme/plugin discovery
python3 dirsearch.py -u http://wordpress.local -w db/dicc/wordpress.txt -e php

# Include WordPress-specific extensions
python3 dirsearch.py \
  -u http://wordpress.local \
  -w db/dicc/common.txt \
  -e php,txt,xml,sql \
  -r --depth 2

Joomla Scanning

# Joomla directory discovery
python3 dirsearch.py -u http://joomla.local -w db/dicc/joomla.txt -e php

# Component enumeration
python3 dirsearch.py \
  -u http://joomla.local/components \
  -w ./joomla_components.txt \
  -r --depth 2

API Testing

# REST API endpoints
python3 dirsearch.py \
  -u http://api.target.com \
  -w ./api_paths.txt \
  -e json,xml \
  -H "Content-Type: application/json"

# GraphQL endpoint discovery
python3 dirsearch.py -u http://target.com -w db/dicc/common.txt -e graphql,gql

Performance Tuning

Speed Optimization

# Maximum threads for speed
python3 dirsearch.py -u http://target.com -w wordlist.txt -t 200

# Minimal extensions for speed
python3 dirsearch.py -u http://target.com -w wordlist.txt -e php

# Exclude large responses (false positives)
python3 dirsearch.py -u http://target.com -w wordlist.txt --max-response-size 5000

Stealth Optimization

# Slow scan to avoid detection
python3 dirsearch.py -u http://target.com -w wordlist.txt -t 5 --timeout 20

# Distribute across time
# Run smaller scans with delays
for i in {1..10}; do
  python3 dirsearch.py -u http://target.com -w segment_$i.txt -t 10
  sleep 60
done

Troubleshooting

High False Positives

# Filter by size to reduce false positives
python3 dirsearch.py -u http://target.com -w wordlist.txt \
  --min-response-size 100 \
  --max-response-size 10000

# Exclude error pages
python3 dirsearch.py -u http://target.com -w wordlist.txt -x 404,403,500,502

Slow Scanning

# Reduce thread count and increase timeout
python3 dirsearch.py -u http://target.com -w wordlist.txt -t 20 --timeout 30

# Use smaller wordlist
python3 dirsearch.py -u http://target.com -w smallwordlist.txt

Best Practices

  • Use recursive scanning with reasonable depth (2-3)
  • Filter status codes to reduce noise
  • Test with smaller wordlists first
  • Adjust extensions based on target technology
  • Use size filtering to eliminate false positives
  • Combine with other reconnaissance tools
  • Analyze results for patterns
  • Document findings with proper context
  • Respect rate limiting and targets’ policies
  • Use proxies through Burp for manual validation

Last updated: 2025-03-30 | Dirsearch GitHub