Ir al contenido

Gau

Gau (Get All URLs) is a high-speed tool that fetches all known URLs for a domain from multiple sources: AlienVault OTX, Wayback Machine, Common Crawl, and URLScan. Faster and more comprehensive than using individual sources.

Installation

From Releases

# Linux
wget https://github.com/lc/gau/releases/latest/download/gau_linux_amd64
mv gau_linux_amd64 gau
chmod +x gau
sudo mv gau /usr/local/bin/

# macOS
wget https://github.com/lc/gau/releases/latest/download/gau_darwin_amd64
chmod +x gau_darwin_amd64
mv gau_darwin_amd64 gau
sudo mv gau /usr/local/bin/

# Go Install
go install github.com/lc/gau/v2/cmd/gau@latest

Verify Installation

gau -help
echo "example.com" | gau

Basic Usage

CommandDescription
echo "example.com" | gauFetch all URLs
gau example.comDirect argument
gau -subs example.comInclude subdomains
gau -o results.txt example.comSave to file
gau -b txt example.comOutput as text
gau example.com | wc -lCount URLs

URL Enumeration

Basic URL Discovery

# Fetch all URLs for domain
echo "example.com" | gau

# Include subdomains
echo "example.com" | gau -subs

# Exclude subdomains (root domain only)
echo "example.com" | gau --no-subs

# Save to file
gau example.com -o urls.txt

# Specify output format
gau example.com -b txt

# Count discovered URLs
gau example.com | wc -l

# Get unique URLs
gau example.com | sort -u

Provider Control

# Use specific provider only
gau --providers otx example.com
gau --providers wayback example.com
gau --providers commoncrawl example.com
gau --providers urlscan example.com

# Use multiple providers
gau --providers otx,wayback example.com

# List available providers
gau -help | grep providers

# Default: all providers (fastest comprehensive results)
gau example.com

Multiple Domains

# Process domain list
cat domains.txt | gau > all_urls.txt

# With subdomains
cat domains.txt | gau -subs > all_urls_with_subs.txt

# Save per domain
while IFS= read -r domain; do
  echo "[*] Processing $domain..."
  gau $domain -o urls_$domain.txt
  echo "  Found: $(wc -l < urls_$domain.txt) URLs"
done < domains.txt

# Combine and deduplicate
cat urls_*.txt | sort -u > combined_urls.txt

Advanced Techniques

OSINT & Reconnaissance

Comprehensive Endpoint Discovery

#!/bin/bash
# Complete URL enumeration and analysis

TARGET="example.com"
OUTPUT_DIR="urls_recon"
mkdir -p "$OUTPUT_DIR"

echo "[*] Fetching all URLs..."
gau $TARGET | sort -u > "$OUTPUT_DIR/all_urls.txt"

# Separate by extension
echo "[*] Analyzing URLs..."

# API endpoints
grep -iE "(api|rest|graphql)" "$OUTPUT_DIR/all_urls.txt" | sort -u > "$OUTPUT_DIR/api_endpoints.txt"

# Web pages
grep -iE "\.(html|php|asp|jsp)$" "$OUTPUT_DIR/all_urls.txt" | sort -u > "$OUTPUT_DIR/pages.txt"

# Parameters
grep "?" "$OUTPUT_DIR/all_urls.txt" > "$OUTPUT_DIR/urls_with_params.txt"

# Paths without domain
sed "s|.*://[^/]*||" "$OUTPUT_DIR/all_urls.txt" | sort -u > "$OUTPUT_DIR/paths.txt"

# Summary
echo ""
echo "=== Enumeration Summary ==="
echo "Total URLs: $(wc -l < $OUTPUT_DIR/all_urls.txt)"
echo "API endpoints: $(wc -l < $OUTPUT_DIR/api_endpoints.txt)"
echo "Web pages: $(wc -l < $OUTPUT_DIR/pages.txt)"
echo "URLs with params: $(wc -l < $OUTPUT_DIR/urls_with_params.txt)"
echo "Unique paths: $(wc -l < $OUTPUT_DIR/paths.txt)"

Parameter Extraction

#!/bin/bash
# Extract and analyze parameters

TARGET="example.com"

echo "[*] Extracting parameters..."
gau $TARGET | grep "?" > params_urls.txt

# Get all parameter names
echo "=== All Parameters ==="
grep -oP '[?&]\K[^=&]+' params_urls.txt | sort -u

# Most common parameters
echo ""
echo "=== Most Common Parameters ==="
grep -oP '[?&]\K[^=&]+' params_urls.txt | sort | uniq -c | sort -rn | head -20

# Parameters with numeric values (IDOR potential)
echo ""
echo "=== Numeric ID Parameters ==="
grep -oP '=[0-9]{1,10}(?=&|$)' params_urls.txt | sed 's/^=//' | sort -u

# Complex queries
echo ""
echo "=== Complex Query Strings ==="
grep -E '\?.*&.*&' params_urls.txt | head -20

API Endpoint Discovery

#!/bin/bash
# Find API endpoints and methods

TARGET="example.com"

echo "[*] Discovering API structure..."
gau $TARGET | grep -iE "(api|v1|v2|v3|graphql|rest)" > api_urls.txt

# Extract API paths
echo "=== API Paths ==="
grep -oE "/[a-z0-9/_-]*" api_urls.txt | sort -u | head -30

# Find REST patterns
echo ""
echo "=== REST Endpoints ==="
grep -iE "/(users|posts|items|products|accounts)(/[0-9]+)?" api_urls.txt | sort -u

# GraphQL endpoints
echo ""
echo "=== GraphQL Endpoints ==="
grep -i "graphql" api_urls.txt

# API versioning
echo ""
echo "=== API Versions ==="
grep -oE "/v[0-9]+(\.[0-9]+)?" api_urls.txt | sort -u

# Webhook/callback patterns
echo ""
echo "=== Webhook Patterns ==="
grep -iE "(webhook|callback|hook)" api_urls.txt

Historical/Stale Endpoint Detection

#!/bin/bash
# Find potentially stale endpoints

TARGET="example.com"

gau $TARGET > all_urls.txt

# Old endpoints (various patterns)
echo "=== Potentially Stale Endpoints ==="
grep -iE "(v1|old|legacy|deprecated|archive|beta)" all_urls.txt

# Test endpoints
grep -iE "test" all_urls.txt

# Development paths
grep -iE "(dev|staging|sandbox)" all_urls.txt

# Backup endpoints
grep -iE "(backup|export|download)" all_urls.txt

# Hidden/obscured paths
grep -iE "(hidden|private|secret|internal)" all_urls.txt

Integration with Other Tools

Chain with Grep and Filtering

#!/bin/bash
# Advanced filtering

TARGET="example.com"

# Find URLs with specific extensions
echo "=== PHP Files ==="
gau $TARGET | grep "\.php"

# Find JS files (for source code analysis)
echo ""
echo "=== JavaScript Files ==="
gau $TARGET | grep "\.js$" | head -20

# JSON endpoints
echo ""
echo "=== JSON Endpoints ==="
gau $TARGET | grep "\.json$"

# API with Bearer tokens
echo ""
echo "=== Potential Auth URLs ==="
gau $TARGET | grep -iE "(auth|token|login|oauth)"

Chain with Curl for Status Checking

#!/bin/bash
# Check which endpoints still exist

TARGET="example.com"
URLS=$(gau $TARGET | head -50)  # Limit for speed

echo "Checking endpoint availability..."
echo $URLS | while read url; do
  status=$(timeout 2 curl -s -o /dev/null -w "%{http_code}" "$url" 2>/dev/null)
  if [ "$status" = "200" ] || [ "$status" = "301" ] || [ "$status" = "302" ]; then
    echo "$url: $status"
  fi
done | tee active_urls.txt

Find Subdomains from URLs

#!/bin/bash
# Extract subdomains from all URLs

TARGET="example.com"

gau $TARGET | \
  grep -oE "https?://[^/]+" | \
  sed "s|https://||; s|http://||" | \
  sort -u > subdomains.txt

echo "Unique subdomains found: $(wc -l < subdomains.txt)"
cat subdomains.txt

Chain with Unfurl for URL Analysis

#!/bin/bash
# Deep URL analysis (requires unfurl: github.com/tomnomnom/unfurl)

TARGET="example.com"

gau $TARGET | unfurl --urls | sort -u > structured_urls.txt

# Extract all parameters
gau $TARGET | unfurl --paths | grep "?" | sort -u

# Extract domains from URLs
gau $TARGET | unfurl --domains | sort -u

Filtering & Analysis

Extract Specific Data

#!/bin/bash
# Parse URLs for structured insights

URLS="urls.txt"

# Extract domain from URL
echo "=== Subdomains Found ==="
grep -oE "https?://[^/]+" $URLS | \
  sed "s|https://||; s|http://||" | \
  grep "\." | sort -u

# Extract paths only
echo ""
echo "=== All Paths ==="
sed "s|.*://[^/]*||" $URLS | sort -u | head -30

# Extract query parameters
echo ""
echo "=== All Parameters ==="
grep "?" $URLS | grep -oE "[?&][^=]+" | sort -u

# Find URLs by path depth
echo ""
echo "=== Single-level Paths ==="
sed "s|.*://[^/]*||" $URLS | grep "^/[^/]*$" | sort -u

echo ""
echo "=== Multi-level Paths ==="
sed "s|.*://[^/]*||" $URLS | grep "/" | grep -v "^/$" | sort -u

Deduplication & Cleanup

# Remove duplicates
sort -u urls.txt > unique_urls.txt

# Remove by path (ignore domains)
sed "s|.*://[^/]*||" urls.txt | sort -u > unique_paths.txt

# Keep only URLs matching pattern
grep "\.php\?" urls.txt | sort -u > php_urls.txt

# Remove common false positives
grep -v -E "^$|^#" urls.txt | sort -u

Data Export & Reporting

Generate Summary Report

#!/bin/bash
# Comprehensive analysis report

TARGET="example.com"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
REPORT="gau_report_${TARGET}_${TIMESTAMP}.txt"

cat > $REPORT << EOF
=== GAU URL ENUMERATION REPORT ===
Domain: $TARGET
Timestamp: $TIMESTAMP
Tool: gau (github.com/lc/gau)

=== STATISTICS ===
EOF

TOTAL=$(gau $TARGET | wc -l)
UNIQUE=$(gau $TARGET | sort -u | wc -l)
WITH_PARAMS=$(gau $TARGET | grep "?" | wc -l)

cat >> $REPORT << EOF
Total URLs: $TOTAL
Unique URLs: $UNIQUE
URLs with parameters: $WITH_PARAMS

=== URL SAMPLES ===
EOF

echo "" >> $REPORT
echo "Sample URLs:" >> $REPORT
gau $TARGET | sort -u | head -50 >> $REPORT

echo ""
echo "[+] Report saved to $REPORT"

CSV Export

#!/bin/bash
# Export to CSV format

TARGET="example.com"
OUTPUT="gau_$(date +%Y%m%d).csv"

echo "url,has_params,extension,path_depth" > $OUTPUT

gau $TARGET | sort -u | while read url; do
  # Has parameters?
  params=$(echo "$url" | grep -c "?")

  # Get extension
  ext=$(echo "$url" | grep -oE "\.[a-z0-9]+$" | sed 's/^\.//')
  [ -z "$ext" ] && ext="none"

  # Path depth
  depth=$(echo "$url" | sed "s|.*://[^/]*||" | tr -cd '/' | wc -c)

  echo "$url,$params,$ext,$depth" >> $OUTPUT
done

echo "[+] Exported to $OUTPUT"

Performance Optimization

Speed Tips

# Gau uses multiple sources concurrently for speed
# Already optimized by default

# But you can limit to fastest source
gau --providers wayback example.com  # Fastest usually

# Combine with other tools efficiently
gau example.com | sort -u | while read url; do
  # Process efficiently
  echo "$url"
done

Handle Large Result Sets

# If output is very large, use streams
gau example.com | head -1000 > first_1000.txt

# Process in batches
gau example.com | split -l 1000 - batch_

# Filter early
gau example.com | grep "api" | head -500

Best Practices

  • Gau is fast: no significant rate limiting concerns
  • Use all providers for comprehensive coverage (default)
  • Combine with grep for specific endpoint types
  • Cross-reference with historical data from multiple sources
  • Look for endpoint patterns that reveal architecture
  • Check for deprecated API versions
  • Find admin/debug endpoints from historical records
  • Identify parameter patterns for vulnerability testing

Common Issues

Empty Results

# Some domains may have limited URL history
gau example.com

# Try with different providers
gau --providers otx example.com
gau --providers commoncrawl example.com

# Check if domain exists
dig example.com

Too Many Results

# Filter to relevant URLs
gau example.com | grep "api" | head -100

# Get only unique paths
gau example.com | sed "s|.*://[^/]*||" | sort -u | head -100

# Sample results
gau example.com | shuf | head -100

Resources


Last updated: 2026-03-30