la Herramienta de Enumeración de Harvester Email y Subdominio¶

"Clase de la hoja" idbutton id="theharvester-copy-btn" class="copy-btn" onclick="copyAllCommands()" Copiar todos los comandos id="theharvester-pdf-btn" class="pdf-btn" onclick="generatePDF()" Generar PDF seleccionado/button ■/div titulada

Sinopsis¶

theHarvester es una poderosa herramienta OSINT (Open Source Intelligence) diseñada para recoger direcciones de correo electrónico, nombres de subdominios, hosts virtuales, puertos abiertos, banners y nombres de empleados de diferentes fuentes públicas. Es ampliamente utilizado por testers de penetración, cazadores de recompensas de fallos, e investigadores de seguridad para el reconocimiento y la reunión de información durante las fases iniciales de evaluaciones de seguridad.

NOVEDAD ** Aviso legal**: Únicamente utilice elHarvester en los dominios que posee o tenga permiso explícito para probar. El reconocimiento no autorizado puede violar las condiciones de servicio y las leyes locales.

Instalación¶

Kali Linux Instalación¶

# theHarvester is pre-installed on Kali Linux
theharvester --help

# Update to latest version
sudo apt update
sudo apt install theharvester

# Alternative: Install from GitHub
git clone https://github.com/laramies/theHarvester.git
cd theHarvester
sudo python3 -m pip install -r requirements.txt

Instalación Ubuntu/Debian¶

# Install dependencies
sudo apt update
sudo apt install python3 python3-pip git

# Clone repository
git clone https://github.com/laramies/theHarvester.git
cd theHarvester

# Install Python dependencies
python3 -m pip install -r requirements.txt

# Make executable
chmod +x theHarvester.py

# Create symlink for global access
sudo ln -s $(pwd)/theHarvester.py /usr/local/bin/theharvester

Docker Instalación¶

# Pull official Docker image
docker pull theharvester/theharvester

# Run with Docker
docker run --rm theharvester/theharvester -d google -l 100 -b example.com

# Build from source
git clone https://github.com/laramies/theHarvester.git
cd theHarvester
docker build -t theharvester .

# Run custom build
docker run --rm theharvester -d google -l 100 -b example.com

Python Virtual Environment¶

# Create virtual environment
python3 -m venv theharvester-env
source theharvester-env/bin/activate

# Clone and install
git clone https://github.com/laramies/theHarvester.git
cd theHarvester
pip install -r requirements.txt

# Run theHarvester
python3 theHarvester.py --help

Uso básico¶

Estructura del comando¶

# Basic syntax
theharvester -d <domain> -l <limit> -b <source>

# Common usage pattern
theharvester -d example.com -l 500 -b google

# Multiple sources
theharvester -d example.com -l 500 -b google,bing,yahoo

# Save results to file
theharvester -d example.com -l 500 -b google -f results.html

Parámetros esenciales¶

# Domain to search
-d, --domain DOMAIN

# Limit number of results
-l, --limit LIMIT

# Data source to use
-b, --source SOURCE

# Output file
-f, --filename FILENAME

# Start result number
-s, --start START

# Enable DNS brute force
-c, --dns-brute

# Enable DNS TLD expansion
-t, --dns-tld

# Enable port scanning
-p, --port-scan

# Take screenshots
-e, --screenshot

Fuentes de datos¶

Motores de búsqueda¶

# Google search
theharvester -d example.com -l 500 -b google

# Bing search
theharvester -d example.com -l 500 -b bing

# Yahoo search
theharvester -d example.com -l 500 -b yahoo

# DuckDuckGo search
theharvester -d example.com -l 500 -b duckduckgo

# Yandex search
theharvester -d example.com -l 500 -b yandex

Redes sociales¶

# LinkedIn search
theharvester -d example.com -l 500 -b linkedin

# Twitter search
theharvester -d example.com -l 500 -b twitter

# Instagram search
theharvester -d example.com -l 500 -b instagram

# Facebook search
theharvester -d example.com -l 500 -b facebook

Bases de datos profesionales¶

# Hunter.io (requires API key)
theharvester -d example.com -l 500 -b hunter

# SecurityTrails (requires API key)
theharvester -d example.com -l 500 -b securitytrails

# Shodan (requires API key)
theharvester -d example.com -l 500 -b shodan

# VirusTotal (requires API key)
theharvester -d example.com -l 500 -b virustotal

Transparencia del certificado¶

# Certificate Transparency logs
theharvester -d example.com -l 500 -b crtsh

# Censys (requires API key)
theharvester -d example.com -l 500 -b censys

# Certificate Spotter
theharvester -d example.com -l 500 -b certspotter

DNS Fuentes¶

# DNS dumpster
theharvester -d example.com -l 500 -b dnsdumpster

# Threat Crowd
theharvester -d example.com -l 500 -b threatcrowd

# DNS brute force
theharvester -d example.com -l 500 -b google -c

# TLD expansion
theharvester -d example.com -l 500 -b google -t

Técnicas avanzadas¶

Reconocimiento general¶

#!/bin/bash
# comprehensive-recon.sh

DOMAIN="$1"
OUTPUT_DIR="theharvester_results_$(date +%Y%m%d_%H%M%S)"

if [ $# -ne 1 ]; then
    echo "Usage: $0 <domain>"
    exit 1
fi

mkdir -p "$OUTPUT_DIR"

echo "Starting comprehensive reconnaissance for $DOMAIN"

# Search engines
echo "=== Search Engines ==="
theharvester -d "$DOMAIN" -l 500 -b google -f "$OUTPUT_DIR/google.html"
theharvester -d "$DOMAIN" -l 500 -b bing -f "$OUTPUT_DIR/bing.html"
theharvester -d "$DOMAIN" -l 500 -b yahoo -f "$OUTPUT_DIR/yahoo.html"

# Social networks
echo "=== Social Networks ==="
theharvester -d "$DOMAIN" -l 500 -b linkedin -f "$OUTPUT_DIR/linkedin.html"
theharvester -d "$DOMAIN" -l 500 -b twitter -f "$OUTPUT_DIR/twitter.html"

# Certificate transparency
echo "=== Certificate Transparency ==="
theharvester -d "$DOMAIN" -l 500 -b crtsh -f "$OUTPUT_DIR/crtsh.html"

# DNS sources
echo "=== DNS Sources ==="
theharvester -d "$DOMAIN" -l 500 -b dnsdumpster -f "$OUTPUT_DIR/dnsdumpster.html"

# DNS brute force
echo "=== DNS Brute Force ==="
theharvester -d "$DOMAIN" -l 500 -b google -c -f "$OUTPUT_DIR/dns_brute.html"

# All sources combined
echo "=== All Sources Combined ==="
theharvester -d "$DOMAIN" -l 1000 -b all -f "$OUTPUT_DIR/all_sources.html"

echo "Reconnaissance complete. Results saved in $OUTPUT_DIR"

API Configuración clave¶

# Create API keys configuration file
cat > api-keys.yaml ``<< 'EOF'
apikeys:
  hunter: your_hunter_api_key
  securitytrails: your_securitytrails_api_key
  shodan: your_shodan_api_key
  virustotal: your_virustotal_api_key
  censys:
    id: your_censys_id
    secret: your_censys_secret
  binaryedge: your_binaryedge_api_key
  fullhunt: your_fullhunt_api_key
  github: your_github_token
EOF

# Use configuration file
theharvester -d example.com -l 500 -b hunter --api-keys api-keys.yaml

Email Pattern Analysis¶

#!/usr/bin/env python3
# email-pattern-analyzer.py

import re
import sys
from collections import Counter

def analyze_email_patterns(emails):
    """Analyze email patterns to identify naming conventions"""
    patterns = []
    domains = []

    for email in emails:
        if '@' in email:
            local, domain = email.split('@', 1)
            domains.append(domain.lower())

            # Analyze local part patterns
            if '.' in local:
                if len(local.split('.')) == 2:
                    patterns.append('firstname.lastname')
                else:
                    patterns.append('complex.pattern')
            elif '_' in local:
                patterns.append('firstname_lastname')
            elif any(char.isdigit() for char in local):
                patterns.append('name_with_numbers')
            else:
                patterns.append('single_name')

    return patterns, domains

def extract_names_from_emails(emails):
    """Extract potential names from email addresses"""
    names = []

    for email in emails:
        if '@' in email:
            local = email.split('@')[0]

            # Remove numbers and special characters
            clean_local = re.sub(r'[0-9_.-]', ' ', local)

            # Split into potential name parts
            parts = clean_local.split()
            if len(parts) >``= 2:
                names.extend(parts)

    return names

def main():
    if len(sys.argv) != 2:
        print("Usage: python3 email-pattern-analyzer.py <email_list_file>")
        sys.exit(1)

    email_file = sys.argv[1]

    try:
        with open(email_file, 'r') as f:
            content = f.read()

        # Extract emails using regex
        email_pattern = r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]\\\\{2,\\\\}\b'
        emails = re.findall(email_pattern, content)

        print(f"Found \\\\{len(emails)\\\\} email addresses")
        print("\n=== Email Addresses ===")
        for email in sorted(set(emails)):
            print(email)

        # Analyze patterns
        patterns, domains = analyze_email_patterns(emails)

        print("\n=== Email Patterns ===")
        pattern_counts = Counter(patterns)
        for pattern, count in pattern_counts.most_common():
            print(f"\\\\{pattern\\\\}: \\\\{count\\\\}")

        print("\n=== Domains ===")
        domain_counts = Counter(domains)
        for domain, count in domain_counts.most_common():
            print(f"\\\\{domain\\\\}: \\\\{count\\\\}")

        # Extract names
        names = extract_names_from_emails(emails)
        if names:
            print("\n=== Potential Names ===")
            name_counts = Counter(names)
            for name, count in name_counts.most_common(20):
                if len(name) > 2:  # Filter out short strings
                    print(f"\\\\{name\\\\}: \\\\{count\\\\}")

    except FileNotFoundError:
        print(f"Error: File \\\\{email_file\\\\} not found")
    except Exception as e:
        print(f"Error: \\\\{e\\\\}")

if __name__ == "__main__":
    main()

Validación del subdominio¶

#!/bin/bash
# subdomain-validator.sh

DOMAIN="$1"
SUBDOMAIN_FILE="$2"

if [ $# -ne 2 ]; then
    echo "Usage: $0 <domain> <subdomain_file>"
    exit 1
fi

echo "Validating subdomains for $DOMAIN"

# Extract subdomains from theHarvester results
grep -oE "[a-zA-Z0-9.-]+\.$DOMAIN" "$SUBDOMAIN_FILE"|sort -u > temp_subdomains.txt

# Validate each subdomain
while read subdomain; do
    if [ -n "$subdomain" ]; then
        echo -n "Checking $subdomain: "

        # DNS resolution check
        if nslookup "$subdomain" >/dev/null 2>&1; then
            echo -n "DNS✓ "

            # HTTP check
            if curl -s --connect-timeout 5 "http://$subdomain" >/dev/null 2>&1; then
                echo "HTTP✓"
            elif curl -s --connect-timeout 5 "https://$subdomain" >/dev/null 2>&1; then
                echo "HTTPS✓"
            else
                echo "No HTTP"
            fi
        else
            echo "DNS✗"
        fi
    fi
done < temp_subdomains.txt

rm temp_subdomains.txt

Integración con otras herramientas¶

Integración con Nmap¶

#!/bin/bash
# theharvester-nmap-integration.sh

DOMAIN="$1"

if [ $# -ne 1 ]; then
    echo "Usage: $0 <domain>"
    exit 1
fi

# Gather subdomains with theHarvester
echo "Gathering subdomains with theHarvester..."
theharvester -d "$DOMAIN" -l 500 -b all -f harvester_results.html

# Extract IP addresses and subdomains
grep -oE '([0-9]\\\\{1,3\\\\}\.)\\\\{3\\\\}[0-9]\\\\{1,3\\\\}' harvester_results.html|sort -u > ips.txt
grep -oE "[a-zA-Z0-9.-]+\.$DOMAIN" harvester_results.html|sort -u > subdomains.txt

# Scan discovered IPs with Nmap
if [ -s ips.txt ]; then
    echo "Scanning discovered IPs with Nmap..."
    nmap -sS -O -sV -oA nmap_ips -iL ips.txt
fi

# Resolve subdomains and scan
if [ -s subdomains.txt ]; then
    echo "Resolving and scanning subdomains..."
    while read subdomain; do
        ip=$(dig +short "$subdomain"|head -1)
        if [ -n "$ip" ] && [[ "$ip" =~ ^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
            echo "$ip $subdomain" >> resolved_hosts.txt
        fi
    done < subdomains.txt

    if [ -s resolved_hosts.txt ]; then
        nmap -sS -sV -oA nmap_subdomains -iL resolved_hosts.txt
    fi
fi

echo "Integration complete. Check nmap_*.xml files for results."

Integración con Metasploit¶

#!/bin/bash
# theharvester-metasploit-integration.sh

DOMAIN="$1"
WORKSPACE="$2"

if [ $# -ne 2 ]; then
    echo "Usage: $0 <domain> <workspace>"
    exit 1
fi

# Run theHarvester
theharvester -d "$DOMAIN" -l 500 -b all -f harvester_results.html

# Extract emails and hosts
grep -oE '\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]\\\\{2,\\\\}\b' harvester_results.html > emails.txt
grep -oE '([0-9]\\\\{1,3\\\\}\.)\\\\{3\\\\}[0-9]\\\\{1,3\\\\}' harvester_results.html|sort -u > hosts.txt

# Create Metasploit resource script
cat > metasploit_import.rc ``<< EOF
workspace -a $WORKSPACE
workspace $WORKSPACE

# Import hosts
$(while read host; do echo "hosts -a $host"; done < hosts.txt)

# Import emails as notes
$(while read email; do echo "notes -a -t email -d \"$email\" -H $DOMAIN"; done < emails.txt)

# Run auxiliary modules
use auxiliary/gather/dns_enum
set DOMAIN $DOMAIN
run

use auxiliary/scanner/http/http_version
set RHOSTS file:hosts.txt
run

workspace
hosts
notes
EOF

echo "Metasploit resource script created: metasploit_import.rc"
echo "Run with: msfconsole -r metasploit_import.rc"

Integración con Recon-ng¶

#!/usr/bin/env python3
# theharvester-recon-ng-integration.py

import subprocess
import re
import json

class TheHarvesterReconIntegration:
    def __init__(self, domain):
        self.domain = domain
        self.results = \\\{
            'emails': [],
            'subdomains': [],
            'ips': [],
            'social_profiles': []
        \\\}

    def run_theharvester(self):
        """Run theHarvester and parse results"""
        try:
            # Run theHarvester with multiple sources
            cmd = ['theharvester', '-d', self.domain, '-l', '500', '-b', 'all']
            result = subprocess.run(cmd, capture_output=True, text=True)

            if result.returncode == 0:
                self.parse_results(result.stdout)
            else:
                print(f"theHarvester error: \\\{result.stderr\\\}")

        except Exception as e:
            print(f"Error running theHarvester: \\\{e\\\}")

    def parse_results(self, output):
        """Parse theHarvester output"""
        # Extract emails
        email_pattern = r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]\\\{2,\\\}\b'
        self.results['emails'] = list(set(re.findall(email_pattern, output)))

        # Extract IPs
        ip_pattern = r'([0-9]\\\{1,3\\\}\.)\\\{3\\\}[0-9]\\\{1,3\\\}'
        self.results['ips'] = list(set(re.findall(ip_pattern, output)))

        # Extract subdomains
        subdomain_pattern = rf'[a-zA-Z0-9.-]+\.\\\{re.escape(self.domain)\\\}'
        self.results['subdomains'] = list(set(re.findall(subdomain_pattern, output)))

    def generate_recon_ng_commands(self):
        """Generate Recon-ng commands"""
        commands = [
            f"workspaces create \\\{self.domain\\\}",
            f"workspaces select \\\{self.domain\\\}",
        ]

        # Add domains
        commands.append(f"db insert domains \\\{self.domain\\\}")
        for subdomain in self.results['subdomains']:
            commands.append(f"db insert domains \\\{subdomain\\\}")

        # Add hosts
        for ip in self.results['ips']:
            commands.append(f"db insert hosts \\\{ip\\\}")

        # Add contacts (emails)
        for email in self.results['emails']:
            local, domain = email.split('@', 1)
            commands.extend([
                f"db insert contacts \\\{local\\\} \\\{local\\\} \\\{email\\\}",
                f"db insert domains \\\{domain\\\}"
            ])

        # Add reconnaissance modules
        commands.extend([
            "modules load recon/domains-hosts/hackertarget",
            "run",
            "modules load recon/domains-hosts/threatcrowd",
            "run",
            "modules load recon/hosts-ports/shodan_hostname",
            "run"
        ])

        return commands

    def save_recon_ng_script(self, filename="recon_ng_commands.txt"):
        """Save Recon-ng commands to file"""
        commands = self.generate_recon_ng_commands()

        with open(filename, 'w') as f:
            for cmd in commands:
                f.write(cmd + '\n')

        print(f"Recon-ng commands saved to \\\{filename\\\}")
        print(f"Run with: recon-ng -r \\\{filename\\\}")

    def export_json(self, filename="theharvester_results.json"):
        """Export results to JSON"""
        with open(filename, 'w') as f:
            json.dump(self.results, f, indent=2)

        print(f"Results exported to \\\{filename\\\}")

def main():
    import sys

    if len(sys.argv) != 2:
        print("Usage: python3 theharvester-recon-ng-integration.py <domain>``")
        sys.exit(1)

    domain = sys.argv[1]

    integration = TheHarvesterReconIntegration(domain)
    integration.run_theharvester()
    integration.save_recon_ng_script()
    integration.export_json()

    print(f"\nResults Summary:")
    print(f"Emails: \\\\{len(integration.results['emails'])\\\\}")
    print(f"Subdomains: \\\\{len(integration.results['subdomains'])\\\\}")
    print(f"IPs: \\\\{len(integration.results['ips'])\\\\}")

if __name__ == "__main__":
    main()

Automatización y scripting¶

Supervisión automatizada¶

#!/bin/bash
# theharvester-monitor.sh

DOMAIN="$1"
INTERVAL="$2"  # in hours
ALERT_EMAIL="$3"

if [ $# -ne 3 ]; then
    echo "Usage: $0 <domain> <interval_hours> <alert_email>"
    exit 1
fi

BASELINE_FILE="baseline_$\\\\{DOMAIN\\\\}.txt"
CURRENT_FILE="current_$\\\\{DOMAIN\\\\}.txt"

# Create baseline if it doesn't exist
if [ ! -f "$BASELINE_FILE" ]; then
    echo "Creating baseline for $DOMAIN"
    theharvester -d "$DOMAIN" -l 500 -b all > "$BASELINE_FILE"
fi

while true; do
    echo "$(date): Monitoring $DOMAIN"

    # Run current scan
    theharvester -d "$DOMAIN" -l 500 -b all > "$CURRENT_FILE"

    # Compare with baseline
    if ! diff -q "$BASELINE_FILE" "$CURRENT_FILE" >/dev/null; then
        echo "Changes detected for $DOMAIN"

        # Generate diff report
        diff "$BASELINE_FILE" "$CURRENT_FILE" > "changes_$\\\\{DOMAIN\\\\}_$(date +%Y%m%d_%H%M%S).txt"

        # Send alert email
        if command -v mail >/dev/null; then
            echo "New information discovered for $DOMAIN"|mail -s "theHarvester Alert: $DOMAIN" "$ALERT_EMAIL"
        fi

        # Update baseline
        cp "$CURRENT_FILE" "$BASELINE_FILE"
    fi

    # Wait for next interval
    sleep $((INTERVAL * 3600))
done

Procesamiento de dominio de lotes¶

#!/usr/bin/env python3
# batch-domain-processor.py

import subprocess
import threading
import time
import os
from concurrent.futures import ThreadPoolExecutor, as_completed

class BatchDomainProcessor:
    def __init__(self, max_workers=5):
        self.max_workers = max_workers
        self.results = \\\\{\\\\}

    def process_domain(self, domain, sources=['google', 'bing', 'crtsh']):
        """Process a single domain"""
        try:
            print(f"Processing \\\\{domain\\\\}...")

            # Create output directory
            output_dir = f"results_\\\\{domain\\\\}_\\\\{int(time.time())\\\\}"
            os.makedirs(output_dir, exist_ok=True)

            results = \\\\{\\\\}

            for source in sources:
                try:
                    output_file = f"\\\\{output_dir\\\\}/\\\\{source\\\\}.html"
                    cmd = [
                        'theharvester',
                        '-d', domain,
                        '-l', '500',
                        '-b', source,
                        '-f', output_file
                    ]

                    result = subprocess.run(
                        cmd,
                        capture_output=True,
                        text=True,
                        timeout=300  # 5 minute timeout
                    )

                    if result.returncode == 0:
                        results[source] = \\\\{
                            'status': 'success',
                            'output_file': output_file
                        \\\\}
                    else:
                        results[source] = \\\\{
                            'status': 'error',
                            'error': result.stderr
                        \\\\}

                except subprocess.TimeoutExpired:
                    results[source] = \\\\{
                        'status': 'timeout',
                        'error': 'Command timed out'
                    \\\\}
                except Exception as e:
                    results[source] = \\\\{
                        'status': 'error',
                        'error': str(e)
                    \\\\}

            self.results[domain] = results
            print(f"Completed \\\\{domain\\\\}")

        except Exception as e:
            print(f"Error processing \\\\{domain\\\\}: \\\\{e\\\\}")
            self.results[domain] = \\\\{'error': str(e)\\\\}

    def process_domains(self, domains, sources=['google', 'bing', 'crtsh']):
        """Process multiple domains concurrently"""
        with ThreadPoolExecutor(max_workers=self.max_workers) as executor:
            futures = \\\\{
                executor.submit(self.process_domain, domain, sources): domain
                for domain in domains
            \\\\}

            for future in as_completed(futures):
                domain = futures[future]
                try:
                    future.result()
                except Exception as e:
                    print(f"Error processing \\\\{domain\\\\}: \\\\{e\\\\}")

    def generate_summary_report(self, output_file="batch_summary.txt"):
        """Generate summary report"""
        with open(output_file, 'w') as f:
            f.write("theHarvester Batch Processing Summary\n")
            f.write("=" * 40 + "\n\n")

            for domain, results in self.results.items():
                f.write(f"Domain: \\\\{domain\\\\}\n")

                if 'error' in results:
                    f.write(f"  Error: \\\\{results['error']\\\\}\n")
                else:
                    for source, result in results.items():
                        f.write(f"  \\\\{source\\\\}: \\\\{result['status']\\\\}\n")
                        if result['status'] == 'error':
                            f.write(f"    Error: \\\\{result['error']\\\\}\n")

                f.write("\n")

        print(f"Summary report saved to \\\\{output_file\\\\}")

def main():
    import sys

    if len(sys.argv) != 2:
        print("Usage: python3 batch-domain-processor.py <domain_list_file>")
        sys.exit(1)

    domain_file = sys.argv[1]

    try:
        with open(domain_file, 'r') as f:
            domains = [line.strip() for line in f if line.strip()]

        processor = BatchDomainProcessor(max_workers=3)

        print(f"Processing \\\\{len(domains)\\\\} domains...")
        processor.process_domains(domains)
        processor.generate_summary_report()

        print("Batch processing complete!")

    except FileNotFoundError:
        print(f"Error: File \\\\{domain_file\\\\} not found")
    except Exception as e:
        print(f"Error: \\\\{e\\\\}")

if __name__ == "__main__":
    main()

Buenas prácticas¶

Metodología del Reconocimiento¶

1. Passive Information Gathering:
   - Start with search engines (Google, Bing)
   - Use certificate transparency logs
   - Check social media platforms
   - Avoid direct contact with target

2. Source Diversification:
   - Use multiple data sources
   - Cross-reference findings
   - Validate discovered information
   - Document source reliability

3. Rate Limiting:
   - Respect API rate limits
   - Use delays between requests
   - Rotate IP addresses if needed
   - Monitor for blocking

4. Data Validation:
   - Verify email addresses exist
   - Check subdomain resolution
   - Validate IP address ownership
   - Confirm social media profiles

Seguridad operacional¶

#!/bin/bash
# opsec-checklist.sh

echo "theHarvester OPSEC Checklist"
echo "============================"

echo "1. Network Security:"
echo "   □ Use VPN or proxy"
echo "   □ Rotate IP addresses"
echo "   □ Monitor for rate limiting"
echo "   □ Use different user agents"

echo -e "\n2. Data Handling:"
echo "   □ Encrypt stored results"
echo "   □ Use secure file permissions"
echo "   □ Delete temporary files"
echo "   □ Secure API keys"

echo -e "\n3. Legal Compliance:"
echo "   □ Verify authorization scope"
echo "   □ Respect terms of service"
echo "   □ Document activities"
echo "   □ Follow local laws"

echo -e "\n4. Technical Measures:"
echo "   □ Use isolated environment"
echo "   □ Monitor system logs"
echo "   □ Validate SSL certificates"
echo "   □ Check for detection"

Solución de problemas¶

Cuestiones comunes¶

# Issue: API rate limiting
# Solution: Use API keys and implement delays
theharvester -d example.com -l 100 -b google --delay 2

# Issue: No results from certain sources
# Check if source is available
theharvester -d example.com -l 10 -b google -v

# Issue: SSL certificate errors
# Disable SSL verification (use with caution)
export PYTHONHTTPSVERIFY=0

# Issue: Timeout errors
# Increase timeout values in source code
# Or use smaller result limits
theharvester -d example.com -l 50 -b google

Modo de depuración¶

# Enable verbose output
theharvester -d example.com -l 100 -b google -v

# Check available sources
theharvester -h|grep -A 20 "sources:"

# Test specific source
theharvester -d google.com -l 10 -b google

# Check API key configuration
cat ~/.theHarvester/api-keys.yaml

Optimización del rendimiento¶

# Use specific sources instead of 'all'
theharvester -d example.com -l 500 -b google,bing,crtsh

# Limit results for faster execution
theharvester -d example.com -l 100 -b google

# Use parallel processing for multiple domains
parallel -j 3 theharvester -d \\\\{\\\\} -l 500 -b google ::: domain1.com domain2.com domain3.com

# Cache DNS results
export PYTHONDONTWRITEBYTECODE=1

Recursos¶

-...

*Esta hoja de trampa proporciona una guía completa para usar elHarvester para actividades de OSINT y reconocimiento. Garantizar siempre la debida autorización y el cumplimiento legal antes de realizar cualquier actividad de reunión de información. *