DumpsterDiver
DumpsterDiver is a tool designed to search through large volumes of data to identify sensitive information including API keys, passwords, hardcoded credentials, and other secrets. It’s useful for security audits, compliance scanning, and identifying exposed credentials in code repositories and data dumps.
Installation
Abschnitt betitelt „Installation“Install from GitHub
Abschnitt betitelt „Install from GitHub“git clone https://github.com/maximumG/DumpsterDiver.git
cd DumpsterDiver
python3 -m pip install -r requirements.txt
Using pip
Abschnitt betitelt „Using pip“pip3 install dumpster-diver
Docker Installation
Abschnitt betitelt „Docker Installation“docker build -t dumpsterdiver .
docker run -v /path/to/scan:/data dumpsterdiver /data
System Requirements
Abschnitt betitelt „System Requirements“# Python 3.6 or higher
python3 --version
# Install dependencies
pip3 install pyyaml requests
Basic Usage
Abschnitt betitelt „Basic Usage“Scan a Directory
Abschnitt betitelt „Scan a Directory“python3 DumpsterDiver.py -p /path/to/directory
Scan a Single File
Abschnitt betitelt „Scan a Single File“python3 DumpsterDiver.py -p /path/to/file.txt
Scan Git Repository
Abschnitt betitelt „Scan Git Repository“python3 DumpsterDiver.py -p /path/to/repo -r
Use Custom Rules File
Abschnitt betitelt „Use Custom Rules File“python3 DumpsterDiver.py -p /path/to/scan -c custom_rules.yaml
Command-Line Options
Abschnitt betitelt „Command-Line Options“| Option | Description |
|---|---|
-p, --path | Path to file or directory to scan |
-r, --recursive | Recursively scan subdirectories |
-c, --config | Use custom configuration/rules file |
-o, --output | Output file for results |
-j, --json | Output results in JSON format |
-s, --sensitive | Show sensitive content in results |
--verbose | Enable verbose output |
--ignore | Ignore specific patterns |
-e, --entropy | Calculate entropy for detection |
Practical Examples
Abschnitt betitelt „Practical Examples“Scan Project Directory for Secrets
Abschnitt betitelt „Scan Project Directory for Secrets“python3 DumpsterDiver.py -p /home/user/projects -r
Scan and Save Results to File
Abschnitt betitelt „Scan and Save Results to File“python3 DumpsterDiver.py -p /var/www/html -o findings.txt
Scan with JSON Output for Processing
Abschnitt betitelt „Scan with JSON Output for Processing“python3 DumpsterDiver.py -p /app/source -j -o results.json
Scan Git History for Exposed Secrets
Abschnitt betitelt „Scan Git History for Exposed Secrets“git clone https://github.com/user/repo.git
python3 DumpsterDiver.py -p repo -r --git-history
Verbose Scanning with Details
Abschnitt betitelt „Verbose Scanning with Details“python3 DumpsterDiver.py -p /code -r --verbose
Scan with Custom Rules
Abschnitt betitelt „Scan with Custom Rules“python3 DumpsterDiver.py -p /project -c my_rules.yaml -r
Detection Patterns
Abschnitt betitelt „Detection Patterns“DumpsterDiver detects common secret patterns:
| Secret Type | Pattern | Example |
|---|---|---|
| AWS Keys | AKIA[0-9A-Z]{16} | AKIA2EXAMPLE123456 |
| API Keys | api[_-]?key | api_key=abc123xyz |
| Passwords | password\s*= | password = “secret123” |
| Tokens | token|auth | auth_token: xyz789 |
| SSH Keys | BEGIN RSA | -----BEGIN RSA PRIVATE KEY----- |
| Slack Tokens | xox[baprs] | xoxb-1234567890-abcdefghij |
| GitHub Tokens | ghp_[A-Za-z0-9_]{36,255} | ghp_example123token |
| Database URLs | (mysql|postgres):\/\/ | mysql://user:pass@host |
Custom Rules Configuration
Abschnitt betitelt „Custom Rules Configuration“Create Custom Rules File
Abschnitt betitelt „Create Custom Rules File“# custom_rules.yaml
rules:
- name: "Custom API Key Pattern"
pattern: "custom_api_[a-zA-Z0-9]{32}"
entropy: 4.0
type: "credentials"
- name: "Internal Secret"
pattern: "INTERNAL_SECRET_[A-Z0-9]{16}"
entropy: 3.5
type: "secret"
- name: "Database Connection"
pattern: "DB_PASSWORD=.*"
entropy: 3.0
type: "database"
Run with Custom Rules
Abschnitt betitelt „Run with Custom Rules“python3 DumpsterDiver.py -p /app -c custom_rules.yaml -r
Advanced Techniques
Abschnitt betitelt „Advanced Techniques“Entropy-Based Detection
Abschnitt betitelt „Entropy-Based Detection“# Detect suspicious strings with high entropy
python3 DumpsterDiver.py -p /code -e --entropy-threshold 4.5
Scan Multiple Directories
Abschnitt betitelt „Scan Multiple Directories“# Create scan script
#!/bin/bash
for dir in /app /config /home/user; do
python3 DumpsterDiver.py -p $dir -o result_$dir.txt
done
Git Repository Secret Hunting
Abschnitt betitelt „Git Repository Secret Hunting“# Clone and scan entire git history
git clone --mirror https://github.com/user/repo.git
python3 DumpsterDiver.py -p repo.git -r --git-history
Filter Results by Confidence
Abschnitt betitelt „Filter Results by Confidence“python3 DumpsterDiver.py -p /source -j | jq '.results[] | select(.confidence > 0.8)'
Parallel Scanning
Abschnitt betitelt „Parallel Scanning“# Use GNU Parallel for faster scanning
parallel python3 DumpsterDiver.py -p {} ::: /path1 /path2 /path3
Output Analysis
Abschnitt betitelt „Output Analysis“Parse JSON Results
Abschnitt betitelt „Parse JSON Results“# Extract only high-confidence findings
python3 DumpsterDiver.py -p /app -j -o findings.json
cat findings.json | jq '.[] | select(.confidence >= 0.9)'
Generate Report
Abschnitt betitelt „Generate Report“python3 DumpsterDiver.py -p /app -o results.txt
cat results.txt | grep -E "^(File|Match|Pattern)" > report.txt
Count Findings by Type
Abschnitt betitelt „Count Findings by Type“python3 DumpsterDiver.py -p /code -j -o findings.json
jq '.[] | .type' findings.json | sort | uniq -c
Integration with CI/CD
Abschnitt betitelt „Integration with CI/CD“GitHub Actions Integration
Abschnitt betitelt „GitHub Actions Integration“name: Secret Detection
on: [push, pull_request]
jobs:
secrets:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Run DumpsterDiver
run: |
git clone https://github.com/maximumG/DumpsterDiver.git
cd DumpsterDiver
python3 -m pip install -r requirements.txt
python3 DumpsterDiver.py -p .. -j -o findings.json
- name: Check findings
run: |
if [ -s findings.json ]; then
cat findings.json
exit 1
fi
GitLab CI Integration
Abschnitt betitelt „GitLab CI Integration“secret_scan:
image: python:3.9
script:
- git clone https://github.com/maximumG/DumpsterDiver.git
- cd DumpsterDiver
- pip install -r requirements.txt
- python3 DumpsterDiver.py -p .. -j -o findings.json
- "[ ! -s findings.json ] || (cat findings.json && exit 1)"
Troubleshooting
Abschnitt betitelt „Troubleshooting“Module Not Found
Abschnitt betitelt „Module Not Found“# Install missing dependencies
pip3 install pyyaml requests regex
# Verify installation
python3 -c "import DumpsterDiver"
Permission Denied on Files
Abschnitt betitelt „Permission Denied on Files“# Run with appropriate permissions
sudo python3 DumpsterDiver.py -p /restricted/path -r
Out of Memory on Large Directories
Abschnitt betitelt „Out of Memory on Large Directories“# Scan specific subdirectories instead
python3 DumpsterDiver.py -p /large/path/subdir1 -r
python3 DumpsterDiver.py -p /large/path/subdir2 -r
No Results Found
Abschnitt betitelt „No Results Found“# Verify patterns are correct
python3 DumpsterDiver.py -p /path --verbose
# Check if directory contains actual secrets
grep -r "password\|api_key\|token" /path | head
Security Best Practices
Abschnitt betitelt „Security Best Practices“Handle Findings Responsibly
Abschnitt betitelt „Handle Findings Responsibly“# Store results securely
python3 DumpsterDiver.py -p /app -o findings.txt
chmod 600 findings.txt
# Encrypt sensitive report
gpg -c findings.txt
Remediate Exposed Secrets
Abschnitt betitelt „Remediate Exposed Secrets“# After finding exposed credentials:
# 1. Rotate all exposed secrets immediately
# 2. Scan git history for exposure timeline
# 3. Update secrets management practices
# 4. Re-scan to verify remediation
python3 DumpsterDiver.py -p /app -r
Regular Scanning Schedule
Abschnitt betitelt „Regular Scanning Schedule“# Add to crontab for regular scanning
0 2 * * * /usr/bin/python3 /opt/DumpsterDiver/DumpsterDiver.py -p /app -r -o /var/log/dumpster_$(date +%Y%m%d).txt
Comparison with Similar Tools
Abschnitt betitelt „Comparison with Similar Tools“| Tool | Focus | Method |
|---|---|---|
| DumpsterDiver | Large data volumes | Pattern + entropy |
| TruffleHog | Git history | Entropy + regex |
| GitGuardian | Git monitoring | API patterns |
| SAST Tools | Code analysis | Static analysis |
| git-secrets | Git hooks | Pattern matching |
Common Secret Patterns to Monitor
Abschnitt betitelt „Common Secret Patterns to Monitor“Environment Variables
Abschnitt betitelt „Environment Variables“# Scan for unprotected env vars
python3 DumpsterDiver.py -p /app -c patterns/env_vars.yaml
Configuration Files
Abschnitt betitelt „Configuration Files“# Focus on config file patterns
python3 DumpsterDiver.py -p /etc --include="*.conf" --include="*.yaml"
Backup Files
Abschnitt betitelt „Backup Files“# Check backup directories
python3 DumpsterDiver.py -p /backups -r
Log Files
Abschnitt betitelt „Log Files“# Scan logs for leaked credentials
python3 DumpsterDiver.py -p /var/log -r --include="*.log"
Summary
Abschnitt betitelt „Summary“DumpsterDiver is an essential tool for identifying exposed secrets and sensitive data in code repositories, configuration files, and data dumps. Its flexible pattern matching and entropy-based detection help organizations find credentials that may have been accidentally committed or exposed. Regular scanning as part of security audits and CI/CD pipelines helps maintain strong credential hygiene.