WhatWeb Cheat Sheet
Overview
WhatWeb is a web scanner that identifies technologies used by websites. It recognizes web technologies including content management systems, blogging platforms, statistic/analytics packages, JavaScript libraries, web servers, and embedded devices. WhatWeb has over 1800 plugins, each to recognize something different, and is designed to be fast and thorough.
💡 Key Features: 1800+ plugins, multiple output formats, aggressive scanning modes, custom plugin development, bulk scanning capabilities, and comprehensive technology fingerprinting.
Installation and Setup
Package Manager Installation
# Ubuntu/Debian
sudo apt update
sudo apt install whatweb
# CentOS/RHEL/Fedora
sudo yum install whatweb
# or
sudo dnf install whatweb
# Arch Linux
sudo pacman -S whatweb
# macOS with Homebrew
brew install whatweb
# Verify installation
whatweb --version
whatweb --help
Source Installation
# Clone repository
git clone https://github.com/urbanadventurer/WhatWeb.git
cd WhatWeb
# Install Ruby dependencies
sudo gem install bundler
bundle install
# Make executable
chmod +x whatweb
# Add to PATH
sudo ln -s $(pwd)/whatweb /usr/local/bin/whatweb
# Verify installation
whatweb --version
# Update plugins
git pull origin master
Docker Installation
# Pull official Docker image
docker pull urbanadventurer/whatweb
# Run WhatWeb in Docker
docker run --rm urbanadventurer/whatweb https://example.com
# Run with volume mount for output
docker run --rm -v $(pwd):/output urbanadventurer/whatweb https://example.com --log-brief /output/results.txt
# Create alias for easier usage
echo 'alias whatweb="docker run --rm -v $(pwd):/output urbanadventurer/whatweb"' >> ~/.bashrc
source ~/.bashrc
# Build custom Docker image
cat > Dockerfile << 'EOF'
FROM ruby:2.7-alpine
RUN apk add --no-cache git
RUN git clone https://github.com/urbanadventurer/WhatWeb.git /whatweb
WORKDIR /whatweb
RUN bundle install
ENTRYPOINT ["./whatweb"]
EOF
docker build -t custom-whatweb .
Ruby Gem Installation
# Install as Ruby gem
gem install whatweb
# Install specific version
gem install whatweb -v 0.5.5
# Update to latest version
gem update whatweb
# Uninstall
gem uninstall whatweb
# Install with bundler
echo 'gem "whatweb"' >> Gemfile
bundle install
# Verify installation
whatweb --version
Configuration and Setup
# Create configuration directory
mkdir -p ~/.whatweb
# Create custom configuration file
cat > ~/.whatweb/config.yml << 'EOF'
# WhatWeb Configuration
default_options:
aggression: 1
user_agent: "WhatWeb/0.5.5"
max_threads: 25
open_timeout: 15
read_timeout: 30
redirect_limit: 5
output_formats:
default: brief
verbose: verbose
json: json
xml: xml
proxy_settings:
enabled: false
host: "127.0.0.1"
port: 8080
custom_plugins:
enabled: true
directory: "~/.whatweb/plugins"
EOF
# Create custom plugins directory
mkdir -p ~/.whatweb/plugins
# Set environment variables
export WHATWEB_CONFIG=~/.whatweb/config.yml
export WHATWEB_PLUGINS=~/.whatweb/plugins
Basic Usage and Commands
Simple Scanning
# Basic website scan
whatweb example.com
# Scan with HTTPS
whatweb https://example.com
# Scan multiple URLs
whatweb example.com test.com demo.com
# Scan from file
echo -e "example.com\ntest.com\ndemo.com" > urls.txt
whatweb -i urls.txt
# Scan with custom user agent
whatweb --user-agent "Mozilla/5.0 Custom Agent" example.com
# Scan with custom headers
whatweb --header "Authorization: Bearer token123" example.com
# Scan through proxy
whatweb --proxy 127.0.0.1:8080 example.com
# Scan with authentication
whatweb --cookie "session=abc123; auth=xyz789" example.com
Aggression Levels
# Passive scanning (level 1) - default
whatweb example.com
# Polite scanning (level 2)
whatweb --aggression 2 example.com
# Aggressive scanning (level 3)
whatweb --aggression 3 example.com
# Heavy scanning (level 4) - most thorough
whatweb --aggression 4 example.com
# Compare different aggression levels
for level in 1 2 3 4; do
echo "=== Aggression Level $level ==="
whatweb --aggression $level example.com
echo ""
done
Output Formats and Logging
# Brief output (default)
whatweb example.com
# Verbose output
whatweb --verbose example.com
# JSON output
whatweb --log-json results.json example.com
# XML output
whatweb --log-xml results.xml example.com
# CSV output
whatweb --log-csv results.csv example.com
# Multiple output formats
whatweb --log-brief brief.txt --log-json results.json --log-xml results.xml example.com
# Custom output format
whatweb --log-object results.obj example.com
# Errors only
whatweb --log-errors errors.txt example.com
# Verbose with colors
whatweb --colour=always --verbose example.com
Advanced Scanning Options
# Set custom timeouts
whatweb --open-timeout 30 --read-timeout 60 example.com
# Limit redirects
whatweb --max-redirects 10 example.com
# Custom threading
whatweb --max-threads 50 example.com
# URL range scanning
whatweb --url-prefix "https://example.com/page" --url-suffix ".php" --input-file pages.txt
# Scan with custom plugins only
whatweb --plugins WordPress,Drupal,Joomla example.com
# Exclude specific plugins
whatweb --plugins-exclude "Title,HTTPServer" example.com
# List available plugins
whatweb --list-plugins
# Plugin information
whatweb --info-plugins WordPress
# Scan with debugging
whatweb --debug example.com
Advanced Web Fingerprinting
Custom Plugin Development
# Create custom plugin file: ~/.whatweb/plugins/custom_framework.rb
Plugin.define do
name "Custom Framework"
authors [
"Your Name <your.email@example.com>"
]
version "0.1"
description "Detects Custom Framework installations"
website "https://custom-framework.com"
# Passive checks
passive do
# Check HTTP headers
if @headers['x-powered-by'] =~ /Custom Framework/i
m << { :name => "X-Powered-By Header" }
end
# Check server header
if @headers['server'] =~ /Custom Framework/i
m << { :name => "Server Header" }
end
# Check HTML content
if @body =~ /<meta name="generator" content="Custom Framework ([^"]+)"/i
m << { :version => $1, :name => "Meta Generator" }
end
# Check for specific JavaScript
if @body =~ /customFramework\.js/i
m << { :name => "JavaScript Library" }
end
# Check for CSS files
if @body =~ /custom-framework\.css/i
m << { :name => "CSS Framework" }
end
end
# Aggressive checks
aggressive do
# Check specific paths
target = URI.join(@base_uri.to_s, "/custom-framework/version.txt").to_s
status, url, ip, body, headers = open_target(target)
if status == 200 and body =~ /Custom Framework v([0-9\.]+)/i
m << { :version => $1, :name => "Version File" }
end
# Check admin panel
admin_target = URI.join(@base_uri.to_s, "/custom-admin/").to_s
admin_status, admin_url, admin_ip, admin_body, admin_headers = open_target(admin_target)
if admin_status == 200 and admin_body =~ /Custom Framework Admin/i
m << { :name => "Admin Panel" }
end
end
end
Bulk Scanning Automation
#!/usr/bin/env python3
# Advanced WhatWeb automation and analysis
import subprocess
import json
import threading
import time
import csv
import xml.etree.ElementTree as ET
from concurrent.futures import ThreadPoolExecutor, as_completed
from urllib.parse import urlparse
import re
class WhatWebAnalyzer:
def __init__(self, max_workers=10, aggression=1):
self.max_workers = max_workers
self.aggression = aggression
self.results = []
self.lock = threading.Lock()
def scan_url(self, url, options=None):
"""Scan single URL with WhatWeb"""
if options is None:
options = {}
try:
# Build WhatWeb command
cmd = ['whatweb', '--log-json=-', '--aggression', str(self.aggression)]
# Add options
if options.get('user_agent'):
cmd.extend(['--user-agent', options['user_agent']])
if options.get('proxy'):
cmd.extend(['--proxy', options['proxy']])
if options.get('headers'):
for header in options['headers']:
cmd.extend(['--header', header])
if options.get('cookie'):
cmd.extend(['--cookie', options['cookie']])
if options.get('timeout'):
cmd.extend(['--open-timeout', str(options['timeout'])])
cmd.extend(['--read-timeout', str(options['timeout'])])
if options.get('max_redirects'):
cmd.extend(['--max-redirects', str(options['max_redirects'])])
if options.get('plugins'):
cmd.extend(['--plugins', options['plugins']])
# Add URL
cmd.append(url)
# Run WhatWeb
result = subprocess.run(
cmd,
capture_output=True, text=True,
timeout=options.get('timeout', 60)
)
scan_result = {
'url': url,
'success': result.returncode == 0,
'technologies': [],
'error': None,
'raw_output': result.stdout
}
if result.returncode == 0 and result.stdout:
try:
# Parse JSON output
for line in result.stdout.strip().split('\n'):
if line.strip():
data = json.loads(line)
scan_result['technologies'].append(data)
except json.JSONDecodeError as e:
scan_result['error'] = f"JSON parse error: {e}"
else:
scan_result['error'] = result.stderr
return scan_result
except subprocess.TimeoutExpired:
return {
'url': url,
'success': False,
'technologies': [],
'error': 'WhatWeb scan timeout',
'raw_output': ''
}
except Exception as e:
return {
'url': url,
'success': False,
'technologies': [],
'error': str(e),
'raw_output': ''
}
def analyze_technologies(self, scan_result):
"""Analyze detected technologies for insights"""
analysis = {
'url': scan_result['url'],
'total_technologies': 0,
'categories': {},
'security_technologies': [],
'cms_detected': [],
'web_servers': [],
'programming_languages': [],
'javascript_libraries': [],
'analytics_tools': [],
'security_headers': [],
'potential_vulnerabilities': [],
'technology_versions': {}
}
if not scan_result['success'] or not scan_result['technologies']:
return analysis
# Process each technology detection
for tech_data in scan_result['technologies']:
plugins = tech_data.get('plugins', {})
analysis['total_technologies'] = len(plugins)
for plugin_name, plugin_data in plugins.items():
# Categorize technologies
category = self.categorize_technology(plugin_name)
if category not in analysis['categories']:
analysis['categories'][category] = []
analysis['categories'][category].append(plugin_name)
# Extract version information
if isinstance(plugin_data, dict) and 'version' in plugin_data:
analysis['technology_versions'][plugin_name] = plugin_data['version']
elif isinstance(plugin_data, list):
for item in plugin_data:
if isinstance(item, dict) and 'version' in item:
analysis['technology_versions'][plugin_name] = item['version']
break
# Specific technology analysis
if plugin_name.lower() in ['wordpress', 'drupal', 'joomla', 'magento', 'shopify']:
analysis['cms_detected'].append(plugin_name)
if plugin_name.lower() in ['apache', 'nginx', 'iis', 'lighttpd', 'tomcat']:
analysis['web_servers'].append(plugin_name)
if plugin_name.lower() in ['php', 'python', 'ruby', 'java', 'nodejs', 'asp.net']:
analysis['programming_languages'].append(plugin_name)
if plugin_name.lower() in ['jquery', 'angular', 'react', 'vue', 'bootstrap']:
analysis['javascript_libraries'].append(plugin_name)
if plugin_name.lower() in ['google-analytics', 'google-tag-manager', 'facebook-pixel']:
analysis['analytics_tools'].append(plugin_name)
if plugin_name.lower() in ['strict-transport-security', 'content-security-policy', 'x-frame-options']:
analysis['security_headers'].append(plugin_name)
# Check for potential vulnerabilities
vulnerabilities = self.check_vulnerabilities(plugin_name, analysis['technology_versions'].get(plugin_name))
analysis['potential_vulnerabilities'].extend(vulnerabilities)
return analysis
def categorize_technology(self, plugin_name):
"""Categorize technology based on plugin name"""
plugin_lower = plugin_name.lower()
# CMS and Frameworks
if any(cms in plugin_lower for cms in ['wordpress', 'drupal', 'joomla', 'magento', 'shopify', 'prestashop']):
return 'CMS'
# Web Servers
if any(server in plugin_lower for server in ['apache', 'nginx', 'iis', 'lighttpd', 'tomcat', 'jetty']):
return 'Web Server'
# Programming Languages
if any(lang in plugin_lower for lang in ['php', 'python', 'ruby', 'java', 'nodejs', 'asp.net', 'perl']):
return 'Programming Language'
# JavaScript Libraries
if any(js in plugin_lower for js in ['jquery', 'angular', 'react', 'vue', 'bootstrap', 'ember']):
return 'JavaScript Library'
# Analytics
if any(analytics in plugin_lower for analytics in ['analytics', 'tracking', 'pixel', 'tag-manager']):
return 'Analytics'
# Security
if any(security in plugin_lower for security in ['security', 'ssl', 'tls', 'firewall', 'protection']):
return 'Security'
# CDN
if any(cdn in plugin_lower for cdn in ['cloudflare', 'akamai', 'fastly', 'maxcdn', 'cloudfront']):
return 'CDN'
# Database
if any(db in plugin_lower for db in ['mysql', 'postgresql', 'mongodb', 'redis', 'elasticsearch']):
return 'Database'
return 'Other'
def check_vulnerabilities(self, plugin_name, version):
"""Check for known vulnerabilities (simplified)"""
vulnerabilities = []
plugin_lower = plugin_name.lower()
# WordPress vulnerabilities
if 'wordpress' in plugin_lower and version:
if version.startswith('4.') or version.startswith('3.'):
vulnerabilities.append(f"Outdated WordPress version {version} - multiple known vulnerabilities")
# PHP vulnerabilities
if 'php' in plugin_lower and version:
if version.startswith('5.') or version.startswith('7.0') or version.startswith('7.1'):
vulnerabilities.append(f"Outdated PHP version {version} - security support ended")
# Apache vulnerabilities
if 'apache' in plugin_lower and version:
if version.startswith('2.2') or version.startswith('2.0'):
vulnerabilities.append(f"Outdated Apache version {version} - known vulnerabilities")
# jQuery vulnerabilities
if 'jquery' in plugin_lower and version:
if version.startswith('1.') or version.startswith('2.'):
vulnerabilities.append(f"Outdated jQuery version {version} - XSS vulnerabilities")
return vulnerabilities
def bulk_scan(self, urls, options=None):
"""Perform bulk scanning of multiple URLs"""
print(f"Starting bulk scan of {len(urls)} URLs")
print(f"Max workers: {self.max_workers}")
print(f"Aggression level: {self.aggression}")
with ThreadPoolExecutor(max_workers=self.max_workers) as executor:
# Submit all tasks
future_to_url = {
executor.submit(self.scan_url, url, options): url
for url in urls
}
# Process completed tasks
for future in as_completed(future_to_url):
url = future_to_url[future]
try:
result = future.result()
# Analyze technologies
analysis = self.analyze_technologies(result)
result['analysis'] = analysis
with self.lock:
self.results.append(result)
if result['success']:
tech_count = analysis['total_technologies']
cms_count = len(analysis['cms_detected'])
vuln_count = len(analysis['potential_vulnerabilities'])
print(f"✓ {url}: {tech_count} technologies, {cms_count} CMS, {vuln_count} potential vulnerabilities")
else:
print(f"✗ {url}: {result['error']}")
except Exception as e:
print(f"✗ Error scanning {url}: {e}")
return self.results
def generate_report(self, output_file='whatweb_analysis_report.json'):
"""Generate comprehensive analysis report"""
# Calculate statistics
total_urls = len(self.results)
successful_scans = sum(1 for r in self.results if r['success'])
# Technology statistics
all_technologies = {}
all_categories = {}
all_cms = {}
all_vulnerabilities = []
for result in self.results:
if result['success'] and 'analysis' in result:
analysis = result['analysis']
# Count technologies by category
for category, techs in analysis['categories'].items():
if category not in all_categories:
all_categories[category] = 0
all_categories[category] += len(techs)
for tech in techs:
if tech not in all_technologies:
all_technologies[tech] = 0
all_technologies[tech] += 1
# Count CMS
for cms in analysis['cms_detected']:
if cms not in all_cms:
all_cms[cms] = 0
all_cms[cms] += 1
# Collect vulnerabilities
all_vulnerabilities.extend(analysis['potential_vulnerabilities'])
# Sort by popularity
popular_technologies = sorted(all_technologies.items(), key=lambda x: x[1], reverse=True)[:20]
popular_categories = sorted(all_categories.items(), key=lambda x: x[1], reverse=True)[:10]
popular_cms = sorted(all_cms.items(), key=lambda x: x[1], reverse=True)[:10]
report = {
'scan_summary': {
'total_urls': total_urls,
'successful_scans': successful_scans,
'success_rate': (successful_scans / total_urls * 100) if total_urls > 0 else 0,
'scan_date': time.strftime('%Y-%m-%d %H:%M:%S'),
'aggression_level': self.aggression
},
'technology_statistics': {
'total_unique_technologies': len(all_technologies),
'popular_technologies': popular_technologies,
'popular_categories': popular_categories,
'popular_cms': popular_cms,
'total_vulnerabilities': len(all_vulnerabilities)
},
'vulnerability_summary': {
'unique_vulnerabilities': list(set(all_vulnerabilities)),
'vulnerability_count': len(all_vulnerabilities)
},
'detailed_results': self.results
}
# Save report
with open(output_file, 'w') as f:
json.dump(report, f, indent=2)
print(f"\nWhatWeb Analysis Report:")
print(f"Total URLs scanned: {total_urls}")
print(f"Successful scans: {successful_scans}")
print(f"Success rate: {report['scan_summary']['success_rate']:.1f}%")
print(f"Unique technologies found: {len(all_technologies)}")
print(f"Potential vulnerabilities: {len(all_vulnerabilities)}")
print(f"Report saved to: {output_file}")
return report
# Usage example
if __name__ == "__main__":
# Create analyzer instance
analyzer = WhatWebAnalyzer(max_workers=5, aggression=3)
# URLs to scan
urls = [
'https://example.com',
'https://test.com',
'https://demo.com'
]
# Scan options
scan_options = {
'user_agent': 'Mozilla/5.0 (Custom Scanner)',
'timeout': 30,
'max_redirects': 5
}
# Perform bulk scan
results = analyzer.bulk_scan(urls, scan_options)
# Generate report
report = analyzer.generate_report('comprehensive_whatweb_report.json')
Plugin Management and Development
#!/bin/bash
# WhatWeb plugin management script
WHATWEB_DIR="/usr/share/whatweb"
PLUGINS_DIR="$WHATWEB_DIR/plugins"
CUSTOM_PLUGINS_DIR="$HOME/.whatweb/plugins"
# Function to list all plugins
list_plugins() {
echo "WhatWeb Plugin Management"
echo "========================"
echo "Total plugins: $(whatweb --list-plugins | wc -l)"
echo ""
echo "Plugin categories:"
whatweb --list-plugins | grep -E "^\s*[A-Z]" | cut -d' ' -f1 | sort | uniq -c | sort -nr
echo ""
echo "Recent plugins (last 30 days):"
find "$PLUGINS_DIR" -name "*.rb" -mtime -30 -exec basename {} .rb \; | sort
}
# Function to search plugins
search_plugins() {
local search_term="$1"
if [ -z "$search_term" ]; then
echo "Usage: search_plugins <search_term>"
return 1
fi
echo "Searching for plugins matching: $search_term"
echo "============================================"
whatweb --list-plugins | grep -i "$search_term"
echo ""
echo "Plugin files containing '$search_term':"
grep -l -i "$search_term" "$PLUGINS_DIR"/*.rb 2>/dev/null | while read file; do
echo "- $(basename "$file" .rb)"
done
}
# Function to get plugin information
plugin_info() {
local plugin_name="$1"
if [ -z "$plugin_name" ]; then
echo "Usage: plugin_info <plugin_name>"
return 1
fi
echo "Plugin Information: $plugin_name"
echo "================================"
whatweb --info-plugins "$plugin_name"
echo ""
echo "Plugin file location:"
find "$PLUGINS_DIR" -name "*${plugin_name,,}*" -o -name "*${plugin_name^}*" 2>/dev/null
}
# Function to test plugin
test_plugin() {
local plugin_name="$1"
local test_url="$2"
if [ -z "$plugin_name" ] || [ -z "$test_url" ]; then
echo "Usage: test_plugin <plugin_name> <test_url>"
return 1
fi
echo "Testing plugin '$plugin_name' against '$test_url'"
echo "================================================"
whatweb --plugins "$plugin_name" --verbose "$test_url"
}
# Function to create custom plugin template
create_plugin_template() {
local plugin_name="$1"
if [ -z "$plugin_name" ]; then
echo "Usage: create_plugin_template <plugin_name>"
return 1
fi
mkdir -p "$CUSTOM_PLUGINS_DIR"
local plugin_file="$CUSTOM_PLUGINS_DIR/${plugin_name,,}.rb"
cat > "$plugin_file" << EOF
##
# This file is part of WhatWeb and may be subject to
# redistribution and commercial restrictions. Please see the WhatWeb
# web site for more information on licensing and terms of use.
# https://www.morningstarsecurity.com/research/whatweb
##
Plugin.define do
name "$plugin_name"
authors [
"Your Name <your.email@example.com>"
]
version "0.1"
description "Detects $plugin_name installations"
website "https://example.com"
# Plugin categories
# Choose from: CMS, Blogging, Wiki, Forum, E-commerce, Photo Gallery,
# Analytics, Advertising, Widgets, CDN, Marketing, Social, Video,
# Comment System, Captcha, Font, Map, Mobile Framework, Programming Language,
# Operating System, Search Engine, Web Server, Database, Development,
# Miscellaneous
# Passive checks (HTTP headers and HTML content)
passive do
# Check HTTP headers
if @headers['x-powered-by'] =~ /$plugin_name/i
m << { :name => "X-Powered-By Header" }
end
if @headers['server'] =~ /$plugin_name/i
m << { :name => "Server Header" }
end
# Check HTML content
if @body =~ /<meta name="generator" content="$plugin_name ([^"]+)"/i
m << { :version => \$1, :name => "Meta Generator" }
end
# Check for specific strings in HTML
if @body =~ /$plugin_name/i
m << { :name => "HTML Content" }
end
# Check for JavaScript
if @body =~ /${plugin_name,,}\.js/i
m << { :name => "JavaScript Library" }
end
# Check for CSS
if @body =~ /${plugin_name,,}\.css/i
m << { :name => "CSS Framework" }
end
end
# Aggressive checks (additional HTTP requests)
aggressive do
# Check specific paths
target = URI.join(@base_uri.to_s, "/${plugin_name,,}/version.txt").to_s
status, url, ip, body, headers = open_target(target)
if status == 200 and body =~ /$plugin_name v([0-9\.]+)/i
m << { :version => \$1, :name => "Version File" }
end
# Check admin panel
admin_target = URI.join(@base_uri.to_s, "/${plugin_name,,}-admin/").to_s
admin_status, admin_url, admin_ip, admin_body, admin_headers = open_target(admin_target)
if admin_status == 200 and admin_body =~ /$plugin_name Admin/i
m << { :name => "Admin Panel" }
end
# Check common files
common_files = [
"/${plugin_name,,}/readme.txt",
"/${plugin_name,,}/changelog.txt",
"/${plugin_name,,}/license.txt"
]
common_files.each do |file_path|
file_target = URI.join(@base_uri.to_s, file_path).to_s
file_status, file_url, file_ip, file_body, file_headers = open_target(file_target)
if file_status == 200
m << { :name => "Common File: #{file_path}" }
# Extract version from file content
if file_body =~ /version[:\s]+([0-9\.]+)/i
m << { :version => \$1, :name => "Version from #{file_path}" }
end
end
end
end
end
EOF
echo "Plugin template created: $plugin_file"
echo "Edit the file to customize detection logic"
echo "Test with: whatweb --plugins $(basename "$plugin_file" .rb) <target_url>"
}
# Function to validate plugin syntax
validate_plugin() {
local plugin_file="$1"
if [ -z "$plugin_file" ]; then
echo "Usage: validate_plugin <plugin_file>"
return 1
fi
if [ ! -f "$plugin_file" ]; then
echo "Plugin file not found: $plugin_file"
return 1
fi
echo "Validating plugin: $plugin_file"
echo "==============================="
# Check Ruby syntax
if ruby -c "$plugin_file" > /dev/null 2>&1; then
echo "✅ Ruby syntax: OK"
else
echo "❌ Ruby syntax: ERROR"
ruby -c "$plugin_file"
return 1
fi
# Check plugin structure
if grep -q "Plugin.define do" "$plugin_file"; then
echo "✅ Plugin structure: OK"
else
echo "❌ Plugin structure: Missing Plugin.define block"
return 1
fi
# Check required fields
local required_fields=("name" "authors" "version" "description")
for field in "${required_fields[@]}"; do
if grep -q "$field" "$plugin_file"; then
echo "✅ Required field '$field': OK"
else
echo "❌ Required field '$field': Missing"
fi
done
echo "Plugin validation completed"
}
# Function to backup plugins
backup_plugins() {
local backup_dir="$HOME/.whatweb/backup-$(date +%Y%m%d-%H%M%S)"
echo "Backing up WhatWeb plugins..."
echo "Backup directory: $backup_dir"
mkdir -p "$backup_dir"
# Backup system plugins
cp -r "$PLUGINS_DIR" "$backup_dir/system-plugins"
# Backup custom plugins
if [ -d "$CUSTOM_PLUGINS_DIR" ]; then
cp -r "$CUSTOM_PLUGINS_DIR" "$backup_dir/custom-plugins"
fi
echo "Backup completed: $backup_dir"
}
# Function to update plugins
update_plugins() {
echo "Updating WhatWeb plugins..."
# Backup first
backup_plugins
# Update from git repository
if [ -d "/opt/WhatWeb" ]; then
cd /opt/WhatWeb
git pull origin master
echo "WhatWeb updated from git repository"
else
echo "Git repository not found. Please update WhatWeb manually."
fi
# Update via package manager
if command -v apt &> /dev/null; then
sudo apt update && sudo apt upgrade whatweb
elif command -v yum &> /dev/null; then
sudo yum update whatweb
elif command -v dnf &> /dev/null; then
sudo dnf update whatweb
elif command -v brew &> /dev/null; then
brew upgrade whatweb
fi
echo "Plugin update completed"
}
# Main execution
case "${1:-help}" in
"list")
list_plugins
;;
"search")
search_plugins "$2"
;;
"info")
plugin_info "$2"
;;
"test")
test_plugin "$2" "$3"
;;
"create")
create_plugin_template "$2"
;;
"validate")
validate_plugin "$2"
;;
"backup")
backup_plugins
;;
"update")
update_plugins
;;
"help"|*)
echo "WhatWeb Plugin Management Script"
echo "Usage: $0 {list|search|info|test|create|validate|backup|update|help}"
echo ""
echo "Commands:"
echo " list - List all available plugins"
echo " search <term> - Search for plugins by name"
echo " info <plugin> - Show plugin information"
echo " test <plugin> <url> - Test plugin against URL"
echo " create <name> - Create custom plugin template"
echo " validate <file> - Validate plugin syntax"
echo " backup - Backup all plugins"
echo " update - Update plugins from repository"
echo " help - Show this help message"
;;
esac
Automation and Integration
CI/CD Integration
#!/bin/bash
# CI/CD script for web technology fingerprinting
set -e
TARGET_URL="$1"
OUTPUT_DIR="$2"
AGGRESSION_LEVEL="${3:-2}"
if [ -z "$TARGET_URL" ] || [ -z "$OUTPUT_DIR" ]; then
echo "Usage: $0 <target_url> <output_dir> [aggression_level]"
exit 1
fi
echo "Starting web technology fingerprinting..."
echo "Target: $TARGET_URL"
echo "Output directory: $OUTPUT_DIR"
echo "Aggression level: $AGGRESSION_LEVEL"
mkdir -p "$OUTPUT_DIR"
# Run WhatWeb scan
echo "Running WhatWeb scan..."
whatweb --aggression "$AGGRESSION_LEVEL" \
--log-json "$OUTPUT_DIR/whatweb_results.json" \
--log-brief "$OUTPUT_DIR/whatweb_brief.txt" \
--log-verbose "$OUTPUT_DIR/whatweb_verbose.txt" \
--max-threads 25 \
--open-timeout 30 \
--read-timeout 60 \
"$TARGET_URL"
# Parse results
if [ -f "$OUTPUT_DIR/whatweb_results.json" ]; then
TECH_COUNT=$(jq '[.[] | .plugins | keys] | flatten | length' "$OUTPUT_DIR/whatweb_results.json" 2>/dev/null || echo "0")
echo "Found $TECH_COUNT technologies"
else
TECH_COUNT=0
echo "No results file generated"
fi
# Extract specific technology categories
echo "Analyzing detected technologies..."
# CMS detection
CMS_DETECTED=$(jq -r '.[] | .plugins | keys[] | select(test("WordPress|Drupal|Joomla|Magento|Shopify"; "i"))' "$OUTPUT_DIR/whatweb_results.json" 2>/dev/null | sort | uniq || echo "")
CMS_COUNT=$(echo "$CMS_DETECTED" | grep -v '^$' | wc -l)
# Web server detection
SERVERS_DETECTED=$(jq -r '.[] | .plugins | keys[] | select(test("Apache|Nginx|IIS|LiteSpeed|Tomcat"; "i"))' "$OUTPUT_DIR/whatweb_results.json" 2>/dev/null | sort | uniq || echo "")
SERVER_COUNT=$(echo "$SERVERS_DETECTED" | grep -v '^$' | wc -l)
# Programming language detection
LANGUAGES_DETECTED=$(jq -r '.[] | .plugins | keys[] | select(test("PHP|Python|Ruby|Java|ASP|Node"; "i"))' "$OUTPUT_DIR/whatweb_results.json" 2>/dev/null | sort | uniq || echo "")
LANGUAGE_COUNT=$(echo "$LANGUAGES_DETECTED" | grep -v '^$' | wc -l)
# Security technology detection
SECURITY_DETECTED=$(jq -r '.[] | .plugins | keys[] | select(test("SSL|TLS|Security|Firewall|Protection"; "i"))' "$OUTPUT_DIR/whatweb_results.json" 2>/dev/null | sort | uniq || echo "")
SECURITY_COUNT=$(echo "$SECURITY_DETECTED" | grep -v '^$' | wc -l)
# Development/debug technology detection
DEV_DETECTED=$(jq -r '.[] | .plugins | keys[] | select(test("Debug|Development|Test|Staging"; "i"))' "$OUTPUT_DIR/whatweb_results.json" 2>/dev/null | sort | uniq || echo "")
DEV_COUNT=$(echo "$DEV_DETECTED" | grep -v '^$' | wc -l)
# Generate summary report
cat > "$OUTPUT_DIR/technology-summary.txt" << EOF
Web Technology Fingerprinting Summary
====================================
Date: $(date)
Target: $TARGET_URL
Aggression Level: $AGGRESSION_LEVEL
Total Technologies: $TECH_COUNT
Technology Categories:
- CMS: $CMS_COUNT
- Web Servers: $SERVER_COUNT
- Programming Languages: $LANGUAGE_COUNT
- Security Technologies: $SECURITY_COUNT
- Development Technologies: $DEV_COUNT
Status: $(if [ "$DEV_COUNT" -gt "0" ]; then echo "WARNING - Development tools detected"; else echo "OK"; fi)
EOF
# List detected technologies by category
if [ "$CMS_COUNT" -gt "0" ]; then
echo "" >> "$OUTPUT_DIR/technology-summary.txt"
echo "CMS Detected:" >> "$OUTPUT_DIR/technology-summary.txt"
echo "$CMS_DETECTED" | sed 's/^/- /' >> "$OUTPUT_DIR/technology-summary.txt"
fi
if [ "$SERVER_COUNT" -gt "0" ]; then
echo "" >> "$OUTPUT_DIR/technology-summary.txt"
echo "Web Servers:" >> "$OUTPUT_DIR/technology-summary.txt"
echo "$SERVERS_DETECTED" | sed 's/^/- /' >> "$OUTPUT_DIR/technology-summary.txt"
fi
if [ "$LANGUAGE_COUNT" -gt "0" ]; then
echo "" >> "$OUTPUT_DIR/technology-summary.txt"
echo "Programming Languages:" >> "$OUTPUT_DIR/technology-summary.txt"
echo "$LANGUAGES_DETECTED" | sed 's/^/- /' >> "$OUTPUT_DIR/technology-summary.txt"
fi
if [ "$DEV_COUNT" -gt "0" ]; then
echo "" >> "$OUTPUT_DIR/technology-summary.txt"
echo "Development Technologies (WARNING):" >> "$OUTPUT_DIR/technology-summary.txt"
echo "$DEV_DETECTED" | sed 's/^/- /' >> "$OUTPUT_DIR/technology-summary.txt"
fi
# Generate detailed report
python3 << 'PYTHON_EOF'
import sys
import json
from datetime import datetime
output_dir = sys.argv[1]
target_url = sys.argv[2]
# Read WhatWeb results
try:
with open(f"{output_dir}/whatweb_results.json", 'r') as f:
results = json.load(f)
except (FileNotFoundError, json.JSONDecodeError):
results = []
# Process results
technologies = {}
categories = {}
versions = {}
for result in results:
plugins = result.get('plugins', {})
for plugin_name, plugin_data in plugins.items():
technologies[plugin_name] = plugin_data
# Extract version information
if isinstance(plugin_data, dict):
if 'version' in plugin_data:
versions[plugin_name] = plugin_data['version']
elif isinstance(plugin_data, list):
for item in plugin_data:
if isinstance(item, dict) and 'version' in item:
versions[plugin_name] = item['version']
break
# Categorize technologies
def categorize_technology(plugin_name):
plugin_lower = plugin_name.lower()
if any(cms in plugin_lower for cms in ['wordpress', 'drupal', 'joomla', 'magento', 'shopify']):
return 'CMS'
elif any(server in plugin_lower for server in ['apache', 'nginx', 'iis', 'litespeed', 'tomcat']):
return 'Web Server'
elif any(lang in plugin_lower for lang in ['php', 'python', 'ruby', 'java', 'asp', 'node']):
return 'Programming Language'
elif any(js in plugin_lower for js in ['jquery', 'angular', 'react', 'vue', 'bootstrap']):
return 'JavaScript Library'
elif any(analytics in plugin_lower for analytics in ['analytics', 'tracking', 'pixel']):
return 'Analytics'
elif any(security in plugin_lower for security in ['ssl', 'tls', 'security', 'firewall']):
return 'Security'
elif any(dev in plugin_lower for dev in ['debug', 'development', 'test', 'staging']):
return 'Development'
else:
return 'Other'
for tech in technologies.keys():
category = categorize_technology(tech)
if category not in categories:
categories[category] = []
categories[category].append(tech)
# Create detailed report
report = {
'scan_info': {
'target': target_url,
'scan_date': datetime.now().isoformat(),
'technology_count': len(technologies)
},
'technologies': technologies,
'versions': versions,
'categories': categories,
'raw_results': results
}
# Save detailed report
with open(f"{output_dir}/whatweb-detailed-report.json", 'w') as f:
json.dump(report, f, indent=2)
# Generate HTML report
html_content = f"""
<!DOCTYPE html>
<html>
<head>
<title>Web Technology Fingerprinting Report</title>
<style>
body {{ font-family: Arial, sans-serif; margin: 20px; }}
.header {{ background-color: #f0f0f0; padding: 20px; border-radius: 5px; }}
.category {{ margin: 10px 0; padding: 15px; border-left: 4px solid #007bff; background-color: #f8f9fa; }}
.warning {{ border-left-color: #ffc107; }}
.danger {{ border-left-color: #dc3545; }}
table {{ border-collapse: collapse; width: 100%; margin: 20px 0; }}
th, td {{ border: 1px solid #ddd; padding: 8px; text-align: left; }}
th {{ background-color: #f2f2f2; }}
</style>
</head>
<body>
<div class="header">
<h1>Web Technology Fingerprinting Report</h1>
<p><strong>Target:</strong> {target_url}</p>
<p><strong>Scan Date:</strong> {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}</p>
<p><strong>Technologies Found:</strong> {len(technologies)}</p>
</div>
<h2>Technology Categories</h2>
"""
for category, techs in categories.items():
css_class = "danger" if category == "Development" else "warning" if category in ["Other"] else ""
html_content += f"""
<div class="category {css_class}">
<h3>{category} ({len(techs)})</h3>
<p>{', '.join(techs)}</p>
</div>
"""
html_content += """
<h2>Detailed Technologies</h2>
<table>
<tr><th>Technology</th><th>Version</th><th>Category</th></tr>
"""
for tech in sorted(technologies.keys()):
version = versions.get(tech, 'N/A')
category = categorize_technology(tech)
html_content += f"""
<tr>
<td>{tech}</td>
<td>{version}</td>
<td>{category}</td>
</tr>
"""
html_content += """
</table>
</body>
</html>
"""
with open(f"{output_dir}/whatweb-report.html", 'w') as f:
f.write(html_content)
print(f"Detailed reports generated:")
print(f"- JSON: {output_dir}/whatweb-detailed-report.json")
print(f"- HTML: {output_dir}/whatweb-report.html")
PYTHON_EOF
# Check for development technologies and exit
if [ "$DEV_COUNT" -gt "0" ]; then
echo "WARNING: Development technologies detected"
echo "Development technologies found:"
echo "$DEV_DETECTED"
exit 1
else
echo "SUCCESS: No development technologies detected"
exit 0
fi
GitHub Actions Integration
# .github/workflows/whatweb-fingerprinting.yml
name: Web Technology Fingerprinting
on:
push:
branches: [ main, develop ]
pull_request:
branches: [ main ]
schedule:
- cron: '0 5 * * 2' # Weekly scan on Tuesdays at 5 AM
jobs:
web-fingerprinting:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Install WhatWeb
run: |
sudo apt update
sudo apt install -y whatweb
whatweb --version
- name: Fingerprint production environment
run: |
mkdir -p fingerprint-results
# Scan production with different aggression levels
whatweb --aggression 1 \
--log-json fingerprint-results/prod-passive.json \
--log-brief fingerprint-results/prod-passive.txt \
${{ vars.PRODUCTION_URL }}
whatweb --aggression 3 \
--log-json fingerprint-results/prod-aggressive.json \
--log-brief fingerprint-results/prod-aggressive.txt \
${{ vars.PRODUCTION_URL }}
# Count technologies
PASSIVE_COUNT=$(jq '[.[] | .plugins | keys] | flatten | length' fingerprint-results/prod-passive.json 2>/dev/null || echo "0")
AGGRESSIVE_COUNT=$(jq '[.[] | .plugins | keys] | flatten | length' fingerprint-results/prod-aggressive.json 2>/dev/null || echo "0")
echo "PASSIVE_COUNT=$PASSIVE_COUNT" >> $GITHUB_ENV
echo "AGGRESSIVE_COUNT=$AGGRESSIVE_COUNT" >> $GITHUB_ENV
- name: Analyze technology stack
run: |
# Check for CMS
CMS_DETECTED=$(jq -r '.[] | .plugins | keys[] | select(test("WordPress|Drupal|Joomla|Magento"; "i"))' fingerprint-results/prod-aggressive.json 2>/dev/null || echo "")
# Check for development technologies
DEV_TECHS=$(jq -r '.[] | .plugins | keys[] | select(test("Debug|Development|Test|Staging|PhpMyAdmin"; "i"))' fingerprint-results/prod-aggressive.json 2>/dev/null || echo "")
# Check for security technologies
SECURITY_TECHS=$(jq -r '.[] | .plugins | keys[] | select(test("SSL|TLS|Security|Firewall|HSTS"; "i"))' fingerprint-results/prod-aggressive.json 2>/dev/null || echo "")
# Set environment variables
if [ -n "$CMS_DETECTED" ]; then
echo "CMS_FOUND=true" >> $GITHUB_ENV
echo "$CMS_DETECTED" > fingerprint-results/cms-detected.txt
else
echo "CMS_FOUND=false" >> $GITHUB_ENV
touch fingerprint-results/cms-detected.txt
fi
if [ -n "$DEV_TECHS" ]; then
echo "DEV_TECHS_FOUND=true" >> $GITHUB_ENV
echo "$DEV_TECHS" > fingerprint-results/dev-technologies.txt
else
echo "DEV_TECHS_FOUND=false" >> $GITHUB_ENV
touch fingerprint-results/dev-technologies.txt
fi
SECURITY_COUNT=$(echo "$SECURITY_TECHS" | grep -v '^$' | wc -l)
echo "SECURITY_COUNT=$SECURITY_COUNT" >> $GITHUB_ENV
- name: Generate technology report
run: |
# Create comprehensive summary
cat > fingerprint-results/summary.txt << EOF
Web Technology Fingerprinting Summary
====================================
Production URL: ${{ vars.PRODUCTION_URL }}
Scan Date: $(date)
Technology Counts:
- Passive scan: $PASSIVE_COUNT technologies
- Aggressive scan: $AGGRESSIVE_COUNT technologies
- Security technologies: $SECURITY_COUNT
Findings:
- CMS detected: $(if [ "$CMS_FOUND" = "true" ]; then echo "YES"; else echo "NO"; fi)
- Development tools: $(if [ "$DEV_TECHS_FOUND" = "true" ]; then echo "YES (CRITICAL)"; else echo "NO"; fi)
- Security technologies: $SECURITY_COUNT
EOF
# Add detected technologies to summary
if [ "$CMS_FOUND" = "true" ]; then
echo "" >> fingerprint-results/summary.txt
echo "CMS Detected:" >> fingerprint-results/summary.txt
cat fingerprint-results/cms-detected.txt | sed 's/^/- /' >> fingerprint-results/summary.txt
fi
if [ "$DEV_TECHS_FOUND" = "true" ]; then
echo "" >> fingerprint-results/summary.txt
echo "Development Technologies (CRITICAL):" >> fingerprint-results/summary.txt
cat fingerprint-results/dev-technologies.txt | sed 's/^/- /' >> fingerprint-results/summary.txt
fi
- name: Upload fingerprinting results
uses: actions/upload-artifact@v3
with:
name: web-fingerprinting-results
path: fingerprint-results/
- name: Comment PR with results
if: github.event_name == 'pull_request'
uses: actions/github-script@v6
with:
script: |
const fs = require('fs');
const summary = fs.readFileSync('fingerprint-results/summary.txt', 'utf8');
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: `## Web Technology Fingerprinting\n\n\`\`\`\n${summary}\n\`\`\``
});
- name: Fail if development technologies found
run: |
if [ "$DEV_TECHS_FOUND" = "true" ]; then
echo "CRITICAL: Development technologies detected in production!"
cat fingerprint-results/dev-technologies.txt
exit 1
fi
if [ "$SECURITY_COUNT" -eq "0" ]; then
echo "WARNING: No security technologies detected"
# Don't fail, just warn
fi
Performance Optimization and Troubleshooting
Performance Tuning
# Optimize WhatWeb for different scenarios
# Fast scanning with minimal aggression
whatweb --aggression 1 --max-threads 50 example.com
# Thorough scanning with high aggression
whatweb --aggression 4 --max-threads 10 --open-timeout 60 example.com
# Memory-efficient scanning for large lists
whatweb --aggression 2 --max-threads 5 -i large_urls.txt
# Network-optimized scanning
whatweb --open-timeout 30 --read-timeout 60 --max-redirects 3 example.com
# Performance monitoring script
#!/bin/bash
monitor_whatweb_performance() {
local target="$1"
local output_file="whatweb-performance-$(date +%s).log"
echo "Monitoring WhatWeb performance for: $target"
# Start monitoring
{
echo "Timestamp,CPU%,Memory(MB),Threads,Status"
while true; do
if pgrep -f "whatweb" > /dev/null; then
local cpu=$(ps -p $(pgrep -f "whatweb") -o %cpu --no-headers)
local mem=$(ps -p $(pgrep -f "whatweb") -o rss --no-headers | awk '{print $1/1024}')
local threads=$(ps -p $(pgrep -f "whatweb") -o nlwp --no-headers)
echo "$(date +%s),$cpu,$mem,$threads,running"
fi
sleep 2
done
} > "$output_file" &
local monitor_pid=$!
# Run WhatWeb
time whatweb --aggression 3 --verbose "$target"
# Stop monitoring
kill $monitor_pid 2>/dev/null
echo "Performance monitoring completed: $output_file"
}
# Usage
monitor_whatweb_performance "https://example.com"
Troubleshooting Common Issues
# Troubleshooting script for WhatWeb
troubleshoot_whatweb() {
echo "WhatWeb Troubleshooting Guide"
echo "============================"
# Check if WhatWeb is installed
if ! command -v whatweb &> /dev/null; then
echo "❌ WhatWeb not found in PATH"
echo "Solution: Install WhatWeb using package manager or from source"
return 1
fi
echo "✅ WhatWeb found: $(which whatweb)"
echo "Version: $(whatweb --version 2>&1)"
# Check Ruby version
if ! command -v ruby &> /dev/null; then
echo "❌ Ruby not found"
echo "Solution: Install Ruby runtime"
return 1
fi
local ruby_version=$(ruby --version)
echo "✅ Ruby version: $ruby_version"
# Check network connectivity
if ! curl -s --connect-timeout 5 https://httpbin.org/get > /dev/null; then
echo "❌ Network connectivity issues"
echo "Solution: Check internet connection and proxy settings"
return 1
fi
echo "✅ Network connectivity OK"
# Test basic functionality
echo "Testing basic WhatWeb functionality..."
if timeout 60 whatweb --aggression 1 https://httpbin.org/get > /dev/null 2>&1; then
echo "✅ Basic functionality test passed"
else
echo "❌ Basic functionality test failed"
echo "Solution: Check WhatWeb installation and network settings"
return 1
fi
# Check plugin directory
local plugin_dir="/usr/share/whatweb/plugins"
if [ -d "$plugin_dir" ]; then
local plugin_count=$(find "$plugin_dir" -name "*.rb" | wc -l)
echo "✅ Plugin directory found: $plugin_dir ($plugin_count plugins)"
else
echo "⚠️ Plugin directory not found: $plugin_dir"
echo "Note: Plugins may be in a different location"
fi
# Check for common configuration issues
echo "Checking for common configuration issues..."
# Check file permissions
if [ ! -r "$plugin_dir" ] 2>/dev/null; then
echo "⚠️ Plugin directory permission issues"
echo "Solution: Check read permissions on plugin directory"
fi
# Check for proxy issues
if [ -n "$HTTP_PROXY" ] || [ -n "$HTTPS_PROXY" ]; then
echo "⚠️ Proxy environment variables detected"
echo "Note: Use --proxy option if needed"
fi
echo "Troubleshooting completed"
}
# Common error solutions
fix_common_whatweb_errors() {
echo "Common WhatWeb Error Solutions"
echo "============================="
echo "1. 'command not found: whatweb'"
echo " Solution: sudo apt install whatweb (Ubuntu/Debian)"
echo " Alternative: Install from source or use package manager"
echo ""
echo "2. 'Ruby runtime error'"
echo " Solution: Install Ruby and required gems"
echo " Example: sudo apt install ruby ruby-dev"
echo ""
echo "3. 'timeout' or connection errors"
echo " Solution: Increase timeout values"
echo " Example: whatweb --open-timeout 60 --read-timeout 120 <url>"
echo ""
echo "4. 'SSL certificate error'"
echo " Solution: Check SSL configuration or use different URL"
echo ""
echo "5. 'Too many redirects'"
echo " Solution: Limit redirects or check URL manually"
echo " Example: whatweb --max-redirects 5 <url>"
echo ""
echo "6. 'No technologies detected' (false negatives)"
echo " Solution: Increase aggression level"
echo " Example: whatweb --aggression 3 <url>"
echo ""
echo "7. 'Plugin errors'"
echo " Solution: Update WhatWeb or check plugin syntax"
echo " Example: ruby -c /path/to/plugin.rb"
echo ""
echo "8. 'Permission denied'"
echo " Solution: Check file permissions or run with appropriate privileges"
}
# Run troubleshooting
troubleshoot_whatweb
fix_common_whatweb_errors
Resources and Documentation
Official Resources
- WhatWeb GitHub Repository - Main repository and documentation
- WhatWeb Wiki - Comprehensive usage guide
- Plugin Development Guide - Creating custom plugins
- Release Notes - Latest updates and changes
Community Resources
- WhatWeb Discussions - Community Q&A
- Bug Reports - Issue tracking and bug reports
- Plugin Repository - Official plugin collection
- Security Research - Research and methodologies
Integration Examples
- Automation Scripts - Example automation
- Penetration Testing Workflows - Security testing integration
- Bug Bounty Methodologies - Reconnaissance workflows
- Custom Plugin Examples - Plugin development