Wappalyzer Cheat Sheet
"Clase de la hoja" id="copy-btn" class="copy-btn" onclick="copyAllCommands()" Copiar todos los comandos id="pdf-btn" class="pdf-btn" onclick="generatePDF()" Generar PDF seleccionado/button ■/div titulada
Sinopsis
Wappalyzer es un perfilador tecnológico que identifica las tecnologías utilizadas en los sitios web. Detecta sistemas de gestión de contenidos, plataformas de comercio electrónico, marcos web, software del servidor, herramientas de análisis y muchas otras tecnologías. Disponible como extensión del navegador, herramienta CLI y API, Wappalyzer es esencial para el reconocimiento, análisis competitivo y evaluaciones de seguridad.
■ Características clave: Detección tecnológica, extensión del navegador, herramienta CLI, acceso a API, análisis a granel, informes detallados e integración con flujos de trabajo de seguridad.
Instalación y configuración
Instalación de extensión del navegador
# Chrome/Chromium
# Visit: https://chrome.google.com/webstore/detail/wappalyzer/gppongmhjkpfnbhagpmjfkannfbllamg
# Click "Add to Chrome"
# Firefox
# Visit: https://addons.mozilla.org/en-US/firefox/addon/wappalyzer/
# Click "Add to Firefox"
# Edge
# Visit: https://microsoftedge.microsoft.com/addons/detail/wappalyzer/mnbndgmknlpdjdnjfmfcdjoegcckoikn
# Click "Get"
# Safari
# Visit: https://apps.apple.com/app/wappalyzer/id1520333300
# Install from App Store
# Manual installation for development
git clone https://github.com/wappalyzer/wappalyzer.git
cd wappalyzer
npm install
npm run build
# Load unpacked extension from src/drivers/webextension/
Instalación de herramientas CLI
# Install via npm (Node.js required)
npm install -g wappalyzer
# Verify installation
wappalyzer --version
# Install specific version
npm install -g wappalyzer@6.10.66
# Install locally in project
npm install wappalyzer
npx wappalyzer --version
# Update to latest version
npm update -g wappalyzer
# Uninstall
npm uninstall -g wappalyzer
Docker Instalación
# Pull official Docker image
docker pull wappalyzer/cli
# Run Wappalyzer in Docker
docker run --rm wappalyzer/cli https://example.com
# Run with volume mount for output
docker run --rm -v $(pwd):/output wappalyzer/cli https://example.com --output /output/results.json
# Create alias for easier usage
echo 'alias wappalyzer="docker run --rm -v $(pwd):/output wappalyzer/cli"' >> ~/.bashrc
source ~/.bashrc
# Build custom Docker image
cat > Dockerfile << 'EOF'
FROM node:16-alpine
RUN npm install -g wappalyzer
WORKDIR /app
ENTRYPOINT ["wappalyzer"]
EOF
docker build -t custom-wappalyzer .
Configuración de API
# Sign up for API access at https://www.wappalyzer.com/api/
# Get API key from dashboard
# Set environment variable
export WAPPALYZER_API_KEY="your_api_key_here"
# Test API access
curl -H "x-api-key: $WAPPALYZER_API_KEY" \
"https://api.wappalyzer.com/v2/lookup/?urls=https://example.com"
# Create configuration file
cat > ~/.wappalyzer-config.json << 'EOF'
{
"api_key": "your_api_key_here",
"api_url": "https://api.wappalyzer.com/v2/",
"timeout": 30,
"max_retries": 3,
"rate_limit": 100
}
EOF
# Set configuration path
export WAPPALYZER_CONFIG=~/.wappalyzer-config.json
Development Setup
# Clone repository for development
git clone https://github.com/wappalyzer/wappalyzer.git
cd wappalyzer
# Install dependencies
npm install
# Build the project
npm run build
# Run tests
npm test
# Start development server
npm run dev
# Create custom technology definitions
mkdir -p custom-technologies
cat > custom-technologies/custom.json << 'EOF'
{
"Custom Framework": {
"cats": [18],
"description": "Custom web framework",
"icon": "custom.png",
"website": "https://custom-framework.com",
"headers": {
"X-Powered-By": "Custom Framework"
},
"html": "<meta name=\"generator\" content=\"Custom Framework",
"js": {
"CustomFramework": ""
},
"implies": "PHP"
}
}
EOF
# Validate custom technology definitions
npm run validate -- custom-technologies/custom.json
Uso básico y comandos
CLI Comandos básicos
# Analyze single website
wappalyzer https://example.com
# Analyze with detailed output
wappalyzer https://example.com --pretty
# Save results to file
wappalyzer https://example.com --output results.json
# Analyze multiple URLs
wappalyzer https://example.com https://test.com
# Analyze from file
echo -e "https://example.com\nhttps://test.com" > urls.txt
wappalyzer --urls-file urls.txt
# Set custom user agent
wappalyzer https://example.com --user-agent "Custom Agent 1.0"
# Set timeout
wappalyzer https://example.com --timeout 30000
# Follow redirects
wappalyzer https://example.com --follow-redirect
# Disable SSL verification
wappalyzer https://example.com --no-ssl-verify
Opciones avanzadas de CLI
# Analyze with custom headers
wappalyzer https://example.com --header "Authorization: Bearer token123"
# Set maximum pages to analyze
wappalyzer https://example.com --max-pages 10
# Set crawl depth
wappalyzer https://example.com --max-depth 3
# Analyze with proxy
wappalyzer https://example.com --proxy http://127.0.0.1:8080
# Set custom delay between requests
wappalyzer https://example.com --delay 1000
# Analyze with authentication
wappalyzer https://example.com --cookie "session=abc123; auth=xyz789"
# Output in different formats
wappalyzer https://example.com --output results.csv --format csv
wappalyzer https://example.com --output results.xml --format xml
# Verbose output for debugging
wappalyzer https://example.com --verbose
# Analyze specific categories only
wappalyzer https://example.com --categories "CMS,Web frameworks"
Análisis a granel
# Analyze multiple domains from file
cat > domains.txt << 'EOF'
example.com
test.com
demo.com
sample.com
EOF
# Basic bulk analysis
wappalyzer --urls-file domains.txt --output bulk_results.json
# Bulk analysis with threading
wappalyzer --urls-file domains.txt --concurrent 10 --output threaded_results.json
# Bulk analysis with rate limiting
wappalyzer --urls-file domains.txt --delay 2000 --output rate_limited_results.json
# Analyze subdomains
subfinder -d example.com -silent | head -100 > subdomains.txt
wappalyzer --urls-file subdomains.txt --output subdomain_analysis.json
# Combine with other tools
| echo "example.com" | subfinder -silent | httpx -silent | head -50 | while read url; do |
echo "https://$url"
done > live_urls.txt
wappalyzer --urls-file live_urls.txt --output comprehensive_analysis.json
Detección avanzada de tecnología
Detector de Tecnología Aduanera
#!/usr/bin/env python3
# Advanced Wappalyzer automation and custom detection
import json
import subprocess
import requests
import threading
import time
import re
from concurrent.futures import ThreadPoolExecutor, as_completed
from urllib.parse import urlparse, urljoin
import os
class WappalyzerAnalyzer:
def __init__(self, api_key=None, max_workers=10):
self.api_key = api_key
self.max_workers = max_workers
self.results = []
self.lock = threading.Lock()
self.api_url = "https://api.wappalyzer.com/v2/"
def analyze_url_cli(self, url, options=None):
"""Analyze URL using Wappalyzer CLI"""
if options is None:
options = {}
try:
# Build command
cmd = ['wappalyzer', url]
if options.get('timeout'):
cmd.extend(['--timeout', str(options['timeout'])])
if options.get('user_agent'):
cmd.extend(['--user-agent', options['user_agent']])
if options.get('headers'):
for header in options['headers']:
cmd.extend(['--header', header])
if options.get('proxy'):
cmd.extend(['--proxy', options['proxy']])
if options.get('delay'):
cmd.extend(['--delay', str(options['delay'])])
if options.get('max_pages'):
cmd.extend(['--max-pages', str(options['max_pages'])])
if options.get('follow_redirect'):
cmd.append('--follow-redirect')
if options.get('no_ssl_verify'):
cmd.append('--no-ssl-verify')
# Run Wappalyzer
result = subprocess.run(
cmd,
capture_output=True, text=True,
timeout=options.get('timeout', 30)
)
if result.returncode == 0:
try:
technologies = json.loads(result.stdout)
return {
'url': url,
'success': True,
'technologies': technologies,
'error': None
}
except json.JSONDecodeError:
return {
'url': url,
'success': False,
'technologies': [],
'error': 'Invalid JSON response'
}
else:
return {
'url': url,
'success': False,
'technologies': [],
'error': result.stderr
}
except subprocess.TimeoutExpired:
return {
'url': url,
'success': False,
'technologies': [],
'error': 'CLI timeout'
}
except Exception as e:
return {
'url': url,
'success': False,
'technologies': [],
'error': str(e)
}
def analyze_url_api(self, url):
"""Analyze URL using Wappalyzer API"""
if not self.api_key:
return {
'url': url,
'success': False,
'technologies': [],
'error': 'API key not provided'
}
try:
headers = {
'x-api-key': self.api_key,
'Content-Type': 'application/json'
}
response = requests.get(
f"{self.api_url}lookup/",
params={'urls': url},
headers=headers,
timeout=30
)
if response.status_code == 200:
data = response.json()
return {
'url': url,
'success': True,
'technologies': data.get(url, []),
'error': None
}
else:
return {
'url': url,
'success': False,
'technologies': [],
'error': f'API error: {response.status_code}'
}
except Exception as e:
return {
'url': url,
'success': False,
'technologies': [],
'error': str(e)
}
def custom_technology_detection(self, url):
"""Perform custom technology detection"""
custom_detections = []
try:
# Fetch page content
response = requests.get(url, timeout=30, verify=False)
content = response.text
headers = response.headers
# Custom detection rules
detections = {
'Custom Framework': {
'patterns': [
r'<meta name="generator" content="Custom Framework',
r'X-Powered-By.*Custom Framework'
],
'category': 'Web frameworks'
},
'Internal Tool': {
'patterns': [
r'<!-- Internal Tool v\d+\.\d+ -->',
r'internal-tool\.js',
r'data-internal-version'
],
'category': 'Development tools'
},
'Security Headers': {
'patterns': [
r'Content-Security-Policy',
r'X-Frame-Options',
r'X-XSS-Protection'
],
'category': 'Security'
},
'Analytics Platform': {
'patterns': [
r'analytics\.custom\.com',
r'customAnalytics\(',
r'data-analytics-id'
],
'category': 'Analytics'
},
'CDN Detection': {
'patterns': [
r'cdn\.custom\.com',
r'X-Cache.*HIT',
r'X-CDN-Provider'
],
'category': 'CDN'
}
}
# Check patterns in content and headers
for tech_name, tech_info in detections.items():
detected = False
for pattern in tech_info['patterns']:
# Check in content
if re.search(pattern, content, re.IGNORECASE):
detected = True
break
# Check in headers
for header_name, header_value in headers.items():
if re.search(pattern, f"{header_name}: {header_value}", re.IGNORECASE):
detected = True
break
if detected:
custom_detections.append({
'name': tech_name,
'category': tech_info['category'],
'confidence': 'high',
'version': None
})
return {
'url': url,
'success': True,
'custom_technologies': custom_detections,
'error': None
}
except Exception as e:
return {
'url': url,
'success': False,
'custom_technologies': [],
'error': str(e)
}
def comprehensive_analysis(self, url, use_api=False, custom_detection=True):
"""Perform comprehensive technology analysis"""
print(f"Analyzing: {url}")
results = {
'url': url,
'timestamp': time.time(),
'wappalyzer_cli': None,
'wappalyzer_api': None,
'custom_detection': None,
'combined_technologies': [],
'technology_categories': {},
'security_technologies': [],
'risk_assessment': {}
}
# CLI analysis
cli_result = self.analyze_url_cli(url, {
'timeout': 30000,
'follow_redirect': True,
'max_pages': 5
})
results['wappalyzer_cli'] = cli_result
# API analysis (if enabled)
if use_api and self.api_key:
api_result = self.analyze_url_api(url)
results['wappalyzer_api'] = api_result
# Custom detection (if enabled)
if custom_detection:
custom_result = self.custom_technology_detection(url)
results['custom_detection'] = custom_result
# Combine and analyze results
all_technologies = []
# Add CLI technologies
if cli_result['success'] and cli_result['technologies']:
for tech in cli_result['technologies']:
all_technologies.append({
'name': tech.get('name', 'Unknown'),
'category': tech.get('categories', []),
'version': tech.get('version'),
'confidence': tech.get('confidence', 100),
'source': 'wappalyzer_cli'
})
# Add API technologies
if use_api and results['wappalyzer_api'] and results['wappalyzer_api']['success']:
for tech in results['wappalyzer_api']['technologies']:
all_technologies.append({
'name': tech.get('name', 'Unknown'),
'category': tech.get('categories', []),
'version': tech.get('version'),
'confidence': tech.get('confidence', 100),
'source': 'wappalyzer_api'
})
# Add custom technologies
if custom_detection and results['custom_detection'] and results['custom_detection']['success']:
for tech in results['custom_detection']['custom_technologies']:
all_technologies.append({
'name': tech['name'],
'category': [tech['category']],
'version': tech.get('version'),
'confidence': 90, # High confidence for custom detection
'source': 'custom_detection'
})
# Remove duplicates and categorize
unique_technologies = {}
for tech in all_technologies:
tech_name = tech['name']
if tech_name not in unique_technologies:
unique_technologies[tech_name] = tech
else:
# Merge information from multiple sources
existing = unique_technologies[tech_name]
if tech['confidence'] > existing['confidence']:
unique_technologies[tech_name] = tech
results['combined_technologies'] = list(unique_technologies.values())
# Categorize technologies
categories = {}
security_techs = []
for tech in results['combined_technologies']:
tech_categories = tech.get('category', [])
if isinstance(tech_categories, str):
tech_categories = [tech_categories]
for category in tech_categories:
if category not in categories:
categories[category] = []
categories[category].append(tech['name'])
# Identify security-related technologies
if any(sec_keyword in category.lower() for sec_keyword in ['security', 'firewall', 'protection', 'ssl', 'certificate']):
security_techs.append(tech)
results['technology_categories'] = categories
results['security_technologies'] = security_techs
# Risk assessment
risk_factors = []
# Check for outdated technologies
for tech in results['combined_technologies']:
if tech.get('version'):
# This would require a database of known vulnerabilities
# For now, just flag old versions
version = tech['version']
if any(old_indicator in version.lower() for old_indicator in ['1.', '2.', '3.', '4.', '5.']):
risk_factors.append(f"Potentially outdated {tech['name']} version {version}")
# Check for missing security headers
if not security_techs:
risk_factors.append("No security technologies detected")
# Check for development/debug technologies in production
dev_categories = ['Development tools', 'Debugging', 'Testing']
for category in dev_categories:
if category in categories:
risk_factors.append(f"Development tools detected in production: {', '.join(categories[category])}")
results['risk_assessment'] = {
'risk_level': 'low' if len(risk_factors) == 0 else 'medium' if len(risk_factors) <= 2 else 'high',
'risk_factors': risk_factors,
'recommendations': self.generate_recommendations(results)
}
with self.lock:
self.results.append(results)
return results
def generate_recommendations(self, analysis_result):
"""Generate security and optimization recommendations"""
recommendations = []
technologies = analysis_result['combined_technologies']
categories = analysis_result['technology_categories']
# Security recommendations
if 'Security' not in categories:
recommendations.append("Consider implementing security headers (CSP, HSTS, X-Frame-Options)")
if 'SSL/TLS' not in categories:
recommendations.append("Ensure HTTPS is properly configured with valid SSL/TLS certificates")
if 'Web application firewall' not in categories:
recommendations.append("Consider implementing a Web Application Firewall (WAF)")
# Performance recommendations
if 'CDN' not in categories:
recommendations.append("Consider using a Content Delivery Network (CDN) for better performance")
if 'Caching' not in categories:
recommendations.append("Implement caching mechanisms to improve performance")
# Technology-specific recommendations
cms_technologies = categories.get('CMS', [])
if cms_technologies:
recommendations.append(f"Keep {', '.join(cms_technologies)} updated to the latest version")
framework_technologies = categories.get('Web frameworks', [])
if framework_technologies:
recommendations.append(f"Ensure {', '.join(framework_technologies)} are updated and properly configured")
# Analytics and privacy
analytics_technologies = categories.get('Analytics', [])
if analytics_technologies:
recommendations.append("Ensure analytics tools comply with privacy regulations (GDPR, CCPA)")
return recommendations
def bulk_analysis(self, urls, use_api=False, custom_detection=True):
"""Perform bulk technology analysis"""
print(f"Starting bulk analysis of {len(urls)} URLs")
print(f"Max workers: {self.max_workers}")
with ThreadPoolExecutor(max_workers=self.max_workers) as executor:
# Submit all tasks
future_to_url = {
executor.submit(self.comprehensive_analysis, url, use_api, custom_detection): url
for url in urls
}
# Process completed tasks
for future in as_completed(future_to_url):
url = future_to_url[future]
try:
result = future.result()
tech_count = len(result['combined_technologies'])
risk_level = result['risk_assessment']['risk_level']
print(f"✓ {url}: {tech_count} technologies, risk: {risk_level}")
except Exception as e:
print(f"✗ Error analyzing {url}: {e}")
return self.results
def generate_report(self, output_file='wappalyzer_analysis_report.json'):
"""Generate comprehensive analysis report"""
# Calculate statistics
total_urls = len(self.results)
successful_analyses = sum(1 for r in self.results if r['wappalyzer_cli']['success'])
# Technology statistics
all_technologies = {}
all_categories = {}
risk_levels = {'low': 0, 'medium': 0, 'high': 0}
for result in self.results:
# Count technologies
for tech in result['combined_technologies']:
tech_name = tech['name']
if tech_name not in all_technologies:
all_technologies[tech_name] = 0
all_technologies[tech_name] += 1
# Count categories
for category, techs in result['technology_categories'].items():
if category not in all_categories:
all_categories[category] = 0
all_categories[category] += len(techs)
# Count risk levels
risk_level = result['risk_assessment']['risk_level']
if risk_level in risk_levels:
risk_levels[risk_level] += 1
# Sort by popularity
popular_technologies = sorted(all_technologies.items(), key=lambda x: x[1], reverse=True)[:20]
popular_categories = sorted(all_categories.items(), key=lambda x: x[1], reverse=True)[:10]
report = {
'scan_summary': {
'total_urls': total_urls,
'successful_analyses': successful_analyses,
'success_rate': (successful_analyses / total_urls * 100) if total_urls > 0 else 0,
'scan_date': time.strftime('%Y-%m-%d %H:%M:%S')
},
'technology_statistics': {
'total_unique_technologies': len(all_technologies),
'popular_technologies': popular_technologies,
'popular_categories': popular_categories
},
'risk_assessment': {
'risk_distribution': risk_levels,
'high_risk_urls': [
r['url'] for r in self.results
if r['risk_assessment']['risk_level'] == 'high'
]
},
'detailed_results': self.results
}
# Save report
with open(output_file, 'w') as f:
json.dump(report, f, indent=2)
print(f"\nWappalyzer Analysis Report:")
print(f"Total URLs analyzed: {total_urls}")
print(f"Successful analyses: {successful_analyses}")
print(f"Success rate: {report['scan_summary']['success_rate']:.1f}%")
print(f"Unique technologies found: {len(all_technologies)}")
print(f"High risk URLs: {risk_levels['high']}")
print(f"Report saved to: {output_file}")
return report
# Usage example
if __name__ == "__main__":
# Create analyzer instance
analyzer = WappalyzerAnalyzer(
api_key=os.getenv('WAPPALYZER_API_KEY'),
max_workers=5
)
# URLs to analyze
urls = [
'https://example.com',
'https://test.com',
'https://demo.com'
]
# Perform bulk analysis
results = analyzer.bulk_analysis(
urls,
use_api=False, # Set to True if you have API key
custom_detection=True
)
# Generate report
report = analyzer.generate_report('comprehensive_wappalyzer_report.json')
Automatización e integración
CI/CD Integration
#!/bin/bash
# CI/CD script for technology stack analysis
set -e
TARGET_URL="$1"
OUTPUT_DIR="$2"
BASELINE_FILE="$3"
| if [ -z "$TARGET_URL" ] | | [ -z "$OUTPUT_DIR" ]; then |
echo "Usage: $0 <target_url> <output_dir> [baseline_file]"
exit 1
fi
echo "Starting technology stack analysis..."
echo "Target: $TARGET_URL"
echo "Output directory: $OUTPUT_DIR"
mkdir -p "$OUTPUT_DIR"
# Run Wappalyzer analysis
echo "Running Wappalyzer analysis..."
wappalyzer "$TARGET_URL" \
--timeout 30000 \
--follow-redirect \
--max-pages 10 \
--output "$OUTPUT_DIR/wappalyzer_results.json" \
--pretty
# Parse results
| TECH_COUNT=$(jq '. | length' "$OUTPUT_DIR/wappalyzer_results.json" 2>/dev/null | | echo "0") |
echo "Found $TECH_COUNT technologies"
# Extract technology categories
| jq -r '.[].categories[]' "$OUTPUT_DIR/wappalyzer_results.json" 2>/dev/null | sort | uniq > "$OUTPUT_DIR/categories.txt" | | touch "$OUTPUT_DIR/categories.txt" |
CATEGORY_COUNT=$(wc -l < "$OUTPUT_DIR/categories.txt")
# Extract security-related technologies
| jq -r '.[] | select(.categories[] | contains("Security") or contains("SSL") or contains("Certificate")) | .name' "$OUTPUT_DIR/wappalyzer_results.json" 2>/dev/null > "$OUTPUT_DIR/security_technologies.txt" | | touch "$OUTPUT_DIR/security_technologies.txt" |
SECURITY_COUNT=$(wc -l < "$OUTPUT_DIR/security_technologies.txt")
# Check for development/debug technologies
| jq -r '.[] | select(.categories[] | contains("Development") or contains("Debug") or contains("Testing")) | .name' "$OUTPUT_DIR/wappalyzer_results.json" 2>/dev/null > "$OUTPUT_DIR/dev_technologies.txt" | | touch "$OUTPUT_DIR/dev_technologies.txt" |
DEV_COUNT=$(wc -l < "$OUTPUT_DIR/dev_technologies.txt")
# Generate summary report
cat > "$OUTPUT_DIR/technology-summary.txt" << EOF
Technology Stack Analysis Summary
================================
Date: $(date)
Target: $TARGET_URL
Total Technologies: $TECH_COUNT
Categories: $CATEGORY_COUNT
Security Technologies: $SECURITY_COUNT
Development Technologies: $DEV_COUNT
Status: $(if [ "$DEV_COUNT" -gt "0" ]; then echo "WARNING - Development tools detected"; else echo "OK"; fi)
EOF
# Compare with baseline if provided
if [ -n "$BASELINE_FILE" ] && [ -f "$BASELINE_FILE" ]; then
echo "Comparing with baseline..."
# Extract current technology names
| jq -r '.[].name' "$OUTPUT_DIR/wappalyzer_results.json" 2>/dev/null | sort > "$OUTPUT_DIR/current_technologies.txt" | | touch "$OUTPUT_DIR/current_technologies.txt" |
# Extract baseline technology names
| jq -r '.[].name' "$BASELINE_FILE" 2>/dev/null | sort > "$OUTPUT_DIR/baseline_technologies.txt" | | touch "$OUTPUT_DIR/baseline_technologies.txt" |
# Find differences
comm -23 "$OUTPUT_DIR/current_technologies.txt" "$OUTPUT_DIR/baseline_technologies.txt" > "$OUTPUT_DIR/new_technologies.txt"
comm -13 "$OUTPUT_DIR/current_technologies.txt" "$OUTPUT_DIR/baseline_technologies.txt" > "$OUTPUT_DIR/removed_technologies.txt"
NEW_COUNT=$(wc -l < "$OUTPUT_DIR/new_technologies.txt")
REMOVED_COUNT=$(wc -l < "$OUTPUT_DIR/removed_technologies.txt")
echo "New technologies: $NEW_COUNT"
echo "Removed technologies: $REMOVED_COUNT"
# Add to summary
cat >> "$OUTPUT_DIR/technology-summary.txt" << EOF
Baseline Comparison:
New Technologies: $NEW_COUNT
Removed Technologies: $REMOVED_COUNT
EOF
if [ "$NEW_COUNT" -gt "0" ]; then
echo "New technologies detected:"
cat "$OUTPUT_DIR/new_technologies.txt"
fi
fi
# Generate detailed report
python3 << 'PYTHON_EOF'
import sys
import json
from datetime import datetime
output_dir = sys.argv[1]
target_url = sys.argv[2]
# Read Wappalyzer results
try:
with open(f"{output_dir}/wappalyzer_results.json", 'r') as f:
technologies = json.load(f)
except (FileNotFoundError, json.JSONDecodeError):
technologies = []
# Categorize technologies
categories = {}
security_techs = []
dev_techs = []
outdated_techs = []
for tech in technologies:
tech_name = tech.get('name', 'Unknown')
tech_categories = tech.get('categories', [])
tech_version = tech.get('version', '')
# Categorize
for category in tech_categories:
if category not in categories:
categories[category] = []
categories[category].append(tech_name)
# Security technologies
if any(sec_keyword in category.lower() for sec_keyword in ['security', 'ssl', 'certificate', 'firewall']):
security_techs.append(tech)
# Development technologies
if any(dev_keyword in category.lower() for dev_keyword in ['development', 'debug', 'testing']):
dev_techs.append(tech)
# Check for potentially outdated versions
if tech_version and any(old_indicator in tech_version for old_indicator in ['1.', '2.', '3.', '4.', '5.']):
outdated_techs.append(tech)
# Risk assessment
risk_factors = []
if len(dev_techs) > 0:
risk_factors.append(f"Development tools detected: {', '.join([t['name'] for t in dev_techs])}")
if len(security_techs) == 0:
risk_factors.append("No security technologies detected")
if len(outdated_techs) > 0:
risk_factors.append(f"Potentially outdated technologies: {', '.join([t['name'] for t in outdated_techs])}")
risk_level = 'low' if len(risk_factors) == 0 else 'medium' if len(risk_factors) <= 2 else 'high'
# Create detailed report
report = {
'scan_info': {
'target': target_url,
'scan_date': datetime.now().isoformat(),
'technology_count': len(technologies),
'category_count': len(categories)
},
'technologies': technologies,
'categories': categories,
'security_assessment': {
'security_technologies': security_techs,
'development_technologies': dev_techs,
'outdated_technologies': outdated_techs,
'risk_level': risk_level,
'risk_factors': risk_factors
}
}
# Save detailed report
with open(f"{output_dir}/wappalyzer-detailed-report.json", 'w') as f:
json.dump(report, f, indent=2)
# Generate HTML report
html_content = f"""
<!DOCTYPE html>
<html>
<head>
<title>Technology Stack Analysis Report</title>
<style>
body {{ font-family: Arial, sans-serif; margin: 20px; }}
.header {{ background-color: #f0f0f0; padding: 20px; border-radius: 5px; }}
.category {{ margin: 10px 0; padding: 15px; border-left: 4px solid #007bff; background-color: #f8f9fa; }}
.risk-high {{ border-left-color: #dc3545; }}
.risk-medium {{ border-left-color: #ffc107; }}
.risk-low {{ border-left-color: #28a745; }}
table {{ border-collapse: collapse; width: 100%; margin: 20px 0; }}
th, td {{ border: 1px solid #ddd; padding: 8px; text-align: left; }}
th {{ background-color: #f2f2f2; }}
</style>
</head>
<body>
<div class="header">
<h1>Technology Stack Analysis Report</h1>
<p><strong>Target:</strong> {target_url}</p>
<p><strong>Scan Date:</strong> {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}</p>
<p><strong>Technologies Found:</strong> {len(technologies)}</p>
<p><strong>Risk Level:</strong> <span class="risk-{risk_level}">{risk_level.upper()}</span></p>
</div>
<h2>Technology Categories</h2>
"""
for category, techs in categories.items():
html_content += f"""
<div class="category">
<h3>{category}</h3>
<p>{', '.join(techs)}</p>
</div>
"""
html_content += """
<h2>Detailed Technologies</h2>
<table>
<tr><th>Name</th><th>Version</th><th>Categories</th><th>Confidence</th></tr>
"""
for tech in technologies:
html_content += f"""
<tr>
<td>{tech.get('name', 'Unknown')}</td>
<td>{tech.get('version', 'N/A')}</td>
<td>{', '.join(tech.get('categories', []))}</td>
<td>{tech.get('confidence', 'N/A')}%</td>
</tr>
"""
html_content += """
</table>
<h2>Risk Assessment</h2>
"""
if risk_factors:
html_content += "<ul>"
for factor in risk_factors:
html_content += f"<li>{factor}</li>"
html_content += "</ul>"
else:
html_content += "<p>No significant risk factors identified.</p>"
html_content += """
</body>
</html>
"""
with open(f"{output_dir}/wappalyzer-report.html", 'w') as f:
f.write(html_content)
print(f"Detailed reports generated:")
print(f"- JSON: {output_dir}/wappalyzer-detailed-report.json")
print(f"- HTML: {output_dir}/wappalyzer-report.html")
PYTHON_EOF
# Check for development technologies and exit
if [ "$DEV_COUNT" -gt "0" ]; then
echo "WARNING: Development technologies detected in production environment"
echo "Development technologies found:"
cat "$OUTPUT_DIR/dev_technologies.txt"
exit 1
else
echo "SUCCESS: No development technologies detected"
exit 0
fi
GitHub Actions Integration
# .github/workflows/wappalyzer-tech-analysis.yml
name: Technology Stack Analysis
on:
push:
branches: [ main, develop ]
pull_request:
branches: [ main ]
schedule:
- cron: '0 6 * * 1' # Weekly scan on Mondays at 6 AM
jobs:
technology-analysis:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Setup Node.js
uses: actions/setup-node@v3
with:
node-version: '16'
- name: Install Wappalyzer
run: |
npm install -g wappalyzer
wappalyzer --version
- name: Analyze production environment
run: |
mkdir -p analysis-results
# Analyze main application
wappalyzer ${{ vars.PRODUCTION_URL }} \
--timeout 30000 \
--follow-redirect \
--max-pages 10 \
--output analysis-results/production-tech.json \
--pretty
# Analyze staging environment
wappalyzer ${{ vars.STAGING_URL }} \
--timeout 30000 \
--follow-redirect \
--max-pages 5 \
--output analysis-results/staging-tech.json \
--pretty
# Count technologies
| PROD_TECH_COUNT=$(jq '. | length' analysis-results/production-tech.json 2>/dev/null | | echo "0") |
| STAGING_TECH_COUNT=$(jq '. | length' analysis-results/staging-tech.json 2>/dev/null | | echo "0") |
echo "PROD_TECH_COUNT=$PROD_TECH_COUNT" >> $GITHUB_ENV
echo "STAGING_TECH_COUNT=$STAGING_TECH_COUNT" >> $GITHUB_ENV
- name: Check for development technologies
run: |
# Check for development/debug technologies in production
| DEV_TECHS=$(jq -r '.[] | select(.categories[] | contains("Development") or contains("Debug") or contains("Testing")) | .name' analysis-results/production-tech.json 2>/dev/null | | echo "") |
if [ -n "$DEV_TECHS" ]; then
echo "DEV_TECHS_FOUND=true" >> $GITHUB_ENV
echo "Development technologies found in production: "
echo "$DEV_TECHS"
echo "$DEV_TECHS" > analysis-results/dev-technologies.txt
else
echo "DEV_TECHS_FOUND=false" >> $GITHUB_ENV
touch analysis-results/dev-technologies.txt
fi
- name: Security technology assessment
run: |
# Check for security technologies
| SECURITY_TECHS=$(jq -r '.[] | select(.categories[] | contains("Security") or contains("SSL") or contains("Certificate")) | .name' analysis-results/production-tech.json 2>/dev/null | | echo "") |
if [ -n "$SECURITY_TECHS" ]; then
echo "Security technologies found: "
echo "$SECURITY_TECHS"
echo "$SECURITY_TECHS" > analysis-results/security-technologies.txt
else
echo "No security technologies detected"
touch analysis-results/security-technologies.txt
fi
SECURITY_COUNT=$(echo "$SECURITY_TECHS" | wc -l)
echo "SECURITY_COUNT=$SECURITY_COUNT" >> $GITHUB_ENV
- name: Generate comparison report
run: |
# Compare production and staging
| jq -r '.[].name' analysis-results/production-tech.json 2>/dev/null | sort > analysis-results/prod-techs.txt | | touch analysis-results/prod-techs.txt |
| jq -r '.[].name' analysis-results/staging-tech.json 2>/dev/null | sort > analysis-results/staging-techs.txt | | touch analysis-results/staging-techs.txt |
# Find differences
comm -23 analysis-results/prod-techs.txt analysis-results/staging-techs.txt > analysis-results/prod-only.txt
comm -13 analysis-results/prod-techs.txt analysis-results/staging-techs.txt > analysis-results/staging-only.txt
# Generate summary
cat > analysis-results/summary.txt << EOF
Technology Stack Analysis Summary
================================
Production Technologies: $PROD_TECH_COUNT
Staging Technologies: $STAGING_TECH_COUNT
Security Technologies: $SECURITY_COUNT
Development Technologies in Production: $(if [ "$DEV_TECHS_FOUND" = "true" ]; then echo "YES (CRITICAL)"; else echo "NO"; fi)
Production-only Technologies: $(wc -l < analysis-results/prod-only.txt)
Staging-only Technologies: $(wc -l < analysis-results/staging-only.txt)
EOF
- name: Upload analysis results
uses: actions/upload-artifact@v3
with:
name: technology-analysis-results
path: analysis-results/
- name: Comment PR with results
if: github.event_name == 'pull_request'
uses: actions/github-script@v6
with:
script: |
const fs = require('fs');
const summary = fs.readFileSync('analysis-results/summary.txt', 'utf8');
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: `## Technology Stack Analysis\n\n\`\`\`\n${summary}\n\`\`\``
});
- name: Fail if development technologies found
run: |
if [ "$DEV_TECHS_FOUND" = "true" ]; then
echo "CRITICAL: Development technologies detected in production!"
cat analysis-results/dev-technologies.txt
exit 1
fi
Optimización del rendimiento y solución de problemas
Performance Tuning
# Optimize Wappalyzer for different scenarios
# Fast analysis with minimal pages
wappalyzer https://example.com --max-pages 1 --timeout 10000
# Thorough analysis with deep crawling
wappalyzer https://example.com --max-pages 20 --max-depth 5 --timeout 60000
# Bulk analysis with rate limiting
for url in $(cat urls.txt); do
wappalyzer "$url" --output "results_$(echo $url | sed 's/[^a-zA-Z0-9]/_/g').json"
sleep 2
done
# Memory-efficient analysis for large sites
wappalyzer https://example.com --max-pages 5 --no-scripts --timeout 30000
# Performance monitoring script
#!/bin/bash
monitor_wappalyzer_performance() {
local url="$1"
local output_file="wappalyzer-performance-$(date +%s).log"
echo "Monitoring Wappalyzer performance for: $url"
# Start monitoring
{
echo "Timestamp,CPU%,Memory(MB),Status"
while true; do
if pgrep -f "wappalyzer" > /dev/null; then
local cpu=$(ps -p $(pgrep -f "wappalyzer") -o %cpu --no-headers)
local mem=$(ps -p $(pgrep -f "wappalyzer") -o rss --no-headers | awk '{print $1/1024}')
echo "$(date +%s),$cpu,$mem,running"
fi
sleep 2
done
} > "$output_file" &
local monitor_pid=$!
# Run Wappalyzer
time wappalyzer "$url" --pretty --output "performance_test_results.json"
# Stop monitoring
kill $monitor_pid 2>/dev/null
echo "Performance monitoring completed: $output_file"
}
# Usage
monitor_wappalyzer_performance "https://example.com"
Problemas comunes
# Troubleshooting script for Wappalyzer
troubleshoot_wappalyzer() {
echo "Wappalyzer Troubleshooting Guide"
echo "==============================="
# Check if Wappalyzer is installed
if ! command -v wappalyzer &> /dev/null; then
echo "❌ Wappalyzer not found in PATH"
echo "Solution: Install Wappalyzer using 'npm install -g wappalyzer'"
return 1
fi
echo "✅ Wappalyzer found: $(which wappalyzer)"
echo "Version: $(wappalyzer --version 2>&1)"
# Check Node.js version
if ! command -v node &> /dev/null; then
echo "❌ Node.js not found"
echo "Solution: Install Node.js from https://nodejs.org/"
return 1
fi
local node_version=$(node --version)
echo "✅ Node.js version: $node_version"
# Check network connectivity
if ! curl -s --connect-timeout 5 https://httpbin.org/get > /dev/null; then
echo "❌ Network connectivity issues"
echo "Solution: Check internet connection and proxy settings"
return 1
fi
echo "✅ Network connectivity OK"
# Test basic functionality
echo "Testing basic Wappalyzer functionality..."
if timeout 60 wappalyzer https://httpbin.org/get --timeout 30000 > /dev/null 2>&1; then
echo "✅ Basic functionality test passed"
else
echo "❌ Basic functionality test failed"
echo "Solution: Check Wappalyzer installation and network settings"
return 1
fi
# Check for common configuration issues
echo "Checking for common configuration issues..."
# Check npm permissions
if [ ! -w "$(npm config get prefix)/lib/node_modules" ] 2>/dev/null; then
echo "⚠️ npm permission issues detected"
echo "Solution: Fix npm permissions or use nvm"
fi
# Check for proxy issues
| if [ -n "$HTTP_PROXY" ] | | [ -n "$HTTPS_PROXY" ]; then |
echo "⚠️ Proxy environment variables detected"
echo "Note: Wappalyzer should respect proxy settings automatically"
fi
echo "Troubleshooting completed"
}
# Common error solutions
fix_common_wappalyzer_errors() {
echo "Common Wappalyzer Error Solutions"
echo "================================"
echo "1. 'command not found: wappalyzer'"
echo " Solution: npm install -g wappalyzer"
echo " Alternative: npx wappalyzer <url>"
echo ""
echo "2. 'EACCES: permission denied'"
echo " Solution: Fix npm permissions or use sudo"
echo " Better: Use nvm to manage Node.js versions"
echo ""
echo "3. 'timeout' or 'ETIMEDOUT'"
echo " Solution: Increase timeout with --timeout option"
echo " Example: wappalyzer <url> --timeout 60000"
echo ""
echo "4. 'SSL certificate error'"
echo " Solution: Use --no-ssl-verify (not recommended for production)"
echo ""
echo "5. 'Too many redirects'"
echo " Solution: Use --follow-redirect or check URL manually"
echo ""
echo "6. 'No technologies detected' (false negatives)"
echo " Solution: Increase --max-pages and --max-depth"
echo " Example: wappalyzer <url> --max-pages 10 --max-depth 3"
echo ""
echo "7. 'Out of memory' for large sites"
echo " Solution: Reduce --max-pages or use --no-scripts"
echo " Example: wappalyzer <url> --max-pages 5 --no-scripts"
}
# Run troubleshooting
troubleshoot_wappalyzer
fix_common_wappalyzer_errors
Recursos y documentación
Recursos oficiales
- Wappalyzer Website - Sitio web principal y extensiones del navegador
- Wappalyzer CLI GitHub - Repositorio de herramientas CLI
- API Documentation - Referencia y precios de API
- Base de datos de tecnología - Definiciones tecnológicas
Recursos comunitarios
- Browser Extension Store - Chrome Web Store
- Firefox Add-ons - extensión Firefox
- Discusiones tecnológicas - Comunidad
- Bug Reports - Seguimiento de números
Ejemplos de integración
- Security Automation - Guías de integración
- Análisis competitivo - Herramientas de investigación del mercado
- Tendencias tecnológicas - Estadísticas tecnológicas
- Integraciones personales - Ejemplos de integración de API