Anubis
Overview
Seção intitulada “Overview”Anubis is an open-source Web AI Firewall and anti-scraping reverse proxy that protects upstream resources from AI crawlers, scraper bots, and automated threats. It implements proof-of-work (SHA-256) challenges delivered via JavaScript to verify that requests come from legitimate browsers rather than AI crawlers or bot networks.
Created by Xe Iaso after experiencing significant resource exhaustion when Amazon crawlers overloaded their Git server, Anubis provides a lightweight, efficient protection layer written in Go. It sits between user traffic and your application, transparently filtering malicious automated access.
GitHub: TecharoHQ/anubis
License: MIT/Apache 2.0
Built With: Go, JavaScript
Installation
Seção intitulada “Installation”Prerequisites
Seção intitulada “Prerequisites”- Go 1.19+ or Docker
- Upstream server to protect
- TLS certificates (for HTTPS protection)
Build from Source
Seção intitulada “Build from Source”# Clone repository
git clone https://github.com/TecharoHQ/anubis.git
cd anubis
# Build binary
go build -o anubis ./cmd/anubis
# Verify installation
./anubis --version
Docker Installation
Seção intitulada “Docker Installation”# Pull Docker image
docker pull techarohq/anubis:latest
# Run container
docker run -d \
-p 8080:8080 \
-e UPSTREAM_URL=http://backend:3000 \
techarohq/anubis:latest
Docker Compose Setup
Seção intitulada “Docker Compose Setup”version: '3.8'
services:
anubis:
image: techarohq/anubis:latest
ports:
- "8080:8080"
- "8443:8443"
environment:
UPSTREAM_URL: http://backend:3000
ENABLE_HTTPS: "true"
CHALLENGE_DIFFICULTY: "medium"
LOG_LEVEL: "info"
volumes:
- ./certs:/etc/anubis/certs
restart: unless-stopped
backend:
image: myapp:latest
expose:
- "3000"
System Service (Linux)
Seção intitulada “System Service (Linux)”# Copy binary to system location
sudo cp anubis /usr/local/bin/
# Create systemd service
sudo tee /etc/systemd/system/anubis.service > /dev/null << 'EOF'
[Unit]
Description=Anubis Web AI Firewall
After=network.target
[Service]
Type=simple
User=anubis
ExecStart=/usr/local/bin/anubis -config /etc/anubis/config.yaml
Restart=on-failure
RestartSec=10
[Install]
WantedBy=multi-user.target
EOF
# Enable and start service
sudo systemctl daemon-reload
sudo systemctl enable anubis
sudo systemctl start anubis
Configuration
Seção intitulada “Configuration”Basic Configuration
Seção intitulada “Basic Configuration”# config.yaml
server:
listen: ":8080"
read_timeout: 30s
write_timeout: 30s
idle_timeout: 60s
upstream:
url: "http://localhost:3000"
timeout: 30s
max_idle_conns: 100
challenge:
enabled: true
difficulty: "medium"
timeout: 300s
cache_size: 10000
logging:
level: "info"
format: "json"
output: "stdout"
Environment Variables
Seção intitulada “Environment Variables”# Core settings
UPSTREAM_URL=http://localhost:3000
LISTEN_ADDR=:8080
LOG_LEVEL=info
# Challenge settings
CHALLENGE_ENABLED=true
CHALLENGE_DIFFICULTY=medium
CHALLENGE_TIMEOUT=300
# Performance settings
MAX_IDLE_CONNS=100
REQUEST_TIMEOUT=30
CACHE_SIZE=10000
Core Commands
Seção intitulada “Core Commands”| Command | Purpose | Example |
|---|---|---|
anubis | Start with default config | anubis |
anubis -config | Start with custom config | anubis -config /etc/anubis/config.yaml |
anubis -upstream | Set upstream URL | anubis -upstream http://app:3000 |
anubis -listen | Set listen address | anubis -listen :8443 |
anubis -help | Show help | anubis -help |
anubis -version | Show version | anubis -version |
Proof-of-Work Challenge System
Seção intitulada “Proof-of-Work Challenge System”How Challenges Work
Seção intitulada “How Challenges Work”Anubis challenges requests with a SHA-256 proof-of-work puzzle:
- Browser receives challenge HTML/JavaScript
- Client-side JavaScript computes SHA-256 hashes
- Once valid nonce found (matching difficulty), request continues
- Server validates proof-of-work before proxying
Challenge Difficulty Levels
Seção intitulada “Challenge Difficulty Levels”challenge:
difficulty: easy # ~0.5 seconds CPU (home connections)
# OR
difficulty: medium # ~2 seconds CPU (default, bots filtered)
# OR
difficulty: hard # ~10 seconds CPU (heavy protection)
Challenge Response Example
Seção intitulada “Challenge Response Example”// Browser receives challenge
{
"challenge": "find_nonce_for_this_hash",
"target": "00001234abcd...",
"difficulty": "medium"
}
// Browser solves and returns
{
"challenge": "...",
"nonce": "12345",
"proof": "valid_sha256_hash"
}
Request Flow
Seção intitulada “Request Flow”Normal Browser Request
Seção intitulada “Normal Browser Request”┌─────────────┐
│ Browser │
└──────┬──────┘
│ HTTP GET /page
▼
┌──────────────────┐
│ Anubis Firewall │ ◄─── JavaScript Challenge
├──────────────────┤ (SHA-256 PoW)
│ Challenge System │
│ Cache │
└──────┬───────────┘
│ HTTP Request (with PoW token)
▼
┌──────────────────┐
│ Upstream Server │
└──────────────────┘
Blocked AI Crawler Request
Seção intitulada “Blocked AI Crawler Request”┌─────────────┐
│ AI Crawler │
└──────┬──────┘
│ HTTP GET /page
▼
┌──────────────────┐
│ Anubis Firewall │
├──────────────────┤
│ JavaScript │
│ Not Executed ✗ │
└──────────────────┘
▼
403 Forbidden (PoW Required)
Advanced Configuration
Seção intitulada “Advanced Configuration”Rate Limiting Integration
Seção intitulada “Rate Limiting Integration”rate_limit:
enabled: true
requests_per_second: 100
burst: 10
per_ip: true
challenge:
difficulty: medium
# Higher difficulty for repeated failures
escalate_on_failure: true
Custom Challenge Difficulty
Seção intitulada “Custom Challenge Difficulty”challenge:
difficulty: "custom"
custom_difficulty_bits: 18 # Adjust PoW difficulty in bits
timeout: 600
# Difficulty scaling based on time of day
schedules:
- time: "08:00-18:00"
difficulty: "easy"
- time: "18:00-08:00"
difficulty: "hard"
Whitelist & Blacklist
Seção intitulada “Whitelist & Blacklist”acl:
whitelist:
- "203.0.113.0/24" # Trusted networks
- "user-agent:GoogleBot" # Legitimate crawlers
blacklist:
- "1.2.3.4" # Known bad IPs
- "user-agent:BadBot" # Known malicious bots
# Whitelist never challenges
# Blacklist always blocked
challenges_required_for_others: true
HTTPS/TLS Configuration
Seção intitulada “HTTPS/TLS Configuration”server:
listen: ":8443"
use_tls: true
tls:
cert_file: "/etc/anubis/cert.pem"
key_file: "/etc/anubis/key.pem"
# Auto-renew with Let's Encrypt
auto_renew: true
acme_email: "admin@example.com"
Deployment Examples
Seção intitulada “Deployment Examples”Nginx Reverse Proxy + Anubis
Seção intitulada “Nginx Reverse Proxy + Anubis”upstream anubis {
server localhost:8080;
}
server {
listen 80;
server_name example.com;
location / {
proxy_pass http://anubis;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}
Kubernetes Deployment
Seção intitulada “Kubernetes Deployment”apiVersion: apps/v1
kind: Deployment
metadata:
name: anubis
spec:
replicas: 3
selector:
matchLabels:
app: anubis
template:
metadata:
labels:
app: anubis
spec:
containers:
- name: anubis
image: techarohq/anubis:latest
ports:
- containerPort: 8080
env:
- name: UPSTREAM_URL
value: "http://backend-service:3000"
- name: CHALLENGE_DIFFICULTY
value: "medium"
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 10
periodSeconds: 10
Docker Swarm Deployment
Seção intitulada “Docker Swarm Deployment”# Create Anubis service
docker service create \
--name anubis \
--publish 8080:8080 \
--replicas 3 \
--env UPSTREAM_URL=http://backend:3000 \
--env CHALLENGE_DIFFICULTY=medium \
techarohq/anubis:latest
# Scale service
docker service scale anubis=5
Monitoring & Observability
Seção intitulada “Monitoring & Observability”Health Check Endpoint
Seção intitulada “Health Check Endpoint”# Check Anubis status
curl http://localhost:8080/health
# Response
{
"status": "healthy",
"upstream_healthy": true,
"challenges_served": 1542,
"challenges_solved": 1389,
"uptime_seconds": 86400
}
Metrics Endpoint
Seção intitulada “Metrics Endpoint”# Prometheus-compatible metrics
curl http://localhost:8080/metrics
# Output includes:
# anubis_requests_total
# anubis_challenges_issued
# anubis_challenges_solved
# anubis_upstream_latency_ms
# anubis_bot_requests_blocked
Logging Configuration
Seção intitulada “Logging Configuration”logging:
level: "info"
format: "json"
# Log to file
file:
enabled: true
path: "/var/log/anubis/anubis.log"
max_size_mb: 100
max_backups: 10
# Structured logging
fields:
request_id: true
user_agent: true
remote_ip: true
response_time: true
upstream_latency: true
Performance Tuning
Seção intitulada “Performance Tuning”Connection Pooling
Seção intitulada “Connection Pooling”upstream:
max_idle_conns: 200 # Increased from default
max_conns_per_host: 100
idle_conn_timeout: 90s
Challenge Caching
Seção intitulada “Challenge Caching”challenge:
cache_size: 50000 # More cache for frequent users
cache_ttl: 3600s
cache_backend: "redis" # Optional: use Redis for distributed
Request Optimization
Seção intitulada “Request Optimization”server:
read_timeout: 20s
write_timeout: 20s
idle_timeout: 30s
# Gzip compression
gzip:
enabled: true
level: 6
min_size: 1024
Bot Detection & Blocking
Seção intitulada “Bot Detection & Blocking”User-Agent Based Rules
Seção intitulada “User-Agent Based Rules”bot_detection:
block_headless_browsers: true
block_curl_wget: true
block_agents:
- "python-requests"
- "scrapy"
- "selenium"
- "beautifulsoup"
- "mechanize"
allow_agents:
- "googlebot"
- "bingbot"
- "applebot"
Behavioral Analysis
Seção intitulada “Behavioral Analysis”bot_detection:
# Detect non-human-like behavior
require_js_execution: true
detect_headless: true
# Patterns that trigger challenges
patterns:
rapid_requests: "10/second"
sequential_urls: true
missing_referer: true
suspicious_headers: true
IP Reputation Integration
Seção intitulada “IP Reputation Integration”threat_intelligence:
enabled: true
# External threat feeds
sources:
- "abuseipdb"
- "maxmind"
- "custom_internal_feed"
# Action on known bad IPs
actions:
reputation_score_above: 50
action: "block"
Integration with Applications
Seção intitulada “Integration with Applications”Passing Challenge Info to Backend
Seção intitulada “Passing Challenge Info to Backend”headers_to_upstream:
x-anubis-challenge-solved: true
x-anubis-solved-timestamp: "2024-01-15T10:30:00Z"
x-anubis-client-ip: true
x-anubis-difficulty-level: true
Custom Headers in JavaScript
Seção intitulada “Custom Headers in JavaScript”// Browser JavaScript can access challenge info
const challengeInfo = {
solved: true,
difficulty: "medium",
duration_ms: 1234
};
// Send with next request
fetch(url, {
headers: {
'X-Challenge-Duration': challengeInfo.duration_ms
}
});
Troubleshooting
Seção intitulada “Troubleshooting”High CPU Usage
Seção intitulada “High CPU Usage”# Reduce challenge difficulty
challenge:
difficulty: "easy"
# Or increase cache size
challenge:
cache_size: 100000
# Check upstream performance
upstream:
timeout: 60s # Increase if backend slow
Challenge Failures
Seção intitulada “Challenge Failures”# Enable debug logging
LOG_LEVEL=debug anubis
# Check JavaScript delivery
curl -v http://localhost:8080/
# Verify challenge endpoint
curl http://localhost:8080/challenge/verify
Backend Timeout Issues
Seção intitulada “Backend Timeout Issues”upstream:
timeout: 60s
keepalive_timeout: 120s
server:
read_timeout: 30s
write_timeout: 30s
Redis Cache Issues (if enabled)
Seção intitulada “Redis Cache Issues (if enabled)”# Check Redis connection
redis-cli PING
# Monitor cache
redis-cli MONITOR
# Clear cache if needed
redis-cli FLUSHDB
Best Practices
Seção intitulada “Best Practices”Security
Seção intitulada “Security”- Always use HTTPS for production deployments
- Whitelist legitimate crawlers if needed (e.g., Google, Bing)
- Monitor challenge metrics for anomalies
- Rotate TLS certificates regularly
- Keep upstream secret - don’t expose in logs/errors
Performance
Seção intitulada “Performance”- Use Redis for distributed deployments
- Implement proper health checks in load balancers
- Cache aggressively - challenges are stateless
- Monitor upstream latency - Anubis is lightweight
- Scale horizontally - stateless design supports it
Operations
Seção intitulada “Operations”# Monitor challenge rate
curl http://localhost:8080/metrics | grep challenges
# Check error rates
curl http://localhost:8080/metrics | grep errors
# Validate config before deployment
anubis -config config.yaml -validate
Q: Does Anubis block legitimate users?
A: No. Modern browsers execute JavaScript seamlessly. Only headless browsers and CLI tools are challenged.
Q: What about accessibility?
A: Implement fallback mechanisms for users who can’t complete challenges (optional contact form).
Q: Can it block specific content?
A: Anubis only protects with PoW challenges. Use WAF/firewall rules for content blocking.
Q: Performance impact?
A: Minimal (~1-2ms latency). Challenge computation happens client-side.
Resources
Seção intitulada “Resources”- GitHub: https://github.com/TecharoHQ/anubis
- Documentation: https://anubis.techarohq.dev
- Issue Tracker: https://github.com/TecharoHQ/anubis/issues
- Discussions: https://github.com/TecharoHQ/anubis/discussions
Related Tools
Seção intitulada “Related Tools”- Cloudflare Challenges (proprietary)
- AWS WAF (managed service)
- Nginx ModSecurity (open-source WAF)
- Datadome (bot management)