Pentest Swarm AI

Pentest Swarm AI is a multi-agent framework written in Go that orchestrates autonomous penetration testing across targets using the Claude API. Each agent specializes in a phase — recon, scanning, exploitation, post-exploitation, or reporting — and the swarm controller coordinates parallel execution to reduce assessment time while maintaining audit trails.

Installation

From Source (Go)

# Requires Go 1.22+
git clone https://github.com/pentest-swarm/swarm-ai
cd swarm-ai
go mod download
go build -o swarm-ai ./cmd/swarm

# Verify build
./swarm-ai --version

Docker

# Pull official image
docker pull pentestswarm/swarm-ai:latest

# Run with mounted config
docker run --rm -it \
  -v $(pwd)/config:/app/config \
  -v $(pwd)/reports:/app/reports \
  -e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \
  pentestswarm/swarm-ai:latest

# Build from source in Docker
docker build -t swarm-ai:local .

Docker Compose (full stack with reporting DB)

curl -O https://raw.githubusercontent.com/pentest-swarm/swarm-ai/main/docker-compose.yml
docker compose up -d

Configuration

Environment Variables

export ANTHROPIC_API_KEY="sk-ant-..."        # Required: Claude API key
export SWARM_CONCURRENCY=8                   # Max parallel agents (default: 4)
export SWARM_MODEL="claude-sonnet-4-6"       # Claude model for agent reasoning
export SWARM_REPORT_DIR="./reports"          # Report output directory
export SWARM_LOG_LEVEL="info"                # debug | info | warn | error
export SWARM_RATE_LIMIT=60                   # API calls per minute

Config File (`config/swarm.yaml`)

# Initialize default config
./swarm-ai config init

# Edit config
cat config/swarm.yaml

swarm:
  concurrency: 8
  timeout: 3600           # seconds per assessment
  model: claude-sonnet-4-6
  max_depth: 3            # exploitation chain depth

agents:
  recon:
    enabled: true
    tools: [nmap, amass, subfinder, httpx]
    passive_only: false
  scanner:
    enabled: true
    tools: [nuclei, nikto, sqlmap]
  exploitation:
    enabled: true
    require_approval: true   # human-in-the-loop before exploiting
  reporting:
    format: [html, json, pdf]
    cvss_threshold: 4.0

scope:
  allowed_networks: []     # populate before running
  excluded_hosts: []
  max_targets: 50

Agent Profile Setup

# List available agent profiles
./swarm-ai profiles list

# Create custom profile
./swarm-ai profiles create --name "web-app-focus" \
  --agents "recon,scanner,exploitation" \
  --tools "ffuf,sqlmap,nuclei,burp"

# Set default profile
./swarm-ai profiles set-default web-app-focus

Core Commands

Command	Description
`swarm-ai run --target <host>`	Run full assessment against a single target
`swarm-ai run --targets targets.txt`	Run against list of targets in parallel
`swarm-ai run --profile <name> --target <host>`	Use named agent profile
`swarm-ai agents list`	List all available agent types
`swarm-ai agents status`	Show running agent status
`swarm-ai agents kill <agent-id>`	Terminate a specific agent
`swarm-ai scope add <cidr>`	Add CIDR to in-scope list
`swarm-ai scope verify <host>`	Check if host is in scope
`swarm-ai report generate --session <id>`	Generate report from session
`swarm-ai report list`	List all past assessment sessions
`swarm-ai config validate`	Validate config file syntax
`swarm-ai config show`	Print active configuration
`swarm-ai update`	Update agent tool definitions from registry
`swarm-ai sessions list`	List all past and active sessions
`swarm-ai sessions resume <id>`	Resume an interrupted session

Advanced Usage

Parallel Multi-Target Assessment

# Assess 20 targets in parallel with 8 agents
./swarm-ai run \
  --targets targets.txt \
  --concurrency 8 \
  --profile external-pentest \
  --timeout 7200 \
  --output ./reports/$(date +%Y%m%d)

# Dry-run: show what agents would do without executing
./swarm-ai run --target 10.0.0.0/24 --dry-run

Agent Orchestration Flags

Flag	Description
`--agents recon,scanner`	Run only specified agent types
`--skip exploitation`	Skip exploitation phase
`--require-approval`	Pause before each exploitation attempt
`--passive`	Recon only, no active scanning
`--stealth`	Slow scan timing, evade IDS
`--concurrency <n>`	Override parallel agent count
`--depth <n>`	Max exploitation chain depth
`--timeout <sec>`	Per-target timeout in seconds
`--resume <session-id>`	Resume interrupted session
`--scope-file <path>`	Load scope rules from file

Custom Agent Prompts

# Override recon agent system prompt
./swarm-ai run --target example.com \
  --agent-prompt recon="Focus on subdomain enumeration and cloud asset discovery. Prioritize S3 buckets and exposed APIs."

# Chain custom instructions
./swarm-ai run --target example.com \
  --agent-prompt exploitation="Attempt SQL injection and XSS. Report CVSS >= 7.0 only."

Swarm Controller API

# Start controller with REST API enabled
./swarm-ai controller start --api --port 8080

# Query running assessment via API
curl http://localhost:8080/api/v1/sessions
curl http://localhost:8080/api/v1/agents/status
curl http://localhost:8080/api/v1/findings?session=abc123

# Inject new target mid-session
curl -X POST http://localhost:8080/api/v1/targets \
  -H "Content-Type: application/json" \
  -d '{"host": "192.168.1.50", "session": "abc123"}'

Report Formats

# Generate all report formats
./swarm-ai report generate --session abc123 --format html,json,pdf

# Filter by severity
./swarm-ai report generate --session abc123 --min-cvss 7.0

# Export findings as SARIF (for CI/CD)
./swarm-ai report generate --session abc123 --format sarif

# Merge multiple sessions into one report
./swarm-ai report merge --sessions abc123,def456 --output merged-report

Common Workflows

External Pentest Assessment

# 1. Define scope
./swarm-ai scope add 203.0.113.0/24
./swarm-ai scope add example.com
./swarm-ai scope verify 203.0.113.50   # confirm in scope

# 2. Run passive recon first
./swarm-ai run --target example.com \
  --agents recon \
  --passive \
  --output ./recon-phase

# 3. Review recon findings, then run active scan
./swarm-ai run --target example.com \
  --agents scanner \
  --input ./recon-phase/findings.json \
  --output ./scan-phase

# 4. Run exploitation with human approval
./swarm-ai run --target example.com \
  --agents exploitation \
  --require-approval \
  --input ./scan-phase/findings.json

# 5. Generate final report
./swarm-ai report generate \
  --sessions recon-phase,scan-phase \
  --format html,pdf \
  --output ./final-report

CI/CD Security Gate

# Run lightweight scan in CI pipeline
./swarm-ai run \
  --target $STAGING_URL \
  --profile ci-quick \
  --timeout 600 \
  --min-cvss 7.0 \
  --format sarif \
  --output ./security-results

# Fail pipeline if critical findings exist
./swarm-ai report check --session latest --fail-on critical
echo "Exit code: $?"

Scheduled Red Team Exercise

# Create cron-compatible assessment script
cat > weekly-redteam.sh << 'EOF'
#!/bin/bash
SESSION=$(./swarm-ai run \
  --targets /etc/swarm/targets.txt \
  --profile full-redteam \
  --concurrency 12 \
  --output /var/reports/$(date +%Y%m%d) \
  --json-session)

./swarm-ai report generate \
  --session $SESSION \
  --format html,pdf \
  --email security-team@company.com
EOF
chmod +x weekly-redteam.sh

Tips and Best Practices

Always define scope first using swarm-ai scope add before any assessment — the framework will reject out-of-scope targets
Enable require-approval for exploitation phases in production environments to maintain human-in-the-loop control
Use --passive for initial recon against sensitive targets to avoid triggering IDS/WAF rules before you understand the environment
Monitor API costs with --dry-run before large batch assessments; Claude API calls scale with target count and depth
Set max_depth: 2 in config for broad external assessments; increase to 3 only for targeted internal red team engagements
Use --stealth on IDS-protected networks — this throttles timing but reduces detection probability
Store sessions with ./swarm-ai sessions list and resume interrupted assessments rather than restarting from scratch
Integrate SARIF output into GitHub Advanced Security or SonarQube to track findings as code quality issues
Rotate API keys between engagements and never commit swarm.yaml with embedded credentials — use environment variables
Archive reports immediately after generation; session data is purged after 30 days by default