콘텐츠로 이동

Pentest Swarm AI

Pentest Swarm AI is a multi-agent framework written in Go that orchestrates autonomous penetration testing across targets using the Claude API. Each agent specializes in a phase — recon, scanning, exploitation, post-exploitation, or reporting — and the swarm controller coordinates parallel execution to reduce assessment time while maintaining audit trails.

Installation

From Source (Go)

# Requires Go 1.22+
git clone https://github.com/pentest-swarm/swarm-ai
cd swarm-ai
go mod download
go build -o swarm-ai ./cmd/swarm

# Verify build
./swarm-ai --version

Docker

# Pull official image
docker pull pentestswarm/swarm-ai:latest

# Run with mounted config
docker run --rm -it \
  -v $(pwd)/config:/app/config \
  -v $(pwd)/reports:/app/reports \
  -e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \
  pentestswarm/swarm-ai:latest

# Build from source in Docker
docker build -t swarm-ai:local .

Docker Compose (full stack with reporting DB)

curl -O https://raw.githubusercontent.com/pentest-swarm/swarm-ai/main/docker-compose.yml
docker compose up -d

Configuration

Environment Variables

export ANTHROPIC_API_KEY="sk-ant-..."        # Required: Claude API key
export SWARM_CONCURRENCY=8                   # Max parallel agents (default: 4)
export SWARM_MODEL="claude-sonnet-4-6"       # Claude model for agent reasoning
export SWARM_REPORT_DIR="./reports"          # Report output directory
export SWARM_LOG_LEVEL="info"                # debug | info | warn | error
export SWARM_RATE_LIMIT=60                   # API calls per minute

Config File (config/swarm.yaml)

# Initialize default config
./swarm-ai config init

# Edit config
cat config/swarm.yaml
swarm:
  concurrency: 8
  timeout: 3600           # seconds per assessment
  model: claude-sonnet-4-6
  max_depth: 3            # exploitation chain depth

agents:
  recon:
    enabled: true
    tools: [nmap, amass, subfinder, httpx]
    passive_only: false
  scanner:
    enabled: true
    tools: [nuclei, nikto, sqlmap]
  exploitation:
    enabled: true
    require_approval: true   # human-in-the-loop before exploiting
  reporting:
    format: [html, json, pdf]
    cvss_threshold: 4.0

scope:
  allowed_networks: []     # populate before running
  excluded_hosts: []
  max_targets: 50

Agent Profile Setup

# List available agent profiles
./swarm-ai profiles list

# Create custom profile
./swarm-ai profiles create --name "web-app-focus" \
  --agents "recon,scanner,exploitation" \
  --tools "ffuf,sqlmap,nuclei,burp"

# Set default profile
./swarm-ai profiles set-default web-app-focus

Core Commands

CommandDescription
swarm-ai run --target <host>Run full assessment against a single target
swarm-ai run --targets targets.txtRun against list of targets in parallel
swarm-ai run --profile <name> --target <host>Use named agent profile
swarm-ai agents listList all available agent types
swarm-ai agents statusShow running agent status
swarm-ai agents kill <agent-id>Terminate a specific agent
swarm-ai scope add <cidr>Add CIDR to in-scope list
swarm-ai scope verify <host>Check if host is in scope
swarm-ai report generate --session <id>Generate report from session
swarm-ai report listList all past assessment sessions
swarm-ai config validateValidate config file syntax
swarm-ai config showPrint active configuration
swarm-ai updateUpdate agent tool definitions from registry
swarm-ai sessions listList all past and active sessions
swarm-ai sessions resume <id>Resume an interrupted session

Advanced Usage

Parallel Multi-Target Assessment

# Assess 20 targets in parallel with 8 agents
./swarm-ai run \
  --targets targets.txt \
  --concurrency 8 \
  --profile external-pentest \
  --timeout 7200 \
  --output ./reports/$(date +%Y%m%d)

# Dry-run: show what agents would do without executing
./swarm-ai run --target 10.0.0.0/24 --dry-run

Agent Orchestration Flags

FlagDescription
--agents recon,scannerRun only specified agent types
--skip exploitationSkip exploitation phase
--require-approvalPause before each exploitation attempt
--passiveRecon only, no active scanning
--stealthSlow scan timing, evade IDS
--concurrency <n>Override parallel agent count
--depth <n>Max exploitation chain depth
--timeout <sec>Per-target timeout in seconds
--resume <session-id>Resume interrupted session
--scope-file <path>Load scope rules from file

Custom Agent Prompts

# Override recon agent system prompt
./swarm-ai run --target example.com \
  --agent-prompt recon="Focus on subdomain enumeration and cloud asset discovery. Prioritize S3 buckets and exposed APIs."

# Chain custom instructions
./swarm-ai run --target example.com \
  --agent-prompt exploitation="Attempt SQL injection and XSS. Report CVSS >= 7.0 only."

Swarm Controller API

# Start controller with REST API enabled
./swarm-ai controller start --api --port 8080

# Query running assessment via API
curl http://localhost:8080/api/v1/sessions
curl http://localhost:8080/api/v1/agents/status
curl http://localhost:8080/api/v1/findings?session=abc123

# Inject new target mid-session
curl -X POST http://localhost:8080/api/v1/targets \
  -H "Content-Type: application/json" \
  -d '{"host": "192.168.1.50", "session": "abc123"}'

Report Formats

# Generate all report formats
./swarm-ai report generate --session abc123 --format html,json,pdf

# Filter by severity
./swarm-ai report generate --session abc123 --min-cvss 7.0

# Export findings as SARIF (for CI/CD)
./swarm-ai report generate --session abc123 --format sarif

# Merge multiple sessions into one report
./swarm-ai report merge --sessions abc123,def456 --output merged-report

Common Workflows

External Pentest Assessment

# 1. Define scope
./swarm-ai scope add 203.0.113.0/24
./swarm-ai scope add example.com
./swarm-ai scope verify 203.0.113.50   # confirm in scope

# 2. Run passive recon first
./swarm-ai run --target example.com \
  --agents recon \
  --passive \
  --output ./recon-phase

# 3. Review recon findings, then run active scan
./swarm-ai run --target example.com \
  --agents scanner \
  --input ./recon-phase/findings.json \
  --output ./scan-phase

# 4. Run exploitation with human approval
./swarm-ai run --target example.com \
  --agents exploitation \
  --require-approval \
  --input ./scan-phase/findings.json

# 5. Generate final report
./swarm-ai report generate \
  --sessions recon-phase,scan-phase \
  --format html,pdf \
  --output ./final-report

CI/CD Security Gate

# Run lightweight scan in CI pipeline
./swarm-ai run \
  --target $STAGING_URL \
  --profile ci-quick \
  --timeout 600 \
  --min-cvss 7.0 \
  --format sarif \
  --output ./security-results

# Fail pipeline if critical findings exist
./swarm-ai report check --session latest --fail-on critical
echo "Exit code: $?"

Scheduled Red Team Exercise

# Create cron-compatible assessment script
cat > weekly-redteam.sh << 'EOF'
#!/bin/bash
SESSION=$(./swarm-ai run \
  --targets /etc/swarm/targets.txt \
  --profile full-redteam \
  --concurrency 12 \
  --output /var/reports/$(date +%Y%m%d) \
  --json-session)

./swarm-ai report generate \
  --session $SESSION \
  --format html,pdf \
  --email security-team@company.com
EOF
chmod +x weekly-redteam.sh

Tips and Best Practices

  • Always define scope first using swarm-ai scope add before any assessment — the framework will reject out-of-scope targets
  • Enable require-approval for exploitation phases in production environments to maintain human-in-the-loop control
  • Use --passive for initial recon against sensitive targets to avoid triggering IDS/WAF rules before you understand the environment
  • Monitor API costs with --dry-run before large batch assessments; Claude API calls scale with target count and depth
  • Set max_depth: 2 in config for broad external assessments; increase to 3 only for targeted internal red team engagements
  • Use --stealth on IDS-protected networks — this throttles timing but reduces detection probability
  • Store sessions with ./swarm-ai sessions list and resume interrupted assessments rather than restarting from scratch
  • Integrate SARIF output into GitHub Advanced Security or SonarQube to track findings as code quality issues
  • Rotate API keys between engagements and never commit swarm.yaml with embedded credentials — use environment variables
  • Archive reports immediately after generation; session data is purged after 30 days by default