Garak is an open-source AI red teaming framework designed to test and evaluate the security and robustness of large language models (LLMs) and AI systems through adversarial testing.
Installation & Setup
| Command | Description |
|---|
pip install garak | Install Garak via pip |
git clone https://github.com/leondz/garak.git | Clone from GitHub |
cd garak && pip install -e . | Install in development mode |
garak --help | Display help and available options |
garak --list-probes | List all available probes |
garak --list-detectors | List all available detectors |
garak --list-generators | List all available generators |
Basic Usage
| Command | Description |
|---|
garak --model-type openai --model-name gpt-3.5-turbo | Test OpenAI GPT-3.5-turbo |
garak --model-type huggingface --model-name microsoft/DialoGPT-medium | Test HuggingFace model |
garak --model-type replicate --model-name replicate/llama-2-70b-chat | Test Replicate model |
garak --probes encoding | Run encoding vulnerability probes |
garak --probes malwaregen | Run malware generation probes |
garak --probes promptinject | Run prompt injection probes |
Probe Categories
Security Probes
| Command | Description |
|---|
garak --probes encoding.InjectBase64 | Test base64 encoding injection |
garak --probes encoding.InjectHex | Test hexadecimal encoding injection |
garak --probes encoding.InjectMorse | Test Morse code encoding injection |
garak --probes encoding.InjectROT13 | Test ROT13 encoding injection |
garak --probes malwaregen.Evasion | Test malware generation evasion |
garak --probes promptinject.AttackPrompt | Test prompt injection attacks |
Bias and Toxicity Probes
| Command | Description |
|---|
garak --probes bias.BiasProbe | Test for bias in model responses |
garak --probes toxicity.ToxicityProbe | Test for toxic content generation |
garak --probes hate.HateSpeechProbe | Test for hate speech generation |
garak --probes discrimination.DiscriminationProbe | Test for discriminatory content |
Data Leakage Probes
| Command | Description |
|---|
garak --probes leakage.PIILeakage | Test for PII data leakage |
garak --probes leakage.TrainingDataLeakage | Test for training data exposure |
garak --probes leakage.SystemPromptLeakage | Test for system prompt exposure |
Advanced Configuration
| Command | Description |
|---|
garak --config config.yaml | Use custom configuration file |
garak --output-dir results/ | Specify output directory |
garak --report-prefix test_run_ | Set report file prefix |
garak --parallel-requests 5 | Set number of parallel requests |
garak --temperature 0.7 | Set model temperature |
garak --max-tokens 150 | Set maximum tokens per response |
Custom Probes
| Command | Description |
|---|
garak --probes myprobe.CustomProbe | Run custom probe |
garak --probe-options '{"param": "value"}' | Pass parameters to probe |
garak --probe-tags security,injection | Filter probes by tags |
Detectors and Evaluation
| Command | Description |
|---|
garak --detectors always.Pass | Use always-pass detector |
garak --detectors mitigation.MitigationBypass | Use mitigation bypass detector |
garak --detectors specialwords.SlursReclaimedSlurs | Detect slurs and reclaimed slurs |
garak --detectors toxicity.ToxicityClassifier | Use toxicity classifier |
Output and Reporting
| Command | Description |
|---|
garak --report-format json | Generate JSON report |
garak --report-format html | Generate HTML report |
garak --report-format csv | Generate CSV report |
garak --verbose | Enable verbose output |
garak --log-level DEBUG | Set debug logging level |
Model Integration
OpenAI Models
| Command | Description |
|---|
garak --model-type openai --model-name gpt-4 | Test GPT-4 |
garak --model-type openai --model-name gpt-3.5-turbo-16k | Test GPT-3.5-turbo with 16k context |
export OPENAI_API_KEY=your_key | Set OpenAI API key |
HuggingFace Models
| Command | Description |
|---|
garak --model-type huggingface --model-name facebook/opt-1.3b | Test OPT model |
garak --model-type huggingface --model-name EleutherAI/gpt-j-6B | Test GPT-J model |
export HF_TOKEN=your_token | Set HuggingFace token |
Local Models
| Command | Description |
|---|
garak --model-type ggml --model-name path/to/model.bin | Test GGML model |
garak --model-type llamacpp --model-name path/to/model.gguf | Test llama.cpp model |
Batch Testing
| Command | Description |
|---|
garak --model-list models.txt | Test multiple models from file |
garak --probe-list probes.txt | Run multiple probes from file |
garak --generations 10 | Set number of generations per probe |
garak --seed 42 | Set random seed for reproducibility |
Security Testing Workflows
Comprehensive Security Scan
| Command | Description |
|---|
garak --model-type openai --model-name gpt-4 --probes encoding,malwaregen,promptinject --generations 20 | Full security probe suite |
Bias and Fairness Testing
| Command | Description |
|---|
garak --model-type huggingface --model-name microsoft/DialoGPT-medium --probes bias,toxicity,hate --detectors specialwords | Bias testing suite |
Data Privacy Testing
| Command | Description |
|---|
garak --model-type openai --model-name gpt-3.5-turbo --probes leakage --detectors pii | Privacy testing suite |
Configuration Files
Basic Config (config.yaml)
model:
type: openai
name: gpt-3.5-turbo
temperature: 0.7
max_tokens: 150
probes:
- encoding
- promptinject
- malwaregen
detectors:
- always.Pass
- mitigation.MitigationBypass
output:
directory: results/
format: json
prefix: garak_test_
Advanced Config
parallel_requests: 5
generations: 10
seed: 42
log_level: INFO
model:
type: huggingface
name: microsoft/DialoGPT-medium
device: cuda
batch_size: 4
probe_options:
encoding.InjectBase64:
payload_count: 50
promptinject.AttackPrompt:
attack_types: ["jailbreak", "roleplay"]
Troubleshooting
| Command | Description |
|---|
garak --check-models | Verify model connectivity |
garak --dry-run | Test configuration without running probes |
garak --debug | Enable debug mode |
garak --list-model-types | Show supported model types |
pip install garak[dev] | Install with development dependencies |
Integration with CI/CD
| Command | Description |
|---|
garak --model-type openai --model-name gpt-3.5-turbo --probes security --exit-on-fail | Fail CI on security issues |
garak --config ci_config.yaml --report-format json > results.json | Generate CI-friendly output |
Best Practices
- Always test models before production deployment
- Use multiple probe categories for comprehensive testing
- Set appropriate generation counts for statistical significance
- Configure proper API rate limits to avoid throttling
- Store sensitive API keys as environment variables
- Review and analyze generated reports thoroughly
- Implement continuous testing in development pipelines
- Document and track security testing results over time