Garak

Garak is an open-source AI red teaming framework designed to test and evaluate the security and robustness of large language models (LLMs) and AI systems through adversarial testing.

Installation & Setup

Command	Description
`pip install garak`	Install Garak via pip
`git clone https://github.com/leondz/garak.git`	Clone from GitHub
`cd garak && pip install -e .`	Install in development mode
`garak --help`	Display help and available options
`garak --list-probes`	List all available probes
`garak --list-detectors`	List all available detectors
`garak --list-generators`	List all available generators

Basic Usage

Command	Description
`garak --model-type openai --model-name gpt-3.5-turbo`	Test OpenAI GPT-3.5-turbo
`garak --model-type huggingface --model-name microsoft/DialoGPT-medium`	Test HuggingFace model
`garak --model-type replicate --model-name replicate/llama-2-70b-chat`	Test Replicate model
`garak --probes encoding`	Run encoding vulnerability probes
`garak --probes malwaregen`	Run malware generation probes
`garak --probes promptinject`	Run prompt injection probes

Probe Categories

Security Probes

Command	Description
`garak --probes encoding.InjectBase64`	Test base64 encoding injection
`garak --probes encoding.InjectHex`	Test hexadecimal encoding injection
`garak --probes encoding.InjectMorse`	Test Morse code encoding injection
`garak --probes encoding.InjectROT13`	Test ROT13 encoding injection
`garak --probes malwaregen.Evasion`	Test malware generation evasion
`garak --probes promptinject.AttackPrompt`	Test prompt injection attacks

Bias and Toxicity Probes

Command	Description
`garak --probes bias.BiasProbe`	Test for bias in model responses
`garak --probes toxicity.ToxicityProbe`	Test for toxic content generation
`garak --probes hate.HateSpeechProbe`	Test for hate speech generation
`garak --probes discrimination.DiscriminationProbe`	Test for discriminatory content

Data Leakage Probes

Command	Description
`garak --probes leakage.PIILeakage`	Test for PII data leakage
`garak --probes leakage.TrainingDataLeakage`	Test for training data exposure
`garak --probes leakage.SystemPromptLeakage`	Test for system prompt exposure

Advanced Configuration

Command	Description
`garak --config config.yaml`	Use custom configuration file
`garak --output-dir results/`	Specify output directory
`garak --report-prefix test_run_`	Set report file prefix
`garak --parallel-requests 5`	Set number of parallel requests
`garak --temperature 0.7`	Set model temperature
`garak --max-tokens 150`	Set maximum tokens per response

Custom Probes

Command	Description
`garak --probes myprobe.CustomProbe`	Run custom probe
`garak --probe-options '{"param": "value"}'`	Pass parameters to probe
`garak --probe-tags security,injection`	Filter probes by tags

Detectors and Evaluation

Command	Description
`garak --detectors always.Pass`	Use always-pass detector
`garak --detectors mitigation.MitigationBypass`	Use mitigation bypass detector
`garak --detectors specialwords.SlursReclaimedSlurs`	Detect slurs and reclaimed slurs
`garak --detectors toxicity.ToxicityClassifier`	Use toxicity classifier

Output and Reporting

Command	Description
`garak --report-format json`	Generate JSON report
`garak --report-format html`	Generate HTML report
`garak --report-format csv`	Generate CSV report
`garak --verbose`	Enable verbose output
`garak --log-level DEBUG`	Set debug logging level

Model Integration

OpenAI Models

Command	Description
`garak --model-type openai --model-name gpt-4`	Test GPT-4
`garak --model-type openai --model-name gpt-3.5-turbo-16k`	Test GPT-3.5-turbo with 16k context
`export OPENAI_API_KEY=your_key`	Set OpenAI API key

HuggingFace Models

Command	Description
`garak --model-type huggingface --model-name facebook/opt-1.3b`	Test OPT model
`garak --model-type huggingface --model-name EleutherAI/gpt-j-6B`	Test GPT-J model
`export HF_TOKEN=your_token`	Set HuggingFace token

Local Models

Command	Description
`garak --model-type ggml --model-name path/to/model.bin`	Test GGML model
`garak --model-type llamacpp --model-name path/to/model.gguf`	Test llama.cpp model

Batch Testing

Command	Description
`garak --model-list models.txt`	Test multiple models from file
`garak --probe-list probes.txt`	Run multiple probes from file
`garak --generations 10`	Set number of generations per probe
`garak --seed 42`	Set random seed for reproducibility

Security Testing Workflows

Comprehensive Security Scan

Command	Description
`garak --model-type openai --model-name gpt-4 --probes encoding,malwaregen,promptinject --generations 20`	Full security probe suite

Bias and Fairness Testing

Command	Description
`garak --model-type huggingface --model-name microsoft/DialoGPT-medium --probes bias,toxicity,hate --detectors specialwords`	Bias testing suite

Data Privacy Testing

Command	Description
`garak --model-type openai --model-name gpt-3.5-turbo --probes leakage --detectors pii`	Privacy testing suite

Configuration Files

Basic Config (config.yaml)

model:
  type: openai
  name: gpt-3.5-turbo
  temperature: 0.7
  max_tokens: 150

probes:
  - encoding
  - promptinject
  - malwaregen

detectors:
  - always.Pass
  - mitigation.MitigationBypass

output:
  directory: results/
  format: json
  prefix: garak_test_

Advanced Config

parallel_requests: 5
generations: 10
seed: 42
log_level: INFO

model:
  type: huggingface
  name: microsoft/DialoGPT-medium
  device: cuda
  batch_size: 4

probe_options:
  encoding.InjectBase64:
    payload_count: 50
  promptinject.AttackPrompt:
    attack_types: ["jailbreak", "roleplay"]

Troubleshooting

Command	Description
`garak --check-models`	Verify model connectivity
`garak --dry-run`	Test configuration without running probes
`garak --debug`	Enable debug mode
`garak --list-model-types`	Show supported model types
`pip install garak[dev]`	Install with development dependencies

Integration with CI/CD

Command	Description
`garak --model-type openai --model-name gpt-3.5-turbo --probes security --exit-on-fail`	Fail CI on security issues
`garak --config ci_config.yaml --report-format json > results.json`	Generate CI-friendly output

Best Practices

Always test models before production deployment
Use multiple probe categories for comprehensive testing
Set appropriate generation counts for statistical significance
Configure proper API rate limits to avoid throttling
Store sensitive API keys as environment variables
Review and analyze generated reports thoroughly
Implement continuous testing in development pipelines
Document and track security testing results over time