Garak
Garak ist ein offener KI-Red-Teaming-Framework, das die Sicherheit und Robustheit großer Sprachmodelle (LLMs) und KI-Systeme durch adversariale Tests testen und bewerten soll.
Installation und Inbetriebnahme
| | Command | Description | |
| --- | --- |
| | pip install garak
| Install Garak via pip | |
| | git clone https://github.com/leondz/garak.git
| Clone from GitHub | |
| | cd garak && pip install -e .
| Install in development mode | |
| | garak --help
| Display help and available options | |
| | garak --list-probes
| List all available probes | |
| | garak --list-detectors
| List all available detectors | |
| | garak --list-generators
| List all available generators | |
Basisnutzung
| | Command | Description | |
| --- | --- |
| | garak --model-type openai --model-name gpt-3.5-turbo
| Test OpenAI GPT-3.5-turbo | |
| | garak --model-type huggingface --model-name microsoft/DialoGPT-medium
| Test HuggingFace model | |
| | garak --model-type replicate --model-name replicate/llama-2-70b-chat
| Test Replicate model | |
| | garak --probes encoding
| Run encoding vulnerability probes | |
| | garak --probes malwaregen
| Run malware generation probes | |
| | garak --probes promptinject
| Run prompt injection probes | |
Sonde Kategorien
Sicherheitssonden
| | Command | Description | |
| --- | --- |
| | garak --probes encoding.InjectBase64
| Test base64 encoding injection | |
| | garak --probes encoding.InjectHex
| Test hexadecimal encoding injection | |
| | garak --probes encoding.InjectMorse
| Test Morse code encoding injection | |
| | garak --probes encoding.InjectROT13
| Test ROT13 encoding injection | |
| | garak --probes malwaregen.Evasion
| Test malware generation evasion | |
| | garak --probes promptinject.AttackPrompt
| Test prompt injection attacks | |
Bias und Toxicity Sonden
| | Command | Description | |
| --- | --- |
| | garak --probes bias.BiasProbe
| Test for bias in model responses | |
| | garak --probes toxicity.ToxicityProbe
| Test for toxic content generation | |
| | garak --probes hate.HateSpeechProbe
| Test for hate speech generation | |
| | garak --probes discrimination.DiscriminationProbe
| Test for discriminatory content | |
Datenverlust Sonden
| | Command | Description | |
| --- | --- |
| | garak --probes leakage.PIILeakage
| Test for PII data leakage | |
| | garak --probes leakage.TrainingDataLeakage
| Test for training data exposure | |
| | garak --probes leakage.SystemPromptLeakage
| Test for system prompt exposure | |
Erweiterte Konfiguration
| | Command | Description | |
| --- | --- |
| | garak --config config.yaml
| Use custom configuration file | |
| | garak --output-dir results/
| Specify output directory | |
| | garak --report-prefix test_run_
| Set report file prefix | |
| | garak --parallel-requests 5
| Set number of parallel requests | |
| | garak --temperature 0.7
| Set model temperature | |
| | garak --max-tokens 150
| Set maximum tokens per response | |
Benutzerdefinierte Sonden
| | Command | Description | |
| --- | --- |
| | garak --probes myprobe.CustomProbe
| Run custom probe | |
| | garak --probe-options '{"param": "value"}'
| Pass parameters to probe | |
| | garak --probe-tags security,injection
| Filter probes by tags | |
Detectors und Evaluation
| | Command | Description | |
| --- | --- |
| | garak --detectors always.Pass
| Use always-pass detector | |
| | garak --detectors mitigation.MitigationBypass
| Use mitigation bypass detector | |
| | garak --detectors specialwords.SlursReclaimedSlurs
| Detect slurs and reclaimed slurs | |
| | garak --detectors toxicity.ToxicityClassifier
| Use toxicity classifier | |
Ausgabe und Reporting
| | Command | Description | |
| --- | --- |
| | garak --report-format json
| Generate JSON report | |
| | garak --report-format html
| Generate HTML report | |
| | garak --report-format csv
| Generate CSV report | |
| | garak --verbose
| Enable verbose output | |
| | garak --log-level DEBUG
| Set debug logging level | |
Modellintegration
OpenAI Modelle
| | Command | Description | |
| --- | --- |
| | garak --model-type openai --model-name gpt-4
| Test GPT-4 | |
| | garak --model-type openai --model-name gpt-3.5-turbo-16k
| Test GPT-3.5-turbo with 16k context | |
| | export OPENAI_API_KEY=your_key
| Set OpenAI API key | |
Hugging Gesichtsmodelle
| | Command | Description | |
| --- | --- |
| | garak --model-type huggingface --model-name facebook/opt-1.3b
| Test OPT model | |
| | garak --model-type huggingface --model-name EleutherAI/gpt-j-6B
| Test GPT-J model | |
| | export HF_TOKEN=your_token
| Set HuggingFace token | |
Lokale Modelle
| | Command | Description | |
| --- | --- |
| | garak --model-type ggml --model-name path/to/model.bin
| Test GGML model | |
| | garak --model-type llamacpp --model-name path/to/model.gguf
| Test llama.cpp model | |
Batch Testing
| | Command | Description | |
| --- | --- |
| | garak --model-list models.txt
| Test multiple models from file | |
| | garak --probe-list probes.txt
| Run multiple probes from file | |
| | garak --generations 10
| Set number of generations per probe | |
| | garak --seed 42
| Set random seed for reproducibility | |
Sicherheitstesting Workflows
Umfassender Sicherheitsscan
| | Command | Description | |
| --- | --- |
| | garak --model-type openai --model-name gpt-4 --probes encoding,malwaregen,promptinject --generations 20
| Full security probe suite | |
Bias und Fairness Testing
| | Command | Description | |
| --- | --- |
| | garak --model-type huggingface --model-name microsoft/DialoGPT-medium --probes bias,toxicity,hate --detectors specialwords
| Bias testing suite | |
Datenschutzerklärung Testing
| | Command | Description | |
| --- | --- |
| | garak --model-type openai --model-name gpt-3.5-turbo --probes leakage --detectors pii
| Privacy testing suite | |
Konfigurationsdateien
Basic Config (config.yaml)
```yaml model: type: openai name: gpt-3.5-turbo temperature: 0.7 max_tokens: 150
probes: - encoding - promptinject - malwaregen
detectors: - always.Pass - mitigation.MitigationBypass
output: directory: results/ format: json prefix: garak_test_ ```_
Erweitertes Vertrauen
```yaml parallel_requests: 5 generations: 10 seed: 42 log_level: INFO
model: type: huggingface name: microsoft/DialoGPT-medium device: cuda batch_size: 4
probe_options: encoding.InjectBase64: payload_count: 50 promptinject.AttackPrompt: attack_types: ["jailbreak", "roleplay"] ```_
Fehlerbehebung
| | Command | Description | |
| --- | --- |
| | garak --check-models
| Verify model connectivity | |
| | garak --dry-run
| Test configuration without running probes | |
| | garak --debug
| Enable debug mode | |
| | garak --list-model-types
| Show supported model types | |
| | pip install garak[dev]
| Install with development dependencies | |
Integration von CI/CD
| | Command | Description | |
| --- | --- |
| | garak --model-type openai --model-name gpt-3.5-turbo --probes security --exit-on-fail
| Fail CI on security issues | |
| | garak --config ci_config.yaml --report-format json > results.json
| Generate CI-friendly output | |
Best Practices
- Immer Testmodelle vor dem Produktionseinsatz
- Verwenden Sie mehrere Sondenkategorien für umfassende Tests
- Angemessene Erzeugungszahlen für statistische Bedeutung festsetzen
- Richtig konfigurieren API-Rate Grenzen zu vermeiden Drosselung
- Speichern Sie sensible API-Tasten als Umgebungsvariablen
- Überprüfung und Analyse generierter Berichte gründlich
- Durchführung kontinuierlicher Tests in Entwicklungspipelines
- Sicherheitstests im Laufe der Zeit