Zum Inhalt

Garak

generieren

Garak ist ein offener KI-Red-Teaming-Framework, das die Sicherheit und Robustheit großer Sprachmodelle (LLMs) und KI-Systeme durch adversariale Tests testen und bewerten soll.

Installation und Inbetriebnahme

| | Command | Description | | | --- | --- | | | pip install garak | Install Garak via pip | | | | git clone https://github.com/leondz/garak.git | Clone from GitHub | | | | cd garak && pip install -e . | Install in development mode | | | | garak --help | Display help and available options | | | | garak --list-probes | List all available probes | | | | garak --list-detectors | List all available detectors | | | | garak --list-generators | List all available generators | |

Basisnutzung

| | Command | Description | | | --- | --- | | | garak --model-type openai --model-name gpt-3.5-turbo | Test OpenAI GPT-3.5-turbo | | | | garak --model-type huggingface --model-name microsoft/DialoGPT-medium | Test HuggingFace model | | | | garak --model-type replicate --model-name replicate/llama-2-70b-chat | Test Replicate model | | | | garak --probes encoding | Run encoding vulnerability probes | | | | garak --probes malwaregen | Run malware generation probes | | | | garak --probes promptinject | Run prompt injection probes | |

Sonde Kategorien

Sicherheitssonden

| | Command | Description | | | --- | --- | | | garak --probes encoding.InjectBase64 | Test base64 encoding injection | | | | garak --probes encoding.InjectHex | Test hexadecimal encoding injection | | | | garak --probes encoding.InjectMorse | Test Morse code encoding injection | | | | garak --probes encoding.InjectROT13 | Test ROT13 encoding injection | | | | garak --probes malwaregen.Evasion | Test malware generation evasion | | | | garak --probes promptinject.AttackPrompt | Test prompt injection attacks | |

Bias und Toxicity Sonden

| | Command | Description | | | --- | --- | | | garak --probes bias.BiasProbe | Test for bias in model responses | | | | garak --probes toxicity.ToxicityProbe | Test for toxic content generation | | | | garak --probes hate.HateSpeechProbe | Test for hate speech generation | | | | garak --probes discrimination.DiscriminationProbe | Test for discriminatory content | |

Datenverlust Sonden

| | Command | Description | | | --- | --- | | | garak --probes leakage.PIILeakage | Test for PII data leakage | | | | garak --probes leakage.TrainingDataLeakage | Test for training data exposure | | | | garak --probes leakage.SystemPromptLeakage | Test for system prompt exposure | |

Erweiterte Konfiguration

| | Command | Description | | | --- | --- | | | garak --config config.yaml | Use custom configuration file | | | | garak --output-dir results/ | Specify output directory | | | | garak --report-prefix test_run_ | Set report file prefix | | | | garak --parallel-requests 5 | Set number of parallel requests | | | | garak --temperature 0.7 | Set model temperature | | | | garak --max-tokens 150 | Set maximum tokens per response | |

Benutzerdefinierte Sonden

| | Command | Description | | | --- | --- | | | garak --probes myprobe.CustomProbe | Run custom probe | | | | garak --probe-options '{"param": "value"}' | Pass parameters to probe | | | | garak --probe-tags security,injection | Filter probes by tags | |

Detectors und Evaluation

| | Command | Description | | | --- | --- | | | garak --detectors always.Pass | Use always-pass detector | | | | garak --detectors mitigation.MitigationBypass | Use mitigation bypass detector | | | | garak --detectors specialwords.SlursReclaimedSlurs | Detect slurs and reclaimed slurs | | | | garak --detectors toxicity.ToxicityClassifier | Use toxicity classifier | |

Ausgabe und Reporting

| | Command | Description | | | --- | --- | | | garak --report-format json | Generate JSON report | | | | garak --report-format html | Generate HTML report | | | | garak --report-format csv | Generate CSV report | | | | garak --verbose | Enable verbose output | | | | garak --log-level DEBUG | Set debug logging level | |

Modellintegration

OpenAI Modelle

| | Command | Description | | | --- | --- | | | garak --model-type openai --model-name gpt-4 | Test GPT-4 | | | | garak --model-type openai --model-name gpt-3.5-turbo-16k | Test GPT-3.5-turbo with 16k context | | | | export OPENAI_API_KEY=your_key | Set OpenAI API key | |

Hugging Gesichtsmodelle

| | Command | Description | | | --- | --- | | | garak --model-type huggingface --model-name facebook/opt-1.3b | Test OPT model | | | | garak --model-type huggingface --model-name EleutherAI/gpt-j-6B | Test GPT-J model | | | | export HF_TOKEN=your_token | Set HuggingFace token | |

Lokale Modelle

| | Command | Description | | | --- | --- | | | garak --model-type ggml --model-name path/to/model.bin | Test GGML model | | | | garak --model-type llamacpp --model-name path/to/model.gguf | Test llama.cpp model | |

Batch Testing

| | Command | Description | | | --- | --- | | | garak --model-list models.txt | Test multiple models from file | | | | garak --probe-list probes.txt | Run multiple probes from file | | | | garak --generations 10 | Set number of generations per probe | | | | garak --seed 42 | Set random seed for reproducibility | |

Sicherheitstesting Workflows

Umfassender Sicherheitsscan

| | Command | Description | | | --- | --- | | | garak --model-type openai --model-name gpt-4 --probes encoding,malwaregen,promptinject --generations 20 | Full security probe suite | |

Bias und Fairness Testing

| | Command | Description | | | --- | --- | | | garak --model-type huggingface --model-name microsoft/DialoGPT-medium --probes bias,toxicity,hate --detectors specialwords | Bias testing suite | |

Datenschutzerklärung Testing

| | Command | Description | | | --- | --- | | | garak --model-type openai --model-name gpt-3.5-turbo --probes leakage --detectors pii | Privacy testing suite | |

Konfigurationsdateien

Basic Config (config.yaml)

```yaml model: type: openai name: gpt-3.5-turbo temperature: 0.7 max_tokens: 150

probes: - encoding - promptinject - malwaregen

detectors: - always.Pass - mitigation.MitigationBypass

output: directory: results/ format: json prefix: garak_test_ ```_

Erweitertes Vertrauen

```yaml parallel_requests: 5 generations: 10 seed: 42 log_level: INFO

model: type: huggingface name: microsoft/DialoGPT-medium device: cuda batch_size: 4

probe_options: encoding.InjectBase64: payload_count: 50 promptinject.AttackPrompt: attack_types: ["jailbreak", "roleplay"] ```_

Fehlerbehebung

| | Command | Description | | | --- | --- | | | garak --check-models | Verify model connectivity | | | | garak --dry-run | Test configuration without running probes | | | | garak --debug | Enable debug mode | | | | garak --list-model-types | Show supported model types | | | | pip install garak[dev] | Install with development dependencies | |

Integration von CI/CD

| | Command | Description | | | --- | --- | | | garak --model-type openai --model-name gpt-3.5-turbo --probes security --exit-on-fail | Fail CI on security issues | | | | garak --config ci_config.yaml --report-format json > results.json | Generate CI-friendly output | |

Best Practices

  • Immer Testmodelle vor dem Produktionseinsatz
  • Verwenden Sie mehrere Sondenkategorien für umfassende Tests
  • Angemessene Erzeugungszahlen für statistische Bedeutung festsetzen
  • Richtig konfigurieren API-Rate Grenzen zu vermeiden Drosselung
  • Speichern Sie sensible API-Tasten als Umgebungsvariablen
  • Überprüfung und Analyse generierter Berichte gründlich
  • Durchführung kontinuierlicher Tests in Entwicklungspipelines
  • Sicherheitstests im Laufe der Zeit