Garak¶

generieren

Garak ist ein offener KI-Red-Teaming-Framework, das die Sicherheit und Robustheit großer Sprachmodelle (LLMs) und KI-Systeme durch adversariale Tests testen und bewerten soll.

Installation und Inbetriebnahme¶

Command	Description
`pip install garak`	Install Garak via pip
`git clone https://github.com/leondz/garak.git`	Clone from GitHub
`cd garak && pip install -e .`	Install in development mode
`garak --help`	Display help and available options
`garak --list-probes`	List all available probes
`garak --list-detectors`	List all available detectors
`garak --list-generators`	List all available generators

Basisnutzung¶

Command	Description
`garak --model-type openai --model-name gpt-3.5-turbo`	Test OpenAI GPT-3.5-turbo
`garak --model-type huggingface --model-name microsoft/DialoGPT-medium`	Test HuggingFace model
`garak --model-type replicate --model-name replicate/llama-2-70b-chat`	Test Replicate model
`garak --probes encoding`	Run encoding vulnerability probes
`garak --probes malwaregen`	Run malware generation probes
`garak --probes promptinject`	Run prompt injection probes

Sonde Kategorien¶

Sicherheitssonden¶

Command	Description
`garak --probes encoding.InjectBase64`	Test base64 encoding injection
`garak --probes encoding.InjectHex`	Test hexadecimal encoding injection
`garak --probes encoding.InjectMorse`	Test Morse code encoding injection
`garak --probes encoding.InjectROT13`	Test ROT13 encoding injection
`garak --probes malwaregen.Evasion`	Test malware generation evasion
`garak --probes promptinject.AttackPrompt`	Test prompt injection attacks

Bias und Toxicity Sonden¶

Command	Description
`garak --probes bias.BiasProbe`	Test for bias in model responses
`garak --probes toxicity.ToxicityProbe`	Test for toxic content generation
`garak --probes hate.HateSpeechProbe`	Test for hate speech generation
`garak --probes discrimination.DiscriminationProbe`	Test for discriminatory content

Datenverlust Sonden¶

Command	Description
`garak --probes leakage.PIILeakage`	Test for PII data leakage
`garak --probes leakage.TrainingDataLeakage`	Test for training data exposure
`garak --probes leakage.SystemPromptLeakage`	Test for system prompt exposure

Erweiterte Konfiguration¶

Command	Description
`garak --config config.yaml`	Use custom configuration file
`garak --output-dir results/`	Specify output directory
`garak --report-prefix test_run_`	Set report file prefix
`garak --parallel-requests 5`	Set number of parallel requests
`garak --temperature 0.7`	Set model temperature
`garak --max-tokens 150`	Set maximum tokens per response

Benutzerdefinierte Sonden¶

Command	Description
`garak --probes myprobe.CustomProbe`	Run custom probe
`garak --probe-options '{"param": "value"}'`	Pass parameters to probe
`garak --probe-tags security,injection`	Filter probes by tags

Detectors und Evaluation¶

Command	Description
`garak --detectors always.Pass`	Use always-pass detector
`garak --detectors mitigation.MitigationBypass`	Use mitigation bypass detector
`garak --detectors specialwords.SlursReclaimedSlurs`	Detect slurs and reclaimed slurs
`garak --detectors toxicity.ToxicityClassifier`	Use toxicity classifier

Ausgabe und Reporting¶

Command	Description
`garak --report-format json`	Generate JSON report
`garak --report-format html`	Generate HTML report
`garak --report-format csv`	Generate CSV report
`garak --verbose`	Enable verbose output
`garak --log-level DEBUG`	Set debug logging level

Modellintegration¶

OpenAI Modelle¶

Command	Description
`garak --model-type openai --model-name gpt-4`	Test GPT-4
`garak --model-type openai --model-name gpt-3.5-turbo-16k`	Test GPT-3.5-turbo with 16k context
`export OPENAI_API_KEY=your_key`	Set OpenAI API key

Hugging Gesichtsmodelle¶

Command	Description
`garak --model-type huggingface --model-name facebook/opt-1.3b`	Test OPT model
`garak --model-type huggingface --model-name EleutherAI/gpt-j-6B`	Test GPT-J model
`export HF_TOKEN=your_token`	Set HuggingFace token

Lokale Modelle¶

Command	Description
`garak --model-type ggml --model-name path/to/model.bin`	Test GGML model
`garak --model-type llamacpp --model-name path/to/model.gguf`	Test llama.cpp model

Batch Testing¶

Command	Description
`garak --model-list models.txt`	Test multiple models from file
`garak --probe-list probes.txt`	Run multiple probes from file
`garak --generations 10`	Set number of generations per probe
`garak --seed 42`	Set random seed for reproducibility

Sicherheitstesting Workflows¶

Umfassender Sicherheitsscan¶

Command	Description
`garak --model-type openai --model-name gpt-4 --probes encoding,malwaregen,promptinject --generations 20`	Full security probe suite

Bias und Fairness Testing¶

Command	Description
`garak --model-type huggingface --model-name microsoft/DialoGPT-medium --probes bias,toxicity,hate --detectors specialwords`	Bias testing suite

Datenschutzerklärung Testing¶

Command	Description
`garak --model-type openai --model-name gpt-3.5-turbo --probes leakage --detectors pii`	Privacy testing suite

Konfigurationsdateien¶

Basic Config (config.yaml)¶

```yaml model: type: openai name: gpt-3.5-turbo temperature: 0.7 max_tokens: 150

probes: - encoding - promptinject - malwaregen

detectors: - always.Pass - mitigation.MitigationBypass

output: directory: results/ format: json prefix: garak_test_ ```_

Erweitertes Vertrauen¶

```yaml parallel_requests: 5 generations: 10 seed: 42 log_level: INFO

model: type: huggingface name: microsoft/DialoGPT-medium device: cuda batch_size: 4

probe_options: encoding.InjectBase64: payload_count: 50 promptinject.AttackPrompt: attack_types: ["jailbreak", "roleplay"] ```_

Fehlerbehebung¶

Command	Description
`garak --check-models`	Verify model connectivity
`garak --dry-run`	Test configuration without running probes
`garak --debug`	Enable debug mode
`garak --list-model-types`	Show supported model types
`pip install garak[dev]`	Install with development dependencies

Integration von CI/CD¶

Command	Description
`garak --model-type openai --model-name gpt-3.5-turbo --probes security --exit-on-fail`	Fail CI on security issues
`garak --config ci_config.yaml --report-format json > results.json`	Generate CI-friendly output

Best Practices¶

Immer Testmodelle vor dem Produktionseinsatz
Verwenden Sie mehrere Sondenkategorien für umfassende Tests
Angemessene Erzeugungszahlen für statistische Bedeutung festsetzen
Richtig konfigurieren API-Rate Grenzen zu vermeiden Drosselung
Speichern Sie sensible API-Tasten als Umgebungsvariablen
Überprüfung und Analyse generierter Berichte gründlich
Durchführung kontinuierlicher Tests in Entwicklungspipelines
Sicherheitstests im Laufe der Zeit