Greffons Ghidra Feuille de chaleur
Aperçu général
Les plugins Ghidra prolongent la fonctionnalité du cadre d'ingénierie inverse Ghidra de la NSA. Ce guide complet couvre les plugins essentiels tels que BinExport, GhidraBridge, Ghidra2Frida, et bien d'autres qui améliorent l'analyse collaborative, l'intégration avec d'autres outils et les capacités avancées d'ingénierie inverse.
C'est-à-dire Principaux avantages : Collaboration améliorée, intégration d'outils, analyse automatisée, amélioration de l'efficacité du workflow et fonctionnalité étendue au-delà des fonctionnalités Ghidra.
Catégories essentielles de greffons
Collaboration et exportation de greffons
Plugin BinExport
# Installation
git clone https://github.com/google/binexport.git
cd binexport
mkdir build && cd build
cmake ..
make -j$(nproc)
# Install to Ghidra
cp BinExport.jar $GHIDRA_INSTALL_DIR/Extensions/Ghidra/
# Usage in Ghidra
# File -> Export Program -> BinExport (v2) for BinDiff
# File -> Export Program -> BinExport (v2) for BinNavi
# Command line export
$GHIDRA_INSTALL_DIR/support/analyzeHeadless \
/path/to/project ProjectName \
-import /path/to/binary \
-postScript BinExportScript.java \
-scriptPath /path/to/scripts
# Export formats
# .BinExport - For BinDiff comparison
# .BinExport2 - Enhanced format with more metadata
# SQL export - For BinNavi database import
GhidraBridge
# Installation
pip install ghidra-bridge
# Server setup in Ghidra
# Run GhidraBridge server script in Ghidra Script Manager
# Window -> Script Manager -> GhidraBridge -> ghidra_bridge_server.py
# Python client usage
import ghidra_bridge
# Connect to Ghidra
b = ghidra_bridge.GhidraBridge(namespace=globals())
# Access Ghidra API from Python
current_program = b.getCurrentProgram()
print(f"Program: {current_program.getName()}")
# Get function manager
function_manager = current_program.getFunctionManager()
functions = function_manager.getFunctions(True)
# Iterate through functions
for func in functions:
print(f"Function: {func.getName()} at {func.getEntryPoint()}")
# Get function body
body = func.getBody()
print(f" Size: {body.getNumAddresses()} addresses")
# Get calling functions
callers = func.getCallingFunctions(None)
print(f" Callers: {len(list(callers))}")
# Advanced analysis with external tools
import networkx as nx
def build_call_graph():
"""Build call graph using NetworkX"""
G = nx.DiGraph()
for func in function_manager.getFunctions(True):
func_name = func.getName()
G.add_node(func_name)
# Add edges for function calls
for caller in func.getCallingFunctions(None):
G.add_edge(caller.getName(), func_name)
return G
# Export analysis results
def export_function_info():
"""Export function information to JSON"""
import json
functions_data = []
for func in function_manager.getFunctions(True):
func_data = {
'name': func.getName(),
'address': str(func.getEntryPoint()),
'size': func.getBody().getNumAddresses(),
'signature': func.getSignature().getPrototypeString()
}
functions_data.append(func_data)
with open('ghidra_functions.json', 'w') as f:
json.dump(functions_data, f, indent=2)
return functions_data
# Machine learning integration
def extract_features_for_ml():
"""Extract features for machine learning analysis"""
features = []
for func in function_manager.getFunctions(True):
# Extract various features
feature_vector = {
'name': func.getName(),
'size': func.getBody().getNumAddresses(),
'complexity': len(list(func.getCallingFunctions(None))),
'has_loops': False, # Would need more complex analysis
'instruction_count': 0,
'string_refs': 0,
'api_calls': 0
}
# Analyze function body for more features
instructions = current_program.getListing().getInstructions(func.getBody(), True)
for instruction in instructions:
feature_vector['instruction_count'] += 1
# Check for API calls, string references, etc.
# This would require more detailed analysis
features.append(feature_vector)
return features
```_
### Plugins d'intégration et d'automatisation
#### Ghidra2Frida
```python
# Installation
# Download from: https://github.com/federicodotta/Ghidra2Frida
# Place in Ghidra Extensions directory
# Usage in Ghidra Script Manager
# Generate Frida hooks for functions
# Example generated Frida script
frida_script = """
// Auto-generated Frida script from Ghidra
// Hook function at 0x401000
Interceptor.attach(ptr("0x401000"), {
onEnter: function(args) {
console.log("[+] Entering function_name");
console.log(" arg0: " + args[0]);
console.log(" arg1: " + args[1]);
// Log stack trace
console.log("Stack trace:");
console.log(Thread.backtrace(this.context, Backtracer.ACCURATE)
.map(DebugSymbol.fromAddress).join("\\n"));
},
onLeave: function(retval) {
console.log("[+] Leaving function_name");
console.log(" Return value: " + retval);
}
});
// Hook string functions
var strcpy = Module.findExportByName(null, "strcpy");
if (strcpy) {
Interceptor.attach(strcpy, {
onEnter: function(args) {
console.log("[strcpy] dest: " + args[0] + ", src: " + Memory.readUtf8String(args[1]));
}
});
}
// Memory scanning for patterns
function scanForPattern(pattern) {
var ranges = Process.enumerateRanges('r--');
ranges.forEach(function(range) {
Memory.scan(range.base, range.size, pattern, {
onMatch: function(address, size) {
console.log("[+] Pattern found at: " + address);
},
onComplete: function() {
console.log("[+] Scan complete for range: " + range.base);
}
});
});
}
// Usage
scanForPattern("41 41 41 41"); // Search for AAAA pattern
"""
# Save and use with Frida
with open('ghidra_hooks.js', 'w') as f:
f.write(frida_script)
# Run with Frida
# frida -l ghidra_hooks.js -f target_binary
```_
#### Intégration de Ghidra Jupyter
```python
# Installation and setup
pip install jupyter ghidra-bridge matplotlib pandas
# Jupyter notebook cell
import ghidra_bridge
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
# Connect to Ghidra
b = ghidra_bridge.GhidraBridge(namespace=globals())
# Function size analysis
def analyze_function_sizes():
"""Analyze and visualize function sizes"""
sizes = []
names = []
function_manager = getCurrentProgram().getFunctionManager()
for func in function_manager.getFunctions(True):
size = func.getBody().getNumAddresses()
sizes.append(size)
names.append(func.getName())
# Create DataFrame
df = pd.DataFrame({
'function': names,
'size': sizes
})
# Statistical analysis
print(f"Total functions: {len(df)}")
print(f"Average size: {df['size'].mean():.2f}")
print(f"Median size: {df['size'].median():.2f}")
print(f"Largest function: {df.loc[df['size'].idxmax(), 'function']} ({df['size'].max()} bytes)")
# Visualization
plt.figure(figsize=(12, 8))
# Histogram
plt.subplot(2, 2, 1)
plt.hist(df['size'], bins=50, alpha=0.7)
plt.xlabel('Function Size (bytes)')
plt.ylabel('Frequency')
plt.title('Function Size Distribution')
# Top 10 largest functions
plt.subplot(2, 2, 2)
top_10 = df.nlargest(10, 'size')
plt.barh(range(len(top_10)), top_10['size'])
plt.yticks(range(len(top_10)), top_10['function'])
plt.xlabel('Size (bytes)')
plt.title('Top 10 Largest Functions')
# Box plot
plt.subplot(2, 2, 3)
plt.boxplot(df['size'])
plt.ylabel('Size (bytes)')
plt.title('Function Size Box Plot')
# Cumulative distribution
plt.subplot(2, 2, 4)
sorted_sizes = np.sort(df['size'])
cumulative = np.arange(1, len(sorted_sizes) + 1) / len(sorted_sizes)
plt.plot(sorted_sizes, cumulative)
plt.xlabel('Function Size (bytes)')
plt.ylabel('Cumulative Probability')
plt.title('Cumulative Distribution')
plt.tight_layout()
plt.show()
return df
# Cross-reference analysis
def analyze_cross_references():
"""Analyze cross-references between functions"""
reference_manager = getCurrentProgram().getReferenceManager()
# Build reference graph
ref_data = []
function_manager = getCurrentProgram().getFunctionManager()
for func in function_manager.getFunctions(True):
func_addr = func.getEntryPoint()
# Get references TO this function
refs_to = reference_manager.getReferencesTo(func_addr)
for ref in refs_to:
from_addr = ref.getFromAddress()
from_func = function_manager.getFunctionContaining(from_addr)
if from_func:
ref_data.append({
'from_function': from_func.getName(),
'to_function': func.getName(),
'reference_type': str(ref.getReferenceType())
})
# Create DataFrame
ref_df = pd.DataFrame(ref_data)
if not ref_df.empty:
# Most referenced functions
most_referenced = ref_df['to_function'].value_counts().head(10)
plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 1)
most_referenced.plot(kind='bar')
plt.title('Most Referenced Functions')
plt.xlabel('Function')
plt.ylabel('Reference Count')
plt.xticks(rotation=45)
# Reference type distribution
plt.subplot(1, 2, 2)
ref_df['reference_type'].value_counts().plot(kind='pie', autopct='%1.1f%%')
plt.title('Reference Type Distribution')
plt.tight_layout()
plt.show()
return ref_df
# String analysis
def analyze_strings():
"""Analyze strings in the binary"""
listing = getCurrentProgram().getListing()
memory = getCurrentProgram().getMemory()
strings_data = []
# Get all defined strings
data_iterator = listing.getDefinedData(True)
for data in data_iterator:
if data.hasStringValue():
string_value = data.getValue()
if string_value and len(str(string_value)) > 3:
strings_data.append({
'address': str(data.getAddress()),
'string': str(string_value),
'length': len(str(string_value)),
'type': str(data.getDataType())
})
# Create DataFrame
strings_df = pd.DataFrame(strings_data)
if not strings_df.empty:
# String length analysis
plt.figure(figsize=(12, 8))
plt.subplot(2, 2, 1)
plt.hist(strings_df['length'], bins=30, alpha=0.7)
plt.xlabel('String Length')
plt.ylabel('Frequency')
plt.title('String Length Distribution')
# Longest strings
plt.subplot(2, 2, 2)
longest = strings_df.nlargest(10, 'length')
plt.barh(range(len(longest)), longest['length'])
plt.yticks(range(len(longest)), [s[:30] + '...' if len(s) > 30 else s for s in longest['string']])
plt.xlabel('Length')
plt.title('Longest Strings')
# String type distribution
plt.subplot(2, 2, 3)
strings_df['type'].value_counts().plot(kind='pie', autopct='%1.1f%%')
plt.title('String Type Distribution')
plt.tight_layout()
plt.show()
# Interesting strings (potential passwords, URLs, etc.)
interesting_patterns = [
r'password', r'passwd', r'pwd',
r'http[s]?://', r'ftp://',
r'[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}', # Email
r'[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}', # IP address
]
import re
interesting_strings = []
for _, row in strings_df.iterrows():
string_val = row['string'].lower()
for pattern in interesting_patterns:
if re.search(pattern, string_val, re.IGNORECASE):
interesting_strings.append(row)
break
if interesting_strings:
print("Interesting strings found:")
for string_info in interesting_strings[:10]:
print(f" {string_info['address']}: {string_info['string'][:50]}")
return strings_df
# Run analyses
function_df = analyze_function_sizes()
ref_df = analyze_cross_references()
strings_df = analyze_strings()
Greffons d'analyse avancés
Extensions de décompilateur Ghidra
// Custom decompiler plugin example
// Place in Ghidra/Features/Decompiler/src/main/java/
import ghidra.app.decompiler.*;
import ghidra.program.model.listing.*;
import ghidra.program.model.pcode.*;
public class CustomDecompilerAnalysis {
public void analyzeFunction(Function function, DecompInterface decompiler) {
// Get high-level representation
DecompileResults results = decompiler.decompileFunction(function, 30, null);
if (results.decompileCompleted()) {
HighFunction highFunction = results.getHighFunction();
// Analyze control flow
analyzeControlFlow(highFunction);
// Analyze data flow
analyzeDataFlow(highFunction);
// Detect patterns
detectSecurityPatterns(highFunction);
}
}
private void analyzeControlFlow(HighFunction highFunction) {
// Get basic blocks
Iterator<PcodeBlockBasic> blocks = highFunction.getBasicBlocks();
while (blocks.hasNext()) {
PcodeBlockBasic block = blocks.next();
// Analyze block structure
System.out.println("Block: " + block.getStart() + " to " + block.getStop());
// Get successors and predecessors
for (int i = 0; i < block.getOutSize(); i++) {
PcodeBlock successor = block.getOut(i);
System.out.println(" Successor: " + successor.getStart());
}
}
}
private void analyzeDataFlow(HighFunction highFunction) {
// Get all variables
Iterator<HighSymbol> symbols = highFunction.getLocalSymbolMap().getSymbols();
while (symbols.hasNext()) {
HighSymbol symbol = symbols.next();
// Analyze variable usage
HighVariable variable = symbol.getHighVariable();
if (variable != null) {
System.out.println("Variable: " + symbol.getName());
System.out.println(" Type: " + variable.getDataType());
System.out.println(" Size: " + variable.getSize());
// Get def-use information
Iterator<PcodeOp> defs = variable.getDescendants();
while (defs.hasNext()) {
PcodeOp def = defs.next();
System.out.println(" Used in: " + def.getOpcode());
}
}
}
}
private void detectSecurityPatterns(HighFunction highFunction) {
// Look for dangerous function calls
String[] dangerousFunctions = {
"strcpy", "strcat", "sprintf", "gets", "scanf"
};
// Analyze P-code operations
Iterator<PcodeOpAST> ops = highFunction.getPcodeOps();
while (ops.hasNext()) {
PcodeOpAST op = ops.next();
if (op.getOpcode() == PcodeOp.CALL) {
// Check if it's a call to dangerous function
Varnode target = op.getInput(0);
if (target.isAddress()) {
// Get function name at target address
// Check against dangerous functions list
System.out.println("Potential security issue: dangerous function call");
}
}
// Look for buffer operations
if (op.getOpcode() == PcodeOp.COPY || op.getOpcode() == PcodeOp.STORE) {
// Analyze for potential buffer overflows
analyzeBufferOperation(op);
}
}
}
private void analyzeBufferOperation(PcodeOpAST op) {
// Simplified buffer overflow detection
Varnode output = op.getOutput();
if (output != null && output.getSize() > 0) {
// Check if operation could exceed buffer bounds
System.out.println("Buffer operation detected at: " + op.getSeqnum().getTarget());
}
}
}
Extensions de script de Ghidra
# Advanced Ghidra scripting examples
# Crypto detection script
def detect_crypto_constants():
"""Detect cryptographic constants in binary"""
# Common crypto constants
crypto_constants = {
0x67452301: "MD5 initial value A",
0xEFCDAB89: "MD5 initial value B",
0x98BADCFE: "MD5 initial value C",
0x10325476: "MD5 initial value D",
0x6A09E667: "SHA-256 initial value H0",
0xBB67AE85: "SHA-256 initial value H1",
0x3C6EF372: "SHA-256 initial value H2",
0xA54FF53A: "SHA-256 initial value H3",
0x428A2F98: "SHA-256 round constant K0",
0x71374491: "SHA-256 round constant K1",
0x9E3779B9: "TEA delta constant",
0x61C88647: "XTEA delta constant"
}
memory = getCurrentProgram().getMemory()
found_constants = []
# Search for constants in memory
for block in memory.getBlocks():
if block.isInitialized():
block_start = block.getStart()
block_end = block.getEnd()
# Search 4-byte aligned addresses
addr = block_start
while addr.compareTo(block_end) < 0:
try:
# Read 4 bytes as integer
value = memory.getInt(addr)
if value in crypto_constants:
found_constants.append({
'address': addr,
'value': hex(value),
'description': crypto_constants[value]
})
# Create comment
setEOLComment(addr, crypto_constants[value])
addr = addr.add(4)
except:
addr = addr.add(1)
# Print results
print(f"Found {len(found_constants)} crypto constants:")
for const in found_constants:
print(f" {const['address']}: {const['value']} - {const['description']}")
return found_constants
# Function similarity analysis
def analyze_function_similarity():
"""Analyze similarity between functions"""
function_manager = getCurrentProgram().getFunctionManager()
functions = list(function_manager.getFunctions(True))
# Extract features for each function
function_features = {}
for func in functions:
features = extract_function_features(func)
function_features[func.getName()] = features
# Compare functions
similarities = []
for i, func1 in enumerate(functions):
for func2 in functions[i+1:]:
similarity = calculate_similarity(
function_features[func1.getName()],
function_features[func2.getName()]
)
if similarity > 0.8: # High similarity threshold
similarities.append({
'function1': func1.getName(),
'function2': func2.getName(),
'similarity': similarity,
'addr1': func1.getEntryPoint(),
'addr2': func2.getEntryPoint()
})
# Sort by similarity
similarities.sort(key=lambda x: x['similarity'], reverse=True)
print(f"Found {len(similarities)} similar function pairs:")
for sim in similarities[:10]: # Top 10
print(f" {sim['function1']} <-> {sim['function2']}: {sim['similarity']:.3f}")
return similarities
def extract_function_features(function):
"""Extract features from function for similarity analysis"""
features = {
'size': function.getBody().getNumAddresses(),
'block_count': 0,
'call_count': 0,
'instruction_types': {},
'string_refs': 0,
'api_calls': []
}
# Analyze basic blocks
body = function.getBody()
listing = getCurrentProgram().getListing()
# Count instructions and types
instructions = listing.getInstructions(body, True)
for instruction in instructions:
mnemonic = instruction.getMnemonicString()
features['instruction_types'][mnemonic] = features['instruction_types'].get(mnemonic, 0) + 1
# Count calls
if instruction.getFlowType().isCall():
features['call_count'] += 1
# Get call target
refs = instruction.getReferencesFrom()
for ref in refs:
if ref.getReferenceType().isCall():
target_addr = ref.getToAddress()
target_func = getCurrentProgram().getFunctionManager().getFunctionAt(target_addr)
if target_func:
features['api_calls'].append(target_func.getName())
return features
def calculate_similarity(features1, features2):
"""Calculate similarity between two feature sets"""
# Simple similarity based on instruction type distribution
types1 = features1['instruction_types']
types2 = features2['instruction_types']
# Get all instruction types
all_types = set(types1.keys()) | set(types2.keys())
if not all_types:
return 0.0
# Calculate cosine similarity
dot_product = 0
norm1 = 0
norm2 = 0
for inst_type in all_types:
count1 = types1.get(inst_type, 0)
count2 = types2.get(inst_type, 0)
dot_product += count1 * count2
norm1 += count1 * count1
norm2 += count2 * count2
if norm1 == 0 or norm2 == 0:
return 0.0
return dot_product / (math.sqrt(norm1) * math.sqrt(norm2))
# Automated vulnerability detection
def detect_vulnerabilities():
"""Detect potential vulnerabilities in code"""
vulnerabilities = []
# Dangerous function patterns
dangerous_functions = {
'strcpy': 'Buffer overflow risk - no bounds checking',
'strcat': 'Buffer overflow risk - no bounds checking',
'sprintf': 'Buffer overflow risk - no bounds checking',
'gets': 'Buffer overflow risk - reads unlimited input',
'scanf': 'Buffer overflow risk with %s format',
'system': 'Command injection risk',
'exec': 'Command injection risk',
'eval': 'Code injection risk'
}
function_manager = getCurrentProgram().getFunctionManager()
# Check for dangerous function calls
for func in function_manager.getFunctions(True):
body = func.getBody()
listing = getCurrentProgram().getListing()
instructions = listing.getInstructions(body, True)
for instruction in instructions:
if instruction.getFlowType().isCall():
refs = instruction.getReferencesFrom()
for ref in refs:
if ref.getReferenceType().isCall():
target_addr = ref.getToAddress()
target_func = function_manager.getFunctionAt(target_addr)
if target_func:
func_name = target_func.getName()
for dangerous_func, description in dangerous_functions.items():
if dangerous_func in func_name.lower():
vulnerabilities.append({
'type': 'dangerous_function_call',
'function': func.getName(),
'address': instruction.getAddress(),
'dangerous_function': func_name,
'description': description,
'severity': 'high' if dangerous_func in ['gets', 'system'] else 'medium'
})
# Check for format string vulnerabilities
detect_format_string_vulns(vulnerabilities)
# Check for integer overflow patterns
detect_integer_overflow_patterns(vulnerabilities)
# Print results
print(f"Found {len(vulnerabilities)} potential vulnerabilities:")
for vuln in vulnerabilities:
print(f" [{vuln['severity'].upper()}] {vuln['type']} in {vuln['function']}")
print(f" Address: {vuln['address']}")
print(f" Description: {vuln['description']}")
return vulnerabilities
def detect_format_string_vulns(vulnerabilities):
"""Detect format string vulnerabilities"""
# Look for printf-family functions with user-controlled format strings
printf_functions = ['printf', 'fprintf', 'sprintf', 'snprintf', 'vprintf']
function_manager = getCurrentProgram().getFunctionManager()
for func in function_manager.getFunctions(True):
# Analyze function for printf calls
# This is a simplified detection - real analysis would need data flow
pass
def detect_integer_overflow_patterns(vulnerabilities):
"""Detect potential integer overflow patterns"""
# Look for arithmetic operations without bounds checking
# This is a simplified detection
pass
# Run analysis scripts
crypto_constants = detect_crypto_constants()
similar_functions = analyze_function_similarity()
vulnerabilities = detect_vulnerabilities()
Greffons Utilitaires et Helper
Traitement par lots de Ghidra
# Batch processing utilities for Ghidra
import os
import json
import subprocess
from pathlib import Path
class GhidraBatchProcessor:
def __init__(self, ghidra_path, project_path):
self.ghidra_path = Path(ghidra_path)
self.project_path = Path(project_path)
self.analyze_headless = self.ghidra_path / "support" / "analyzeHeadless"
def batch_analyze(self, binary_paths, scripts=None, output_dir=None):
"""Batch analyze multiple binaries"""
if output_dir is None:
output_dir = Path("./batch_analysis_results")
output_dir.mkdir(exist_ok=True)
results = []
for binary_path in binary_paths:
binary_path = Path(binary_path)
print(f"Analyzing: {binary_path.name}")
# Create project for this binary
project_name = f"batch_{binary_path.stem}"
# Build command
cmd = [
str(self.analyze_headless),
str(self.project_path),
project_name,
"-import", str(binary_path),
"-overwrite"
]
# Add scripts if specified
if scripts:
for script in scripts:
cmd.extend(["-postScript", script])
# Add output directory
cmd.extend(["-scriptPath", str(output_dir)])
try:
# Run analysis
result = subprocess.run(cmd, capture_output=True, text=True, timeout=300)
analysis_result = {
'binary': str(binary_path),
'project': project_name,
'success': result.returncode == 0,
'stdout': result.stdout,
'stderr': result.stderr
}
results.append(analysis_result)
# Save individual result
result_file = output_dir / f"{binary_path.stem}_result.json"
with open(result_file, 'w') as f:
json.dump(analysis_result, f, indent=2)
except subprocess.TimeoutExpired:
print(f"Timeout analyzing {binary_path.name}")
results.append({
'binary': str(binary_path),
'project': project_name,
'success': False,
'error': 'timeout'
})
# Save batch results
batch_result_file = output_dir / "batch_results.json"
with open(batch_result_file, 'w') as f:
json.dump(results, f, indent=2)
return results
def export_all_functions(self, binary_path, output_format='json'):
"""Export all functions from a binary"""
script_content = f"""
# Export functions script
import json
def export_functions():
program = getCurrentProgram()
function_manager = program.getFunctionManager()
functions_data = []
for func in function_manager.getFunctions(True):
func_data = {{
'name': func.getName(),
'address': str(func.getEntryPoint()),
'size': func.getBody().getNumAddresses(),
'signature': func.getSignature().getPrototypeString() if func.getSignature() else None,
'calling_convention': str(func.getCallingConvention()) if func.getCallingConvention() else None,
'parameter_count': func.getParameterCount(),
'local_variable_count': len(func.getLocalVariables()),
'is_thunk': func.isThunk(),
'is_external': func.isExternal()
}}
# Get function calls
calls = []
body = func.getBody()
listing = program.getListing()
instructions = listing.getInstructions(body, True)
for instruction in instructions:
if instruction.getFlowType().isCall():
refs = instruction.getReferencesFrom()
for ref in refs:
if ref.getReferenceType().isCall():
target_addr = ref.getToAddress()
target_func = function_manager.getFunctionAt(target_addr)
if target_func:
calls.append(target_func.getName())
func_data['calls'] = calls
functions_data.append(func_data)
# Save to file
output_file = "{binary_path.stem}_functions.{output_format}"
with open(output_file, 'w') as f:
json.dump(functions_data, f, indent=2)
print(f"Exported {{len(functions_data)}} functions to {{output_file}}")
export_functions()
"""
# Save script
script_file = Path("export_functions.py")
with open(script_file, 'w') as f:
f.write(script_content)
# Run analysis with script
return self.batch_analyze([binary_path], scripts=[str(script_file)])
# Usage example
def run_batch_analysis():
"""Example of running batch analysis"""
# Setup
ghidra_path = "/opt/ghidra" # Adjust path
project_path = "/tmp/ghidra_projects"
processor = GhidraBatchProcessor(ghidra_path, project_path)
# Find binaries to analyze
binary_paths = [
"/bin/ls",
"/bin/cat",
"/bin/echo"
]
# Custom analysis scripts
analysis_scripts = [
"export_functions.py",
"detect_crypto.py",
"analyze_strings.py"
]
# Run batch analysis
results = processor.batch_analyze(binary_paths, scripts=analysis_scripts)
# Print summary
successful = sum(1 for r in results if r['success'])
print(f"Batch analysis complete: {successful}/{len(results)} successful")
return results
# Ghidra project management utilities
class GhidraProjectManager:
def __init__(self, ghidra_path):
self.ghidra_path = Path(ghidra_path)
def create_project(self, project_path, project_name):
"""Create new Ghidra project"""
cmd = [
str(self.ghidra_path / "support" / "analyzeHeadless"),
str(project_path),
project_name,
"-create"
]
result = subprocess.run(cmd, capture_output=True, text=True)
return result.returncode == 0
def import_binary(self, project_path, project_name, binary_path, analyze=True):
"""Import binary into project"""
cmd = [
str(self.ghidra_path / "support" / "analyzeHeadless"),
str(project_path),
project_name,
"-import", str(binary_path)
]
if not analyze:
cmd.append("-noanalysis")
result = subprocess.run(cmd, capture_output=True, text=True)
return result.returncode == 0
def export_project(self, project_path, project_name, export_path, format_type="xml"):
"""Export project data"""
export_script = f"""
# Export project script
import os
def export_project_data():
program = getCurrentProgram()
# Export program as XML
from ghidra.app.util.exporter import XmlExporter
exporter = XmlExporter()
export_file = "{export_path}"
# Configure export options
options = exporter.getDefaultOptions()
# Perform export
success = exporter.export(export_file, program, None, None)
if success:
print(f"Project exported to {{export_file}}")
else:
print("Export failed")
export_project_data()
"""
# Save and run export script
script_file = Path("export_project.py")
with open(script_file, 'w') as f:
f.write(export_script)
cmd = [
str(self.ghidra_path / "support" / "analyzeHeadless"),
str(project_path),
project_name,
"-postScript", str(script_file)
]
result = subprocess.run(cmd, capture_output=True, text=True)
return result.returncode == 0
# Run examples
if __name__ == "__main__":
# Run batch analysis
batch_results = run_batch_analysis()
# Project management example
ghidra_path = "/opt/ghidra"
manager = GhidraProjectManager(ghidra_path)
# Create project
manager.create_project("/tmp/test_project", "TestProject")
# Import binary
manager.import_binary("/tmp/test_project", "TestProject", "/bin/ls")
# Export project
manager.export_project("/tmp/test_project", "TestProject", "/tmp/exported_project.xml")
Développement de plugins
Création de plugins personnalisés
// Custom Ghidra plugin template
// Place in Ghidra/Features/Base/src/main/java/
import ghidra.app.plugin.PluginCategoryNames;
import ghidra.app.plugin.ProgramPlugin;
import ghidra.framework.plugintool.*;
import ghidra.framework.plugintool.util.PluginStatus;
import ghidra.program.model.listing.Program;
@PluginInfo(
status = PluginStatus.STABLE,
packageName = "CustomAnalysis",
category = PluginCategoryNames.ANALYSIS,
shortDescription = "Custom analysis plugin",
description = "Performs custom binary analysis tasks"
)
public class CustomAnalysisPlugin extends ProgramPlugin {
public CustomAnalysisPlugin(PluginTool tool) {
super(tool, true, true);
// Initialize plugin
setupActions();
}
private void setupActions() {
// Create menu actions
DockingAction analyzeAction = new DockingAction("Custom Analysis", getName()) {
@Override
public void actionPerformed(ActionContext context) {
performCustomAnalysis();
}
};
analyzeAction.setMenuBarData(new MenuData(
new String[]{"Analysis", "Custom Analysis"},
"CustomAnalysis"
));
analyzeAction.setDescription("Run custom analysis");
analyzeAction.setEnabled(true);
tool.addAction(analyzeAction);
}
private void performCustomAnalysis() {
Program program = getCurrentProgram();
if (program == null) {
return;
}
// Perform analysis
CustomAnalyzer analyzer = new CustomAnalyzer(program);
analyzer.analyze();
// Display results
displayResults(analyzer.getResults());
}
private void displayResults(AnalysisResults results) {
// Create results dialog or panel
CustomResultsDialog dialog = new CustomResultsDialog(results);
tool.showDialog(dialog);
}
@Override
protected void programActivated(Program program) {
// Called when program becomes active
super.programActivated(program);
}
@Override
protected void programDeactivated(Program program) {
// Called when program becomes inactive
super.programDeactivated(program);
}
}
// Custom analyzer class
class CustomAnalyzer {
private Program program;
private AnalysisResults results;
public CustomAnalyzer(Program program) {
this.program = program;
this.results = new AnalysisResults();
}
public void analyze() {
// Perform custom analysis
analyzeFunctions();
analyzeStrings();
analyzeReferences();
}
private void analyzeFunctions() {
FunctionManager functionManager = program.getFunctionManager();
FunctionIterator functions = functionManager.getFunctions(true);
while (functions.hasNext()) {
Function function = functions.next();
// Analyze function
FunctionAnalysis analysis = new FunctionAnalysis();
analysis.setName(function.getName());
analysis.setAddress(function.getEntryPoint());
analysis.setSize(function.getBody().getNumAddresses());
// Add complexity metrics
analysis.setComplexity(calculateComplexity(function));
results.addFunctionAnalysis(analysis);
}
}
private int calculateComplexity(Function function) {
// Simple complexity calculation
return function.getBody().getNumAddresses() / 10;
}
private void analyzeStrings() {
// String analysis implementation
}
private void analyzeReferences() {
// Reference analysis implementation
}
public AnalysisResults getResults() {
return results;
}
}
// Results data structure
class AnalysisResults {
private List<FunctionAnalysis> functionAnalyses;
private List<StringAnalysis> stringAnalyses;
public AnalysisResults() {
this.functionAnalyses = new ArrayList<>();
this.stringAnalyses = new ArrayList<>();
}
public void addFunctionAnalysis(FunctionAnalysis analysis) {
functionAnalyses.add(analysis);
}
public List<FunctionAnalysis> getFunctionAnalyses() {
return functionAnalyses;
}
}
class FunctionAnalysis {
private String name;
private Address address;
private long size;
private int complexity;
// Getters and setters
public void setName(String name) { this.name = name; }
public String getName() { return name; }
public void setAddress(Address address) { this.address = address; }
public Address getAddress() { return address; }
public void setSize(long size) { this.size = size; }
public long getSize() { return size; }
public void setComplexity(int complexity) { this.complexity = complexity; }
public int getComplexity() { return complexity; }
}
Configuration et déploiement du plugin
# Plugin build and deployment
# 1. Build plugin
cd $GHIDRA_INSTALL_DIR
./gradlew buildExtension -PGHIDRA_INSTALL_DIR=$GHIDRA_INSTALL_DIR
# 2. Install plugin
cp dist/CustomAnalysisPlugin.zip $GHIDRA_INSTALL_DIR/Extensions/Ghidra/
# 3. Enable plugin in Ghidra
# File -> Configure -> Configure Plugins -> Check your plugin
# 4. Plugin directory structure
mkdir -p MyCustomPlugin/src/main/java/mypackage
mkdir -p MyCustomPlugin/src/main/resources
mkdir -p MyCustomPlugin/data
# 5. Create extension.properties
cat > MyCustomPlugin/extension.properties << EOF
name=MyCustomPlugin
description=Custom analysis plugin for Ghidra
author=Your Name
createdOn=2025-01-01
version=1.0
EOF
# 6. Create build.gradle
cat > MyCustomPlugin/build.gradle << EOF
apply from: "\$rootProject.projectDir/gradle/javaProject.gradle"
apply from: "\$rootProject.projectDir/gradle/helpProject.gradle"
apply from: "\$rootProject.projectDir/gradle/distributableGhidraModule.gradle"
dependencies {
api project(':Base')
api project(':Decompiler')
}
EOF
# 7. Build and package
./gradlew :MyCustomPlugin:buildExtension
# 8. Install extension
unzip dist/MyCustomPlugin.zip -d $GHIDRA_INSTALL_DIR/Extensions/Ghidra/
Exemples d'intégration
Intégration CI/CD
# GitHub Actions workflow for Ghidra analysis
name: Ghidra Binary Analysis
on:
push:
paths:
- 'binaries/**'
pull_request:
paths:
- 'binaries/**'
jobs:
ghidra-analysis:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Setup Java
uses: actions/setup-java@v3
with:
java-version: '17'
distribution: 'temurin'
- name: Download Ghidra
run: |
wget https://github.com/NationalSecurityAgency/ghidra/releases/download/Ghidra_10.4_build/ghidra_10.4_PUBLIC_20230928.zip
unzip ghidra_10.4_PUBLIC_20230928.zip
export GHIDRA_INSTALL_DIR=$PWD/ghidra_10.4_PUBLIC
- name: Install Ghidra plugins
run: |
# Install BinExport
git clone https://github.com/google/binexport.git
cd binexport
mkdir build && cd build
cmake ..
make -j$(nproc)
cp BinExport.jar $GHIDRA_INSTALL_DIR/Extensions/Ghidra/
- name: Run Ghidra analysis
run: |
# Create analysis script
cat > analyze_binary.py << 'EOF'
import json
import os
def analyze_program():
program = getCurrentProgram()
if not program:
return
results = {
'binary_name': program.getName(),
'architecture': str(program.getLanguage().getProcessor()),
'entry_point': str(program.getImageBase().add(program.getAddressFactory().getDefaultAddressSpace().getMinAddress())),
'functions': [],
'strings': [],
'imports': []
}
# Analyze functions
function_manager = program.getFunctionManager()
for func in function_manager.getFunctions(True):
func_data = {
'name': func.getName(),
'address': str(func.getEntryPoint()),
'size': func.getBody().getNumAddresses()
}
results['functions'].append(func_data)
# Export results
output_file = os.path.join(os.getcwd(), 'analysis_results.json')
with open(output_file, 'w') as f:
json.dump(results, f, indent=2)
print(f"Analysis complete. Results saved to {output_file}")
analyze_program()
EOF
# Run analysis on all binaries
for binary in binaries/*; do
if [ -f "$binary" ]; then
echo "Analyzing $binary"
$GHIDRA_INSTALL_DIR/support/analyzeHeadless \
/tmp/ghidra_projects \
"CI_Analysis_$(basename $binary)" \
-import "$binary" \
-postScript analyze_binary.py \
-overwrite
fi
done
- name: Upload analysis results
uses: actions/upload-artifact@v3
with:
name: ghidra-analysis-results
path: analysis_results.json
- name: Security scan results
run: |
# Parse results for security issues
python3 << 'EOF'
import json
import sys
try:
with open('analysis_results.json', 'r') as f:
results = json.load(f)
# Check for dangerous functions
dangerous_functions = ['strcpy', 'gets', 'sprintf', 'system']
security_issues = []
for func in results.get('functions', []):
func_name = func['name'].lower()
for dangerous in dangerous_functions:
if dangerous in func_name:
security_issues.append({
'type': 'dangerous_function',
'function': func['name'],
'address': func['address'],
'issue': f'Potentially dangerous function: {dangerous}'
})
if security_issues:
print("Security issues found:")
for issue in security_issues:
print(f" - {issue['issue']} in {issue['function']} at {issue['address']}")
sys.exit(1)
else:
print("No security issues detected")
except FileNotFoundError:
print("Analysis results not found")
sys.exit(1)
EOF
Intégration Docker
# Dockerfile for Ghidra analysis environment
FROM ubuntu:22.04
# Install dependencies
RUN apt-get update && apt-get install -y \
openjdk-17-jdk \
wget \
unzip \
git \
build-essential \
cmake \
python3 \
python3-pip \
&& rm -rf /var/lib/apt/lists/*
# Install Ghidra
WORKDIR /opt
RUN wget https://github.com/NationalSecurityAgency/ghidra/releases/download/Ghidra_10.4_build/ghidra_10.4_PUBLIC_20230928.zip \
&& unzip ghidra_10.4_PUBLIC_20230928.zip \
&& rm ghidra_10.4_PUBLIC_20230928.zip \
&& mv ghidra_10.4_PUBLIC ghidra
ENV GHIDRA_INSTALL_DIR=/opt/ghidra
ENV PATH=$PATH:$GHIDRA_INSTALL_DIR/support
# Install Python dependencies
RUN pip3 install ghidra-bridge requests
# Install Ghidra plugins
WORKDIR /tmp
RUN git clone https://github.com/google/binexport.git \
&& cd binexport \
&& mkdir build && cd build \
&& cmake .. \
&& make -j$(nproc) \
&& cp BinExport.jar $GHIDRA_INSTALL_DIR/Extensions/Ghidra/
# Create analysis scripts directory
RUN mkdir -p /opt/analysis-scripts
# Copy analysis scripts
COPY scripts/ /opt/analysis-scripts/
# Create workspace
RUN mkdir -p /workspace/projects /workspace/binaries /workspace/results
WORKDIR /workspace
# Entry point script
COPY entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
#!/bin/bash
# entrypoint.sh
set -e
# Default values
PROJECT_NAME=${PROJECT_NAME:-"analysis_project"}
BINARY_PATH=${BINARY_PATH:-""}
ANALYSIS_SCRIPTS=${ANALYSIS_SCRIPTS:-""}
OUTPUT_DIR=${OUTPUT_DIR:-"/workspace/results"}
# Create output directory
mkdir -p "$OUTPUT_DIR"
if [ -z "$BINARY_PATH" ]; then
echo "Error: BINARY_PATH environment variable must be set"
exit 1
fi
if [ ! -f "$BINARY_PATH" ]; then
echo "Error: Binary file not found: $BINARY_PATH"
exit 1
fi
echo "Starting Ghidra analysis..."
echo "Binary: $BINARY_PATH"
echo "Project: $PROJECT_NAME"
echo "Output: $OUTPUT_DIR"
# Build analysis command
ANALYSIS_CMD="$GHIDRA_INSTALL_DIR/support/analyzeHeadless \
/workspace/projects \
$PROJECT_NAME \
-import $BINARY_PATH \
-overwrite"
# Add analysis scripts if specified
if [ -n "$ANALYSIS_SCRIPTS" ]; then
for script in $ANALYSIS_SCRIPTS; do
if [ -f "/opt/analysis-scripts/$script" ]; then
ANALYSIS_CMD="$ANALYSIS_CMD -postScript /opt/analysis-scripts/$script"
else
echo "Warning: Script not found: $script"
fi
done
fi
# Run analysis
eval $ANALYSIS_CMD
# Copy results
if [ -d "/workspace/projects/$PROJECT_NAME.rep" ]; then
cp -r "/workspace/projects/$PROJECT_NAME.rep" "$OUTPUT_DIR/"
fi
echo "Analysis complete. Results saved to $OUTPUT_DIR"
# Keep container running if requested
if [ "$KEEP_RUNNING" = "true" ]; then
echo "Keeping container running..."
tail -f /dev/null
fi
# Docker usage examples
# Build the image
docker build -t ghidra-analysis .
# Analyze a single binary
docker run --rm \
-v /path/to/binary:/workspace/binaries/target:ro \
-v /path/to/results:/workspace/results \
-e BINARY_PATH=/workspace/binaries/target \
-e PROJECT_NAME=my_analysis \
-e ANALYSIS_SCRIPTS="export_functions.py detect_crypto.py" \
ghidra-analysis
# Interactive analysis
docker run -it \
-v /path/to/binaries:/workspace/binaries:ro \
-v /path/to/results:/workspace/results \
-e KEEP_RUNNING=true \
ghidra-analysis bash
# Batch analysis with docker-compose
cat > docker-compose.yml << EOF
version: '3.8'
services:
ghidra-analysis:
build: .
volumes:
- ./binaries:/workspace/binaries:ro
- ./results:/workspace/results
- ./custom-scripts:/opt/analysis-scripts/custom:ro
environment:
- PROJECT_NAME=batch_analysis
- ANALYSIS_SCRIPTS=export_functions.py detect_crypto.py custom/my_script.py
command: |
bash -c "
for binary in /workspace/binaries/*; do
if [ -f \"\$binary\" ]; then
echo \"Analyzing \$(basename \$binary)\"
BINARY_PATH=\"\$binary\" \
PROJECT_NAME=\"analysis_\$(basename \$binary)\" \
/entrypoint.sh
fi
done
"
EOF
docker-compose up
Ressources et documentation
Ressources officielles
- Répertoire GitHub Ghidra - Code source et plugins officiels
- [Documentation Ghidra] (LINK_17) - Documentation et guides officiels
- [Documentation de l'API Ghidra] (LINK_17) - Référence complète de l'API
- Guide de développement du plugin Ghidra - tutoriel officiel de développement du plugin
Greffons et extensions communautaires
- Résistoire de plugins Ghidra - Liste des plugins curés
- BinExport - Exportation vers BinDiff et BinNavi
- [GhidraBridge] (LINK_17) - Pont Python pour Ghidra
-
- Générer des crochets Frida
- [Ghidra Jupyter] (LINK_17) - Intégration du portable Jupyter
Ressources pédagogiques
- [Matériel de formation Ghidra] (LINK_17) - Cours de formation officiels
- [Tutoriel de Scripting Ghidra] (LINK_17) - Guide de Scripting
- [Ingénierie inverse avec Ghidra] (LINK_17) - Livre complet
- Ghidra Blog Posts - Articles de blog officiels de la NSA
Développement et contribution
- Guide de développement Ghidra - Configuration de développement
- [Contribution à Ghidra] (LINK_17) - Lignes directrices sur les contributions
- Ghidra Issue Tracker - Rapports de bogues et demandes de fonctionnalités
- Débats de Ghidra - Discussions communautaires