Ghidra Plugins Cheat Sheet¶

"Clase de la hoja" id="copy-btn" class="copy-btn" onclick="copyAllCommands()" Copiar todos los comandos id="pdf-btn" class="pdf-btn" onclick="generatePDF()" Generar PDF seleccionado/button ■/div titulada

Sinopsis¶

Los plugins Ghidra amplían la funcionalidad del marco de ingeniería inversa Ghidra de la NSA. Esta guía completa cubre plugins esenciales incluyendo BinExport, GhidraBridge, Ghidra2Frida, y muchos otros que mejoran el análisis de colaboración, la integración con otras herramientas, y las capacidades avanzadas de ingeniería inversa.

■ ** Beneficios clave**: Mayor colaboración, integración de herramientas, análisis automatizado, mejora de la eficiencia del flujo de trabajo y mayor funcionalidad más allá de las características principales de Ghidra.

Categorías de complementos esenciales¶

Collaboration and Export Plugins¶

BinExport Plugin¶

# Installation
git clone https://github.com/google/binexport.git
cd binexport
mkdir build && cd build
cmake ..
make -j$(nproc)

# Install to Ghidra
cp BinExport.jar $GHIDRA_INSTALL_DIR/Extensions/Ghidra/

# Usage in Ghidra
# File -> Export Program -> BinExport (v2) for BinDiff
# File -> Export Program -> BinExport (v2) for BinNavi

# Command line export
$GHIDRA_INSTALL_DIR/support/analyzeHeadless \
    /path/to/project ProjectName \
    -import /path/to/binary \
    -postScript BinExportScript.java \
    -scriptPath /path/to/scripts

# Export formats
# .BinExport - For BinDiff comparison
# .BinExport2 - Enhanced format with more metadata
# SQL export - For BinNavi database import

GhidraBridge¶

# Installation
pip install ghidra-bridge

# Server setup in Ghidra
# Run GhidraBridge server script in Ghidra Script Manager
# Window -> Script Manager -> GhidraBridge -> ghidra_bridge_server.py

# Python client usage
import ghidra_bridge

# Connect to Ghidra
b = ghidra_bridge.GhidraBridge(namespace=globals())

# Access Ghidra API from Python
current_program = b.getCurrentProgram()
print(f"Program: {current_program.getName()}")

# Get function manager
function_manager = current_program.getFunctionManager()
functions = function_manager.getFunctions(True)

# Iterate through functions
for func in functions:
    print(f"Function: {func.getName()} at {func.getEntryPoint()}")

    # Get function body
    body = func.getBody()
    print(f"  Size: {body.getNumAddresses()} addresses")

    # Get calling functions
    callers = func.getCallingFunctions(None)
    print(f"  Callers: {len(list(callers))}")

# Advanced analysis with external tools
import networkx as nx

def build_call_graph():
    """Build call graph using NetworkX"""
    G = nx.DiGraph()

    for func in function_manager.getFunctions(True):
        func_name = func.getName()
        G.add_node(func_name)

        # Add edges for function calls
        for caller in func.getCallingFunctions(None):
            G.add_edge(caller.getName(), func_name)

    return G

# Export analysis results
def export_function_info():
    """Export function information to JSON"""
    import json

    functions_data = []
    for func in function_manager.getFunctions(True):
        func_data = {
            'name': func.getName(),
            'address': str(func.getEntryPoint()),
            'size': func.getBody().getNumAddresses(),
            'signature': func.getSignature().getPrototypeString()
        }
        functions_data.append(func_data)

    with open('ghidra_functions.json', 'w') as f:
        json.dump(functions_data, f, indent=2)

    return functions_data

# Machine learning integration
def extract_features_for_ml():
    """Extract features for machine learning analysis"""
    features = []

    for func in function_manager.getFunctions(True):
        # Extract various features
        feature_vector = {
            'name': func.getName(),
            'size': func.getBody().getNumAddresses(),
            'complexity': len(list(func.getCallingFunctions(None))),
            'has_loops': False,  # Would need more complex analysis
            'instruction_count': 0,
            'string_refs': 0,
            'api_calls': 0
        }

        # Analyze function body for more features
        instructions = current_program.getListing().getInstructions(func.getBody(), True)
        for instruction in instructions:
            feature_vector['instruction_count'] += 1

            # Check for API calls, string references, etc.
            # This would require more detailed analysis

        features.append(feature_vector)

    return features

Plugins de integración y automatización¶

Ghidra2Frida¶

# Installation
# Download from: https://github.com/federicodotta/Ghidra2Frida
# Place in Ghidra Extensions directory

# Usage in Ghidra Script Manager
# Generate Frida hooks for functions

# Example generated Frida script
frida_script = """
// Auto-generated Frida script from Ghidra

// Hook function at 0x401000
Interceptor.attach(ptr("0x401000"), {
    onEnter: function(args) {
        console.log("[+] Entering function_name");
        console.log("    arg0: " + args[0]);
        console.log("    arg1: " + args[1]);

        // Log stack trace
        console.log("Stack trace:");
        console.log(Thread.backtrace(this.context, Backtracer.ACCURATE)
            .map(DebugSymbol.fromAddress).join("\\n"));
    },
    onLeave: function(retval) {
        console.log("[+] Leaving function_name");
        console.log("    Return value: " + retval);
    }
});

// Hook string functions
var strcpy = Module.findExportByName(null, "strcpy");
if (strcpy) {
    Interceptor.attach(strcpy, {
        onEnter: function(args) {
            console.log("[strcpy] dest: " + args[0] + ", src: " + Memory.readUtf8String(args[1]));
        }
    });
}

// Memory scanning for patterns
function scanForPattern(pattern) {
    var ranges = Process.enumerateRanges('r--');
    ranges.forEach(function(range) {
        Memory.scan(range.base, range.size, pattern, {
            onMatch: function(address, size) {
                console.log("[+] Pattern found at: " + address);
            },
            onComplete: function() {
                console.log("[+] Scan complete for range: " + range.base);
            }
        });
    });
}

// Usage
scanForPattern("41 41 41 41");  // Search for AAAA pattern
"""

# Save and use with Frida
with open('ghidra_hooks.js', 'w') as f:
    f.write(frida_script)

# Run with Frida
# frida -l ghidra_hooks.js -f target_binary

Ghidra Jupyter Integration¶

# Installation and setup
pip install jupyter ghidra-bridge matplotlib pandas

# Jupyter notebook cell
import ghidra_bridge
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

# Connect to Ghidra
b = ghidra_bridge.GhidraBridge(namespace=globals())

# Function size analysis
def analyze_function_sizes():
    """Analyze and visualize function sizes"""

    sizes = []
    names = []

    function_manager = getCurrentProgram().getFunctionManager()
    for func in function_manager.getFunctions(True):
        size = func.getBody().getNumAddresses()
        sizes.append(size)
        names.append(func.getName())

    # Create DataFrame
    df = pd.DataFrame({
        'function': names,
        'size': sizes
    })

    # Statistical analysis
    print(f"Total functions: {len(df)}")
    print(f"Average size: {df['size'].mean():.2f}")
    print(f"Median size: {df['size'].median():.2f}")
    print(f"Largest function: {df.loc[df['size'].idxmax(), 'function']} ({df['size'].max()} bytes)")

    # Visualization
    plt.figure(figsize=(12, 8))

    # Histogram
    plt.subplot(2, 2, 1)
    plt.hist(df['size'], bins=50, alpha=0.7)
    plt.xlabel('Function Size (bytes)')
    plt.ylabel('Frequency')
    plt.title('Function Size Distribution')

    # Top 10 largest functions
    plt.subplot(2, 2, 2)
    top_10 = df.nlargest(10, 'size')
    plt.barh(range(len(top_10)), top_10['size'])
    plt.yticks(range(len(top_10)), top_10['function'])
    plt.xlabel('Size (bytes)')
    plt.title('Top 10 Largest Functions')

    # Box plot
    plt.subplot(2, 2, 3)
    plt.boxplot(df['size'])
    plt.ylabel('Size (bytes)')
    plt.title('Function Size Box Plot')

    # Cumulative distribution
    plt.subplot(2, 2, 4)
    sorted_sizes = np.sort(df['size'])
    cumulative = np.arange(1, len(sorted_sizes) + 1) / len(sorted_sizes)
    plt.plot(sorted_sizes, cumulative)
    plt.xlabel('Function Size (bytes)')
    plt.ylabel('Cumulative Probability')
    plt.title('Cumulative Distribution')

    plt.tight_layout()
    plt.show()

    return df

# Cross-reference analysis
def analyze_cross_references():
    """Analyze cross-references between functions"""

    reference_manager = getCurrentProgram().getReferenceManager()

    # Build reference graph
    ref_data = []

    function_manager = getCurrentProgram().getFunctionManager()
    for func in function_manager.getFunctions(True):
        func_addr = func.getEntryPoint()

        # Get references TO this function
        refs_to = reference_manager.getReferencesTo(func_addr)

        for ref in refs_to:
            from_addr = ref.getFromAddress()
            from_func = function_manager.getFunctionContaining(from_addr)

            if from_func:
                ref_data.append({
                    'from_function': from_func.getName(),
                    'to_function': func.getName(),
                    'reference_type': str(ref.getReferenceType())
                })

    # Create DataFrame
    ref_df = pd.DataFrame(ref_data)

    if not ref_df.empty:
        # Most referenced functions
        most_referenced = ref_df['to_function'].value_counts().head(10)

        plt.figure(figsize=(12, 6))

        plt.subplot(1, 2, 1)
        most_referenced.plot(kind='bar')
        plt.title('Most Referenced Functions')
        plt.xlabel('Function')
        plt.ylabel('Reference Count')
        plt.xticks(rotation=45)

        # Reference type distribution
        plt.subplot(1, 2, 2)
        ref_df['reference_type'].value_counts().plot(kind='pie', autopct='%1.1f%%')
        plt.title('Reference Type Distribution')

        plt.tight_layout()
        plt.show()

    return ref_df

# String analysis
def analyze_strings():
    """Analyze strings in the binary"""

    listing = getCurrentProgram().getListing()
    memory = getCurrentProgram().getMemory()

    strings_data = []

    # Get all defined strings
    data_iterator = listing.getDefinedData(True)
    for data in data_iterator:
        if data.hasStringValue():
            string_value = data.getValue()
            if string_value and len(str(string_value)) > 3:
                strings_data.append({
                    'address': str(data.getAddress()),
                    'string': str(string_value),
                    'length': len(str(string_value)),
                    'type': str(data.getDataType())
                })

    # Create DataFrame
    strings_df = pd.DataFrame(strings_data)

    if not strings_df.empty:
        # String length analysis
        plt.figure(figsize=(12, 8))

        plt.subplot(2, 2, 1)
        plt.hist(strings_df['length'], bins=30, alpha=0.7)
        plt.xlabel('String Length')
        plt.ylabel('Frequency')
        plt.title('String Length Distribution')

        # Longest strings
        plt.subplot(2, 2, 2)
        longest = strings_df.nlargest(10, 'length')
        plt.barh(range(len(longest)), longest['length'])
        plt.yticks(range(len(longest)), [s[:30] + '...' if len(s) > 30 else s for s in longest['string']])
        plt.xlabel('Length')
        plt.title('Longest Strings')

        # String type distribution
        plt.subplot(2, 2, 3)
        strings_df['type'].value_counts().plot(kind='pie', autopct='%1.1f%%')
        plt.title('String Type Distribution')

        plt.tight_layout()
        plt.show()

        # Interesting strings (potential passwords, URLs, etc.)
        interesting_patterns = [
            r'password', r'passwd', r'pwd',
            r'http[s]?://', r'ftp://',
            r'[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}',  # Email
            r'[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}',  # IP address
        ]

        import re
        interesting_strings = []

        for _, row in strings_df.iterrows():
            string_val = row['string'].lower()
            for pattern in interesting_patterns:
                if re.search(pattern, string_val, re.IGNORECASE):
                    interesting_strings.append(row)
                    break

        if interesting_strings:
            print("Interesting strings found:")
            for string_info in interesting_strings[:10]:
                print(f"  {string_info['address']}: {string_info['string'][:50]}")

    return strings_df

# Run analyses
function_df = analyze_function_sizes()
ref_df = analyze_cross_references()
strings_df = analyze_strings()

Plugins de Análisis Avanzado¶

Ghidra Decompiler Extensiones¶

// Custom decompiler plugin example
// Place in Ghidra/Features/Decompiler/src/main/java/

import ghidra.app.decompiler.*;
import ghidra.program.model.listing.*;
import ghidra.program.model.pcode.*;

public class CustomDecompilerAnalysis {

    public void analyzeFunction(Function function, DecompInterface decompiler) {
        // Get high-level representation
        DecompileResults results = decompiler.decompileFunction(function, 30, null);

        if (results.decompileCompleted()) {
            HighFunction highFunction = results.getHighFunction();

            // Analyze control flow
            analyzeControlFlow(highFunction);

            // Analyze data flow
            analyzeDataFlow(highFunction);

            // Detect patterns
            detectSecurityPatterns(highFunction);
        }
    }

    private void analyzeControlFlow(HighFunction highFunction) {
        // Get basic blocks
        Iterator<PcodeBlockBasic> blocks = highFunction.getBasicBlocks();

        while (blocks.hasNext()) {
            PcodeBlockBasic block = blocks.next();

            // Analyze block structure
            System.out.println("Block: " + block.getStart() + " to " + block.getStop());

            // Get successors and predecessors
            for (int i = 0; i < block.getOutSize(); i++) {
                PcodeBlock successor = block.getOut(i);
                System.out.println("  Successor: " + successor.getStart());
            }
        }
    }

    private void analyzeDataFlow(HighFunction highFunction) {
        // Get all variables
        Iterator<HighSymbol> symbols = highFunction.getLocalSymbolMap().getSymbols();

        while (symbols.hasNext()) {
            HighSymbol symbol = symbols.next();

            // Analyze variable usage
            HighVariable variable = symbol.getHighVariable();
            if (variable != null) {
                System.out.println("Variable: " + symbol.getName());
                System.out.println("  Type: " + variable.getDataType());
                System.out.println("  Size: " + variable.getSize());

                // Get def-use information
                Iterator<PcodeOp> defs = variable.getDescendants();
                while (defs.hasNext()) {
                    PcodeOp def = defs.next();
                    System.out.println("  Used in: " + def.getOpcode());
                }
            }
        }
    }

    private void detectSecurityPatterns(HighFunction highFunction) {
        // Look for dangerous function calls
        String[] dangerousFunctions = {
            "strcpy", "strcat", "sprintf", "gets", "scanf"
        };

        // Analyze P-code operations
        Iterator<PcodeOpAST> ops = highFunction.getPcodeOps();
        while (ops.hasNext()) {
            PcodeOpAST op = ops.next();

            if (op.getOpcode() == PcodeOp.CALL) {
                // Check if it's a call to dangerous function
                Varnode target = op.getInput(0);
                if (target.isAddress()) {
                    // Get function name at target address
                    // Check against dangerous functions list
                    System.out.println("Potential security issue: dangerous function call");
                }
            }

            // Look for buffer operations
|  |  |  | if (op.getOpcode() == PcodeOp.COPY |  | op.getOpcode() == PcodeOp.STORE) { |  |  |  |
                // Analyze for potential buffer overflows
                analyzeBufferOperation(op);
            }
        }
    }

    private void analyzeBufferOperation(PcodeOpAST op) {
        // Simplified buffer overflow detection
        Varnode output = op.getOutput();
        if (output != null && output.getSize() > 0) {
            // Check if operation could exceed buffer bounds
            System.out.println("Buffer operation detected at: " + op.getSeqnum().getTarget());
        }
    }
}

Ghidra Scripting Extensiones¶

# Advanced Ghidra scripting examples

# Crypto detection script
def detect_crypto_constants():
    """Detect cryptographic constants in binary"""

    # Common crypto constants
    crypto_constants = {
        0x67452301: "MD5 initial value A",
        0xEFCDAB89: "MD5 initial value B", 
        0x98BADCFE: "MD5 initial value C",
        0x10325476: "MD5 initial value D",
        0x6A09E667: "SHA-256 initial value H0",
        0xBB67AE85: "SHA-256 initial value H1",
        0x3C6EF372: "SHA-256 initial value H2",
        0xA54FF53A: "SHA-256 initial value H3",
        0x428A2F98: "SHA-256 round constant K0",
        0x71374491: "SHA-256 round constant K1",
        0x9E3779B9: "TEA delta constant",
        0x61C88647: "XTEA delta constant"
    }

    memory = getCurrentProgram().getMemory()
    found_constants = []

    # Search for constants in memory
    for block in memory.getBlocks():
        if block.isInitialized():
            block_start = block.getStart()
            block_end = block.getEnd()

            # Search 4-byte aligned addresses
            addr = block_start
            while addr.compareTo(block_end) < 0:
                try:
                    # Read 4 bytes as integer
                    value = memory.getInt(addr)

                    if value in crypto_constants:
                        found_constants.append({
                            'address': addr,
                            'value': hex(value),
                            'description': crypto_constants[value]
                        })

                        # Create comment
                        setEOLComment(addr, crypto_constants[value])

                    addr = addr.add(4)
                except:
                    addr = addr.add(1)

    # Print results
    print(f"Found {len(found_constants)} crypto constants:")
    for const in found_constants:
        print(f"  {const['address']}: {const['value']} - {const['description']}")

    return found_constants

# Function similarity analysis
def analyze_function_similarity():
    """Analyze similarity between functions"""

    function_manager = getCurrentProgram().getFunctionManager()
    functions = list(function_manager.getFunctions(True))

    # Extract features for each function
    function_features = {}

    for func in functions:
        features = extract_function_features(func)
        function_features[func.getName()] = features

    # Compare functions
    similarities = []

    for i, func1 in enumerate(functions):
        for func2 in functions[i+1:]:
            similarity = calculate_similarity(
                function_features[func1.getName()],
                function_features[func2.getName()]
            )

            if similarity > 0.8:  # High similarity threshold
                similarities.append({
                    'function1': func1.getName(),
                    'function2': func2.getName(),
                    'similarity': similarity,
                    'addr1': func1.getEntryPoint(),
                    'addr2': func2.getEntryPoint()
                })

    # Sort by similarity
    similarities.sort(key=lambda x: x['similarity'], reverse=True)

    print(f"Found {len(similarities)} similar function pairs:")
    for sim in similarities[:10]:  # Top 10
        print(f"  {sim['function1']} <-> {sim['function2']}: {sim['similarity']:.3f}")

    return similarities

def extract_function_features(function):
    """Extract features from function for similarity analysis"""

    features = {
        'size': function.getBody().getNumAddresses(),
        'block_count': 0,
        'call_count': 0,
        'instruction_types': {},
        'string_refs': 0,
        'api_calls': []
    }

    # Analyze basic blocks
    body = function.getBody()
    listing = getCurrentProgram().getListing()

    # Count instructions and types
    instructions = listing.getInstructions(body, True)
    for instruction in instructions:
        mnemonic = instruction.getMnemonicString()
        features['instruction_types'][mnemonic] = features['instruction_types'].get(mnemonic, 0) + 1

        # Count calls
        if instruction.getFlowType().isCall():
            features['call_count'] += 1

            # Get call target
            refs = instruction.getReferencesFrom()
            for ref in refs:
                if ref.getReferenceType().isCall():
                    target_addr = ref.getToAddress()
                    target_func = getCurrentProgram().getFunctionManager().getFunctionAt(target_addr)
                    if target_func:
                        features['api_calls'].append(target_func.getName())

    return features

def calculate_similarity(features1, features2):
    """Calculate similarity between two feature sets"""

    # Simple similarity based on instruction type distribution
    types1 = features1['instruction_types']
    types2 = features2['instruction_types']

    # Get all instruction types
    all_types = set(types1.keys()) | set(types2.keys())

    if not all_types:
        return 0.0

    # Calculate cosine similarity
    dot_product = 0
    norm1 = 0
    norm2 = 0

    for inst_type in all_types:
        count1 = types1.get(inst_type, 0)
        count2 = types2.get(inst_type, 0)

        dot_product += count1 * count2
        norm1 += count1 * count1
        norm2 += count2 * count2

    if norm1 == 0 or norm2 == 0:
        return 0.0

    return dot_product / (math.sqrt(norm1) * math.sqrt(norm2))

# Automated vulnerability detection
def detect_vulnerabilities():
    """Detect potential vulnerabilities in code"""

    vulnerabilities = []

    # Dangerous function patterns
    dangerous_functions = {
        'strcpy': 'Buffer overflow risk - no bounds checking',
        'strcat': 'Buffer overflow risk - no bounds checking', 
        'sprintf': 'Buffer overflow risk - no bounds checking',
        'gets': 'Buffer overflow risk - reads unlimited input',
        'scanf': 'Buffer overflow risk with %s format',
        'system': 'Command injection risk',
        'exec': 'Command injection risk',
        'eval': 'Code injection risk'
    }

    function_manager = getCurrentProgram().getFunctionManager()

    # Check for dangerous function calls
    for func in function_manager.getFunctions(True):
        body = func.getBody()
        listing = getCurrentProgram().getListing()

        instructions = listing.getInstructions(body, True)
        for instruction in instructions:
            if instruction.getFlowType().isCall():
                refs = instruction.getReferencesFrom()

                for ref in refs:
                    if ref.getReferenceType().isCall():
                        target_addr = ref.getToAddress()
                        target_func = function_manager.getFunctionAt(target_addr)

                        if target_func:
                            func_name = target_func.getName()

                            for dangerous_func, description in dangerous_functions.items():
                                if dangerous_func in func_name.lower():
                                    vulnerabilities.append({
                                        'type': 'dangerous_function_call',
                                        'function': func.getName(),
                                        'address': instruction.getAddress(),
                                        'dangerous_function': func_name,
                                        'description': description,
                                        'severity': 'high' if dangerous_func in ['gets', 'system'] else 'medium'
                                    })

    # Check for format string vulnerabilities
    detect_format_string_vulns(vulnerabilities)

    # Check for integer overflow patterns
    detect_integer_overflow_patterns(vulnerabilities)

    # Print results
    print(f"Found {len(vulnerabilities)} potential vulnerabilities:")
    for vuln in vulnerabilities:
        print(f"  [{vuln['severity'].upper()}] {vuln['type']} in {vuln['function']}")
        print(f"    Address: {vuln['address']}")
        print(f"    Description: {vuln['description']}")

    return vulnerabilities

def detect_format_string_vulns(vulnerabilities):
    """Detect format string vulnerabilities"""

    # Look for printf-family functions with user-controlled format strings
    printf_functions = ['printf', 'fprintf', 'sprintf', 'snprintf', 'vprintf']

    function_manager = getCurrentProgram().getFunctionManager()

    for func in function_manager.getFunctions(True):
        # Analyze function for printf calls
        # This is a simplified detection - real analysis would need data flow
        pass

def detect_integer_overflow_patterns(vulnerabilities):
    """Detect potential integer overflow patterns"""

    # Look for arithmetic operations without bounds checking
    # This is a simplified detection
    pass

# Run analysis scripts
crypto_constants = detect_crypto_constants()
similar_functions = analyze_function_similarity()
vulnerabilities = detect_vulnerabilities()

Plugins de Utilidad y Ayudante¶

Ghidra Batch Processing¶

# Batch processing utilities for Ghidra

import os
import json
import subprocess
from pathlib import Path

class GhidraBatchProcessor:
    def __init__(self, ghidra_path, project_path):
        self.ghidra_path = Path(ghidra_path)
        self.project_path = Path(project_path)
        self.analyze_headless = self.ghidra_path / "support" / "analyzeHeadless"

    def batch_analyze(self, binary_paths, scripts=None, output_dir=None):
        """Batch analyze multiple binaries"""

        if output_dir is None:
            output_dir = Path("./batch_analysis_results")

        output_dir.mkdir(exist_ok=True)

        results = []

        for binary_path in binary_paths:
            binary_path = Path(binary_path)

            print(f"Analyzing: {binary_path.name}")

            # Create project for this binary
            project_name = f"batch_{binary_path.stem}"

            # Build command
            cmd = [
                str(self.analyze_headless),
                str(self.project_path),
                project_name,
                "-import", str(binary_path),
                "-overwrite"
            ]

            # Add scripts if specified
            if scripts:
                for script in scripts:
                    cmd.extend(["-postScript", script])

            # Add output directory
            cmd.extend(["-scriptPath", str(output_dir)])

            try:
                # Run analysis
                result = subprocess.run(cmd, capture_output=True, text=True, timeout=300)

                analysis_result = {
                    'binary': str(binary_path),
                    'project': project_name,
                    'success': result.returncode == 0,
                    'stdout': result.stdout,
                    'stderr': result.stderr
                }

                results.append(analysis_result)

                # Save individual result
                result_file = output_dir / f"{binary_path.stem}_result.json"
                with open(result_file, 'w') as f:
                    json.dump(analysis_result, f, indent=2)

            except subprocess.TimeoutExpired:
                print(f"Timeout analyzing {binary_path.name}")
                results.append({
                    'binary': str(binary_path),
                    'project': project_name,
                    'success': False,
                    'error': 'timeout'
                })

        # Save batch results
        batch_result_file = output_dir / "batch_results.json"
        with open(batch_result_file, 'w') as f:
            json.dump(results, f, indent=2)

        return results

    def export_all_functions(self, binary_path, output_format='json'):
        """Export all functions from a binary"""

        script_content = f"""
# Export functions script
import json

def export_functions():
    program = getCurrentProgram()
    function_manager = program.getFunctionManager()

    functions_data = []

    for func in function_manager.getFunctions(True):
        func_data = {{
            'name': func.getName(),
            'address': str(func.getEntryPoint()),
            'size': func.getBody().getNumAddresses(),
            'signature': func.getSignature().getPrototypeString() if func.getSignature() else None,
            'calling_convention': str(func.getCallingConvention()) if func.getCallingConvention() else None,
            'parameter_count': func.getParameterCount(),
            'local_variable_count': len(func.getLocalVariables()),
            'is_thunk': func.isThunk(),
            'is_external': func.isExternal()
        }}

        # Get function calls
        calls = []
        body = func.getBody()
        listing = program.getListing()

        instructions = listing.getInstructions(body, True)
        for instruction in instructions:
            if instruction.getFlowType().isCall():
                refs = instruction.getReferencesFrom()
                for ref in refs:
                    if ref.getReferenceType().isCall():
                        target_addr = ref.getToAddress()
                        target_func = function_manager.getFunctionAt(target_addr)
                        if target_func:
                            calls.append(target_func.getName())

        func_data['calls'] = calls
        functions_data.append(func_data)

    # Save to file
    output_file = "{binary_path.stem}_functions.{output_format}"
    with open(output_file, 'w') as f:
        json.dump(functions_data, f, indent=2)

    print(f"Exported {{len(functions_data)}} functions to {{output_file}}")

export_functions()
"""

        # Save script
        script_file = Path("export_functions.py")
        with open(script_file, 'w') as f:
            f.write(script_content)

        # Run analysis with script
        return self.batch_analyze([binary_path], scripts=[str(script_file)])

# Usage example
def run_batch_analysis():
    """Example of running batch analysis"""

    # Setup
    ghidra_path = "/opt/ghidra"  # Adjust path
    project_path = "/tmp/ghidra_projects"

    processor = GhidraBatchProcessor(ghidra_path, project_path)

    # Find binaries to analyze
    binary_paths = [
        "/bin/ls",
        "/bin/cat", 
        "/bin/echo"
    ]

    # Custom analysis scripts
    analysis_scripts = [
        "export_functions.py",
        "detect_crypto.py",
        "analyze_strings.py"
    ]

    # Run batch analysis
    results = processor.batch_analyze(binary_paths, scripts=analysis_scripts)

    # Print summary
    successful = sum(1 for r in results if r['success'])
    print(f"Batch analysis complete: {successful}/{len(results)} successful")

    return results

# Ghidra project management utilities
class GhidraProjectManager:
    def __init__(self, ghidra_path):
        self.ghidra_path = Path(ghidra_path)

    def create_project(self, project_path, project_name):
        """Create new Ghidra project"""

        cmd = [
            str(self.ghidra_path / "support" / "analyzeHeadless"),
            str(project_path),
            project_name,
            "-create"
        ]

        result = subprocess.run(cmd, capture_output=True, text=True)
        return result.returncode == 0

    def import_binary(self, project_path, project_name, binary_path, analyze=True):
        """Import binary into project"""

        cmd = [
            str(self.ghidra_path / "support" / "analyzeHeadless"),
            str(project_path),
            project_name,
            "-import", str(binary_path)
        ]

        if not analyze:
            cmd.append("-noanalysis")

        result = subprocess.run(cmd, capture_output=True, text=True)
        return result.returncode == 0

    def export_project(self, project_path, project_name, export_path, format_type="xml"):
        """Export project data"""

        export_script = f"""
# Export project script
import os

def export_project_data():
    program = getCurrentProgram()

    # Export program as XML
    from ghidra.app.util.exporter import XmlExporter

    exporter = XmlExporter()
    export_file = "{export_path}"

    # Configure export options
    options = exporter.getDefaultOptions()

    # Perform export
    success = exporter.export(export_file, program, None, None)

    if success:
        print(f"Project exported to {{export_file}}")
    else:
        print("Export failed")

export_project_data()
"""

        # Save and run export script
        script_file = Path("export_project.py")
        with open(script_file, 'w') as f:
            f.write(export_script)

        cmd = [
            str(self.ghidra_path / "support" / "analyzeHeadless"),
            str(project_path),
            project_name,
            "-postScript", str(script_file)
        ]

        result = subprocess.run(cmd, capture_output=True, text=True)
        return result.returncode == 0

# Run examples
if __name__ == "__main__":
    # Run batch analysis
    batch_results = run_batch_analysis()

    # Project management example
    ghidra_path = "/opt/ghidra"
    manager = GhidraProjectManager(ghidra_path)

    # Create project
    manager.create_project("/tmp/test_project", "TestProject")

    # Import binary
    manager.import_binary("/tmp/test_project", "TestProject", "/bin/ls")

    # Export project
    manager.export_project("/tmp/test_project", "TestProject", "/tmp/exported_project.xml")

Plugin Development¶

Creación de Plugins personalizados¶

// Custom Ghidra plugin template
// Place in Ghidra/Features/Base/src/main/java/

import ghidra.app.plugin.PluginCategoryNames;
import ghidra.app.plugin.ProgramPlugin;
import ghidra.framework.plugintool.*;
import ghidra.framework.plugintool.util.PluginStatus;
import ghidra.program.model.listing.Program;

@PluginInfo(
    status = PluginStatus.STABLE,
    packageName = "CustomAnalysis",
    category = PluginCategoryNames.ANALYSIS,
    shortDescription = "Custom analysis plugin",
    description = "Performs custom binary analysis tasks"
)
public class CustomAnalysisPlugin extends ProgramPlugin {

    public CustomAnalysisPlugin(PluginTool tool) {
        super(tool, true, true);

        // Initialize plugin
        setupActions();
    }

    private void setupActions() {
        // Create menu actions
        DockingAction analyzeAction = new DockingAction("Custom Analysis", getName()) {
            @Override
            public void actionPerformed(ActionContext context) {
                performCustomAnalysis();
            }
        };

        analyzeAction.setMenuBarData(new MenuData(
            new String[]{"Analysis", "Custom Analysis"}, 
            "CustomAnalysis"
        ));

        analyzeAction.setDescription("Run custom analysis");
        analyzeAction.setEnabled(true);

        tool.addAction(analyzeAction);
    }

    private void performCustomAnalysis() {
        Program program = getCurrentProgram();
        if (program == null) {
            return;
        }

        // Perform analysis
        CustomAnalyzer analyzer = new CustomAnalyzer(program);
        analyzer.analyze();

        // Display results
        displayResults(analyzer.getResults());
    }

    private void displayResults(AnalysisResults results) {
        // Create results dialog or panel
        CustomResultsDialog dialog = new CustomResultsDialog(results);
        tool.showDialog(dialog);
    }

    @Override
    protected void programActivated(Program program) {
        // Called when program becomes active
        super.programActivated(program);
    }

    @Override
    protected void programDeactivated(Program program) {
        // Called when program becomes inactive
        super.programDeactivated(program);
    }
}

// Custom analyzer class
class CustomAnalyzer {
    private Program program;
    private AnalysisResults results;

    public CustomAnalyzer(Program program) {
        this.program = program;
        this.results = new AnalysisResults();
    }

    public void analyze() {
        // Perform custom analysis
        analyzeFunctions();
        analyzeStrings();
        analyzeReferences();
    }

    private void analyzeFunctions() {
        FunctionManager functionManager = program.getFunctionManager();
        FunctionIterator functions = functionManager.getFunctions(true);

        while (functions.hasNext()) {
            Function function = functions.next();

            // Analyze function
            FunctionAnalysis analysis = new FunctionAnalysis();
            analysis.setName(function.getName());
            analysis.setAddress(function.getEntryPoint());
            analysis.setSize(function.getBody().getNumAddresses());

            // Add complexity metrics
            analysis.setComplexity(calculateComplexity(function));

            results.addFunctionAnalysis(analysis);
        }
    }

    private int calculateComplexity(Function function) {
        // Simple complexity calculation
        return function.getBody().getNumAddresses() / 10;
    }

    private void analyzeStrings() {
        // String analysis implementation
    }

    private void analyzeReferences() {
        // Reference analysis implementation
    }

    public AnalysisResults getResults() {
        return results;
    }
}

// Results data structure
class AnalysisResults {
    private List<FunctionAnalysis> functionAnalyses;
    private List<StringAnalysis> stringAnalyses;

    public AnalysisResults() {
        this.functionAnalyses = new ArrayList<>();
        this.stringAnalyses = new ArrayList<>();
    }

    public void addFunctionAnalysis(FunctionAnalysis analysis) {
        functionAnalyses.add(analysis);
    }

    public List<FunctionAnalysis> getFunctionAnalyses() {
        return functionAnalyses;
    }
}

class FunctionAnalysis {
    private String name;
    private Address address;
    private long size;
    private int complexity;

    // Getters and setters
    public void setName(String name) { this.name = name; }
    public String getName() { return name; }

    public void setAddress(Address address) { this.address = address; }
    public Address getAddress() { return address; }

    public void setSize(long size) { this.size = size; }
    public long getSize() { return size; }

    public void setComplexity(int complexity) { this.complexity = complexity; }
    public int getComplexity() { return complexity; }
}

Configuración y Despliegue de Plugin¶

# Plugin build and deployment

# 1. Build plugin
cd $GHIDRA_INSTALL_DIR
./gradlew buildExtension -PGHIDRA_INSTALL_DIR=$GHIDRA_INSTALL_DIR

# 2. Install plugin
cp dist/CustomAnalysisPlugin.zip $GHIDRA_INSTALL_DIR/Extensions/Ghidra/

# 3. Enable plugin in Ghidra
# File -> Configure -> Configure Plugins -> Check your plugin

# 4. Plugin directory structure
mkdir -p MyCustomPlugin/src/main/java/mypackage
mkdir -p MyCustomPlugin/src/main/resources
mkdir -p MyCustomPlugin/data

# 5. Create extension.properties
cat > MyCustomPlugin/extension.properties << EOF
name=MyCustomPlugin
description=Custom analysis plugin for Ghidra
author=Your Name
createdOn=2025-01-01
version=1.0
EOF

# 6. Create build.gradle
cat > MyCustomPlugin/build.gradle << EOF
apply from: "\$rootProject.projectDir/gradle/javaProject.gradle"
apply from: "\$rootProject.projectDir/gradle/helpProject.gradle"
apply from: "\$rootProject.projectDir/gradle/distributableGhidraModule.gradle"

dependencies {
    api project(':Base')
    api project(':Decompiler')
}
EOF

# 7. Build and package
./gradlew :MyCustomPlugin:buildExtension

# 8. Install extension
unzip dist/MyCustomPlugin.zip -d $GHIDRA_INSTALL_DIR/Extensions/Ghidra/

Ejemplos de integración¶

CI/CD Integration¶

# GitHub Actions workflow for Ghidra analysis
name: Ghidra Binary Analysis

on:
  push:
    paths:
      - 'binaries/**'
  pull_request:
    paths:
      - 'binaries/**'

jobs:
  ghidra-analysis:
    runs-on: ubuntu-latest

    steps:
    - name: Checkout code
      uses: actions/checkout@v3

    - name: Setup Java
      uses: actions/setup-java@v3
      with:
        java-version: '17'
        distribution: 'temurin'

    - name: Download Ghidra
      run: |
        wget https://github.com/NationalSecurityAgency/ghidra/releases/download/Ghidra_10.4_build/ghidra_10.4_PUBLIC_20230928.zip
        unzip ghidra_10.4_PUBLIC_20230928.zip
        export GHIDRA_INSTALL_DIR=$PWD/ghidra_10.4_PUBLIC

    - name: Install Ghidra plugins
      run: |
        # Install BinExport
        git clone https://github.com/google/binexport.git
        cd binexport
        mkdir build && cd build
        cmake ..
        make -j$(nproc)
        cp BinExport.jar $GHIDRA_INSTALL_DIR/Extensions/Ghidra/

    - name: Run Ghidra analysis
      run: |
        # Create analysis script
        cat > analyze_binary.py << 'EOF'
        import json
        import os

        def analyze_program():
            program = getCurrentProgram()
            if not program:
                return

            results = {
                'binary_name': program.getName(),
                'architecture': str(program.getLanguage().getProcessor()),
                'entry_point': str(program.getImageBase().add(program.getAddressFactory().getDefaultAddressSpace().getMinAddress())),
                'functions': [],
                'strings': [],
                'imports': []
            }

            # Analyze functions
            function_manager = program.getFunctionManager()
            for func in function_manager.getFunctions(True):
                func_data = {
                    'name': func.getName(),
                    'address': str(func.getEntryPoint()),
                    'size': func.getBody().getNumAddresses()
                }
                results['functions'].append(func_data)

            # Export results
            output_file = os.path.join(os.getcwd(), 'analysis_results.json')
            with open(output_file, 'w') as f:
                json.dump(results, f, indent=2)

            print(f"Analysis complete. Results saved to {output_file}")

        analyze_program()
        EOF

        # Run analysis on all binaries
        for binary in binaries/*; do
          if [ -f "$binary" ]; then
            echo "Analyzing $binary"
            $GHIDRA_INSTALL_DIR/support/analyzeHeadless \
              /tmp/ghidra_projects \
              "CI_Analysis_$(basename $binary)" \
              -import "$binary" \
              -postScript analyze_binary.py \
              -overwrite
          fi
        done

    - name: Upload analysis results
      uses: actions/upload-artifact@v3
      with:
        name: ghidra-analysis-results
        path: analysis_results.json

    - name: Security scan results
      run: |
        # Parse results for security issues
        python3 << 'EOF'
        import json
        import sys

        try:
            with open('analysis_results.json', 'r') as f:
                results = json.load(f)

            # Check for dangerous functions
            dangerous_functions = ['strcpy', 'gets', 'sprintf', 'system']
            security_issues = []

            for func in results.get('functions', []):
                func_name = func['name'].lower()
                for dangerous in dangerous_functions:
                    if dangerous in func_name:
                        security_issues.append({
                            'type': 'dangerous_function',
                            'function': func['name'],
                            'address': func['address'],
                            'issue': f'Potentially dangerous function: {dangerous}'
                        })

            if security_issues:
                print("Security issues found:")
                for issue in security_issues:
                    print(f"  - {issue['issue']} in {issue['function']} at {issue['address']}")
                sys.exit(1)
            else:
                print("No security issues detected")

        except FileNotFoundError:
            print("Analysis results not found")
            sys.exit(1)
        EOF

Docker Integration¶

# Dockerfile for Ghidra analysis environment
FROM ubuntu:22.04

# Install dependencies
RUN apt-get update && apt-get install -y \
    openjdk-17-jdk \
    wget \
    unzip \
    git \
    build-essential \
    cmake \
    python3 \
    python3-pip \
    && rm -rf /var/lib/apt/lists/*

# Install Ghidra
WORKDIR /opt
RUN wget https://github.com/NationalSecurityAgency/ghidra/releases/download/Ghidra_10.4_build/ghidra_10.4_PUBLIC_20230928.zip \
    && unzip ghidra_10.4_PUBLIC_20230928.zip \
    && rm ghidra_10.4_PUBLIC_20230928.zip \
    && mv ghidra_10.4_PUBLIC ghidra

ENV GHIDRA_INSTALL_DIR=/opt/ghidra
ENV PATH=$PATH:$GHIDRA_INSTALL_DIR/support

# Install Python dependencies
RUN pip3 install ghidra-bridge requests

# Install Ghidra plugins
WORKDIR /tmp
RUN git clone https://github.com/google/binexport.git \
    && cd binexport \
    && mkdir build && cd build \
    && cmake .. \
    && make -j$(nproc) \
    && cp BinExport.jar $GHIDRA_INSTALL_DIR/Extensions/Ghidra/

# Create analysis scripts directory
RUN mkdir -p /opt/analysis-scripts

# Copy analysis scripts
COPY scripts/ /opt/analysis-scripts/

# Create workspace
RUN mkdir -p /workspace/projects /workspace/binaries /workspace/results

WORKDIR /workspace

# Entry point script
COPY entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh

ENTRYPOINT ["/entrypoint.sh"]

#!/bin/bash
# entrypoint.sh

set -e

# Default values
PROJECT_NAME=${PROJECT_NAME:-"analysis_project"}
BINARY_PATH=${BINARY_PATH:-""}
ANALYSIS_SCRIPTS=${ANALYSIS_SCRIPTS:-""}
OUTPUT_DIR=${OUTPUT_DIR:-"/workspace/results"}

# Create output directory
mkdir -p "$OUTPUT_DIR"

if [ -z "$BINARY_PATH" ]; then
    echo "Error: BINARY_PATH environment variable must be set"
    exit 1
fi

if [ ! -f "$BINARY_PATH" ]; then
    echo "Error: Binary file not found: $BINARY_PATH"
    exit 1
fi

echo "Starting Ghidra analysis..."
echo "Binary: $BINARY_PATH"
echo "Project: $PROJECT_NAME"
echo "Output: $OUTPUT_DIR"

# Build analysis command
ANALYSIS_CMD="$GHIDRA_INSTALL_DIR/support/analyzeHeadless \
    /workspace/projects \
    $PROJECT_NAME \
    -import $BINARY_PATH \
    -overwrite"

# Add analysis scripts if specified
if [ -n "$ANALYSIS_SCRIPTS" ]; then
    for script in $ANALYSIS_SCRIPTS; do
        if [ -f "/opt/analysis-scripts/$script" ]; then
            ANALYSIS_CMD="$ANALYSIS_CMD -postScript /opt/analysis-scripts/$script"
        else
            echo "Warning: Script not found: $script"
        fi
    done
fi

# Run analysis
eval $ANALYSIS_CMD

# Copy results
if [ -d "/workspace/projects/$PROJECT_NAME.rep" ]; then
    cp -r "/workspace/projects/$PROJECT_NAME.rep" "$OUTPUT_DIR/"
fi

echo "Analysis complete. Results saved to $OUTPUT_DIR"

# Keep container running if requested
if [ "$KEEP_RUNNING" = "true" ]; then
    echo "Keeping container running..."
    tail -f /dev/null
fi

# Docker usage examples

# Build the image
docker build -t ghidra-analysis .

# Analyze a single binary
docker run --rm \
    -v /path/to/binary:/workspace/binaries/target:ro \
    -v /path/to/results:/workspace/results \
    -e BINARY_PATH=/workspace/binaries/target \
    -e PROJECT_NAME=my_analysis \
    -e ANALYSIS_SCRIPTS="export_functions.py detect_crypto.py" \
    ghidra-analysis

# Interactive analysis
docker run -it \
    -v /path/to/binaries:/workspace/binaries:ro \
    -v /path/to/results:/workspace/results \
    -e KEEP_RUNNING=true \
    ghidra-analysis bash

# Batch analysis with docker-compose
cat > docker-compose.yml << EOF
version: '3.8'

services:
  ghidra-analysis:
    build: .
    volumes:
      - ./binaries:/workspace/binaries:ro
      - ./results:/workspace/results
      - ./custom-scripts:/opt/analysis-scripts/custom:ro
    environment:
      - PROJECT_NAME=batch_analysis
      - ANALYSIS_SCRIPTS=export_functions.py detect_crypto.py custom/my_script.py
    command: |
      bash -c "
        for binary in /workspace/binaries/*; do
          if [ -f \"\$binary\" ]; then
            echo \"Analyzing \$(basename \$binary)\"
            BINARY_PATH=\"\$binary\" \
            PROJECT_NAME=\"analysis_\$(basename \$binary)\" \
            /entrypoint.sh
          fi
        done
      "
EOF

docker-compose up

Recursos y documentación¶

Recursos oficiales¶

Ghidra GitHub Repository - Código fuente y plugins oficiales
Ghidra Documentation - Documentación y guías oficiales
Ghidra API Documentation - Referencia completa de API
Ghidra Plugin Development Guide - Manual de desarrollo del plugin oficial

Plugins comunitarios y extensiones¶

Ghidra Plugin Repository - Lista curada de plugins
BinExport - Exportar a BinDiff y BinNavi
GhidraBridge - Puente Python para Ghidra
Ghidra2Frida - Generar ganchos Frida
Ghidra Jupyter - Integración de la libreta de Jupyter

Recursos didácticos¶

Ghidra Training Materials - Cursos oficiales de capacitación
Ghidra Scripting Tutorial - Guía de scripting
Ingeniería Reversa con Ghidra - Libro completo
Ghidra Blog Posts - Mensajes oficiales del blog de NSA

Desarrollo y contribución¶

Ghidra Development Guide - Development setup
Contribuir a Ghidra - Directrices de contribución
Ghidra Issue Tracker - Informes de errores y solicitudes de características
Discusiones Ghidra - Debates comunitarios