Lighthouse Plugin Cheat Sheet¶

Overview¶

Lighthouse is a code coverage plugin for disassemblers that visualizes code coverage data directly within IDA Pro and Ghidra. It's particularly useful for correlating fuzzing results with disassembly, identifying uncovered code paths, and guiding security research efforts.

💡 Key Features: Real-time coverage visualization, multiple coverage format support, differential coverage analysis, fuzzing integration, and interactive coverage exploration within disassemblers.

Installation and Setup¶

IDA Pro Installation¶

# Clone Lighthouse repository
git clone https://github.com/gaasedelen/lighthouse.git
cd lighthouse

# Install Python dependencies
pip install -r requirements.txt

# For IDA Pro 7.x
cp -r lighthouse/ "$IDADIR/plugins/"
cp lighthouse_plugin.py "$IDADIR/plugins/"

# For IDA Pro 8.x
cp -r lighthouse/ "$IDADIR/plugins/"
cp lighthouse_plugin.py "$IDADIR/plugins/"

# Verify installation
# Start IDA Pro and check Edit -> Plugins -> Lighthouse

# Alternative: Install via pip (if available)
pip install lighthouse-ida

# Manual installation verification
python -c "
import sys
sys.path.append('/path/to/ida/python')
import lighthouse
print('Lighthouse installed successfully')
"

Ghidra Installation¶

# Clone Lighthouse repository
git clone https://github.com/gaasedelen/lighthouse.git
cd lighthouse

# Build Ghidra extension
cd ghidra_scripts
# Copy scripts to Ghidra script directory
cp *.py "$GHIDRA_INSTALL_DIR/Ghidra/Features/Python/ghidra_scripts/"

# Or use Ghidra's script manager
# 1. Open Ghidra
# 2. Window -> Script Manager
# 3. Script Directories -> Add
# 4. Select lighthouse/ghidra_scripts directory

# Install Python dependencies for Ghidra
pip install coverage
pip install pyqt5  # For GUI components

# Verify installation in Ghidra
# Window -> Script Manager -> Search for "lighthouse"

Coverage Data Sources Setup¶

# Install coverage tools for different formats

# Intel Pin (for DynamoRIO coverage)
wget https://software.intel.com/sites/landingpage/pintool/downloads/pin-3.21-98484-ge7cd811fd-gcc-linux.tar.gz
tar -xzf pin-3.21-98484-ge7cd811fd-gcc-linux.tar.gz
export PIN_ROOT=/path/to/pin

# DynamoRIO
git clone https://github.com/DynamoRIO/dynamorio.git
cd dynamorio
mkdir build && cd build
cmake ..
make -j$(nproc)

# AFL++ for fuzzing coverage
git clone https://github.com/AFLplusplus/AFLplusplus.git
cd AFLplusplus
make
sudo make install

# Frida for dynamic instrumentation
pip install frida-tools
npm install frida

# GCOV for compile-time coverage
# Already available with GCC
gcc --coverage source.c -o binary

# LLVM coverage
clang -fprofile-instr-generate -fcoverage-mapping source.c -o binary

# Verify coverage tools
which afl-fuzz
which frida
which gcov
which llvm-profdata

Configuration and Setup¶

# Lighthouse configuration file (~/.lighthouse/config.json)
{
    "coverage_formats": {
        "drcov": {
            "enabled": true,
            "color": "#FF6B6B"
        },
        "intel_pin": {
            "enabled": true,
            "color": "#4ECDC4"
        },
        "frida": {
            "enabled": true,
            "color": "#45B7D1"
        },
        "lighthouse": {
            "enabled": true,
            "color": "#96CEB4"
        }
    },
    "ui_settings": {
        "auto_load": true,
        "show_coverage_percentage": true,
        "highlight_uncovered": true,
        "coverage_threshold": 0.8
    },
    "performance": {
        "max_coverage_files": 100,
        "cache_coverage_data": true,
        "async_loading": true
    }
}

Basic Coverage Analysis¶

Loading Coverage Data¶

# IDA Pro - Load coverage data
import lighthouse

# Load single coverage file
lighthouse.load_coverage("coverage.drcov")

# Load multiple coverage files
coverage_files = [
    "run1.drcov",
    "run2.drcov", 
    "run3.drcov"
]

for coverage_file in coverage_files:
    lighthouse.load_coverage(coverage_file)

# Load coverage directory
lighthouse.load_coverage_directory("/path/to/coverage/files/")

# Load with specific format
lighthouse.load_coverage("coverage.log", format="intel_pin")

# Load Frida coverage
lighthouse.load_coverage("frida_trace.json", format="frida")

# Load GCOV coverage
lighthouse.load_coverage("coverage.gcov", format="gcov")

Coverage Visualization¶

# Enable coverage visualization
lighthouse.show_coverage()

# Hide coverage visualization
lighthouse.hide_coverage()

# Toggle coverage display
lighthouse.toggle_coverage()

# Set coverage colors
lighthouse.set_coverage_color(0xFF6B6B)  # Red for covered
lighthouse.set_uncovered_color(0x808080)  # Gray for uncovered

# Highlight specific coverage
lighthouse.highlight_coverage("run1.drcov")

# Show coverage statistics
stats = lighthouse.get_coverage_stats()
print(f"Total functions: {stats['total_functions']}")
print(f"Covered functions: {stats['covered_functions']}")
print(f"Coverage percentage: {stats['coverage_percentage']:.2f}%")

# Navigate to uncovered code
lighthouse.goto_next_uncovered()
lighthouse.goto_previous_uncovered()

# Find uncovered functions
uncovered_functions = lighthouse.get_uncovered_functions()
for func in uncovered_functions:
    print(f"Uncovered function: {func['name']} at 0x{func['address']:x}")

Coverage Comparison¶

# Compare two coverage runs
lighthouse.compare_coverage("baseline.drcov", "new_run.drcov")

# Differential coverage analysis
diff_coverage = lighthouse.differential_coverage([
    "run1.drcov",
    "run2.drcov",
    "run3.drcov"
])

# Show only new coverage
lighthouse.show_new_coverage_only()

# Show coverage intersection
lighthouse.show_coverage_intersection()

# Show coverage union
lighthouse.show_coverage_union()

# Export coverage comparison
lighthouse.export_coverage_diff("coverage_diff.json")

# Generate coverage report
report = lighthouse.generate_coverage_report()
with open("coverage_report.html", "w") as f:
    f.write(report)

Advanced Coverage Analysis¶

Fuzzing Integration¶

# AFL++ integration
import subprocess
import time
import os

class AFLLighthouseIntegration:
    def __init__(self, target_binary, afl_output_dir):
        self.target_binary = target_binary
        self.afl_output_dir = afl_output_dir
        self.coverage_files = []

    def start_afl_fuzzing(self, input_dir, timeout=3600):
        """Start AFL fuzzing with coverage collection"""

        # Prepare AFL command
        afl_cmd = [
            "afl-fuzz",
            "-i", input_dir,
            "-o", self.afl_output_dir,
            "-t", "1000",  # Timeout in ms
            "-m", "none",  # No memory limit
            "--", self.target_binary, "@@"
        ]

        print(f"Starting AFL fuzzing: {' '.join(afl_cmd)}")

        # Start AFL in background
        afl_process = subprocess.Popen(afl_cmd)

        # Monitor for new test cases and collect coverage
        start_time = time.time()
        while time.time() - start_time < timeout:
            self.collect_afl_coverage()
            time.sleep(60)  # Check every minute

        # Stop AFL
        afl_process.terminate()

        return self.coverage_files

    def collect_afl_coverage(self):
        """Collect coverage from AFL test cases"""

        queue_dir = os.path.join(self.afl_output_dir, "default", "queue")
        if not os.path.exists(queue_dir):
            return

        # Get new test cases
        test_cases = [f for f in os.listdir(queue_dir) if f.startswith("id:")]

        for test_case in test_cases:
            test_path = os.path.join(queue_dir, test_case)
            coverage_file = f"coverage_{test_case}.drcov"

            if coverage_file not in self.coverage_files:
                # Run test case with coverage collection
                self.run_with_coverage(test_path, coverage_file)
                self.coverage_files.append(coverage_file)

    def run_with_coverage(self, input_file, coverage_file):
        """Run target with coverage collection using DynamoRIO"""

        drrun_cmd = [
            "drrun",
            "-t", "drcov",
            "-dump_text",
            "-logdir", ".",
            "-logprefix", coverage_file.replace(".drcov", ""),
            "--", self.target_binary, input_file
        ]

        try:
            subprocess.run(drrun_cmd, timeout=10, capture_output=True)
        except subprocess.TimeoutExpired:
            pass  # Timeout is expected for some test cases

# Usage example
afl_integration = AFLLighthouseIntegration("./target_binary", "./afl_output")
coverage_files = afl_integration.start_afl_fuzzing("./input_seeds", timeout=1800)

# Load coverage in Lighthouse
for coverage_file in coverage_files:
    lighthouse.load_coverage(coverage_file)

Custom Coverage Collection¶

# Custom coverage collector using Frida
import frida
import json

class FridaCoverageCollector:
    def __init__(self, target_process):
        self.target_process = target_process
        self.session = None
        self.script = None
        self.coverage_data = []

    def attach_to_process(self):
        """Attach Frida to target process"""

        try:
            self.session = frida.attach(self.target_process)
            print(f"Attached to process: {self.target_process}")
        except frida.ProcessNotFoundError:
            print(f"Process {self.target_process} not found")
            return False

        return True

    def start_coverage_collection(self, module_name=None):
        """Start collecting coverage data"""

        # Frida script for coverage collection
        frida_script = """
        var coverage_data = [];
        var module_base = null;
        var module_size = 0;

        // Get module information
        if ("%s") {
            var module = Process.getModuleByName("%s");
            module_base = module.base;
            module_size = module.size;
            console.log("Monitoring module: " + module.name + " at " + module_base);
        } else {
            // Monitor main module
            var modules = Process.enumerateModules();
            if (modules.length > 0) {
                module_base = modules[0].base;
                module_size = modules[0].size;
                console.log("Monitoring main module: " + modules[0].name);
            }
        }

        // Hook instruction execution
        Interceptor.attach(module_base, {
            onEnter: function(args) {
                var address = this.context.pc;
                if (address >= module_base && address < module_base.add(module_size)) {
                    coverage_data.push({
                        address: address.toString(),
                        timestamp: Date.now()
                    });
                }
            }
        });

        // Export coverage data
        rpc.exports.getCoverageData = function() {
            return coverage_data;
        };

        rpc.exports.clearCoverageData = function() {
            coverage_data = [];
        };
        """ % (module_name or "", module_name or "")

        self.script = self.session.create_script(frida_script)
        self.script.load()

        print("Coverage collection started")

    def get_coverage_data(self):
        """Get collected coverage data"""

        if self.script:
            return self.script.exports.get_coverage_data()
        return []

    def save_coverage(self, output_file):
        """Save coverage data to file"""

        coverage_data = self.get_coverage_data()

        # Convert to Lighthouse format
        lighthouse_format = {
            "version": "1.0",
            "type": "frida",
            "coverage": []
        }

        for entry in coverage_data:
            lighthouse_format["coverage"].append({
                "address": entry["address"],
                "hit_count": 1,
                "timestamp": entry["timestamp"]
            })

        with open(output_file, "w") as f:
            json.dump(lighthouse_format, f, indent=2)

        print(f"Coverage saved to: {output_file}")

    def detach(self):
        """Detach from process"""

        if self.session:
            self.session.detach()
            print("Detached from process")

# Usage example
collector = FridaCoverageCollector("target_process")
if collector.attach_to_process():
    collector.start_coverage_collection("main_module.exe")

    # Let it run for some time
    time.sleep(30)

    # Save coverage
    collector.save_coverage("frida_coverage.json")
    collector.detach()

    # Load in Lighthouse
    lighthouse.load_coverage("frida_coverage.json", format="frida")

Coverage-Guided Analysis¶

# Coverage-guided vulnerability research
class CoverageGuidedAnalysis:
    def __init__(self, binary_path):
        self.binary_path = binary_path
        self.coverage_data = {}
        self.interesting_functions = []
        self.uncovered_paths = []

    def analyze_coverage_gaps(self):
        """Analyze coverage gaps to find interesting code paths"""

        # Get all functions in binary
        all_functions = lighthouse.get_all_functions()
        covered_functions = lighthouse.get_covered_functions()

        # Find uncovered functions
        uncovered_functions = []
        for func in all_functions:
            if func not in covered_functions:
                uncovered_functions.append(func)

        # Analyze uncovered functions for interesting patterns
        for func in uncovered_functions:
            func_analysis = self.analyze_function(func)

            if func_analysis["interesting"]:
                self.interesting_functions.append({
                    "function": func,
                    "reason": func_analysis["reason"],
                    "priority": func_analysis["priority"]
                })

        # Sort by priority
        self.interesting_functions.sort(key=lambda x: x["priority"], reverse=True)

        return self.interesting_functions

    def analyze_function(self, function):
        """Analyze function for interesting characteristics"""

        analysis = {
            "interesting": False,
            "reason": [],
            "priority": 0
        }

        # Get function disassembly
        disasm = lighthouse.get_function_disassembly(function)

        # Check for interesting patterns
        if "strcpy" in disasm or "sprintf" in disasm:
            analysis["interesting"] = True
            analysis["reason"].append("Potentially unsafe string operations")
            analysis["priority"] += 30

        if "malloc" in disasm or "free" in disasm:
            analysis["interesting"] = True
            analysis["reason"].append("Memory management operations")
            analysis["priority"] += 20

        if "system" in disasm or "exec" in disasm:
            analysis["interesting"] = True
            analysis["reason"].append("System command execution")
            analysis["priority"] += 40

        if "crypto" in disasm.lower() or "encrypt" in disasm.lower():
            analysis["interesting"] = True
            analysis["reason"].append("Cryptographic operations")
            analysis["priority"] += 25

        # Check for error handling paths
        if "error" in disasm.lower() or "exception" in disasm.lower():
            analysis["interesting"] = True
            analysis["reason"].append("Error handling code")
            analysis["priority"] += 15

        # Check for network operations
        if "socket" in disasm or "connect" in disasm or "send" in disasm:
            analysis["interesting"] = True
            analysis["reason"].append("Network operations")
            analysis["priority"] += 35

        return analysis

    def generate_test_cases(self, target_function):
        """Generate test cases to reach uncovered function"""

        # Analyze function parameters and calling conventions
        func_info = lighthouse.get_function_info(target_function)

        # Generate inputs based on function signature
        test_cases = []

        if func_info["parameters"]:
            for param_type in func_info["parameters"]:
                if param_type == "string":
                    test_cases.extend([
                        "A" * 10,
                        "A" * 100,
                        "A" * 1000,
                        "../../../etc/passwd",
                        "%s%s%s%s",
                        "\x00\x01\x02\x03"
                    ])
                elif param_type == "integer":
                    test_cases.extend([
                        0, 1, -1, 0x7FFFFFFF, 0x80000000,
                        0xFFFFFFFF, 0x100000000
                    ])

        return test_cases

    def guided_fuzzing_campaign(self):
        """Run coverage-guided fuzzing campaign"""

        # Analyze current coverage gaps
        interesting_functions = self.analyze_coverage_gaps()

        print(f"Found {len(interesting_functions)} interesting uncovered functions")

        for func_info in interesting_functions[:10]:  # Top 10 priority
            func = func_info["function"]
            print(f"Targeting function: {func['name']} (Priority: {func_info['priority']})")

            # Generate test cases for this function
            test_cases = self.generate_test_cases(func)

            # Run fuzzing campaign targeting this function
            self.run_targeted_fuzzing(func, test_cases)

            # Check if we achieved coverage
            if lighthouse.is_function_covered(func):
                print(f"Successfully covered function: {func['name']}")
            else:
                print(f"Failed to cover function: {func['name']}")

    def run_targeted_fuzzing(self, target_function, test_cases):
        """Run fuzzing campaign targeting specific function"""

        # Create input files for test cases
        input_dir = f"inputs_{target_function['name']}"
        os.makedirs(input_dir, exist_ok=True)

        for i, test_case in enumerate(test_cases):
            input_file = os.path.join(input_dir, f"input_{i:04d}")
            with open(input_file, "wb") as f:
                if isinstance(test_case, str):
                    f.write(test_case.encode())
                else:
                    f.write(bytes([test_case]))

        # Run AFL with coverage feedback
        afl_cmd = [
            "afl-fuzz",
            "-i", input_dir,
            "-o", f"output_{target_function['name']}",
            "-t", "1000",
            "-m", "none",
            "--", self.binary_path, "@@"
        ]

        # Run for limited time
        process = subprocess.Popen(afl_cmd)
        time.sleep(300)  # 5 minutes
        process.terminate()

        # Collect and load new coverage
        self.collect_campaign_coverage(f"output_{target_function['name']}")

# Usage example
analysis = CoverageGuidedAnalysis("./target_binary")
analysis.guided_fuzzing_campaign()

Coverage Reporting and Metrics¶

Comprehensive Coverage Reports¶

# Generate detailed coverage reports
class LighthouseCoverageReporter:
    def __init__(self):
        self.coverage_data = {}
        self.metrics = {}

    def generate_comprehensive_report(self, output_file="coverage_report.html"):
        """Generate comprehensive HTML coverage report"""

        # Collect coverage metrics
        self.collect_coverage_metrics()

        # Generate HTML report
        html_content = self.create_html_report()

        with open(output_file, "w") as f:
            f.write(html_content)

        print(f"Coverage report generated: {output_file}")

    def collect_coverage_metrics(self):
        """Collect comprehensive coverage metrics"""

        # Basic coverage statistics
        self.metrics["basic"] = lighthouse.get_coverage_stats()

        # Function-level coverage
        all_functions = lighthouse.get_all_functions()
        covered_functions = lighthouse.get_covered_functions()

        self.metrics["functions"] = {
            "total": len(all_functions),
            "covered": len(covered_functions),
            "uncovered": len(all_functions) - len(covered_functions),
            "coverage_percentage": (len(covered_functions) / len(all_functions)) * 100
        }

        # Basic block coverage
        all_blocks = lighthouse.get_all_basic_blocks()
        covered_blocks = lighthouse.get_covered_basic_blocks()

        self.metrics["basic_blocks"] = {
            "total": len(all_blocks),
            "covered": len(covered_blocks),
            "uncovered": len(all_blocks) - len(covered_blocks),
            "coverage_percentage": (len(covered_blocks) / len(all_blocks)) * 100
        }

        # Instruction coverage
        all_instructions = lighthouse.get_all_instructions()
        covered_instructions = lighthouse.get_covered_instructions()

        self.metrics["instructions"] = {
            "total": len(all_instructions),
            "covered": len(covered_instructions),
            "uncovered": len(all_instructions) - len(covered_instructions),
            "coverage_percentage": (len(covered_instructions) / len(all_instructions)) * 100
        }

        # Coverage by module
        self.metrics["modules"] = {}
        modules = lighthouse.get_modules()

        for module in modules:
            module_coverage = lighthouse.get_module_coverage(module)
            self.metrics["modules"][module["name"]] = module_coverage

        # Coverage hotspots (most frequently hit)
        self.metrics["hotspots"] = lighthouse.get_coverage_hotspots(limit=20)

        # Coverage timeline
        self.metrics["timeline"] = lighthouse.get_coverage_timeline()

    def create_html_report(self):
        """Create HTML coverage report"""

        html_template = """
<!DOCTYPE html>
<html>
<head>
    <title>Lighthouse Coverage Report</title>
    <style>
        body { font-family: Arial, sans-serif; margin: 20px; }
        .header { background-color: #f0f0f0; padding: 20px; border-radius: 5px; }
        .metric-box { display: inline-block; margin: 10px; padding: 15px; 
                     border: 1px solid #ddd; border-radius: 5px; min-width: 150px; }
        .covered { background-color: #d4edda; }
        .uncovered { background-color: #f8d7da; }
        .partial { background-color: #fff3cd; }
        table { border-collapse: collapse; width: 100%; margin: 20px 0; }
        th, td { border: 1px solid #ddd; padding: 8px; text-align: left; }
        th { background-color: #f2f2f2; }
        .progress-bar { width: 100%; height: 20px; background-color: #f0f0f0; border-radius: 10px; }
        .progress-fill { height: 100%; background-color: #28a745; border-radius: 10px; }
        .chart { margin: 20px 0; }
    </style>
    <script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
</head>
<body>
    <div class="header">
        <h1>Lighthouse Coverage Report</h1>
        <p>Generated on: {timestamp}</p>
        <p>Binary: {binary_name}</p>
    </div>

    <h2>Coverage Summary</h2>
    <div class="metric-box covered">
        <h3>Function Coverage</h3>
        <div class="progress-bar">
            <div class="progress-fill" style="width: {function_coverage}%"></div>
        </div>
        <p>{covered_functions}/{total_functions} functions ({function_coverage:.1f}%)</p>
    </div>

    <div class="metric-box covered">
        <h3>Basic Block Coverage</h3>
        <div class="progress-bar">
            <div class="progress-fill" style="width: {block_coverage}%"></div>
        </div>
        <p>{covered_blocks}/{total_blocks} blocks ({block_coverage:.1f}%)</p>
    </div>

    <div class="metric-box covered">
        <h3>Instruction Coverage</h3>
        <div class="progress-bar">
            <div class="progress-fill" style="width: {instruction_coverage}%"></div>
        </div>
        <p>{covered_instructions}/{total_instructions} instructions ({instruction_coverage:.1f}%)</p>
    </div>

    <h2>Module Coverage</h2>
    <table>
        <tr>
            <th>Module</th>
            <th>Functions</th>
            <th>Coverage</th>
            <th>Progress</th>
        </tr>
        {module_rows}
    </table>

    <h2>Coverage Hotspots</h2>
    <table>
        <tr>
            <th>Address</th>
            <th>Function</th>
            <th>Hit Count</th>
            <th>Percentage</th>
        </tr>
        {hotspot_rows}
    </table>

    <h2>Uncovered Functions</h2>
    <table>
        <tr>
            <th>Address</th>
            <th>Function Name</th>
            <th>Size</th>
            <th>Complexity</th>
        </tr>
        {uncovered_rows}
    </table>

    <h2>Coverage Timeline</h2>
    <div class="chart">
        <canvas id="timelineChart" width="800" height="400"></canvas>
    </div>

    <script>
        // Coverage timeline chart
        var ctx = document.getElementById('timelineChart').getContext('2d');
        var chart = new Chart(ctx, {{
            type: 'line',
            data: {{
                labels: {timeline_labels},
                datasets: [{{
                    label: 'Coverage Percentage',
                    data: {timeline_data},
                    borderColor: 'rgb(75, 192, 192)',
                    tension: 0.1
                }}]
            }},
            options: {{
                responsive: true,
                scales: {{
                    y: {{
                        beginAtZero: true,
                        max: 100
                    }}
                }}
            }}
        }});
    </script>
</body>
</html>
        """

        # Format template with metrics
        return html_template.format(
            timestamp=time.strftime("%Y-%m-%d %H:%M:%S"),
            binary_name=lighthouse.get_binary_name(),
            function_coverage=self.metrics["functions"]["coverage_percentage"],
            covered_functions=self.metrics["functions"]["covered"],
            total_functions=self.metrics["functions"]["total"],
            block_coverage=self.metrics["basic_blocks"]["coverage_percentage"],
            covered_blocks=self.metrics["basic_blocks"]["covered"],
            total_blocks=self.metrics["basic_blocks"]["total"],
            instruction_coverage=self.metrics["instructions"]["coverage_percentage"],
            covered_instructions=self.metrics["instructions"]["covered"],
            total_instructions=self.metrics["instructions"]["total"],
            module_rows=self.generate_module_rows(),
            hotspot_rows=self.generate_hotspot_rows(),
            uncovered_rows=self.generate_uncovered_rows(),
            timeline_labels=json.dumps([t["timestamp"] for t in self.metrics["timeline"]]),
            timeline_data=json.dumps([t["coverage"] for t in self.metrics["timeline"]])
        )

    def generate_module_rows(self):
        """Generate HTML rows for module coverage table"""

        rows = []
        for module_name, coverage in self.metrics["modules"].items():
            progress_width = coverage["coverage_percentage"]
            row = f"""
            <tr>
                <td>{module_name}</td>
                <td>{coverage["covered"]}/{coverage["total"]}</td>
                <td>{coverage["coverage_percentage"]:.1f}%</td>
                <td>
                    <div class="progress-bar">
                        <div class="progress-fill" style="width: {progress_width}%"></div>
                    </div>
                </td>
            </tr>
            """
            rows.append(row)

        return "".join(rows)

    def generate_hotspot_rows(self):
        """Generate HTML rows for coverage hotspots"""

        rows = []
        for hotspot in self.metrics["hotspots"]:
            row = f"""
            <tr>
                <td>0x{hotspot["address"]:x}</td>
                <td>{hotspot["function_name"]}</td>
                <td>{hotspot["hit_count"]}</td>
                <td>{hotspot["percentage"]:.2f}%</td>
            </tr>
            """
            rows.append(row)

        return "".join(rows)

    def generate_uncovered_rows(self):
        """Generate HTML rows for uncovered functions"""

        uncovered_functions = lighthouse.get_uncovered_functions()
        rows = []

        for func in uncovered_functions[:50]:  # Limit to top 50
            row = f"""
            <tr>
                <td>0x{func["address"]:x}</td>
                <td>{func["name"]}</td>
                <td>{func["size"]} bytes</td>
                <td>{func["complexity"]}</td>
            </tr>
            """
            rows.append(row)

        return "".join(rows)

    def export_coverage_data(self, format="json"):
        """Export coverage data in various formats"""

        if format == "json":
            output_file = "coverage_data.json"
            with open(output_file, "w") as f:
                json.dump(self.metrics, f, indent=2)

        elif format == "csv":
            import csv

            # Export function coverage
            with open("function_coverage.csv", "w", newline="") as f:
                writer = csv.writer(f)
                writer.writerow(["Address", "Function", "Covered", "Hit Count"])

                for func in lighthouse.get_all_functions():
                    coverage_info = lighthouse.get_function_coverage(func)
                    writer.writerow([
                        f"0x{func['address']:x}",
                        func["name"],
                        coverage_info["covered"],
                        coverage_info["hit_count"]
                    ])

        elif format == "xml":
            import xml.etree.ElementTree as ET

            root = ET.Element("coverage_report")

            # Add summary
            summary = ET.SubElement(root, "summary")
            for key, value in self.metrics["functions"].items():
                elem = ET.SubElement(summary, key)
                elem.text = str(value)

            # Add functions
            functions = ET.SubElement(root, "functions")
            for func in lighthouse.get_all_functions():
                func_elem = ET.SubElement(functions, "function")
                func_elem.set("address", f"0x{func['address']:x}")
                func_elem.set("name", func["name"])

                coverage_info = lighthouse.get_function_coverage(func)
                func_elem.set("covered", str(coverage_info["covered"]))
                func_elem.set("hit_count", str(coverage_info["hit_count"]))

            tree = ET.ElementTree(root)
            tree.write("coverage_data.xml")

        print(f"Coverage data exported to {format} format")

# Usage example
reporter = LighthouseCoverageReporter()
reporter.generate_comprehensive_report("detailed_coverage_report.html")
reporter.export_coverage_data("json")
reporter.export_coverage_data("csv")

Integration with CI/CD¶

Automated Coverage Analysis¶

#!/bin/bash
# CI/CD integration script for Lighthouse coverage analysis

set -e

BINARY_PATH="$1"
COVERAGE_DIR="$2"
OUTPUT_DIR="$3"
THRESHOLD="$4"

if [ -z "$BINARY_PATH" ] || [ -z "$COVERAGE_DIR" ] || [ -z "$OUTPUT_DIR" ]; then
    echo "Usage: $0 <binary_path> <coverage_dir> <output_dir> [threshold]"
    exit 1
fi

THRESHOLD=${THRESHOLD:-80}  # Default 80% coverage threshold

echo "Starting automated coverage analysis..."
echo "Binary: $BINARY_PATH"
echo "Coverage directory: $COVERAGE_DIR"
echo "Output directory: $OUTPUT_DIR"
echo "Coverage threshold: $THRESHOLD%"

mkdir -p "$OUTPUT_DIR"

# Generate coverage report
python3 << EOF
import sys
sys.path.append('/path/to/lighthouse')
import lighthouse

# Load binary in headless mode
lighthouse.load_binary("$BINARY_PATH")

# Load all coverage files
import os
coverage_files = []
for root, dirs, files in os.walk("$COVERAGE_DIR"):
    for file in files:
        if file.endswith(('.drcov', '.cov', '.gcov')):
            coverage_files.append(os.path.join(root, file))

print(f"Found {len(coverage_files)} coverage files")

for coverage_file in coverage_files:
    try:
        lighthouse.load_coverage(coverage_file)
        print(f"Loaded: {coverage_file}")
    except Exception as e:
        print(f"Failed to load {coverage_file}: {e}")

# Generate metrics
stats = lighthouse.get_coverage_stats()
function_coverage = stats['function_coverage_percentage']
block_coverage = stats['block_coverage_percentage']

print(f"Function coverage: {function_coverage:.2f}%")
print(f"Block coverage: {block_coverage:.2f}%")

# Generate reports
lighthouse.export_coverage_report("$OUTPUT_DIR/coverage_report.html")
lighthouse.export_coverage_data("$OUTPUT_DIR/coverage_data.json")

# Check threshold
if function_coverage < $THRESHOLD:
    print(f"ERROR: Coverage {function_coverage:.2f}% below threshold {$THRESHOLD}%")
    sys.exit(1)
else:
    print(f"SUCCESS: Coverage {function_coverage:.2f}% meets threshold {$THRESHOLD}%")

# Generate uncovered functions list
uncovered = lighthouse.get_uncovered_functions()
with open("$OUTPUT_DIR/uncovered_functions.txt", "w") as f:
    for func in uncovered:
        f.write(f"0x{func['address']:x} {func['name']}\n")

print(f"Found {len(uncovered)} uncovered functions")
EOF

# Generate summary for CI
cat > "$OUTPUT_DIR/coverage_summary.txt" << EOF
Coverage Analysis Summary
========================
Date: $(date)
Binary: $BINARY_PATH
Coverage Files: $(find "$COVERAGE_DIR" -name "*.drcov" -o -name "*.cov" -o -name "*.gcov" | wc -l)
Threshold: $THRESHOLD%

Results:
- Function Coverage: $(grep "Function coverage:" "$OUTPUT_DIR/coverage_report.html" | sed 's/.*: //' | sed 's/%.*//')%
- Block Coverage: $(grep "Block coverage:" "$OUTPUT_DIR/coverage_report.html" | sed 's/.*: //' | sed 's/%.*//')%
- Uncovered Functions: $(wc -l < "$OUTPUT_DIR/uncovered_functions.txt")

Status: $(if [ $? -eq 0 ]; then echo "PASS"; else echo "FAIL"; fi)
EOF

echo "Coverage analysis complete!"
echo "Results saved to: $OUTPUT_DIR"

GitHub Actions Integration¶

# .github/workflows/coverage-analysis.yml
name: Coverage Analysis with Lighthouse

on:
  push:
    branches: [ main, develop ]
  pull_request:
    branches: [ main ]

jobs:
  coverage-analysis:
    runs-on: ubuntu-latest

    steps:
    - uses: actions/checkout@v3

    - name: Setup Python
      uses: actions/setup-python@v4
      with:
        python-version: '3.9'

    - name: Install dependencies
      run: |
        sudo apt-get update
        sudo apt-get install -y build-essential cmake

        # Install WABT for WASM analysis
        sudo apt-get install wabt

        # Install DynamoRIO for coverage collection
        wget https://github.com/DynamoRIO/dynamorio/releases/download/release_9.0.1/DynamoRIO-Linux-9.0.1.tar.gz
        tar -xzf DynamoRIO-Linux-9.0.1.tar.gz
        export DYNAMORIO_HOME=$PWD/DynamoRIO-Linux-9.0.1

        # Install Lighthouse
        pip install lighthouse-ida

        # Install AFL++ for fuzzing
        git clone https://github.com/AFLplusplus/AFLplusplus.git
        cd AFLplusplus
        make
        sudo make install
        cd ..

    - name: Build target binary
      run: |
        gcc -g -O0 --coverage src/main.c -o target_binary

    - name: Run fuzzing campaign
      run: |
        mkdir -p input_seeds
        echo "test input" > input_seeds/seed1
        echo "another test" > input_seeds/seed2

        # Run AFL for limited time
        timeout 300 afl-fuzz -i input_seeds -o afl_output -t 1000 -m none -- ./target_binary @@ || true

    - name: Collect coverage data
      run: |
        mkdir -p coverage_data

        # Collect AFL coverage
        find afl_output -name "id:*" | head -20 | while read testcase; do
          coverage_file="coverage_data/$(basename "$testcase").drcov"
          $DYNAMORIO_HOME/bin64/drrun -t drcov -dump_text -logdir coverage_data -- ./target_binary "$testcase"
        done

        # Collect GCOV coverage
        gcov src/main.c
        mv *.gcov coverage_data/

    - name: Analyze coverage with Lighthouse
      run: |
        python3 scripts/lighthouse_analysis.py ./target_binary coverage_data coverage_results 75

    - name: Upload coverage results
      uses: actions/upload-artifact@v3
      with:
        name: coverage-results
        path: coverage_results/

    - name: Comment PR with coverage
      if: github.event_name == 'pull_request'
      uses: actions/github-script@v6
      with:
        script: |
          const fs = require('fs');
          const summary = fs.readFileSync('coverage_results/coverage_summary.txt', 'utf8');

          github.rest.issues.createComment({
            issue_number: context.issue.number,
            owner: context.repo.owner,
            repo: context.repo.repo,
            body: `## Coverage Analysis Results\n\n\`\`\`\n${summary}\n\`\`\``
          });

    - name: Fail if coverage below threshold
      run: |
        if grep -q "FAIL" coverage_results/coverage_summary.txt; then
          echo "Coverage analysis failed!"
          exit 1
        fi

Resources and Documentation¶

Official Resources¶

Lighthouse GitHub Repository - Main repository and documentation
Lighthouse Wiki - Comprehensive usage guide
IDA Pro Plugin Development - IDA Python API documentation
Ghidra Scripting - Ghidra scripting guide

Coverage Tools Integration¶

DynamoRIO - Dynamic binary instrumentation platform
Intel Pin - Binary instrumentation framework
AFL++ - Advanced fuzzing framework
Frida - Dynamic instrumentation toolkit

Research and Papers¶

Code Coverage in Reverse Engineering - Academic research on coverage-guided RE
Fuzzing with Code Coverage - AFL fuzzing methodology
Binary Analysis with Coverage - Driller paper on coverage-guided analysis

Community Resources¶

Lighthouse Users Group - Community discussions
Reverse Engineering Stack Exchange - Q&A for RE topics
r/ReverseEngineering - Reddit community
Binary Analysis Discord - Real-time community chat