Lighthouse Plugin Cheat Sheet
Overview
Lighthouse is a code coverage plugin for disassemblers that visualizes code coverage data directly within IDA Pro and Ghidra. It's particularly useful for correlating fuzzing results with disassembly, identifying uncovered code paths, and guiding security research efforts.
💡 Key Features: Real-time coverage visualization, multiple coverage format support, differential coverage analysis, fuzzing integration, and interactive coverage exploration within disassemblers.
Installation and Setup
IDA Pro Installation
# Clone Lighthouse repository
git clone https://github.com/gaasedelen/lighthouse.git
cd lighthouse
# Install Python dependencies
pip install -r requirements.txt
# For IDA Pro 7.x
cp -r lighthouse/ "$IDADIR/plugins/"
cp lighthouse_plugin.py "$IDADIR/plugins/"
# For IDA Pro 8.x
cp -r lighthouse/ "$IDADIR/plugins/"
cp lighthouse_plugin.py "$IDADIR/plugins/"
# Verify installation
# Start IDA Pro and check Edit -> Plugins -> Lighthouse
# Alternative: Install via pip (if available)
pip install lighthouse-ida
# Manual installation verification
python -c "
import sys
sys.path.append('/path/to/ida/python')
import lighthouse
print('Lighthouse installed successfully')
"
Ghidra Installation
# Clone Lighthouse repository
git clone https://github.com/gaasedelen/lighthouse.git
cd lighthouse
# Build Ghidra extension
cd ghidra_scripts
# Copy scripts to Ghidra script directory
cp *.py "$GHIDRA_INSTALL_DIR/Ghidra/Features/Python/ghidra_scripts/"
# Or use Ghidra's script manager
# 1. Open Ghidra
# 2. Window -> Script Manager
# 3. Script Directories -> Add
# 4. Select lighthouse/ghidra_scripts directory
# Install Python dependencies for Ghidra
pip install coverage
pip install pyqt5 # For GUI components
# Verify installation in Ghidra
# Window -> Script Manager -> Search for "lighthouse"
Coverage Data Sources Setup
# Install coverage tools for different formats
# Intel Pin (for DynamoRIO coverage)
wget https://software.intel.com/sites/landingpage/pintool/downloads/pin-3.21-98484-ge7cd811fd-gcc-linux.tar.gz
tar -xzf pin-3.21-98484-ge7cd811fd-gcc-linux.tar.gz
export PIN_ROOT=/path/to/pin
# DynamoRIO
git clone https://github.com/DynamoRIO/dynamorio.git
cd dynamorio
mkdir build && cd build
cmake ..
make -j$(nproc)
# AFL++ for fuzzing coverage
git clone https://github.com/AFLplusplus/AFLplusplus.git
cd AFLplusplus
make
sudo make install
# Frida for dynamic instrumentation
pip install frida-tools
npm install frida
# GCOV for compile-time coverage
# Already available with GCC
gcc --coverage source.c -o binary
# LLVM coverage
clang -fprofile-instr-generate -fcoverage-mapping source.c -o binary
# Verify coverage tools
which afl-fuzz
which frida
which gcov
which llvm-profdata
Configuration and Setup
# Lighthouse configuration file (~/.lighthouse/config.json)
{
"coverage_formats": {
"drcov": {
"enabled": true,
"color": "#FF6B6B"
},
"intel_pin": {
"enabled": true,
"color": "#4ECDC4"
},
"frida": {
"enabled": true,
"color": "#45B7D1"
},
"lighthouse": {
"enabled": true,
"color": "#96CEB4"
}
},
"ui_settings": {
"auto_load": true,
"show_coverage_percentage": true,
"highlight_uncovered": true,
"coverage_threshold": 0.8
},
"performance": {
"max_coverage_files": 100,
"cache_coverage_data": true,
"async_loading": true
}
}
Basic Coverage Analysis
Loading Coverage Data
# IDA Pro - Load coverage data
import lighthouse
# Load single coverage file
lighthouse.load_coverage("coverage.drcov")
# Load multiple coverage files
coverage_files = [
"run1.drcov",
"run2.drcov",
"run3.drcov"
]
for coverage_file in coverage_files:
lighthouse.load_coverage(coverage_file)
# Load coverage directory
lighthouse.load_coverage_directory("/path/to/coverage/files/")
# Load with specific format
lighthouse.load_coverage("coverage.log", format="intel_pin")
# Load Frida coverage
lighthouse.load_coverage("frida_trace.json", format="frida")
# Load GCOV coverage
lighthouse.load_coverage("coverage.gcov", format="gcov")
Coverage Visualization
# Enable coverage visualization
lighthouse.show_coverage()
# Hide coverage visualization
lighthouse.hide_coverage()
# Toggle coverage display
lighthouse.toggle_coverage()
# Set coverage colors
lighthouse.set_coverage_color(0xFF6B6B) # Red for covered
lighthouse.set_uncovered_color(0x808080) # Gray for uncovered
# Highlight specific coverage
lighthouse.highlight_coverage("run1.drcov")
# Show coverage statistics
stats = lighthouse.get_coverage_stats()
print(f"Total functions: {stats['total_functions']}")
print(f"Covered functions: {stats['covered_functions']}")
print(f"Coverage percentage: {stats['coverage_percentage']:.2f}%")
# Navigate to uncovered code
lighthouse.goto_next_uncovered()
lighthouse.goto_previous_uncovered()
# Find uncovered functions
uncovered_functions = lighthouse.get_uncovered_functions()
for func in uncovered_functions:
print(f"Uncovered function: {func['name']} at 0x{func['address']:x}")
Coverage Comparison
# Compare two coverage runs
lighthouse.compare_coverage("baseline.drcov", "new_run.drcov")
# Differential coverage analysis
diff_coverage = lighthouse.differential_coverage([
"run1.drcov",
"run2.drcov",
"run3.drcov"
])
# Show only new coverage
lighthouse.show_new_coverage_only()
# Show coverage intersection
lighthouse.show_coverage_intersection()
# Show coverage union
lighthouse.show_coverage_union()
# Export coverage comparison
lighthouse.export_coverage_diff("coverage_diff.json")
# Generate coverage report
report = lighthouse.generate_coverage_report()
with open("coverage_report.html", "w") as f:
f.write(report)
Advanced Coverage Analysis
Fuzzing Integration
# AFL++ integration
import subprocess
import time
import os
class AFLLighthouseIntegration:
def __init__(self, target_binary, afl_output_dir):
self.target_binary = target_binary
self.afl_output_dir = afl_output_dir
self.coverage_files = []
def start_afl_fuzzing(self, input_dir, timeout=3600):
"""Start AFL fuzzing with coverage collection"""
# Prepare AFL command
afl_cmd = [
"afl-fuzz",
"-i", input_dir,
"-o", self.afl_output_dir,
"-t", "1000", # Timeout in ms
"-m", "none", # No memory limit
"--", self.target_binary, "@@"
]
print(f"Starting AFL fuzzing: {' '.join(afl_cmd)}")
# Start AFL in background
afl_process = subprocess.Popen(afl_cmd)
# Monitor for new test cases and collect coverage
start_time = time.time()
while time.time() - start_time < timeout:
self.collect_afl_coverage()
time.sleep(60) # Check every minute
# Stop AFL
afl_process.terminate()
return self.coverage_files
def collect_afl_coverage(self):
"""Collect coverage from AFL test cases"""
queue_dir = os.path.join(self.afl_output_dir, "default", "queue")
if not os.path.exists(queue_dir):
return
# Get new test cases
test_cases = [f for f in os.listdir(queue_dir) if f.startswith("id:")]
for test_case in test_cases:
test_path = os.path.join(queue_dir, test_case)
coverage_file = f"coverage_{test_case}.drcov"
if coverage_file not in self.coverage_files:
# Run test case with coverage collection
self.run_with_coverage(test_path, coverage_file)
self.coverage_files.append(coverage_file)
def run_with_coverage(self, input_file, coverage_file):
"""Run target with coverage collection using DynamoRIO"""
drrun_cmd = [
"drrun",
"-t", "drcov",
"-dump_text",
"-logdir", ".",
"-logprefix", coverage_file.replace(".drcov", ""),
"--", self.target_binary, input_file
]
try:
subprocess.run(drrun_cmd, timeout=10, capture_output=True)
except subprocess.TimeoutExpired:
pass # Timeout is expected for some test cases
# Usage example
afl_integration = AFLLighthouseIntegration("./target_binary", "./afl_output")
coverage_files = afl_integration.start_afl_fuzzing("./input_seeds", timeout=1800)
# Load coverage in Lighthouse
for coverage_file in coverage_files:
lighthouse.load_coverage(coverage_file)
Custom Coverage Collection
# Custom coverage collector using Frida
import frida
import json
class FridaCoverageCollector:
def __init__(self, target_process):
self.target_process = target_process
self.session = None
self.script = None
self.coverage_data = []
def attach_to_process(self):
"""Attach Frida to target process"""
try:
self.session = frida.attach(self.target_process)
print(f"Attached to process: {self.target_process}")
except frida.ProcessNotFoundError:
print(f"Process {self.target_process} not found")
return False
return True
def start_coverage_collection(self, module_name=None):
"""Start collecting coverage data"""
# Frida script for coverage collection
frida_script = """
var coverage_data = [];
var module_base = null;
var module_size = 0;
// Get module information
if ("%s") {
var module = Process.getModuleByName("%s");
module_base = module.base;
module_size = module.size;
console.log("Monitoring module: " + module.name + " at " + module_base);
} else {
// Monitor main module
var modules = Process.enumerateModules();
if (modules.length > 0) {
module_base = modules[0].base;
module_size = modules[0].size;
console.log("Monitoring main module: " + modules[0].name);
}
}
// Hook instruction execution
Interceptor.attach(module_base, {
onEnter: function(args) {
var address = this.context.pc;
if (address >= module_base && address < module_base.add(module_size)) {
coverage_data.push({
address: address.toString(),
timestamp: Date.now()
});
}
}
});
// Export coverage data
rpc.exports.getCoverageData = function() {
return coverage_data;
};
rpc.exports.clearCoverageData = function() {
coverage_data = [];
};
""" % (module_name or "", module_name or "")
self.script = self.session.create_script(frida_script)
self.script.load()
print("Coverage collection started")
def get_coverage_data(self):
"""Get collected coverage data"""
if self.script:
return self.script.exports.get_coverage_data()
return []
def save_coverage(self, output_file):
"""Save coverage data to file"""
coverage_data = self.get_coverage_data()
# Convert to Lighthouse format
lighthouse_format = {
"version": "1.0",
"type": "frida",
"coverage": []
}
for entry in coverage_data:
lighthouse_format["coverage"].append({
"address": entry["address"],
"hit_count": 1,
"timestamp": entry["timestamp"]
})
with open(output_file, "w") as f:
json.dump(lighthouse_format, f, indent=2)
print(f"Coverage saved to: {output_file}")
def detach(self):
"""Detach from process"""
if self.session:
self.session.detach()
print("Detached from process")
# Usage example
collector = FridaCoverageCollector("target_process")
if collector.attach_to_process():
collector.start_coverage_collection("main_module.exe")
# Let it run for some time
time.sleep(30)
# Save coverage
collector.save_coverage("frida_coverage.json")
collector.detach()
# Load in Lighthouse
lighthouse.load_coverage("frida_coverage.json", format="frida")
Coverage-Guided Analysis
# Coverage-guided vulnerability research
class CoverageGuidedAnalysis:
def __init__(self, binary_path):
self.binary_path = binary_path
self.coverage_data = {}
self.interesting_functions = []
self.uncovered_paths = []
def analyze_coverage_gaps(self):
"""Analyze coverage gaps to find interesting code paths"""
# Get all functions in binary
all_functions = lighthouse.get_all_functions()
covered_functions = lighthouse.get_covered_functions()
# Find uncovered functions
uncovered_functions = []
for func in all_functions:
if func not in covered_functions:
uncovered_functions.append(func)
# Analyze uncovered functions for interesting patterns
for func in uncovered_functions:
func_analysis = self.analyze_function(func)
if func_analysis["interesting"]:
self.interesting_functions.append({
"function": func,
"reason": func_analysis["reason"],
"priority": func_analysis["priority"]
})
# Sort by priority
self.interesting_functions.sort(key=lambda x: x["priority"], reverse=True)
return self.interesting_functions
def analyze_function(self, function):
"""Analyze function for interesting characteristics"""
analysis = {
"interesting": False,
"reason": [],
"priority": 0
}
# Get function disassembly
disasm = lighthouse.get_function_disassembly(function)
# Check for interesting patterns
if "strcpy" in disasm or "sprintf" in disasm:
analysis["interesting"] = True
analysis["reason"].append("Potentially unsafe string operations")
analysis["priority"] += 30
if "malloc" in disasm or "free" in disasm:
analysis["interesting"] = True
analysis["reason"].append("Memory management operations")
analysis["priority"] += 20
if "system" in disasm or "exec" in disasm:
analysis["interesting"] = True
analysis["reason"].append("System command execution")
analysis["priority"] += 40
if "crypto" in disasm.lower() or "encrypt" in disasm.lower():
analysis["interesting"] = True
analysis["reason"].append("Cryptographic operations")
analysis["priority"] += 25
# Check for error handling paths
if "error" in disasm.lower() or "exception" in disasm.lower():
analysis["interesting"] = True
analysis["reason"].append("Error handling code")
analysis["priority"] += 15
# Check for network operations
if "socket" in disasm or "connect" in disasm or "send" in disasm:
analysis["interesting"] = True
analysis["reason"].append("Network operations")
analysis["priority"] += 35
return analysis
def generate_test_cases(self, target_function):
"""Generate test cases to reach uncovered function"""
# Analyze function parameters and calling conventions
func_info = lighthouse.get_function_info(target_function)
# Generate inputs based on function signature
test_cases = []
if func_info["parameters"]:
for param_type in func_info["parameters"]:
if param_type == "string":
test_cases.extend([
"A" * 10,
"A" * 100,
"A" * 1000,
"../../../etc/passwd",
"%s%s%s%s",
"\x00\x01\x02\x03"
])
elif param_type == "integer":
test_cases.extend([
0, 1, -1, 0x7FFFFFFF, 0x80000000,
0xFFFFFFFF, 0x100000000
])
return test_cases
def guided_fuzzing_campaign(self):
"""Run coverage-guided fuzzing campaign"""
# Analyze current coverage gaps
interesting_functions = self.analyze_coverage_gaps()
print(f"Found {len(interesting_functions)} interesting uncovered functions")
for func_info in interesting_functions[:10]: # Top 10 priority
func = func_info["function"]
print(f"Targeting function: {func['name']} (Priority: {func_info['priority']})")
# Generate test cases for this function
test_cases = self.generate_test_cases(func)
# Run fuzzing campaign targeting this function
self.run_targeted_fuzzing(func, test_cases)
# Check if we achieved coverage
if lighthouse.is_function_covered(func):
print(f"Successfully covered function: {func['name']}")
else:
print(f"Failed to cover function: {func['name']}")
def run_targeted_fuzzing(self, target_function, test_cases):
"""Run fuzzing campaign targeting specific function"""
# Create input files for test cases
input_dir = f"inputs_{target_function['name']}"
os.makedirs(input_dir, exist_ok=True)
for i, test_case in enumerate(test_cases):
input_file = os.path.join(input_dir, f"input_{i:04d}")
with open(input_file, "wb") as f:
if isinstance(test_case, str):
f.write(test_case.encode())
else:
f.write(bytes([test_case]))
# Run AFL with coverage feedback
afl_cmd = [
"afl-fuzz",
"-i", input_dir,
"-o", f"output_{target_function['name']}",
"-t", "1000",
"-m", "none",
"--", self.binary_path, "@@"
]
# Run for limited time
process = subprocess.Popen(afl_cmd)
time.sleep(300) # 5 minutes
process.terminate()
# Collect and load new coverage
self.collect_campaign_coverage(f"output_{target_function['name']}")
# Usage example
analysis = CoverageGuidedAnalysis("./target_binary")
analysis.guided_fuzzing_campaign()
Coverage Reporting and Metrics
Comprehensive Coverage Reports
# Generate detailed coverage reports
class LighthouseCoverageReporter:
def __init__(self):
self.coverage_data = {}
self.metrics = {}
def generate_comprehensive_report(self, output_file="coverage_report.html"):
"""Generate comprehensive HTML coverage report"""
# Collect coverage metrics
self.collect_coverage_metrics()
# Generate HTML report
html_content = self.create_html_report()
with open(output_file, "w") as f:
f.write(html_content)
print(f"Coverage report generated: {output_file}")
def collect_coverage_metrics(self):
"""Collect comprehensive coverage metrics"""
# Basic coverage statistics
self.metrics["basic"] = lighthouse.get_coverage_stats()
# Function-level coverage
all_functions = lighthouse.get_all_functions()
covered_functions = lighthouse.get_covered_functions()
self.metrics["functions"] = {
"total": len(all_functions),
"covered": len(covered_functions),
"uncovered": len(all_functions) - len(covered_functions),
"coverage_percentage": (len(covered_functions) / len(all_functions)) * 100
}
# Basic block coverage
all_blocks = lighthouse.get_all_basic_blocks()
covered_blocks = lighthouse.get_covered_basic_blocks()
self.metrics["basic_blocks"] = {
"total": len(all_blocks),
"covered": len(covered_blocks),
"uncovered": len(all_blocks) - len(covered_blocks),
"coverage_percentage": (len(covered_blocks) / len(all_blocks)) * 100
}
# Instruction coverage
all_instructions = lighthouse.get_all_instructions()
covered_instructions = lighthouse.get_covered_instructions()
self.metrics["instructions"] = {
"total": len(all_instructions),
"covered": len(covered_instructions),
"uncovered": len(all_instructions) - len(covered_instructions),
"coverage_percentage": (len(covered_instructions) / len(all_instructions)) * 100
}
# Coverage by module
self.metrics["modules"] = {}
modules = lighthouse.get_modules()
for module in modules:
module_coverage = lighthouse.get_module_coverage(module)
self.metrics["modules"][module["name"]] = module_coverage
# Coverage hotspots (most frequently hit)
self.metrics["hotspots"] = lighthouse.get_coverage_hotspots(limit=20)
# Coverage timeline
self.metrics["timeline"] = lighthouse.get_coverage_timeline()
def create_html_report(self):
"""Create HTML coverage report"""
html_template = """
<!DOCTYPE html>
<html>
<head>
<title>Lighthouse Coverage Report</title>
<style>
body { font-family: Arial, sans-serif; margin: 20px; }
.header { background-color: #f0f0f0; padding: 20px; border-radius: 5px; }
.metric-box { display: inline-block; margin: 10px; padding: 15px;
border: 1px solid #ddd; border-radius: 5px; min-width: 150px; }
.covered { background-color: #d4edda; }
.uncovered { background-color: #f8d7da; }
.partial { background-color: #fff3cd; }
table { border-collapse: collapse; width: 100%; margin: 20px 0; }
th, td { border: 1px solid #ddd; padding: 8px; text-align: left; }
th { background-color: #f2f2f2; }
.progress-bar { width: 100%; height: 20px; background-color: #f0f0f0; border-radius: 10px; }
.progress-fill { height: 100%; background-color: #28a745; border-radius: 10px; }
.chart { margin: 20px 0; }
</style>
<script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
</head>
<body>
<div class="header">
<h1>Lighthouse Coverage Report</h1>
<p>Generated on: {timestamp}</p>
<p>Binary: {binary_name}</p>
</div>
<h2>Coverage Summary</h2>
<div class="metric-box covered">
<h3>Function Coverage</h3>
<div class="progress-bar">
<div class="progress-fill" style="width: {function_coverage}%"></div>
</div>
<p>{covered_functions}/{total_functions} functions ({function_coverage:.1f}%)</p>
</div>
<div class="metric-box covered">
<h3>Basic Block Coverage</h3>
<div class="progress-bar">
<div class="progress-fill" style="width: {block_coverage}%"></div>
</div>
<p>{covered_blocks}/{total_blocks} blocks ({block_coverage:.1f}%)</p>
</div>
<div class="metric-box covered">
<h3>Instruction Coverage</h3>
<div class="progress-bar">
<div class="progress-fill" style="width: {instruction_coverage}%"></div>
</div>
<p>{covered_instructions}/{total_instructions} instructions ({instruction_coverage:.1f}%)</p>
</div>
<h2>Module Coverage</h2>
<table>
<tr>
<th>Module</th>
<th>Functions</th>
<th>Coverage</th>
<th>Progress</th>
</tr>
{module_rows}
</table>
<h2>Coverage Hotspots</h2>
<table>
<tr>
<th>Address</th>
<th>Function</th>
<th>Hit Count</th>
<th>Percentage</th>
</tr>
{hotspot_rows}
</table>
<h2>Uncovered Functions</h2>
<table>
<tr>
<th>Address</th>
<th>Function Name</th>
<th>Size</th>
<th>Complexity</th>
</tr>
{uncovered_rows}
</table>
<h2>Coverage Timeline</h2>
<div class="chart">
<canvas id="timelineChart" width="800" height="400"></canvas>
</div>
<script>
// Coverage timeline chart
var ctx = document.getElementById('timelineChart').getContext('2d');
var chart = new Chart(ctx, {{
type: 'line',
data: {{
labels: {timeline_labels},
datasets: [{{
label: 'Coverage Percentage',
data: {timeline_data},
borderColor: 'rgb(75, 192, 192)',
tension: 0.1
}}]
}},
options: {{
responsive: true,
scales: {{
y: {{
beginAtZero: true,
max: 100
}}
}}
}}
}});
</script>
</body>
</html>
"""
# Format template with metrics
return html_template.format(
timestamp=time.strftime("%Y-%m-%d %H:%M:%S"),
binary_name=lighthouse.get_binary_name(),
function_coverage=self.metrics["functions"]["coverage_percentage"],
covered_functions=self.metrics["functions"]["covered"],
total_functions=self.metrics["functions"]["total"],
block_coverage=self.metrics["basic_blocks"]["coverage_percentage"],
covered_blocks=self.metrics["basic_blocks"]["covered"],
total_blocks=self.metrics["basic_blocks"]["total"],
instruction_coverage=self.metrics["instructions"]["coverage_percentage"],
covered_instructions=self.metrics["instructions"]["covered"],
total_instructions=self.metrics["instructions"]["total"],
module_rows=self.generate_module_rows(),
hotspot_rows=self.generate_hotspot_rows(),
uncovered_rows=self.generate_uncovered_rows(),
timeline_labels=json.dumps([t["timestamp"] for t in self.metrics["timeline"]]),
timeline_data=json.dumps([t["coverage"] for t in self.metrics["timeline"]])
)
def generate_module_rows(self):
"""Generate HTML rows for module coverage table"""
rows = []
for module_name, coverage in self.metrics["modules"].items():
progress_width = coverage["coverage_percentage"]
row = f"""
<tr>
<td>{module_name}</td>
<td>{coverage["covered"]}/{coverage["total"]}</td>
<td>{coverage["coverage_percentage"]:.1f}%</td>
<td>
<div class="progress-bar">
<div class="progress-fill" style="width: {progress_width}%"></div>
</div>
</td>
</tr>
"""
rows.append(row)
return "".join(rows)
def generate_hotspot_rows(self):
"""Generate HTML rows for coverage hotspots"""
rows = []
for hotspot in self.metrics["hotspots"]:
row = f"""
<tr>
<td>0x{hotspot["address"]:x}</td>
<td>{hotspot["function_name"]}</td>
<td>{hotspot["hit_count"]}</td>
<td>{hotspot["percentage"]:.2f}%</td>
</tr>
"""
rows.append(row)
return "".join(rows)
def generate_uncovered_rows(self):
"""Generate HTML rows for uncovered functions"""
uncovered_functions = lighthouse.get_uncovered_functions()
rows = []
for func in uncovered_functions[:50]: # Limit to top 50
row = f"""
<tr>
<td>0x{func["address"]:x}</td>
<td>{func["name"]}</td>
<td>{func["size"]} bytes</td>
<td>{func["complexity"]}</td>
</tr>
"""
rows.append(row)
return "".join(rows)
def export_coverage_data(self, format="json"):
"""Export coverage data in various formats"""
if format == "json":
output_file = "coverage_data.json"
with open(output_file, "w") as f:
json.dump(self.metrics, f, indent=2)
elif format == "csv":
import csv
# Export function coverage
with open("function_coverage.csv", "w", newline="") as f:
writer = csv.writer(f)
writer.writerow(["Address", "Function", "Covered", "Hit Count"])
for func in lighthouse.get_all_functions():
coverage_info = lighthouse.get_function_coverage(func)
writer.writerow([
f"0x{func['address']:x}",
func["name"],
coverage_info["covered"],
coverage_info["hit_count"]
])
elif format == "xml":
import xml.etree.ElementTree as ET
root = ET.Element("coverage_report")
# Add summary
summary = ET.SubElement(root, "summary")
for key, value in self.metrics["functions"].items():
elem = ET.SubElement(summary, key)
elem.text = str(value)
# Add functions
functions = ET.SubElement(root, "functions")
for func in lighthouse.get_all_functions():
func_elem = ET.SubElement(functions, "function")
func_elem.set("address", f"0x{func['address']:x}")
func_elem.set("name", func["name"])
coverage_info = lighthouse.get_function_coverage(func)
func_elem.set("covered", str(coverage_info["covered"]))
func_elem.set("hit_count", str(coverage_info["hit_count"]))
tree = ET.ElementTree(root)
tree.write("coverage_data.xml")
print(f"Coverage data exported to {format} format")
# Usage example
reporter = LighthouseCoverageReporter()
reporter.generate_comprehensive_report("detailed_coverage_report.html")
reporter.export_coverage_data("json")
reporter.export_coverage_data("csv")
Integration with CI/CD
Automated Coverage Analysis
#!/bin/bash
# CI/CD integration script for Lighthouse coverage analysis
set -e
BINARY_PATH="$1"
COVERAGE_DIR="$2"
OUTPUT_DIR="$3"
THRESHOLD="$4"
if [ -z "$BINARY_PATH" ] || [ -z "$COVERAGE_DIR" ] || [ -z "$OUTPUT_DIR" ]; then
echo "Usage: $0 <binary_path> <coverage_dir> <output_dir> [threshold]"
exit 1
fi
THRESHOLD=${THRESHOLD:-80} # Default 80% coverage threshold
echo "Starting automated coverage analysis..."
echo "Binary: $BINARY_PATH"
echo "Coverage directory: $COVERAGE_DIR"
echo "Output directory: $OUTPUT_DIR"
echo "Coverage threshold: $THRESHOLD%"
mkdir -p "$OUTPUT_DIR"
# Generate coverage report
python3 << EOF
import sys
sys.path.append('/path/to/lighthouse')
import lighthouse
# Load binary in headless mode
lighthouse.load_binary("$BINARY_PATH")
# Load all coverage files
import os
coverage_files = []
for root, dirs, files in os.walk("$COVERAGE_DIR"):
for file in files:
if file.endswith(('.drcov', '.cov', '.gcov')):
coverage_files.append(os.path.join(root, file))
print(f"Found {len(coverage_files)} coverage files")
for coverage_file in coverage_files:
try:
lighthouse.load_coverage(coverage_file)
print(f"Loaded: {coverage_file}")
except Exception as e:
print(f"Failed to load {coverage_file}: {e}")
# Generate metrics
stats = lighthouse.get_coverage_stats()
function_coverage = stats['function_coverage_percentage']
block_coverage = stats['block_coverage_percentage']
print(f"Function coverage: {function_coverage:.2f}%")
print(f"Block coverage: {block_coverage:.2f}%")
# Generate reports
lighthouse.export_coverage_report("$OUTPUT_DIR/coverage_report.html")
lighthouse.export_coverage_data("$OUTPUT_DIR/coverage_data.json")
# Check threshold
if function_coverage < $THRESHOLD:
print(f"ERROR: Coverage {function_coverage:.2f}% below threshold {$THRESHOLD}%")
sys.exit(1)
else:
print(f"SUCCESS: Coverage {function_coverage:.2f}% meets threshold {$THRESHOLD}%")
# Generate uncovered functions list
uncovered = lighthouse.get_uncovered_functions()
with open("$OUTPUT_DIR/uncovered_functions.txt", "w") as f:
for func in uncovered:
f.write(f"0x{func['address']:x} {func['name']}\n")
print(f"Found {len(uncovered)} uncovered functions")
EOF
# Generate summary for CI
cat > "$OUTPUT_DIR/coverage_summary.txt" << EOF
Coverage Analysis Summary
========================
Date: $(date)
Binary: $BINARY_PATH
Coverage Files: $(find "$COVERAGE_DIR" -name "*.drcov" -o -name "*.cov" -o -name "*.gcov" | wc -l)
Threshold: $THRESHOLD%
Results:
- Function Coverage: $(grep "Function coverage:" "$OUTPUT_DIR/coverage_report.html" | sed 's/.*: //' | sed 's/%.*//')%
- Block Coverage: $(grep "Block coverage:" "$OUTPUT_DIR/coverage_report.html" | sed 's/.*: //' | sed 's/%.*//')%
- Uncovered Functions: $(wc -l < "$OUTPUT_DIR/uncovered_functions.txt")
Status: $(if [ $? -eq 0 ]; then echo "PASS"; else echo "FAIL"; fi)
EOF
echo "Coverage analysis complete!"
echo "Results saved to: $OUTPUT_DIR"
GitHub Actions Integration
# .github/workflows/coverage-analysis.yml
name: Coverage Analysis with Lighthouse
on:
push:
branches: [ main, develop ]
pull_request:
branches: [ main ]
jobs:
coverage-analysis:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: '3.9'
- name: Install dependencies
run: |
sudo apt-get update
sudo apt-get install -y build-essential cmake
# Install WABT for WASM analysis
sudo apt-get install wabt
# Install DynamoRIO for coverage collection
wget https://github.com/DynamoRIO/dynamorio/releases/download/release_9.0.1/DynamoRIO-Linux-9.0.1.tar.gz
tar -xzf DynamoRIO-Linux-9.0.1.tar.gz
export DYNAMORIO_HOME=$PWD/DynamoRIO-Linux-9.0.1
# Install Lighthouse
pip install lighthouse-ida
# Install AFL++ for fuzzing
git clone https://github.com/AFLplusplus/AFLplusplus.git
cd AFLplusplus
make
sudo make install
cd ..
- name: Build target binary
run: |
gcc -g -O0 --coverage src/main.c -o target_binary
- name: Run fuzzing campaign
run: |
mkdir -p input_seeds
echo "test input" > input_seeds/seed1
echo "another test" > input_seeds/seed2
# Run AFL for limited time
timeout 300 afl-fuzz -i input_seeds -o afl_output -t 1000 -m none -- ./target_binary @@ || true
- name: Collect coverage data
run: |
mkdir -p coverage_data
# Collect AFL coverage
find afl_output -name "id:*" | head -20 | while read testcase; do
coverage_file="coverage_data/$(basename "$testcase").drcov"
$DYNAMORIO_HOME/bin64/drrun -t drcov -dump_text -logdir coverage_data -- ./target_binary "$testcase"
done
# Collect GCOV coverage
gcov src/main.c
mv *.gcov coverage_data/
- name: Analyze coverage with Lighthouse
run: |
python3 scripts/lighthouse_analysis.py ./target_binary coverage_data coverage_results 75
- name: Upload coverage results
uses: actions/upload-artifact@v3
with:
name: coverage-results
path: coverage_results/
- name: Comment PR with coverage
if: github.event_name == 'pull_request'
uses: actions/github-script@v6
with:
script: |
const fs = require('fs');
const summary = fs.readFileSync('coverage_results/coverage_summary.txt', 'utf8');
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: `## Coverage Analysis Results\n\n\`\`\`\n${summary}\n\`\`\``
});
- name: Fail if coverage below threshold
run: |
if grep -q "FAIL" coverage_results/coverage_summary.txt; then
echo "Coverage analysis failed!"
exit 1
fi
Resources and Documentation
Official Resources
- Lighthouse GitHub Repository - Main repository and documentation
- Lighthouse Wiki - Comprehensive usage guide
- IDA Pro Plugin Development - IDA Python API documentation
- Ghidra Scripting - Ghidra scripting guide
Coverage Tools Integration
- DynamoRIO - Dynamic binary instrumentation platform
- Intel Pin - Binary instrumentation framework
- AFL++ - Advanced fuzzing framework
- Frida - Dynamic instrumentation toolkit
Research and Papers
- Code Coverage in Reverse Engineering - Academic research on coverage-guided RE
- Fuzzing with Code Coverage - AFL fuzzing methodology
- Binary Analysis with Coverage - Driller paper on coverage-guided analysis
Community Resources
- Lighthouse Users Group - Community discussions
- Reverse Engineering Stack Exchange - Q&A for RE topics
- r/ReverseEngineering - Reddit community
- Binary Analysis Discord - Real-time community chat