Salta ai contenuti

missidentify

Overview

missidentify is a specialized tool for identifying Windows PE (Portable Executable) files that are misidentified, corrupted, or lack proper PE headers. It’s invaluable for malware analysis, incident response, and vulnerability assessment. The tool detects anomalies in executable structure, helps identify packed malware, and reveals suspicious binary characteristics that antivirus software might miss.

Installation

From Source

# Clone repository
git clone https://github.com/Yara-Rules/missidentify.git
cd missidentify

# Install dependencies
pip3 install -r requirements.txt

# Make executable
chmod +x missidentify.py

Package Installation

# Using pip
pip3 install missidentify

# Verify installation
missidentify --version

Docker Installation

# Build Docker image
docker build -t missidentify .

# Run in container
docker run -v /samples:/samples missidentify /samples/binary.exe

Basic Usage

Simple Binary Analysis

# Analyze single executable
missidentify /path/to/binary.exe

# Analyze with verbose output
missidentify -v /path/to/binary.exe

# Check multiple files
missidentify *.exe

# Analyze directory recursively
missidentify -r /path/to/samples/

Output Formats

OptionDescription
-j, --jsonOutput in JSON format
-c, --csvOutput in CSV format
-x, --xmlOutput in XML format
-t, --textPlain text output (default)
-q, --quietSuppress non-critical output
-v, --verboseDetailed analysis output

Format Examples

# JSON output for parsing
missidentify --json binary.exe > results.json

# CSV for spreadsheet analysis
missidentify --csv *.exe > analysis.csv

# XML for integration with other tools
missidentify --xml /samples/ > report.xml

# Quiet mode for scripts
missidentify -q binary.exe && echo "Analysis complete"

PE Header Analysis

Understanding PE Structure

DOS Header (MZ signature)
    |
    v
PE Signature (PE\0\0)
    |
    v
COFF File Header
    |
    v
Optional Header
    |
    v
Section Headers
    |
    v
Section Data

Signature Verification

# Check DOS header (MZ signature)
missidentify --check-dos-header binary.exe

# Verify PE signature
missidentify --check-pe-signature binary.exe

# Full header validation
missidentify --validate-headers binary.exe

# Show all headers
missidentify --dump-headers binary.exe

Common PE Issues Detected

IssueDescriptionRisk Level
Missing MZ headerBinary doesn’t start with “MZ”Critical
Invalid PE signaturePE\0\0 signature corruptedCritical
Misaligned sectionsSections don’t align properlyHigh
Invalid machine typeCPU architecture mismatchHigh
Corrupted COFF headerFile header checksum invalidHigh
Section overlapSections occupy same memoryMedium
Padding anomaliesSuspicious null paddingMedium

Advanced Analysis

Packing Detection

# Detect packers/protectors
missidentify --detect-packing binary.exe

# Entropy analysis for compression
missidentify --entropy binary.exe

# Identify common packers
missidentify --identify-packer binary.exe

# Show packing signatures
missidentify --packer-signatures binary.exe

Entropy Calculation

# Calculate section entropy
missidentify --entropy --all-sections binary.exe

# High entropy indicates:
# - Encryption
# - Compression
# - Packed code

# Analyze entropy per section
missidentify --entropy-detailed binary.exe

# Export entropy data
missidentify --entropy --json binary.exe > entropy.json

Import Table Analysis

# Analyze imported libraries
missidentify --imports binary.exe

# Check for suspicious imports
missidentify --suspicious-imports binary.exe

# Validate import address table
missidentify --validate-iat binary.exe

# Export imports to file
missidentify --imports --json binary.exe > imports.json

Export Table Inspection

# List exported functions
missidentify --exports binary.exe

# Check export directory integrity
missidentify --validate-exports binary.exe

# Find suspicious exports
missidentify --suspicious-exports binary.exe

# Detailed export analysis
missidentify --exports --verbose binary.exe

Malware-Specific Detection

Suspicious Characteristics

# Scan for common malware traits
missidentify --malware-scan binary.exe

# Check for hooking patterns
missidentify --detect-hooks binary.exe

# Find anomalous entry points
missidentify --analyze-ep binary.exe

# Look for code injection markers
missidentify --injection-detection binary.exe

Resource Scanning

# Analyze embedded resources
missidentify --resources binary.exe

# Detect hidden resources
missidentify --hidden-resources binary.exe

# Extract resource details
missidentify --resources --verbose binary.exe

# Find suspicious resource types
missidentify --suspicious-resources binary.exe

Anomaly Detection

# Full anomaly report
missidentify --anomaly-report binary.exe

# Check for null section names
missidentify --check-null-sections binary.exe

# Detect unusual characteristics
missidentify --unusual-characteristics binary.exe

# Validate section permissions
missidentify --check-section-perms binary.exe

Batch Analysis

Scanning Multiple Files

# Analyze all executables in directory
missidentify -r /malware/samples/

# Filter by file extension
missidentify *.exe *.dll *.sys

# Recursive with pattern matching
missidentify -r --pattern="*.exe" /samples/

# Parallel processing for speed
missidentify --parallel 4 /samples/

Filtering Results

# Show only suspicious files
missidentify -r /samples/ --suspicious-only

# Filter by severity
missidentify -r /samples/ --min-severity=high

# Exclude clean files
missidentify -r /samples/ --skip-clean

# Show only failures
missidentify -r /samples/ --errors-only

Report Generation

# Generate comprehensive report
missidentify -r /samples/ --report summary.txt

# HTML report with visualizations
missidentify -r /samples/ --html report.html

# JSON report for automation
missidentify -r /samples/ --json report.json

# Create threat assessment document
missidentify -r /samples/ --threat-level --json > threats.json

Section Analysis

Section Headers

# Display all sections
missidentify --sections binary.exe

# Analyze section permissions
missidentify --section-perms binary.exe

# Detect anomalous sections
missidentify --anomalous-sections binary.exe

# Section entropy analysis
missidentify --section-entropy binary.exe

Section Characteristics

SectionPurposeExpected Characteristics
.textCodeReadable, Executable
.dataInitialized dataReadable, Writable
.rdataRead-only dataReadable only
.rsrcResourcesReadable only
.relocRelocationsReadable
.tlsThread-local storageReadable, Writable

Suspicious Section Detection

# Find executable data sections
missidentify --executable-data-sections binary.exe

# Detect writable code sections
missidentify --writable-code binary.exe

# Find code in data sections
missidentify --code-in-data binary.exe

# Analyze section raw sizes
missidentify --section-sizes binary.exe

Forensic Analysis

Memory Layout Inspection

# Show memory layout
missidentify --memory-layout binary.exe

# Check address space
missidentify --address-space binary.exe

# Validate image base
missidentify --image-base binary.exe

# Detect relocation issues
missidentify --relocation-issues binary.exe

Timestamp Analysis

# Extract compile time
missidentify --timestamp binary.exe

# Check file timestamps
missidentify --file-timestamps binary.exe

# Analyze build information
missidentify --build-info binary.exe

# Timeline analysis
missidentify --timeline binary.exe > timeline.csv

Hash Calculation

# Calculate file hashes
missidentify --hashes binary.exe

# MD5 hash
missidentify --md5 binary.exe

# SHA256 hash
missidentify --sha256 binary.exe

# Calculate section hashes
missidentify --section-hashes binary.exe

Comparison and Correlation

Binary Comparison

# Compare two binaries
missidentify --compare binary1.exe binary2.exe

# Find differences in structure
missidentify --diff binary1.exe binary2.exe

# Similarity analysis
missidentify --similarity binary1.exe binary2.exe

# Section comparison
missidentify --compare-sections binary1.exe binary2.exe

Known Sample Matching

# Compare against known malware samples
missidentify --match-samples binary.exe /samples/

# Find similar binaries in database
missidentify --find-similar binary.exe --db malware.db

# YARA rule matching
missidentify --yara-rules rules.yar binary.exe

# Hash lookup in VirusTotal
missidentify --virustotal-lookup binary.exe

Corruption Recovery

Repairing PE Headers

# Attempt to repair headers
missidentify --repair binary.exe --output fixed.exe

# Validate repair
missidentify --validate-headers fixed.exe

# Backup original before repair
cp binary.exe binary.exe.bak
missidentify --repair binary.exe

# Repair with specific options
missidentify --repair binary.exe --fix-dos --fix-pe --fix-sections

Recovery Options

# Fix DOS header
missidentify --fix-dos-header binary.exe

# Rebuild PE signature
missidentify --rebuild-pe-sig binary.exe

# Repair section headers
missidentify --repair-sections binary.exe

# Reconstruct imports
missidentify --rebuild-iat binary.exe

Integration with Other Tools

Yara Integration

# Use with YARA rules
missidentify --yara rules.yar /samples/

# Create YARA rules from analysis
missidentify --generate-yara binary.exe > rule.yar

# Combine with YARA scanning
yara rules.yar binary.exe && missidentify binary.exe

IDA Integration

# Generate IDA-compatible output
missidentify --ida-export binary.exe > exports.idc

# Export symbol information
missidentify --symbols binary.exe > symbols.idc

# Create IDA database from analysis
missidentify --ida-db binary.exe analysis.i64

Winapiset Integration

# Identify Windows API usage
missidentify --api-analysis binary.exe

# Generate API call graph
missidentify --api-graph binary.exe

# Detect suspicious API chains
missidentify --suspicious-apis binary.exe

Scripting and Automation

Python API

from missidentify import BinaryAnalyzer

# Initialize analyzer
analyzer = BinaryAnalyzer('binary.exe')

# Get basic information
info = analyzer.get_info()
print(f"Binary: {info['name']}")
print(f"Machine: {info['machine']}")
print(f"Sections: {info['section_count']}")

# Check for anomalies
anomalies = analyzer.detect_anomalies()
for anomaly in anomalies:
    print(f"Anomaly: {anomaly['type']} - {anomaly['severity']}")

# Get packing detection
packer = analyzer.detect_packing()
print(f"Packer detected: {packer['name']}")

Bash Scripting

#!/bin/bash
# Batch malware analysis script

DIR="/samples"
REPORT="analysis_$(date +%Y%m%d).csv"

echo "File,Status,Issues,Packer" > "$REPORT"

for file in "$DIR"/*.exe; do
    status=$(missidentify -q "$file" && echo "OK" || echo "ERROR")
    issues=$(missidentify --anomaly-report "$file" | wc -l)
    packer=$(missidentify --detect-packing "$file" | grep -oP "Packer: \K.*")
    
    echo "$(basename $file),$status,$issues,$packer" >> "$REPORT"
done

echo "Report saved to $REPORT"

PowerShell Scripting

# Windows PowerShell analysis script

$SamplesDir = "C:\samples"
$Report = "analysis.csv"

Get-ChildItem $SamplesDir -Filter "*.exe" | ForEach-Object {
    $file = $_.FullName
    $result = & missidentify $file 2>&1
    
    if ($LASTEXITCODE -eq 0) {
        $status = "Clean"
    } else {
        $status = "Suspicious"
    }
    
    "$($_.Name),$status" | Add-Content $Report
}

Troubleshooting

Common Issues

# File not found
missidentify /nonexistent/file.exe
# Error: File not found

# Permission denied
sudo missidentify /root/protected.exe

# Corrupted file
missidentify --allow-corruption binary.exe

# Insufficient memory for large file
missidentify --memory-optimized large.exe

Debug Mode

# Enable debug output
missidentify --debug binary.exe

# Trace execution
missidentify --trace binary.exe

# Verbose parsing
missidentify -vvv binary.exe

# Log to file
missidentify --log debug.log binary.exe

Performance Optimization

# Fast mode (skip optional checks)
missidentify --fast binary.exe

# Memory-optimized mode
missidentify --memory-optimized /samples/

# Parallel processing
missidentify --threads 8 -r /samples/

# Progress reporting
missidentify --progress -r /samples/

Real-World Examples

Detecting Packed Malware

# Analyze suspicious binary
$ missidentify malware.exe
[!] High entropy in .text section (7.89/8.0)
[!] Suspicious section names detected
[!] Packer: UPX detected
[!] Import table may be obfuscated

Finding Corrupted Samples

# Scan collection for corruption
$ missidentify -r /samples/ --suspicious-only
malware1.exe: Invalid PE signature at offset 0x40
malware2.exe: Section overlap detected
malware3.exe: Corrupted COFF header

Comparative Analysis

# Compare original and modified samples
$ missidentify --compare legitimate.exe modified.exe
Modified sections:
- .text: Different content (hash mismatch)
- .data: Additional 512 bytes
- Added: .injected section

Operational Security

Analysis Best Practices

  • Use isolated analysis system
  • Enable antivirus and EDR protection
  • Set file system and network isolation
  • Log all analysis activities
  • Use virtual machines for suspicious samples
  • Document all findings
  • Maintain sample integrity with hashing
  • Secure sensitive analysis results

Evidence Preservation

# Calculate hashes before analysis
sha256sum binary.exe > binary.exe.sha256

# Backup original
cp binary.exe binary.exe.original

# Record analysis metadata
missidentify --metadata binary.exe > metadata.json

# Chain of custody documentation
echo "Analyzed: $(date)" >> analysis_log.txt

References

missidentify is designed for legitimate security research, malware analysis, and incident response. Unauthorized analysis of software or systems is illegal. Always obtain proper authorization and document all analysis activities for audit purposes.