Ir al contenido

oletools Cheat Sheet

Overview

oletools is a comprehensive Python toolkit for analyzing Microsoft OLE2 files (also known as Structured Storage or Compound File Binary Format), which includes Word documents (.doc), Excel spreadsheets (.xls), PowerPoint presentations (.ppt), and other Microsoft Office formats. The toolkit provides multiple specialized tools for extracting and analyzing VBA macros, detecting auto-execution triggers, identifying obfuscated code patterns, extracting embedded objects, and analyzing RTF documents for exploits. oletools is essential for malware analysts and SOC teams dealing with the persistent threat of malicious Office documents.

The toolkit includes several key tools: olevba for extracting and analyzing VBA macros with obfuscation detection, mraptor for detecting auto-executing macros (a strong malware indicator), oleid for quick document classification, oleobj for extracting embedded objects, rtfobj for analyzing RTF documents, and msodde for detecting DDE (Dynamic Data Exchange) attacks. oletools supports both legacy OLE formats (.doc, .xls) and modern OOXML formats (.docx, .xlsx, .docm, .xlsm) as well as OneNote files. It is widely used in automated malware analysis pipelines and email security gateways.

Installation

Via pip

# Install oletools
pip install oletools

# Install with optional dependencies
pip install oletools[full]

# Verify installation
olevba --version

System Package

# Ubuntu/Debian
sudo apt install python3-oletools

# Arch Linux
yay -S python-oletools

From Source

git clone https://github.com/decalage2/oletools.git
cd oletools
pip install -e .

Core Tools

olevba — VBA Macro Analyzer

CommandDescription
olevba <file>Extract and analyze VBA macros
olevba -a <file>Analysis mode (summary only)
olevba -c <file>Code mode (show VBA source)
olevba --decode <file>Decode obfuscated strings
olevba --deobf <file>Attempt macro deobfuscation
olevba -j <file>JSON output
# Full VBA analysis
olevba malicious.doc

# Show only analysis summary
olevba -a suspicious.xlsm

# Extract VBA source code
olevba -c document.docm

# Deobfuscate VBA code
olevba --deobf obfuscated.doc

# Decode strings
olevba --decode encoded.doc

# JSON output for automation
olevba -j malicious.doc > analysis.json

# Analyze password-protected file
olevba -p infected malicious.doc

# Recursive analysis of zip containing docs
olevba -r archive.zip

# Show only suspicious indicators
olevba -a malicious.doc | grep -E "AutoExec|Suspicious|IOC|VBA"

mraptor — Macro Raptor

# Quick macro auto-execution detection
mraptor document.doc

# Scan multiple files
mraptor *.doc *.docm *.xlsm

# JSON output
mraptor -j document.doc

# Exit code interpretation
# 0 = no macros or no auto-execution
# 1 = suspicious (auto-execution + write/execute)
# 2 = macros present but no auto-execution

# Batch scanning with exit codes
for f in /samples/*.doc; do
  mraptor "$f" > /dev/null 2>&1
  if [ $? -eq 1 ]; then
    echo "SUSPICIOUS: $f"
  fi
done

oleid — Document Identifier

# Quick document classification
oleid document.doc

# Output shows:
# - File format
# - Container format
# - VBA Macros presence
# - XLM Macros presence
# - External relationships
# - ObjectPool
# - Flash objects

oleobj — OLE Object Extractor

# Extract embedded objects
oleobj malicious.doc

# Save extracted objects to directory
oleobj -d /output/objects/ malicious.doc

# Analyze embedded OLE objects
oleobj -i malicious.doc

# Extract from OOXML files
oleobj malicious.docx

rtfobj — RTF Object Analyzer

# Analyze RTF document
rtfobj malicious.rtf

# Extract embedded objects
rtfobj -d /output/ malicious.rtf

# Save all objects
rtfobj -s all malicious.rtf

# Detect CVE-2017-11882 (Equation Editor exploit)
rtfobj malicious.rtf
# Look for "Equation.3" OLE objects

msodde — DDE Detection

# Detect DDE/DDEAUTO in documents
msodde document.doc
msodde document.docx

# Extract DDE commands
msodde -j document.doc

# Scan CSV/other text files for DDE
msodde spreadsheet.csv

Analysis Workflow

Triage Pipeline

#!/bin/bash
# triage_office.sh - Quick triage of Office documents

FILE=$1
echo "=== Analyzing: $FILE ==="

# Step 1: Identify document type
echo "[*] Document identification:"
oleid "$FILE"

# Step 2: Check for auto-executing macros
echo "[*] Macro raptor check:"
mraptor "$FILE"
MRAPTOR_EXIT=$?

# Step 3: Extract VBA if present
echo "[*] VBA Analysis:"
olevba -a "$FILE"

# Step 4: Check for DDE
echo "[*] DDE Check:"
msodde "$FILE" 2>/dev/null

# Step 5: Extract embedded objects
echo "[*] Embedded Objects:"
oleobj "$FILE" 2>/dev/null

# Verdict
if [ $MRAPTOR_EXIT -eq 1 ]; then
    echo "[!] VERDICT: SUSPICIOUS - Auto-executing macros detected"
else
    echo "[+] VERDICT: No auto-executing macros found"
fi

Extracting IOCs

# Extract URLs from macros
olevba -c malicious.doc | grep -oP 'https?://[^\s"\\)]+' | sort -u

# Extract IP addresses
olevba -c malicious.doc | grep -oP '\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}' | sort -u

# Extract file paths
olevba -c malicious.doc | grep -oP '[A-Z]:\\[^\s"]+' | sort -u

# Extract base64 encoded content
olevba --decode malicious.doc | grep -oP '[A-Za-z0-9+/]{40,}={0,2}'

# JSON extraction pipeline
olevba -j malicious.doc | python3 -c "
import json, sys
data = json.load(sys.stdin)
for result in data.get('results', []):
    for analysis in result.get('analysis', []):
        if analysis.get('type') == 'IOC':
            print(f\"{analysis['keyword']}: {analysis.get('description', '')}\")
"

Configuration

Python API

from oletools.olevba import VBA_Parser
from oletools.mraptor import MacroRaptor

# Analyze VBA macros
vba_parser = VBA_Parser("document.doc")

if vba_parser.detect_vba_macros():
    for (filename, stream_path, vba_filename, vba_code) in vba_parser.extract_macros():
        print(f"Module: {vba_filename}")
        print(vba_code)

    # Get analysis results
    results = vba_parser.analyze_macros()
    for kw_type, keyword, description in results:
        print(f"{kw_type}: {keyword} - {description}")

vba_parser.close()

# Check for auto-execution
mraptor = MacroRaptor(vba_code)
mraptor.scan()
if mraptor.suspicious:
    print("Auto-executing macros detected!")

Integration with Email Gateway

import email
from oletools.olevba import VBA_Parser
from oletools.mraptor import MacroRaptor

def scan_email_attachment(msg_file):
    """Scan email attachments for malicious macros."""
    msg = email.message_from_file(open(msg_file, 'rb'))
    results = []

    for part in msg.walk():
        filename = part.get_filename()
        if filename and filename.lower().endswith(('.doc', '.docm', '.xls', '.xlsm')):
            content = part.get_payload(decode=True)
            try:
                vba = VBA_Parser(filename, data=content)
                if vba.detect_vba_macros():
                    for _, _, _, code in vba.extract_macros():
                        mr = MacroRaptor(code)
                        mr.scan()
                        if mr.suspicious:
                            results.append({
                                'filename': filename,
                                'suspicious': True,
                                'flags': mr.flags
                            })
                vba.close()
            except Exception as e:
                results.append({'filename': filename, 'error': str(e)})

    return results

Advanced Usage

XLM Macro Detection

# Detect Excel 4.0 (XLM) macros
olevba -a spreadsheet.xls
# Look for "XLM macros" in output

# Extract XLM formulas
# XLM macros appear in hidden sheets with formulas like:
# =EXEC("cmd /c ...")
# =ALERT("message")
# =CALL("kernel32","VirtualAlloc"...)

Handling Password-Protected Files

# Try common passwords
olevba -p password malicious.doc
olevba -p infected malicious.doc
olevba -p 1234 malicious.doc

# Brute force with wordlist
while read pwd; do
  olevba -p "$pwd" malicious.doc 2>/dev/null && echo "Password: $pwd" && break
done < wordlist.txt

Batch Processing

# Scan directory of documents
find /samples -name "*.doc" -o -name "*.docm" -o -name "*.xlsm" | while read f; do
  RESULT=$(mraptor "$f" 2>&1)
  EXIT=$?
  if [ $EXIT -eq 1 ]; then
    echo "MALICIOUS: $f"
    echo "$RESULT"
  fi
done > scan_results.txt

Troubleshooting

IssueSolution
ImportError: No module named oletoolsInstall with pip install oletools
Cannot parse OOXML fileEnsure python-oletools[full] is installed with XML dependencies
Password-protected fileUse -p flag with password or try common passwords
Corrupted OLE fileTry olefile.isOleFile() to verify, or use olevba --relaxed
Large file timeoutProcess in chunks or increase Python memory limits
Missing XLM detectionUpdate to latest version; XLM support was added in oletools 0.60
RTF parsing errorsTry rtfobj specifically for RTF files instead of olevba
False positive macrosReview VBA source with olevba -c to verify auto-exec triggers