FlameGraph Commands
FlameGraph is a collection of scripts by Brendan Gregg that generate interactive SVG flame graphs from stack trace data. Flame graphs visualize profiled software by showing which code paths consume the most resources, making performance bottlenecks immediately visible.
Installation
Linux/Ubuntu
# Clone the repository
git clone https://github.com/brendangregg/FlameGraph.git
cd FlameGraph
# Add to PATH (optional)
export PATH="$PATH:$(pwd)"
# Verify
./flamegraph.pl --help 2>&1 | head -3
# Dependencies — Perl is required (usually pre-installed)
perl --version
Core Workflow
# The standard 3-step process:
# 1. Capture stacks (perf, bpftrace, dtrace, etc.)
# 2. Collapse/fold stacks into single lines
# 3. Generate the SVG flame graph
# Example with perf:
perf record -F 99 -a -g -- sleep 30
perf script > out.perf
./stackcollapse-perf.pl out.perf > out.folded
./flamegraph.pl out.folded > flamegraph.svg
Stack Collapsers
# Collapse perf script output
./stackcollapse-perf.pl out.perf > out.folded
# Collapse with PID annotations
./stackcollapse-perf.pl --pid out.perf > out.folded
# Collapse with thread IDs
./stackcollapse-perf.pl --tid out.perf > out.folded
# Collapse DTrace output
./stackcollapse.pl out.dtrace > out.folded
# Collapse bpftrace output
./stackcollapse-bpftrace.pl out.bpftrace > out.folded
# Collapse Java jstack output
./stackcollapse-jstack.pl out.jstack > out.folded
# Collapse Go pprof output
./stackcollapse-go.pl out.pprof > out.folded
# Collapse Python cProfile output
./stackcollapse-python.pl out.cprofile > out.folded
# Collapse Xcode Instruments output
./stackcollapse-instruments.pl out.instruments > out.folded
# Collapse strace output
./stackcollapse-stap.pl out.strace > out.folded
# Collapse recursive grep of /proc/PID/stack
./stackcollapse-recursive.pl out.procstack > out.folded
Generating Flame Graphs
# Basic flame graph
./flamegraph.pl out.folded > flamegraph.svg
# Custom title
./flamegraph.pl --title "My App CPU Profile" out.folded > flamegraph.svg
# Custom subtitle
./flamegraph.pl --subtitle "Production 2026-05-21" out.folded > flamegraph.svg
# Set minimum display width (percentage)
./flamegraph.pl --minwidth 0.5 out.folded > flamegraph.svg
# Custom width and height
./flamegraph.pl --width 1400 --height 24 out.folded > flamegraph.svg
# Reverse stack order (icicle graph — grows downward)
./flamegraph.pl --inverted out.folded > icicle.svg
# Custom color palette
./flamegraph.pl --color hot out.folded > flamegraph.svg
./flamegraph.pl --color mem out.folded > flamegraph.svg
./flamegraph.pl --color io out.folded > flamegraph.svg
./flamegraph.pl --color java out.folded > flamegraph.svg
# Count name on y-axis
./flamegraph.pl --countname "microseconds" out.folded > flamegraph.svg
# Custom name type
./flamegraph.pl --nametype "Function:" out.folded > flamegraph.svg
CPU Flame Graphs
# On-CPU flame graph from perf
perf record -F 99 -a -g -- sleep 30
perf script > out.perf
./stackcollapse-perf.pl out.perf > out.folded
./flamegraph.pl --title "CPU Flame Graph" out.folded > cpu.svg
# On-CPU flame graph from bpftrace
sudo bpftrace -e 'profile:hz:99 { @[kstack] = count(); }' > out.bpftrace
./stackcollapse-bpftrace.pl out.bpftrace > out.folded
./flamegraph.pl out.folded > cpu_bpf.svg
# User-space only CPU flame graph
perf record -F 99 -g --call-graph dwarf -p 1234 -- sleep 30
perf script > out.perf
./stackcollapse-perf.pl --kernel out.perf > out.folded
./flamegraph.pl --color java out.folded > user_cpu.svg
Off-CPU Flame Graphs
# Using BCC offcputime
sudo offcputime-bpfcc -f 30 > offcpu.folded
./flamegraph.pl --color=io --title="Off-CPU Time" --countname=us offcpu.folded > offcpu.svg
# Using bpftrace for off-CPU analysis
sudo bpftrace -e '
tracepoint:sched:sched_switch {
@start[tid] = nsecs;
}
tracepoint:sched:sched_wakeup /@start[args.pid]/ {
@[kstack, args.comm] = sum(nsecs - @start[args.pid]);
delete(@start[args.pid]);
}' > offcpu.bt
Memory Flame Graphs
# Memory allocation flame graph from perf
perf record -e kmem:kmalloc -a -g -- sleep 10
perf script > mem.perf
./stackcollapse-perf.pl mem.perf > mem.folded
./flamegraph.pl --color=mem --title="Memory Allocations" --countname="bytes" mem.folded > mem.svg
Differential Flame Graphs
# Compare two profiles using difffolded.pl
# 1. Capture baseline profile
perf record -F 99 -a -g -- sleep 30
perf script > baseline.perf
./stackcollapse-perf.pl baseline.perf > baseline.folded
# 2. Capture comparison profile (after changes)
perf record -F 99 -a -g -- sleep 30
perf script > comparison.perf
./stackcollapse-perf.pl comparison.perf > comparison.folded
# 3. Generate differential folded stacks
./difffolded.pl baseline.folded comparison.folded > diff.folded
# 4. Generate differential flame graph
# Red = growth (regression), blue = shrinkage (improvement)
./flamegraph.pl --negate diff.folded > diff_flamegraph.svg
# Normalize to same sample count
./difffolded.pl -n baseline.folded comparison.folded > diff_normalized.folded
./flamegraph.pl --negate diff_normalized.folded > diff_norm.svg
Filtering and Transforming
# Grep for specific functions in folded stacks
grep 'tcp_' out.folded | ./flamegraph.pl > tcp_only.svg
# Exclude kernel stacks
grep -v 'vmlinux' out.folded | ./flamegraph.pl > user_only.svg
# Filter to specific process
grep 'my_app' out.folded | ./flamegraph.pl > my_app.svg
# Combine multiple folded stack files
cat profile1.folded profile2.folded | ./flamegraph.pl > combined.svg
# Sort folded stacks for diffing
sort out.folded > sorted.folded
Folded Stack Format
# Format: semicolon-separated stack frames followed by a space and count
# Bottom of stack is on the left, top (leaf) on the right
main;read_data;parse_json;validate 42
main;read_data;parse_json;transform 87
main;handle_request;send_response 156
main;handle_request;log_request 23
Interactive SVG Features
The generated SVG files include built-in interactivity:
| Feature | Description |
|---|---|
| Hover | Shows function name, sample count, and percentage |
| Click | Zooms into a specific frame and its children |
| Ctrl+F / Search | Highlights matching frames with magenta |
| Reset Zoom | Click “Reset Zoom” or press Escape |
| Right-click | Opens browser context menu for saving |
Language-Specific Workflows
Java
# Using async-profiler (recommended for Java)
./asprof -d 30 -f out.html jps_pid
# Or generate folded stacks
./asprof -d 30 -o collapsed -f out.folded jps_pid
./flamegraph.pl out.folded > java_cpu.svg
Python
# Using py-spy to generate folded stacks
py-spy record -f raw -o out.folded --pid 1234
./flamegraph.pl out.folded > python_cpu.svg
Node.js
# Using 0x (wrapper around perf for Node.js)
npx 0x my_app.js
# Or perf with --perf-basic-prof
node --perf-basic-prof my_app.js &
perf record -F 99 -p $! -g -- sleep 30
perf script > out.perf
./stackcollapse-perf.pl out.perf > out.folded
./flamegraph.pl out.folded > node_cpu.svg
Tips
# Increase sample rate for short workloads
perf record -F 999 -g -- ./short_task
# Use DWARF unwinding for accurate user-space stacks
perf record -g --call-graph dwarf -F 99 -p 1234 -- sleep 30
# Check if frame pointers are available
readelf -S /usr/bin/myapp | grep -i frame
# Pipe everything in one line
perf script | ./stackcollapse-perf.pl | ./flamegraph.pl > flamegraph.svg