Chapter 19: Footprint Analysis Fundamentals
Part VI: Embedded Constraints
"In embedded systems, every byte counts—literally." — Jack Ganssle
The Missing 2 KB
Two weeks before mass production, our project war room reeked of espresso and anxiety.
The development team had just merged the final security patch, but the automated build server flashed an angry red warning:
region 'FLASH' overflowed by 2048 bytes
This was a 128 KB flash system, and our binary had grown to 130 KB. Those extra 2 KB stood like an insurmountable wall between us and product shipment.
Senior engineer Zhang immediately shouted: "Quick! Strip out all the debug strings from printf, and disable those unnecessary assert statements!"
Everyone scrambled through the source code, hunting for strings to delete. An hour later, the second build result arrived: only 400 bytes saved.
The team fell silent. Blind "intuition-based optimization" proved utterly powerless against hard memory constraints.
"We need data, not guesses." Junior performance engineer Ming broke the chaos.
Instead of rushing to delete code, he calmly ran size and nm --size-sort. In the detailed linker map file, he discovered the real "space killer" wasn't printf—it was a newly introduced third-party sensor driver.
That driver had inadvertently pulled in the floating-point emulation library, all because of a calibration routine that mistakenly used double for fewer than ten lines of data processing.
Through systematic analysis tools, the team fixed just two lines of code, converting floating-point to fixed-point arithmetic. The binary instantly shrank by 15 KB.
Optimizing footprint isn't a guessing game of "deleting code"—it's a precise science of measurement.
What is Footprint?
In embedded systems, footprint refers to the memory space a program occupies. Unlike desktop systems, embedded memory is a hard constraint—your firmware must fit into fixed-size flash and RAM.
Static vs Dynamic Footprint
Footprint can be categorized into two types:
┌─────────────────────────────────────────────────────────┐
│ Static Footprint (determined at compile time) │
├─────────────────────────────────────────────────────────┤
│ .text │ Machine code, instructions │ Stored in Flash │
│ .rodata │ Constants, string literals │ Stored in Flash │
│ .data │ Initialized globals │ Flash → RAM │
│ .bss │ Uninitialized globals │ RAM (zeroed) │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ Dynamic Footprint (changes at runtime) │
├─────────────────────────────────────────────────────────┤
│ Stack │ Local variables, call frames │ RAM │
│ Heap │ Dynamically allocated memory │ RAM │
└─────────────────────────────────────────────────────────┘
Flash vs RAM Occupancy
Understanding the mapping between sections and memory is crucial:
Flash usage = .text + .rodata + .data (initial values)
RAM usage = .data + .bss + stack + heap
Note that .data occupies both Flash (storing initial values) and RAM (runtime storage). This detail trips up many engineers.
Why "Won't Fit in Flash" Is So Common
Typical embedded system memory constraints:
Device Type Flash RAM
────────────────────────────────
Low-end MCU 32 KB 4 KB
Mid-range MCU 256 KB 64 KB
High-end MCU 1 MB 256 KB
Application CPU Unlimited* 512 MB+
* Has file system and virtual memory
As features accumulate, code size easily grows unnoticed. A single "harmless" library reference might bring in tens of KB of hidden dependencies.
The Toolbox
Just as performance analysis needs profilers, footprint analysis requires specialized tools. Here are four essential tools for systems software engineers.
1. The size Command: Quick Overview
size is the most basic tool for quickly grasping a binary's overall structure:
$ riscv64-unknown-elf-size -A firmware.elf
section size addr
.text 0x4500 0x80000000
.rodata 0x0800 0x80004500
.data 0x0100 0x80004d00
.bss 0x0200 0x20000000
.stack 0x1000 0x20000200
Total 0x6000
Key metrics:
- Flash usage:
.text+.rodata+.data= 21,248 bytes - RAM usage:
.data+.bss+.stack= 4,864 bytes
Pro tip: Use -A (System V format) instead of the default Berkeley format for more detailed section breakdown.
2. The nm Command: Symbol-Level Analysis
When you find a section is too large, dig into symbol level to find the culprit:
$ riscv64-unknown-elf-nm -S --size-sort -r firmware.elf | head -10
80001a20 000005d4 T core_process_loop
20000040 00000400 B network_buffer
800021f4 00000210 t parse_json_string
80002404 000001c8 T uart_send_buffer
...
Output interpretation:
- Column 1: Symbol address
- Column 2: Symbol size (bytes)
- Column 3: Symbol type (T=text, B=bss, D=data)
- Column 4: Symbol name
Pro tip: Filter by type, e.g., finding only large variables in RAM:
$ nm -S --size-sort -r firmware.elf | grep -E ' [BD] '
3. bloaty: Modern Footprint Analyzer
Bloaty McBloatface is an advanced footprint analysis tool from Google. It displays space distribution hierarchically and supports diff comparison between versions.
# Analyze by compile unit (source file)
$ bloaty firmware.elf -d compileunits
VM SIZE FILE SIZE
-------------- --------------
62.5% 5.15Ki tasks.c 62.5% 5.15Ki
21.2% 1.75Ki queue.c 21.2% 1.75Ki
8.5% 712B list.c 8.5% 712B
7.8% 650B port.c 7.8% 650B
Version comparison (Diff)—bloaty's most powerful feature:
$ bloaty new_firmware.elf -- old_firmware.elf
VM SIZE FILE SIZE
-------------- --------------
+15.2% +2.1Ki .text +15.2% +2.1Ki
[ = ] 0 .rodata [ = ] 0
+8.3% +128B .bss +8.3% +128B
-------------- --------------
+12.1% +2.2Ki TOTAL +12.1% +2.2Ki
This diff capability is especially useful in CI/CD—automatically compare footprint changes after each commit.
4. Linker Map File: The Ultimate Truth
The linker map file records how the compiler combines all object files into the final binary. It's the ultimate weapon for solving "where did the space go?" mysteries.
Generating a map file:
$ riscv64-unknown-elf-gcc main.o lib.o -Wl,-Map=output.map -o firmware.elf
Map file example:
.text.core_init
0x0000000080000100 0x48 main.o
.text.uart_send
0x0000000080000148 0x20 uart.o
*fill* 0x0000000080000168 0x08
.text.process_data
0x0000000080000170 0x120 process.o
Key observations:
*fill*indicates padding (alignment)—hidden space waste- You can trace each symbol back to its source object file
- You can discover libraries that were accidentally linked in
Analysis Workflow
Establish a systematic analysis process instead of guessing by intuition:
Step 1: Baseline Measurement
↓
$ size firmware.elf
Record .text, .data, .bss sizes
↓
Step 2: Identify Heavy Hitters
↓
$ nm -S --size-sort -r firmware.elf | head -20
Find symbols consuming the most space
↓
Step 3: Trace Origins
↓
Check linker map file
Confirm which object files these symbols come from
↓
Step 4: Analyze Causes
↓
- Is there an accidentally included library?
- Are there unnecessary features being compiled in?
- Are there oversized static buffers?
↓
Step 5: Verify Changes
↓
$ bloaty new.elf -- old.elf
Confirm changes actually reduced footprint
Common "Space Killers"
1. Floating-Point Library
- Using float/double on MCUs without FPU
- Even a single printf("%f") pulls in the entire float formatting library
2. Standard Library Functions
- printf family: 10-20 KB
- malloc/free: 1-5 KB
- Consider newlib-nano or custom minimal versions
3. Oversized Static Buffers
- char log_buffer[4096]; // Do you really need this big?
4. Unused Features
- Referencing a library but only using a small part
- Not enabling --gc-sections to remove dead code
Case Study: Tracing an Accidental Library Reference
Let's return to the opening story and reconstruct Ming's analysis process with tools.
Step 1: Discover the problem
$ size firmware_before.elf
text data bss dec hex filename
133120 256 4096 137472 21900 firmware_before.elf
Flash usage is 133,376 bytes (.text + .data), exceeding the 128 KB limit.
Step 2: Find the heavy hitters
$ nm -S --size-sort -r firmware_before.elf | head -5
80010000 00003a00 T __aeabi_ddiv
8000c600 00002800 T __aeabi_dmul
80009e00 00001c00 T __aeabi_dadd
80008200 00001c00 T __aeabi_dsub
80006600 00001400 T __aeabi_d2iz
These __aeabi_d* functions are software emulation for double floating-point operations! They total about 50 KB.
Step 3: Trace the origin
Search for these symbols' source in the linker map file:
$ grep -A1 "__aeabi_ddiv" output.map
__aeabi_ddiv
0x80010000 0x3a00 libgcc.a(dp-bit.o)
It's libgcc's double-precision floating-point emulation.
Step 4: Find the caller
$ grep -r "double\|float" src/
src/drivers/sensor.c:42: double calibrated = raw_value * 0.0125;
There it is! A simple calibration operation pulled in 50 KB of floating-point library.
Step 5: Fix it
Convert double operations to fixed-point:
// Before: pulls in 50 KB floating-point library
double calibrated = raw_value * 0.0125;
// After: fixed-point, 0 KB overhead
int32_t calibrated = (raw_value * 125) / 10000;
Step 6: Verify
$ bloaty firmware_after.elf -- firmware_before.elf
VM SIZE FILE SIZE
-------------- --------------
-37.5% -50.0Ki .text -37.5% -50.0Ki
[ = ] 0 .data [ = ] 0
[ = ] 0 .bss [ = ] 0
-------------- --------------
-37.5% -50.0Ki TOTAL -37.5% -50.0Ki
Success—50 KB saved!
Summary
- Footprint = memory space a program occupies, including code size (Flash) and data size (RAM)
- Measurement tools:
size: Quick overview of section sizesnm --size-sort: Find the largest symbolsbloaty: Hierarchical analysis and version comparison- Linker map file: Trace symbol origins
- Analysis workflow: Baseline measurement → Identify heavy hitters → Trace origins → Analyze causes → Verify changes
- Common pitfalls: Floating-point library, standard library functions, oversized static buffers, unremoved dead code
- Core principle: Measure, don't guess