Appendix C: I/O and Storage Performance
"Storage is the new memory." — Jim Gray
Storage Performance Fundamentals
Key Metrics
Storage performance metrics:
1. Bandwidth (Throughput)
- Unit: MB/s, GB/s
- Maximum sequential read/write speed
2. IOPS (I/O Operations Per Second)
- Unit: ops/s
- Random read/write operations
3. Latency
- Unit: μs, ms
- Time for single I/O operation
4. Queue Depth
- Concurrent I/O requests
- Affects IOPS and latency
Storage Hierarchy
Storage hierarchy and typical performance:
Level Latency Bandwidth
─────────────────────────────────────────────
CPU Cache 1-10 ns 100+ GB/s
DRAM 50-100 ns 50-100 GB/s
NVMe SSD 10-100 μs 3-7 GB/s
SATA SSD 50-200 μs 500-600 MB/s
HDD 5-10 ms 100-200 MB/s
Network 1-100 ms 1-10 GB/s
fio (Flexible I/O Tester)
fio is the most commonly used storage benchmark tool.
Installation
# Ubuntu/Debian
sudo apt install fio
# macOS
brew install fio
# From source
git clone https://github.com/axboe/fio.git
cd fio && ./configure && make && sudo make install
Basic Usage
# Sequential write test
fio --name=seq_write \
--ioengine=libaio \
--direct=1 \
--bs=1M \
--size=1G \
--numjobs=1 \
--rw=write \
--filename=/tmp/fio_test
# Random read test
fio --name=rand_read \
--ioengine=libaio \
--direct=1 \
--bs=4K \
--size=1G \
--numjobs=4 \
--iodepth=32 \
--rw=randread \
--filename=/tmp/fio_test
Common Parameters
fio parameters:
--ioengine I/O engine (libaio, io_uring, sync)
--direct Bypass page cache (1=yes)
--bs Block size (4K, 1M, etc.)
--size Test file size
--numjobs Parallel jobs
--iodepth Queue depth
--rw Read/write mode (read, write, randread, randwrite, randrw)
--runtime Runtime (seconds)
--time_based Time-based instead of size-based
Job File
; fio_test.fio - Complete test configuration
[global]
ioengine=libaio
direct=1
size=1G
runtime=60
time_based
group_reporting
[seq_read]
rw=read
bs=1M
numjobs=1
[seq_write]
rw=write
bs=1M
numjobs=1
[rand_read]
rw=randread
bs=4K
numjobs=4
iodepth=32
[rand_write]
rw=randwrite
bs=4K
numjobs=4
iodepth=32
[mixed]
rw=randrw
rwmixread=70
bs=4K
numjobs=4
iodepth=32
Running and Output
# Run job file
fio fio_test.fio
# JSON output
fio fio_test.fio --output-format=json --output=results.json
# Output example
seq_read: (g=0): rw=read, bs=(R) 1024KiB-1024KiB
read: IOPS=3245, BW=3245MiB/s (3403MB/s)
slat (usec): min=2, max=45, avg=5.2
clat (usec): min=280, max=1234, avg=302.5
lat (usec): min=285, max=1240, avg=307.7
ioping
ioping measures I/O latency, similar to ping.
Installation and Usage
# Install
sudo apt install ioping
# Measure latency
ioping -c 10 /tmp
# Output example
4 KiB <<< /tmp (ext4 /dev/sda1): request=1 time=234.5 us
4 KiB <<< /tmp (ext4 /dev/sda1): request=2 time=198.3 us
...
--- /tmp (ext4 /dev/sda1) ioping statistics ---
10 requests completed in 2.15 ms, 40 KiB read, 4.65 k iops, 18.6 MiB/s
min/avg/max/mdev = 156.2 us / 215.0 us / 312.4 us / 45.2 us
Advanced Usage
# Direct I/O (bypass cache)
ioping -D /dev/sda
# Specify size
ioping -s 1M /tmp
# Continuous test
ioping -c 100 -i 0 /tmp
Network I/O
iperf3
iperf3 is the standard tool for network bandwidth testing.
# Install
sudo apt install iperf3
# Server side
iperf3 -s
# Client side
iperf3 -c server_ip
# Output example
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.00 sec 11.0 GBytes 9.42 Gbits/sec
iperf3 Advanced Usage
# UDP test
iperf3 -c server_ip -u -b 1G
# Multiple connections
iperf3 -c server_ip -P 4
# Bidirectional test
iperf3 -c server_ip --bidir
# JSON output
iperf3 -c server_ip -J > results.json
netperf
netperf focuses on latency testing.
# Install
sudo apt install netperf
# Server side
netserver
# TCP request/response latency
netperf -H server_ip -t TCP_RR
# Output example
TCP REQUEST/RESPONSE TEST
Local /Remote
Socket Size Request Resp. Elapsed Trans.
Send Recv Size Size Time Rate
bytes bytes bytes bytes secs. per sec
16384 131072 1 1 10.00 45678.90
dd Test
dd is the simplest I/O testing method.
# Write test
dd if=/dev/zero of=/tmp/test bs=1M count=1024 conv=fdatasync
# Read test
dd if=/tmp/test of=/dev/null bs=1M
# Read after clearing cache
sync; echo 3 | sudo tee /proc/sys/vm/drop_caches
dd if=/tmp/test of=/dev/null bs=1M
Limitations of dd
Problems with dd:
1. Single-threaded
- Cannot test parallel performance
2. No queue depth control
- Cannot test NVMe's true performance
3. No statistics
- Only average, no latency distribution
Recommendation:
- Use dd for quick tests
- Use fio for formal testing
File System Performance
Comparing Different File Systems
# Create test environment
for fs in ext4 xfs btrfs; do
mkfs.$fs /dev/sdb1
mount /dev/sdb1 /mnt/test
fio --name=test --filename=/mnt/test/file \
--size=10G --bs=4K --rw=randread \
--iodepth=32 --numjobs=4 \
--output-format=json > ${fs}_results.json
umount /mnt/test
done
Mount Options Impact
# Default mount
mount /dev/sdb1 /mnt/test
# Performance-optimized mount
mount -o noatime,nodiratime,discard /dev/sdb1 /mnt/test
# Compare performance difference
I/O Scheduler
Viewing and Setting
# View current scheduler
cat /sys/block/sda/queue/scheduler
# [mq-deadline] kyber bfq none
# Set scheduler
echo "none" | sudo tee /sys/block/sda/queue/scheduler
Scheduler Comparison
I/O scheduler characteristics:
Scheduler Use Case Features
─────────────────────────────────────────────────────
none NVMe SSD Lowest latency
mq-deadline General Balance latency and throughput
kyber Low latency needs Auto-adjusting
bfq Desktop/interactive Fairness priority
Performance Analysis Tools
iostat
# Install
sudo apt install sysstat
# Basic usage
iostat -x 1
# Example output
Device r/s w/s rkB/s wkB/s await %util
sda 125.00 45.00 5000.0 1800.0 0.85 12.5
nvme0n1 3500.0 1200.0 14000.0 4800.0 0.12 45.2
blktrace
# Trace I/O
sudo blktrace -d /dev/sda -o trace
# Analyze
blkparse -i trace.blktrace.0
# Visualize
btt -i trace.blktrace.0
Summary
Key tools for I/O performance testing:
Storage Testing
- fio: Complete storage benchmark
- ioping: I/O latency testing
- dd: Quick simple test
Network Testing
- iperf3: Bandwidth testing
- netperf: Latency testing
Analysis Tools
- iostat: Real-time monitoring
- blktrace: Detailed tracing
Testing Tips
- Use direct I/O to bypass cache
- Test different block sizes
- Test different queue depths
- Multiple runs for statistics