Chapter 31: Case Study: Database Query Optimization
Part VIII: Case Studies
"The fastest query is the one you don't have to make." — Unknown DBA
The Story of "Cross-Datacenter Query" That Could Only Run 10 Times Per Second
Our service needed to read data from MySQL in another datacenter. Network latency was about 20ms (speed of light limitation).
Simple code:
def get_user_orders(user_id):
user = db.query("SELECT * FROM users WHERE id = ?", user_id)
orders = db.query("SELECT * FROM orders WHERE user_id = ?", user_id)
for order in orders:
items = db.query("SELECT * FROM items WHERE order_id = ?", order.id)
order.items = items
return user, orders
A user has 10 orders, each order has 5 items.
Total: 1 + 1 + 10 = 12 queries.
Each query 20ms RTT → total latency 240ms.
Can only handle 4 requests per second (single thread).
This is the classic N+1 query problem, amplified by network latency.
Network Latency: The Underestimated Killer
Speed of Light Limits
Location Distance Fiber Latency (one-way) RTT
───────────────────────────────────────────────────────────────
Same datacenter < 1 km ~0.005 ms ~0.01 ms
Same city ~50 km ~0.25 ms ~0.5 ms
Cross-city ~350 km ~1.75 ms ~3.5 ms
Cross-country ~2000 km ~10 ms ~20 ms
Cross-continent ~10000 km ~50 ms ~100 ms
This is a physical limit, cannot be optimized. The only solution is to reduce round trip count.
Little's Law Again
Throughput = Concurrency / Latency
If RTT = 20ms, single connection:
Throughput = 1 / 0.02 = 50 queries/sec
To reach 1000 queries/sec:
Concurrency = 1000 × 0.02 = 20 parallel connections
Problem Analysis
Original Code Problems
# Classic N+1 problem
users = db.query("SELECT * FROM users LIMIT 100")
for user in users:
# One extra query per user
orders = db.query("SELECT * FROM orders WHERE user_id = ?", user.id)
user.orders = orders
# Total 101 queries!
Using EXPLAIN to Analyze
EXPLAIN SELECT * FROM orders WHERE user_id = 12345;
+----+-------------+--------+------+---------------+------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+------+---------------+------+---------+------+---------+-------------+
| 1 | SIMPLE | orders | ALL | NULL | NULL | NULL | NULL | 1000000 | Using where |
+----+-------------+--------+------+---------------+------+---------+------+---------+-------------+
type = ALL → Full table scan! No index used.
Optimization Strategies
Optimization 1: Add Index
Most basic but most effective:
-- Check existing indexes
SHOW INDEX FROM orders;
-- Add missing index
CREATE INDEX idx_orders_user_id ON orders(user_id);
-- Verify
EXPLAIN SELECT * FROM orders WHERE user_id = 12345;
type = ref → Using index, scanning 10 rows (not 1M rows)
Optimization 2: Solve N+1 Problem
Method A: JOIN
-- Original: N+1 queries
SELECT * FROM users WHERE id = ?;
SELECT * FROM orders WHERE user_id = ?;
-- Optimized: 1 JOIN
SELECT u.*, o.*
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.id = ?;
Method B: Batch Query
# Original: N+1
users = db.query("SELECT * FROM users LIMIT 100")
for user in users:
orders = db.query("SELECT * FROM orders WHERE user_id = ?", user.id)
# Optimized: 2 queries
users = db.query("SELECT * FROM users LIMIT 100")
user_ids = [u.id for u in users]
orders = db.query("SELECT * FROM orders WHERE user_id IN (?)", user_ids)
# Combine in application layer
orders_by_user = group_by(orders, 'user_id')
for user in users:
user.orders = orders_by_user.get(user.id, [])
Method C: ORM Eager Loading
Optimization 4: Connection Pool
# Original: create connection for each query
def query(sql):
conn = mysql.connect(host='db.server.com', ...) # TCP + TLS handshake
cursor = conn.cursor()
cursor.execute(sql)
result = cursor.fetchall()
conn.close()
return result
# Optimized: connection pool
from sqlalchemy import create_engine
engine = create_engine(
'mysql://user:pass@db.server.com/mydb',
pool_size=20, # Keep 20 connections
max_overflow=10, # Up to 10 extra
pool_recycle=3600, # Recycle after 1 hour
pool_pre_ping=True # Test connection before use
)
Connection establishment cost:
TCP three-way handshake: 1 RTT (~20ms)
TLS handshake: 2 RTT (~40ms) for TLS 1.2
MySQL authentication: 1 RTT (~20ms)
──────────────────────────────────────────
Total: ~80ms per new connection
With connection pool, this cost is paid only once.
Caching Strategy
Multi-Layer Cache Architecture
┌─────────────┐
│ Application │
│ Cache │ ← L1: In-process cache (fastest, small capacity)
└──────┬──────┘
│
┌──────▼──────┐
│ Redis / │ ← L2: Distributed cache (fast, medium capacity)
│ Memcached │
└──────┬──────┘
│
┌──────▼──────┐
│ Database │ ← L3: Database (slow, large capacity)
│ Buffer Pool │
└─────────────┘
Cache-Aside Pattern
def get_user(user_id):
# 1. Check cache first
cache_key = f"user:{user_id}"
cached = redis.get(cache_key)
if cached:
return deserialize(cached)
# 2. Cache miss, query database
user = db.query("SELECT * FROM users WHERE id = ?", user_id)
# 3. Write to cache
redis.setex(cache_key, 3600, serialize(user)) # 1 hour expiry
return user
Cache Invalidation
# Invalidate on update
def update_user(user_id, data):
db.execute("UPDATE users SET ... WHERE id = ?", user_id)
redis.delete(f"user:{user_id}") # Delete cache
# Or: update cache on update
def update_user_with_cache(user_id, data):
db.execute("UPDATE users SET ... WHERE id = ?", user_id)
user = db.query("SELECT * FROM users WHERE id = ?", user_id)
redis.setex(f"user:{user_id}", 3600, serialize(user))
Practical Example: Optimizing Cross-Datacenter Query
Back to the original problem:
# Original: 12 queries, 240ms
def get_user_orders(user_id):
user = db.query("SELECT * FROM users WHERE id = ?", user_id)
orders = db.query("SELECT * FROM orders WHERE user_id = ?", user_id)
for order in orders:
items = db.query("SELECT * FROM items WHERE order_id = ?", order.id)
order.items = items
return user, orders
Optimized Version
def get_user_orders_optimized(user_id):
# 1. Check cache first
cache_key = f"user_orders:{user_id}"
cached = redis.get(cache_key)
if cached:
return deserialize(cached)
# 2. Single query to get all data
result = db.execute("""
SELECT u.*, o.*, i.*
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
LEFT JOIN items i ON o.id = i.order_id
WHERE u.id = ?
""", user_id)
# 3. Assemble in application layer
user, orders = assemble_result(result)
# 4. Write to cache
redis.setex(cache_key, 1800, serialize((user, orders)))
return user, orders
Results
Original:
- 12 queries × 20ms = 240ms
- Throughput: 4 req/s (single thread)
Optimized:
- Cache hit: < 1ms (Redis in same datacenter)
- Cache miss: 1 query × 20ms = 20ms
- Throughput: 50+ req/s (single thread)
- With 90% cache hit rate: average ~3ms
Summary
Network Latency Is a Hard Limit
20ms RTT = max 50 queries/sec (single connection)
Solution: reduce round trips, increase parallelism
N+1 Problem
Symptom: N+1 queries
Solution: JOIN, batch query, ORM eager loading
Optimization Layers
| Layer | Optimization Method |
|---|---|
| Query | Index, JOIN, batching |
| Connection | Connection pool, multiplexing |
| Protocol | Pipeline, compression |
| Cache | L1/L2 cache, Cache-Aside |
| Storage | Buffer pool, partitioning, SSD tuning |
| Network | TCP tuning, BBR |
Caching Strategies
Cache-Aside: Fill on read
Write-Through: Update on write
Write-Behind: Async write
Watch out for: penetration, breakdown, avalanche
Remember
Query count × RTT = minimum latency
1 good query > 10 simple queries
On high-latency networks, this difference is even more pronounced