Practice · Scaling · Card 3
What's the failure mode of this cache under high traffic?
The code is textbook. It works in development. Under load, it has a specific failure mode you have to name.
The code
A homepage controller serving 5,000 requests per minute.
def index
@posts = Rails.cache.fetch("homepage_feed", expires_in: 1.minute) do
Post.includes(:user, :tags)
.order(score: :desc)
.limit(100)
.to_a
end
end The question
Under sustained traffic, this code starts behaving badly every minute. Name the failure mode and how you'd fix it.
Take a moment. Every 60 seconds, the cache entry expires. What does a flood of in-flight requests do at that instant?
The failure mode
Cache stampede (also called "thundering herd"). Every 60 seconds, the entry expires. The next batch of concurrent requests all see "cache miss" before any of them finishes rebuilding. They all run the expensive query against the database at the same time, multiplying the cost by however many concurrent requests landed in that window.
The cache exists to spare the database; the stampede makes the database take the hit anyway, but in an even worse pattern — synchronized concurrent expensive queries.
The fix
Three patterns, ordered by simplicity:
race_condition_ttl:Tells Rails to extend an expiring entry briefly so only one request rebuilds; the others keep serving the slightly-stale value during the gap.- Jitter the TTL.
expires_in: 1.minute + rand(10).seconds. Spreads expirations across requests; no synchronized cliff. - Warm the cache from a background job. The web request never sees a miss; the job refreshes the cache periodically and the web requests just read.
Wrong takes worth noticing
- "Key collision with another part of the app." Possible in theory; would surface as wrong data, not periodic slowdown. A unique key prefix per cache type prevents it.
- "The TTL is too short." TTL choice is a trade-off (freshness vs hit rate), not a failure mode. A 1-minute TTL is reasonable for a live-feeling feed.
- "
Rails.cache.fetchdoesn't support blocks." It does; that's the whole API.
The principle
"Cache miss + concurrent requests + expensive rebuild" is the recipe for stampede. Any one of three things breaks the recipe: keep someone serving the old value while rebuilding, spread the miss across time, or rebuild ahead of the miss. Pick whichever fits your traffic shape.
Theory
Full walkthrough at Scaling · Caching, Properly.