Scaling Series · 5 of 8

Caching, Properly

Where Rails caching comes from, why cache invalidation is the hard part, and how Discourse, Russian-doll fragments, and HTTP ETags work together to keep a fast app fast.

Where this rule comes from

Phil Karlton's famous remark from his time at Netscape: "There are only two hard things in Computer Science: cache invalidation and naming things." The line lands because anyone who has shipped caching code has felt the asymmetry. The cache itself is trivial, a key, a value, a store. The hard part is knowing when the value is no longer correct.

Caching in Rails has three lineages worth knowing about:

HTTP caching, defined in RFC 2068 in 1997, lets browsers and CDNs avoid round-trips entirely. ETag and Last-Modified headers carry the version of a response; a conditional request can return 304 Not Modified with no body.
Fragment caching, in Rails since 2.0 (2007), lets you cache parts of views. The big innovation came in Rails 4 (2013) with Russian-doll caching: nested fragments where the outer cache key automatically includes the inner caches, so invalidating an inner record invalidates the outer view by construction.
Low-level caching, the generic Rails.cache.fetch(key) { expensive_computation } wrapper. Backed by memory, Redis, memcached, or, in Rails 8, Solid Cache, which uses Postgres.

The senior rule across all three: caching is not a way to speed up slow code; it is a way to avoid running correct code that has not changed. If the underlying code is slow because of an N+1 or a missing index, fix that first. Caching layered on top of fundamentally wrong code is how production incidents are built.

The anti-pattern

Picture a Rails app where a senior dashboard renders slowly. The team's first instinct is to wrap it in Rails.cache.fetch:

class DashboardController < ApplicationController
  def show
    @stats = Rails.cache.fetch("dashboard_stats", expires_in: 1.hour) do
      {
        total_revenue:    Order.sum(:total_cents),
        active_users:     User.where("last_seen_at > ?", 1.day.ago).count,
        signups_today:    User.where("created_at > ?", Time.current.beginning_of_day).count,
        # ... 12 more aggregations
      }
    end
  end
end

This works for ninety minutes, then breaks in three different ways:

The cache returns stale data. A user signs up, refreshes the dashboard, and does not see themselves in signups_today for an hour. Support tickets follow.
The cache stampede. When the 1-hour TTL expires, the next ten concurrent requests all see a cache miss, all run the underlying aggregation simultaneously, and all write to the cache at almost the same moment. The database hits a spike of expensive parallel queries every hour at minute :00.
The cache key is global. Every user sees the same numbers. The day someone wants a per-user version of this dashboard, the cache key needs to change. Every cache-key change is a stampede the moment it ships.

The deeper anti-pattern: caching was reached for as the first move. The dashboard is slow because Order.sum(:total_cents) is doing a full table scan on a 10-million-row orders table without an index, and User.where("last_seen_at > ?", 1.day.ago).count is doing the same on users. Fixing the underlying queries (lesson 2) would make the dashboard render in 80ms instead of 4 seconds, and the cache would not be needed at all.

Rule 1 of caching: fix the underlying code first

Before caching anything, ask: is this slow because the underlying code is slow, or because the underlying code is fundamentally expensive? The two have different fixes:

Slow because of an N+1 or a missing index, fix that. Caching the wrong query gives you correct caching on incorrect work.
Slow because it is fundamentally expensive, image processing, large aggregations on tables with billions of rows, paid API calls. Caching makes sense here, because the underlying work is genuinely costly even when done correctly.

The dashboard example above is the first case. It does not need caching; it needs indexes. Most "slow Rails endpoint" cases are this case. Apply caching when you have already made the underlying code as fast as it can be, and it is still expensive enough to justify caching.

Russian-doll fragment caching

For view rendering, the case where caching genuinely pays off in Rails, DHH's Russian-doll pattern from Rails 4 is the canonical shape. The idea: nest fragment caches so that invalidating an inner one cascades correctly to its outer wrappers, by including the inner cache key in the outer cache key.

# views/posts/index.html.erb
<% cache(@posts) do %>
  <% @posts.each do |post| %>
    <% cache(post) do %>
      <article>
        <h2><%= post.title %></h2>
        <p><%= post.summary %></p>

        <% cache([post, "comments"]) do %>
          <% post.comments.each do |comment| %>
            <% cache(comment) do %>
              <div class="comment">
                <strong><%= comment.author_name %></strong>
                <p><%= comment.body %></p>
              </div>
            <% end %>
          <% end %>
        <% end %>
      </article>
    <% end %>
  <% end %>
<% end %>

Three things make this work:

1. The cache key is the record itself. Rails uses @post.cache_key_with_version automatically: "posts/42-20260512123456". When the post is updated, updated_at changes, the cache key changes, the next render misses and rebuilds. Invalidation by key change instead of by explicit deletion.

2. Outer caches include the collection's cache key. The outermost cache(@posts) hashes together every post's updated_at. If any post in the list is updated, the outer fragment is invalidated automatically.

3. touch: true on associations propagates updates upward. class Comment; belongs_to :post, touch: true; end means saving a Comment updates the parent Post's updated_at, which invalidates the post's cache, which invalidates the index's cache. The cascade is automatic; you do not write invalidation code.

The pattern's elegance: cache invalidation becomes a side effect of normal Rails writes, not a separate concern the developer has to remember. This is why DHH says cache invalidation does not become a separate concern in Rails, because Rails-shaped caching uses cache-key-versioning, not explicit invalidation.

HTTP caching: the cache you forgot you had

The fastest cache is one that returns nothing at all. HTTP conditional GETs let the browser or CDN ask "do you have a newer version of this resource than I do?" and skip the response body entirely if not.

class PostsController < ApplicationController
  def show
    @post = Post.find(params[:id])

    # If the client's ETag matches, return 304 Not Modified
    # with no body. Saves rendering, transfer, and parsing.
    fresh_when @post
  end
end

# fresh_when sends:
#   ETag: W/"abc123..." (based on @post.cache_key_with_version)
#   Last-Modified: <@post.updated_at>
#
# Next request includes:
#   If-None-Match: W/"abc123..."
#   If-Modified-Since: <prev value>
#
# If unchanged, Rails returns 304 with no body. Browser uses
# its local copy. The full controller code does not run after
# the conditional check.

This costs nothing and saves bandwidth, server time, and database queries on every cache hit. Most Rails apps could enable fresh_when on their show actions and immediately reduce server load.

For public, non-personalized resources, you can also use stale? with public: true to let CDNs cache the response themselves. The CDN absorbs the traffic; your server never sees the request. For high-traffic public pages, this is the single highest-impact caching move.

Low-level caching with proper keys

For things that are not whole responses or view fragments, expensive computations, external API responses, aggregated data, use Rails.cache.fetch. The trick is the cache key. A bad cache key is the source of every cache-related production bug:

# Bad: global key, no versioning. Stale forever after one write.
Rails.cache.fetch("user_count") { User.count }

# Better: include a version that changes when the underlying
# data changes. If you do not have a natural version, use a
# manual one and bump it when the schema changes.
Rails.cache.fetch(["user_count", "v2"], expires_in: 5.minutes) do
  User.count
end

# Best: scope the key to the inputs that affect the result.
Rails.cache.fetch(["recent_orders", user.cache_key_with_version], expires_in: 10.minutes) do
  user.orders.where("created_at > ?", 1.day.ago).to_a
end
# When the user is updated (touched), the key changes,
# the cache is naturally invalidated.

The rule: every input that can affect the cached value should be in the key. User ID. Time bucket (if relevant). Version number (if you might change the computation later). When any of those changes, the cache key changes, and stale data cannot leak.

Cache stampedes and how to avoid them

A cache stampede is what happens when many requests notice a cache miss simultaneously and all run the expensive computation. The cache, ironically, is the thing that caused the database spike.

Rails 5.2 added a fix to Rails.cache.fetch: race conditions are mitigated via the race_condition_ttl option. When the cache expires, the first request to notice will start regenerating the value while still returning the stale value to other requests for a short overlap. By the time the overlap window ends, the new value is in the cache.

Rails.cache.fetch(
  ["expensive_report", scope],
  expires_in: 1.hour,
  race_condition_ttl: 30.seconds
) do
  ExpensiveReport.compute(scope)
end

# After the 1-hour TTL expires, only one request runs the block.
# Other concurrent requests get the slightly-stale value
# for up to 30 more seconds while the new value is being computed.

Combined with the Russian-doll pattern, where cache keys are versioned by record updated_at, stampedes become rare. The remaining cases are usually pre-computed reports where you control the work, and you can schedule the warming on a job rather than letting requests trigger it.

Solid Cache vs Redis

The Rails 8 default is Solid Cache, a Postgres-backed cache that stores entries in your existing database. It is part of the same "Solid trio" as Solid Queue and Solid Cable, designed to let you run Rails apps without operating Redis.

The tradeoff:

Solid Cache, one fewer service to operate, transactional consistency with your database writes, slightly higher latency per read/write than Redis. Good for the majority of apps, especially smaller ones where the operational simplicity of "no Redis" is worth the small per-operation overhead.
Redis, much higher throughput on individual cache ops, mature, well-understood, requires running a Redis service. Good when you have very high cache traffic, or when your team already runs Redis for other reasons (Sidekiq before Solid Queue, ActionCable before Solid Cable, etc.).

The two-schools framing from the SOLID series shows up again here. The 37signals school uses Solid Cache by default (it is in Fizzy, in Hey, in the Rails 8 install template). The Shopify/Sidekiq-heavy school continues to use Redis. Both are senior; the choice depends on what your team already operates and what your traffic profile is.

What real teams have written

Discourse is famously aggressive about caching. Sam Saffron has written extensively about Russian-doll caching, fragment caching, and HTTP caching in Discourse's blog posts. The Discourse codebase is one of the best public references for caching done well in a real Rails app, with the patterns used in production by thousands of forums.

Shopify Engineering has published about their caching tier, IdentityCache, mentioned in the hot-rows lesson, is partly a caching gem. Their post on memcached's role in Shopify's request lifecycle is instructive for very-large-scale Rails caching, though most teams will not approach Shopify's scale.

Basecamp / 37signals originated the Russian-doll pattern. DHH's Russian Doll Caching screencast from 2013 is still online and is the canonical explanation of the technique. The pattern's design, cache invalidation by key change rather than by deletion, is what makes "cache invalidation is tractable" a working claim in Rails, despite Karlton's joke.

GitHub's engineering blog on HTTP caching at the CDN level is worth reading for the public-API perspective. They serve enormous traffic from CDN cache hits, the percentage of requests that never reach a Rails dyno is one of their key scaling levers.

When NOT to cache

When the underlying code is fast already. Caching a 5ms query saves milliseconds; the operational complexity is not worth it.
When freshness matters more than speed. Financial data, real-time chat, "is this user logged in right now", the cost of a stale answer is higher than the cost of recomputing.
When write patterns invalidate the cache more often than it is read. A cache that is invalidated on every write and read only twice between invalidations is a net loss.
When you cannot reason about the cache key. If you cannot answer "what changes invalidate this cache?" in one sentence, you do not have a cache; you have a future bug.

The senior heuristic: caching is the second-to-last move, not the first. Fix indexes, fix N+1s, move synchronous I/O to background jobs. By the time you actually need caching, you will know exactly what to cache and why, and the cache key will write itself.

The principle at play

Caching is a contract between you and the future: "this value will not change in a way that matters before its cache key changes." The hard part of caching is making that contract correct. Russian-doll caching makes it correct by construction, because the cache key includes everything that affects the cached value, transitively. HTTP caching makes it correct by making the client responsible for noticing changes (via ETag). Low-level caching makes it correct by hand, and gets it wrong frequently.

The deeper move is that "the cache stays correct" should be a property of the cache key, not a property of code you remember to write. When you cache [user.cache_key, "stats"], you have outsourced invalidation to Rails, user gets updated, key changes, cache is functionally invalidated. When you cache "user_#{user.id}_stats", you have to remember to delete that key in every code path that mutates the user, which is brittle.

The pragmatic value: most "the cache is stale" production incidents come from cache keys that did not include the right inputs. Designing the cache key well is the work; the rest is plumbing.

Practice exercise

Grep your views for <% cache. For each cached fragment, check that the cache key uses record-based versioning (cache(@post), not cache("posts_index")). The first is invalidated automatically; the second is a stale-cache time bomb.
Look at one of your high-traffic show actions. Add fresh_when @record at the top. Test that subsequent requests with the same browser get 304 responses. This is free performance for clients with caching enabled.
For models that participate in cached views, check that child models use belongs_to :parent, touch: true. Without touch, Rails caching cannot propagate child changes up to parent fragments.
Grep your codebase for Rails.cache.fetch. For each call, ask: "if every input to the cached computation changes, will the cache key change?" If the answer is no, the key is wrong.
Bonus: if your app uses Redis for caching and you do not have a specific reason to (very-high-traffic cache ops, existing Redis infrastructure), try Solid Cache on a staging environment. The operational simplicity of "no Redis service to manage" is real, and the per-op overhead is usually invisible.

← Scaling 4, Hot Rows Scaling 6, Background Jobs →