Practice · Scaling · Card 5

Why is the counter cache slowing down every comment?

A counter cache. Reasonable schema. Then a celebrity user posts. Suddenly every comment INSERT takes 800ms. Same table. Same indexes. What changed?

The setup

A counter cache on a Twitter-style app. The celebrity user has 100,000 followers and just posted; thousands of comments arrive within seconds.

class Comment < ApplicationRecord
  belongs_to :post, counter_cache: true
end

# Every Comment.create! generates:
#
#   BEGIN;
#   INSERT INTO comments (post_id, body, ...) VALUES (...);
#   UPDATE posts SET comments_count = comments_count + 1
#     WHERE id = 42;
#   COMMIT;

# The post is post #42. Thousands of users comment in the same second.

The question

Why does every comment INSERT slow down, given the comments table itself has plenty of capacity?

Take a moment. Pick the best answer. Wrong picks reveal why they're wrong, which is half the point.

✅ Answer breakdown

✗ A. Postgres doesn't allow multiple writes to the comments table in parallel.

It absolutely does. Postgres handles concurrent inserts into the same table fine. Inserts conflict only when they touch the same row (or via unique-index contention on a specific value).

✓ B. Each comment INSERT also runs an UPDATE on the same post row. Concurrent transactions all take a row-level lock on that one row and have to wait for each other.

The counter cache turns every comment write into two writes: an insert (on the comments table, fine) and an update on a single row of the posts table. Postgres locks that row for the duration of each transaction. Thousands of concurrent commenters serialize on post #42's row.

✗ C. The counter cache is updated through a slow database trigger.

Counter caches are Rails-level (callbacks on the model), not database triggers. The slowness isn't in how the update is fired, it's in the lock contention on the row being updated.

✗ D. Rails uses a single global lock when updating counter cache columns.

There's no such Rails-level lock. The contention is in Postgres, on a specific row, because thousands of transactions want to UPDATE the same row.

💡 The principle

This is hot-row contention. The shape: a single row in your database is being written to by many concurrent transactions, and Postgres serializes them on the row's lock. The whole system feels slow, but it's actually one row creating a queue.

Three real fixes:

Move the counter out of the hot path. Update posts.comments_count from a background job that batches writes. Each comment write returns quickly.
Shard the counter. Store N counter rows per post (e.g., 10), each updated by a fraction of writers. Sum them at read time. Standard "counter sharding" pattern from very-high-write systems.
Skip the cache when not needed. If you only show comments_count on the post page itself, you can compute it on read with a fast COUNT(*) + index. The "cache" was the bug.

Find hot rows by watching for high wait times on a small number of rows. pg_stat_activity and the slow query log are your friends.

📚 Theory

For the full walkthrough, read Scaling · Hot Rows & Contention.

← Why is page 50,000 so much slower than page 1? Back to Course →