Back to Course

Interview Question

"includes vs preload vs eager_load vs joins"

The Active Record interview question that tests how deeply the candidate has gone into the four loading methods. The SQL each one generates, when each is right, and the implementation knowledge that signals senior depth.

What the interviewer is actually checking

Every Rails developer has used includes. Fewer have used all four and understand the differences. The question filters for whether the candidate has ever opened the Rails log and read the SQL, ever hit a case where the wrong loading method made a query 100x slower, ever debugged the "expected JOIN, got two queries" surprise.

The strong signal is whether the candidate can describe the SQL each method generates without prompting. A mid-level answer covers includes and stops at "Rails handles it." A senior answer enumerates the four, names the SQL shape, and gives an opinion about which one to default to.

Behind the question is a check on whether the candidate is comfortable reading Active Record's generated SQL, which is required for any non-trivial production Rails work.

The mid-level answer

"includes loads associations to avoid N+1. joins does a SQL JOIN for filtering but does not load the records. preload and eager_load are alternatives to includes but I do not really use them; includes works for most cases."

Partially correct, but skips the details that matter. includes is not a single behavior; it switches between preload and eager_load at runtime. Knowing that switch and when to override it is the senior part of the answer.

The senior answer

Walk the four methods, name the SQL each generates, and give an opinion.

preload. "Always two queries. One for the parent collection, one for the children with WHERE parent_id IN (...). The associations are loaded but the queries are independent; the parent SQL is unaffected by the preload."

Post.preload(:author).limit(50)
# SELECT * FROM posts LIMIT 50
# SELECT * FROM authors WHERE id IN (1, 2, 3, ...)

eager_load. "Single query with a LEFT OUTER JOIN, plus column-aliasing so Active Record can hydrate both the parent and child records from one result row."

Post.eager_load(:author).where("users.role = ?", "admin")
# SELECT posts.id AS t0_r0, ... users.name AS t1_r1, ...
# FROM posts LEFT OUTER JOIN users ON users.id = posts.author_id
# WHERE users.role = 'admin'

includes. "Switches between the two at runtime. Default is preload. If Active Record detects that the query references the joined table from a string condition or with references(), it silently upgrades to eager_load. The runtime decision is the source of the 'why is includes generating one query here and two there?' confusion."

joins. "INNER JOIN, no record loading. The associated table is in the FROM but its columns are not selected. Useful only for filtering or sorting by the association without needing the records afterward. If you call post.author.name on a result, that fires a new query because the author was not loaded."

Post.joins(:author).where(users: { role: "admin" })
# SELECT posts.* FROM posts
# INNER JOIN users ON users.id = posts.author_id
# WHERE users.role = 'admin'

# Now: post.author.name still triggers a separate query.

The opinion. "I default to explicit preload when I do not need to filter by the association, and explicit eager_load when I do. includes is convenient but the runtime switch can surprise you when the same scope is used in two different chains. The pattern I have seen most often biting teams in production is includes that was expected to preload but quietly upgraded to a JOIN with multiplied result rows."

The follow-ups

"What does references do?" The right answer: it tells includes that you will reference the joined table from a string condition, even though Active Record cannot parse the string. Post.includes(:author).where("users.role = ?", "admin") works fine because the string mentions users. Post.includes(:author).where("LOWER(users.name) = ?", n) may fail because Active Record cannot always parse complex SQL; .references(:author) forces the upgrade.

"When would eager_load be wrong?" The right answer: when the join multiplies result rows in a way that explodes memory. If a Post has 1000 Comments and you eager_load(:comments).limit(50), the SQL JOIN can return 50,000 rows where Active Record then deduplicates the 50 Posts and attaches the 1000 Comments to each. The two-query preload avoids the explosion: one query gets 50 Posts, the second gets up to 50,000 Comments with a single WHERE post_id IN (...), no duplication.

"How would you detect which one Rails is using?" The right answer: read the development log. The SQL is right there. Either one query with a LEFT OUTER JOIN (eager_load) or two queries (preload). Confirm by counting the queries on a specific request.

"What about left_outer_joins?" Bonus topic. left_outer_joins is like joins but with a LEFT OUTER JOIN instead of INNER. Useful when you want to filter "posts with no author" via left_outer_joins(:author).where(users: { id: nil }). Does not load the association either.

What signals what

  • "includes handles it." Junior.
  • "includes is preload or eager_load depending on context." Mid. Aware of the switch but has not internalized when it bites.
  • "I default to explicit preload because of result-set multiplication." Senior. Has hit the failure mode and developed a heuristic.
  • "And in production I use pg_stat_statements to find queries where one slow JOIN matters more than the eager-loading shape." Senior+. Thinks in terms of production diagnosis, not only the API.

The principle at play

Active Record's loading methods are not interchangeable. Each generates different SQL with different cost and different memory profiles. The senior reflex is to look at the generated SQL, not only call the method and hope. Rails 7.1's .explain(:analyze) integration makes this easier than it used to be.

The interview reframing: when you get this question, do not stop at enumerating the methods. Name the SQL each produces. The candidates who do best at this question are the ones who clearly have an internal model of what the framework is doing in the database, not only what the Ruby API is.

Related lessons