Back to Course

Interview Question

"How do you handle long-running tasks?"

The interview question that tests whether the candidate has shipped real production background work. Active Job, the adapter choice, retries and idempotency, transactions, and the gotchas that bite teams once they hit scale.

What the interviewer is actually checking

"Long-running" in a Rails request usually means "this should not happen synchronously." The interview question filters for whether the candidate has thought about why the request cycle is bad for long work, what tool to reach for, how to make jobs reliable, and what to do when a job fails.

A weak answer says "use Sidekiq." A senior answer covers Active Job as the abstraction, the adapter choice for 2026, retry semantics, idempotency, the transactional-enqueue trap, and the operational concerns (job UI, dead-job recovery, runaway queues).

The mid-level answer

"I would put the long-running work in a background job using Sidekiq. The controller calls SomeJob.perform_later and returns immediately; the job runs separately."

Correct as a starting point. Misses Active Job (the abstraction Rails ships), the 2026 adapter choice (Solid Queue is the new default), retries and idempotency, and the transactional gotcha.

The senior answer

Active Job as the layer. "Rails has shipped Active Job since 4.2. Application code calls SomeJob.perform_later(args) regardless of the adapter underneath. The adapter (Sidekiq, Solid Queue, good_job) is a config setting. Writing application code against Active Job means I can swap adapters without touching business logic."

The adapter choice for 2026. "Default to Solid Queue for new Rails 8 apps. Postgres-backed, no Redis required, transactional enqueueing with the database. Switch to Sidekiq if throughput exceeds ~5,000 jobs/sec or the team has years of Sidekiq ops experience. Detailed in Sidekiq vs Solid Queue."

The transactional enqueue trap. "Critical detail. If I'm using Sidekiq, SomeJob.perform_later writes to Redis immediately, regardless of whether the surrounding transaction commits. If the transaction rolls back, the job still runs and finds no record. The Sidekiq community pattern is after_commit_everywhere or wrapping enqueues in ActiveRecord::Base.connection.after_commit_block. With Solid Queue, the enqueue is in the same database transaction, so this problem disappears."

Retries and idempotency. "Jobs can fail and will be retried. The senior reflex: every job should be idempotent. Running the job twice should produce the same result as running it once. For 'send email' jobs, that means tracking whether the email was sent before sending; for 'process payment' jobs, that means an idempotency key on the Stripe call. retry_on in Active Job declares which exceptions warrant retries; discard_on declares which should give up immediately."

class ChargeOrderJob < ApplicationJob
  retry_on Stripe::APIConnectionError, wait: :polynomially_longer, attempts: 5
  discard_on ActiveRecord::RecordNotFound  # order was deleted, do not retry

  def perform(order_id)
    order = Order.find(order_id)
    return if order.charged?  # idempotency check

    Stripe::Charge.create(
      amount:           order.total_cents,
      customer:         order.user.stripe_customer_id,
      idempotency_key:  "order-#{order.id}-charge"
    )
    order.update!(charged: true, charged_at: Time.current)
  end
end

The operational concerns

"Beyond the code, three things a senior thinks about for any background-job system."

Observability. "I need to see which jobs are running, which are failing, what the queue depths look like. Sidekiq Web is the gold standard for this. Mission Control for Solid Queue is improving. For both: have a known URL, know who has access, check it during incidents."

Dead jobs. "When a job exhausts retries, it lands in a dead-jobs queue. Someone needs to look at it. A weekly review of the dead-jobs queue is a useful discipline; otherwise dead jobs become invisible until they cost something."

Queue priorities. "Not all jobs are equal. A welcome email can wait two minutes; a fraud check has to run within seconds. Separate queues with different worker pools (or different concurrency settings) let critical work jump ahead of the bulk."

The follow-ups

"When would you NOT use a background job?" When the operation is fast and the user is waiting for the result. Don't background a 50ms database query: the controller can wait, the user wants the result. Background work is for operations that exceed roughly 200ms or that should not block the response (sending email, generating reports, syncing to external systems).

"How would you handle a job that needs to run at a specific time?" SomeJob.set(wait: 1.hour).perform_later or .set(wait_until: 5.minutes.from_now). For recurring schedules: Sidekiq Cron, Solid Queue's recurring tasks, or a Cronicle/Cron setup that enqueues jobs at the right times.

"How do you backfill a large dataset?" Not in one job. Bulk operations should be batched: find_each(batch_size: 1000), enqueueing one job per batch (or per record if the per-record work is heavy). The senior trap: a single job that .update_all on 10M rows runs for hours and cannot be paused.

"What is a 'poison pill' and how do you handle one?" A job that fails consistently and never makes progress (always raises the same exception). Without intervention, retries keep firing forever. The fix: either fix the underlying bug (often a stale reference to a deleted record), or move the job to dead-letter so it stops eating queue capacity. discard_on with specific exceptions handles many of these automatically.

The principle at play

Background jobs are not "send the work somewhere else." They are a distinct surface with their own failure modes (retries, partial completion, queue exhaustion, transactional decoupling) and their own operational surface (UI, dead jobs, queue depth, priority). A senior treats the job system as carefully as the request system.

The shortest version of the senior answer: Active Job + an adapter, jobs are idempotent, enqueue after_commit, retry with backoff, watch the dead queue. Anything more sophisticated (priorities, scheduled runs, batches) is a refinement of that core.

Related lessons