System Design · 18 min read

Design a Distributed Rate Limiter: Tokens, Leaks, and Shared State

Edge vs service limits, approximate counters, and what breaks under Redis failover.

3,689 words

Design a Distributed Rate Limiter: Tokens, Leaks, and Shared State. Edge vs service limits, approximate counters, and what breaks under Redis failover. This long-form guide sits in the Alpha Code library because interview prep should feel structured, not superstitious: we anchor advice to what loops actually measure, how time pressure distorts judgment, and how to rehearse behaviors that stay stable under stress. You will find six concrete chapters below, each with checklists and recovery patterns you can reuse across companies and levels. We wrote it for candidates who already know the basics but want a disciplined narrative — the kind of document you can skim before a phone screen and deep-read before an onsite. Expect explicit tradeoffs, not cheerleading: some strategies cost time, some require partners, and some only make sense at certain seniority bands. If a section does not apply to your target loop, skip it without guilt; the goal is optionality, not completionism. By the end, you should be able to describe your prep plan to a mentor in five minutes and sound like you have a system, not a pile of bookmarks.

algorithm choices — what interviewers measure in the first five minutes

This section focuses on algorithm choices — what interviewers measure in the first five minutes. Candidates preparing for Design a Distributed Rate Limiter often underestimate how much interviewers infer from process: how you decompose the prompt, name tradeoffs, and verify before you optimize. The behaviors that look boring — restating constraints, proposing a baseline, testing a tiny example — are exactly what separates hire from no-hire when two solutions have similar asymptotics. We connect this theme to what hiring committees actually write in feedback forms, not abstract advice. Treat the next paragraphs as a script you can steal: say the quiet parts out loud, label your invariants, and narrate recovery when you misread a constraint. Practice until it feels mechanical, because stress will strip your polish unless the habits are automatic.

Complexity analysis is a communication tool. Big-O is not only for the end of the problem — it is how you justify why you are not exploring an exponential search. State the bottleneck honestly: maybe sorting dominates, maybe a hash map makes queries linear on average, maybe nested loops are acceptable because the inner bound is tiny. Interviewers reward coherent complexity stories more than memorized proofs.

Tradeoff tables beat absolutes. Strong consistency vs availability, SQL vs NoSQL for this workload, sync vs async processing — show the decision criteria, not a slogan. The goal is to demonstrate judgment, not encyclopedic product knowledge.

Offer timelines compress judgment. You will be tired, you will compare yourself to peers, and you will be tempted to cram randomly. A written plan — even a single page — reduces thrash: which skills you are proving this week, which companies get which energy, and what 'good enough' looks like for each stage. Revisit the plan twice a week instead of reinventing it nightly.

The best onsite performances look boring from the outside: clear steps, explicit assumptions, and a solution that actually finishes.
Composite feedback from mock interview coaches
  • Restate the heart of "algorithm choices — what interviewers measure in the first five minutes" and confirm inputs, outputs, and edge cases.
  • Propose a brute-force or baseline you can finish — name its complexity honestly.
  • Walk a hand trace on a small example; only then refactor toward the optimal structure.
  • Reserve the final minutes for tests: null/empty, duplicates, extremes, and off-by-one boundaries.
  • Close with a one-sentence summary of tradeoffs and what you would monitor in production.

Tradeoff tables beat absolutes. Strong consistency vs availability, SQL vs NoSQL for this workload, sync vs async processing — show the decision criteria, not a slogan. The goal is to demonstrate judgment, not encyclopedic product knowledge.

Complexity analysis is a communication tool. Big-O is not only for the end of the problem — it is how you justify why you are not exploring an exponential search. State the bottleneck honestly: maybe sorting dominates, maybe a hash map makes queries linear on average, maybe nested loops are acceptable because the inner bound is tiny. Interviewers reward coherent complexity stories more than memorized proofs.

First moves: framing shared state stores before you reach for code

This section focuses on First moves: framing shared state stores before you reach for code. Candidates preparing for Design a Distributed Rate Limiter often underestimate how much interviewers infer from process: how you decompose the prompt, name tradeoffs, and verify before you optimize. The behaviors that look boring — restating constraints, proposing a baseline, testing a tiny example — are exactly what separates hire from no-hire when two solutions have similar asymptotics. We connect this theme to what hiring committees actually write in feedback forms, not abstract advice. Treat the next paragraphs as a script you can steal: say the quiet parts out loud, label your invariants, and narrate recovery when you misread a constraint. Practice until it feels mechanical, because stress will strip your polish unless the habits are automatic.

Behavioral answers rot without maintenance. Stories should be refreshed every six to twelve months with new metrics and clearer scope. The STAR format is a scaffold, not a script — senior interviewers want to hear how you prioritized, what you learned, and what you would do differently. Keep a one-page story bank with bullets, not paragraphs, so you can assemble answers live without sounding rehearsed.

Data models should match access patterns. Normalize when invariants matter; denormalize when read latency dominates and you can tolerate eventual consistency. Event sourcing is powerful but operational overhead is real — do not propose it unless the problem benefits from auditability or replay.

Data structures are not Pokemon; you do not collect them for their own sake. You pick the structure that makes the operations your algorithm needs cheap. If you need fast membership and order does not matter, a set or map is the conversation. If you need order statistics, heaps or balanced trees enter. If the problem is about connectivity, graphs are near. Practice explaining that mapping in one sentence before you write code.

  • Restate the heart of "First moves: framing shared state stores before you reach for code" and confirm inputs, outputs, and edge cases.
  • Propose a brute-force or baseline you can finish — name its complexity honestly.
  • Walk a hand trace on a small example; only then refactor toward the optimal structure.
  • Reserve the final minutes for tests: null/empty, duplicates, extremes, and off-by-one boundaries.
  • Close with a one-sentence summary of tradeoffs and what you would monitor in production.

Data models should match access patterns. Normalize when invariants matter; denormalize when read latency dominates and you can tolerate eventual consistency. Event sourcing is powerful but operational overhead is real — do not propose it unless the problem benefits from auditability or replay.

Behavioral answers rot without maintenance. Stories should be refreshed every six to twelve months with new metrics and clearer scope. The STAR format is a scaffold, not a script — senior interviewers want to hear how you prioritized, what you learned, and what you would do differently. Keep a one-page story bank with bullets, not paragraphs, so you can assemble answers live without sounding rehearsed.

MomentWhat to say
StartI'll restate the goal, then propose a baseline I can complete in time.
MidpointHere's the invariant I'm maintaining — I'll verify it on the example.
StuckI'm stuck on X; I'll try a smaller case and see what breaks.
EndI'll run these edge cases, then summarize complexity and tradeoffs.

Tradeoffs, pitfalls, and honest complexity around edge vs regional

This section focuses on Tradeoffs, pitfalls, and honest complexity around edge vs regional. Candidates preparing for Design a Distributed Rate Limiter often underestimate how much interviewers infer from process: how you decompose the prompt, name tradeoffs, and verify before you optimize. The behaviors that look boring — restating constraints, proposing a baseline, testing a tiny example — are exactly what separates hire from no-hire when two solutions have similar asymptotics. We connect this theme to what hiring committees actually write in feedback forms, not abstract advice. Treat the next paragraphs as a script you can steal: say the quiet parts out loud, label your invariants, and narrate recovery when you misread a constraint. Practice until it feels mechanical, because stress will strip your polish unless the habits are automatic.

Testing your solution should be habitual, not heroic. Walk a small example by hand, then translate that walk into asserts or print debugging if the environment allows. If tests fail, read the failure mode: off-by-one errors cluster at boundaries; infinite loops often mean your termination condition moved; wrong answers without crashes often mean a logic gap in state updates. Label those categories in your post-mortem so you see patterns across problems.

Caching layers need invalidation stories. TTL-only caches are simpler but stale reads may be unacceptable for certain entities. Write-through vs write-behind trades durability vs latency. Mention cache stampede mitigation if hot keys are plausible — interviews reward that awareness.

Mock interviews fail when they are too polite. The point is not confidence; the point is diagnostic signal. You want a partner who will interrupt, ask why you chose a data structure, and force you to state invariants explicitly. Record audio if you can. The gap between what you think you explained and what you actually said is where most surprises live.

  • Restate the heart of "Tradeoffs, pitfalls, and honest complexity around edge vs regional" and confirm inputs, outputs, and edge cases.
  • Propose a brute-force or baseline you can finish — name its complexity honestly.
  • Walk a hand trace on a small example; only then refactor toward the optimal structure.
  • Reserve the final minutes for tests: null/empty, duplicates, extremes, and off-by-one boundaries.
  • Close with a one-sentence summary of tradeoffs and what you would monitor in production.

Caching layers need invalidation stories. TTL-only caches are simpler but stale reads may be unacceptable for certain entities. Write-through vs write-behind trades durability vs latency. Mention cache stampede mitigation if hot keys are plausible — interviews reward that awareness.

Testing your solution should be habitual, not heroic. Walk a small example by hand, then translate that walk into asserts or print debugging if the environment allows. If tests fail, read the failure mode: off-by-one errors cluster at boundaries; infinite loops often mean your termination condition moved; wrong answers without crashes often mean a logic gap in state updates. Label those categories in your post-mortem so you see patterns across problems.

When burst tolerance goes sideways: recovery scripts that still score

This section focuses on When burst tolerance goes sideways: recovery scripts that still score. Candidates preparing for Design a Distributed Rate Limiter often underestimate how much interviewers infer from process: how you decompose the prompt, name tradeoffs, and verify before you optimize. The behaviors that look boring — restating constraints, proposing a baseline, testing a tiny example — are exactly what separates hire from no-hire when two solutions have similar asymptotics. We connect this theme to what hiring committees actually write in feedback forms, not abstract advice. Treat the next paragraphs as a script you can steal: say the quiet parts out loud, label your invariants, and narrate recovery when you misread a constraint. Practice until it feels mechanical, because stress will strip your polish unless the habits are automatic.

ML and AI interviews increasingly test systems, not just models. Be ready to discuss data pipelines, evaluation beyond accuracy, latency budgets, failure modes, and cost. A model that is correct offline but too slow online is not shippable. Practice sketching a training-serving split, monitoring hooks, and rollback strategy — that is the engineering bar, not the latest paper.

Tradeoff tables beat absolutes. Strong consistency vs availability, SQL vs NoSQL for this workload, sync vs async processing — show the decision criteria, not a slogan. The goal is to demonstrate judgment, not encyclopedic product knowledge.

Company-specific prep should stay ethical. You can study public interview guides, pattern frequencies, and how loops are structured. You should not seek live question dumps or share proprietary assessments. The goal is to reduce anxiety and calibrate effort, not to memorize answers you do not understand. Understanding travels; memorization shatters when the interviewer changes a constraint.

The best onsite performances look boring from the outside: clear steps, explicit assumptions, and a solution that actually finishes.
Composite feedback from mock interview coaches
  • Restate the heart of "When burst tolerance goes sideways: recovery scripts that still score" and confirm inputs, outputs, and edge cases.
  • Propose a brute-force or baseline you can finish — name its complexity honestly.
  • Walk a hand trace on a small example; only then refactor toward the optimal structure.
  • Reserve the final minutes for tests: null/empty, duplicates, extremes, and off-by-one boundaries.
  • Close with a one-sentence summary of tradeoffs and what you would monitor in production.

Tradeoff tables beat absolutes. Strong consistency vs availability, SQL vs NoSQL for this workload, sync vs async processing — show the decision criteria, not a slogan. The goal is to demonstrate judgment, not encyclopedic product knowledge.

ML and AI interviews increasingly test systems, not just models. Be ready to discuss data pipelines, evaluation beyond accuracy, latency budgets, failure modes, and cost. A model that is correct offline but too slow online is not shippable. Practice sketching a training-serving split, monitoring hooks, and rollback strategy — that is the engineering bar, not the latest paper.

A two-week drill plan with milestones tied to failure modes

This section focuses on A two-week drill plan with milestones tied to failure modes. Candidates preparing for Design a Distributed Rate Limiter often underestimate how much interviewers infer from process: how you decompose the prompt, name tradeoffs, and verify before you optimize. The behaviors that look boring — restating constraints, proposing a baseline, testing a tiny example — are exactly what separates hire from no-hire when two solutions have similar asymptotics. We connect this theme to what hiring committees actually write in feedback forms, not abstract advice. Treat the next paragraphs as a script you can steal: say the quiet parts out loud, label your invariants, and narrate recovery when you misread a constraint. Practice until it feels mechanical, because stress will strip your polish unless the habits are automatic.

Negotiation starts before the offer. The credible story is built throughout the process: scope you owned, impact you can quantify, and alternatives you are genuinely considering. If the first time you mention competing opportunities is after the number arrives, it feels tactical rather than factual. That does not mean playing games — it means being transparent about timeline and decision criteria when recruiters ask.

Observability is part of design, not an appendix. Metrics for latency percentiles, error budgets, tracing across services, and structured logs for debugging — pick two to emphasize based on the prompt. Staff interviewers want to know how you would operate what you designed.

Complexity analysis is a communication tool. Big-O is not only for the end of the problem — it is how you justify why you are not exploring an exponential search. State the bottleneck honestly: maybe sorting dominates, maybe a hash map makes queries linear on average, maybe nested loops are acceptable because the inner bound is tiny. Interviewers reward coherent complexity stories more than memorized proofs.

  • Restate the heart of "A two-week drill plan with milestones tied to failure modes" and confirm inputs, outputs, and edge cases.
  • Propose a brute-force or baseline you can finish — name its complexity honestly.
  • Walk a hand trace on a small example; only then refactor toward the optimal structure.
  • Reserve the final minutes for tests: null/empty, duplicates, extremes, and off-by-one boundaries.
  • Close with a one-sentence summary of tradeoffs and what you would monitor in production.

Observability is part of design, not an appendix. Metrics for latency percentiles, error budgets, tracing across services, and structured logs for debugging — pick two to emphasize based on the prompt. Staff interviewers want to know how you would operate what you designed.

Negotiation starts before the offer. The credible story is built throughout the process: scope you owned, impact you can quantify, and alternatives you are genuinely considering. If the first time you mention competing opportunities is after the number arrives, it feels tactical rather than factual. That does not mean playing games — it means being transparent about timeline and decision criteria when recruiters ask.

Day-of checklist: observability and abuse, timeboxing, and how to close strong

This section focuses on Day-of checklist: observability and abuse, timeboxing, and how to close strong. Candidates preparing for Design a Distributed Rate Limiter often underestimate how much interviewers infer from process: how you decompose the prompt, name tradeoffs, and verify before you optimize. The behaviors that look boring — restating constraints, proposing a baseline, testing a tiny example — are exactly what separates hire from no-hire when two solutions have similar asymptotics. We connect this theme to what hiring committees actually write in feedback forms, not abstract advice. Treat the next paragraphs as a script you can steal: say the quiet parts out loud, label your invariants, and narrate recovery when you misread a constraint. Practice until it feels mechanical, because stress will strip your polish unless the habits are automatic.

SQL interviews reward clarity of thought over clever hacks. Window functions, CTEs, and careful joins solve most analytics questions without subquery soup. If your query is five levels deep, pause and ask whether a window can express the ranking or running metric directly. Explain null handling before your interviewer has to ask — it signals production experience.

Start every design with users and workloads. Who reads, who writes, and what latency matters? Without those anchors, caching and sharding discussions float uselessly. A social feed and a payment ledger have different consistency requirements — say that explicitly before drawing boxes.

The best prep materials are the ones you will actually use. A perfect curriculum that you abandon after four days loses to a decent curriculum you finish. Optimize for adherence: shorter sessions you can repeat, frictionless environments, and clear win conditions each session. Track streaks lightly — consistency beats intensity spikes that vanish after finals week.

  • Restate the heart of "Day-of checklist: observability and abuse, timeboxing, and how to close strong" and confirm inputs, outputs, and edge cases.
  • Propose a brute-force or baseline you can finish — name its complexity honestly.
  • Walk a hand trace on a small example; only then refactor toward the optimal structure.
  • Reserve the final minutes for tests: null/empty, duplicates, extremes, and off-by-one boundaries.
  • Close with a one-sentence summary of tradeoffs and what you would monitor in production.

Start every design with users and workloads. Who reads, who writes, and what latency matters? Without those anchors, caching and sharding discussions float uselessly. A social feed and a payment ledger have different consistency requirements — say that explicitly before drawing boxes.

SQL interviews reward clarity of thought over clever hacks. Window functions, CTEs, and careful joins solve most analytics questions without subquery soup. If your query is five levels deep, pause and ask whether a window can express the ranking or running metric directly. Explain null handling before your interviewer has to ask — it signals production experience.

MomentWhat to say
StartI'll restate the goal, then propose a baseline I can complete in time.
MidpointHere's the invariant I'm maintaining — I'll verify it on the example.
StuckI'm stuck on X; I'll try a smaller case and see what breaks.
EndI'll run these edge cases, then summarize complexity and tradeoffs.

Stop grinding. Start patterning.

Alpha Code is a patterns-first interview prep platform — coding, system design, behavioral, mocks, and ML/AI engineering all under one $19/mo subscription.