Most candidates fail system design interviews not because they cannot design systems, but because they spend forty-five minutes on the diagram and zero minutes on the conversation the interviewer is actually scoring. The rubric is not 'is this design correct.' The rubric is a checklist of behaviors — clarification, prioritization, capacity reasoning, tradeoff articulation, depth-on-demand. This guide is the seven-step framework that maps to that rubric, calibrated for L4 through Staff+ loops at FAANG-tier companies. Use it as a discipline, not a script. The interviewer wants to see your judgment under time pressure, not a memorized recital.
What the interviewer is actually scoring
Before the framework, understand the rubric. At every major US tech company the system design rubric covers roughly five areas: requirements gathering, scope and tradeoff judgment, technical depth on at least one component, articulation of the design under load and failure, and senior-coded behaviors like asking before assuming, narrating tradeoffs out loud, and pushing back on constraints. None of those scores are about how pretty the diagram is.
If you remember nothing else from this article: every five minutes, ask yourself which rubric box you have not earned in the last five minutes. That single habit will lift most candidates one half-level on the rubric without changing what they technically know.
Step 1 — Requirements: functional, non-functional, out of scope
Spend the first five to seven minutes on requirements. This is not optional. The interviewer wants three lists: functional requirements (the user-visible behaviors), non-functional requirements (latency targets, availability targets, consistency expectations, scale), and explicit out-of-scope items (auth, billing, abuse, internationalization — whatever you are not going to design).
- Functional: 'Users can post a tweet of up to 280 characters. Tweets are visible to followers within five seconds. Users can follow and unfollow.' Three to five concrete behaviors.
- Non-functional: 'P99 read latency under 200ms. Five-nines availability for reads, four-nines for writes. Eventually consistent fan-out is acceptable.' Numbers, not adjectives.
- Out of scope: 'I will not design auth, monetization, abuse detection, or media transcoding.' Naming exclusions is a senior signal.
End this step by saying out loud: 'Are we aligned on those? Anything you want me to drop or add?' That single sentence converts a monologue into a calibration. The interviewer will often nudge you toward the part of the design they actually care about, and you will save fifteen minutes of designing the wrong thing.
Step 2 — Capacity math: do it on the board
Capacity math is where most candidates either skip the step entirely or do it silently in their head. Both are scoring failures. Write the numbers on the whiteboard. The interviewer cannot give you credit for math they did not see.
| Quantity | Estimate | How |
|---|---|---|
| Daily active users | 300M | given or assumed |
| Tweets per user per day | 0.5 | industry rule of thumb |
| Writes per second | ~1,700 | 300M × 0.5 / 86,400 |
| Read:write ratio | ~100:1 | given |
| Reads per second | ~170,000 | writes × 100 |
| Tweet payload | ~1 KB | 280 chars + metadata |
| Daily storage | ~150 GB/day | writes × payload × 86,400 |
| Five-year storage | ~270 TB | 150 GB × 365 × 5 |
Step 3 — API contract and data model
Sketch the public API in three to five endpoints. Decide REST or gRPC, name the verbs, list the request and response shapes. This forces you to commit to a contract before you draw the diagram, which is exactly the order real engineering teams work in. It also exposes design questions early: idempotency, pagination, consistency expectations, error semantics.
POST /v1/tweets { text } -> { id, created_at }
GET /v1/tweets/:id -> { id, author, text, created_at }
GET /v1/users/:id/timeline?cursor= -> { tweets[], next_cursor }
POST /v1/users/:id/follow -> 204
DELETE /v1/users/:id/follow -> 204Then sketch the data model. Tables or collections, the primary key, the secondary indexes you will need, and — critically — which queries each index serves. The data model is where consistency, sharding, and read-path decisions get made. Get it right and the rest of the design falls out of it.
Step 4 — High-level design: name every box
Now the diagram. Client, edge (CDN, load balancer), API gateway, the application service tier, the data tier, and any async pipeline (queue, stream, worker). Every box gets a name and a one-sentence purpose. Avoid 'cache' as a label — say which cache (Redis cluster, in-memory LRU, CDN edge cache) and which keys it holds.
- Edge: CDN for static assets, load balancer for L7 routing.
- Gateway: auth, rate limiting, request shaping.
- Service tier: stateless app servers behind the gateway.
- Cache: Redis cluster keyed by user_id holding fanned-out timelines.
- Storage: Cassandra (or DynamoDB) for tweets, sharded by author_id.
- Async: Kafka topic 'tweet.posted' driving the fan-out worker.
Walk one user request end to end at this point. 'A user posts a tweet. The request hits the load balancer, terminates TLS, routes to the gateway, the gateway authenticates and forwards to the tweet-write service, which writes to the tweet store and emits to Kafka. The fan-out worker reads from Kafka and writes the tweet ID into each follower's timeline cache.' This narration is high-leverage rubric points.
Step 5 — Deep dives: pick one or two, go deep
The interviewer will steer you toward one component to drill into. Common deep-dive surfaces: the fan-out strategy (push vs pull vs hybrid), the database choice and sharding key, the cache invalidation model, the rate limiter, the consensus or replication choice. Pick one or two and go four levels deep — enough to discuss specific failure modes, specific tradeoffs, and at least one concrete number you have computed.
Hybrid fan-out is the canonical example: push to most followers' timeline caches at write time, but for celebrity authors with millions of followers, pull at read time and merge. Naming the threshold (say, 100k followers), the cost of each side, and the operational tradeoff (cache memory vs read-path complexity) is the L5+ depth.
Step 6 — Failure modes and load behavior
Now stress the design. What happens when the cache cluster loses a node? When the fan-out queue backs up by ten million messages? When a thundering herd hits a cold cache after deploy? When the primary database fails over? You do not need to handle every case — you need to articulate the behavior and name the tradeoff.
- 1Walk one failure scenario per major component.
- 2State the user-visible impact (degraded reads? full outage? stale timeline?).
- 3State the recovery mechanism (replica promotion, retry with backoff, drain-and-replay).
- 4State the operational cost (alerting, runbook, blast radius).
Then load: walk a 10x traffic spike. Where does the system bend, where does it break, what is the failure mode (latency degradation, dropped writes, cascading retries), and what is the lever to add capacity (horizontal scale, cache size, shard count)?
Step 7 — Tradeoffs and explicit follow-ups
Close the interview by enumerating tradeoffs out loud. 'I chose push fan-out for the read latency win, accepting higher write amplification and cache memory cost. I chose Cassandra over Postgres for the write throughput and horizontal scale, accepting weaker secondary index support and more application-side aggregation logic.' Each sentence is a rubric point.
Then list two or three things you would do with more time: 'I would design the abuse pipeline, I would think harder about the cold-start case for new users with zero followers, and I would propose a three-month rollout sequence that ships the read path first and the fan-out worker behind a flag.' This signals you are aware of the gap between a whiteboard and a real production system.
“I have written three hundred system design interview rubrics. The candidates who get the offer are not the ones with the cleanest diagram. They are the ones who, at the forty-minute mark, name three things they consciously decided not to design and explain the reasoning.”
Five anti-patterns that quietly downcode you
Even candidates who follow the framework lose points to a handful of recurring anti-patterns. Most of them are habitual, learned from years of reading whiteboard solutions online. Watch yourself for these — they are correctable in two practice sessions.
- 1Drawing the diagram before naming requirements. Looks decisive, reads as 'pattern matched without thinking.'
- 2Saying 'add a cache' without naming what is cached, where it lives, what evicts it, and how it is invalidated. A cache without a coherent invalidation story is worse than no cache.
- 3Using 'eventual consistency' as a magic word. Always state the consistency window (seconds, minutes), the user-visible behavior during the window, and how a user observes convergence.
- 4Picking exotic technology to look senior. Reaching for Cassandra when Postgres would do is read by interviewers as 'cargo culted modernity,' which is the opposite signal you want.
- 5Refusing to commit. 'I'd consider both Postgres and Cassandra' is a non-answer. Pick one, name the tradeoff, move on. The interviewer will challenge if they want.
Every one of these is a habit you can break. Record yourself running through a design out loud, listen for the pattern, and rerun it cleanly. The next session you do, the habit will be gone — and you will sound a half-level more senior without changing what you actually know.
Stop grinding. Start patterning.
Alpha Code is a patterns-first interview prep platform — coding, system design, behavioral, mocks, and ML/AI engineering all under one $19/mo subscription.