System design · ML

ML system design — the loop the industry under-teaches.

TL;DR

ML system design interviews ask: how would you build a recommendation engine, a search ranker, a fraud detector, or an LLM product? Scoring rewards pipeline thinking (feature → train → serve → monitor), eval rigor, and honest tradeoff talk about latency, cost, and model drift.

The four pillars

Data pipeline (ingest, dedupe, feature store), training (offline eval, retraining cadence), serving (online latency, A/B framework), and monitoring (drift, skew, SLOs). Every interview touches all four — candidates who miss one look junior.

Retrieval + ranking

Most modern systems (search, recs, RAG) are a two-stage retrieval + ranking pipeline. Know the tradeoffs: ANN vs BM25 recall, cross-encoder vs bi-encoder latency, cascading rankers, and cold-start strategies.

LLM-system variants

RAG, agents, tool-use pipelines, eval harnesses. Narrate cost ($ per request), latency budgets, hallucination mitigation, and privacy constraints. LLM-system interviews are the fastest-growing loop in 2026.

Monitoring and rollout

Shadow mode, canary, A/B with holdout, drift detectors on feature distributions, offline/online eval skew. Saying 'I'd monitor X metric against Y threshold and roll back on breach' is the specific-number move that scores.

Frequently asked questions

What's an ML system design interview?
A variant of system design where the service you're building is ML-powered: a recommender, a ranker, a fraud detector, or an LLM product. It layers data, training, and eval on top of the usual serving architecture.
How is ML system design different from regular system design?
You must explicitly cover the feature pipeline, offline evaluation, rollout strategy, and drift monitoring. Leaving any of these out reads as 'this person has never shipped ML to production'.
Does Alpha Code cover LLM / RAG system design?
Yes. RAG retrieval, agent orchestration, eval harnesses, and cost/latency tradeoff narrative are covered in this track.
What companies ask ML system design?
Meta, Google, Databricks, Stripe, and every AI-first scale-up — Anthropic, OpenAI, Cohere, Mistral. ML engineering loops are almost always one ML system design round.

Practice this live.

Book a 45-minute system design mock with a transcript. Included in the $19/mo subscription.