System design interviews fail candidates in a specific, predictable way: they jump to solutions before understanding the problem. They pick a technology before defining the scale. They talk about databases before clarifying what queries the system needs to handle.
This guide gives you the framework, the 20 questions you're most likely to face, and the specific things FAANG interviewers are grading that most candidates never address.
Practice system design interviews out loud at interview-prep.academy — AI voice mocks, free, no card.
The framework: how to structure every system design answer
Use this structure regardless of the question. It's not a rigid script — it's a checklist to ensure you cover what interviewers grade.
Step 1: Clarify requirements (5–7 minutes)
Never skip this. The worst system designs start with "OK so I'll build a distributed database."
Functional requirements: What does the system do? What are the core user actions?
- For URL shortener: "Users can create a short URL. Users can click a short URL and be redirected."
- For Twitter: "Users can post tweets. Users can follow other users. Users can see a timeline."
Non-functional requirements: How does it do it?
- Scale: How many users? DAU? Reads per second? Writes per second?
- Latency: p99 latency requirement? Is this a real-time system (ms) or batch (seconds/minutes)?
- Availability: 99.9% (8.7 hours downtime/year)? 99.99%? Multi-region?
- Consistency: Strong consistency required (banking), or eventual is fine (social feed)?
- Durability: Can we lose writes? (almost always: no)
Out of scope: explicitly exclude features you won't design. "I'll focus on read/write of tweets and the home timeline; I'll leave notifications, search, and ads out of scope."
Step 2: Capacity estimation (3–5 minutes)
Back-of-envelope numbers to size your design decisions.
Writes per second: DAU × actions/day ÷ 86,400 Reads per second: DAU × read_actions/day ÷ 86,400 Storage per year: bytes per record × writes/sec × 86,400 × 365
This tells you: Do you need sharding? Do you need a CDN? Do you need read replicas? Is caching a 10× win or a 2× win?
Step 3: High-level design (10 minutes)
Draw the major components. A typical web system:
- Client → Load balancer → Application servers → Database
- Add Cache (Redis/Memcached) between app servers and database
- Add CDN for static assets and geo-distributed reads
- Add Message queue (Kafka/SQS) for async writes, event streaming
- Add Storage (S3/Blob) for media/large objects
State your assumptions: "I'll assume horizontally scalable stateless app servers behind a load balancer."
Step 4: Deep dive (15 minutes)
Pick 2–3 components and go deep. Interviewers will guide you — follow their lead. Common deep-dives:
- Database schema and indexing strategy
- Caching strategy (what to cache, eviction policy, cache invalidation)
- Sharding strategy (how to partition data, how to handle hot keys)
- Message queue design (topic partitioning, consumer groups, at-least-once vs. exactly-once)
- API design (REST vs. gRPC, pagination, rate limiting)
Step 5: Address bottlenecks and trade-offs (5 minutes)
Every design has trade-offs. State them. Interviewers grade "trade-off awareness" explicitly.
- "SQL gives us strong consistency but horizontal scaling is harder — I'd use it here because our write volume is modest"
- "NoSQL scales horizontally easily but we lose transactions — acceptable here because we're write-heavy and reads are eventually consistent"
- "A cache improves read latency by 10× but introduces cache invalidation complexity"
The 20 system design questions
Foundational (most common, good starting points)
1. Design a URL shortener (bit.ly) Core challenge: generating short unique codes, high read throughput (redirects >> creates), analytics counting. Key components: hash generation (base62 encoding of auto-increment ID), redirect database (key-value store works well), CDN for global redirect performance, analytics via async event streaming.
2. Design a rate limiter Choose algorithm: token bucket (allows bursts), leaky bucket (smooths traffic), sliding window log (accurate, memory-heavy), sliding window counter (good balance). Implementation: Redis with INCR + EXPIRE, or a distributed counter with Lua scripts for atomicity. Distributed vs. per-node — per-node is simpler but allows 10× threshold with 10 nodes.
3. Design a key-value store This IS the deep-dive — covers everything: consistent hashing for node assignment, replication factor, quorum reads/writes (N=3, W=2, R=2 for strong consistency), gossip protocol for membership, Merkle trees for anti-entropy. Know Dynamo-style architecture.
4. Design a web crawler Frontier (URL queue), fetcher (politeness rules, robots.txt, rate limiting per domain), parser (URL extraction + content indexing), deduplication (URL normalization + content fingerprinting). Scale challenge: distributed crawl across thousands of machines without crawling the same URL twice.
5. Design a notification system Event producer → message queue → notification dispatcher → delivery channels (push/SMS/email). Key: fan-out to millions of subscribers, per-user delivery preferences, deduplication (don't send same notification twice), retry with exponential backoff, delivery receipts.
Social and content platforms
6. Design Twitter / X timeline The canonical fan-out problem. Two approaches:
- Fan-out on write (push): on tweet, push to all followers' timeline caches. Fast reads, but celebrities with 100M followers make writes expensive.
- Fan-out on read (pull): on timeline request, merge tweets from all followees. Always fresh, but slow reads for users who follow many accounts.
- Hybrid: fan-out on write for normal users, fan-out on read for celebrities. This is what Twitter uses.
7. Design Instagram / photo sharing Photo upload → object storage (S3) → async thumbnail generation → CDN distribution. Feed: social graph service → photo metadata service → pre-computed feed for active users. Key: photos are read-heavy (100× more reads than writes), CDN is essential.
8. Design YouTube / video streaming Video upload pipeline: chunked upload → transcoding to multiple bitrates (HLS/DASH) → CDN distribution. Streaming: adaptive bitrate — client switches quality based on bandwidth. Scale: transcoding is compute-intensive, use dedicated worker fleet with job queue.
9. Design a chat application (WhatsApp / Slack) WebSocket connections for real-time bidirectional communication. Message delivery: sender → server → persistent storage → deliver to recipient WebSocket. Offline delivery: push notification + message stored for retrieval on reconnect. Group chat: fan-out per message to all group members; for large groups, use message queue fan-out.
10. Design a social graph (Facebook friends / Twitter follows) Graph database vs. adjacency list in RDBMS vs. sharded graph representation. Follower/following reads must be fast — typically cache in Redis sorted sets. Graph traversal for "people you may know" runs as offline batch job, not real-time.
Infrastructure and platform
11. Design a distributed message queue (Kafka-style) Topic → partitions → replicas. Producer writes to partition leader, replicas follow. Consumer groups read from partitions in parallel. Key guarantee choices: at-least-once (default), at-most-once, exactly-once (requires two-phase commit or idempotent consumers). Log compaction for topic retention.
12. Design a distributed cache (Redis Cluster) Consistent hashing across nodes. Replication: primary-replica, replica serves reads, handles failover. Eviction policies: LRU (general), LFU (for skewed access), TTL-based. Cache aside vs. write-through vs. write-behind — trade-off between consistency and performance.
13. Design a search autocomplete system (typeahead) Trie data structure in theory; in practice: prefix matching against pre-computed top-k completions stored in Redis sorted sets (ZRANGEBYLEX). Cold path: count query frequency, build top-k per prefix offline; warm path: serve from Redis. Personalization: blend global popularity with user-specific history.
14. Design a proximity service / Yelp Geospatial indexing: Geohash (convert lat/lng to string prefix, nearby = shared prefix), QuadTree, or PostGIS. The challenge: "nearby" searches near grid boundaries miss results — always search adjacent cells too. Rank results by distance + rating + relevance.
15. Design a ride-sharing dispatch system (Uber) Driver location service: drivers push GPS updates every 5 seconds → stored in geospatial index. Matching: when rider requests, find nearby available drivers → score by ETA + rating. Real-time tracking: WebSocket for live driver position update during ride. Challenges: driver state machine (available/matched/on-trip), preventing double-matching.
Data and analytics
16. Design a metrics and monitoring system (Datadog-style) Time-series data model: (metric name, tags, timestamp, value). Write path: agents → message queue → aggregation layer → time-series DB (InfluxDB, Prometheus). Query path: time-series DB → downsampling → dashboards. Alerting: streaming evaluation of alert rules against incoming metrics.
17. Design a distributed task scheduler (Airflow-style) Task graph (DAG) definition, scheduler polls for tasks ready to run (dependencies met), dispatches to worker pool, workers execute and report status. Handling failures: retry policies per task, dead-letter queue for tasks that exceed retry limit. Distributed coordination: single scheduler is a SPOF — use leader election (ZooKeeper/etcd).
18. Design a payment processing system Exactly-once semantics required — double charges are catastrophic. Idempotency keys on payment requests. Two-phase commit or saga pattern for distributed transactions across payment processor + balance ledger + notification. Reconciliation job to catch discrepancies. PCI compliance implications on what data you can store.
19. Design a content delivery network (CDN) Points of presence (PoPs) globally. Origin pulls (cache-miss fetches from origin) vs. pre-warm for known viral content. Cache invalidation: TTL-based (simple) vs. event-driven purge (complex, fast). Routing: anycast DNS to direct users to nearest PoP.
20. Design a hotel / flight reservation system Inventory management (seats/rooms), holds (reserve inventory briefly while user completes checkout), two-phase commit to prevent overbooking. Read-heavy for search (cache aggressively), write-heavy for booking (ACID transactions required). Handling concurrency: optimistic locking vs. pessimistic locking trade-offs.
What interviewers are actually grading
Every FAANG system design rubric has the same underlying dimensions:
| Dimension | What they look for |
|---|---|
| Problem scoping | Did you clarify before designing? |
| Scale intuition | Do your capacity estimates actually inform your design? |
| Component choice | Can you justify why you chose SQL vs. NoSQL, Redis vs. Memcached? |
| Trade-off awareness | Do you acknowledge what your choices cost? |
| Depth | Can you go deep on one component when asked? |
| Communication | Is your reasoning clear without being prompted? |
Most candidates hit 4/6. The ones who get offers hit 5–6/6.
FAQ
What's the most common system design interview question? URL shortener and Twitter timeline are the two most frequently reported across FAANG companies. They're popular precisely because they test different skills: URL shortener tests capacity estimation and read optimization; Twitter tests fan-out strategy and CAP theorem trade-offs.
How much math is expected for capacity estimation? Order-of-magnitude accuracy is sufficient. Interviewers know the exact numbers matter less than whether your estimates inform your design. Rounding 86,400 to 100K is fine. Getting the direction right (gigabytes vs. terabytes) is what matters.
Should I ask for a whiteboard or just talk through it? Always draw something, even on paper. Diagrams force clarity and give the interviewer something to react to. If in a virtual interview, use the coding platform's drawing tool or ask what's available.
Is it OK to say "I'd use a managed service like DynamoDB" instead of designing from scratch? Yes — and it's often the right answer. Senior engineers use managed services; they don't reinvent databases in job interviews. Name the service, explain why it fits, and be prepared to go one level deeper on the underlying trade-offs if asked.
How do you balance breadth vs. depth in 45 minutes? Spend 15 minutes on breadth (clarification + estimation + high-level diagram). Spend 25 minutes on 2–3 deep dives. Reserve 5 minutes for trade-offs. If the interviewer keeps asking follow-up questions on one component, let them guide you — they're telling you that's what they care about.