System Design Interview Questions (2026): 20 Real Problems + How to Answer

System design interviews fail candidates in a specific, predictable way: they jump to solutions before understanding the problem. They pick a technology before defining the scale. They talk about databases before clarifying what queries the system needs to handle.

This guide gives you the framework, the 20 questions you're most likely to face, and the specific things FAANG interviewers are grading that most candidates never address.

Practice system design interviews out loud at interview-prep.academy — AI voice mocks, free, no card.


The framework: how to structure every system design answer

Use this structure regardless of the question. It's not a rigid script — it's a checklist to ensure you cover what interviewers grade.

Step 1: Clarify requirements (5–7 minutes)

Never skip this. The worst system designs start with "OK so I'll build a distributed database."

Functional requirements: What does the system do? What are the core user actions?

Non-functional requirements: How does it do it?

Out of scope: explicitly exclude features you won't design. "I'll focus on read/write of tweets and the home timeline; I'll leave notifications, search, and ads out of scope."

Step 2: Capacity estimation (3–5 minutes)

Back-of-envelope numbers to size your design decisions.

Writes per second: DAU × actions/day ÷ 86,400 Reads per second: DAU × read_actions/day ÷ 86,400 Storage per year: bytes per record × writes/sec × 86,400 × 365

This tells you: Do you need sharding? Do you need a CDN? Do you need read replicas? Is caching a 10× win or a 2× win?

Step 3: High-level design (10 minutes)

Draw the major components. A typical web system:

State your assumptions: "I'll assume horizontally scalable stateless app servers behind a load balancer."

Step 4: Deep dive (15 minutes)

Pick 2–3 components and go deep. Interviewers will guide you — follow their lead. Common deep-dives:

Step 5: Address bottlenecks and trade-offs (5 minutes)

Every design has trade-offs. State them. Interviewers grade "trade-off awareness" explicitly.


The 20 system design questions

Foundational (most common, good starting points)

1. Design a URL shortener (bit.ly) Core challenge: generating short unique codes, high read throughput (redirects >> creates), analytics counting. Key components: hash generation (base62 encoding of auto-increment ID), redirect database (key-value store works well), CDN for global redirect performance, analytics via async event streaming.

2. Design a rate limiter Choose algorithm: token bucket (allows bursts), leaky bucket (smooths traffic), sliding window log (accurate, memory-heavy), sliding window counter (good balance). Implementation: Redis with INCR + EXPIRE, or a distributed counter with Lua scripts for atomicity. Distributed vs. per-node — per-node is simpler but allows 10× threshold with 10 nodes.

3. Design a key-value store This IS the deep-dive — covers everything: consistent hashing for node assignment, replication factor, quorum reads/writes (N=3, W=2, R=2 for strong consistency), gossip protocol for membership, Merkle trees for anti-entropy. Know Dynamo-style architecture.

4. Design a web crawler Frontier (URL queue), fetcher (politeness rules, robots.txt, rate limiting per domain), parser (URL extraction + content indexing), deduplication (URL normalization + content fingerprinting). Scale challenge: distributed crawl across thousands of machines without crawling the same URL twice.

5. Design a notification system Event producer → message queue → notification dispatcher → delivery channels (push/SMS/email). Key: fan-out to millions of subscribers, per-user delivery preferences, deduplication (don't send same notification twice), retry with exponential backoff, delivery receipts.

Social and content platforms

6. Design Twitter / X timeline The canonical fan-out problem. Two approaches:

7. Design Instagram / photo sharing Photo upload → object storage (S3) → async thumbnail generation → CDN distribution. Feed: social graph service → photo metadata service → pre-computed feed for active users. Key: photos are read-heavy (100× more reads than writes), CDN is essential.

8. Design YouTube / video streaming Video upload pipeline: chunked upload → transcoding to multiple bitrates (HLS/DASH) → CDN distribution. Streaming: adaptive bitrate — client switches quality based on bandwidth. Scale: transcoding is compute-intensive, use dedicated worker fleet with job queue.

9. Design a chat application (WhatsApp / Slack) WebSocket connections for real-time bidirectional communication. Message delivery: sender → server → persistent storage → deliver to recipient WebSocket. Offline delivery: push notification + message stored for retrieval on reconnect. Group chat: fan-out per message to all group members; for large groups, use message queue fan-out.

10. Design a social graph (Facebook friends / Twitter follows) Graph database vs. adjacency list in RDBMS vs. sharded graph representation. Follower/following reads must be fast — typically cache in Redis sorted sets. Graph traversal for "people you may know" runs as offline batch job, not real-time.

Infrastructure and platform

11. Design a distributed message queue (Kafka-style) Topic → partitions → replicas. Producer writes to partition leader, replicas follow. Consumer groups read from partitions in parallel. Key guarantee choices: at-least-once (default), at-most-once, exactly-once (requires two-phase commit or idempotent consumers). Log compaction for topic retention.

12. Design a distributed cache (Redis Cluster) Consistent hashing across nodes. Replication: primary-replica, replica serves reads, handles failover. Eviction policies: LRU (general), LFU (for skewed access), TTL-based. Cache aside vs. write-through vs. write-behind — trade-off between consistency and performance.

13. Design a search autocomplete system (typeahead) Trie data structure in theory; in practice: prefix matching against pre-computed top-k completions stored in Redis sorted sets (ZRANGEBYLEX). Cold path: count query frequency, build top-k per prefix offline; warm path: serve from Redis. Personalization: blend global popularity with user-specific history.

14. Design a proximity service / Yelp Geospatial indexing: Geohash (convert lat/lng to string prefix, nearby = shared prefix), QuadTree, or PostGIS. The challenge: "nearby" searches near grid boundaries miss results — always search adjacent cells too. Rank results by distance + rating + relevance.

15. Design a ride-sharing dispatch system (Uber) Driver location service: drivers push GPS updates every 5 seconds → stored in geospatial index. Matching: when rider requests, find nearby available drivers → score by ETA + rating. Real-time tracking: WebSocket for live driver position update during ride. Challenges: driver state machine (available/matched/on-trip), preventing double-matching.

Data and analytics

16. Design a metrics and monitoring system (Datadog-style) Time-series data model: (metric name, tags, timestamp, value). Write path: agents → message queue → aggregation layer → time-series DB (InfluxDB, Prometheus). Query path: time-series DB → downsampling → dashboards. Alerting: streaming evaluation of alert rules against incoming metrics.

17. Design a distributed task scheduler (Airflow-style) Task graph (DAG) definition, scheduler polls for tasks ready to run (dependencies met), dispatches to worker pool, workers execute and report status. Handling failures: retry policies per task, dead-letter queue for tasks that exceed retry limit. Distributed coordination: single scheduler is a SPOF — use leader election (ZooKeeper/etcd).

18. Design a payment processing system Exactly-once semantics required — double charges are catastrophic. Idempotency keys on payment requests. Two-phase commit or saga pattern for distributed transactions across payment processor + balance ledger + notification. Reconciliation job to catch discrepancies. PCI compliance implications on what data you can store.

19. Design a content delivery network (CDN) Points of presence (PoPs) globally. Origin pulls (cache-miss fetches from origin) vs. pre-warm for known viral content. Cache invalidation: TTL-based (simple) vs. event-driven purge (complex, fast). Routing: anycast DNS to direct users to nearest PoP.

20. Design a hotel / flight reservation system Inventory management (seats/rooms), holds (reserve inventory briefly while user completes checkout), two-phase commit to prevent overbooking. Read-heavy for search (cache aggressively), write-heavy for booking (ACID transactions required). Handling concurrency: optimistic locking vs. pessimistic locking trade-offs.


What interviewers are actually grading

Every FAANG system design rubric has the same underlying dimensions:

DimensionWhat they look for
Problem scopingDid you clarify before designing?
Scale intuitionDo your capacity estimates actually inform your design?
Component choiceCan you justify why you chose SQL vs. NoSQL, Redis vs. Memcached?
Trade-off awarenessDo you acknowledge what your choices cost?
DepthCan you go deep on one component when asked?
CommunicationIs your reasoning clear without being prompted?

Most candidates hit 4/6. The ones who get offers hit 5–6/6.


FAQ

What's the most common system design interview question? URL shortener and Twitter timeline are the two most frequently reported across FAANG companies. They're popular precisely because they test different skills: URL shortener tests capacity estimation and read optimization; Twitter tests fan-out strategy and CAP theorem trade-offs.

How much math is expected for capacity estimation? Order-of-magnitude accuracy is sufficient. Interviewers know the exact numbers matter less than whether your estimates inform your design. Rounding 86,400 to 100K is fine. Getting the direction right (gigabytes vs. terabytes) is what matters.

Should I ask for a whiteboard or just talk through it? Always draw something, even on paper. Diagrams force clarity and give the interviewer something to react to. If in a virtual interview, use the coding platform's drawing tool or ask what's available.

Is it OK to say "I'd use a managed service like DynamoDB" instead of designing from scratch? Yes — and it's often the right answer. Senior engineers use managed services; they don't reinvent databases in job interviews. Name the service, explain why it fits, and be prepared to go one level deeper on the underlying trade-offs if asked.

How do you balance breadth vs. depth in 45 minutes? Spend 15 minutes on breadth (clarification + estimation + high-level diagram). Spend 25 minutes on 2–3 deep dives. Reserve 5 minutes for trade-offs. If the interviewer keeps asking follow-up questions on one component, let them guide you — they're telling you that's what they care about.


Free gets you ready. Pro gets you sharp.

Reading this guide is the start — the reps are where offers are won. Free gives you unlimited mock interviews, the full 8,675 real interview questions across 23 languages, and the AI Study Coach, no credit card. Pro ($10/mo) adds live voice interviews with Zaheen, the AI coach who asks follow-ups, pushes back, and scores you like a real interviewer — plus unlimited sessions.

See what Pro adds → $10/mo

7-day money-back guarantee · cancel anytime