System Design Interview Framework
Structured Approach to Tackle Any System Design Question
Founder
Why System Design Interviews Are Different
Unlike coding interviews with "correct" answers, system design is open-ended. Interviewers evaluate:
- Structured thinking: Can you break down ambiguity?
- Trade-off analysis: Do you understand pros/cons?
- Scalability awareness: Can you design for millions of users?
- Communication: Can you explain complex systems clearly?
There's no single "right" answer. A junior engineer might design Twitter with 1 server; a senior engineer considers 1 billion users, 500M tweets/day, and 5 data centers.
The 6-Step Framework
Step 1: Clarify Requirements (5 minutes)
Don't jump to solutions! Ask questions to scope the problem.
Step 2: Capacity Estimation (5 minutes)
Calculate traffic, storage, bandwidth to guide design decisions.
Step 3: High-Level Design (10 minutes)
Draw boxes and arrows showing main components and data flow.
Step 4: API Design (5 minutes)
Define RESTful or function interfaces for core functionality.
Step 5: Database Design (10 minutes)
Choose SQL vs NoSQL, define schema, plan for scale.
Step 6: Deep Dive (15 minutes)
Address bottlenecks, scaling, caching, monitoring, trade-offs.
Total: 50 minutes. Adjust based on 45/60 minute interview length.
Step 1: Clarify Requirements
Goal: Turn vague question into concrete requirements.
Functional Requirements (What should the system do?)
Example: "Design Twitter"
Ask:
Post tweets? (Yes)
Follow users? (Yes)
Timeline: home feed + user profile? (Yes, both)
Like/retweet? (Nice-to-have, out of scope)
Search tweets? (Out of scope)
Direct messages? (Out of scope)
Trending topics? (Out of scope)
Result: Focus on core features only.Non-Functional Requirements (How should it perform?)
Ask:
Scale: How many users? (100M daily active users)
Availability: More important than consistency? (Yes, eventual consistency OK)
Latency: How fast? (Timeline loads < 1 second)
Read vs Write: More reads or writes? (10:1 read-heavy)
Why these matter:
- 100M users → Need distributed system
- Availability > Consistency → Use NoSQL, caching
- Read-heavy → Focus on read optimization (caching, CDN)
- < 1s latency → Pre-compute timelines, use CDNPro Tip
Write requirements on whiteboard/doc to reference later. Prevents scope creep: "Remember we decided search was out of scope."
Step 2: Capacity Estimation
Goal: Use rough numbers to guide design. Be transparent about assumptions.
Traffic Estimation
Example: Twitter-like System
Given:
- 100M daily active users (DAU)
- Each user views timeline 5 times/day
- Each timeline shows 20 tweets
- Users post 0.5 tweets/day on average
Read (Timeline Views):
- 100M users × 5 views = 500M timeline requests/day
- 500M / 86,400 seconds = ~6,000 requests/second (QPS)
- Peak (3x average) = 18,000 QPS
Write (Posting Tweets):
- 100M users × 0.5 tweets = 50M tweets/day
- 50M / 86,400 = ~600 tweets/second
- Peak = 1,800 tweets/second
Result: Read-heavy (10:1 ratio). Optimize reads with caching!Storage Estimation
Tweets:
- 50M tweets/day × 280 chars × 2 bytes (Unicode) = ~28 GB/day text
- Plus metadata (user ID, timestamp, etc.) = ~30 GB/day
- 30 GB × 365 days = ~11 TB/year
Media (photos/videos):
- 20% of tweets have media
- 50M × 0.2 = 10M media uploads/day
- Avg 200 KB per image = 10M × 200 KB = 2 TB/day
- 2 TB × 365 = 730 TB/year
Total: ~750 TB/year
Result: Need distributed storage (S3, blob storage). Can't fit on 1 server!Bandwidth Estimation
Incoming:
- 30 GB text + 2 TB media = ~2 TB/day
- 2 TB / 86,400 seconds = ~24 MB/second
Outgoing (users viewing tweets):
- 500M timeline views × 20 tweets × 300 bytes (avg) = 3 TB/day text
- Plus media views (assume 50% of media): 1 TB/day media
- 4 TB / 86,400 = ~46 MB/second
Result: Outgoing > incoming (read-heavy confirms earlier assumption)Common Mistake
Don't spend 20 minutes on precise calculations. Interviewers want to see you understand scale, not exact math. Say: "Roughly 10,000 QPS" not "9,847.3 QPS".
Step 3: High-Level Design
Goal: Draw 5-10 boxes showing architecture. Start simple, add complexity.
Version 1: Naive Single-Server
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Client │───────▶│ Server │───────▶│ Database │
└──────────┘ └──────────┘ └──────────┘
Works for:
100 users
100M users (single point of failure, can't scale)Version 2: Add Load Balancer + Multiple Servers
┌──────────┐
│ Load │
┌───────────────│ Balancer │
│ └──────────┘
│ │
│ ┌───────────┼───────────┐
▼ ▼ ▼ ▼
┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐
│Server 1│ │Server 2│ │Server 3│ │Server N│
└────────┘ └────────┘ └────────┘ └────────┘
│ │ │ │
└─────────┴───────────┴───────────┘
│
┌────▼─────┐
│ Database │
└──────────┘
Improvements:
Horizontal scaling: add more servers
No single point of failure (if 1 server dies, others handle)
Database still bottleneckVersion 3: Add Caching Layer
┌──────────┐
│ Load │
│ Balancer │
└──────────┘
│
┌──────────────┼──────────────┐
▼ ▼ ▼
┌────────┐ ┌────────┐ ┌────────┐
│Server 1│ │Server 2│ │Server 3│
└────────┘ └────────┘ └────────┘
│ │ │
└──────────────┼──────────────┘
│
┌────▼─────┐
│ Redis │ ◄── Cache hot data
│ Cache │
└──────────┘
│
┌────▼─────┐
│ Database │ ◄── Cold storage
└──────────┘
Benefits:
80% of reads from cache (< 1ms latency)
Database load reduced 5x
Redis: in-memory, very fastVersion 4: Separate Read/Write + CDN
┌────────┐ ┌─────────┐ ┌────────────┐
│ User │─────▶│ CDN │─────▶│ Static │
│(Browser)│ │(Images, │ │ Assets │
└────────┘ │ JS, CSS)│ │ (S3/Blob) │
│ └─────────┘ └────────────┘
│
│ ┌──────────┐
└──────────▶│ Load │
│ Balancer │
└──────────┘
│
┌───────────┼───────────┐
▼ ▼ ▼
┌────────┐ ┌────────┐ ┌────────┐
│ Write │ │ Write │ │ Read │
│Server 1│ │Server 2│ │Servers │
└────────┘ └────────┘ └────────┘
│ │ │
└───────────┼───────────┘
│
┌───────────┴───────────┐
▼ ▼
┌─────────┐ ┌──────────┐
│ Primary │──────────▶│ Replicas │
│Database │ Replicate │ (Readers)│
│(Writer) │ └──────────┘
└─────────┘
Benefits:
CDN: assets served from nearest edge location (20-200ms saved)
Write servers: optimized for inserts (no caching)
Read servers: optimized for queries (heavy caching)
Database replication: reads scale horizontallyStep 4: API Design
Goal: Define clear interfaces. Use RESTful conventions.
// Post a tweet
POST /api/v1/tweets
Request Body:
{
"user_id": "uuid",
"text": "Hello world!",
"media_urls": ["https://cdn.example.com/img1.jpg"]
}
Response:
{
"tweet_id": "uuid",
"created_at": "2025-02-12T10:30:00Z"
}
// Get user timeline (home feed)
GET /api/v1/timeline?user_id={uuid}&cursor={cursor}&limit=20
Response:
{
"tweets": [
{
"tweet_id": "uuid",
"user_id": "uuid",
"username": "alice",
"text": "...",
"created_at": "...",
"media_urls": [...],
"likes_count": 42,
"retweets_count": 10
},
// ... 19 more
],
"next_cursor": "base64_encoded_timestamp"
}
// Follow a user
POST /api/v1/follow
Request Body:
{
"follower_id": "uuid",
"followee_id": "uuid"
}
Response:
{
"success": true
}
// Get user profile
GET /api/v1/users/{user_id}
Response:
{
"user_id": "uuid",
"username": "alice",
"bio": "...",
"followers_count": 1000,
"following_count": 500,
"tweets_count": 2000
}Key Decisions to Mention:
- Pagination: Cursor-based (better for real-time feeds than offset)
- Rate Limiting: 300 tweets/hour per user, 1000 API calls/15min
- Authentication: JWT tokens in Authorization header
- Versioning: /api/v1/ allows future breaking changes
Step 5: Database Design
Goal: Choose appropriate database(s) and define schema.
SQL vs NoSQL Decision Matrix
| Factor | SQL (Postgres) | NoSQL (DynamoDB) |
|---|---|---|
| Schema | Fixed, enforced | Flexible |
| Transactions | ACID | Eventual consistency |
| Joins | Powerful | Difficult/expensive |
| Scaling | Vertical + sharding | Horizontal |
| Write Speed | Moderate | Very Fast |
For Twitter: Use Both!
-- PostgreSQL: User data (needs transactions)
CREATE TABLE users (
user_id UUID PRIMARY KEY,
username VARCHAR(50) UNIQUE NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
bio TEXT,
profile_image_url TEXT
);
CREATE TABLE follows (
follower_id UUID REFERENCES users(user_id),
followee_id UUID REFERENCES users(user_id),
created_at TIMESTAMP DEFAULT NOW(),
PRIMARY KEY (follower_id, followee_id)
);
-- Why SQL: Need to enforce unique username, email constraints.
-- Following relationships need joins for "mutual follows" queries.// DynamoDB (NoSQL): Tweets and Timeline (needs scale + speed)
{
TableName: "Tweets",
PartitionKey: "tweet_id", // UUID
SortKey: "created_at", // Timestamp
Attributes: {
user_id: "UUID",
username: "String", // Denormalized for fast reads!
text: "String",
media_urls: ["String"],
likes_count: "Number",
retweets_count: "Number"
}
// Global Secondary Index: user_id + created_at (for user profile view)
}
{
TableName: "Timeline",
PartitionKey: "user_id", // Owner of timeline
SortKey: "created_at", // Latest first
Attributes: {
tweet_id: "UUID",
// Fan-out on write: when user tweets, add to all followers' timelines
}
// TTL: 7 days (auto-delete old timeline entries)
}
// Why NoSQL:
// - 50M tweets/day needs horizontal scaling
// - Schema may evolve (polls, videos, etc.)
// - Read-heavy: denormalize for speed (store username in tweet)
// - Fan-out on write: pre-compute timelines in Timeline tableStep 6: Deep Dive & Scaling
Goal: Address potential bottlenecks and demonstrate senior thinking.
1. Timeline Generation: Fan-out Strategies
Problem: When user posts tweet, how do 10,000 followers see it?
Option A: Fan-out on Write (Push)
- Store tweet in each follower's timeline immediately
- Read: Fast (just query user's timeline table)
- Write: Slow for celebrities (1M followers = 1M writes)
Option B: Fan-out on Read (Pull)
- Store tweets only in user's own table
- Read: Slow (join tweets from all followed users)
- Write: Fast (1 write only)
Hybrid Solution (What Twitter Actually Does):
- Regular users (<10K followers): Fan-out on write
- Celebrities (>10K followers): Fan-out on read
- At read time: merge pre-computed timeline + celebrity tweets
- Best of both worlds!2. Caching Strategy
What to Cache:
Timeline: Top 100 tweets per user (Redis Sorted Set, TTL 5min)
User profiles: Hot users (celebrities) cached (TTL 1hr)
Tweet metadata: Likes/retweets count (updated async)
Cache Invalidation:
- New tweet: invalidate author's timeline + followers' timelines
- Use pub/sub (Redis) to notify cache servers
- Accept slight delay (eventual consistency)
Cache-Aside Pattern:
1. App checks cache
2. Cache miss: query database
3. Store in cache with TTL
4. Return to user3. Database Sharding
Problem: Postgres has 1 billion users, can't fit on 1 server.
Shard by user_id:
- Hash(user_id) % N → determines which database shard
- Shard 1: user_id 0-249M
- Shard 2: user_id 250M-499M
- Shard 3: user_id 500M-749M
- Shard 4: user_id 750M-999M
Pros:
Even distribution
Each shard handles 250M users
Cons:
Cross-shard queries hard (e.g., "users who follow user A and B")
Rebalancing when adding shards is complex
Mitigation:
- Use consistent hashing to minimize re-sharding
- Denormalize to avoid cross-shard queries4. Monitoring & Observability
Metrics to Track:
Server health: CPU, memory, disk I/O per instance
API latency: p50, p95, p99 per endpoint
Error rates: 4xx, 5xx by endpoint
Database: Query time, connection pool size, replication lag
Cache: Hit rate (target >80%), eviction rate
Alerts:
P99 latency > 2 seconds
Error rate > 1%
Replication lag > 10 seconds
Cache hit rate < 70%
Tools:
- Prometheus + Grafana: Metrics dashboards
- Jaeger: Distributed tracing
- ELK Stack: Centralized logging5. Security Considerations
- Rate Limiting: Prevent spam/abuse (300 tweets/hr, 1000 API calls/15min)
- Authentication: JWT tokens with 1hr expiry
- Authorization: Check tweet.user_id === auth.user_id before edit/delete
- Input Validation: Sanitize tweet text, validate URLs
- HTTPS: Encrypt all traffic (TLS 1.3)
- DDoS Protection: CloudFlare/AWS Shield at edge
Trade-offs to Discuss
Interviewers LOVE when you mention trade-offs without being asked:
Consistency vs Availability (CAP Theorem)
"I chose eventual consistency for timelines because 1-2 second delay is acceptable for availability. For payment systems, I'd choose strong consistency."
Latency vs Consistency
"Caching reduces latency to 10ms but risks showing stale data for 1 minute. Acceptable for social media, not for stock prices."
Storage Cost vs Query Speed
"Denormalizing (storing username in tweet) costs 50 bytes × 50M tweets = 2.5GB extra, but avoids join, saving 100ms per query. Worth it for read-heavy system."
Complexity vs Performance
"Hybrid fan-out adds complexity (2 code paths) but handles both regular users and celebrities efficiently. Simpler fan-out on read would break for Elon Musk tweets."
Common Mistakes to Avoid
Jumping to implementation too quickly
Ask clarifying questions first! "Should we support video tweets?"
Focusing only on happy path
Discuss: What if server crashes? Database is down? User spams API?
Ignoring scale
"Just use Postgres" works for 1K users, not 100M. Always consider scale from step 2.
Over-engineering early
Start simple (monolith), then add complexity (microservices, Kafka) when explaining scale.
Silent drawing
Narrate while drawing: "I'm adding a cache here to reduce database load..."
Sample Questions to Practice
- Beginner: Design URL Shortener (bit.ly), Design Pastebin, Design Rate Limiter
- Intermediate: Design Instagram, Design YouTube, Design Uber, Design WhatsApp
- Advanced: Design Google Search, Design Amazon, Design Netflix, Design Distributed Cache
Practice these with a friend or record yourself. The goal is to speak confidently and demonstrate structured thinking, not memorize solutions.
Key Takeaways
- Clarify first: Scope the problem before designing
- Estimate capacity: Use numbers to guide decisions
- Start simple: Monolith → Load balancer → Caching → Microservices
- Discuss trade-offs: Every decision has pros/cons
- Think about failure: What breaks at scale? How to recover?
- Communicate clearly: Draw, narrate, check understanding
- No perfect answer: Show thought process, adapt to feedback
Further Resources
- Book: "Designing Data-Intensive Applications" by Martin Kleppmann
- Course: "Grokking the System Design Interview" (educative.io)
- YouTube: System Design Interview channel, Gaurav Sen
- Practice: Use our System Design questions with real-world scenarios
Continue Learning
- Hash Maps: When and Why - Essential for caching layers in system design
- Big O Notation Explained - Understand scalability analysis fundamentals
- STAR Method for Behavioral Interviews - Prepare for the behavioral portion of your interview
- Start Practicing - Apply system design concepts to real scenarios
Article Details
This guide is part of HireReady's interview prep library and is maintained to reflect current hiring practices.
Further Reading
Keep Reading
Design Uber: Ride-Sharing System Design Guide
Master the Uber system design interview. Learn to design ride matching, real-time location tracking, surge pricing, and ETA calculation at scale.
Read moreDesign Spotify: Music Streaming System Design Guide
Master the Spotify system design interview. Learn to design audio streaming, playlist management, recommendation engines, and offline playback at scale.
Read moreDesign Twitter: A Step-by-Step System Design Walkthrough
Master the Twitter system design interview. Learn to design the feed, handle celebrity users, scale tweet storage, and implement real-time notifications.
Read more