Design Uber: Ride-Sharing System Design Guide
Design real-time ride matching, location tracking, and pricing for millions of concurrent users
Founder
Why Design Uber?
Uber is a top-tier system design question that tests your understanding of real-time systems, geospatial data, and marketplace dynamics. Unlike feed-based systems, Uber requires sub-second matching and continuous location updates.
- Real-time matching: Match riders with nearby drivers in seconds
- Geospatial indexing: Efficiently query nearby drivers by location
- Dynamic pricing: Surge pricing based on supply and demand
- ETA calculation: Accurate arrival time estimates using road networks
Step 1: Clarify Requirements (5 minutes)
Functional Requirements
Core features:
- Rider requests a ride (pickup → destination)
- System matches rider with nearest available driver
- Real-time location tracking during ride
- Fare calculation and payment
- Driver/rider ratings
Out of scope (confirm):
- UberEats / delivery
- Scheduled rides
- Carpooling (UberPool)
- Driver onboardingNon-Functional Requirements
Scale:
- 100M riders, 5M drivers
- 20M rides/day
- 5M concurrent drivers sending location updates
Performance:
- Match rider to driver in < 10 seconds
- Location updates every 3 seconds from active drivers
- ETA accuracy within 2 minutes
- 99.99% availability (rides are safety-critical)
Key insight: This is a real-time, write-heavy, location-intensive system.
Very different from social media feed design.Step 2: Capacity Estimation
Location Updates:
- 5M active drivers x 1 update/3 sec = 1.67M updates/sec
- Each update: ~100 bytes (driver_id, lat, lng, timestamp, status)
- 1.67M x 100 bytes = 167 MB/sec ingress
Ride Requests:
- 20M rides/day = ~230 rides/sec
- Peak (3x): ~700 rides/sec
Storage:
- Ride records: 20M/day x 1KB = 20 GB/day
- Location history: 1.67M/sec x 100 bytes x 86400 = 14.4 TB/day
- Keep hot data 30 days, archive older data
Key insight: Location data volume is enormous.
Need specialized geospatial storage, not a general-purpose DB.Step 3: High-Level Architecture
┌──────────────────────────────────────────────────────────────┐
│ RIDER & DRIVER APPS │
│ (GPS, ride requests, real-time tracking) │
└───────────────────────┬──────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────┐
│ API GATEWAY / LOAD BALANCER │
│ WebSocket for real-time, REST for requests │
└──────┬──────────┬──────────┬──────────┬──────────────────────┘
│ │ │ │
▼ ▼ ▼ ▼
┌──────────┐ ┌────────┐ ┌────────┐ ┌──────────┐
│ Location │ │ Ride │ │ Pricing│ │ Payment │
│ Service │ │ Match │ │Service │ │ Service │
└────┬─────┘ └───┬────┘ └───┬────┘ └────┬─────┘
│ │ │ │
▼ ▼ ▼ ▼
┌──────────┐ ┌────────┐ ┌────────┐ ┌──────────┐
│ Geo Index│ │ Ride │ │ Supply │ │ Stripe │
│ (Redis) │ │ DB │ │Demand │ │ (ext) │
└──────────┘ └────────┘ └────────┘ └──────────┘Key Services
1. Location Service
- Ingests driver location updates (1.67M/sec)
- Maintains geospatial index of active drivers
- Provides "find nearby drivers" API
2. Ride Matching Service
- Receives ride requests from riders
- Queries Location Service for nearby available drivers
- Sends ride offers to drivers
- Handles driver acceptance/rejection
3. Pricing Service
- Calculates base fare from distance and time
- Applies surge multiplier based on supply/demand
- ETA calculation using road network data
4. Payment Service
- Fare calculation at ride end
- Payment processing via Stripe/payment provider
- Driver payoutsStep 4: Deep Dive - Geospatial Indexing
This is the most critical component. You need to efficiently answer: "Which available drivers are within 5 km of this rider?"
Approach: Geohash-Based Index
Geohash divides the world into grid cells of configurable size.
How it works:
1. Convert (lat, lng) → geohash string (e.g., "9q8yyk")
2. Precision 6 = ~1.2km x 0.6km cells
3. Store drivers in Redis sets keyed by geohash
Driver location update:
1. Calculate new geohash from driver's position
2. If geohash changed:
- SREM old_geohash driver_id
- SADD new_geohash driver_id
3. Update driver's current position in hash
Finding nearby drivers:
1. Calculate rider's geohash
2. Get the 8 neighboring geohashes (handles cell boundaries)
3. SUNION all 9 geohash sets → candidate drivers
4. Filter by exact distance (Haversine formula)
5. Filter by availability status
6. Sort by distance, return top N
Redis data structures:
- SET geohash:{hash} → {driver_id_1, driver_id_2, ...}
- HASH driver:{id} → {lat, lng, status, last_updated}
Why Redis?
- In-memory: sub-millisecond lookups
- SET operations: efficient add/remove/union
- Handles 1.67M writes/sec with shardingAlternative: QuadTree
QuadTree recursively divides space into 4 quadrants.
Pros vs Geohash:
+ Dynamic resolution (denser areas get more subdivisions)
+ Natural range queries
Cons:
- Harder to distribute across multiple servers
- More complex implementation
- Redis geohash approach is simpler and fast enough
For the interview: mention both, explain why you chose geohash.Step 5: Deep Dive - Ride Matching
Matching algorithm:
1. Rider requests ride → Ride Matching Service
2. Query Location Service: "nearest 10 available drivers within 5km"
3. Rank drivers by:
- Distance to rider (primary)
- Driver rating
- Time since last ride (fairness)
- Vehicle type match
4. Send ride offer to top driver
5. Driver has 15 seconds to accept
6. If rejected/timeout → offer to next driver
7. If all 10 reject → expand radius to 10km, retry
Dispatch optimization:
- Don't dispatch to a driver who's heading away from rider
- Consider driver's current heading (angle between driver heading and rider direction)
- Predict driver ETA using road network, not straight-line distance
State machine for a ride:
REQUESTED → MATCHING → DRIVER_ACCEPTED → DRIVER_EN_ROUTE
→ DRIVER_ARRIVED → RIDE_IN_PROGRESS → COMPLETED → RATEDStep 6: Surge Pricing
Surge pricing balances supply and demand in real-time.
How it works:
1. Divide city into zones (geohash-based, ~2km cells)
2. For each zone, track:
- Supply: number of available drivers
- Demand: ride requests in last 5 minutes
3. Compute surge multiplier:
multiplier = demand / (supply * target_ratio)
Capped at 1.0x (floor) to 5.0x (ceiling)
Update frequency: Every 2 minutes per zone
Implementation:
- Use time-windowed counters in Redis
- INCR demand:{zone}:{time_bucket} on each request
- COUNT drivers in zone from geospatial index
- Background worker computes multipliers every 2 min
Show rider the surge price BEFORE they confirm.
Riders explicitly accept the higher price.Step 7: ETA Calculation
ETA is NOT straight-line distance / speed.
Real ETA uses:
1. Road network graph (from OpenStreetMap or proprietary data)
2. Current traffic conditions (from driver GPS data)
3. Historical patterns (rush hour, events)
Algorithm:
- Pre-compute road network as weighted graph
- Edge weights = travel time (updated with live traffic)
- Use A* or Dijkstra's for shortest path
- Partition city into regions, pre-compute inter-region travel times
For the interview:
- Mention the road graph approach
- Discuss live traffic from driver GPS aggregation
- Note that Google Maps API is often used early-stage
- Uber built their own routing engine for cost and accuracyStep 8: Real-Time Communication
Both rider and driver need real-time updates:
- Driver location during pickup and ride
- Ride status changes
- ETA updates
Protocol: WebSocket connections
- Each active user maintains a persistent WebSocket
- Server pushes location updates every 3 seconds
- Much more efficient than polling
Scale challenge:
- 5M concurrent WebSocket connections
- Use WebSocket gateway servers (each handles ~100K connections)
- Need ~50 gateway servers
- Use pub/sub (Redis or Kafka) to route messages to correct gateway
Message flow:
Driver app → Location Service → Redis Pub/Sub → Gateway → Rider appKey Takeaways for the Interview
- Geospatial indexing is the core challenge: Explain geohash vs quadtree trade-offs
- Real-time matters: WebSockets, not polling. Sub-second location propagation
- Matching is more than distance: Heading direction, ETA, fairness, driver rating
- Surge pricing: Zone-based supply/demand balancing, updated every few minutes
- ETA uses road networks: Not straight-line distance. Graph-based routing with live traffic
Practice This on HireReady
Uber-style system design questions appear at Uber, Lyft, DoorDash, and other marketplace companies. Practice articulating your design with our AI voice interviewer.
Article Details
This guide is part of HireReady's interview prep library and is maintained to reflect current hiring practices.
Keep Reading
Design Spotify: Music Streaming System Design Guide
Master the Spotify system design interview. Learn to design audio streaming, playlist management, recommendation engines, and offline playback at scale.
Read moreDesign Twitter: A Step-by-Step System Design Walkthrough
Master the Twitter system design interview. Learn to design the feed, handle celebrity users, scale tweet storage, and implement real-time notifications.
Read moreDesign Instagram: A Step-by-Step System Design Walkthrough
Master the Instagram system design interview. Learn to design photo sharing, news feed generation, story features, and scale to billions of users.
Read more