Design Netflix - Video Streaming Platform
Learn how to design a scalable video streaming platform like Netflix, handling millions of concurrent users, content delivery, and personalized recommendations.
Design Netflix - Video Streaming Platform
Problem Statement
Design a video streaming platform that can:
- Stream videos to millions of concurrent users
- Provide personalized recommendations
- Support multiple devices and resolutions
- Handle content upload and encoding
- Deliver content with low latency globally
Requirements
Functional Requirements
- User authentication and profiles
- Video upload and processing
- Video streaming with adaptive bitrate
- Search and browse content
- Personalized recommendations
- Watch history and resume playback
- Subtitle support
- Download for offline viewing
Non-Functional Requirements
- Scalability: Support 200M+ users, 100M+ concurrent streams
- Availability: 99.99% uptime
- Low Latency: Start streaming within 2 seconds
- Global: Serve users worldwide
- Cost-Effective: Optimize bandwidth and storage costs
High-Level Architecture
┌─────────────┐
│ Client │ (Web, Mobile, TV, Gaming Console)
└──────┬──────┘
│
├─────────────────────────────────────────┐
│ │
┌──────▼──────┐ ┌───────▼────────┐
│ CDN │ │ API Gateway │
│ (CloudFront)│ │ (Load Bal) │
└──────┬──────┘ └───────┬────────┘
│ │
│ ┌────────▼────────┐
│ │ Microservices │
│ ├─────────────────┤
│ │ • Auth Service │
│ │ • User Service │
│ │ • Video Service │
│ │ • Search Service│
│ │ • Recommend Svc │
│ └────────┬────────┘
│ │
┌──────▼──────┐ ┌───────▼────────┐
│ Storage │ │ Databases │
│ (S3) │ ├────────────────┤
├─────────────┤ │ • PostgreSQL │
│ • Videos │ │ • Cassandra │
│ • Thumbnails│ │ • Redis Cache │
│ • Subtitles │ │ • Elasticsearch│
└─────────────┘ └────────────────┘
Core Components
1. Content Delivery Network (CDN)
Purpose: Deliver video content with low latency globally
Implementation:
CDN Strategy:
├── Edge Locations (200+ worldwide)
├── Origin Servers (S3 buckets)
├── Cache Strategy
│ ├── Popular content: Cache at all edges
│ ├── Regional content: Cache in specific regions
│ └── Long-tail content: On-demand caching
└── Adaptive Bitrate Streaming (ABR)
├── 4K: 25 Mbps
├── 1080p: 5 Mbps
├── 720p: 3 Mbps
├── 480p: 1.5 Mbps
└── 360p: 0.7 Mbps
2. Video Processing Pipeline
Workflow:
Upload → Transcoding → Quality Check → Storage → CDN Distribution
1. Upload Service
- Chunked upload for large files
- Resume capability
- Validation (format, size, duration)
2. Transcoding Service
- Multiple resolutions (360p to 4K)
- Multiple formats (MP4, WebM, HLS)
- Audio tracks (multiple languages)
- Subtitle generation
- Thumbnail extraction
3. Storage
- Original: S3 Glacier (cold storage)
- Transcoded: S3 Standard
- Metadata: Database
3. Streaming Protocol
HLS (HTTP Live Streaming):
video.m3u8 (Master Playlist)
├── 4k.m3u8
│ ├── segment_001.ts
│ ├── segment_002.ts
│ └── segment_003.ts
├── 1080p.m3u8
├── 720p.m3u8
└── 480p.m3u8
Client automatically switches quality based on:
- Network bandwidth
- Device capability
- Buffer health
4. Recommendation System
Architecture:
Data Collection → Feature Engineering → Model Training → Serving
Data Sources:
├── Watch history
├── Search queries
├── Ratings
├── Time of day
├── Device type
└── Geographic location
Algorithms:
├── Collaborative Filtering
├── Content-Based Filtering
├── Deep Learning (Neural Networks)
└── A/B Testing for optimization
Real-time Serving:
├── Pre-computed recommendations (batch)
├── Real-time personalization
└── Redis cache for fast access
Database Design
User Service (PostgreSQL)
users
├── user_id (PK)
├── email
├── password_hash
├── subscription_tier
├── created_at
└── last_login
profiles
├── profile_id (PK)
├── user_id (FK)
├── name
├── avatar
└── preferences
Video Service (Cassandra)
videos (Partition Key: video_id)
├── video_id
├── title
├── description
├── duration
├── release_date
├── genres
├── cast
└── thumbnail_url
video_metadata
├── video_id
├── resolution
├── bitrate
├── codec
├── file_size
└── cdn_url
Watch History (Cassandra)
watch_history (Partition Key: user_id, Clustering Key: timestamp)
├── user_id
├── video_id
├── timestamp
├── duration_watched
├── total_duration
└── device_type
API Design
Streaming API
GET /api/v1/stream/{video_id}
Headers:
Authorization: Bearer {token}
Range: bytes=0-1024
Response:
{
"manifest_url": "https://cdn.netflix.com/video123/master.m3u8",
"drm_license": "...",
"subtitles": [
{"language": "en", "url": "..."},
{"language": "es", "url": "..."}
]
}
Recommendation API
GET /api/v1/recommendations
Headers:
Authorization: Bearer {token}
Response:
{
"personalized": [...],
"trending": [...],
"continue_watching": [...],
"new_releases": [...]
}
Scalability Strategies
1. Horizontal Scaling
- Microservices architecture
- Stateless services
- Load balancing across multiple instances
2. Caching Strategy
Multi-Level Caching:
├── Browser Cache (videos, thumbnails)
├── CDN Cache (edge locations)
├── Application Cache (Redis)
│ ├── User sessions
│ ├── Recommendations
│ └── Popular content metadata
└── Database Cache (query results)
3. Database Sharding
User Data Sharding:
- Shard by user_id % num_shards
- Consistent hashing for even distribution
Video Data Sharding:
- Shard by video_id
- Replicate popular content across shards
Cost Optimization
1. Storage Tiering
Content Lifecycle:
├── New Release (0-30 days): S3 Standard + All CDN edges
├── Popular (30-180 days): S3 Standard + Regional CDN
├── Catalog (180-365 days): S3 IA + On-demand CDN
└── Archive (365+ days): S3 Glacier + Rare access
2. Bandwidth Optimization
- Adaptive bitrate streaming
- Compression (H.265/HEVC)
- P2P delivery for popular content
- Off-peak encoding
Security
1. DRM (Digital Rights Management)
- Widevine (Android, Chrome)
- FairPlay (iOS, Safari)
- PlayReady (Windows, Xbox)
2. Content Protection
- Encrypted streaming (HTTPS)
- Token-based authentication
- Geo-blocking
- Watermarking
Monitoring & Analytics
Key Metrics
Performance:
├── Video start time
├── Buffering ratio
├── Bitrate distribution
└── CDN hit ratio
Business:
├── Concurrent streams
├── Watch time per user
├── Completion rate
└── Churn rate
Infrastructure:
├── Server CPU/Memory
├── Database query time
├── Cache hit rate
└── Error rates
Interview Questions
Q1: How to handle 100M concurrent streams?
Answer:
- CDN for content delivery (offload 95% traffic)
- Microservices for horizontal scaling
- Database sharding for user data
- Redis for session management
- Load balancers with auto-scaling
Q2: How to reduce video start time?
Answer:
- Preload first segment
- Optimize CDN cache hit ratio
- Use HTTP/2 for multiplexing
- Reduce manifest file size
- Predictive prefetching
Q3: How to handle video encoding at scale?
Answer:
- Distributed encoding cluster
- Queue-based processing (SQS)
- Priority queue (new releases first)
- Parallel encoding (multiple resolutions)
- Spot instances for cost savings
Conclusion
Netflix-scale system design requires:
- Global CDN for low-latency delivery
- Microservices for scalability
- Adaptive streaming for quality
- ML-based recommendations for engagement
- Cost optimization for profitability
Key architectural decisions:
- CDN-first approach
- Cassandra for time-series data
- Redis for real-time caching
- Kafka for event streaming
- Kubernetes for orchestration