System Design • 2024-01-20

Design Netflix - Video Streaming Platform

Learn how to design a scalable video streaming platform like Netflix, handling millions of concurrent users, content delivery, and personalized recommendations.

Design Netflix - Video Streaming Platform

Problem Statement

Design a video streaming platform that can:

Stream videos to millions of concurrent users
Provide personalized recommendations
Support multiple devices and resolutions
Handle content upload and encoding
Deliver content with low latency globally

Requirements

Functional Requirements

User authentication and profiles
Video upload and processing
Video streaming with adaptive bitrate
Search and browse content
Personalized recommendations
Watch history and resume playback
Subtitle support
Download for offline viewing

Non-Functional Requirements

Scalability: Support 200M+ users, 100M+ concurrent streams
Availability: 99.99% uptime
Low Latency: Start streaming within 2 seconds
Global: Serve users worldwide
Cost-Effective: Optimize bandwidth and storage costs

High-Level Architecture

┌─────────────┐
│   Client    │ (Web, Mobile, TV, Gaming Console)
└──────┬──────┘
       │
       ├─────────────────────────────────────────┐
       │                                         │
┌──────▼──────┐                         ┌───────▼────────┐
│   CDN       │                         │  API Gateway   │
│  (CloudFront)│                        │   (Load Bal)   │
└──────┬──────┘                         └───────┬────────┘
       │                                         │
       │                                ┌────────▼────────┐
       │                                │  Microservices  │
       │                                ├─────────────────┤
       │                                │ • Auth Service  │
       │                                │ • User Service  │
       │                                │ • Video Service │
       │                                │ • Search Service│
       │                                │ • Recommend Svc │
       │                                └────────┬────────┘
       │                                         │
┌──────▼──────┐                         ┌───────▼────────┐
│   Storage   │                         │   Databases    │
│   (S3)      │                         ├────────────────┤
├─────────────┤                         │ • PostgreSQL   │
│ • Videos    │                         │ • Cassandra    │
│ • Thumbnails│                         │ • Redis Cache  │
│ • Subtitles │                         │ • Elasticsearch│
└─────────────┘                         └────────────────┘

Core Components

1. Content Delivery Network (CDN)

Purpose: Deliver video content with low latency globally

Implementation:

CDN Strategy:
├── Edge Locations (200+ worldwide)
├── Origin Servers (S3 buckets)
├── Cache Strategy
│   ├── Popular content: Cache at all edges
│   ├── Regional content: Cache in specific regions
│   └── Long-tail content: On-demand caching
└── Adaptive Bitrate Streaming (ABR)
    ├── 4K: 25 Mbps
    ├── 1080p: 5 Mbps
    ├── 720p: 3 Mbps
    ├── 480p: 1.5 Mbps
    └── 360p: 0.7 Mbps

2. Video Processing Pipeline

Workflow:

Upload → Transcoding → Quality Check → Storage → CDN Distribution

1. Upload Service
   - Chunked upload for large files
   - Resume capability
   - Validation (format, size, duration)

2. Transcoding Service
   - Multiple resolutions (360p to 4K)
   - Multiple formats (MP4, WebM, HLS)
   - Audio tracks (multiple languages)
   - Subtitle generation
   - Thumbnail extraction

3. Storage
   - Original: S3 Glacier (cold storage)
   - Transcoded: S3 Standard
   - Metadata: Database

3. Streaming Protocol

HLS (HTTP Live Streaming):

video.m3u8 (Master Playlist)
├── 4k.m3u8
│   ├── segment_001.ts
│   ├── segment_002.ts
│   └── segment_003.ts
├── 1080p.m3u8
├── 720p.m3u8
└── 480p.m3u8

Client automatically switches quality based on:
- Network bandwidth
- Device capability
- Buffer health

4. Recommendation System

Architecture:

Data Collection → Feature Engineering → Model Training → Serving

Data Sources:
├── Watch history
├── Search queries
├── Ratings
├── Time of day
├── Device type
└── Geographic location

Algorithms:
├── Collaborative Filtering
├── Content-Based Filtering
├── Deep Learning (Neural Networks)
└── A/B Testing for optimization

Real-time Serving:
├── Pre-computed recommendations (batch)
├── Real-time personalization
└── Redis cache for fast access

Database Design

User Service (PostgreSQL)

users
├── user_id (PK)
├── email
├── password_hash
├── subscription_tier
├── created_at
└── last_login

profiles
├── profile_id (PK)
├── user_id (FK)
├── name
├── avatar
└── preferences

Video Service (Cassandra)

videos (Partition Key: video_id)
├── video_id
├── title
├── description
├── duration
├── release_date
├── genres
├── cast
└── thumbnail_url

video_metadata
├── video_id
├── resolution
├── bitrate
├── codec
├── file_size
└── cdn_url

Watch History (Cassandra)

watch_history (Partition Key: user_id, Clustering Key: timestamp)
├── user_id
├── video_id
├── timestamp
├── duration_watched
├── total_duration
└── device_type

API Design

Streaming API

GET /api/v1/stream/{video_id}
Headers:
  Authorization: Bearer {token}
  Range: bytes=0-1024

Response:
{
  "manifest_url": "https://cdn.netflix.com/video123/master.m3u8",
  "drm_license": "...",
  "subtitles": [
    {"language": "en", "url": "..."},
    {"language": "es", "url": "..."}
  ]
}

Recommendation API

GET /api/v1/recommendations
Headers:
  Authorization: Bearer {token}

Response:
{
  "personalized": [...],
  "trending": [...],
  "continue_watching": [...],
  "new_releases": [...]
}

Scalability Strategies

1. Horizontal Scaling

Microservices architecture
Stateless services
Load balancing across multiple instances

2. Caching Strategy

Multi-Level Caching:
├── Browser Cache (videos, thumbnails)
├── CDN Cache (edge locations)
├── Application Cache (Redis)
│   ├── User sessions
│   ├── Recommendations
│   └── Popular content metadata
└── Database Cache (query results)

3. Database Sharding

User Data Sharding:
- Shard by user_id % num_shards
- Consistent hashing for even distribution

Video Data Sharding:
- Shard by video_id
- Replicate popular content across shards

Cost Optimization

1. Storage Tiering

Content Lifecycle:
├── New Release (0-30 days): S3 Standard + All CDN edges
├── Popular (30-180 days): S3 Standard + Regional CDN
├── Catalog (180-365 days): S3 IA + On-demand CDN
└── Archive (365+ days): S3 Glacier + Rare access

2. Bandwidth Optimization

Adaptive bitrate streaming
Compression (H.265/HEVC)
P2P delivery for popular content
Off-peak encoding

Security

1. DRM (Digital Rights Management)

Widevine (Android, Chrome)
FairPlay (iOS, Safari)
PlayReady (Windows, Xbox)

2. Content Protection

Encrypted streaming (HTTPS)
Token-based authentication
Geo-blocking
Watermarking

Monitoring & Analytics

Key Metrics

Performance:
├── Video start time
├── Buffering ratio
├── Bitrate distribution
└── CDN hit ratio

Business:
├── Concurrent streams
├── Watch time per user
├── Completion rate
└── Churn rate

Infrastructure:
├── Server CPU/Memory
├── Database query time
├── Cache hit rate
└── Error rates

Interview Questions

Q1: How to handle 100M concurrent streams?

Answer:

CDN for content delivery (offload 95% traffic)
Microservices for horizontal scaling
Database sharding for user data
Redis for session management
Load balancers with auto-scaling

Q2: How to reduce video start time?

Answer:

Preload first segment
Optimize CDN cache hit ratio
Use HTTP/2 for multiplexing
Reduce manifest file size
Predictive prefetching

Q3: How to handle video encoding at scale?

Answer:

Distributed encoding cluster
Queue-based processing (SQS)
Priority queue (new releases first)
Parallel encoding (multiple resolutions)
Spot instances for cost savings

Conclusion

Netflix-scale system design requires:

Global CDN for low-latency delivery
Microservices for scalability
Adaptive streaming for quality
ML-based recommendations for engagement
Cost optimization for profitability

Key architectural decisions:

CDN-first approach
Cassandra for time-series data
Redis for real-time caching
Kafka for event streaming
Kubernetes for orchestration