Load Balancer Fundamentals in System Design
Learn Load Balancers from a System Design perspective. This guide explains why Load Balancers are essential, traffic distribution algorithms, health checks, Layer 4 vs Layer 7 Load Balancers, sticky sessions, SSL termination, high availability, and real-world architectures using AWS ALB, NLB, NGINX, and Spring Boot microservices.
Introduction
Imagine Amazon during Black Friday.
Millions of customers are simultaneously:
- Searching products
- Adding items to cart
- Making payments
- Tracking orders
Can a single server handle millions of users?
No.
A single server has limitations:
- CPU
- Memory
- Network bandwidth
- Database connections
- Thread pool
Eventually it becomes overloaded and crashes.
Modern applications solve this problem using Load Balancers.
A Load Balancer distributes incoming traffic across multiple servers so that:
- No single server is overloaded
- Applications remain highly available
- Systems scale horizontally
- Failures are handled automatically
Learning Objectives
After completing this article, you will understand:
- What is a Load Balancer?
- Why Load Balancers are Needed
- Request Flow
- Traffic Distribution
- Health Checks
- Load Balancing Algorithms
- Layer 4 vs Layer 7
- Sticky Sessions
- SSL Termination
- High Availability
- AWS ALB & NLB
- Real-World Examples
What is a Load Balancer?
A Load Balancer is a component that distributes client requests across multiple backend servers.
Instead of:
Users
↓
One Server
It becomes:
Users
↓
Load Balancer
↓
Multiple Servers
Without Load Balancer
flowchart TD
A[Users]
B[Spring Boot Server]
C[(Database)]
A --> B
B --> C
Problems
- Server overload
- Single Point of Failure
- Poor scalability
- Downtime
With Load Balancer
flowchart TD
A[Users]
B[Load Balancer]
C[Spring Boot Server 1]
D[Spring Boot Server 2]
E[Spring Boot Server 3]
F[(Database)]
A --> B
B --> C
B --> D
B --> E
C --> F
D --> F
E --> F
Request Flow
flowchart LR
A[Client]
B[DNS]
C[Load Balancer]
D[Application]
E[(Database)]
A --> B
B --> C
C --> D
D --> E
Why Use Load Balancers?
Without a Load Balancer:
- One server receives all traffic.
- CPU reaches 100%.
- Requests become slow.
- Server crashes.
With a Load Balancer:
- Requests are shared.
- CPU usage is balanced.
- Response times improve.
- Applications remain available.
Real-Time Banking Example
flowchart TD
A[Mobile Banking Users]
B[Load Balancer]
C[Payment Service 1]
D[Payment Service 2]
E[Payment Service 3]
F[(Core Banking Database)]
A --> B
B --> C
B --> D
B --> E
C --> F
D --> F
E --> F
If one payment service fails, requests are routed to the remaining healthy services.
Traffic Distribution
Suppose
300 Requests
Three servers
100
↓
100
↓
100
instead of
300
↓
One Server
Health Checks
Load Balancers continuously monitor application health.
graph TD
LoadBalancer["Load Balancer"]
HealthCheck["Health Check"]
App1["Application 1"]
App2["Application 2"]
LoadBalancer --> HealthCheck
HealthCheck --> App1
HealthCheck --> App2
Spring Boot
GET /actuator/health
Response
{
"status":"UP"
}
Unhealthy instances are automatically removed from traffic.
Server Failure
Normal Operation
flowchart TD
A[Load Balancer]
B[Server 1]
C[Server 2]
D[Server 3]
A --> B
A --> C
A --> D
Server 2 crashes.
flowchart TD
A[Load Balancer]
B[Server 1]
D[Server 3]
A --> B
A --> D
Users continue using the application without interruption.
Load Balancing Algorithms
Popular algorithms include:
| Algorithm | Description |
|---|---|
| Round Robin | Requests distributed sequentially |
| Least Connections | Sends traffic to the server with the fewest active connections |
| Least Response Time | Chooses the fastest responding server |
| Weighted Round Robin | Gives more traffic to higher-capacity servers |
| IP Hash | Routes based on client IP |
Round Robin
flowchart LR
A[Request 1]
B[Server 1]
C[Request 2]
D[Server 2]
E[Request 3]
F[Server 3]
A --> B
C --> D
E --> F
Simple and commonly used.
Least Connections
flowchart TD
A[Load Balancer]
B[Server 1]
C[Server 2]
D[Server 3]
A --> B
A --> C
A --> D
The Load Balancer chooses the server with the fewest active connections.
Best for long-running requests.
Weighted Round Robin
Suppose
Server 1
Weight 5
Server 2
Weight 2
Server 1 receives more traffic because it has greater capacity.
Layer 4 Load Balancer
Operates at the Transport Layer.
Routes traffic using:
- IP Address
- TCP Port
flowchart LR
A[Client]
B[L4 Load Balancer]
C[Servers]
A --> B
B --> C
Fast and lightweight.
Example
- AWS Network Load Balancer (NLB)
Layer 7 Load Balancer
Operates at the Application Layer.
Routes using:
- URL
- HTTP Headers
- Cookies
- Hostname
flowchart TD
CLIENT["Client"]
ALB["Application Load Balancer"]
CUSTOMER["Customers Service"]
PAYMENT["Payments Service"]
ORDER["Orders Service"]
CLIENT --> ALB
ALB --> CUSTOMER
ALB --> PAYMENT
ALB --> ORDER
Example
- AWS Application Load Balancer (ALB)
Layer 4 vs Layer 7
| Layer 4 | Layer 7 |
|---|---|
| TCP/UDP | HTTP/HTTPS |
| Faster | More Intelligent |
| No URL Routing | Supports Path Routing |
| Lower Latency | Rich Features |
Path-Based Routing
flowchart TD
A["Client"]
B["ALB"]
C["Route: /users"]
D["Route: /orders"]
E["Route: /payments"]
A --> B
B --> C
B --> D
B --> E
Each request is routed to the appropriate microservice.
Host-Based Routing
api.company.com
↓
API Service
admin.company.com
↓
Admin Service
Supported by Layer 7 Load Balancers.
Sticky Sessions
Normally
Request 1
↓
Server 1
Request 2
↓
Server 2
With Sticky Sessions
User A
↓
Always Server 1
Useful for legacy session-based applications.
Not recommended for stateless microservices.
SSL Termination
Instead of every application handling TLS,
the Load Balancer decrypts HTTPS traffic.
flowchart LR
A[Browser]
B[HTTPS]
C[Load Balancer]
D[HTTP]
E[Spring Boot]
A --> B
B --> C
C --> D
D --> E
Benefits
- Simplified certificate management
- Reduced CPU usage on application servers
High Availability
Deploy multiple Load Balancers across Availability Zones.
flowchart TD
A[Users]
B[ALB]
C[AZ-1]
D[AZ-2]
A --> B
B --> C
B --> D
Ensures service remains available even if one Availability Zone fails.
AWS Load Balancers
| Service | Use Case |
|---|---|
| ALB | HTTP/HTTPS Applications |
| NLB | TCP/UDP Traffic |
| GWLB | Network Appliances |
| CLB | Legacy Applications |
Spring Boot Architecture
flowchart TD
A[Users]
B[AWS ALB]
C[Spring Boot 1]
D[Spring Boot 2]
E[Spring Boot 3]
F[(Amazon RDS)]
A --> B
B --> C
B --> D
B --> E
C --> F
D --> F
E --> F
Amazon Example
Amazon distributes requests across thousands of application servers using multiple layers of load balancing.
Benefits
- High Availability
- Fault Tolerance
- Horizontal Scaling
Netflix Example
Netflix combines:
- CDN
- Load Balancers
- Auto Scaling
- Regional deployments
to stream content reliably to millions of users.
Banking Example
Every payment request first reaches a Load Balancer before being routed to a healthy payment service.
This prevents overload during peak banking hours.
Monitoring
Monitor
- Requests/sec
- Active Connections
- Backend Response Time
- Healthy Hosts
- Unhealthy Hosts
- HTTP 5xx Errors
- Target Response Time
- CPU Usage
Tools
- AWS CloudWatch
- Datadog
- Grafana
- Prometheus
Common Mistakes
❌ Deploying a single application server
❌ No health checks
❌ Sticky sessions in stateless microservices
❌ No auto scaling
❌ Ignoring backend latency
❌ Single Availability Zone deployment
Best Practices
- Use multiple application instances.
- Enable health checks.
- Prefer stateless services.
- Use Layer 7 Load Balancers for HTTP APIs.
- Use Layer 4 for high-performance TCP/UDP workloads.
- Enable HTTPS with SSL termination.
- Deploy across multiple Availability Zones.
- Combine Load Balancers with Auto Scaling.
- Monitor response time and unhealthy targets.
Common Interview Questions
What is a Load Balancer?
A Load Balancer distributes incoming client requests across multiple backend servers to improve availability, scalability, and fault tolerance.
Why are health checks important?
Health checks allow the Load Balancer to detect failed instances and stop routing traffic to them until they recover.
What is the difference between Layer 4 and Layer 7 Load Balancers?
Layer 4 operates at the Transport Layer using TCP/UDP information, while Layer 7 understands HTTP/HTTPS requests and supports advanced routing based on URLs, headers, and hostnames.
What are Sticky Sessions?
Sticky Sessions ensure that requests from the same client continue to be routed to the same backend server. They are useful for stateful applications but generally avoided in stateless microservices.
Why is SSL termination performed at the Load Balancer?
SSL termination centralizes TLS certificate management, reduces CPU overhead on application servers, and simplifies backend service configuration.
Summary
Load Balancers are a fundamental building block of scalable and highly available distributed systems. They distribute traffic intelligently, monitor application health, and ensure continuous service even when servers fail.
In this article, we covered:
- Load Balancer fundamentals
- Traffic distribution
- Health checks
- Load balancing algorithms
- Layer 4 vs Layer 7
- Sticky Sessions
- SSL termination
- High Availability
- AWS ALB & NLB
- Spring Boot architecture
- Real-world examples
- Best practices
Load Balancers work hand-in-hand with Auto Scaling, API Gateways, CDNs, and Kubernetes to build resilient, cloud-native applications capable of serving millions of users.
Comments
Share a question, correction, or practical insight about this article.
Checking login status...
Loading approved comments...