Apache Kafka Architecture
Learn Apache Kafka Architecture from the ground up. Understand producers, consumers, brokers, topics, partitions, offsets, consumer groups, replication, leaders, followers, ZooKeeper, KRaft, Spring Boot integration, exactly-once processing, and real-world architectures used by Netflix, Uber, LinkedIn, Banking, and Amazon.
Introduction
Imagine you're building an E-Commerce Platform.
Every second thousands of events occur.
- Customer Registration
- Product Search
- Order Placement
- Payment Completed
- Inventory Updated
- Shipment Created
- Email Sent
Traditional REST communication looks like this.
Order Service
↓
Payment Service
↓
Inventory Service
↓
Notification Service
↓
Analytics Service
Problems
- Tight Coupling
- High Latency
- Service Dependencies
- Difficult Scaling
- Cascading Failures
Modern systems solve this using Apache Kafka.
Instead of direct communication,
services communicate using events.
Learning Objectives
By the end of this article you'll understand
- What is Kafka?
- Why Kafka?
- Kafka Architecture
- Producer
- Broker
- Topic
- Partition
- Offset
- Consumer
- Consumer Group
- Leader & Follower
- Replication
- ISR
- ZooKeeper
- KRaft
- Spring Boot Integration
- Kafka Delivery Guarantees
- Real-world Examples
What is Apache Kafka?
Apache Kafka is a Distributed Event Streaming Platform used to publish, store, process, and consume millions of events per second.
Kafka acts like a highly scalable event backbone between microservices.
Why Kafka?
Suppose an order is created.
Without Kafka
flowchart LR
ORDER[Order Service]
PAYMENT[Payment]
INVENTORY[Inventory]
EMAIL[Notification]
ANALYTICS[Analytics]
ORDER --> PAYMENT
ORDER --> INVENTORY
ORDER --> EMAIL
ORDER --> ANALYTICS
Order Service depends on every downstream service.
With Kafka
flowchart LR
ORDER[Order Service]
KAFKA[(Kafka)]
PAYMENT[Payment]
INVENTORY[Inventory]
EMAIL[Notification]
ANALYTICS[Analytics]
ORDER --> KAFKA
KAFKA --> PAYMENT
KAFKA --> INVENTORY
KAFKA --> EMAIL
KAFKA --> ANALYTICS
Order Service publishes one event.
Consumers process independently.
High-Level Kafka Architecture
flowchart LR
P[Producer]
K[(Kafka Cluster)]
C1[Consumer 1]
C2[Consumer 2]
C3[Consumer 3]
P --> K
K --> C1
K --> C2
K --> C3
Kafka Components
Kafka consists of
- Producer
- Broker
- Topic
- Partition
- Consumer
- Consumer Group
- ZooKeeper (Old)
- KRaft (New)
Producer
A Producer publishes messages.
Examples
- Order Service
- Payment Service
- User Service
flowchart LR
APP[Order Service]
TOPIC[(Orders Topic)]
APP --> TOPIC
Producer Example
kafkaTemplate.send(
"orders",
orderId,
orderEvent
);
Broker
A Broker stores Kafka messages.
Kafka Cluster
flowchart LR
B1[(Broker 1)]
B2[(Broker 2)]
B3[(Broker 3)]
Most production systems have multiple brokers.
Topic
A Topic is a logical stream of events.
Examples
orders
payments
customers
inventory
notifications
Topic Architecture
flowchart TD
TOPIC[(Orders Topic)]
P1[Partition 0]
P2[Partition 1]
P3[Partition 2]
TOPIC --> P1
TOPIC --> P2
TOPIC --> P3
Why Partitions?
Suppose
1 Million messages arrive.
One partition
↓
One consumer
↓
Slow processing
Multiple partitions
↓
Multiple consumers
↓
Parallel processing
Partition Example
flowchart LR
PRODUCER["Producer"]
TOPIC["Kafka Topic"]
PART0["Partition 0"]
PART1["Partition 1"]
PART2["Partition 2"]
PRODUCER --> TOPIC
TOPIC --> PART0
TOPIC --> PART1
TOPIC --> PART2
(Represented below with valid Mermaid)
flowchart LR
P[Producer]
A[Partition 0]
B[Partition 1]
C[Partition 2]
P --> A
P --> B
P --> C
Ordering
Kafka guarantees ordering within one partition.
Example
Order Created
↓
Payment Completed
↓
Shipment Created
All messages for the same Order ID should go to the same partition.
Message Key
Producer
kafkaTemplate.send(
"orders",
orderId,
event
);
Kafka hashes
Order ID
to determine the partition.
Offset
Every message has an Offset.
Offset 0
Offset 1
Offset 2
Offset 3
Offset 4
Offsets uniquely identify messages within a partition.
Offset Diagram
flowchart LR
O0[0]
O1[1]
O2[2]
O3[3]
O4[4]
O0 --> O1 --> O2 --> O3 --> O4
Consumer
Consumers read events.
Example
flowchart LR
TOPIC[(Orders)]
CONSUMER[Inventory Service]
TOPIC --> CONSUMER
Consumer Group
Multiple consumers share work.
flowchart LR
TOPIC[(Orders)]
C1[Consumer 1]
C2[Consumer 2]
C3[Consumer 3]
TOPIC --> C1
TOPIC --> C2
TOPIC --> C3
Each partition is assigned to only one consumer within the same group.
Consumer Group Architecture
flowchart TD
TOPIC[(Orders)]
P0[Partition 0]
P1[Partition 1]
P2[Partition 2]
C1[Consumer A]
C2[Consumer B]
TOPIC --> P0
TOPIC --> P1
TOPIC --> P2
P0 --> C1
P1 --> C2
P2 --> C1
Replication
Kafka replicates partitions.
flowchart LR
Leader[(Leader)]
Follower1[(Follower)]
Follower2[(Follower)]
Leader --> Follower1
Leader --> Follower2
Replication improves availability.
Leader and Followers
Each partition has
- One Leader
- Multiple Followers
Only Leader handles
- Reads
- Writes
Followers replicate data.
ISR (In-Sync Replicas)
ISR means
Replicas that are fully synchronized with the Leader.
If Leader crashes,
Kafka elects a new Leader from ISR.
Replication Flow
sequenceDiagram
participant Producer
participant Leader
participant Follower
Producer->>Leader: Publish
Leader->>Follower: Replicate
Follower-->>Leader: ACK
Leader-->>Producer: Success
Leader Failure
flowchart TD
Leader[(Leader)]
Follower1[(Follower)]
Follower2[(Follower)]
Leader -. Crash .-> Follower1
Follower1 --> NewLeader[(New Leader)]
Kafka automatically elects a new leader.
ZooKeeper (Legacy)
Older Kafka versions used ZooKeeper for
- Broker Registration
- Leader Election
- Metadata
- Cluster Coordination
KRaft Mode
Modern Kafka removes ZooKeeper.
flowchart LR
Controller[KRaft Controller]
Broker1
Broker2
Broker3
Controller --> Broker1
Controller --> Broker2
Controller --> Broker3
Benefits
- Simpler deployment
- Better scalability
- Lower operational complexity
Message Flow
sequenceDiagram
participant Producer
participant Kafka
participant Consumer
Producer->>Kafka: Publish Event
Kafka-->>Consumer: Deliver Event
Consumer-->>Kafka: Commit Offset
Offset Commit
After processing,
consumer commits
Offset = 1050
If restarted,
processing resumes from
Offset 1051
Delivery Guarantees
Kafka supports
| Mode | Description |
|---|---|
| At Most Once | No retries, possible loss |
| At Least Once | Default, duplicates possible |
| Exactly Once | Transactions + Idempotent Producer |
Spring Boot Architecture
flowchart TD
CLIENT[React]
ORDER[Order Service]
KAFKA[(Kafka)]
PAYMENT[Payment Service]
EMAIL[Notification]
ANALYTICS[Analytics]
CLIENT --> ORDER
ORDER --> KAFKA
KAFKA --> PAYMENT
KAFKA --> EMAIL
KAFKA --> ANALYTICS
Spring Boot Producer
@Service
public class OrderProducer {
@Autowired
KafkaTemplate<String, OrderEvent> kafkaTemplate;
public void publish(OrderEvent event){
kafkaTemplate.send(
"orders",
event.getOrderId(),
event
);
}
}
Spring Boot Consumer
@KafkaListener(topics = "orders")
public void consume(OrderEvent event){
System.out.println(event);
}
Banking Example
Money Transfer
↓
Publish
TransferCompleted
Consumers
- Notification
- Fraud Detection
- Analytics
- Audit
Each processes independently.
Netflix Example
Movie Started
↓
Kafka
↓
Recommendation
↓
Analytics
↓
Trending
↓
Billing
Millions of events are processed every second.
Uber Example
Ride Completed
↓
Kafka
↓
Payment
↓
Receipt
↓
Driver Earnings
↓
Analytics
Amazon Example
Order Created
↓
Kafka
↓
Inventory
↓
Shipping
↓
Recommendations
↓
Notifications
Advantages
- High Throughput
- Horizontal Scalability
- Durable Storage
- Event Replay
- Fault Tolerance
- Loose Coupling
- High Availability
Challenges
- Consumer Lag
- Duplicate Messages
- Event Ordering
- Schema Evolution
- Operational Complexity
- Monitoring
Monitoring
Monitor
- Consumer Lag
- Broker Health
- Partition Count
- ISR Count
- Replication Lag
- Producer Latency
- Consumer Throughput
Tools
- Kafka UI
- Prometheus
- Grafana
- Datadog
- Confluent Control Center
Common Mistakes
❌ Too many partitions
❌ Very large messages
❌ No message keys
❌ Ignoring idempotency
❌ Unlimited retention
❌ Not monitoring consumer lag
Best Practices
- Use message keys to preserve ordering.
- Keep events immutable.
- Design consumers to be idempotent.
- Monitor consumer lag and ISR health.
- Use KRaft for new Kafka deployments.
- Use Schema Registry for event versioning.
- Tune partition count based on expected throughput.
Common Interview Questions
What is Kafka?
Kafka is a distributed event streaming platform for publishing, storing, and consuming high volumes of events with high throughput and fault tolerance.
Why are partitions important?
Partitions enable horizontal scaling and parallel consumption while maintaining ordering within each partition.
What is a Consumer Group?
A Consumer Group is a set of consumers that cooperatively process messages from a topic, with each partition assigned to only one consumer in the group.
What is ISR?
ISR (In-Sync Replicas) are follower replicas that are fully synchronized with the partition leader and are eligible to become the new leader during failover.
What are Kafka's delivery guarantees?
- At Most Once
- At Least Once
- Exactly Once
Summary
Apache Kafka is the backbone of modern event-driven architectures. It enables producers and consumers to communicate asynchronously while providing durability, scalability, fault tolerance, and high throughput.
In this article, we covered:
- Kafka fundamentals
- Producers
- Brokers
- Topics
- Partitions
- Offsets
- Consumer Groups
- Replication
- Leader and Followers
- ISR
- ZooKeeper vs KRaft
- Spring Boot implementation
- Banking, Amazon, Uber, and Netflix examples
- Monitoring
- Best practices
Kafka is a foundational technology for Event-Driven Architecture, Saga Pattern, CQRS, Event Sourcing, and real-time data streaming. Mastering its architecture is essential for designing resilient, scalable, cloud-native enterprise systems.
Comments
Share a question, correction, or practical insight about this article.
Checking login status...
Loading approved comments...