Full Stack • Java • System Design • Cloud • AI Engineering

Apache Kafka Architecture

Learn Apache Kafka Architecture from the ground up. Understand producers, consumers, brokers, topics, partitions, offsets, consumer groups, replication, leaders, followers, ZooKeeper, KRaft, Spring Boot integration, exactly-once processing, and real-world architectures used by Netflix, Uber, LinkedIn, Banking, and Amazon.


Introduction

Imagine you're building an E-Commerce Platform.

Every second thousands of events occur.

  • Customer Registration
  • Product Search
  • Order Placement
  • Payment Completed
  • Inventory Updated
  • Shipment Created
  • Email Sent

Traditional REST communication looks like this.

Order Service
      ↓
Payment Service
      ↓
Inventory Service
      ↓
Notification Service
      ↓
Analytics Service

Problems

  • Tight Coupling
  • High Latency
  • Service Dependencies
  • Difficult Scaling
  • Cascading Failures

Modern systems solve this using Apache Kafka.

Instead of direct communication,

services communicate using events.


Learning Objectives

By the end of this article you'll understand

  • What is Kafka?
  • Why Kafka?
  • Kafka Architecture
  • Producer
  • Broker
  • Topic
  • Partition
  • Offset
  • Consumer
  • Consumer Group
  • Leader & Follower
  • Replication
  • ISR
  • ZooKeeper
  • KRaft
  • Spring Boot Integration
  • Kafka Delivery Guarantees
  • Real-world Examples

What is Apache Kafka?

Apache Kafka is a Distributed Event Streaming Platform used to publish, store, process, and consume millions of events per second.

Kafka acts like a highly scalable event backbone between microservices.


Why Kafka?

Suppose an order is created.

Without Kafka

flowchart LR

ORDER[Order Service]

PAYMENT[Payment]

INVENTORY[Inventory]

EMAIL[Notification]

ANALYTICS[Analytics]

ORDER --> PAYMENT
ORDER --> INVENTORY
ORDER --> EMAIL
ORDER --> ANALYTICS

Order Service depends on every downstream service.


With Kafka

flowchart LR

ORDER[Order Service]

KAFKA[(Kafka)]

PAYMENT[Payment]

INVENTORY[Inventory]

EMAIL[Notification]

ANALYTICS[Analytics]

ORDER --> KAFKA

KAFKA --> PAYMENT
KAFKA --> INVENTORY
KAFKA --> EMAIL
KAFKA --> ANALYTICS

Order Service publishes one event.

Consumers process independently.


High-Level Kafka Architecture

flowchart LR

P[Producer]

K[(Kafka Cluster)]

C1[Consumer 1]

C2[Consumer 2]

C3[Consumer 3]

P --> K

K --> C1
K --> C2
K --> C3

Kafka Components

Kafka consists of

  • Producer
  • Broker
  • Topic
  • Partition
  • Consumer
  • Consumer Group
  • ZooKeeper (Old)
  • KRaft (New)

Producer

A Producer publishes messages.

Examples

  • Order Service
  • Payment Service
  • User Service
flowchart LR

APP[Order Service]

TOPIC[(Orders Topic)]

APP --> TOPIC

Producer Example

kafkaTemplate.send(
    "orders",
    orderId,
    orderEvent
);

Broker

A Broker stores Kafka messages.

Kafka Cluster

flowchart LR

B1[(Broker 1)]

B2[(Broker 2)]

B3[(Broker 3)]

Most production systems have multiple brokers.


Topic

A Topic is a logical stream of events.

Examples

orders

payments

customers

inventory

notifications

Topic Architecture

flowchart TD

TOPIC[(Orders Topic)]

P1[Partition 0]

P2[Partition 1]

P3[Partition 2]

TOPIC --> P1
TOPIC --> P2
TOPIC --> P3

Why Partitions?

Suppose

1 Million messages arrive.

One partition

One consumer

Slow processing

Multiple partitions

Multiple consumers

Parallel processing


Partition Example

flowchart LR
    PRODUCER["Producer"]

    TOPIC["Kafka Topic"]

    PART0["Partition 0"]
    PART1["Partition 1"]
    PART2["Partition 2"]

    PRODUCER --> TOPIC

    TOPIC --> PART0
    TOPIC --> PART1
    TOPIC --> PART2

(Represented below with valid Mermaid)

flowchart LR

P[Producer]

A[Partition 0]

B[Partition 1]

C[Partition 2]

P --> A
P --> B
P --> C

Ordering

Kafka guarantees ordering within one partition.

Example

Order Created

↓

Payment Completed

↓

Shipment Created

All messages for the same Order ID should go to the same partition.


Message Key

Producer

kafkaTemplate.send(
    "orders",
    orderId,
    event
);

Kafka hashes

Order ID

to determine the partition.


Offset

Every message has an Offset.

Offset 0

Offset 1

Offset 2

Offset 3

Offset 4

Offsets uniquely identify messages within a partition.


Offset Diagram

flowchart LR

O0[0]

O1[1]

O2[2]

O3[3]

O4[4]

O0 --> O1 --> O2 --> O3 --> O4

Consumer

Consumers read events.

Example

flowchart LR

TOPIC[(Orders)]

CONSUMER[Inventory Service]

TOPIC --> CONSUMER

Consumer Group

Multiple consumers share work.

flowchart LR

TOPIC[(Orders)]

C1[Consumer 1]

C2[Consumer 2]

C3[Consumer 3]

TOPIC --> C1
TOPIC --> C2
TOPIC --> C3

Each partition is assigned to only one consumer within the same group.


Consumer Group Architecture

flowchart TD

TOPIC[(Orders)]

P0[Partition 0]

P1[Partition 1]

P2[Partition 2]

C1[Consumer A]

C2[Consumer B]

TOPIC --> P0
TOPIC --> P1
TOPIC --> P2

P0 --> C1
P1 --> C2
P2 --> C1

Replication

Kafka replicates partitions.

flowchart LR

Leader[(Leader)]

Follower1[(Follower)]

Follower2[(Follower)]

Leader --> Follower1
Leader --> Follower2

Replication improves availability.


Leader and Followers

Each partition has

  • One Leader
  • Multiple Followers

Only Leader handles

  • Reads
  • Writes

Followers replicate data.


ISR (In-Sync Replicas)

ISR means

Replicas that are fully synchronized with the Leader.

If Leader crashes,

Kafka elects a new Leader from ISR.


Replication Flow

sequenceDiagram

participant Producer

participant Leader

participant Follower

Producer->>Leader: Publish

Leader->>Follower: Replicate

Follower-->>Leader: ACK

Leader-->>Producer: Success

Leader Failure

flowchart TD

Leader[(Leader)]

Follower1[(Follower)]

Follower2[(Follower)]

Leader -. Crash .-> Follower1

Follower1 --> NewLeader[(New Leader)]

Kafka automatically elects a new leader.


ZooKeeper (Legacy)

Older Kafka versions used ZooKeeper for

  • Broker Registration
  • Leader Election
  • Metadata
  • Cluster Coordination

KRaft Mode

Modern Kafka removes ZooKeeper.

flowchart LR

Controller[KRaft Controller]

Broker1

Broker2

Broker3

Controller --> Broker1
Controller --> Broker2
Controller --> Broker3

Benefits

  • Simpler deployment
  • Better scalability
  • Lower operational complexity

Message Flow

sequenceDiagram

participant Producer

participant Kafka

participant Consumer

Producer->>Kafka: Publish Event

Kafka-->>Consumer: Deliver Event

Consumer-->>Kafka: Commit Offset

Offset Commit

After processing,

consumer commits

Offset = 1050

If restarted,

processing resumes from

Offset 1051

Delivery Guarantees

Kafka supports

Mode Description
At Most Once No retries, possible loss
At Least Once Default, duplicates possible
Exactly Once Transactions + Idempotent Producer

Spring Boot Architecture

flowchart TD

CLIENT[React]

ORDER[Order Service]

KAFKA[(Kafka)]

PAYMENT[Payment Service]

EMAIL[Notification]

ANALYTICS[Analytics]

CLIENT --> ORDER

ORDER --> KAFKA

KAFKA --> PAYMENT

KAFKA --> EMAIL

KAFKA --> ANALYTICS

Spring Boot Producer

@Service
public class OrderProducer {

    @Autowired
    KafkaTemplate<String, OrderEvent> kafkaTemplate;

    public void publish(OrderEvent event){

        kafkaTemplate.send(
            "orders",
            event.getOrderId(),
            event
        );
    }
}

Spring Boot Consumer

@KafkaListener(topics = "orders")
public void consume(OrderEvent event){

    System.out.println(event);

}

Banking Example

Money Transfer

Publish

TransferCompleted

Consumers

  • Notification
  • Fraud Detection
  • Analytics
  • Audit

Each processes independently.


Netflix Example

Movie Started

Kafka

Recommendation

Analytics

Trending

Billing

Millions of events are processed every second.


Uber Example

Ride Completed

Kafka

Payment

Receipt

Driver Earnings

Analytics


Amazon Example

Order Created

Kafka

Inventory

Shipping

Recommendations

Notifications


Advantages

  • High Throughput
  • Horizontal Scalability
  • Durable Storage
  • Event Replay
  • Fault Tolerance
  • Loose Coupling
  • High Availability

Challenges

  • Consumer Lag
  • Duplicate Messages
  • Event Ordering
  • Schema Evolution
  • Operational Complexity
  • Monitoring

Monitoring

Monitor

  • Consumer Lag
  • Broker Health
  • Partition Count
  • ISR Count
  • Replication Lag
  • Producer Latency
  • Consumer Throughput

Tools

  • Kafka UI
  • Prometheus
  • Grafana
  • Datadog
  • Confluent Control Center

Common Mistakes

❌ Too many partitions

❌ Very large messages

❌ No message keys

❌ Ignoring idempotency

❌ Unlimited retention

❌ Not monitoring consumer lag


Best Practices

  • Use message keys to preserve ordering.
  • Keep events immutable.
  • Design consumers to be idempotent.
  • Monitor consumer lag and ISR health.
  • Use KRaft for new Kafka deployments.
  • Use Schema Registry for event versioning.
  • Tune partition count based on expected throughput.

Common Interview Questions

What is Kafka?

Kafka is a distributed event streaming platform for publishing, storing, and consuming high volumes of events with high throughput and fault tolerance.


Why are partitions important?

Partitions enable horizontal scaling and parallel consumption while maintaining ordering within each partition.


What is a Consumer Group?

A Consumer Group is a set of consumers that cooperatively process messages from a topic, with each partition assigned to only one consumer in the group.


What is ISR?

ISR (In-Sync Replicas) are follower replicas that are fully synchronized with the partition leader and are eligible to become the new leader during failover.


What are Kafka's delivery guarantees?

  • At Most Once
  • At Least Once
  • Exactly Once

Summary

Apache Kafka is the backbone of modern event-driven architectures. It enables producers and consumers to communicate asynchronously while providing durability, scalability, fault tolerance, and high throughput.

In this article, we covered:

  • Kafka fundamentals
  • Producers
  • Brokers
  • Topics
  • Partitions
  • Offsets
  • Consumer Groups
  • Replication
  • Leader and Followers
  • ISR
  • ZooKeeper vs KRaft
  • Spring Boot implementation
  • Banking, Amazon, Uber, and Netflix examples
  • Monitoring
  • Best practices

Kafka is a foundational technology for Event-Driven Architecture, Saga Pattern, CQRS, Event Sourcing, and real-time data streaming. Mastering its architecture is essential for designing resilient, scalable, cloud-native enterprise systems.


Loading likes...

Comments

Share a question, correction, or practical insight about this article.

Loading approved comments...