Saga Pattern in Microservices
Learn the Saga Pattern from the ground up. Understand why distributed transactions fail in microservices, choreography vs orchestration, compensating transactions, Spring Boot implementation with Kafka, event flow, rollback mechanisms, failure handling, and real-world examples from Amazon, Uber, Banking, Netflix, and e-commerce systems.
Introduction
Imagine you're building an E-Commerce Platform.
A customer places an order.
The following services participate:
- Order Service
- Payment Service
- Inventory Service
- Shipping Service
- Notification Service
In a Monolithic application,
everything happens inside a single database transaction.
BEGIN
↓
Create Order
↓
Charge Payment
↓
Reserve Inventory
↓
COMMIT
If anything fails,
the transaction rolls back.
Simple.
Now imagine a Microservices architecture.
Each service owns its own database.
Order DB
Payment DB
Inventory DB
Shipping DB
Question:
How can one transaction span multiple databases?
The answer is:
It can't.
Traditional ACID transactions don't scale well across independent services.
The solution is the Saga Pattern.
Learning Objectives
After completing this article, you'll understand:
- What is Saga Pattern?
- Why Saga Pattern?
- Distributed Transactions
- Compensating Transactions
- Choreography Saga
- Orchestration Saga
- Kafka Integration
- Event Flow
- Failure Recovery
- Retry Strategy
- Dead Letter Queue
- Spring Boot Implementation
- Best Practices
- Real-world Examples
Why Traditional Transactions Fail
Monolith
flowchart TD
APP[Spring Boot]
DB[(Single Database)]
APP --> DB
One transaction.
One rollback.
Everything is simple.
Microservices
flowchart TD
ORDER[Order Service]
PAYMENT[Payment Service]
INVENTORY[Inventory Service]
SHIPPING[Shipping Service]
ORDERDB[(Order DB)]
PAYDB[(Payment DB)]
INVDB[(Inventory DB)]
SHIPDB[(Shipping DB)]
ORDER --> ORDERDB
PAYMENT --> PAYDB
INVENTORY --> INVDB
SHIPPING --> SHIPDB
Every service owns its own database.
No global transaction exists.
The Problem
Customer places an order.
Order Created
↓
Payment Successful
↓
Inventory Failed
Now what?
Order already exists.
Payment already completed.
Inventory reservation failed.
The system becomes inconsistent.
What is Saga Pattern?
A Saga is a sequence of local transactions.
Each service completes its own transaction independently.
If one step fails,
previous successful steps are undone using Compensating Transactions.
Saga Workflow
flowchart LR
ORDER[Create Order]
PAYMENT[Charge Payment]
INVENTORY[Reserve Inventory]
SHIPPING[Create Shipment]
ORDER --> PAYMENT
PAYMENT --> INVENTORY
INVENTORY --> SHIPPING
Every service commits independently.
Successful Saga
sequenceDiagram
participant Customer
participant Order
participant Payment
participant Inventory
participant Shipping
Customer->>Order: Create Order
Order->>Payment: Charge Card
Payment->>Inventory: Reserve Stock
Inventory->>Shipping: Create Shipment
Shipping-->>Customer: Order Confirmed
Every service succeeds.
Saga completes.
Failed Saga
Inventory fails.
sequenceDiagram
participant Order
participant Payment
participant Inventory
Order->>Payment: Charge Card
Payment->>Inventory: Reserve Stock
Inventory-->>Payment: Failed
Payment->>Order: Cancel Payment
Order-->>Order: Cancel Order
Instead of rollback,
services execute compensating actions.
Compensating Transaction
Instead of
Rollback
Microservices execute
Refund Payment
↓
Cancel Order
↓
Release Inventory
Business operations are reversed using explicit logic.
Compensation Flow
flowchart TD
ORDER[Order Created]
PAYMENT[Payment Success]
FAIL[Inventory Failed]
REFUND[Refund Payment]
CANCEL[Cancel Order]
ORDER --> PAYMENT
PAYMENT --> FAIL
FAIL --> REFUND
REFUND --> CANCEL
Two Types of Saga
There are two approaches.
- Choreography
- Orchestration
Choreography Saga
No central coordinator.
Services communicate using events.
flowchart LR
ORDER[Order]
KAFKA[(Kafka)]
PAYMENT[Payment]
INVENTORY[Inventory]
EMAIL[Notification]
ORDER --> KAFKA
KAFKA --> PAYMENT
KAFKA --> INVENTORY
KAFKA --> EMAIL
Every service reacts independently.
Choreography Example
OrderCreated
↓
PaymentCompleted
↓
InventoryReserved
↓
ShipmentCreated
Each event triggers the next step.
Advantages
- Loose Coupling
- High Scalability
- No Central Coordinator
- Easy Event Replay
Challenges
- Difficult Debugging
- Circular Event Dependencies
- Complex Failure Tracking
- Hard to visualize the overall flow
Orchestration Saga
Uses a central coordinator.
flowchart TD
ORCHESTRATOR[Saga Orchestrator]
ORDER[Order Service]
PAYMENT[Payment Service]
INVENTORY[Inventory Service]
SHIPPING[Shipping Service]
ORCHESTRATOR --> ORDER
ORCHESTRATOR --> PAYMENT
ORCHESTRATOR --> INVENTORY
ORCHESTRATOR --> SHIPPING
The orchestrator controls every step.
Orchestration Sequence
sequenceDiagram
participant Saga
participant Order
participant Payment
participant Inventory
Saga->>Order: Create Order
Order-->>Saga: Success
Saga->>Payment: Charge Card
Payment-->>Saga: Success
Saga->>Inventory: Reserve Stock
Inventory-->>Saga: Success
Simple to understand.
Failure in Orchestration
sequenceDiagram
participant Saga
participant Payment
participant Inventory
Saga->>Payment: Charge
Payment-->>Saga: Success
Saga->>Inventory: Reserve
Inventory-->>Saga: Failed
Saga->>Payment: Refund
The orchestrator triggers compensation.
Choreography vs Orchestration
| Choreography | Orchestration |
|---|---|
| Event Driven | Central Coordinator |
| Loosely Coupled | Easier Control |
| Harder to Debug | Easier Monitoring |
| Highly Scalable | Simpler Workflow |
Kafka Integration
flowchart TD
ORDER[Order Service]
TOPIC[(Kafka)]
PAYMENT[Payment]
INVENTORY[Inventory]
SHIPPING[Shipping]
ORDER --> TOPIC
TOPIC --> PAYMENT
TOPIC --> INVENTORY
TOPIC --> SHIPPING
Kafka is commonly used for Saga choreography.
Dead Letter Queue
Failures shouldn't lose events.
flowchart LR
TOPIC[(Kafka)]
CONSUMER[Consumer]
DLQ[(Dead Letter Queue)]
TOPIC --> CONSUMER
CONSUMER --> DLQ
Failed events can be retried later.
Retry Strategy
flowchart TD
EVENT[Consume Event]
SUCCESS{Success?}
RETRY[Retry]
DLQ[Move to DLQ]
EVENT --> SUCCESS
SUCCESS -->|Yes| DONE[Complete]
SUCCESS -->|No| RETRY
RETRY --> DLQ
Idempotency
Events may be delivered more than once.
Consumers should process duplicate events safely.
Example
Instead of
Increase Balance
Use
Process Transaction ID
Duplicate transaction IDs are ignored.
Spring Boot Architecture
flowchart TD
CLIENT[React]
ORDER[Order Service]
KAFKA[(Kafka)]
PAYMENT[Payment Service]
INVENTORY[Inventory Service]
SHIPPING[Shipping Service]
CLIENT --> ORDER
ORDER --> KAFKA
KAFKA --> PAYMENT
KAFKA --> INVENTORY
KAFKA --> SHIPPING
Spring Boot commonly uses:
- Spring Kafka
- Spring Cloud Stream
- Spring Boot Events
- Outbox Pattern
Outbox Pattern
To avoid losing events,
write the business data and event into the same local database transaction.
flowchart LR
SERVICE[Order Service]
ORDERDB[(Orders)]
OUTBOX[(Outbox Table)]
KAFKA[(Kafka)]
SERVICE --> ORDERDB
SERVICE --> OUTBOX
OUTBOX --> KAFKA
A background process publishes events from the Outbox.
Banking Example
Money Transfer
Debit Account
↓
Credit Account
↓
Notify Customer
↓
Update Ledger
If credit fails,
compensating transactions restore consistency.
Amazon Example
Order placement triggers
- Payment
- Inventory
- Shipping
Failures result in payment refunds and order cancellation rather than database rollback.
Uber Example
Ride Booking
Reserve Driver
↓
Charge Rider
↓
Start Trip
If driver assignment fails,
payment authorization is released.
Netflix Example
Subscription upgrade
Payment
↓
Subscription Update
↓
Email
↓
Analytics
Failures trigger compensation rather than distributed rollbacks.
Advantages
- No Distributed ACID Transactions
- High Scalability
- Independent Services
- Better Fault Isolation
- Supports Event-Driven Systems
- Cloud Native Friendly
Challenges
- Complex Compensation Logic
- Eventual Consistency
- Duplicate Events
- Ordering Issues
- Monitoring
- Debugging
Monitoring
Monitor
- Saga Success Rate
- Compensation Count
- Retry Count
- DLQ Messages
- Kafka Consumer Lag
- Event Processing Time
- Service Latency
Tools
- Prometheus
- Grafana
- Datadog
- Jaeger
- Zipkin
- Kafka UI
Common Mistakes
❌ Treating Saga as a distributed ACID transaction
❌ Missing compensating transactions
❌ No idempotency
❌ Ignoring retries
❌ No Dead Letter Queue
❌ Tight coupling between services
Best Practices
- Keep each local transaction small and independent.
- Design compensating transactions before implementation.
- Make all event consumers idempotent.
- Use the Outbox Pattern for reliable event publishing.
- Monitor every saga instance end-to-end.
- Prefer choreography for loosely coupled systems and orchestration for workflows that require centralized visibility.
- Document compensation logic as part of the business process.
Saga vs Two-Phase Commit (2PC)
| Saga Pattern | Two-Phase Commit |
|---|---|
| Local Transactions | Global Transaction |
| Eventual Consistency | Strong Consistency |
| Highly Scalable | Lower Scalability |
| Compensation Based | Rollback Based |
| Cloud Native | Traditional Enterprise Systems |
Common Interview Questions
What is the Saga Pattern?
Saga is a distributed transaction pattern where a business process is broken into multiple local transactions coordinated through events or an orchestrator, with compensating transactions used to recover from failures.
Why can't we use ACID transactions across microservices?
Each microservice owns its own database. There is no shared transaction manager that can safely coordinate independent services at cloud scale.
What is a Compensating Transaction?
A compensating transaction reverses the business effects of a previously completed local transaction, such as refunding a payment or releasing reserved inventory.
What is the difference between Choreography and Orchestration?
| Choreography | Orchestration |
|---|---|
| Event-based coordination | Central coordinator |
| Decentralized | Centralized |
| More flexible | Easier to manage |
| Harder to trace | Easier to monitor |
When should the Saga Pattern be used?
Use Saga for long-running business workflows involving multiple microservices, such as:
- Order Processing
- Banking Transfers
- Travel Booking
- Ride Booking
- Insurance Claims
Summary
The Saga Pattern is one of the most important architectural patterns for distributed systems. It replaces traditional distributed ACID transactions with a sequence of local transactions and compensating actions, enabling scalable and resilient microservices.
In this article, we covered:
- Saga fundamentals
- Distributed transaction challenges
- Compensating transactions
- Choreography
- Orchestration
- Kafka integration
- Retry strategies
- Dead Letter Queues
- Outbox Pattern
- Spring Boot implementation
- Banking, Amazon, Uber, and Netflix examples
- Monitoring
- Best practices
The Saga Pattern, together with Event-Driven Architecture, CQRS, Outbox Pattern, and Idempotent Consumers, forms the foundation of modern cloud-native enterprise applications that need to coordinate complex business workflows without sacrificing scalability.
Comments
Share a question, correction, or practical insight about this article.
Checking login status...
Loading approved comments...