Rate Limiting APIs with Bucket4j and Redis
Step-by-step guide to implement distributed API rate limiting in Spring Boot using Bucket4j and Redis.
Introduction
Rate limiting protects APIs from excessive traffic, brute-force attacks, accidental retries, and unfair resource usage.
In enterprise applications, rate limiting is commonly used for:
- Login APIs
- OTP APIs
- Payment APIs
- Search APIs
- Public APIs
- Partner APIs
- Internal microservice APIs
In this article, we will implement distributed API rate limiting using:
- Spring Boot
- Bucket4j
- Redis
- Lettuce Redis Client
What Problem Are We Solving?
Assume we have an API:
GET /api/products
Without rate limiting, one user or client can send thousands of requests per minute.
This can cause:
- High CPU usage
- Database overload
- Slow response time
- Service instability
- Security risk
So we need a rule like:
Allow only 10 requests per minute per client.
If limit is exceeded, return HTTP 429 Too Many Requests.
High-Level Architecture
flowchart LR
C[Client / Browser / Partner API]
LB[Load Balancer]
A1[Spring Boot App Instance 1]
A2[Spring Boot App Instance 2]
R[(Redis)]
API[Protected API]
C --> LB
LB --> A1
LB --> A2
A1 --> R
A2 --> R
A1 --> API
A2 --> API
Redis is used because multiple Spring Boot instances must share the same rate limit state.
Why Not In-Memory Rate Limiting?
In-memory rate limiting works only for a single application instance.
flowchart TD
C[Client]
A1[App Instance 1 - Local Bucket]
A2[App Instance 2 - Local Bucket]
C --> A1
C --> A2
Problem:
Instance 1 allows 10 requests.
Instance 2 also allows 10 requests.
Total = 20 requests.
Expected = 10 requests.
This is not correct in distributed systems.
Why Redis?
Redis acts as a centralized store for rate limit state.
flowchart TD
C[Client]
A1[App Instance 1]
A2[App Instance 2]
R[(Redis Shared Bucket State)]
C --> A1
C --> A2
A1 --> R
A2 --> R
Now the limit is shared across all instances.
Token Bucket Algorithm
Bucket4j uses the token bucket algorithm.
Simple example:
Bucket capacity = 10 tokens
Refill rate = 10 tokens per minute
Each request consumes 1 token
If tokens are available -> allow request
If bucket is empty -> reject request
Token Bucket Flow
flowchart TD
A[Request Received]
B[Identify Client Key]
C[Get Bucket from Redis]
D{Token Available?}
E[Consume 1 Token]
F[Allow Request]
G[Reject Request]
H[Return HTTP 429]
A --> B
B --> C
C --> D
D -->|Yes| E
E --> F
D -->|No| G
G --> H
Project Structure
rate-limiter-demo
└── src/main/java/com/codewithvenu/ratelimiter
├── RateLimiterApplication.java
├── config
│ ├── RedisConfig.java
│ └── RateLimitConfig.java
├── filter
│ └── RateLimitFilter.java
├── controller
│ └── ProductController.java
└── service
└── RateLimitService.java
Step 1: Create Spring Boot Project
Required dependencies:
- Spring Web
- Spring Data Redis
- Bucket4j
- Lettuce
- Lombok optional
Step 2: Maven Dependencies
<dependencies>
<!-- Spring Boot Web -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- Redis Support -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-redis</artifactId>
</dependency>
<!-- Bucket4j Core -->
<dependency>
<groupId>com.bucket4j</groupId>
<artifactId>bucket4j-core</artifactId>
<version>8.10.1</version>
</dependency>
<!-- Bucket4j Redis Lettuce -->
<dependency>
<groupId>com.bucket4j</groupId>
<artifactId>bucket4j-redis</artifactId>
<version>8.10.1</version>
</dependency>
<!-- Lettuce Redis Client -->
<dependency>
<groupId>io.lettuce</groupId>
<artifactId>lettuce-core</artifactId>
</dependency>
</dependencies>
Note: Always verify the latest compatible Bucket4j version before production use.
Step 3: application.yml
server:
port: 8080
spring:
application:
name: rate-limiter-demo
data:
redis:
host: localhost
port: 6379
rate-limit:
capacity: 10
refill-tokens: 10
refill-duration-minutes: 1
Step 4: Run Redis Locally
Using Docker:
docker run --name redis-rate-limit -p 6379:6379 -d redis:latest
Check Redis:
docker ps
Connect to Redis CLI:
docker exec -it redis-rate-limit redis-cli
Step 5: Redis Configuration
package com.codewithvenu.ratelimiter.config;
import io.lettuce.core.RedisClient;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import java.net.URI;
@Configuration
public class RedisConfig {
@Bean
public RedisClient redisClient(
@Value("${spring.data.redis.host}") String host,
@Value("${spring.data.redis.port}") int port) {
String redisUri = "redis://" + host + ":" + port;
return RedisClient.create(URI.create(redisUri));
}
}
Step 6: Rate Limit Configuration Properties
package com.codewithvenu.ratelimiter.config;
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.context.annotation.Configuration;
@Configuration
@ConfigurationProperties(prefix = "rate-limit")
public class RateLimitConfig {
private int capacity;
private int refillTokens;
private int refillDurationMinutes;
public int getCapacity() {
return capacity;
}
public void setCapacity(int capacity) {
this.capacity = capacity;
}
public int getRefillTokens() {
return refillTokens;
}
public void setRefillTokens(int refillTokens) {
this.refillTokens = refillTokens;
}
public int getRefillDurationMinutes() {
return refillDurationMinutes;
}
public void setRefillDurationMinutes(int refillDurationMinutes) {
this.refillDurationMinutes = refillDurationMinutes;
}
}
Step 7: RateLimitService
package com.codewithvenu.ratelimiter.service;
import com.codewithvenu.ratelimiter.config.RateLimitConfig;
import io.github.bucket4j.Bandwidth;
import io.github.bucket4j.BucketConfiguration;
import io.github.bucket4j.ConsumptionProbe;
import io.github.bucket4j.Refill;
import io.github.bucket4j.distributed.proxy.ProxyManager;
import io.github.bucket4j.redis.lettuce.cas.LettuceBasedProxyManager;
import io.lettuce.core.RedisClient;
import io.lettuce.core.api.StatefulRedisConnection;
import org.springframework.stereotype.Service;
import java.nio.charset.StandardCharsets;
import java.time.Duration;
import java.util.function.Supplier;
@Service
public class RateLimitService {
private final ProxyManager<byte[]> proxyManager;
private final RateLimitConfig rateLimitConfig;
public RateLimitService(RedisClient redisClient, RateLimitConfig rateLimitConfig) {
StatefulRedisConnection<String, byte[]> connection =
redisClient.connect(io.lettuce.core.codec.RedisCodec.of(
io.lettuce.core.codec.StringCodec.UTF8,
io.lettuce.core.codec.ByteArrayCodec.INSTANCE
));
this.proxyManager = LettuceBasedProxyManager
.builderFor(connection)
.build();
this.rateLimitConfig = rateLimitConfig;
}
public ConsumptionProbe tryConsume(String key) {
return proxyManager
.builder()
.build(resolveBucketKey(key), getBucketConfiguration())
.tryConsumeAndReturnRemaining(1);
}
private byte[] resolveBucketKey(String key) {
return ("rate-limit:" + key).getBytes(StandardCharsets.UTF_8);
}
private Supplier<BucketConfiguration> getBucketConfiguration() {
return () -> {
Bandwidth limit = Bandwidth.classic(
rateLimitConfig.getCapacity(),
Refill.intervally(
rateLimitConfig.getRefillTokens(),
Duration.ofMinutes(rateLimitConfig.getRefillDurationMinutes())
)
);
return BucketConfiguration.builder()
.addLimit(limit)
.build();
};
}
}
Step 8: Create RateLimitFilter
package com.codewithvenu.ratelimiter.filter;
import com.codewithvenu.ratelimiter.service.RateLimitService;
import io.github.bucket4j.ConsumptionProbe;
import jakarta.servlet.FilterChain;
import jakarta.servlet.ServletException;
import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletResponse;
import org.springframework.http.HttpStatus;
import org.springframework.stereotype.Component;
import org.springframework.web.filter.OncePerRequestFilter;
import java.io.IOException;
@Component
public class RateLimitFilter extends OncePerRequestFilter {
private final RateLimitService rateLimitService;
public RateLimitFilter(RateLimitService rateLimitService) {
this.rateLimitService = rateLimitService;
}
@Override
protected void doFilterInternal(
HttpServletRequest request,
HttpServletResponse response,
FilterChain filterChain)
throws ServletException, IOException {
String clientKey = resolveClientKey(request);
ConsumptionProbe probe = rateLimitService.tryConsume(clientKey);
response.addHeader("X-Rate-Limit-Remaining", String.valueOf(probe.getRemainingTokens()));
if (probe.isConsumed()) {
filterChain.doFilter(request, response);
return;
}
long waitTimeSeconds = probe.getNanosToWaitForRefill() / 1_000_000_000;
response.setStatus(HttpStatus.TOO_MANY_REQUESTS.value());
response.addHeader("Retry-After", String.valueOf(waitTimeSeconds));
response.setContentType("application/json");
response.getWriter().write("""
{
"error": "Too Many Requests",
"message": "Rate limit exceeded. Please try again later."
}
""");
}
private String resolveClientKey(HttpServletRequest request) {
String apiKey = request.getHeader("X-API-Key");
if (apiKey != null && !apiKey.isBlank()) {
return "api-key:" + apiKey;
}
String forwardedFor = request.getHeader("X-Forwarded-For");
if (forwardedFor != null && !forwardedFor.isBlank()) {
return "ip:" + forwardedFor.split(",")[0].trim();
}
return "ip:" + request.getRemoteAddr();
}
}
Step 9: Create Sample Controller
package com.codewithvenu.ratelimiter.controller;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;
import java.time.Instant;
import java.util.Map;
@RestController
public class ProductController {
@GetMapping("/api/products")
public Map<String, Object> getProducts() {
return Map.of(
"message", "Products fetched successfully",
"timestamp", Instant.now().toString()
);
}
}
Step 10: Main Application Class
package com.codewithvenu.ratelimiter;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.boot.context.properties.ConfigurationPropertiesScan;
@SpringBootApplication
@ConfigurationPropertiesScan
public class RateLimiterApplication {
public static void main(String[] args) {
SpringApplication.run(RateLimiterApplication.class, args);
}
}
Step 11: Test API
Start Redis:
docker start redis-rate-limit
Start Spring Boot:
mvn spring-boot:run
Call API:
curl -i http://localhost:8080/api/products
Call multiple times quickly:
for i in {1..15}
do
curl -i http://localhost:8080/api/products
echo ""
done
Expected result:
First 10 requests -> HTTP 200
Next requests -> HTTP 429 Too Many Requests
Success Response
HTTP/1.1 200 OK
X-Rate-Limit-Remaining: 9
Rate Limit Exceeded Response
HTTP/1.1 429 Too Many Requests
Retry-After: 45
Content-Type: application/json
{
"error": "Too Many Requests",
"message": "Rate limit exceeded. Please try again later."
}
Request Flow Diagram
sequenceDiagram
participant Client
participant Filter as RateLimitFilter
participant Service as RateLimitService
participant Redis
participant API as Product API
Client->>Filter: Request /api/products
Filter->>Filter: Resolve client key
Filter->>Service: tryConsume(clientKey)
Service->>Redis: Check bucket tokens
Redis-->>Service: Token status
Service-->>Filter: ConsumptionProbe
alt Token Available
Filter->>API: Continue request
API-->>Client: 200 OK
else Token Not Available
Filter-->>Client: 429 Too Many Requests
end
Client Key Strategy
Rate limiting depends on how we identify the caller.
Common strategies:
| Strategy | Example | Use Case |
|---|---|---|
| IP address | ip:192.168.1.10 |
Public APIs |
| API key | api-key:partner-a |
Partner APIs |
| User ID | user:1001 |
Authenticated APIs |
| Tenant ID | tenant:abc-corp |
SaaS platforms |
| Endpoint + User | user:1001:/api/search |
Endpoint-specific limits |
Endpoint-Specific Rate Limiting
Sometimes each endpoint needs a different limit.
Example:
/api/login -> 5 requests per minute
/api/products -> 100 requests per minute
/api/payment -> 10 requests per minute
Client key can include endpoint path:
private String resolveClientKey(HttpServletRequest request) {
String ip = request.getRemoteAddr();
String path = request.getRequestURI();
return "ip:" + ip + ":path:" + path;
}
Real-Time Enterprise Example
Banking Login API
API: POST /api/login
Limit: 5 attempts per minute per user/IP
Reason: Prevent brute-force login attacks
Payment API
API: POST /api/payments
Limit: 10 requests per minute per customer
Reason: Prevent duplicate payment submission
Search API
API: GET /api/search
Limit: 100 requests per minute per user
Reason: Protect database from heavy queries
Recommended Production Headers
Add these headers to help clients understand limits:
X-Rate-Limit-Limit: 10
X-Rate-Limit-Remaining: 4
Retry-After: 30
Example update in filter:
response.addHeader("X-Rate-Limit-Limit", "10");
response.addHeader("X-Rate-Limit-Remaining", String.valueOf(probe.getRemainingTokens()));
Redis Key Example
Bucket key:
rate-limit:api-key:partner-a
Redis stores distributed bucket state internally.
You can check keys:
docker exec -it redis-rate-limit redis-cli
keys *
Important Production Considerations
1. Use API Gateway When Possible
If your organization already uses API Gateway, Kong, Apigee, NGINX, AWS API Gateway, or Cloudflare, prefer gateway-level rate limiting for public APIs.
Application-level rate limiting is useful when business-specific logic is required.
2. Do Not Trust X-Forwarded-For Blindly
X-Forwarded-For can be spoofed if your app is directly exposed.
Use it only when requests come through a trusted load balancer or gateway.
3. Use Different Limits for Different Clients
Example:
Free users -> 100 requests/day
Premium users -> 10,000 requests/day
Internal apps -> 100,000 requests/day
4. Fail Open or Fail Closed?
If Redis is down, decide behavior carefully.
| Strategy | Meaning | Use Case |
|---|---|---|
| Fail open | Allow request | Better availability |
| Fail closed | Reject request | Better security |
For most customer-facing APIs, fail open is commonly preferred with monitoring.
5. Add Monitoring
Track:
- Number of allowed requests
- Number of blocked requests
- Redis latency
- HTTP 429 count
- Top rate-limited clients
Add Micrometer Metrics
Example:
import io.micrometer.core.instrument.Counter;
import io.micrometer.core.instrument.MeterRegistry;
@Component
public class RateLimitMetrics {
private final Counter allowedCounter;
private final Counter blockedCounter;
public RateLimitMetrics(MeterRegistry meterRegistry) {
this.allowedCounter = Counter.builder("api.rate.limit.allowed")
.description("Allowed API requests")
.register(meterRegistry);
this.blockedCounter = Counter.builder("api.rate.limit.blocked")
.description("Blocked API requests due to rate limiting")
.register(meterRegistry);
}
public void incrementAllowed() {
allowedCounter.increment();
}
public void incrementBlocked() {
blockedCounter.increment();
}
}
Common Mistakes
Mistake 1: Using Only Local Memory
This breaks rate limiting when multiple app instances are running.
Mistake 2: Applying Same Limit to All APIs
Login, payment, search, and public APIs should not have the same limit.
Mistake 3: Not Returning Retry-After
Clients need to know when they can retry.
Mistake 4: Not Considering Load Balancer IP
If all traffic comes through a load balancer, request.getRemoteAddr() may return the load balancer IP instead of client IP.
Mistake 5: Blocking Internal Health Checks
Exclude health check endpoints:
@Override
protected boolean shouldNotFilter(HttpServletRequest request) {
String path = request.getRequestURI();
return path.startsWith("/actuator/health");
}
Final Architecture
flowchart TD
Client[Client]
Gateway[API Gateway / Load Balancer]
App[Spring Boot API]
Filter[RateLimitFilter]
Bucket[Bucket4j]
Redis[(Redis Shared Bucket Store)]
Controller[Business Controller]
DB[(Database)]
Client --> Gateway
Gateway --> App
App --> Filter
Filter --> Bucket
Bucket --> Redis
Filter -->|Allowed| Controller
Controller --> DB
Filter -->|Rejected| TooMany[HTTP 429 Too Many Requests]
When to Use Bucket4j with Redis
Use this approach when:
- You have multiple Spring Boot instances
- You need distributed rate limiting
- You need custom business-specific rate limit rules
- You need user, tenant, API key, or endpoint-based limits
- Gateway-level rate limiting is not enough
Summary
Rate limiting is an important API security and reliability pattern.
In this article, we implemented:
- Token bucket algorithm
- Bucket4j integration
- Redis-backed distributed rate limiting
- Spring Boot filter-based protection
- Client key strategy
- HTTP 429 response
- Retry-After header
- Production best practices
Rate limiting should be part of every enterprise API architecture.
Final Thought
In enterprise systems, rate limiting is not only a security feature.
It is also a reliability pattern.
A well-designed rate limiter protects:
- Application servers
- Databases
- External integrations
- Payment systems
- Login systems
- Customer experience
Design your APIs to be fast, fair, and safe.
Learning Path Navigation
- Series home: Spring Security Learning Path
- Previous: API Key Authentication for Internal APIs
- Next: CORS and CSRF Protection in Spring Boot
Comments
Share a question, correction, or practical insight about this article.
Checking login status...
Loading approved comments...