Full Stack • Java • System Design • Cloud • AI Engineering

Rate Limiting APIs with Bucket4j and Redis

Step-by-step guide to implement distributed API rate limiting in Spring Boot using Bucket4j and Redis.

Introduction

Rate limiting protects APIs from excessive traffic, brute-force attacks, accidental retries, and unfair resource usage.

In enterprise applications, rate limiting is commonly used for:

  • Login APIs
  • OTP APIs
  • Payment APIs
  • Search APIs
  • Public APIs
  • Partner APIs
  • Internal microservice APIs

In this article, we will implement distributed API rate limiting using:

  • Spring Boot
  • Bucket4j
  • Redis
  • Lettuce Redis Client

What Problem Are We Solving?

Assume we have an API:

GET /api/products

Without rate limiting, one user or client can send thousands of requests per minute.

This can cause:

  • High CPU usage
  • Database overload
  • Slow response time
  • Service instability
  • Security risk

So we need a rule like:

Allow only 10 requests per minute per client.
If limit is exceeded, return HTTP 429 Too Many Requests.

High-Level Architecture

flowchart LR
    C[Client / Browser / Partner API]
    LB[Load Balancer]
    A1[Spring Boot App Instance 1]
    A2[Spring Boot App Instance 2]
    R[(Redis)]
    API[Protected API]

    C --> LB
    LB --> A1
    LB --> A2

    A1 --> R
    A2 --> R

    A1 --> API
    A2 --> API

Redis is used because multiple Spring Boot instances must share the same rate limit state.


Why Not In-Memory Rate Limiting?

In-memory rate limiting works only for a single application instance.

flowchart TD
    C[Client]
    A1[App Instance 1 - Local Bucket]
    A2[App Instance 2 - Local Bucket]

    C --> A1
    C --> A2

Problem:

Instance 1 allows 10 requests.
Instance 2 also allows 10 requests.

Total = 20 requests.
Expected = 10 requests.

This is not correct in distributed systems.


Why Redis?

Redis acts as a centralized store for rate limit state.

flowchart TD
    C[Client]
    A1[App Instance 1]
    A2[App Instance 2]
    R[(Redis Shared Bucket State)]

    C --> A1
    C --> A2
    A1 --> R
    A2 --> R

Now the limit is shared across all instances.


Token Bucket Algorithm

Bucket4j uses the token bucket algorithm.

Simple example:

Bucket capacity = 10 tokens
Refill rate = 10 tokens per minute
Each request consumes 1 token
If tokens are available -> allow request
If bucket is empty -> reject request

Token Bucket Flow

flowchart TD
    A[Request Received]
    B[Identify Client Key]
    C[Get Bucket from Redis]
    D{Token Available?}
    E[Consume 1 Token]
    F[Allow Request]
    G[Reject Request]
    H[Return HTTP 429]

    A --> B
    B --> C
    C --> D
    D -->|Yes| E
    E --> F
    D -->|No| G
    G --> H

Project Structure

rate-limiter-demo
 └── src/main/java/com/codewithvenu/ratelimiter
     ├── RateLimiterApplication.java
     ├── config
     │   ├── RedisConfig.java
     │   └── RateLimitConfig.java
     ├── filter
     │   └── RateLimitFilter.java
     ├── controller
     │   └── ProductController.java
     └── service
         └── RateLimitService.java

Step 1: Create Spring Boot Project

Required dependencies:

  • Spring Web
  • Spring Data Redis
  • Bucket4j
  • Lettuce
  • Lombok optional

Step 2: Maven Dependencies

<dependencies>

    <!-- Spring Boot Web -->
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>

    <!-- Redis Support -->
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-data-redis</artifactId>
    </dependency>

    <!-- Bucket4j Core -->
    <dependency>
        <groupId>com.bucket4j</groupId>
        <artifactId>bucket4j-core</artifactId>
        <version>8.10.1</version>
    </dependency>

    <!-- Bucket4j Redis Lettuce -->
    <dependency>
        <groupId>com.bucket4j</groupId>
        <artifactId>bucket4j-redis</artifactId>
        <version>8.10.1</version>
    </dependency>

    <!-- Lettuce Redis Client -->
    <dependency>
        <groupId>io.lettuce</groupId>
        <artifactId>lettuce-core</artifactId>
    </dependency>

</dependencies>

Note: Always verify the latest compatible Bucket4j version before production use.


Step 3: application.yml

server:
  port: 8080

spring:
  application:
    name: rate-limiter-demo

  data:
    redis:
      host: localhost
      port: 6379

rate-limit:
  capacity: 10
  refill-tokens: 10
  refill-duration-minutes: 1

Step 4: Run Redis Locally

Using Docker:

docker run --name redis-rate-limit -p 6379:6379 -d redis:latest

Check Redis:

docker ps

Connect to Redis CLI:

docker exec -it redis-rate-limit redis-cli

Step 5: Redis Configuration

package com.codewithvenu.ratelimiter.config;

import io.lettuce.core.RedisClient;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

import java.net.URI;

@Configuration
public class RedisConfig {

    @Bean
    public RedisClient redisClient(
            @Value("${spring.data.redis.host}") String host,
            @Value("${spring.data.redis.port}") int port) {

        String redisUri = "redis://" + host + ":" + port;
        return RedisClient.create(URI.create(redisUri));
    }
}

Step 6: Rate Limit Configuration Properties

package com.codewithvenu.ratelimiter.config;

import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.context.annotation.Configuration;

@Configuration
@ConfigurationProperties(prefix = "rate-limit")
public class RateLimitConfig {

    private int capacity;
    private int refillTokens;
    private int refillDurationMinutes;

    public int getCapacity() {
        return capacity;
    }

    public void setCapacity(int capacity) {
        this.capacity = capacity;
    }

    public int getRefillTokens() {
        return refillTokens;
    }

    public void setRefillTokens(int refillTokens) {
        this.refillTokens = refillTokens;
    }

    public int getRefillDurationMinutes() {
        return refillDurationMinutes;
    }

    public void setRefillDurationMinutes(int refillDurationMinutes) {
        this.refillDurationMinutes = refillDurationMinutes;
    }
}

Step 7: RateLimitService

package com.codewithvenu.ratelimiter.service;

import com.codewithvenu.ratelimiter.config.RateLimitConfig;
import io.github.bucket4j.Bandwidth;
import io.github.bucket4j.BucketConfiguration;
import io.github.bucket4j.ConsumptionProbe;
import io.github.bucket4j.Refill;
import io.github.bucket4j.distributed.proxy.ProxyManager;
import io.github.bucket4j.redis.lettuce.cas.LettuceBasedProxyManager;
import io.lettuce.core.RedisClient;
import io.lettuce.core.api.StatefulRedisConnection;
import org.springframework.stereotype.Service;

import java.nio.charset.StandardCharsets;
import java.time.Duration;
import java.util.function.Supplier;

@Service
public class RateLimitService {

    private final ProxyManager<byte[]> proxyManager;
    private final RateLimitConfig rateLimitConfig;

    public RateLimitService(RedisClient redisClient, RateLimitConfig rateLimitConfig) {
        StatefulRedisConnection<String, byte[]> connection =
                redisClient.connect(io.lettuce.core.codec.RedisCodec.of(
                        io.lettuce.core.codec.StringCodec.UTF8,
                        io.lettuce.core.codec.ByteArrayCodec.INSTANCE
                ));

        this.proxyManager = LettuceBasedProxyManager
                .builderFor(connection)
                .build();

        this.rateLimitConfig = rateLimitConfig;
    }

    public ConsumptionProbe tryConsume(String key) {
        return proxyManager
                .builder()
                .build(resolveBucketKey(key), getBucketConfiguration())
                .tryConsumeAndReturnRemaining(1);
    }

    private byte[] resolveBucketKey(String key) {
        return ("rate-limit:" + key).getBytes(StandardCharsets.UTF_8);
    }

    private Supplier<BucketConfiguration> getBucketConfiguration() {
        return () -> {
            Bandwidth limit = Bandwidth.classic(
                    rateLimitConfig.getCapacity(),
                    Refill.intervally(
                            rateLimitConfig.getRefillTokens(),
                            Duration.ofMinutes(rateLimitConfig.getRefillDurationMinutes())
                    )
            );

            return BucketConfiguration.builder()
                    .addLimit(limit)
                    .build();
        };
    }
}

Step 8: Create RateLimitFilter

package com.codewithvenu.ratelimiter.filter;

import com.codewithvenu.ratelimiter.service.RateLimitService;
import io.github.bucket4j.ConsumptionProbe;
import jakarta.servlet.FilterChain;
import jakarta.servlet.ServletException;
import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletResponse;
import org.springframework.http.HttpStatus;
import org.springframework.stereotype.Component;
import org.springframework.web.filter.OncePerRequestFilter;

import java.io.IOException;

@Component
public class RateLimitFilter extends OncePerRequestFilter {

    private final RateLimitService rateLimitService;

    public RateLimitFilter(RateLimitService rateLimitService) {
        this.rateLimitService = rateLimitService;
    }

    @Override
    protected void doFilterInternal(
            HttpServletRequest request,
            HttpServletResponse response,
            FilterChain filterChain)
            throws ServletException, IOException {

        String clientKey = resolveClientKey(request);

        ConsumptionProbe probe = rateLimitService.tryConsume(clientKey);

        response.addHeader("X-Rate-Limit-Remaining", String.valueOf(probe.getRemainingTokens()));

        if (probe.isConsumed()) {
            filterChain.doFilter(request, response);
            return;
        }

        long waitTimeSeconds = probe.getNanosToWaitForRefill() / 1_000_000_000;

        response.setStatus(HttpStatus.TOO_MANY_REQUESTS.value());
        response.addHeader("Retry-After", String.valueOf(waitTimeSeconds));
        response.setContentType("application/json");

        response.getWriter().write("""
                {
                  "error": "Too Many Requests",
                  "message": "Rate limit exceeded. Please try again later."
                }
                """);
    }

    private String resolveClientKey(HttpServletRequest request) {
        String apiKey = request.getHeader("X-API-Key");

        if (apiKey != null && !apiKey.isBlank()) {
            return "api-key:" + apiKey;
        }

        String forwardedFor = request.getHeader("X-Forwarded-For");

        if (forwardedFor != null && !forwardedFor.isBlank()) {
            return "ip:" + forwardedFor.split(",")[0].trim();
        }

        return "ip:" + request.getRemoteAddr();
    }
}

Step 9: Create Sample Controller

package com.codewithvenu.ratelimiter.controller;

import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;

import java.time.Instant;
import java.util.Map;

@RestController
public class ProductController {

    @GetMapping("/api/products")
    public Map<String, Object> getProducts() {
        return Map.of(
                "message", "Products fetched successfully",
                "timestamp", Instant.now().toString()
        );
    }
}

Step 10: Main Application Class

package com.codewithvenu.ratelimiter;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.boot.context.properties.ConfigurationPropertiesScan;

@SpringBootApplication
@ConfigurationPropertiesScan
public class RateLimiterApplication {

    public static void main(String[] args) {
        SpringApplication.run(RateLimiterApplication.class, args);
    }
}

Step 11: Test API

Start Redis:

docker start redis-rate-limit

Start Spring Boot:

mvn spring-boot:run

Call API:

curl -i http://localhost:8080/api/products

Call multiple times quickly:

for i in {1..15}
do
  curl -i http://localhost:8080/api/products
  echo ""
done

Expected result:

First 10 requests -> HTTP 200
Next requests -> HTTP 429 Too Many Requests

Success Response

HTTP/1.1 200 OK
X-Rate-Limit-Remaining: 9

Rate Limit Exceeded Response

HTTP/1.1 429 Too Many Requests
Retry-After: 45
Content-Type: application/json
{
  "error": "Too Many Requests",
  "message": "Rate limit exceeded. Please try again later."
}

Request Flow Diagram

sequenceDiagram
    participant Client
    participant Filter as RateLimitFilter
    participant Service as RateLimitService
    participant Redis
    participant API as Product API

    Client->>Filter: Request /api/products
    Filter->>Filter: Resolve client key
    Filter->>Service: tryConsume(clientKey)
    Service->>Redis: Check bucket tokens
    Redis-->>Service: Token status
    Service-->>Filter: ConsumptionProbe

    alt Token Available
        Filter->>API: Continue request
        API-->>Client: 200 OK
    else Token Not Available
        Filter-->>Client: 429 Too Many Requests
    end

Client Key Strategy

Rate limiting depends on how we identify the caller.

Common strategies:

Strategy Example Use Case
IP address ip:192.168.1.10 Public APIs
API key api-key:partner-a Partner APIs
User ID user:1001 Authenticated APIs
Tenant ID tenant:abc-corp SaaS platforms
Endpoint + User user:1001:/api/search Endpoint-specific limits

Endpoint-Specific Rate Limiting

Sometimes each endpoint needs a different limit.

Example:

/api/login      -> 5 requests per minute
/api/products   -> 100 requests per minute
/api/payment    -> 10 requests per minute

Client key can include endpoint path:

private String resolveClientKey(HttpServletRequest request) {
    String ip = request.getRemoteAddr();
    String path = request.getRequestURI();

    return "ip:" + ip + ":path:" + path;
}

Real-Time Enterprise Example

Banking Login API

API: POST /api/login
Limit: 5 attempts per minute per user/IP
Reason: Prevent brute-force login attacks

Payment API

API: POST /api/payments
Limit: 10 requests per minute per customer
Reason: Prevent duplicate payment submission

Search API

API: GET /api/search
Limit: 100 requests per minute per user
Reason: Protect database from heavy queries

Add these headers to help clients understand limits:

X-Rate-Limit-Limit: 10
X-Rate-Limit-Remaining: 4
Retry-After: 30

Example update in filter:

response.addHeader("X-Rate-Limit-Limit", "10");
response.addHeader("X-Rate-Limit-Remaining", String.valueOf(probe.getRemainingTokens()));

Redis Key Example

Bucket key:

rate-limit:api-key:partner-a

Redis stores distributed bucket state internally.

You can check keys:

docker exec -it redis-rate-limit redis-cli
keys *

Important Production Considerations

1. Use API Gateway When Possible

If your organization already uses API Gateway, Kong, Apigee, NGINX, AWS API Gateway, or Cloudflare, prefer gateway-level rate limiting for public APIs.

Application-level rate limiting is useful when business-specific logic is required.


2. Do Not Trust X-Forwarded-For Blindly

X-Forwarded-For can be spoofed if your app is directly exposed.

Use it only when requests come through a trusted load balancer or gateway.


3. Use Different Limits for Different Clients

Example:

Free users      -> 100 requests/day
Premium users   -> 10,000 requests/day
Internal apps   -> 100,000 requests/day

4. Fail Open or Fail Closed?

If Redis is down, decide behavior carefully.

Strategy Meaning Use Case
Fail open Allow request Better availability
Fail closed Reject request Better security

For most customer-facing APIs, fail open is commonly preferred with monitoring.


5. Add Monitoring

Track:

  • Number of allowed requests
  • Number of blocked requests
  • Redis latency
  • HTTP 429 count
  • Top rate-limited clients

Add Micrometer Metrics

Example:

import io.micrometer.core.instrument.Counter;
import io.micrometer.core.instrument.MeterRegistry;

@Component
public class RateLimitMetrics {

    private final Counter allowedCounter;
    private final Counter blockedCounter;

    public RateLimitMetrics(MeterRegistry meterRegistry) {
        this.allowedCounter = Counter.builder("api.rate.limit.allowed")
                .description("Allowed API requests")
                .register(meterRegistry);

        this.blockedCounter = Counter.builder("api.rate.limit.blocked")
                .description("Blocked API requests due to rate limiting")
                .register(meterRegistry);
    }

    public void incrementAllowed() {
        allowedCounter.increment();
    }

    public void incrementBlocked() {
        blockedCounter.increment();
    }
}

Common Mistakes

Mistake 1: Using Only Local Memory

This breaks rate limiting when multiple app instances are running.


Mistake 2: Applying Same Limit to All APIs

Login, payment, search, and public APIs should not have the same limit.


Mistake 3: Not Returning Retry-After

Clients need to know when they can retry.


Mistake 4: Not Considering Load Balancer IP

If all traffic comes through a load balancer, request.getRemoteAddr() may return the load balancer IP instead of client IP.


Mistake 5: Blocking Internal Health Checks

Exclude health check endpoints:

@Override
protected boolean shouldNotFilter(HttpServletRequest request) {
    String path = request.getRequestURI();
    return path.startsWith("/actuator/health");
}

Final Architecture

flowchart TD
    Client[Client]
    Gateway[API Gateway / Load Balancer]
    App[Spring Boot API]
    Filter[RateLimitFilter]
    Bucket[Bucket4j]
    Redis[(Redis Shared Bucket Store)]
    Controller[Business Controller]
    DB[(Database)]

    Client --> Gateway
    Gateway --> App
    App --> Filter
    Filter --> Bucket
    Bucket --> Redis

    Filter -->|Allowed| Controller
    Controller --> DB

    Filter -->|Rejected| TooMany[HTTP 429 Too Many Requests]

When to Use Bucket4j with Redis

Use this approach when:

  • You have multiple Spring Boot instances
  • You need distributed rate limiting
  • You need custom business-specific rate limit rules
  • You need user, tenant, API key, or endpoint-based limits
  • Gateway-level rate limiting is not enough

Summary

Rate limiting is an important API security and reliability pattern.

In this article, we implemented:

  • Token bucket algorithm
  • Bucket4j integration
  • Redis-backed distributed rate limiting
  • Spring Boot filter-based protection
  • Client key strategy
  • HTTP 429 response
  • Retry-After header
  • Production best practices

Rate limiting should be part of every enterprise API architecture.


Final Thought

In enterprise systems, rate limiting is not only a security feature.

It is also a reliability pattern.

A well-designed rate limiter protects:

  • Application servers
  • Databases
  • External integrations
  • Payment systems
  • Login systems
  • Customer experience

Design your APIs to be fast, fair, and safe.

Learning Path Navigation

Loading likes...

Comments

Share a question, correction, or practical insight about this article.

Loading approved comments...