Full Stack • Java • System Design • Cloud • AI Engineering

AWS CloudWatch Metrics and Alarms

Learn how to monitor Spring Boot applications using Amazon CloudWatch Metrics, custom metrics, dashboards, and alarms with a complete implementation guide.


Introduction

Monitoring is one of the most important aspects of running applications in production. While logs help diagnose issues after they occur, metrics help detect problems before users notice them.

Amazon CloudWatch Metrics continuously collect operational data from AWS resources and applications, while CloudWatch Alarms notify your team or automatically trigger actions when predefined thresholds are exceeded.

This guide demonstrates how to integrate a Spring Boot application with CloudWatch Metrics and build a production-ready monitoring solution.


What are CloudWatch Metrics?

A metric is a numerical value collected over time.

Examples include:

  • CPU Utilization
  • Memory Usage
  • Disk Space
  • Network Traffic
  • HTTP Requests
  • API Latency
  • Error Count
  • Active Users
  • JVM Heap Usage

CloudWatch stores these values as time-series data.


CloudWatch Monitoring Architecture

flowchart LR

A[Users]
B[Spring Boot Application]
C[Micrometer]
D[CloudWatch Metrics]
E[CloudWatch Dashboard]
F[CloudWatch Alarm]
G[SNS Email]
H[Auto Scaling]

A --> B
B --> C
C --> D
D --> E
D --> F
F --> G
F --> H

Types of Metrics

AWS Service Metrics

Automatically collected.

Examples:

  • EC2 CPU
  • RDS Connections
  • Lambda Invocations
  • SQS Queue Length
  • ALB Request Count

No application changes required.


Application Metrics

Generated by your application.

Examples:

  • Orders Processed
  • Login Success
  • Failed Payments
  • API Response Time
  • Business Transactions

Custom Metrics

Developer-defined metrics.

Examples:

OrdersCreated

PaymentsFailed

InventoryUpdated

CustomerRegistrations

Real-World Example

Suppose an Order Service processes customer orders.

Every request should monitor:

  • Total Orders
  • Failed Orders
  • Processing Time
  • Average Response Time
  • JVM Memory
  • CPU
  • Active Threads

CloudWatch displays all these metrics in one dashboard.


Step 1: Add Dependencies

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>

<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-registry-cloudwatch2</artifactId>
</dependency>

Step 2: Configure Spring Boot

management:
  endpoints:
    web:
      exposure:
        include: health,info,metrics,prometheus

management:
  metrics:
    export:
      cloudwatch:
        enabled: true
        namespace: CodeWithVenu/OrderService

Step 3: IAM Permissions

The application requires permission to publish metrics.

Attach:

CloudWatchAgentServerPolicy

or

cloudwatch:PutMetricData

Step 4: Create Custom Metrics

Example Service

@Service
public class OrderService {

    private final Counter ordersCounter;

    public OrderService(MeterRegistry registry) {
        ordersCounter = registry.counter("orders.created");
    }

    public void createOrder() {

        ordersCounter.increment();

    }
}

Every order increases the metric.

orders.created

Recording API Response Time

Timer timer = Timer.builder("order.processing.time")
        .register(registry);

timer.record(() -> {

    processOrder();

});

CloudWatch records:

  • Average
  • Maximum
  • Count
  • Total Time

Recording Failures

Counter failedOrders = registry.counter("orders.failed");

try{

process();

}
catch(Exception ex){

failedOrders.increment();

}

Monitoring Flow

sequenceDiagram

participant User
participant API
participant Micrometer
participant CloudWatch
participant Alarm
participant Admin

User->>API: POST /orders

API->>Micrometer: Record Metrics

Micrometer->>CloudWatch: Publish Metrics

CloudWatch->>Alarm: Evaluate Threshold

Alarm->>Admin: Send Email

Useful JVM Metrics

Micrometer automatically publishes:

JVM Heap

CPU Usage

Thread Count

GC Pause

Memory Used

Memory Max

Class Loading

System Load

Disk Usage

HTTP Requests

Tomcat Sessions

Viewing Metrics

AWS Console

CloudWatch

↓

Metrics

↓

CodeWithVenu

↓

OrderService

Choose metric:

orders.created

View graph.


Creating CloudWatch Alarm

Suppose:

orders.failed > 20

within

5 minutes

Create alarm.

Flow:

flowchart LR

A[Custom Metric]

-->B[CloudWatch Alarm]

-->C[SNS Topic]

-->D[Email]

-->E[Support Team]

Alarm States

CloudWatch alarms have three states.

State Meaning
OK Everything healthy
ALARM Threshold exceeded
INSUFFICIENT_DATA Not enough data

Example Alarms

High CPU

CPU > 80%

High Memory

Heap > 75%

Error Rate

Errors > 5%

API Latency

Average Response Time > 2 Seconds

Low Disk Space

Remaining Disk < 10%

SNS Notifications

Create SNS Topic

ProductionAlerts

Subscribe

[email protected]

Whenever alarm triggers:

Email

SMS

Lambda

Slack

PagerDuty

can be notified.


CloudWatch Dashboard

A production dashboard should include:

CPU

Memory

Heap

GC

Threads

API Requests

API Errors

Latency

Database Connections

Disk Usage

Network

Custom Business Metrics

Dashboard Architecture

flowchart LR

A[CloudWatch Metrics]

-->B[Dashboard]

B-->C[Infrastructure]

B-->D[Application]

B-->E[Business Metrics]

Business Metrics Example

Track:

Orders

Payments

Refunds

Customers

Revenue

Inventory

Login Success

Login Failure

Business teams can also use dashboards.


Best Practices

  • Use namespaces for each application.
  • Create dashboards for every environment.
  • Monitor infrastructure and business metrics together.
  • Set meaningful alarm thresholds.
  • Use SNS for instant notifications.
  • Avoid creating unnecessary custom metrics.
  • Review alarm noise periodically.
  • Tag resources consistently.
  • Combine metrics with logs and traces for complete observability.

Common Issues

Problem Solution
Metrics not appearing Verify IAM permissions
Alarm never triggers Check threshold and evaluation period
Missing custom metrics Ensure Micrometer is configured
Delayed metrics Wait 1–2 minutes for publishing
Too many alarms Consolidate and tune thresholds

Enterprise Monitoring Flow

flowchart TD
    USERS["Users"]
    APP["Spring Boot"]
    MICROMETER["Micrometer"]
    METRICS["CloudWatch Metrics"]
    DASH["Dashboards"]
    ALARM["CloudWatch Alarm"]
    SNS["SNS"]
    DEVOPS["DevOps Team"]
    INVEST["Issue Investigation"]
    LOGS["CloudWatch Logs"]
    FIX["Application Fix"]

    USERS --> APP --> MICROMETER --> METRICS --> DASH --> ALARM --> SNS --> DEVOPS --> INVEST --> LOGS --> FIX

Production Monitoring Checklist

✅ Enable Spring Boot Actuator

✅ Configure Micrometer CloudWatch Registry

✅ Publish JVM metrics

✅ Publish business metrics

✅ Create CloudWatch Dashboard

✅ Configure SNS notifications

✅ Create CPU and Memory alarms

✅ Create Error Rate alarms

✅ Create Latency alarms

✅ Review alarms regularly


Summary

Amazon CloudWatch Metrics and Alarms provide the foundation for proactive monitoring in AWS. By integrating Spring Boot with Micrometer, you can automatically publish JVM and application metrics, visualize them on CloudWatch Dashboards, and configure alarms to detect issues before they impact users.

A complete production monitoring solution should combine:

  • Metrics for system health and trends
  • Logs for troubleshooting failures
  • Alarms for proactive notifications
  • Dashboards for operational visibility
  • Tracing for end-to-end request analysis

Together, these services form a robust observability platform for modern cloud-native Spring Boot applications.


Next Article

37-CloudWatch-Dashboards-SpringBoot.md

Learn how to build interactive CloudWatch Dashboards to visualize application health, infrastructure metrics, business KPIs, and operational trends in a single unified view.


Loading likes...

Comments

Share a question, correction, or practical insight about this article.

Loading approved comments...