AWS CloudWatch Metrics and Alarms
Learn how to monitor Spring Boot applications using Amazon CloudWatch Metrics, custom metrics, dashboards, and alarms with a complete implementation guide.
Introduction
Monitoring is one of the most important aspects of running applications in production. While logs help diagnose issues after they occur, metrics help detect problems before users notice them.
Amazon CloudWatch Metrics continuously collect operational data from AWS resources and applications, while CloudWatch Alarms notify your team or automatically trigger actions when predefined thresholds are exceeded.
This guide demonstrates how to integrate a Spring Boot application with CloudWatch Metrics and build a production-ready monitoring solution.
What are CloudWatch Metrics?
A metric is a numerical value collected over time.
Examples include:
- CPU Utilization
- Memory Usage
- Disk Space
- Network Traffic
- HTTP Requests
- API Latency
- Error Count
- Active Users
- JVM Heap Usage
CloudWatch stores these values as time-series data.
CloudWatch Monitoring Architecture
flowchart LR
A[Users]
B[Spring Boot Application]
C[Micrometer]
D[CloudWatch Metrics]
E[CloudWatch Dashboard]
F[CloudWatch Alarm]
G[SNS Email]
H[Auto Scaling]
A --> B
B --> C
C --> D
D --> E
D --> F
F --> G
F --> H
Types of Metrics
AWS Service Metrics
Automatically collected.
Examples:
- EC2 CPU
- RDS Connections
- Lambda Invocations
- SQS Queue Length
- ALB Request Count
No application changes required.
Application Metrics
Generated by your application.
Examples:
- Orders Processed
- Login Success
- Failed Payments
- API Response Time
- Business Transactions
Custom Metrics
Developer-defined metrics.
Examples:
OrdersCreated
PaymentsFailed
InventoryUpdated
CustomerRegistrations
Real-World Example
Suppose an Order Service processes customer orders.
Every request should monitor:
- Total Orders
- Failed Orders
- Processing Time
- Average Response Time
- JVM Memory
- CPU
- Active Threads
CloudWatch displays all these metrics in one dashboard.
Step 1: Add Dependencies
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-cloudwatch2</artifactId>
</dependency>
Step 2: Configure Spring Boot
management:
endpoints:
web:
exposure:
include: health,info,metrics,prometheus
management:
metrics:
export:
cloudwatch:
enabled: true
namespace: CodeWithVenu/OrderService
Step 3: IAM Permissions
The application requires permission to publish metrics.
Attach:
CloudWatchAgentServerPolicy
or
cloudwatch:PutMetricData
Step 4: Create Custom Metrics
Example Service
@Service
public class OrderService {
private final Counter ordersCounter;
public OrderService(MeterRegistry registry) {
ordersCounter = registry.counter("orders.created");
}
public void createOrder() {
ordersCounter.increment();
}
}
Every order increases the metric.
orders.created
Recording API Response Time
Timer timer = Timer.builder("order.processing.time")
.register(registry);
timer.record(() -> {
processOrder();
});
CloudWatch records:
- Average
- Maximum
- Count
- Total Time
Recording Failures
Counter failedOrders = registry.counter("orders.failed");
try{
process();
}
catch(Exception ex){
failedOrders.increment();
}
Monitoring Flow
sequenceDiagram
participant User
participant API
participant Micrometer
participant CloudWatch
participant Alarm
participant Admin
User->>API: POST /orders
API->>Micrometer: Record Metrics
Micrometer->>CloudWatch: Publish Metrics
CloudWatch->>Alarm: Evaluate Threshold
Alarm->>Admin: Send Email
Useful JVM Metrics
Micrometer automatically publishes:
JVM Heap
CPU Usage
Thread Count
GC Pause
Memory Used
Memory Max
Class Loading
System Load
Disk Usage
HTTP Requests
Tomcat Sessions
Viewing Metrics
AWS Console
CloudWatch
↓
Metrics
↓
CodeWithVenu
↓
OrderService
Choose metric:
orders.created
View graph.
Creating CloudWatch Alarm
Suppose:
orders.failed > 20
within
5 minutes
Create alarm.
Flow:
flowchart LR
A[Custom Metric]
-->B[CloudWatch Alarm]
-->C[SNS Topic]
-->D[Email]
-->E[Support Team]
Alarm States
CloudWatch alarms have three states.
| State | Meaning |
|---|---|
| OK | Everything healthy |
| ALARM | Threshold exceeded |
| INSUFFICIENT_DATA | Not enough data |
Example Alarms
High CPU
CPU > 80%
High Memory
Heap > 75%
Error Rate
Errors > 5%
API Latency
Average Response Time > 2 Seconds
Low Disk Space
Remaining Disk < 10%
SNS Notifications
Create SNS Topic
ProductionAlerts
Subscribe
[email protected]
Whenever alarm triggers:
Email
SMS
Lambda
Slack
PagerDuty
can be notified.
CloudWatch Dashboard
A production dashboard should include:
CPU
Memory
Heap
GC
Threads
API Requests
API Errors
Latency
Database Connections
Disk Usage
Network
Custom Business Metrics
Dashboard Architecture
flowchart LR
A[CloudWatch Metrics]
-->B[Dashboard]
B-->C[Infrastructure]
B-->D[Application]
B-->E[Business Metrics]
Business Metrics Example
Track:
Orders
Payments
Refunds
Customers
Revenue
Inventory
Login Success
Login Failure
Business teams can also use dashboards.
Best Practices
- Use namespaces for each application.
- Create dashboards for every environment.
- Monitor infrastructure and business metrics together.
- Set meaningful alarm thresholds.
- Use SNS for instant notifications.
- Avoid creating unnecessary custom metrics.
- Review alarm noise periodically.
- Tag resources consistently.
- Combine metrics with logs and traces for complete observability.
Common Issues
| Problem | Solution |
|---|---|
| Metrics not appearing | Verify IAM permissions |
| Alarm never triggers | Check threshold and evaluation period |
| Missing custom metrics | Ensure Micrometer is configured |
| Delayed metrics | Wait 1–2 minutes for publishing |
| Too many alarms | Consolidate and tune thresholds |
Enterprise Monitoring Flow
flowchart TD
USERS["Users"]
APP["Spring Boot"]
MICROMETER["Micrometer"]
METRICS["CloudWatch Metrics"]
DASH["Dashboards"]
ALARM["CloudWatch Alarm"]
SNS["SNS"]
DEVOPS["DevOps Team"]
INVEST["Issue Investigation"]
LOGS["CloudWatch Logs"]
FIX["Application Fix"]
USERS --> APP --> MICROMETER --> METRICS --> DASH --> ALARM --> SNS --> DEVOPS --> INVEST --> LOGS --> FIX
Production Monitoring Checklist
✅ Enable Spring Boot Actuator
✅ Configure Micrometer CloudWatch Registry
✅ Publish JVM metrics
✅ Publish business metrics
✅ Create CloudWatch Dashboard
✅ Configure SNS notifications
✅ Create CPU and Memory alarms
✅ Create Error Rate alarms
✅ Create Latency alarms
✅ Review alarms regularly
Summary
Amazon CloudWatch Metrics and Alarms provide the foundation for proactive monitoring in AWS. By integrating Spring Boot with Micrometer, you can automatically publish JVM and application metrics, visualize them on CloudWatch Dashboards, and configure alarms to detect issues before they impact users.
A complete production monitoring solution should combine:
- Metrics for system health and trends
- Logs for troubleshooting failures
- Alarms for proactive notifications
- Dashboards for operational visibility
- Tracing for end-to-end request analysis
Together, these services form a robust observability platform for modern cloud-native Spring Boot applications.
Next Article
37-CloudWatch-Dashboards-SpringBoot.md
Learn how to build interactive CloudWatch Dashboards to visualize application health, infrastructure metrics, business KPIs, and operational trends in a single unified view.
Comments
Share a question, correction, or practical insight about this article.
Checking login status...
Loading approved comments...