Database Partitioning in System Design
Learn Database Partitioning from a System Design perspective. Understand horizontal and vertical partitioning, range, list, hash, composite partitioning, partition pruning, Spring Boot integration, PostgreSQL table partitioning, Oracle partitioning, and real-world examples from Amazon, Netflix, Uber, and Banking systems.
Introduction
Imagine you have an Orders table with:
- 5 Billion Orders
- 20 TB Database Size
- 100 Million New Orders Every Month
Every query becomes slower.
SELECT *
FROM orders
WHERE order_date >= '2026-01-01';
Without partitioning,
the database scans billions of rows.
As data grows:
- Queries become slower
- Indexes become larger
- Backups take longer
- Maintenance becomes difficult
Instead of storing everything in one huge table,
modern databases split one table into smaller partitions.
This technique is called Database Partitioning.
Learning Objectives
After completing this article, you'll understand:
- What is Database Partitioning?
- Why Partition Tables?
- Horizontal vs Vertical Partitioning
- Range Partitioning
- List Partitioning
- Hash Partitioning
- Composite Partitioning
- Partition Pruning
- Spring Boot Integration
- PostgreSQL Partitioning
- Real-world Examples
- Best Practices
What is Database Partitioning?
Partitioning divides one large table into multiple smaller partitions.
Each partition behaves like an independent table,
but applications still query it as a single logical table.
Single Large Table
flowchart TD
APP[Spring Boot Application]
TABLE[(Orders Table<br/>5 Billion Rows)]
APP --> TABLE
Problems
- Large indexes
- Slow scans
- Expensive maintenance
- Long backup times
Partitioned Table
flowchart TD
APP[Spring Boot Application]
PARENT[(Orders Table)]
P2024[(Orders_2024)]
P2025[(Orders_2025)]
P2026[(Orders_2026)]
APP --> PARENT
PARENT --> P2024
PARENT --> P2025
PARENT --> P2026
Applications still query Orders.
The database routes requests to the appropriate partition.
Why Partition Tables?
Imagine querying only 2026 orders.
Without partitioning
5 Billion Rows
↓
Full Table Scan
With partitioning
Orders_2026
↓
120 Million Rows
↓
Fast Query
Partitioning vs Sharding
| Partitioning | Sharding |
|---|---|
| Within one database | Across multiple databases |
| Logical table split | Database split |
| Same server (usually) | Multiple servers |
| Easier joins | Cross-shard complexity |
Horizontal Partitioning
Rows are divided into partitions.
Example
Orders
↓
2024 Orders
2025 Orders
2026 Orders
Horizontal Partitioning Diagram
flowchart TD
ORDERS[(Orders)]
P1[(2024)]
P2[(2025)]
P3[(2026)]
ORDERS --> P1
ORDERS --> P2
ORDERS --> P3
Each partition stores different rows.
Vertical Partitioning
Columns are separated.
Example
Customer Table
CustomerId
Name
Phone
Address
Profile Picture
Biography
Move large columns into another table.
Vertical Partitioning Diagram
flowchart LR
MAIN[(Customer)]
PROFILE[(Customer Profile)]
MAIN --> PROFILE
Frequently used data remains small.
Large columns are loaded only when required.
Range Partitioning
Rows are partitioned based on value ranges.
Example
Order Date
2024
↓
Partition 1
2025
↓
Partition 2
2026
↓
Partition 3
Range Partition Diagram
flowchart TD
ORDERS[(Orders)]
JAN[(2024)]
FEB[(2025)]
MAR[(2026)]
ORDERS --> JAN
ORDERS --> FEB
ORDERS --> MAR
Ideal for
- Financial transactions
- Audit logs
- Historical reports
List Partitioning
Rows are grouped by category.
Example
Country
USA
↓
Partition_US
India
↓
Partition_IN
UK
↓
Partition_UK
List Partition Diagram
flowchart TD
CUSTOMER[(Customers)]
US[(USA)]
IN[(India)]
UK[(United Kingdom)]
CUSTOMER --> US
CUSTOMER --> IN
CUSTOMER --> UK
Useful for regional applications.
Hash Partitioning
A hash function determines the partition.
Formula
CustomerId % NumberOfPartitions
Hash Partition Diagram
flowchart TD
ID[Customer ID]
HASH[Hash Function]
P1[(Partition 1)]
P2[(Partition 2)]
P3[(Partition 3)]
ID --> HASH
HASH --> P1
HASH --> P2
HASH --> P3
Advantages
- Even distribution
- Balanced storage
- Predictable routing
Composite Partitioning
Combination of multiple strategies.
Example
Range + Hash
flowchart TD
ORDERS[(Orders)]
YEAR[(2026)]
HASH[Hash]
P1[(Partition A)]
P2[(Partition B)]
ORDERS --> YEAR
YEAR --> HASH
HASH --> P1
HASH --> P2
Used in enterprise systems.
Partition Pruning
One of the biggest advantages.
Query
SELECT *
FROM orders
WHERE order_date >= '2026-01-01'
AND order_date < '2027-01-01';
Database scans only
Orders_2026
instead of all partitions.
Partition Pruning Diagram
flowchart TD
QUERY[SQL Query]
PARENT[(Orders)]
P2024[(2024)]
P2025[(2025)]
P2026[(2026)]
QUERY --> PARENT
PARENT --> P2026
Only the matching partition is accessed.
PostgreSQL Partitioning
PostgreSQL supports
- RANGE
- LIST
- HASH
Example
CREATE TABLE orders (
id BIGINT,
order_date DATE
)
PARTITION BY RANGE(order_date);
Oracle Partitioning
Oracle supports
- Range
- List
- Hash
- Interval
- Composite
Widely used in Banking applications.
Spring Boot Architecture
flowchart TD
CLIENT[React]
API[Spring Boot]
DB[(Partitioned PostgreSQL)]
P1[(2024)]
P2[(2025)]
P3[(2026)]
CLIENT --> API
API --> DB
DB --> P1
DB --> P2
DB --> P3
Spring Boot queries the parent table.
The database automatically routes requests.
Banking Example
Transaction Table
Transactions
↓
2023
↓
2024
↓
2025
↓
2026
Benefits
- Faster monthly statements
- Easier archiving
- Faster compliance reporting
Amazon Example
Order History
Orders
↓
Monthly Partitions
↓
Query Current Month
Old partitions are archived separately.
Netflix Example
Viewing history is partitioned by
- Date
- Region
- Customer
This reduces storage and query costs.
Uber Example
Trips are partitioned by
- City
- Month
Queries for a single city don't scan all global trips.
Partition Maintenance
Old partitions can be archived.
flowchart LR
ACTIVE[(Active Partition)]
ARCHIVE[(Archive)]
DELETE[(Drop Old Partition)]
ACTIVE --> ARCHIVE
ARCHIVE --> DELETE
Very efficient compared to deleting billions of rows.
Advantages
- Faster Queries
- Smaller Indexes
- Easier Maintenance
- Faster Backup
- Better Performance
- Easier Archiving
- Reduced Lock Contention
Disadvantages
- More Complex Design
- Poor Partition Key can hurt performance
- Cross-partition queries can be slower
- Additional maintenance
- Partition management overhead
Monitoring
Monitor
- Partition Size
- Query Latency
- Partition Pruning
- Index Usage
- Storage Growth
- Vacuum/Analyze Statistics
- Slow Queries
- CPU Usage
Tools
- PostgreSQL EXPLAIN ANALYZE
- pg_stat_statements
- Oracle AWR
- Datadog
- Grafana
- Prometheus
Common Mistakes
❌ Creating too many partitions
❌ Choosing the wrong partition key
❌ Ignoring partition pruning
❌ Querying across all partitions unnecessarily
❌ Not archiving old partitions
❌ Forgetting local/global index maintenance
Best Practices
- Choose a partition key that matches common query patterns.
- Use Range partitioning for time-series data.
- Use Hash partitioning for even distribution.
- Archive old partitions instead of deleting rows.
- Keep partitions balanced in size.
- Monitor partition pruning using execution plans.
- Rebuild indexes when necessary.
- Combine partitioning with indexing for maximum performance.
Common Interview Questions
What is Database Partitioning?
Database Partitioning is the process of dividing one large table into smaller logical partitions to improve performance and simplify maintenance.
What is the difference between Partitioning and Sharding?
Partitioning splits a table within the same database, while Sharding distributes data across multiple independent database servers.
What is Partition Pruning?
Partition Pruning is an optimization where the database scans only the relevant partitions instead of the entire table, significantly reducing query execution time.
Which partitioning strategy is best for time-series data?
Range Partitioning is the preferred strategy for time-based data such as orders, transactions, logs, and audit records.
Can Spring Boot work with partitioned tables?
Yes. Spring Boot and JPA interact with the parent table, while the database transparently routes data to the correct partition.
Summary
Database Partitioning is a powerful optimization technique for managing very large tables. It improves query performance, reduces maintenance effort, and enables efficient handling of billions of rows without changing application code.
In this article, we covered:
- Database Partitioning fundamentals
- Horizontal & Vertical Partitioning
- Range, List, Hash, and Composite Partitioning
- Partition Pruning
- PostgreSQL & Oracle support
- Spring Boot integration
- Banking, Amazon, Netflix, and Uber examples
- Monitoring
- Best practices
Partitioning is typically introduced before database sharding and works exceptionally well for large transactional tables such as orders, payments, logs, and audit records. Combined with proper indexing and replication, it forms a key building block for scalable enterprise database architectures.
Comments
Share a question, correction, or practical insight about this article.
Checking login status...
Loading approved comments...