Database Partitioning in System Design

Learn Database Partitioning from a System Design perspective. Understand horizontal and vertical partitioning, range, list, hash, composite partitioning, partition pruning, Spring Boot integration, PostgreSQL table partitioning, Oracle partitioning, and real-world examples from Amazon, Netflix, Uber, and Banking systems.

Introduction

Imagine you have an Orders table with:

5 Billion Orders
20 TB Database Size
100 Million New Orders Every Month

Every query becomes slower.

SELECT *
FROM orders
WHERE order_date >= '2026-01-01';

Without partitioning,

the database scans billions of rows.

As data grows:

Queries become slower
Indexes become larger
Backups take longer
Maintenance becomes difficult

Instead of storing everything in one huge table,

modern databases split one table into smaller partitions.

This technique is called Database Partitioning.

Learning Objectives

After completing this article, you'll understand:

What is Database Partitioning?
Why Partition Tables?
Horizontal vs Vertical Partitioning
Range Partitioning
List Partitioning
Hash Partitioning
Composite Partitioning
Partition Pruning
Spring Boot Integration
PostgreSQL Partitioning
Real-world Examples
Best Practices

What is Database Partitioning?

Partitioning divides one large table into multiple smaller partitions.

Each partition behaves like an independent table,

but applications still query it as a single logical table.

Single Large Table

flowchart TD
    APP[Spring Boot Application]
    TABLE[(Orders Table<br/>5 Billion Rows)]

    APP --> TABLE

Problems

Large indexes
Slow scans
Expensive maintenance
Long backup times

Partitioned Table

flowchart TD
    APP[Spring Boot Application]

    PARENT[(Orders Table)]

    P2024[(Orders_2024)]
    P2025[(Orders_2025)]
    P2026[(Orders_2026)]

    APP --> PARENT

    PARENT --> P2024
    PARENT --> P2025
    PARENT --> P2026

Applications still query Orders.

The database routes requests to the appropriate partition.

Why Partition Tables?

Imagine querying only 2026 orders.

Without partitioning

5 Billion Rows

↓

Full Table Scan

With partitioning

Orders_2026

↓

120 Million Rows

↓

Fast Query

Partitioning vs Sharding

Partitioning	Sharding
Within one database	Across multiple databases
Logical table split	Database split
Same server (usually)	Multiple servers
Easier joins	Cross-shard complexity

Horizontal Partitioning

Rows are divided into partitions.

Example

Orders

↓

2024 Orders

2025 Orders

2026 Orders

Horizontal Partitioning Diagram

flowchart TD
    ORDERS[(Orders)]

    P1[(2024)]
    P2[(2025)]
    P3[(2026)]

    ORDERS --> P1
    ORDERS --> P2
    ORDERS --> P3

Each partition stores different rows.

Vertical Partitioning

Columns are separated.

Example

Customer Table

CustomerId

Name

Phone

Address

Profile Picture

Biography

Move large columns into another table.

Vertical Partitioning Diagram

flowchart LR
    MAIN[(Customer)]

    PROFILE[(Customer Profile)]

    MAIN --> PROFILE

Frequently used data remains small.

Large columns are loaded only when required.

Range Partitioning

Rows are partitioned based on value ranges.

Example

Order Date

2024

↓

Partition 1

2025

↓

Partition 2

2026

↓

Partition 3

Range Partition Diagram

flowchart TD
    ORDERS[(Orders)]

    JAN[(2024)]

    FEB[(2025)]

    MAR[(2026)]

    ORDERS --> JAN
    ORDERS --> FEB
    ORDERS --> MAR

Ideal for

Financial transactions
Audit logs
Historical reports

List Partitioning

Rows are grouped by category.

Example

Country

USA

↓

Partition_US

India

↓

Partition_IN

UK

↓

Partition_UK

List Partition Diagram

flowchart TD
    CUSTOMER[(Customers)]

    US[(USA)]

    IN[(India)]

    UK[(United Kingdom)]

    CUSTOMER --> US
    CUSTOMER --> IN
    CUSTOMER --> UK

Useful for regional applications.

Hash Partitioning

A hash function determines the partition.

Formula

CustomerId % NumberOfPartitions

Hash Partition Diagram

flowchart TD
    ID[Customer ID]

    HASH[Hash Function]

    P1[(Partition 1)]

    P2[(Partition 2)]

    P3[(Partition 3)]

    ID --> HASH

    HASH --> P1
    HASH --> P2
    HASH --> P3

Advantages

Even distribution
Balanced storage
Predictable routing

Composite Partitioning

Combination of multiple strategies.

Example

Range + Hash

flowchart TD
    ORDERS[(Orders)]

    YEAR[(2026)]

    HASH[Hash]

    P1[(Partition A)]

    P2[(Partition B)]

    ORDERS --> YEAR
    YEAR --> HASH
    HASH --> P1
    HASH --> P2

Used in enterprise systems.

Partition Pruning

One of the biggest advantages.

Query

SELECT *
FROM orders
WHERE order_date >= '2026-01-01'
AND order_date < '2027-01-01';

Database scans only

Orders_2026

instead of all partitions.

Partition Pruning Diagram

flowchart TD
    QUERY[SQL Query]

    PARENT[(Orders)]

    P2024[(2024)]

    P2025[(2025)]

    P2026[(2026)]

    QUERY --> PARENT
    PARENT --> P2026

Only the matching partition is accessed.

PostgreSQL Partitioning

PostgreSQL supports

RANGE
LIST
HASH

Example

CREATE TABLE orders (
    id BIGINT,
    order_date DATE
)
PARTITION BY RANGE(order_date);

Oracle Partitioning

Oracle supports

Range
List
Hash
Interval
Composite

Widely used in Banking applications.

Spring Boot Architecture

flowchart TD
    CLIENT[React]

    API[Spring Boot]

    DB[(Partitioned PostgreSQL)]

    P1[(2024)]

    P2[(2025)]

    P3[(2026)]

    CLIENT --> API
    API --> DB

    DB --> P1
    DB --> P2
    DB --> P3

Spring Boot queries the parent table.

The database automatically routes requests.

Banking Example

Transaction Table

Transactions

↓

2023

↓

2024

↓

2025

↓

2026

Benefits

Faster monthly statements
Easier archiving
Faster compliance reporting

Amazon Example

Order History

Orders

↓

Monthly Partitions

↓

Query Current Month

Old partitions are archived separately.

Netflix Example

Viewing history is partitioned by

Date
Region
Customer

This reduces storage and query costs.

Uber Example

Trips are partitioned by

City
Month

Queries for a single city don't scan all global trips.

Partition Maintenance

Old partitions can be archived.

flowchart LR
    ACTIVE[(Active Partition)]

    ARCHIVE[(Archive)]

    DELETE[(Drop Old Partition)]

    ACTIVE --> ARCHIVE
    ARCHIVE --> DELETE

Very efficient compared to deleting billions of rows.

Advantages

Faster Queries
Smaller Indexes
Easier Maintenance
Faster Backup
Better Performance
Easier Archiving
Reduced Lock Contention

Disadvantages

More Complex Design
Poor Partition Key can hurt performance
Cross-partition queries can be slower
Additional maintenance
Partition management overhead

Monitoring

Monitor

Partition Size
Query Latency
Partition Pruning
Index Usage
Storage Growth
Vacuum/Analyze Statistics
Slow Queries
CPU Usage

Tools

PostgreSQL EXPLAIN ANALYZE
pg_stat_statements
Oracle AWR
Datadog
Grafana
Prometheus

Common Mistakes

❌ Creating too many partitions

❌ Choosing the wrong partition key

❌ Ignoring partition pruning

❌ Querying across all partitions unnecessarily

❌ Not archiving old partitions

❌ Forgetting local/global index maintenance

Best Practices

Choose a partition key that matches common query patterns.
Use Range partitioning for time-series data.
Use Hash partitioning for even distribution.
Archive old partitions instead of deleting rows.
Keep partitions balanced in size.
Monitor partition pruning using execution plans.
Rebuild indexes when necessary.
Combine partitioning with indexing for maximum performance.

Common Interview Questions

What is Database Partitioning?

Database Partitioning is the process of dividing one large table into smaller logical partitions to improve performance and simplify maintenance.

What is the difference between Partitioning and Sharding?

Partitioning splits a table within the same database, while Sharding distributes data across multiple independent database servers.

What is Partition Pruning?

Partition Pruning is an optimization where the database scans only the relevant partitions instead of the entire table, significantly reducing query execution time.

Which partitioning strategy is best for time-series data?

Range Partitioning is the preferred strategy for time-based data such as orders, transactions, logs, and audit records.

Can Spring Boot work with partitioned tables?

Yes. Spring Boot and JPA interact with the parent table, while the database transparently routes data to the correct partition.

Summary

Database Partitioning is a powerful optimization technique for managing very large tables. It improves query performance, reduces maintenance effort, and enables efficient handling of billions of rows without changing application code.

In this article, we covered:

Database Partitioning fundamentals
Horizontal & Vertical Partitioning
Range, List, Hash, and Composite Partitioning
Partition Pruning
PostgreSQL & Oracle support
Spring Boot integration
Banking, Amazon, Netflix, and Uber examples
Monitoring
Best practices

Partitioning is typically introduced before database sharding and works exceptionally well for large transactional tables such as orders, payments, logs, and audit records. Combined with proper indexing and replication, it forms a key building block for scalable enterprise database architectures.

Loading likes...

Comments

Share a question, correction, or practical insight about this article.

Loading approved comments...