Database Sharding and Partitioning Strategies

Q: When should I start sharding?

When you have exhausted vertical scaling and read replicas. Typical signals: single server CPU > 70% sustained, write throughput is the bottleneck (not reads), or you need geographic distribution. Most applications never need sharding.

Introduction

When a single database server cannot handle the load, you have three options: buy a bigger machine (vertical scaling), add read replicas (horizontal read scaling), or split the data across multiple servers (sharding). Sharding is the hardest but the only option for unlimited horizontal scaling. This guide covers strategies, trade-offs, and operational considerations.

Partitioning vs Sharding

Term	Definition	Scope
Partitioning	Splitting a single table into smaller pieces within one database	Single node
Sharding	Distributing partitions across multiple database servers	Multi-node
Horizontal	Splitting rows by partition key	Rows distributed
Vertical	Splitting columns into separate tables	Columns separated

Vertical Partitioning

Split columns of a wide table into separate tables, typically by access pattern.

-- Before: single wide table
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    username VARCHAR(50),
    email VARCHAR(100),
    profile_json JSONB,
    avatar_url VARCHAR(500),
    preferences_json JSONB,
    created_at TIMESTAMP
);

-- After: frequently accessed columns in users, rarely accessed in user_profiles
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    username VARCHAR(50),
    email VARCHAR(100),
    created_at TIMESTAMP
);

CREATE TABLE user_profiles (
    user_id INT PRIMARY KEY REFERENCES users(id),
    profile_json JSONB,
    avatar_url VARCHAR(500),
    preferences_json JSONB
);

When to use: When some columns are accessed 100x more often than others. Reduces I/O for common queries.

Horizontal Partitioning (Table Partitioning)

Split rows of a single table within the same database server.

-- PostgreSQL declarative partitioning by range
CREATE TABLE events (
    id BIGSERIAL,
    event_type VARCHAR(50),
    created_at TIMESTAMP NOT NULL,
    data JSONB
) PARTITION BY RANGE (created_at);

CREATE TABLE events_2024_01 PARTITION OF events
    FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');

CREATE TABLE events_2024_02 PARTITION OF events
    FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');

Benefits: Query pruning (only scans relevant partitions), easier archival (detach old partitions), faster vacuum/analyze.

Sharding Strategies

1. Range Sharding

Split data by contiguous ranges of the shard key.

Shard 1: user_id 1 - 1,000,000
Shard 2: user_id 1,000,001 - 2,000,000
Shard 3: user_id 2,000,001 - 3,000,000

Pros	Cons
Range queries are efficient	Hotspots if data is skewed (e.g., recent data is hotter)
Easy to understand	Rebalancing requires moving large contiguous blocks
Natural for time-series

Best for: Time-series data, date-based partitioning, append-only workloads.

2. Hash Sharding

Apply a hash function to the shard key and map to a shard.

def get_shard(user_id, num_shards=4):
    return hash(user_id) % num_shards

# user_id=123 → hash(123) % 4 → shard 3
# user_id=456 → hash(456) % 4 → shard 0

Pros	Cons
Even distribution (no hotspots from skew)	Range queries require scanning all shards
Adding shards requires rehashing (expensive)	Cross-shard transactions are hard

Best for: Uniformly distributed keys, OLTP workloads with point lookups.

3. Consistent Hashing

A variant of hash sharding that minimizes rebalancing when adding/removing shards.

Key space (0-360):  Node A: 0-120, Node B: 120-240, Node C: 240-360
Add Node D:         Node A: 0-90,  Node B: 90-180, Node C: 180-270, Node D: 270-360

Only 1/4 of keys move when adding a 4th node (vs 1/2 with simple hash).

4. Directory-Based Sharding

Maintain a lookup table (directory) that maps keys to shards. Allows flexible, manual shard assignment.

CREATE TABLE shard_directory (
    tenant_id INT PRIMARY KEY,
    shard_id INT NOT NULL,
    region VARCHAR(20)
);

-- Route EU tenants to EU shards, US tenants to US shards
SELECT shard_id FROM shard_directory WHERE tenant_id = ?;

Best for: Multi-tenant SaaS where tenants need geographic isolation.

Choosing a Shard Key

The shard key determines which rows live together. Choose poorly and you will have hot shards and cross-shard queries.

Good Shard Key	Bad Shard Key
High cardinality (many unique values)	Low cardinality (e.g., country with 5 options)
Accessed together stays together	Frequently joined data lives on different shards
Even distribution	Skewed (e.g., 1% of users generate 50% of events)
Immutable (or rarely changes)	Changes frequently (causes data migration)

Example: For an e-commerce app, user_id is usually a good shard key because orders, profiles, and preferences all relate to a user and are queried together.

The Hotspot Problem

Even with hash sharding, hotspots occur when one key dominates writes.

Shard 1: 45% of writes (celebrity user with 10M followers)
Shard 2: 15% of writes
Shard 3: 20% of writes
Shard 4: 20% of writes

Solutions:

Sub-sharding: Split the hot key further (e.g., by post_id within the celebrity)
Write splitting: Fan out writes for the hot key across multiple queues/shards
Cache layer: Absorb reads for the hot key in Redis

Cross-Shard Queries and Transactions

The Problem

-- If orders and payments are sharded by user_id, this is easy:
SELECT * FROM orders WHERE user_id = 123;

-- But this requires querying all shards:
SELECT * FROM orders WHERE created_at > '2024-01-01';

Solutions

Approach	Trade-off
Scatter-gather	Query all shards, aggregate. Slow and resource-heavy.
Global secondary index	Maintain an index on a non-shard key. Adds write amplification.
CQRS / read model	Replicate data to an analytics store for cross-shard queries.
Avoid cross-shard transactions	Design around them. Use sagas for multi-shard operations.

Rebalancing

When shards become uneven, you must move data.

Strategy	When to Use
Double writes + backfill	Writes go to old and new shards; backfill historical data; then switch reads
Consistent hashing	Minimal data movement when adding nodes
Planned migration window	Accept downtime for simplicity (rarely acceptable in production)

Best Practices

Plan for rebalancing from day one — data gravity is real; moving terabytes is slow
Keep transactions within a single shard — cross-shard transactions are painful
Monitor shard-level metrics — uneven CPU, memory, or disk usage signals a rebalance need
Use application-level routing first — your app knows the shard key; don’t rely solely on database proxies
Test with production-like data volumes — hotspots and skew only appear at scale

Common Mistakes

Choosing a shard key with low cardinality (e.g., country with 5 values)
Assuming hash sharding eliminates all hotspots (celebrity accounts still concentrate load)
Cross-shard JOINs (they don’t exist; you must do it in application code)
Not planning for rebalancing until a shard is 90% full
Sharding too early (< 10M rows or < 1K writes/second)

Frequently Asked Questions

When should I start sharding?

When you have exhausted vertical scaling and read replicas. Typical signals: single server CPU > 70% sustained, write throughput is the bottleneck (not reads), or you need geographic distribution. Most applications never need sharding.

What is the difference between partitioning and sharding?

Partitioning splits a table into smaller pieces on the same server. Sharding distributes those pieces across multiple servers. Partitioning is a database feature; sharding is an architectural decision.

Can I change the shard key later?

Technically yes, practically no. Changing the shard key requires rewriting all data. Design your shard key as if it were immutable. If you must change it, use a double-write and migration strategy over weeks.