Skip to content
SP StackPractices
advanced By StackPractices

Database Schema Evolution

Evolve database schemas safely with backward-compatible changes, versioned migrations, and online DDL operations in production environments.

Topics: databases

Note: This guide follows English-language naming conventions and terminology standards common in international development teams. Examples use English identifiers and comments to maximize compatibility across codebases and tooling.

Overview

Database schemas must evolve as applications grow, but schema changes are a leading cause of production outages. The expand-contract pattern, online DDL, and backward-compatible migrations allow teams to add features without downtime. This resource covers practical techniques for evolving schemas in PostgreSQL, MySQL, and distributed databases while maintaining data integrity and application availability.

When to Use

Use this resource when:

  • Adding columns, indexes, or constraints to tables with millions of rows
  • You need to rename columns or split tables without breaking running applications
  • Running migrations in a CI/CD pipeline that deploys multiple times daily
  • Working with distributed databases where schema changes propagate asynchronously

Solution

Expand-Contract Pattern (PostgreSQL)

-- PHASE 1: EXPAND - Add new column without breaking existing code
ALTER TABLE users ADD COLUMN email_normalized VARCHAR(255);
CREATE INDEX CONCURRENTLY idx_users_email_normalized ON users(email_normalized);

-- Backfill in batches to avoid locking
UPDATE users 
SET email_normalized = LOWER(email)
WHERE id BETWEEN 1 AND 10000;

-- PHASE 2: DUAL WRITE - Application writes to both columns
-- (Deploy code that writes to email and email_normalized)

-- PHASE 3: CONTRACT - Remove old column after verification
ALTER TABLE users DROP COLUMN email;
ALTER TABLE users RENAME COLUMN email_normalized TO email;

Online DDL with pt-online-schema-change (MySQL)

# Add an index without locking the table
pt-online-schema-change \
  --alter "ADD INDEX idx_created_at (created_at)" \
  --execute \
  --max-load Threads_running=25 \
  --critical-load Threads_running=50 \
  D=mydb,t=orders

Flyway Migration (Java/Spring)

// V1.2__Add_user_preferences.sql
CREATE TABLE user_preferences (
    user_id UUID PRIMARY KEY REFERENCES users(id),
    theme VARCHAR(20) DEFAULT 'light',
    notifications_enabled BOOLEAN DEFAULT true,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE INDEX idx_user_preferences_theme ON user_preferences(theme);

Explanation

The expand-contract pattern:

  1. Expand: Add new schema elements (columns, tables) without removing old ones
  2. Migrate: Backfill data; run dual-write during transition
  3. Verify: Ensure new and old paths produce identical results
  4. Contract: Remove deprecated elements once all code uses the new schema

Online vs. offline DDL:

DatabaseOnline DDLLock Level
PostgreSQLCREATE INDEX CONCURRENTLYNone
MySQLALGORITHM=INPLACEBrief metadata
MySQL (large tables)pt-online-schema-changeRow-level copy
SQL ServerONLINE=ONSchema stability

Variants

ApproachBest ForTooling
Expand-contractZero-downtime renamesManual + application changes
Online DDLLarge table index changespt-online-schema-change, gh-ost
Blue-green schemaMajor restructuringTwo databases + dual-write
Logical replicationCross-version migrationpglogical, Debezium

Best Practices

  • Never drop before adding: Always add the replacement before removing the original
  • Use IF EXISTS and IF NOT EXISTS: Prevents migration failures on partial runs
  • Batch backfills: Update 1,000-10,000 rows per transaction to avoid long locks
  • Test migrations on production-sized data: pg_dump + restore to staging isn’t enough
  • Version your migrations: Flyway, Liquibase, or Atlas for tracking and rollback

Common Mistakes

  1. Big-bang migrations: Running ALTER TABLE on a 100M-row table without CONCURRENTLY
  2. Not testing rollback: If the deploy fails, can you revert the schema change? Test deployment strategies.
  3. Missing application compatibility: New schema breaks old code during rolling deployments
  4. Ignoring lock timeouts: PostgreSQL statement_timeout aborts long migrations unpredictably. See connection pooling.
  5. No dry runs: Running migrations directly in production without EXPLAIN or staging validation

Frequently Asked Questions

Q: How do I rename a column without downtime? A: Add new column → dual write → migrate data → update readers → drop old column. Never rename in place.

Q: Can I use transactions for schema changes? A: PostgreSQL supports transactional DDL. MySQL commits implicitly after each DDL statement.

Q: How do I handle schema changes in microservices? A: Each service owns its schema. Use schema-per-service. Shared databases create coupling that makes schema changes dangerous.