Skip to content
SP StackPractices
intermediate

SQL Performance Tuning — Indexes, Queries, and Explain Plans

A practical guide to optimizing SQL queries: indexing strategies, query rewriting, EXPLAIN plan analysis, and common anti-patterns to avoid.

SQL Performance Tuning

Introduction

Slow queries are one of the most common causes of application performance problems. This guide covers practical techniques to identify, diagnose, and fix SQL performance issues across PostgreSQL, MySQL, and SQL Server.

Finding Slow Queries

PostgreSQL

-- pg_stat_statements extension
SELECT query, calls, mean_exec_time, total_exec_time
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;

-- Active slow queries
SELECT pid, query, now() - query_start AS duration
FROM pg_stat_activity
WHERE state = 'active'
  AND now() - query_start > interval '1 second'
ORDER BY duration DESC;

MySQL

-- Slow query log (enable in my.cnf)
SET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 1;

-- Performance Schema
SELECT sql_text, COUNT_STAR, AVG_TIMER_WAIT/1000000000 AS avg_sec
FROM performance_schema.events_statements_summary_by_digest
ORDER BY AVG_TIMER_WAIT DESC
LIMIT 10;

Understanding EXPLAIN Plans

The EXPLAIN plan reveals how the database executes your query.

EXPLAIN (ANALYZE, BUFFERS)
SELECT u.name, COUNT(o.id) AS order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.created_at > '2024-01-01'
GROUP BY u.id, u.name
ORDER BY order_count DESC
LIMIT 10;

Key Plan Operations

OperationMeaningPerformance Impact
Seq ScanTable scan (reads every row)Slow on large tables; needs index
Index ScanReads index, then fetches matching rowsFast for selective queries
Index Only ScanReads only the index, no table accessFastest; requires covering index
Bitmap Heap ScanBuilds bitmap from index, then fetches rows in batchesGood for moderate selectivity
Nested LoopFor each row in outer table, scan inner tableFine for small outer sets
Hash JoinBuilds hash table of inner table, probes with outerGood for large joins
Merge JoinSorts both inputs, merges themGood for pre-sorted data

Indexing Strategies

B-Tree Indexes (Default)

Best for equality and range queries:

-- Single column index
CREATE INDEX idx_users_email ON users(email);

-- Composite index (column order matters!)
CREATE INDEX idx_orders_user_status ON orders(user_id, status);
-- Good for: WHERE user_id = ? AND status = ?
-- Bad for:  WHERE status = ? (user_id not leading)

Index Column Order

Place columns in this priority:

  1. Equality filters (=)
  2. Range filters (>, <, BETWEEN, LIKE 'prefix%')
  3. Columns used in ORDER BY
  4. Columns used in SELECT (for covering indexes)

Covering Indexes

An index that contains all columns needed for the query, avoiding table lookups:

CREATE INDEX idx_orders_covering
ON orders(user_id, status, created_at, total)
INCLUDE (id);

-- Query can be satisfied entirely from the index
SELECT id, created_at, total
FROM orders
WHERE user_id = 123 AND status = 'shipped';

Partial Indexes

Index only a subset of rows, reducing size and maintenance cost:

-- Only index active users
CREATE INDEX idx_users_active_email
ON users(email)
WHERE is_active = true;

Query Rewriting Techniques

1. Avoid SELECT *

-- Bad
SELECT * FROM orders WHERE user_id = 123;

-- Good: fetch only needed columns
SELECT id, status, total FROM orders WHERE user_id = 123;

2. Use EXISTS Instead of IN for Subqueries

-- Bad: materializes full subquery result
SELECT * FROM customers
WHERE id IN (SELECT customer_id FROM orders WHERE amount > 1000);

-- Good: stops at first match
SELECT * FROM customers c
WHERE EXISTS (
    SELECT 1 FROM orders o
    WHERE o.customer_id = c.id AND o.amount > 1000
);

3. Avoid Functions on Indexed Columns

-- Bad: function prevents index usage
SELECT * FROM orders WHERE DATE(created_at) = '2024-01-15';

-- Good: range query uses index
SELECT * FROM orders
WHERE created_at >= '2024-01-15'
  AND created_at < '2024-01-16';

4. Prefer JOINs Over Correlated Subqueries

-- Bad: correlated subquery runs once per row
SELECT name,
    (SELECT COUNT(*) FROM orders WHERE user_id = users.id) AS order_count
FROM users;

-- Good: JOIN is more efficient
SELECT u.name, COUNT(o.id) AS order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id, u.name;

5. Batch Updates Instead of One-by-One

-- Bad: N+1 updates
UPDATE orders SET status = 'shipped' WHERE id = 1;
UPDATE orders SET status = 'shipped' WHERE id = 2;
...

-- Good: single UPDATE with WHERE IN or JOIN
UPDATE orders
SET status = 'shipped'
WHERE id IN (1, 2, 3, ...);

Common Anti-Patterns

Anti-PatternProblemFix
N+1 QueriesOne query per rowUse JOIN or WHERE IN
Missing LIMITFetching millions of rowsAdd LIMIT and pagination
Implicit conversionsFunction on column prevents index useCast the constant, not the column
SELECT DISTINCT to fix duplicatesHides a join problemFix the join or schema
Counting all rowsSELECT COUNT(*) on huge tablesUse approximate counts or triggers
No connection poolingConnection overhead dominatesUse pgBouncer, HikariCP, etc.

Best Practices

  • Index foreign keys automatically — joins depend on them
  • Monitor slow query logs weekly and address the top offenders
  • Analyze tables regularlyANALYZE updates statistics for the query planner
  • Avoid over-indexing — each index slows writes and consumes space
  • Use appropriate data typesINTEGER is faster than VARCHAR for IDs
  • Partition large tables by date or range when they exceed 10M rows

Frequently Asked Questions

Q: How many indexes is too many? A: There is no fixed number, but each index slows INSERT/UPDATE/DELETE. Audit indexes quarterly and drop unused ones. PostgreSQL’s pg_stat_user_indexes shows index usage.

Q: Should I index every column used in WHERE? A: No. Composite indexes often serve multiple queries. Also, the query planner may choose a sequential scan if the table is small or the query returns most rows.

Q: Why does my query use a sequential scan when I have an index? A: The planner estimates that reading the whole table is faster than reading the index plus random table lookups. This is often correct for queries returning >5-10% of rows.