Performance Tuning

SQL performance problems feel mysterious until you look at the query plan. That is where almost everyone goes wrong - they start guessing before they look. Once you see the plan, the problem almost always falls into a small number of categories. The process is diagnostic, not creative: measure, find the bottleneck, apply a targeted fix, verify.

Step one: measure before you guess

This trips people up constantly. The instinct is to start adding indexes or rewriting queries. But if you have not identified which query is actually slow - and confirmed it is slow - you are guessing. A query that runs in 2ms does not need your help. Start by finding the real offender, then figure out why.

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT * FROM orders WHERE customer_id = 42;

Read the output before touching anything. The plan tells you exactly where time is being spent - and often makes the fix obvious.

Missing or unused indexes

This is the most common cause. Look for Seq Scan on a large table. If the table has millions of rows and the query returns a small fraction, a sequential scan means no index is helping - the database is reading every single row.

-- Slow: scanning the entire orders table to find one customer
Seq Scan on orders (cost=0.00..48500.00 rows=85 ...)

-- Add an index:
CREATE INDEX idx_orders_customer_id ON orders (customer_id);

-- Fast: using the index
Index Scan using idx_orders_customer_id on orders (cost=0.43..120.00 rows=85 ...)

Here is where it gets tricky: sometimes an index exists and is still not used. The most common reason is that the query wraps the column in a function, and the database cannot use an index on a transformed value.

-- Will NOT use an index on created_at (because of DATE_TRUNC)
WHERE DATE_TRUNC('day', created_at) = '2024-01-15'

-- Will use it (direct range comparison on the indexed column)
WHERE created_at >= '2024-01-15' AND created_at < '2024-01-16'

Other patterns that defeat an index: implicit type casting between the filter value and the column type, and LIKE with a leading wildcard (LIKE '%term'). All of these look innocent in isolation, which is why the plan is the only reliable way to catch them.

Stale planner statistics

The query planner does not know what your data looks like - it estimates row counts using statistics collected about your table's distribution. When those statistics are stale, the planner makes bad decisions: choosing a nested loop when a hash join would be faster, or picking an index scan when a seq scan is better.

The tell is a big gap between estimated and actual rows in EXPLAIN ANALYZE:

Seq Scan on orders (cost=... rows=100 ...) (actual rows=1500000 ...)

The planner expected 100 rows and got 1.5 million. That discrepancy cascades - bad row estimates lead to bad join strategy choices, bad memory allocation, bad everything. Refreshing the statistics is often a one-liner fix:

ANALYZE orders;

PostgreSQL's autovacuum runs ANALYZE in the background automatically. On rapidly growing tables, it sometimes cannot keep up - so this is worth knowing when you see a query that suddenly got slow after a data load.

The N+1 pattern

N+1 shows up most often in ORMs and application code: fetch a list of 500 orders, then loop through and fetch the customer for each one. That is 501 queries instead of one join. At the SQL level, the same pattern surfaces as a correlated subquery that re-executes for every row.

In the query plan, you will see it as a nested loop with a suspiciously high loop count:

Nested Loop  (cost=... actual rows=... loops=50000)
  -> Seq Scan on orders

If the inner scan runs 50,000 times, either the outer query returns 50,000 rows or a correlated subquery is executing per row. Either way, the fix is the same: restructure as a join, or use a window function to compute the per-row value in a single pass over the data.

Selecting too many columns or too much data

SELECT * seems harmless until your table has a description column storing several kilobytes of text per row, or a JSONB blob, or a BYTEA attachment. When you only need a name and a date, you are pulling all that extra data across the wire on every query.

Select only what you need:

-- Fetches all columns including large ones
SELECT * FROM products WHERE category = 'electronics';

-- Fetches only what the caller needs
SELECT id, name, price FROM products WHERE category = 'electronics';

Fewer columns means less I/O, less network transfer, and sometimes lets the database use a covering index - one that contains all the requested columns and lets the planner skip the table entirely.

Inefficient joins

The most common join problem is straightforward: the join column is not indexed on one side of the join.

-- If orders.customer_id is not indexed, this is a nested loop with a seq scan per row
SELECT c.name, o.total
FROM customers c
JOIN orders o ON o.customer_id = c.id;

Adding an index on the join column opens up better strategies for the planner. When reading EXPLAIN ANALYZE output, a Hash Join or Merge Join on large tables is generally fine. A Nested Loop with high loop counts and inner sequential scans is the warning sign.

A repeatable diagnostic process

Find the slow query - slow query log, APM tool, or pg_stat_statements
Run EXPLAIN (ANALYZE, BUFFERS) and actually read the output
Look for: seq scans on large tables, big estimate-to-actual gaps, high loop counts
Apply one targeted fix: an index, a filter rewrite, an ANALYZE, a restructured join
Rerun EXPLAIN ANALYZE and confirm the plan changed the way you expected

The most important word in step 4 is "one." Change one thing, verify it helped, then decide what is next. Stack multiple changes at once and you will not know what fixed it - or what broke something else.