Reference Guides/Intermediate/CTEs

CTEs

Named temporary result sets that make complex queries readable.

Picture a complex query that filters down to high earners, then figures out which departments they belong to, then joins in headcount data. You could write all that as nested subqueries - three layers deep, inside-out, impossible to follow at a glance. Or you could give each step a name.

That is what a CTE does. A Common Table Expression is a named, temporary result set you define at the top of a query using the WITH keyword. After that, you reference it just like a table. It disappears when the query finishes - no permanent storage, no side effects.

The payoff is that the query reads in the same order you think through the problem. Top to bottom, step by step, each piece named.

The basic form

WITH high_earners AS (
  SELECT id, name, salary, department_id
  FROM employees
  WHERE salary > 100000
)
SELECT d.name AS department, COUNT(*) AS count
FROM high_earners h
JOIN departments d ON h.department_id = d.id
GROUP BY d.name;

The CTE high_earners is defined once, then the main query treats it exactly like a table. No repeated filter logic, no nesting. You write what you mean.

Multiple CTEs

You can chain as many CTEs as you need by separating them with commas. Each one can reference any CTE defined before it - which is where things get expressive:

WITH
dept_stats AS (
  SELECT department_id,
         AVG(salary)  AS avg_salary,
         COUNT(*)     AS headcount
  FROM employees
  GROUP BY department_id
),
large_depts AS (
  SELECT department_id
  FROM dept_stats
  WHERE headcount >= 20
)
SELECT d.name, ds.avg_salary, ds.headcount
FROM departments d
JOIN dept_stats ds   ON ds.department_id = d.id
JOIN large_depts ld  ON ld.department_id = d.id
ORDER BY ds.avg_salary DESC;

large_depts builds directly on dept_stats. Each step narrows or transforms the data, and the final SELECT just assembles what you need. Think of it as a pipeline you can read from top to bottom.

CTEs vs subqueries: when to choose each

CTEs and subqueries can often express the same thing. The difference is primarily about communication - code gets read far more often than it gets written, and that shifts the calculus.

SituationPrefer
One-time, simple filterSubquery in WHERE
Derived table used onceSubquery in FROM
Logic used in more than one placeCTE
Multi-step logic that benefits from namingCTE
Want to test each step in isolationCTE

A useful gut check: if you cannot describe what a subquery does without reading every line of it, give it a name.

Debugging CTEs

This is one of the underrated advantages of CTEs. Want to verify what dept_stats actually returns? Copy its query and run it on its own. With nested subqueries you have to surgically extract layers of code. With CTEs, each step is already isolated.

Performance and materialization

This trips people up sometimes. If you are on an older PostgreSQL (before version 12), each CTE was treated as an optimization barrier - the planner would evaluate it in full before the outer query could filter anything down. Starting in PostgreSQL 12, non-recursive CTEs are inlined by default. The planner sees through them and optimizes freely, the same way it handles subqueries.

If you need to force a CTE to be evaluated exactly once (materialized), PostgreSQL supports an explicit hint:

WITH expensive_calculation AS MATERIALIZED (
  SELECT ...
)

That hint is for specific situations where you know the calculation is expensive and you want to guarantee it runs exactly once. For most queries, write for clarity and trust the planner.

The path to recursive CTEs

The WITH keyword unlocks one more thing: recursive queries. A recursive CTE can call itself, which is how SQL traverses hierarchies and graphs - org charts, folder trees, connected components. That is covered separately in the Advanced section. Get comfortable with regular CTEs first; the recursive form will click much faster once this foundation is solid.