Recursive CTEs
Traversing hierarchies and graphs in pure SQL.
A recursive CTE is a CTE that references itself. It is the SQL mechanism for traversing hierarchical or graph-structured data where the depth is not known in advance: organizational charts, file system trees, bill-of-materials structures, social network graphs, and dependency chains.
Before recursive CTEs, traversing a hierarchy in SQL required either knowing the depth in advance (writing one self-join per level) or pulling the data into application code and walking it there. Recursive CTEs handle arbitrary depth in pure SQL.
The structure
Every recursive CTE has two parts, connected by UNION ALL:
WITH RECURSIVE cte_name AS (
-- Part 1: the anchor (runs once, produces the starting rows)
SELECT ...
UNION ALL
-- Part 2: the recursive part (joins cte_name to itself)
SELECT ... FROM source_table JOIN cte_name ON ...
)
SELECT * FROM cte_name;
The anchor produces the initial set of rows. The recursive part takes those rows, finds more rows related to them, and adds those to the result. The process repeats until the recursive part produces no new rows. The final result is the union of everything produced across all iterations.
Walking an organizational hierarchy
The clearest illustration is an employee table where each row has a manager_id that points to another employee:
WITH RECURSIVE org_tree AS (
-- Anchor: start with the CEO (has no manager)
SELECT id, name, manager_id, 0 AS depth
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive: find the direct reports of everyone found so far
SELECT e.id, e.name, e.manager_id, ot.depth + 1
FROM employees e
JOIN org_tree ot ON e.manager_id = ot.id
)
SELECT id, name, depth
FROM org_tree
ORDER BY depth, name;
Iteration 0: the anchor finds the CEO (depth 0). Iteration 1: the recursive part finds everyone who reports to the CEO (depth 1). Iteration 2: it finds everyone who reports to those people (depth 2). And so on until there are no more direct reports to find.
The depth column is a useful pattern: it tells you how many levels down each node is, which is helpful for indenting a tree display or filtering to a specific level.
Building the path
You can track the full path from root to each node by accumulating it in an array:
WITH RECURSIVE org_tree AS (
SELECT id, name, manager_id, ARRAY[name] AS path
FROM employees
WHERE manager_id IS NULL
UNION ALL
SELECT e.id, e.name, e.manager_id, ot.path || e.name
FROM employees e
JOIN org_tree ot ON e.manager_id = ot.id
)
SELECT id, name, array_to_string(path, ' > ') AS full_path
FROM org_tree
ORDER BY full_path;
This produces rows like CEO > VP Engineering > Alice, showing each employee's position in the hierarchy.
Handling cycles in graphs
A tree has no cycles. A graph can. If a recursive CTE encounters a cycle, it will loop forever (or until it hits a row limit). The standard defense is to track visited nodes in an array and stop when you see a repeat:
WITH RECURSIVE graph AS (
SELECT id, ARRAY[id] AS visited
FROM nodes
WHERE id = 1 -- starting node
UNION ALL
SELECT n.id, g.visited || n.id
FROM edges e
JOIN nodes n ON n.id = e.target
JOIN graph g ON g.id = e.source
WHERE NOT n.id = ANY(g.visited) -- stop if we have been here before
)
SELECT * FROM graph;
PostgreSQL 14 added a CYCLE clause that handles this automatically without manual array tracking:
WITH RECURSIVE graph AS (
SELECT id FROM nodes WHERE id = 1
UNION ALL
SELECT n.id FROM edges e JOIN nodes n ON n.id = e.target JOIN graph g ON g.id = e.source
) CYCLE id SET is_cycle USING visited_path
SELECT * FROM graph WHERE NOT is_cycle;
When recursive CTEs are the right tool
| Problem | Better approach |
|---|---|
| One level of hierarchy | Self-join |
| Two or three known levels | Multiple self-joins |
| Arbitrary depth, tree structure | Recursive CTE |
| Arbitrary depth, graph with cycles | Recursive CTE with cycle detection |
Recursive CTEs are more complex than regular queries. Reach for them when the hierarchy depth is genuinely unknown, not as a first instinct for any hierarchical problem.