Self Joins
Joining a table to itself to find relationships within the same dataset.
Some relationships do not span tables - they live inside a single one. An employees table where every row has a manager_id pointing at another row in the same table is the classic case. You cannot join it to a separate "managers" table because there is no separate table. The self-join solves this: alias the same table twice, give each alias a different name, and the database treats them as two independent sources.
It feels circular the first time you see it. After a few examples it clicks.
The most common use case: hierarchies
The canonical setup is an employees table where each employee has a manager_id referencing another employee's id. You want each employee's name next to their manager's name:
SELECT
e.name AS employee,
m.name AS manager,
e.title AS employee_title
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id;Here e is the employee and m is the manager. Both aliases point to the same table, but the query treats them as two separate sources. The result:
employee | manager | employee_title
----------|----------|----------------
Alice | Carol | Engineer
Bob | Carol | Engineer
Carol | NULL | VP EngineeringNotice the LEFT JOIN. Carol is the top of the org chart - her manager_id is NULL. Switch to an INNER JOIN and she disappears from the result. That trips people up the first time. When the relationship is optional, always use LEFT JOIN.
Comparing rows within the same table
This is where self-joins get fun. Which employees earn more than their direct manager?
SELECT
e.name AS employee,
e.salary AS employee_salary,
m.name AS manager,
m.salary AS manager_salary,
e.salary - m.salary AS difference
FROM employees e
JOIN employees m ON e.manager_id = m.id
WHERE e.salary > m.salary;Without the self-join, you would need a correlated subquery or a round-trip through application code. The self-join keeps it clean.
Generating unique pairs
Say you want all pairs of employees in the same department. Join the table to itself on department and you will get the right rows - but also duplicates and employees paired with themselves. You need one extra condition:
SELECT a.name AS person_a, b.name AS person_b
FROM employees a
JOIN employees b ON a.department = b.department
AND a.id < b.id;The a.id < b.id condition does two things at once. A row's ID can never be less than itself, so self-pairings are gone. And because we enforce a.id < b.id, we only see (Alice, Bob) - never (Bob, Alice). Without it you get every combination twice, plus every employee paired with themselves. One small clause, three problems solved.
Finding rows with no matching pair
You can flip the self-join pattern to find the absence of a relationship. Which managers have no direct reports?
SELECT e.name
FROM employees e
LEFT JOIN employees reports ON reports.manager_id = e.id
WHERE reports.id IS NULL;The left join tries to find at least one employee whose manager_id matches e.id. When none exists, every column from reports is NULL. Filtering on reports.id IS NULL keeps only the rows with no match - the managers with empty queues.
When self-joins are not enough
Self-joins handle one level well: employee to manager, or manager to skip-level. But the moment you need to walk an org chart of unknown depth - from CEO down to individual contributors - you are stuck. You would have to write one join per level, and you cannot know in advance how many levels there are.
That is exactly the problem recursive CTEs exist to solve. Self-joins are a sharp tool for known-depth hierarchies and row comparisons. When the depth is variable, reach for something else.