Set Operations: UNION, INTERSECT, EXCEPT
Combining the results of multiple queries as if they were sets.
Set operations combine the results of two or more SELECT queries by treating each result set as a mathematical set of rows. Rather than joining tables horizontally (adding columns), set operations combine them vertically (adding rows). The queries being combined must return the same number of columns, and corresponding columns must have compatible data types.
There are three set operators: UNION, INTERSECT, and EXCEPT. Each answers a different kind of question.
UNION
UNION combines two result sets and removes duplicate rows from the output:
SELECT name, email FROM customers
UNION
SELECT name, email FROM prospects;
If the same (name, email) pair appears in both tables, it appears only once. The deduplication step requires sorting or hashing the entire result set, which has a cost.
UNION ALL keeps every row, including duplicates, and skips the deduplication:
SELECT product_id, quantity FROM warehouse_a
UNION ALL
SELECT product_id, quantity FROM warehouse_b;
UNION ALL is faster. Use it whenever you know duplicates are not possible, or when you want to count every row (including duplicates). Use UNION only when you specifically need deduplication.
INTERSECT
INTERSECT returns only rows that appear in both result sets:
SELECT customer_id FROM orders_2023
INTERSECT
SELECT customer_id FROM orders_2024;
This returns the IDs of customers who placed at least one order in 2023 and at least one in 2024. Duplicates within each query are removed before the intersection is computed.
EXCEPT
EXCEPT returns rows from the first query that do not appear in the second. It is the set subtraction operator:
SELECT customer_id FROM customers
EXCEPT
SELECT customer_id FROM orders;
This returns customers who have never placed an order. Order matters with EXCEPT: flipping the queries gives you a completely different result. (Oracle calls this operator MINUS.)
Summary of set operators
| Operator | Returns |
|---|---|
UNION | All rows from both queries, duplicates removed |
UNION ALL | All rows from both queries, duplicates kept |
INTERSECT | Rows that appear in both queries |
EXCEPT | Rows in the first query not in the second |
Column names and types
The column names in the combined result come from the first query. The second and subsequent queries' column names are ignored. This matters when you are referencing the output downstream or using the combined result in a CTE.
Types must be compatible, but not necessarily identical. PostgreSQL will attempt to cast compatible types automatically (integer and numeric, for example).
Sorting the combined result
ORDER BY goes at the very end of the entire combined query and applies to the full result:
SELECT name, 'customer' AS source FROM customers
UNION ALL
SELECT name, 'prospect' AS source FROM prospects
ORDER BY name;
You cannot add ORDER BY to an individual query within a set operation. The ordering applies only after the combination is complete.
When to use set operations vs other approaches
Set operations work on rows that have the same shape. JOINs combine rows from tables that are related by a key. For "find rows in A that are not in B," EXCEPT is concise, but NOT EXISTS or LEFT JOIN ... WHERE b.id IS NULL are often more flexible and handle NULL more predictably. For simple cases, EXCEPT is clear and direct.