Distinct
How to eliminate duplicate rows from query results.
SELECT DISTINCT removes duplicate rows from the result set. It is one of the simplest SQL keywords, but its interaction with COUNT, GROUP BY, and multi-column selects trips people up more than you might expect.
Basic DISTINCT
SELECT DISTINCT country FROM customers;Without DISTINCT, this returns one row per customer — many duplicates if customers share countries. With DISTINCT, it returns each country exactly once.
DISTINCT across multiple columns
DISTINCT applies to the entire row, not to a single column. This query returns each unique (country, city) pair:
SELECT DISTINCT country, city FROM customers;You cannot write SELECT DISTINCT country, city and expect only country to be deduplicated. The deduplication always covers every selected column together.
COUNT(DISTINCT column)
To count how many unique values exist in a column, combine COUNT with DISTINCT:
SELECT COUNT(DISTINCT customer_id) AS unique_customers FROM orders;This counts unique customer IDs, not total orders. COUNT(*) would give the total number of rows; COUNT(DISTINCT customer_id) gives the number of distinct customers who placed at least one order.
DISTINCT vs GROUP BY
Both produce unique combinations of the selected columns. The difference is purpose and output:
-- DISTINCT: just deduplicate
SELECT DISTINCT department FROM employees;
-- GROUP BY: deduplicate and optionally aggregate
SELECT department, COUNT(*) AS headcount FROM employees GROUP BY department;When you only need unique values and no aggregation, DISTINCT is cleaner. When you need aggregate functions, you must use GROUP BY. Functionally, SELECT DISTINCT col FROM t and SELECT col FROM t GROUP BY col return identical results — the planner may even produce the same execution plan.
Performance
On large tables, DISTINCT can be expensive because the database must sort or hash all the rows to find duplicates. If you are using DISTINCT to paper over a join that produces unexpected duplicates, it is usually better to fix the join condition instead. Unnecessary DISTINCT is a common sign of an incorrect query hiding behind deduplication.