Filtering with WHERE
How to ask for only the rows you actually want.
A table might have millions of rows. You almost never want all of them. The WHERE clause is how you tell the database which rows you actually care about. It is the most fundamental way to narrow a query, and it appears in nearly every SELECT statement you will ever write.
The database evaluates the condition in your WHERE clause against each row. Rows where the condition is true are included; everything else is discarded before the result is returned to you.
Comparison operators
The six comparison operators cover most situations:
| Operator | Meaning |
|---|---|
= | Equal to |
!= or <> | Not equal to |
< | Less than |
> | Greater than |
<= | Less than or equal to |
>= | Greater than or equal to |
SELECT name, salary
FROM employees
WHERE salary > 80000;
The most common mistake for programmers coming from other languages is using == for equality. SQL uses a single = for both assignment and comparison. Use != or <> for "not equal"; both are valid, though != is more common in PostgreSQL.
Combining conditions with AND and OR
You can chain conditions using AND (both must be true) and OR (at least one must be true):
SELECT name, department, salary
FROM employees
WHERE department = 'Engineering'
AND salary > 90000;
When you mix AND and OR in the same clause, AND binds more tightly than OR. This is the same precedence rule as multiplication vs. addition in arithmetic, but it trips people up in SQL. Use parentheses to make the intent explicit:
-- Without parentheses, this reads as:
-- department = 'Engineering' AND (salary > 90000 OR start_date > '2023-01-01')
WHERE department = 'Engineering'
AND salary > 90000
OR start_date > '2023-01-01';
-- With parentheses, unambiguous:
WHERE (department = 'Engineering' OR department = 'Data')
AND salary > 90000;
When in doubt, add parentheses. They do not hurt performance and they prevent a category of silent, hard-to-diagnose bugs.
IN and NOT IN
When you want to match a column against a list of values, IN is cleaner than chaining multiple OR conditions:
-- Verbose
WHERE department = 'Engineering' OR department = 'Data' OR department = 'Design'
-- Cleaner
WHERE department IN ('Engineering', 'Data', 'Design')
NOT IN excludes the listed values. There is an important caveat: if any value in the NOT IN list is NULL, the entire expression evaluates to NULL and no rows are returned. The NULL guide explains why. For now, just remember that NOT IN and NULL do not mix safely.
LIKE for pattern matching
LIKE matches strings against a pattern using two wildcard characters:
| Wildcard | Matches |
|---|---|
% | Any sequence of zero or more characters |
_ | Exactly one character |
WHERE email LIKE '%@gmail.com' -- ends with @gmail.com
WHERE name LIKE 'A%' -- starts with A
WHERE phone LIKE '555-____' -- 555- followed by exactly 4 characters
LIKE is case-sensitive by default in PostgreSQL. Use ILIKE for a case-insensitive match (PostgreSQL-specific, but very useful):
WHERE name ILIKE 'alice%' -- matches "Alice", "ALICE", "alice"
BETWEEN
BETWEEN tests whether a value falls within a range, inclusive on both ends:
WHERE hire_date BETWEEN '2020-01-01' AND '2022-12-31'
-- equivalent to:
WHERE hire_date >= '2020-01-01' AND hire_date <= '2022-12-31'
The two forms are interchangeable. BETWEEN is slightly more readable for date ranges; explicit comparisons are clearer when the bounds are asymmetric (one inclusive, one exclusive).
NOT
NOT inverts any condition. It works with LIKE, IN, BETWEEN, and EXISTS:
WHERE name NOT LIKE 'A%'
WHERE department NOT IN ('Sales', 'Marketing')
WHERE salary NOT BETWEEN 50000 AND 100000
For simple equality, != is cleaner than NOT ... =. But NOT LIKE, NOT IN, and NOT BETWEEN are idiomatic and widely used.