NULL: The Absent Value
Why NULL is not zero, not empty string, and why it breaks the rules you expect.
NULL is the database's way of representing missing or unknown information. It is not zero. It is not an empty string. It is a special marker that says: we do not know what this value is. That semantic distinction has real consequences for how NULL behaves in comparisons, arithmetic, and aggregation, and understanding those consequences is one of the most important things you can do as a SQL writer.
NULL is not equal to anything, including itself
This is the behavior that surprises everyone at first:
SELECT 1 = NULL; -- returns NULL, not true or false
SELECT NULL = NULL; -- returns NULL, not true
The logic, once you accept the premise, is sound. If a value is unknown, you cannot conclude that it equals anything. You do not know what it is. An unknown value might equal 1, or it might not. The result of comparing it to 1 is therefore also unknown, which is NULL.
The practical consequence is that you cannot filter for NULL using =:
-- Returns zero rows, even if many employees have no manager
SELECT * FROM employees WHERE manager_id = NULL;
-- Correct
SELECT * FROM employees WHERE manager_id IS NULL;
SELECT * FROM employees WHERE manager_id IS NOT NULL;
Always use IS NULL and IS NOT NULL. Using = with NULL is not a syntax error; the query just silently returns nothing, which is worse.
NULL propagates through expressions
NULL is infectious. When you include NULL in almost any expression, the result is NULL:
| Expression | Result |
|---|---|
5 + NULL | NULL |
salary * 1.10 (if salary is NULL) | NULL |
| `'hello' | |
NULL AND TRUE | NULL |
NULL OR TRUE | TRUE (exception) |
NULL OR FALSE | NULL |
The OR TRUE exception exists because if one side is definitely true, the overall result is true regardless of the other operand. But AND and arithmetic follow the general rule: NULL in, NULL out.
This means that if a column can be NULL and you are using it in a calculation, the result for those rows will be NULL unless you handle it.
COALESCE: substituting a default
COALESCE returns the first non-NULL value from its argument list. It is the standard way to provide a default when a column might be NULL:
SELECT
name,
COALESCE(commission, 0) AS commission,
salary + COALESCE(commission, 0) AS total_compensation
FROM employees;
If commission is NULL, you get 0. If it has a value, you get that value. This prevents the NULL from propagating into the total_compensation calculation.
COALESCE accepts any number of arguments and returns the first non-NULL one. You can use it for fallback chains: COALESCE(preferred_name, first_name, 'Unknown').
NULL in aggregate functions
Aggregate functions mostly ignore NULL values, with one important exception:
| Function | Behavior with NULL |
|---|---|
COUNT(*) | Counts all rows, including those with NULLs |
COUNT(column) | Counts only non-NULL values in that column |
SUM(column) | Ignores NULLs, sums the rest |
AVG(column) | Ignores NULLs, averages the rest |
MIN / MAX | Ignores NULLs |
The distinction between COUNT(*) and COUNT(column) matters more than most beginners expect:
SELECT
COUNT(*) AS total_employees, -- everyone
COUNT(bonus) AS employees_with_bonus -- only those where bonus is not NULL
FROM employees;
Similarly, AVG(bonus) divides by the number of non-NULL values, not the total number of employees. If you want to treat NULL bonuses as zero, you need to be explicit: AVG(COALESCE(bonus, 0)).
The NOT IN trap
This is the most dangerous NULL behavior in practice. If any value in a NOT IN list is NULL, the entire expression evaluates to NULL, and zero rows are returned.
-- Finds employees who are not managers -- or tries to
SELECT * FROM employees
WHERE id NOT IN (SELECT manager_id FROM employees);
If even one employee has a NULL manager_id (for example, the CEO who has no manager), the subquery returns a list that contains NULL. SQL cannot confirm that id is "not in" a list that contains an unknown value, so every comparison returns NULL, and every row is excluded.
The fix: filter nulls from the subquery.
WHERE id NOT IN (
SELECT manager_id FROM employees
WHERE manager_id IS NOT NULL
)
Alternatively, use NOT EXISTS, which handles NULL correctly by design and is often clearer:
WHERE NOT EXISTS (
SELECT 1 FROM employees mgr
WHERE mgr.manager_id = employees.id
)
Designing for NULL
NULL is not a bug in SQL; it is a deliberate modeling tool. The discipline is to be intentional about which columns can be NULL and what that means in your domain. "No manager" is meaningfully different from "manager not yet recorded," and your schema design should reflect that. The more carefully you define which columns are nullable and why, the less surprising NULL's behavior will feel in practice.