Guide/Beginner/NULL: The Absent Value

NULL: The Absent Value

Why NULL is not zero, not empty string, and why it breaks the rules you expect.

NULL is the database's way of representing missing or unknown information. It is not zero. It is not an empty string. It is a special marker that says: we do not know what this value is. That semantic distinction has real consequences for how NULL behaves in comparisons, arithmetic, and aggregation, and understanding those consequences is one of the most important things you can do as a SQL writer.

NULL is not equal to anything, including itself

This is the behavior that surprises everyone at first:

SELECT 1 = NULL;    -- returns NULL, not true or false
SELECT NULL = NULL; -- returns NULL, not true

The logic, once you accept the premise, is sound. If a value is unknown, you cannot conclude that it equals anything. You do not know what it is. An unknown value might equal 1, or it might not. The result of comparing it to 1 is therefore also unknown, which is NULL.

The practical consequence is that you cannot filter for NULL using =:

-- Returns zero rows, even if many employees have no manager
SELECT * FROM employees WHERE manager_id = NULL;

-- Correct
SELECT * FROM employees WHERE manager_id IS NULL;
SELECT * FROM employees WHERE manager_id IS NOT NULL;

Always use IS NULL and IS NOT NULL. Using = with NULL is not a syntax error; the query just silently returns nothing, which is worse.

NULL propagates through expressions

NULL is infectious. When you include NULL in almost any expression, the result is NULL:

ExpressionResult
5 + NULLNULL
salary * 1.10 (if salary is NULL)NULL
`'hello'
NULL AND TRUENULL
NULL OR TRUETRUE (exception)
NULL OR FALSENULL

The OR TRUE exception exists because if one side is definitely true, the overall result is true regardless of the other operand. But AND and arithmetic follow the general rule: NULL in, NULL out.

This means that if a column can be NULL and you are using it in a calculation, the result for those rows will be NULL unless you handle it.

COALESCE: substituting a default

COALESCE returns the first non-NULL value from its argument list. It is the standard way to provide a default when a column might be NULL:

SELECT
  name,
  COALESCE(commission, 0) AS commission,
  salary + COALESCE(commission, 0) AS total_compensation
FROM employees;

If commission is NULL, you get 0. If it has a value, you get that value. This prevents the NULL from propagating into the total_compensation calculation.

COALESCE accepts any number of arguments and returns the first non-NULL one. You can use it for fallback chains: COALESCE(preferred_name, first_name, 'Unknown').

NULL in aggregate functions

Aggregate functions mostly ignore NULL values, with one important exception:

FunctionBehavior with NULL
COUNT(*)Counts all rows, including those with NULLs
COUNT(column)Counts only non-NULL values in that column
SUM(column)Ignores NULLs, sums the rest
AVG(column)Ignores NULLs, averages the rest
MIN / MAXIgnores NULLs

The distinction between COUNT(*) and COUNT(column) matters more than most beginners expect:

SELECT
  COUNT(*)      AS total_employees,    -- everyone
  COUNT(bonus)  AS employees_with_bonus -- only those where bonus is not NULL
FROM employees;

Similarly, AVG(bonus) divides by the number of non-NULL values, not the total number of employees. If you want to treat NULL bonuses as zero, you need to be explicit: AVG(COALESCE(bonus, 0)).

The NOT IN trap

This is the most dangerous NULL behavior in practice. If any value in a NOT IN list is NULL, the entire expression evaluates to NULL, and zero rows are returned.

-- Finds employees who are not managers -- or tries to
SELECT * FROM employees
WHERE id NOT IN (SELECT manager_id FROM employees);

If even one employee has a NULL manager_id (for example, the CEO who has no manager), the subquery returns a list that contains NULL. SQL cannot confirm that id is "not in" a list that contains an unknown value, so every comparison returns NULL, and every row is excluded.

The fix: filter nulls from the subquery.

WHERE id NOT IN (
  SELECT manager_id FROM employees
  WHERE manager_id IS NOT NULL
)

Alternatively, use NOT EXISTS, which handles NULL correctly by design and is often clearer:

WHERE NOT EXISTS (
  SELECT 1 FROM employees mgr
  WHERE mgr.manager_id = employees.id
)

Designing for NULL

NULL is not a bug in SQL; it is a deliberate modeling tool. The discipline is to be intentional about which columns can be NULL and what that means in your domain. "No manager" is meaningfully different from "manager not yet recorded," and your schema design should reflect that. The more carefully you define which columns are nullable and why, the less surprising NULL's behavior will feel in practice.