Reference Guides/Beginner/Joins

Joins

How to pull related data from multiple tables into a single result.

Relational databases deliberately spread data across multiple tables. Customers in one, orders in another, products in a third. This design eliminates redundancy and keeps updates clean - but it means almost every real question requires combining tables. JOINs are how you do that.

The core idea: a JOIN matches rows from two tables based on a condition you specify. The type of JOIN determines what happens when a row on one side has no partner on the other.

INNER JOIN

An INNER JOIN returns only rows where the join condition is satisfied in both tables:

SELECT o.id, c.name, o.total
FROM orders o
INNER JOIN customers c ON o.customer_id = c.id;

If an order references a customer_id that does not exist in the customers table, that order is excluded. A customer with no orders is excluded too. Only rows with a match on both sides make it through.

The keyword INNER is optional. A bare JOIN is an inner join, and that is what most people write.

LEFT JOIN

A LEFT JOIN keeps every row from the left table and pulls in matching rows from the right. When no match exists, the right-side columns come back as NULL. The left side is never dropped.

SELECT c.name, COUNT(o.id) AS order_count
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id
GROUP BY c.id, c.name;

Every customer shows up, even those with no orders. For those customers, o.id is NULL - and COUNT(o.id) returns 0, because COUNT on a column ignores NULLs.

One of the most common LEFT JOIN patterns is finding rows in A that have no match in B at all:

SELECT c.name
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id
WHERE o.id IS NULL;  -- customers with no orders at all

RIGHT JOIN

The mirror image of LEFT JOIN: all rows from the right table, matching rows from the left, NULLs where nothing matches. You can always rewrite a RIGHT JOIN as a LEFT JOIN by swapping the table order. Most people do - RIGHT JOIN shows up rarely in real code.

FULL JOIN

Returns all rows from both tables, with NULLs on whichever side has no match:

SELECT a.id AS a_id, b.id AS b_id
FROM table_a a
FULL JOIN table_b b ON a.id = b.id;

Useful for comparing two datasets and finding what is missing on either side. You will see this occasionally in data reconciliation work.

Comparing join types

Join typeReturns
INNER JOINOnly rows with a match in both tables
LEFT JOINAll rows from the left table; NULLs for unmatched right-side columns
RIGHT JOINAll rows from the right table; NULLs for unmatched left-side columns
FULL JOINAll rows from both tables; NULLs where no match exists

The ON clause vs WHERE for filtering

Here is something that surprises almost everyone the first time they see it. You can filter a join in the ON clause or in WHERE - and for an inner join, the results look identical. For a LEFT JOIN, the placement changes the meaning entirely.

-- Filtering in ON: all customers appear; only shipped orders are shown
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id AND o.status = 'shipped'

-- Filtering in WHERE: customers with no shipped orders are excluded
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id
WHERE o.status = 'shipped'

The ON condition is part of the join itself - it controls which rows pair up. The WHERE condition runs after the join is assembled and tosses out rows that do not qualify. When you add WHERE o.status = 'shipped' after a left join, you are effectively turning it back into an inner join, because unmatched rows (where o.status is NULL) get filtered out. This trips people up constantly.

Table aliases

When joining multiple tables, aliases are not optional - they are how you keep the query readable and avoid ambiguity when two tables share a column name like id:

SELECT o.id, c.name, p.title, oi.quantity
FROM orders o
JOIN customers c ON o.customer_id = c.id
JOIN order_items oi ON oi.order_id = o.id
JOIN products p ON p.id = oi.product_id;

Without aliases, every column reference needs the full table name. Four joined tables would make the query almost unreadable. Short, consistent aliases are one of those small habits that make a big difference in how fast other people - including future you - can understand what a query does.