Guide/Intermediate/Data Types and Casting

Data Types and Casting

The types that matter most and how to convert between them.

Every column in a relational database has a declared type. That type is not just a label; it determines what values the column can store, how the database performs comparisons and arithmetic, how values are sorted, and how much space they occupy. Understanding the common types and how casting works prevents a category of subtle bugs that are hard to trace back to their source.

The types you will use most

Integer types store whole numbers with no decimal component:

TypeRangeUse for
SMALLINT-32,768 to 32,767Rarely needed
INTEGER-2.1B to 2.1BCounts, small IDs
BIGINT-9.2Q to 9.2QIDs on large tables, timestamps as integers

Use BIGINT for primary keys on any table expected to grow large. Running out of integer space on a primary key is a painful migration.

Decimal types store numbers with fractional parts:

TypePrecisionUse for
NUMERIC(p, s) / DECIMALExactMoney, anything requiring exact arithmetic
REAL~6 decimal digitsRarely needed
DOUBLE PRECISION~15 decimal digitsScientific values, not money

Never use REAL or DOUBLE PRECISION for currency. Floating-point arithmetic is inherently imprecise: 0.1 + 0.2 in floating-point is 0.30000000000000004, not 0.3. Use NUMERIC with an explicit scale for anything involving money.

price  NUMERIC(10, 2)   -- up to 99,999,999.99 with exact arithmetic

Text types store character strings:

TypeBehavior
TEXTVariable length, no limit (PostgreSQL)
VARCHAR(n)Variable length, up to n characters
CHAR(n)Fixed length, padded with spaces to exactly n

In PostgreSQL, TEXT and VARCHAR perform identically. CHAR(n) is a legacy type that pads values with trailing spaces, which causes surprising comparison behavior. Prefer TEXT or VARCHAR.

Boolean stores TRUE, FALSE, or NULL. It is a proper type in PostgreSQL; some other databases represent booleans as integers.

Date and time types:

TypeWhat it stores
DATECalendar date only
TIMETime of day only
TIMESTAMPDate and time, no timezone
TIMESTAMPTZDate and time, stored as UTC

For any timestamp that represents a real-world event (when an order was placed, when a user logged in), use TIMESTAMPTZ. It stores the moment in UTC and converts to local time on retrieval. Plain TIMESTAMP stores whatever you give it with no timezone context, which causes problems when the application or database server changes timezones.

Casting

Casting converts a value from one type to another. PostgreSQL supports two syntaxes:

SELECT CAST('42' AS INTEGER);   -- standard SQL
SELECT '42'::INTEGER;            -- PostgreSQL shorthand

Both are equivalent. The shorthand form (::) is more common in PostgreSQL code.

Casting fails loudly if the conversion is not possible:

SELECT 'hello'::INTEGER;  -- ERROR: invalid input syntax for type integer

Common useful casts:

SELECT created_at::DATE FROM orders;         -- strip the time component
SELECT '2024-01-15'::DATE;                   -- string literal to date
SELECT price::TEXT FROM products;            -- number to string
SELECT '3.14'::NUMERIC;                      -- string to exact decimal

Integer division

When you divide two integers, PostgreSQL performs integer division and truncates the result:

SELECT 7 / 2;        -- returns 3, not 3.5
SELECT 7 / 2.0;      -- returns 3.5000 (one operand is decimal)
SELECT 7::NUMERIC / 2;  -- returns 3.5

This trips up people who expect 7 / 2 to return 3.5. When you need decimal division and both operands are integers, cast at least one of them.

Implicit casting

The database will sometimes cast automatically when comparing or combining values of compatible types. Comparing an INTEGER column to a string literal often works because PostgreSQL coerces the string. However, implicit casting can prevent index use (the database cannot use an index on customer_id if it is comparing an integer column to a string) and can produce surprising results in edge cases. When the types matter, cast explicitly rather than relying on the database to figure it out.