Reference Guides/Intermediate/Data Types and Casting

Data Types and Casting

The types that matter most and how to convert between them.

Every column in a relational database has a declared type. Most people treat this as a storage hint - pick something reasonable and move on. But the type shapes everything: what comparisons are valid, how arithmetic behaves, how values sort, how much space each row takes. Get it wrong and you spend an hour chasing a query that produces no error - just silently wrong numbers.

The types you will use most

Integer types store whole numbers with no decimal component:

TypeRangeUse for
SMALLINT-32,768 to 32,767Rarely needed
INTEGER-2.1B to 2.1BCounts, small IDs
BIGINT-9.2Q to 9.2QIDs on large tables, timestamps as integers

Use BIGINT for primary keys on any table expected to grow large. Running out of integer space on a primary key is a painful migration.

Decimal types store numbers with fractional parts:

TypePrecisionUse for
NUMERIC(p, s) / DECIMALExactMoney, anything requiring exact arithmetic
REAL~6 decimal digitsRarely needed
DOUBLE PRECISION~15 decimal digitsScientific values, not money

Never use REAL or DOUBLE PRECISION for currency. Floating-point arithmetic is inherently imprecise: 0.1 + 0.2 in floating-point is 0.30000000000000004, not 0.3. Use NUMERIC with an explicit scale for anything involving money.

price  NUMERIC(10, 2)   -- up to 99,999,999.99 with exact arithmetic

Text types store character strings:

TypeBehavior
TEXTVariable length, no limit (PostgreSQL)
VARCHAR(n)Variable length, up to n characters
CHAR(n)Fixed length, padded with spaces to exactly n

In PostgreSQL, TEXT and VARCHAR perform identically. CHAR(n) is a legacy type that pads values with trailing spaces, which causes surprising comparison behavior. Prefer TEXT or VARCHAR.

Boolean stores TRUE, FALSE, or NULL. It is a proper type in PostgreSQL; some other databases represent booleans as integers.

Date and time types:

TypeWhat it stores
DATECalendar date only
TIMETime of day only
TIMESTAMPDate and time, no timezone
TIMESTAMPTZDate and time, stored as UTC

For any timestamp that represents a real-world event (when an order was placed, when a user logged in), use TIMESTAMPTZ. It stores the moment in UTC and converts to local time on retrieval. Plain TIMESTAMP just stores whatever you give it with no timezone context. That is fine until your application server or database host changes timezone, and then your timestamps are quietly wrong.

Casting

Sometimes you need to tell the database to treat a value as a different type. PostgreSQL gives you two ways to do that:

SELECT CAST('42' AS INTEGER);   -- standard SQL
SELECT '42'::INTEGER;            -- PostgreSQL shorthand

Both do the same thing. The :: shorthand is what you will see in most PostgreSQL code - shorter, and it reads naturally once you are used to it.

Casting fails loudly if the conversion is not possible:

SELECT 'hello'::INTEGER;  -- ERROR: invalid input syntax for type integer

A few casts come up constantly in practice:

SELECT created_at::DATE FROM orders;         -- strip the time component
SELECT '2024-01-15'::DATE;                   -- string literal to date
SELECT price::TEXT FROM products;            -- number to string
SELECT '3.14'::NUMERIC;                      -- string to exact decimal

Integer division

When you divide two integers, PostgreSQL performs integer division and truncates the result:

SELECT 7 / 2;        -- returns 3, not 3.5
SELECT 7 / 2.0;      -- returns 3.5000 (one operand is decimal)
SELECT 7::NUMERIC / 2;  -- returns 3.5

This trips up people who expect 7 / 2 to return 3.5. When you need decimal division and both operands are integers, cast at least one of them.

Implicit casting

PostgreSQL will sometimes cast automatically when comparing values of compatible types. Comparing an INTEGER column to a string literal often just works - the engine coerces the string for you. The problem is that implicit casting can silently disable index use. If customer_id is an integer and you compare it to a string, the database cannot use the index on that column. The query runs, returns the right answer, and takes ten times longer than it should. When types matter, cast explicitly.