How a Row is Stored on Disk

Prerequisites(1)

Builds on to(3)

Clustered Index Row-Oriented Storage (OLTP)Version Chain (undo log)

When a database writes a row (tuple) to a page, it must serialize the column values into a compact byte sequence. The row format defines exactly how fields are laid out, how NULLs are represented, and how oversized values are handled.

Row header

Every row begins with a header containing system metadata:

Transaction ID (xmin/xmax) — which transaction created/deleted this row (used by MVCC).
Command ID — ordering within a transaction.
Infomask bits — flags indicating whether the row has NULLs, variable-length fields, or TOAST pointers.
Offset to user data — where the actual column values start.

Fixed-length fields

Columns with fixed sizes — int (4 bytes), bigint (8 bytes), float (4 bytes), char(n) (n bytes) — are stored inline at a known offset. The database can jump directly to byte offset X to read column Y because every preceding column has a predictable size.

Variable-length fields

Columns like varchar, text, or bytea have unpredictable sizes. They are stored using a (offset, length) or a length-prefixed scheme. PostgreSQL uses a 1-byte or 4-byte varlena header: the first byte(s) encode the total length, followed by the data bytes.

NULL bitmap

Rather than storing a sentinel value for NULL, databases use a NULL bitmap: one bit per nullable column. If bit i is 1, column i is NULL and no bytes are stored for it. This is vastly more efficient than storing a placeholder value, especially for wide tables with many nullable columns.

For fixed-length fields, the bitmap lets the engine skip the field entirely. For variable-length fields, a NULL means zero bytes — no length prefix, no data. The bitmap itself costs only ceil(num_columns / 8) bytes per row.

TOAST — The Oversized-Attribute Storage Technique

When a single column value exceeds roughly 2 KB (PostgreSQL's threshold is ~2000 bytes), it cannot fit inline in an 8 KB page alongside other rows. TOAST handles this by:

Compressing the value (using pglz or lz4).
If still too large, storing it out-of-line in a separate TOAST table, replacing the inline value with an 18-byte TOAST pointer (chunk ID + size).
Large values are split into chunks (typically ~2000 bytes each) stored across multiple TOAST table rows.

TOAST is transparent: queries see the full value, but the storage engine fetches it in chunks only when needed.

Alignment and padding

Columns are typically aligned to their natural boundary (4-byte ints at 4-byte offsets, 8-byte bigints at 8-byte offsets). This may introduce padding bytes between columns. Smart column ordering (largest alignment first) can reduce wasted space.

Real-Life: PostgreSQL HeapTupleHeader

Real-World Example

In PostgreSQL, every heap tuple starts with a 23-byte header (HeapTupleHeaderData):

t_xmin (4 bytes) — inserting transaction ID
t_xmax (4 bytes) — deleting transaction ID (0 if not deleted)
t_cid (4 bytes) — command ID within the transaction
t_ctid (6 bytes) — current tuple ID (page, offset)
t_infomask2 (2 bytes) — number of attributes + flags
t_infomask (2 bytes) — NULL bitmap present? Has varlen fields? Has TOAST?
t_hoff (1 byte) — offset to user data

Immediately after the header comes the NULL bitmap (if the infomask says NULLs exist). Then the actual column data begins at byte offset t_hoff.

If email were NULL, the bitmap would be 0b0100, and the email bytes would be completely absent — saving 8 bytes.

Row Format with NULL Bitmap and TOAST

Step 1 of 2

Row Format & NULL bitmap