Back to DAG

Row Format & NULL bitmap

databases

How a Row is Stored on Disk

When a database writes a row (tuple) to a page, it must serialize the column values into a compact byte sequence. The row format defines exactly how fields are laid out, how NULLs are represented, and how oversized values are handled.

Row header

Every row begins with a header containing system metadata:

  • Transaction ID (xmin/xmax) — which transaction created/deleted this row (used by MVCC).
  • Command ID — ordering within a transaction.
  • Infomask bits — flags indicating whether the row has NULLs, variable-length fields, or TOAST pointers.
  • Offset to user data — where the actual column values start.

Fixed-length fields

Columns with fixed sizes — int (4 bytes), bigint (8 bytes), float (4 bytes), char(n) (n bytes) — are stored inline at a known offset. The database can jump directly to byte offset X to read column Y because every preceding column has a predictable size.

Variable-length fields

Columns like varchar, text, or bytea have unpredictable sizes. They are stored using a (offset, length) or a length-prefixed scheme. PostgreSQL uses a 1-byte or 4-byte varlena header: the first byte(s) encode the total length, followed by the data bytes.

NULL bitmap

Rather than storing a sentinel value for NULL, databases use a NULL bitmap: one bit per nullable column. If bit i is 1, column i is NULL and no bytes are stored for it. This is vastly more efficient than storing a placeholder value, especially for wide tables with many nullable columns.

For fixed-length fields, the bitmap lets the engine skip the field entirely. For variable-length fields, a NULL means zero bytes — no length prefix, no data. The bitmap itself costs only ceil(num_columns / 8) bytes per row.

TOAST — The Oversized-Attribute Storage Technique

When a single column value exceeds roughly 2 KB (PostgreSQL's threshold is ~2000 bytes), it cannot fit inline in an 8 KB page alongside other rows. TOAST handles this by:

  1. Compressing the value (using pglz or lz4).
  2. If still too large, storing it out-of-line in a separate TOAST table, replacing the inline value with an 18-byte TOAST pointer (chunk ID + size).
  3. Large values are split into chunks (typically ~2000 bytes each) stored across multiple TOAST table rows.

TOAST is transparent: queries see the full value, but the storage engine fetches it in chunks only when needed.

Alignment and padding

Columns are typically aligned to their natural boundary (4-byte ints at 4-byte offsets, 8-byte bigints at 8-byte offsets). This may introduce padding bytes between columns. Smart column ordering (largest alignment first) can reduce wasted space.

Real-Life: PostgreSQL HeapTupleHeader

Real-World Example

In PostgreSQL, every heap tuple starts with a 23-byte header (HeapTupleHeaderData):

  • t_xmin (4 bytes) — inserting transaction ID
  • t_xmax (4 bytes) — deleting transaction ID (0 if not deleted)
  • t_cid (4 bytes) — command ID within the transaction
  • t_ctid (6 bytes) — current tuple ID (page, offset)
  • t_infomask2 (2 bytes) — number of attributes + flags
  • t_infomask (2 bytes) — NULL bitmap present? Has varlen fields? Has TOAST?
  • t_hoff (1 byte) — offset to user data

Immediately after the header comes the NULL bitmap (if the infomask says NULLs exist). Then the actual column data begins at byte offset t_hoff.

Example row for CREATE TABLE users (id int, name varchar, email varchar, age int): | Header (23B) | NULL bitmap (1B: 0b0000) | id: 42 (4B) | name: "Alice" (6B) | email: "a@b.com" (8B) | age: 25 (4B) |

If email were NULL, the bitmap would be 0b0100, and the email bytes would be completely absent — saving 8 bytes.

Row Format with NULL Bitmap and TOAST

Row Layout: (id INT, name VARCHAR, bio TEXT, age INT) Row 1 — no NULLs: Header 23B NULL: 0000 id: 42 name:Alice bio: "Hello..." age:25 Row 2 — name is NULL (bit 1 = 1): Header 23B NULL: 0010 id: 99 name: skipped (0 bytes) bio: "World..." age:30 Row 3 — bio is TOASTed (too large for inline): Header 23B NULL: 0000 id: 7 name:Bob TOAST ptr 18B age:45 TOAST Table (separate heap file) chunk0: bytes[0..1999] | chunk1: bytes[2000..3500]
Step 1 of 2