The Index Node: File Identity on Disk

An inode (index node) is a data structure on a Unix/Linux filesystem that stores all the metadata about a file -- everything except the filename and the file's actual data contents. Every file and directory on the filesystem has exactly one inode, identified by a unique inode number within that filesystem.

What an inode Contains

FieldDescription
File typeRegular file, directory, symlink, block device, etc.
PermissionsRead/write/execute for owner, group, others (the rwx bits)
Owner (UID)User ID of the file owner
Group (GID)Group ID
SizeFile size in bytes
Timestampsatime (last access), mtime (last modification), ctime (last inode change)
Link countNumber of hard links pointing to this inode
Block pointersPointers to the disk blocks containing the file's data

Crucially, the filename is NOT stored in the inode. Filenames are stored in directory entries (dentries). A directory is simply a special file whose data is a list of (name, inode number) pairs. This separation is what makes hard links possible.

Block Pointers and Indirect Blocks

An inode typically has 12 direct block pointers, each pointing to a disk block (e.g., 4 KB). For a filesystem with 4 KB blocks, 12 direct pointers cover files up to 48 KB. For larger files, the inode uses indirect pointers:

  • Single indirect pointer: points to a block full of direct pointers. With 4-byte pointers and 4 KB blocks, this adds 1024 more pointers = 4 MB.
  • Double indirect pointer: points to a block of single indirect pointers. This adds 1024 x 1024 pointers = 4 GB.
  • Triple indirect pointer: points to a block of double indirect pointers. This adds 1024^3 pointers = 4 TB.

This hierarchical pointer scheme means small files have fast, direct access (no indirection), while very large files are still addressable through multiple levels of indirection.

The inode Table

Every filesystem has an inode table -- a region on disk containing all inodes. The total number of inodes is fixed at filesystem creation time (with mkfs). You can run out of inodes even if disk space remains, which happens when a filesystem stores millions of tiny files. The command df -i shows inode usage.

inode Number as File Identity

The inode number is the true identity of a file. Two directory entries (hard links) with different names but the same inode number refer to the same file -- same data, same metadata, same block pointers. Deleting a file (via unlink()) decrements the link count in the inode. Only when the link count reaches zero (and no process has the file open) does the OS actually free the inode and its data blocks.

Hard Links vs. Symbolic Links

A hard link is a directory entry that points directly to an inode. Multiple hard links share the same inode -- changes through one link are visible through all others. Hard links cannot cross filesystem boundaries (because inode numbers are only unique within a filesystem) and cannot link to directories (to prevent cycles).

A symbolic link (symlink) is a separate file with its own inode whose data content is the path to the target file. Symlinks can cross filesystems and point to directories, but they break if the target is moved or deleted (dangling symlink).

The stat() System Call

The stat() system call reads the inode of a file and returns its metadata to the calling program. The ls -l command uses stat() internally. You can see the inode number with ls -i or stat filename.

Hard Links in Action

Real-World Example

Here is a concrete example of how inodes and hard links work:

$ echo "Hello, world!" > /tmp/original.txt
$ ls -i /tmp/original.txt
2359301 /tmp/original.txt          # inode number is 2359301

$ stat /tmp/original.txt
  File: /tmp/original.txt
  Size: 14        Blocks: 8          Inode: 2359301
  Links: 1        UID: 1000          GID: 1000
  Access: 2024-01-15 10:30:00
  Modify: 2024-01-15 10:30:00

$ ln /tmp/original.txt /tmp/hardlink.txt       # create hard link
$ ls -i /tmp/hardlink.txt
2359301 /tmp/hardlink.txt          # SAME inode number!

$ stat /tmp/original.txt | grep Links
  Links: 2                          # link count is now 2

$ rm /tmp/original.txt             # remove one name
$ cat /tmp/hardlink.txt
Hello, world!                      # data still exists -- link count went from 2 to 1

Key insight: Removing original.txt does not delete the data. It only removes one directory entry and decrements the link count. The inode and its data blocks persist as long as at least one hard link (or open file descriptor) references the inode.

Why "everything is a file" works: Devices like /dev/sda also have inodes. Their inode metadata includes the device type (block or character) and major/minor numbers instead of data block pointers. The filesystem treats device files, directories, symlinks, sockets, and regular files uniformly through the inode abstraction.

inode Structure with Indirect Block Pointers

inode Structure with Block Pointers inode 2359301 type: regular perms: rw-r--r-- uid: 1000 size: 8.5 MB links: 2 Direct ptrs (0-11): block 0..11 (48 KB) single indirect double indirect triple indirect Block 0 Block 1 ... Block 11 Indirect Block ptr -> Block 12 ptr -> Block 13 ... (1024 ptrs) Blk 12 Blk 13 +4 MB Double Indirect 1024 ptrs Indirect 1024 ptrs Data +4 GB Directory Entries "original.txt" -> inode 2359301 "hardlink.txt" -> inode 2359301 "readme.md" -> inode 2359400 Names live here, NOT in the inode
Step 1 of 2