The Index Node: File Identity on Disk
An inode (index node) is a data structure on a Unix/Linux filesystem that stores all the metadata about a file -- everything except the filename and the file's actual data contents. Every file and directory on the filesystem has exactly one inode, identified by a unique inode number within that filesystem.
What an inode Contains
| Field | Description |
|---|---|
| File type | Regular file, directory, symlink, block device, etc. |
| Permissions | Read/write/execute for owner, group, others (the rwx bits) |
| Owner (UID) | User ID of the file owner |
| Group (GID) | Group ID |
| Size | File size in bytes |
| Timestamps | atime (last access), mtime (last modification), ctime (last inode change) |
| Link count | Number of hard links pointing to this inode |
| Block pointers | Pointers to the disk blocks containing the file's data |
Crucially, the filename is NOT stored in the inode. Filenames are stored in directory entries (dentries). A directory is simply a special file whose data is a list of (name, inode number) pairs. This separation is what makes hard links possible.
Block Pointers and Indirect Blocks
An inode typically has 12 direct block pointers, each pointing to a disk block (e.g., 4 KB). For a filesystem with 4 KB blocks, 12 direct pointers cover files up to 48 KB. For larger files, the inode uses indirect pointers:
- Single indirect pointer: points to a block full of direct pointers. With 4-byte pointers and 4 KB blocks, this adds 1024 more pointers = 4 MB.
- Double indirect pointer: points to a block of single indirect pointers. This adds 1024 x 1024 pointers = 4 GB.
- Triple indirect pointer: points to a block of double indirect pointers. This adds 1024^3 pointers = 4 TB.
This hierarchical pointer scheme means small files have fast, direct access (no indirection), while very large files are still addressable through multiple levels of indirection.
The inode Table
Every filesystem has an inode table -- a region on disk containing all inodes. The total number of inodes is fixed at filesystem creation time (with mkfs). You can run out of inodes even if disk space remains, which happens when a filesystem stores millions of tiny files. The command df -i shows inode usage.
inode Number as File Identity
The inode number is the true identity of a file. Two directory entries (hard links) with different names but the same inode number refer to the same file -- same data, same metadata, same block pointers. Deleting a file (via unlink()) decrements the link count in the inode. Only when the link count reaches zero (and no process has the file open) does the OS actually free the inode and its data blocks.
Hard Links vs. Symbolic Links
A hard link is a directory entry that points directly to an inode. Multiple hard links share the same inode -- changes through one link are visible through all others. Hard links cannot cross filesystem boundaries (because inode numbers are only unique within a filesystem) and cannot link to directories (to prevent cycles).
A symbolic link (symlink) is a separate file with its own inode whose data content is the path to the target file. Symlinks can cross filesystems and point to directories, but they break if the target is moved or deleted (dangling symlink).
The stat() System Call
The stat() system call reads the inode of a file and returns its metadata to the calling program. The ls -l command uses stat() internally. You can see the inode number with ls -i or stat filename.
Hard Links in Action
Here is a concrete example of how inodes and hard links work:
$ echo "Hello, world!" > /tmp/original.txt
$ ls -i /tmp/original.txt
2359301 /tmp/original.txt # inode number is 2359301
$ stat /tmp/original.txt
File: /tmp/original.txt
Size: 14 Blocks: 8 Inode: 2359301
Links: 1 UID: 1000 GID: 1000
Access: 2024-01-15 10:30:00
Modify: 2024-01-15 10:30:00
$ ln /tmp/original.txt /tmp/hardlink.txt # create hard link
$ ls -i /tmp/hardlink.txt
2359301 /tmp/hardlink.txt # SAME inode number!
$ stat /tmp/original.txt | grep Links
Links: 2 # link count is now 2
$ rm /tmp/original.txt # remove one name
$ cat /tmp/hardlink.txt
Hello, world! # data still exists -- link count went from 2 to 1
Key insight: Removing original.txt does not delete the data. It only removes one directory entry and decrements the link count. The inode and its data blocks persist as long as at least one hard link (or open file descriptor) references the inode.
Why "everything is a file" works: Devices like /dev/sda also have inodes. Their inode metadata includes the device type (block or character) and major/minor numbers instead of data block pointers. The filesystem treats device files, directories, symlinks, sockets, and regular files uniformly through the inode abstraction.