The Ethernet Frame Structure
Ethernet is the dominant link-layer protocol for local area networks (LANs). It defines how devices on the same physical network segment package data into frames for transmission. Understanding the frame structure is essential because every packet that travels across a LAN — whether it carries an HTTP request, a DNS query, or a database transaction — is wrapped inside an Ethernet frame.
Frame Fields (IEEE 802.3)
An Ethernet frame consists of the following fields, in order:
| Field | Size | Purpose |
|---|---|---|
| Preamble | 7 bytes | Alternating 10101010 pattern; lets the receiver's clock lock onto the sender's bit timing |
| SFD (Start Frame Delimiter) | 1 byte | The byte 10101011; signals that the actual frame content begins next |
| Destination MAC | 6 bytes | The hardware address of the intended recipient (or broadcast FF:FF:FF:FF:FF:FF) |
| Source MAC | 6 bytes | The hardware address of the sender |
| EtherType / Length | 2 bytes | Identifies the upper-layer protocol (0x0800 = IPv4, 0x86DD = IPv6, 0x0806 = ARP) |
| Payload | 46 -- 1500 bytes | The actual data (e.g., an IP packet); padded to 46 bytes minimum |
| FCS (Frame Check Sequence) | 4 bytes | CRC-32 checksum over dest MAC through payload; detects bit errors |
The minimum payload of 46 bytes ensures the frame is long enough for collision detection to work (in legacy half-duplex Ethernet). The maximum payload of 1500 bytes is the standard MTU (Maximum Transmission Unit). Jumbo frames extend this to 9000 bytes for high-performance networks.
MAC Addresses
Every network interface card (NIC) has a globally unique 48-bit MAC address, written as six hex-separated octets (e.g., a4:83:e7:2f:01:bc). The first 3 bytes form the OUI (Organizationally Unique Identifier), assigned to the manufacturer by IEEE. The last 3 bytes are assigned by the manufacturer. Special addresses include:
- Broadcast:
FF:FF:FF:FF:FF:FF— delivered to all devices on the LAN segment. - Multicast: the least-significant bit of the first byte is 1 — delivered to a group of devices.
CSMA/CD (Historical Context)
Early Ethernet used a shared coaxial cable (bus topology) where all devices transmitted on the same wire. CSMA/CD (Carrier Sense Multiple Access with Collision Detection) was the access control protocol: a device listens before transmitting (carrier sense), sends its frame, and simultaneously monitors for signal collisions. If a collision occurs, both senders stop, wait a random backoff time (binary exponential backoff), and retry. Modern switched Ethernet uses full-duplex point-to-point links, eliminating collisions entirely. CSMA/CD is effectively obsolete, but it explains why the minimum frame size exists.
Ethernet Switches and MAC Learning
An Ethernet switch connects multiple devices and forwards frames only to the port where the destination MAC resides. It builds a MAC address table (also called a CAM table) by observing the source MAC of incoming frames and recording which port they arrived on. When a frame arrives for an unknown destination, the switch floods it to all ports. Once the destination replies, the switch learns its port and can forward future frames directly. This is far more efficient than a hub, which blindly copies every frame to every port.
VLANs (802.1Q)
A VLAN (Virtual LAN) partitions a single physical switch into multiple logical broadcast domains. The 802.1Q standard inserts a 4-byte tag between the source MAC and EtherType fields: 2 bytes for the Tag Protocol Identifier (0x8100) and 2 bytes containing a 12-bit VLAN ID (supporting up to 4094 VLANs) plus a 3-bit priority field. Devices on different VLANs cannot communicate directly at Layer 2 — traffic between VLANs must pass through a router (inter-VLAN routing).
Real-Life: How a Switch Forwards a Frame
Imagine a small office with four computers (A, B, C, D) connected to a switch with ports 1-4:
Step 1: Computer A (MAC aa:aa:aa:00:00:01) on port 1 sends a frame to computer C (MAC cc:cc:cc:00:00:03). The switch receives the frame on port 1, records aa:aa:aa:00:00:01 -> port 1 in its MAC table. It does not yet know where C is, so it floods the frame to ports 2, 3, and 4.
Step 2: Computer C (on port 3) processes the frame and sends a reply back to A. The switch now records cc:cc:cc:00:00:03 -> port 3. Since A is already in the table (port 1), the switch forwards the reply only to port 1 — B and D never see it.
Step 3: Any future A-to-C traffic goes directly port 1 to port 3. No flooding needed.
MTU in practice: If the network layer hands down a 4000-byte IP packet but the Ethernet MTU is 1500 bytes, the packet must be fragmented into three pieces (at the IP layer) before being placed into three separate Ethernet frames. This is why matching MTU sizes end-to-end matters — mismatches cause fragmentation, which increases overhead and can cause failures if the Don't Fragment (DF) flag is set.
VLANs in practice: In a company with Engineering and Finance departments sharing the same switch, the admin creates VLAN 10 (Engineering) and VLAN 20 (Finance). A broadcast from an Engineering machine only reaches other VLAN 10 ports — Finance machines on VLAN 20 never see it. This improves security and reduces broadcast traffic.