The 6-Transistor SRAM Cell

Prerequisites(1)

Builds on to(1)

Static Random-Access Memory (SRAM) stores each bit using a circuit that holds its state as long as power is supplied -- no periodic refresh is needed. The standard SRAM cell uses 6 transistors (6T), making it fast but area-expensive.

Cross-Coupled Inverters: The Bistable Core

At the heart of the 6T cell are two cross-coupled CMOS inverters. Each inverter's output feeds into the other's input, forming a bistable latch with two stable states:

State	Node Q	Node Q-bar
Storing 1	HIGH	LOW
Storing 0	LOW	HIGH

The cross-coupling creates positive feedback -- if Q is high, inverter 2 drives Q-bar low, which in turn makes inverter 1 drive Q high. The cell will hold this state indefinitely (hence "static") as long as power is maintained.

Access Transistors and the Word Line

Two additional NMOS access transistors connect the storage nodes (Q and Q-bar) to the bit lines (BL and BL-bar). These access transistors are gated by the word line (WL):

Word line LOW: Access transistors are OFF. The cell is isolated from the bit lines and holds its stored value undisturbed.
Word line HIGH: Access transistors turn ON, connecting the internal nodes to the bit lines for read or write operations.

Read Operation

Before reading, both bit lines are precharged to HIGH. When the word line goes high, the side storing a 0 begins to pull its bit line slightly lower through the access transistor. A sense amplifier detects this small voltage difference between BL and BL-bar and amplifies it to a full logic level. Careful transistor sizing ensures the read operation does not accidentally flip the stored value (read stability).

Write Operation

To write, the external driver forces the bit lines to the desired values (e.g., BL=HIGH, BL-bar=LOW to write a 1). When the word line activates, the strong bit-line drivers overpower the cell's internal inverters, forcing the latch into the new state. The write driver must be stronger than the cell's pull-up transistors.

Performance and Usage

SRAM is fast (sub-nanosecond access), but each bit requires 6 transistors, making it roughly 6x less dense than DRAM. This tradeoff makes SRAM ideal for:

L1/L2/L3 CPU caches (where speed is critical and capacity is small)
Register files inside the processor
TLB (Translation Lookaside Buffer) entries
Embedded SRAM in FPGAs and ASICs

**Static Random-Access Memory (SRAM)** stores each bit using a circuit that holds its state as long as power is supplied -- no periodic refresh is needed. The standard SRAM cell uses **6 transistors (6T)**, making it fast but area-expensive.

### Cross-Coupled Inverters: The Bistable Core

At the heart of the 6T cell are **two cross-coupled CMOS inverters**. Each inverter's output feeds into the other's input, forming a **bistable latch** with two stable states:

| State | Node Q | Node Q-bar |
|-------|--------|------------|
| Storing **1** | HIGH | LOW |
| Storing **0** | LOW | HIGH |

The cross-coupling creates **positive feedback** -- if Q is high, inverter 2 drives Q-bar low, which in turn makes inverter 1 drive Q high. The cell will hold this state indefinitely (hence "static") as long as power is maintained.

### Access Transistors and the Word Line

Two additional **NMOS access transistors** connect the storage nodes (Q and Q-bar) to the **bit lines** (BL and BL-bar). These access transistors are gated by the **word line (WL)**:

- **Word line LOW**: Access transistors are OFF. The cell is isolated from the bit lines and holds its stored value undisturbed.
- **Word line HIGH**: Access transistors turn ON, connecting the internal nodes to the bit lines for read or write operations.

### Read Operation

Before reading, both bit lines are **precharged to HIGH**. When the word line goes high, the side storing a 0 begins to pull its bit line slightly lower through the access transistor. A **sense amplifier** detects this small voltage difference between BL and BL-bar and amplifies it to a full logic level. Careful transistor sizing ensures the read operation does not accidentally **flip** the stored value (read stability).

### Write Operation

To write, the external driver **forces** the bit lines to the desired values (e.g., BL=HIGH, BL-bar=LOW to write a 1). When the word line activates, the strong bit-line drivers overpower the cell's internal inverters, forcing the latch into the new state. The write driver must be stronger than the cell's pull-up transistors.

### Performance and Usage

SRAM is **fast** (sub-nanosecond access), but each bit requires **6 transistors**, making it roughly **6x less dense** than DRAM. This tradeoff makes SRAM ideal for:

- **L1/L2/L3 CPU caches** (where speed is critical and capacity is small)
- **Register files** inside the processor
- **TLB (Translation Lookaside Buffer)** entries
- **Embedded SRAM** in FPGAs and ASICs

Real-Life: A Light Switch with Memory

Real-World Example

Imagine a toggle light switch that stays in whatever position you last flipped it to -- ON or OFF. It does not need anyone to periodically "remind" it of its state (unlike DRAM, which needs constant refresh). The cross-coupled inverters in an SRAM cell work the same way: once set, they stay put.

Practical applications and real-world context:

CPU L1 cache: Modern processors have 32-64 KB of L1 data cache per core, built entirely from SRAM. Access latency is typically 1-4 clock cycles (~1 ns). This cache is the fastest memory the CPU can reach, sitting directly adjacent to the execution units on the die.
L3 cache scaling: A high-end server CPU like AMD EPYC 9004 has up to 384 MB of L3 SRAM cache. This occupies a massive portion of the die area, illustrating the density cost of 6T cells.
Register file: The 64-bit general-purpose registers (RAX, RBX, etc.) in an x86 core are implemented using SRAM-like structures with even faster access (single-cycle).
Embedded SRAM in SoCs: IoT microcontrollers often include 64-512 KB of on-chip SRAM as their only working memory, since it requires no external DRAM controller and provides deterministic access times.
SRAM vs DRAM tradeoff: A 16 MB L3 cache in SRAM might occupy 100 mm^2 of die area. The same 16 MB in DRAM would take roughly 16 mm^2 but require refresh circuitry and would be 10-50x slower.

6T SRAM Cell Schematic

Step 1 of 2