Mutual Exclusion and Synchronization Primitives

A mutex (mutual exclusion lock) is a synchronization primitive that ensures only one thread at a time can execute a critical section -- a block of code that accesses shared data. Without mutual exclusion, concurrent threads can interleave their reads and writes to shared variables, producing corrupted or inconsistent state.

Lock/Unlock Semantics

A mutex has two operations:

  • lock() (acquire): If the mutex is unlocked, the calling thread locks it and enters the critical section. If the mutex is already locked by another thread, the caller is blocked until the mutex becomes available.
  • unlock() (release): The thread releases the mutex, allowing one waiting thread (if any) to acquire it and proceed.

The invariant is simple: between lock() and unlock(), exactly one thread is inside the critical section.

Spinlock (Busy-Wait)

A spinlock is the simplest mutex implementation. A thread trying to acquire a locked spinlock enters a tight loop (spinning), repeatedly checking if the lock is available. Spinlocks are efficient when the critical section is very short (a few instructions) and threads are running on separate cores, because the cost of spinning is less than the cost of a context switch. However, on a single core, a spinning thread wastes the entire time slice -- no other thread can make progress to release the lock.

Sleeping Mutex (Blocking)

A sleeping mutex (also called a blocking mutex) puts the waiting thread to sleep (moves it to a blocked state in the OS scheduler) instead of spinning. When the lock is released, the OS wakes one blocked thread. This is more efficient when the critical section is long or contention is high, because blocked threads consume no CPU cycles. The tradeoff is higher latency to acquire the lock due to the context switch overhead (the thread must be woken by the scheduler).

Compare-and-Swap (CAS)

Both spinlocks and sleeping mutexes are built on a hardware atomic instruction called Compare-and-Swap (CAS). CAS atomically reads a memory location, compares it to an expected value, and if they match, writes a new value -- all in a single indivisible operation. In pseudocode: CAS(addr, expected, newVal) returns true if the swap succeeded. This is the fundamental building block of all lock-free and lock-based concurrent algorithms.

Priority Inversion

Priority inversion occurs when a high-priority thread is blocked waiting for a lock held by a low-priority thread, while a medium-priority thread preempts the low-priority one. The high-priority thread is effectively running at the lowest priority. The Mars Pathfinder incident in 1997 was caused by this: the high-priority bus-management task was blocked by a low-priority meteorological task holding a shared mutex, while medium-priority tasks preempted the meteorological task. The fix was priority inheritance -- temporarily boosting the lock holder's priority to match the highest-priority waiter.

Semaphore

A semaphore generalizes a mutex to allow up to N threads into the critical section simultaneously. It maintains a counter initialized to N. acquire() decrements the counter (blocking if zero), and release() increments it (waking a blocked thread if any). A mutex is a semaphore with N=1 (a "binary semaphore").

Condition Variables

A condition variable allows threads to wait for a specific condition to become true. wait(condVar, mutex) atomically releases the mutex and suspends the thread. signal(condVar) wakes one waiting thread, which re-acquires the mutex. broadcast(condVar) wakes all waiting threads. Condition variables are always used together with a mutex to avoid race conditions on the condition check itself.

Real-Life: Mars Pathfinder Priority Inversion

Real-World Example

In 1997, NASA's Mars Pathfinder lander began experiencing periodic system resets shortly after landing on Mars. The root cause was priority inversion:

  1. A high-priority bus-management task needed to acquire a shared mutex to access the information bus.
  2. A low-priority meteorological task had already acquired the same mutex and was in its critical section.
  3. A medium-priority communications task became runnable and preempted the low-priority task (since it had higher priority).
  4. The high-priority task was now stuck: it could not run because it was waiting for the mutex, which was held by the low-priority task, which could not run because the medium-priority task was using the CPU.

The system's watchdog timer detected the high-priority task had not run for too long and triggered a reset. This repeated every few hours.

The fix: Engineers uploaded a patch enabling priority inheritance in the mutex implementation. When the high-priority task blocked on the mutex, the OS temporarily raised the lock-holding low-priority task's priority to match, ensuring it would not be preempted by the medium-priority task. The meteorological task could then finish its critical section, release the mutex, and the high-priority task could proceed.

This is why modern real-time operating systems (RTOS) support priority-inheritance mutexes by default, and why understanding synchronization primitives deeply matters for safety-critical systems.

Spinlock vs Sleeping Mutex

Spinlock vs Sleeping Mutex Spinlock (Busy-Wait) Thread A holds lock Thread B CAS loop burns CPU cycles L Good: no context switch overhead Bad: wastes CPU if critical section is long Sleeping Mutex (Block) Thread C holds lock Thread D BLOCKED sleeping in wait queue L Good: no CPU waste, thread yields core Bad: context switch cost to wake Compare-and-Swap (CAS) — Hardware Atomic bool CAS(addr, expected, newVal): atomic { if (*addr == expected) { *addr = newVal; return true; } return false; }

Interactive Mutex Simulation

Loading demo...
Step 1 of 3