Concurrency primitives
Compare-and-swap loops
Compare-and-swap (CAS) loops are the bedrock of lock-free programming. Instead of acquiring a mutex, a thread reads a value, computes the next state, and atomically swaps it in only if nothing else has touched it. When the swap fails, the thread loops and tries again with the freshly observed value. Done right, CAS loops scale across cores; done wrong, they livelock, leak ABA bugs, or burn the CPU.
1. The basic shape
Every CAS loop has the same four moves: load the current value, derive a candidate next value, attempt the atomic exchange, and on failure refresh the expected value and retry. The hardware guarantees that exactly one writer wins per cache-line generation, so progress is global even if any single thread starves.
2. Reference implementation
Below is a saturating counter that increments until it reaches a target. We use compare_exchange_weakbecause spurious failures are cheap on ARM and let the compiler emit a tighter LL/SC sequence. The expected value is passed by reference and updated for us on failure, so we never re-load explicitly inside the hot path.
// Lock-free counter using a CAS loop
#include <atomic>
std::atomic<long> counter{0};
long increment_until(long target) {
long current = counter.load(std::memory_order_relaxed);
for (;;) {
if (current >= target) return current;
long next = current + 1;
// compare_exchange_weak updates 'current' on failure
if (counter.compare_exchange_weak(
current, next,
std::memory_order_acq_rel,
std::memory_order_relaxed)) {
return next;
}
// Spurious failure or contention: retry with refreshed 'current'
}
}3. Pitfalls and mitigations
- ABA: a value can change from A to B and back to A between your load and your CAS. Tag pointers or use hazard pointers when the protected datum is a freelist node.
- Livelock: under heavy contention a CAS loop can spin without making global progress. Mix in an exponential backoff or a
pausehint. - Memory order: prefer
acq_relon success andrelaxedon failure unless you are publishing data through the same cache line.