Write-Ahead Log Primer
A write-ahead log (WAL) is the durability backbone of every serious storage engine. Before a mutation touches the main data structures, it is appended to a sequential log on stable storage. If the process crashes mid-commit, replaying the log restores the world. This primer walks through the why, the how, and the gotchas Meridian recipes care about.
01.Why append-only wins
Random writes punish spinning disks and even tax SSDs through write amplification. A WAL turns every mutation into a sequential append, which is the cheapest operation a block device can perform. The main table can stay disk-friendly because the log absorbs the durability cost while in-memory structures stay fast.
The second win is recoverability. A truncated record at the tail of the log is the only failure mode, and it is detectable via a per-record checksum.
02.Record layout
A minimal record carries a length prefix, a CRC32C of the payload, an LSN, and the mutation bytes. Records are grouped into 32 KiB blocks so a torn-page failure cannot corrupt more than one block at a time.
struct WalRecord {
u32 length; // payload bytes
u32 crc32c; // checksum over payload
u64 lsn; // monotonic log sequence number
u8 type; // FULL | FIRST | MIDDLE | LAST
u8 payload[]; // serialized mutation
};03.Fsync, group commit, and the tail
Durability requires fsync, and fsync is slow. The standard answer is group commit: batch concurrent writers into a single flush, amortizing the syscall cost across the cohort. Latency for the median writer drops while throughput rises.
On recovery, scan from the last known-good checkpoint, verify each record CRC, and stop at the first truncated tail. Anything after that point is discarded; the WAL never lies about what was promised to clients.