Recipe / Storage Engines
LSM Tree Primer
Log-structured merge trees power most modern write-heavy databases: RocksDB, LevelDB, Cassandra, ScyllaDB, and Meridian's own event log layer. This primer walks through the three moving parts you need to understand before tuning one in production.
1. The MemTable
All writes hit an in-memory sorted structure first — usually a skip list or a red-black tree. Writes are O(log n) and append-only to a write-ahead log for durability. When the MemTable crosses a size threshold, it is frozen and flushed to disk as an immutable sorted run called an SSTable.
2. SSTables and Levels
SSTables are sorted, immutable files on disk. They are organized into levels (L0, L1, L2…) where each level is exponentially larger than the one above. A read may need to consult one file per level, so bloom filters are attached to each SSTable to skip files that cannot contain the requested key.
3. Compaction
Background workers merge overlapping SSTables, dropping tombstones and superseded versions. The two dominant strategies are leveled compaction (low read amp, high write amp) and size-tiered (the inverse). Pick based on whether your workload is read- or write-bound.
Minimal write path
def put(key, value):
wal.append(key, value) # durability
memtable.insert(key, value) # O(log n)
if memtable.size > THRESHOLD:
sstable = memtable.flush_to_disk()
levels[0].append(sstable)
schedule_compaction()