RECIPE

ClickHouse Primer

A practical introduction to ClickHouse for Meridian operators. Learn how columnar storage, MergeTree engines, and materialized views combine to give you sub-second analytics over billions of inference events without the operational pain of traditional OLAP stacks.

1. Why ClickHouse for Meridian

Meridian writes one row per inference call: tokens in, tokens out, latency, model alias, customer id. ClickHouse stores these columns contiguously, so a query that touches three columns reads three columns from disk, not the entire row. The result is order-of-magnitude faster scans than Postgres on the same hardware, and storage that compresses to under 10% of raw JSON.

  • Append-only ingest at 100k rows/sec on a single node
  • Sub-second percentiles over a month of traffic
  • ZSTD compression cuts disk by 12x vs JSON logs

2. The MergeTree table that powers /v1/usage

Every Meridian deployment ships with a base table partitioned by day and ordered by customer + timestamp. This is the schema the public usage endpoint queries directly:

CREATE TABLE meridian.events (
  ts          DateTime64(3, 'UTC'),
  customer_id String,
  model       LowCardinality(String),
  tokens_in   UInt32,
  tokens_out  UInt32,
  latency_ms  UInt32,
  cost_usd    Decimal(10, 6),
  status      LowCardinality(String)
)
ENGINE = MergeTree
PARTITION BY toYYYYMMDD(ts)
ORDER BY (customer_id, ts)
TTL ts + INTERVAL 90 DAY
SETTINGS index_granularity = 8192;

3. Materialized views for the admin console

The admin console never scans the raw events table at read time. A materialized view rolls each event into a per-customer, per-hour bucket on insert, so the cost cockpit query stays under 50 milliseconds even when the base table holds two billion rows. When you add a new metric, create a fresh view rather than altering the existing one.

The pattern: raw MergeTree for forensics, AggregatingMergeTree views for dashboards. Keep the raw retention short, the rollups long.