LLM Audit Log Design
Every production LLM call should leave a forensic trail. This recipe shows how to structure a Meridian-compatible audit log so you can replay any completion, attribute cost per tenant, and prove compliance when a customer asks why the model said what it said. Designed for high-throughput gateways routing 250+ model aliases.
1. Capture the immutable request envelope
Log the request BEFORE you mutate it for the upstream provider. Persist the raw model alias the caller asked for, the resolved deployment, the tenant id from the API key, and a SHA-256 of the prompt body. Hashing avoids retaining PII while keeping the row joinable to a sidecar encrypted blob store if you need replay.
- request_id (ULID) — primary key, sortable by time
- tenant_id, api_key_id — billing + access scope
- model_requested vs model_resolved — routing audit
- prompt_sha256, prompt_token_count — replay handle
2. Stream the response into a single closing row
Do not write one row per SSE chunk. Buffer the deltas in memory, then write a single terminal record on stream close (or on client abort) with completion tokens, finish reason, upstream latency, and gateway overhead. This keeps the table indexable and makes per-second cost queries cheap.
3. Pin the schema and bill from it
Treat the audit row as the source of truth for invoicing. Never compute usage from the upstream provider response alone — that path loses tenant attribution. Pin the schema with a versioned migration and require every gateway worker to emit the same shape.
Reference schema
CREATE TABLE llm_audit ( request_id TEXT PRIMARY KEY, tenant_id TEXT NOT NULL, api_key_id TEXT NOT NULL, model_requested TEXT NOT NULL, model_resolved TEXT NOT NULL, prompt_sha256 TEXT NOT NULL, prompt_tokens INT NOT NULL, completion_tokens INT, finish_reason TEXT, upstream_ms INT, gateway_ms INT, cost_usd_micros BIGINT, created_at TIMESTAMPTZ DEFAULT now() ); CREATE INDEX llm_audit_tenant_time ON llm_audit (tenant_id, created_at DESC);