Custom eval runner
Ship a bespoke evaluation harness that scores model outputs against your own rubrics — no third-party eval framework required.
Overview
Meridian lets you define a custom runner that invokes your model, collects completions, and applies scoring logic you control. The runner is a single TypeScript module that exports a standard interface — drop it into your project and Meridian handles orchestration, caching, and result storage.
Runner contract
export interface EvalRunner {
id: string;
run(input: EvalInput): Promise<EvalResult>;
}
type EvalInput = {
prompt: string;
model: string;
params?: Record<string, unknown>;
};
type EvalResult = {
score: number; // 0.0 – 1.0
latencyMs: number;
output: string;
metadata?: Record<string, unknown>;
};Wiring it up
Place your runner in evals/runners/ and reference it by id in your eval config. Meridian discovers runners automatically at build time — no manual registration needed.
Built-in scoring helpers
exactMatch— string equalitycontainsAll— substring presencelevenshtein— normalized edit distancejsonSchema— structural validation
Next step
Read the eval config reference to connect your runner to a test suite.