Meridian × Braintrust

Evals with Braintrust

Wire Braintrust over Meridian to compare model performance across scenarios. Run side-by-side evaluations against GPT-4o, Claude Sonnet 4, and Gemini 2.5 Pro with real test inputs and latency tracking.

Evaluation Scenario

🧪

Ready to evaluate

Select a scenario above and click Run Evaluation to compare model outputs, latency, and token usage across all three providers.