← Back to docs
Cost-aware routing strategy
Meridian routes every chat completion across 250+ models on a single API. Cost-aware routing picks the cheapest model that meets your latency and quality bar for each request, so you spend a fraction of what a fixed frontier model would cost without sacrificing output quality on the requests that need it.
1. Tag your traffic
Send a x-meridian-tier header on every request: bulk, standard, or premium. Bulk traffic routes to Llama-4 and DeepSeek classes at roughly 1/40th the cost of a frontier reasoning call.
2. Use the router alias
Set model: "meridian/router"and let the adaptive router pick. The router watches per-tenant spend, latency budgets, and recent quality scores to land each call on the right tier.
3. Example request
curl https://meridian.getnimbus.net/v1/chat/completions \
-H "Authorization: Bearer $MERIDIAN_KEY" \
-H "x-meridian-tier: bulk" \
-d '{
"model": "meridian/router",
"messages": [
{"role": "user", "content": "Summarize this log."}
]
}'Typical savings: 60-85% versus single-model frontier baselines.