Recipe
Parallel tool calls
Cut multi-tool latency from N round-trips to one. When a model can see that three function calls are independent, it emits them in a single response and your runtime fans them out in parallel.
1.Why parallel beats sequential
A typical agent loop calls one tool, waits for the result, then decides what to call next. For independent calls (e.g. weather + stock price + news lookup) this serializes work that has no causal ordering. Meridian-routed models emit multiple tool_use blocks in one assistant turn. Dispatch them with Promise.all and feed the results back in a single user turn.
2.The pattern
Define your tools, send the user prompt, then collect every tool_use block from the response. Run them concurrently and return each result tagged with its original tool_use_id.
// Parallel tool calls with Meridian
import { Meridian } from '@meridian/sdk';
const client = new Meridian({ apiKey: process.env.MERIDIAN_KEY });
const response = await client.messages.create({
model: 'azure/model-router',
max_tokens: 4096,
tools: [
{ name: 'get_weather', input_schema: { /* ... */ } },
{ name: 'get_stock_price', input_schema: { /* ... */ } },
{ name: 'search_web', input_schema: { /* ... */ } },
],
messages: [{
role: 'user',
content: 'Compare weather in NYC, AAPL price, and latest AI news.',
}],
});
// Model emits 3 tool_use blocks in ONE response
const toolCalls = response.content.filter(b => b.type === 'tool_use');
// Dispatch all 3 in parallel — not sequential
const results = await Promise.all(
toolCalls.map(call => dispatch(call.name, call.input))
);3.Gotchas
- •Not every model parallelizes equally. Reasoning-heavy routes may still chain calls; mid-tier general models fan out most aggressively.
- •You must return ALL tool results in the next user message, each as a separate content block, before the model can answer.
- •Cap concurrency at the slowest downstream. A 50-tool fan-out will overwhelm a 10-req/s API regardless of model speed.