Recipe

Parallel tool calls

Cut multi-tool latency from N round-trips to one. When a model can see that three function calls are independent, it emits them in a single response and your runtime fans them out in parallel.

1.Why parallel beats sequential

A typical agent loop calls one tool, waits for the result, then decides what to call next. For independent calls (e.g. weather + stock price + news lookup) this serializes work that has no causal ordering. Meridian-routed models emit multiple tool_use blocks in one assistant turn. Dispatch them with Promise.all and feed the results back in a single user turn.

2.The pattern

Define your tools, send the user prompt, then collect every tool_use block from the response. Run them concurrently and return each result tagged with its original tool_use_id.

// Parallel tool calls with Meridian
import { Meridian } from '@meridian/sdk';

const client = new Meridian({ apiKey: process.env.MERIDIAN_KEY });

const response = await client.messages.create({
  model: 'azure/model-router',
  max_tokens: 4096,
  tools: [
    { name: 'get_weather', input_schema: { /* ... */ } },
    { name: 'get_stock_price', input_schema: { /* ... */ } },
    { name: 'search_web', input_schema: { /* ... */ } },
  ],
  messages: [{
    role: 'user',
    content: 'Compare weather in NYC, AAPL price, and latest AI news.',
  }],
});

// Model emits 3 tool_use blocks in ONE response
const toolCalls = response.content.filter(b => b.type === 'tool_use');

// Dispatch all 3 in parallel — not sequential
const results = await Promise.all(
  toolCalls.map(call => dispatch(call.name, call.input))
);

3.Gotchas

  • Not every model parallelizes equally. Reasoning-heavy routes may still chain calls; mid-tier general models fan out most aggressively.
  • You must return ALL tool results in the next user message, each as a separate content block, before the model can answer.
  • Cap concurrency at the slowest downstream. A 50-tool fan-out will overwhelm a 10-req/s API regardless of model speed.