Multiple Completions

Generate n independent completions in a single API call. Compare outputs, pick the best, and only pay for what you use.

n=4completions per request

Set n=4 and receive four independent, non-deterministic completions in one response. Each completion is a full generation — Meridian streams them back as they finish.

How It Works

1

Send One Request

Pass n=4 in your chat completions call. No batching, no loops — one HTTP request, one response array.

2

Independent Generations

Each completion is sampled independently. Different seeds, different paths — you get genuine diversity, not minor re-rankings of the same logits.

3

Pick the Best

Compare all n outputs client-side. Use your own ranking heuristic, user voting, or automated eval to select the winner.

Pricing

Cost scales linearly with n. If one completion produces T output tokens, an n=4 request costs n × T output tokens. Input tokens are charged once — they are shared across all completions.

Example:n=4, 500 output tokens each2,000 output tokens billed

Common Use Cases

  • A/B testing prompts: Generate multiple responses to the same prompt and let users vote on quality.
  • Self-consistency: Sample multiple reasoning chains and take the majority answer for math or logic tasks.
  • Creative exploration: Generate four variations of copy, code, or design ideas and pick the strongest.
  • Automated evals: Run a scoring function across all completions and return only the top-ranked result to the user.

Ready to try it?

Add n: 4 to your next chat completions call and see the difference.

View API Reference →