← Back to docs
Recipe

Text-to-speech patterns

Meridian streams audio chunks from any TTS-capable model on the gateway. This recipe shows the three patterns you will reach for ninety percent of the time: one-shot synthesis, streaming playback, and voice cloning with reference audio. Each pattern uses the same model: audio/tts-pro alias.

1.One-shot synthesis

Smallest possible call. Send text, receive an MP3 buffer. Good for short notifications, button-press feedback, or any case where total output is under thirty seconds and you do not need word-level timestamps.

2.Streaming playback

For anything over a sentence, switch to stream: true. You get Opus frames as they synthesize, so the user hears the first syllable in under three hundred milliseconds instead of waiting for the whole take to render.

3.Voice cloning

Pass a six-second reference clip as a base64 data URL on the voice_ref field and the model matches timbre, cadence, and accent. Reference clips never leave the inference pod and are wiped at the end of the request.

const r = await fetch('https://llm.getnimbus.net/v1/audio/speech', {
  method: 'POST',
  headers: { Authorization: 'Bearer ' + key },
  body: JSON.stringify({
    model: 'audio/tts-pro',
    input: 'Meridian streams in under 300ms.',
    voice: 'nova',
    stream: true,
  }),
});
← All recipes