Speech-to-text (Whisper)

Convert audio into text using OpenAI's Whisper model. Send a multipart audio file and receive a transcription with optional timestamp granularity.

POSThttps://api.getnimbus.net/v1/audio/transcriptions

Authenticate with your API key in the Authorization: Bearer header.

Request body

Send a multipart/form-data payload with the following fields:

FieldTypeRequiredDescription
filebinaryYesAudio file (flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm)
modelstringYesUse whisper-1
languagestringNoISO-639-1 code (e.g. en) for better accuracy
response_formatstringNojson, text, srt, verbose_json, or vtt (default: json)
timestamp_granularitiesstring[]Noword and/or segment (requires verbose_json)

Example

curl https://api.getnimbus.net/v1/audio/transcriptions \
  -H "Authorization: Bearer $MERIDIAN_API_KEY" \
  -F file="@audio.mp3" \
  -F model="whisper-1" \
  -F response_format="verbose_json" \
  -F timestamp_granularities[]="word"

Response

{
  "text": "Hello, this is a transcription.",
  "task": "transcribe",
  "language": "english",
  "duration": 2.34,
  "segments": [
    {
      "id": 0,
      "start": 0.0,
      "end": 2.34,
      "text": "Hello, this is a transcription.",
      "words": [
        { "word": "Hello,", "start": 0.0, "end": 0.62 },
        { "word": "this", "start": 0.62, "end": 0.94 },
        { "word": "is", "start": 0.94, "end": 1.12 },
        { "word": "a", "start": 1.12, "end": 1.28 },
        { "word": "transcription.", "start": 1.28, "end": 2.34 }
      ]
    }
  ]
}

Limits

  • Maximum file size: 25 MB
  • Supported formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm
  • Rate limit: 50 requests per minute per API key