Speech-to-text (Whisper)
Convert audio into text using OpenAI's Whisper model. Send a multipart audio file and receive a transcription with optional timestamp granularity.
POST
https://api.getnimbus.net/v1/audio/transcriptionsAuthenticate with your API key in the Authorization: Bearer header.
Request body
Send a multipart/form-data payload with the following fields:
| Field | Type | Required | Description |
|---|---|---|---|
| file | binary | Yes | Audio file (flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm) |
| model | string | Yes | Use whisper-1 |
| language | string | No | ISO-639-1 code (e.g. en) for better accuracy |
| response_format | string | No | json, text, srt, verbose_json, or vtt (default: json) |
| timestamp_granularities | string[] | No | word and/or segment (requires verbose_json) |
Example
curl https://api.getnimbus.net/v1/audio/transcriptions \
-H "Authorization: Bearer $MERIDIAN_API_KEY" \
-F file="@audio.mp3" \
-F model="whisper-1" \
-F response_format="verbose_json" \
-F timestamp_granularities[]="word"Response
{
"text": "Hello, this is a transcription.",
"task": "transcribe",
"language": "english",
"duration": 2.34,
"segments": [
{
"id": 0,
"start": 0.0,
"end": 2.34,
"text": "Hello, this is a transcription.",
"words": [
{ "word": "Hello,", "start": 0.0, "end": 0.62 },
{ "word": "this", "start": 0.62, "end": 0.94 },
{ "word": "is", "start": 0.94, "end": 1.12 },
{ "word": "a", "start": 1.12, "end": 1.28 },
{ "word": "transcription.", "start": 1.28, "end": 2.34 }
]
}
]
}Limits
- •Maximum file size: 25 MB
- •Supported formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm
- •Rate limit: 50 requests per minute per API key