Voice command → API action
Capture spoken commands from the browser, transcribe via Whisper, and trigger Meridian API endpoints with the parsed intent.
Step 1 — Capture audio
Request microphone access through the browser MediaRecorder API. Buffer chunks in memory and finalize as a WAV blob when the user releases the push-to-talk button.
Step 2 — Transcribe
POST the audio blob to your backend or directly to the OpenAI Whisper endpoint. Receive a plain-text transcript with optional language and confidence metadata.
Step 3 — Parse intent
Send the transcript to a lightweight LLM call with a system prompt that maps natural language to Meridian action names and payload shapes. Return structured JSON.
Step 4 — Execute
Call the Meridian API with the resolved action and parameters. Surface the result in the UI with a confirmation toast or spoken TTS response.
Endpoints used
POST /api/actions/executePOST /api/intent/parse