Recipe

Recipe: Voice form filler

Capture spoken input via Whisper and map it to structured form fields in real time.

Overview

This recipe demonstrates a voice-to-form pipeline. A user speaks into their microphone, the audio is streamed to OpenAI Whisper for transcription, and the resulting text is parsed into discrete form fields — name, email, phone, and freeform notes.

Architecture

1Browser captures audio via MediaRecorder API

2Audio chunks streamed to your API route

3API forwards to Whisper for transcription

4Structured extraction maps text to form fields

Field mapping

Spoken phrase	Extracted field
"My name is Jane Doe"	`name`
"Email jane@example.com"	`email`
"Phone is 555-0123"	`phone`

Requirements

OpenAI API key with Whisper access
Browser supporting MediaRecorder (Chrome 49+, Firefox 25+)
HTTPS origin for getUserMedia

Full implementation with streaming hooks and extraction prompt available in the recipes repository.