Recipe: YouTube video summary
Extract audio from any YouTube video, transcribe with Whisper, and generate a concise summary — all in one pipeline.
How it works
youtube-dldownloads the audio track as a WAV file.whispertranscribes the audio to text using the base model.- The transcript is passed to an LLM with a summarization prompt.
- The summary is returned as structured Markdown.
Prerequisites
- Python 3.10+
yt-dlpopenai-whisper- OpenAI API key (or local LLM)
Pipeline script
#!/usr/bin/env python3
import subprocess, whisper, openai
URL = "https://www.youtube.com/watch?v=VIDEO_ID"
AUDIO = "audio.wav"
subprocess.run(["yt-dlp", "-x", "--audio-format", "wav",
"-o", AUDIO, URL], check=True)
model = whisper.load_model("base")
result = model.transcribe(AUDIO)
transcript = result["text"]
response = openai.chat.completions.create(
model="gpt-4o-mini",
messages=[{
"role": "system",
"content": "Summarize this transcript concisely."
}, {
"role": "user",
"content": transcript
}]
)
print(response.choices[0].message.content)Usage
python summarize.pyReplace VIDEO_ID with the target YouTube video ID. Output prints directly to stdout.
Need more? See the full docs for advanced recipes and API reference.