← Docs
Recipe

Recipe: YouTube video summary

Extract audio from any YouTube video, transcribe with Whisper, and generate a concise summary — all in one pipeline.

How it works

  1. youtube-dl downloads the audio track as a WAV file.
  2. whisper transcribes the audio to text using the base model.
  3. The transcript is passed to an LLM with a summarization prompt.
  4. The summary is returned as structured Markdown.

Prerequisites

  • Python 3.10+
  • yt-dlp
  • openai-whisper
  • OpenAI API key (or local LLM)

Pipeline script

#!/usr/bin/env python3
import subprocess, whisper, openai

URL = "https://www.youtube.com/watch?v=VIDEO_ID"
AUDIO = "audio.wav"

subprocess.run(["yt-dlp", "-x", "--audio-format", "wav",
                "-o", AUDIO, URL], check=True)

model = whisper.load_model("base")
result = model.transcribe(AUDIO)
transcript = result["text"]

response = openai.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{
        "role": "system",
        "content": "Summarize this transcript concisely."
    }, {
        "role": "user",
        "content": transcript
    }]
)

print(response.choices[0].message.content)

Usage

python summarize.py

Replace VIDEO_ID with the target YouTube video ID. Output prints directly to stdout.

Need more? See the full docs for advanced recipes and API reference.