Back to Docs
Recipe

Recipe: Bulk image captioner

Pipe a folder of screenshots through a vision model and get back a CSV with filenames and AI-generated captions — no manual labeling.

Ingredients

  • A folder of PNG or JPEG images (screenshots, product photos, diagrams)
  • OpenAI API key with GPT-4V access
  • Python 3.11+ with openai and pillow
  • Meridian CLI installed and authenticated

Steps

  1. Drop images into a single directory. Supported: .png, .jpg, .jpeg.
  2. Set your OpenAI key: export OPENAI_API_KEY=sk-...
  3. Run the captioner: meridian caption --input ./screenshots --output captions.csv
  4. Review the CSV. Columns: filename, caption, confidence.

Flags

FlagDefaultDescription
--modelgpt-4oVision model to use
--max-tokens300Max caption length
--concurrency4Parallel API calls

Pro tip

Pipe the CSV straight into a fine-tuning job: meridian caption ... | meridian finetune --dataset -