Recipes

Structured extraction recipe

Pull contacts, dates, and prices from unstructured text using json_schema — no prompt engineering required.

Overview

When you have raw text — emails, invoices, chat logs, web scrapes — and you need structured data out of it, Meridian's extraction endpoint with json_schema mode is the fastest path. You define the shape you want, and the model fills it. No parsing, no regex, no brittle prompt templates.

Step 1 — Define your schema

Write a JSON Schema that describes exactly what you want extracted. Every field you declare becomes a slot the model will attempt to populate from the source text.

{
  "type": "object",
  "properties": {
    "contacts": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "name":  { "type": "string" },
          "email": { "type": "string" },
          "phone": { "type": "string" }
        },
        "required": ["name", "email"]
      }
    },
    "dates": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "label": { "type": "string" },
          "iso_date": { "type": "string" }
        },
        "required": ["label", "iso_date"]
      }
    },
    "prices": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "item":     { "type": "string" },
          "amount":   { "type": "number" },
          "currency": { "type": "string" }
        },
        "required": ["item", "amount", "currency"]
      }
    }
  },
  "required": ["contacts", "dates", "prices"]
}

Step 2 — Call the extraction endpoint

POST your source text and schema to /v1/extract. The model reads the text, finds the relevant spans, and returns a JSON object matching your schema exactly.

curl https://api.getnimbus.net/v1/extract \
  -H "Authorization: Bearer $MERIDIAN_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meridian-extract",
    "mode": "json_schema",
    "schema": { ... },
    "input": "Meeting with Alice (alice@corp.com)
              on 2026-01-15. PO for $4,200
              (widgets) due 2026-02-01."
  }'

Step 3 — Read the structured response

The response is a clean JSON object. No markdown wrapping, no code fences — just the data, ready to pipe into your database or workflow.

{
  "contacts": [
    {
      "name": "Alice",
      "email": "alice@corp.com",
      "phone": null
    }
  ],
  "dates": [
    { "label": "meeting", "iso_date": "2026-01-15" },
    { "label": "po_due",   "iso_date": "2026-02-01" }
  ],
  "prices": [
    { "item": "widgets", "amount": 4200, "currency": "USD" }
  ]
}

Pro tips

  • Use "required" fields to force extraction — the model will return null if it cannot find a value.
  • Add "description" fields to your schema properties for ambiguous keys — the model reads them as hints.
  • Batch multiple documents in one request by passing an array to input — you'll get an array of structured objects back.
Meridian — structured extraction for production pipelines.