Classification recipe

Force reliable categorical output from any LLM using structured generation with response_format=json_schema and temperature=0.

Why this matters

When you need a model to pick exactly one label — spam or not_spam, fraud or legitimate, urgent or routine — raw text generation is brittle. The model might add punctuation, capitalize differently, or wrap the answer in a sentence. Structured output eliminates the parsing layer entirely.

The schema

Define an enum with your allowed labels. The model cannot emit anything outside this set. The schema below enforces exactly spam or not_spam.

{
  "name": "classification",
  "strict": true,
  "schema": {
    "type": "object",
    "properties": {
      "label": {
        "type": "string",
        "enum": ["spam", "not_spam"]
      }
    },
    "required": ["label"],
    "additionalProperties": false
  }
}

The prompt

Keep the system prompt terse. The schema handles output shape; the prompt only needs to define the classification rule.

You are a content classifier.
Analyze the message and classify it as "spam" or "not_spam".

Spam indicators: unsolicited promotion, phishing links,
urgency pressure, suspicious sender patterns.

Message: "CONGRATULATIONS! You've won a FREE iPhone!
Click here to claim: http://bit.ly/3xyz"

The call

Set temperature=0 to eliminate sampling variance. The model will always return the most probable label deterministically.

const response = await openai.chat.completions.create({
  model: "gpt-4o",
  temperature: 0,
  messages: [
    { role: "system", content: systemPrompt },
    { role: "user", content: userMessage }
  ],
  response_format: {
    type: "json_schema",
    json_schema: classificationSchema
  }
});

const result = JSON.parse(
  response.choices[0].message.content
);
console.log(result.label); // "spam"

Expected output

The response is always valid JSON matching the schema. No parsing, no trimming, no regex.

{ "label": "spam" }

Handling edge cases

Ambiguous content: the model will still pick one label. If you need an “uncertain” fallback, add it to the enum and instruct the model when to use it.
Multi-label classification: use an array property with items.enum instead of a single string enum.
Confidence scores: add an optional confidence number field (0–1) to the schema and ask the model to self-assess.

This recipe is part of the Meridian structured generation guide. For batch classification at scale, see the batch processing recipe.