Azure Integration

Safety filters

Meridian routes every generation request through Azure AI Content Safety. Configure severity thresholds to control what your application allows through.

Filter categories

Hate

Speech that attacks or discriminates based on protected characteristics.

Sexual

Sexually explicit content, innuendo, or suggestive material.

Violence

Depictions of physical harm, gore, or weapons-related content.

Self-harm

Content that promotes or depicts self-injury or suicide.

Severity thresholds

Each category accepts a threshold from 0 (most permissive) to 7 (most restrictive). The default is 4 — moderate filtering that catches obvious violations while allowing edge cases through.

Level	Behavior
0–1	Off — no filtering applied.
2–3	Low — blocks only severe content.
4–5	Medium — balanced filtering (default).
6–7	High — aggressive filtering; may trigger false positives.

Understanding content_filter responses

When a threshold is set high (6–7), Azure may return an HTTP 200 with finish_reason=content_filter instead of a generation. The request succeeded — the model simply refused to produce output.

{
  "choices": [{
    "index": 0,
    "finish_reason": "content_filter",
    "content_filter_results": {
      "hate": { "filtered": true, "severity": "high" },
      "sexual": { "filtered": false, "severity": "safe" },
      "violence": { "filtered": false, "severity": "low" },
      "self_harm": { "filtered": false, "severity": "safe" }
    }
  }]
}

Meridian surfaces the filtered categories and severities in your dashboard logs so you can tune thresholds without guesswork.

Recommendations

▸Start at severity 4 and monitor logs for one week before adjusting.
▸If your application handles user-generated prompts, keep hate and self-harm at 5+.
▸Avoid severity 7 in production — it frequently blocks benign requests that contain flagged keywords out of context.
▸Use the dashboard's per-category breakdown to identify which filter triggers most often, then tune that single category rather than raising all thresholds globally.

← Back to docs Rate limits →