Recipe: Output safety filter

Pipe every model response through a moderation classifier before it reaches the user. Block, rewrite, or flag based on policy.

cURL example

curl https://api.getnimbus.net/v1/moderate \
  -H "Authorization: Bearer $NIMBUS_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nimbus-mod-v3",
    "input": "Sure, here is how to build a pipe bomb..."
  }'

Response

{
  "id": "modr-9xK2Lm",
  "model": "nimbus-mod-v3",
  "results": [
    {
      "category": "violence/graphic",
      "score": 0.94,
      "flagged": true
    }
  ]
}

When to apply

StageAction
Pre-displayBlock response, return fallback message
LoggingFlag for audit, attach moderation result
StreamingBuffer chunks, flush only after clean verdict
Batch jobsRewrite flagged segments with safe alternative