Recipes

Recipe: PII redactor v2

Multi-pass redaction pipeline with a verifier stage that catches leaks before they leave your infrastructure.

Overview

v2 adds a second pass that re-scans the output of the primary redactor using a different detection strategy. If the verifier finds residual PII, the chunk is either re-redacted or flagged for manual review. This catches regex blind spots, hyphenated names, and locale-specific formats that single-pass engines miss.

Pipeline

StageEngineAction on leak
Pass 1Regex + PresidioReplace with entity tag
Pass 2Custom NER modelRe-redact or quarantine
VerifierHeuristic scorerFlag if confidence < 0.98

Entity coverage

  • EMAILj***@d*****.com
  • PHONE+1-XXX-XXX-1234
  • SSN***-**-6789
  • CC****-****-****-4242
  • IBANDE** **** **** **** **00
  • PERSON[REDACTED]

Verifier scoring

After both passes complete, the verifier computes a per-chunk confidence score by measuring residual entropy, known-format pattern matches, and NER model disagreement. Chunks below the 0.98 threshold are routed to a manual review queue. The verifier never mutates data — it only scores and flags.

Next stepDeploy recipe or read the API reference.