Recipe: Incident severity definitions
A repeatable framework for classifying every alert so your team triages consistently.
SEV-0 · Critical
- Customer-facing outage affecting all users
- Data loss or corruption confirmed in production
- Security breach with active exploitation
- Response: page on-call immediately, war room within 5 min
SEV-1 · High
- Core feature broken for a large subset of users
- Payment or licensing flow degraded
- Response: on-call acknowledges within 15 min, resolves within 4 h
SEV-2 · Medium
- Non-critical feature partially impaired
- Performance regression below SLA threshold
- Response: triage during business hours, fix within 24 h
SEV-3 · Low
- Cosmetic bug, no user impact
- Internal tooling annoyance
- Response: backlog, addressed in next sprint
Tip: Pair this recipe with an on-call rotation runbook. Every alert must carry a severity label before it reaches a human.