Recipe
5-whys facilitation guide
A structured method to trace any problem back to its root cause through five layers of inquiry.
When to use it
- A production incident just closed and you need a blameless postmortem.
- A recurring bug keeps slipping through code review.
- Team velocity dropped and the surface reason doesn't explain it.
- Customer churn spiked in one cohort — you want the systemic driver.
The core rule
Ask “why” five times. Each answer becomes the premise for the next question. Stop when you reach a process gap or a missing safeguard — never stop at human error.
Facilitation script
- Frame the problem.
Write a one-sentence statement everyone agrees on. Example: “Checkout latency exceeded 4s for EU users on Tuesday.” - Why #1.
“Why did checkout latency exceed 4s?” — record the answer verbatim. - Why #2–#4.
Chain each answer into the next why. Resist the urge to jump to a solution. Stay in inquiry. - Why #5.
The fifth answer should surface a process gap. If it still points to a person, ask one more why. - Capture the countermeasure.
Write a single action item that prevents the root cause from recurring. Assign an owner and a deadline.
Real example
Problem: DB connection pool exhausted in prod.
- 1. Why? — Connection count hit the 200 limit.
- 2. Why? — A background job opened 180 connections in a tight loop.
- 3. Why? — The job didn't reuse the pool; it created raw connections.
- 4. Why? — The migration from pg-pool was marked complete but the old code path remained.
- 5. Why? — No lint rule or CI check enforced the deprecation.
Countermeasure: Add a CI lint rule blocking raw pg.Client imports.
Anti-patterns
- Stopping at “the engineer forgot.” That is never the root cause.
- Asking why six times when five is enough — you’ll drift into philosophy.
- Running the exercise solo on a complex incident. Bring at least two perspectives.
- Skipping the written countermeasure. Insight without action is entertainment.
Part of the Meridian engineering playbook. Pair with the incident review template for a complete postmortem workflow.