Chain-of-thought prompting

Chain-of-thought (CoT) is the single highest-leverage prompting technique for non-reasoning models. It instructs the model to break complex tasks into intermediate reasoning steps before producing a final answer.

The magic phrase

Appending Think step by step. to a prompt consistently improves accuracy on arithmetic, logic, and multi-hop reasoning benchmarks by 10–40% across GPT-3.5, Claude 2, and Gemini 1.0.

When to use it

▸Multi-step math, symbolic reasoning, or constraint satisfaction
▸Code generation requiring algorithmic decomposition
▸Planning tasks with sequential dependencies
▸Any prompt where the model tends to jump to conclusions

Reasoning models

Models with native reasoning — gpt-5*, o4-mini, o1, o3 — perform chain-of-thought internally. They do not need explicit CoT instructions. Adding them can actually degrade performance by interfering with the model's optimized internal reasoning budget. For these models, provide the problem clearly and let the architecture handle decomposition.

Example

A bakery sells cookies in boxes of 6 and boxes of 10.
A customer wants exactly 98 cookies.
How many of each box should they buy?
Think step by step.

Without CoT, non-reasoning models often guess. With it, they enumerate multiples, check divisibility, and arrive at the correct solution systematically.

Variants

Zero-shot CoT

Just add "Think step by step" — no examples needed. Works surprisingly well across model families.

Few-shot CoT

Provide 2–3 worked examples showing intermediate reasoning. Higher accuracy ceiling but more prompt tokens.

Self-consistency

Run CoT multiple times with temperature > 0, then majority-vote the answers. Best for high-stakes accuracy.

Tree-of-thought

Branch and evaluate multiple reasoning paths. Powerful but token-intensive; use for hard planning problems.

Pro tip: Combine CoT with structured output formats. Ask the model to wrap its reasoning in <thinking> tags and its final answer in <answer> tags for easy post-processing and parsing.