Chain-of-Thought
Prompting technique that improves LLM reasoning by asking them to decompose complex problems into explicit intermediate steps before reaching a conclusion.
What it is
Chain-of-Thought (CoT) is a prompt engineering technique that dramatically improves LLM reasoning capability by asking them to "think step by step." Instead of jumping directly to the answer, the model generates intermediate reasoning steps that guide it toward more precise conclusions.
Introduced by Wei et al. in 2022, CoT demonstrated that large models can solve math, logic, and common sense problems that they previously failed consistently.
Why it works
LLMs predict tokens sequentially. When they generate intermediate steps:
- Each step provides context for the next
- Errors become visible and correctable
- The model "works" the problem instead of guessing
- Latent reasoning capabilities in the model are activated
Variants
Zero-shot CoT
Simply adding "Let's think step by step" to the prompt:
How many apples are left if I have 15 and give away 7?
Let's think step by step.
Few-shot CoT
Providing examples with explicit reasoning:
Example: If I have 10 oranges and eat 3, there are 10-3=7 oranges left.
Question: If I have 15 apples and give away 7, how many are left?
Self-Consistency
Generate multiple reasoning chains and choose the most frequent answer. Improves accuracy at the cost of more tokens.
Tree of Thoughts
Explore multiple reasoning branches in parallel, evaluating and pruning less promising paths.
Applications
- Mathematics: arithmetic, algebra, word problems
- Logic: puzzles, deductions, argument analysis
- Code: debugging, algorithm design
- Planning: decomposing complex tasks into executable steps
- Agents: AI agents use CoT internally to decide actions
Limitations
- Cost: more tokens = more latency and cost
- Not infallible: reasoning can be plausible but incorrect
- Small models: CoT works better on large models (>10B parameters)
Why it matters
Chain-of-thought is the prompting technique that most consistently improves LLM performance on reasoning tasks. Understanding when and how to apply it — and its limitations — is a fundamental skill for any engineer building applications with language models.
References
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models — Wei et al., 2022.
- Self-Consistency Improves Chain of Thought Reasoning — Wang et al., 2022.
- Tree of Thoughts — Yao et al., 2023.