Hallucination Mitigation

What it is

Hallucinations are LLM responses that sound correct but contain fabricated information. The model generates plausible text based on statistical patterns, not verified facts. Mitigating hallucinations is critical for applications where accuracy matters.

Types of hallucinations

Type	Example	Detection	Mitigation
Factual	Incorrect data as facts	Verification against sources	RAG with citations
Fabrication	Inventing URLs, papers, citations	Validate sources exist	Instruct "I don't know" + verification
Inconsistency	Contradicting itself in same response	Compare assertions	Chain-of-thought
Extrapolation	Generalizing from limited examples	Evaluate model confidence	Limit prompt scope

Mitigation strategies

Grounding with RAG

RAG anchors responses in real documents. The key is instructing the model to cite specific sources and limit itself to the provided context:

GROUNDED_PROMPT = """Answer ONLY with information from the provided documents.
For each factual claim, include the reference in brackets: [Doc N].
If the documents don't contain the information, respond: "I don't have enough information."
 
Documents:
{context}
 
Question: {question}
"""

This pattern reduces fabrication but doesn't eliminate it — the model can misinterpret context or combine fragments incorrectly.

Chain-of-Verification (CoVe)

Technique from Meta (Dhuliawala et al., 2023) where the model verifies its own response in four steps:

Loading diagram...

Draft: the model generates an initial response
Planning: generates verification questions about its own claims
Independent verification: answers each question separately (without seeing the draft, to avoid bias)
Final response: generates a corrected response based on verifications

def chain_of_verification(client, question: str) -> str:
    # 1. Initial draft
    draft = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": question}],
    ).choices[0].message.content
 
    # 2. Generate verification questions
    verification_qs = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": (
            f"Draft: {draft}\n\n"
            "List the factual claims and generate a verification question for each."
        )}],
    ).choices[0].message.content
 
    # 3. Verify each question independently
    verifications = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": (
            f"Answer each question independently:\n{verification_qs}"
        )}],
    ).choices[0].message.content
 
    # 4. Corrected final response
    return client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": (
            f"Original question: {question}\n"
            f"Draft: {draft}\n"
            f"Verifications: {verifications}\n\n"
            "Generate a final response correcting any detected errors."
        )}],
    ).choices[0].message.content

Prompt design

Instruct the model to say "I don't know" when it lacks information
Ask it to cite sources for factual claims
Use chain-of-thought to make reasoning explicit
Clearly separate facts from opinions

Confidence calibration

Generate multiple responses (N=5) and compare consistency — if responses diverge, confidence is low
Detect linguistic patterns associated with hallucinations: excessive hedging, overly specific details without a source

Evaluation metrics

Metric	What it measures	Tool
Faithfulness	Is the response faithful to the provided context?	RAGAS, DeepEval
FActScore	Atomic-level factual precision (claim by claim)	FActScore
Attribution	Are citations real and relevant?	Manual verification + LLM-as-judge
Self-consistency	Do multiple generations agree?	Sampling + comparison

Limitations

No perfect solution exists. Even with RAG, the model can:

Misinterpret retrieved context
Combine information in incorrect ways
Invent details that "complete" the information

Mitigation reduces frequency, it doesn't eliminate the problem. In critical applications (medical, legal, financial), human verification remains necessary.

Why it matters

Hallucinations are the most visible risk of AI systems in production. A model that generates false information with confidence can cause real harm — from citing nonexistent case law to fabricating medical data. Mitigation techniques — RAG, grounding, CoVe, verification — are engineering requirements, not optional improvements. The goal is not to eliminate hallucinations (impossible with current LLM architecture) but to reduce their frequency and detect them before they reach the user.

References

Chain-of-Verification Reduces Hallucination in Large Language Models — Dhuliawala et al. (Meta), 2023. Four-step self-verification method.
FActScore: Fine-grained Atomic Evaluation of Factual Precision — Min et al., 2023. Atomic-level factual precision evaluation.
A Survey on Hallucination in LLMs — Huang et al., 2023. Comprehensive survey on hallucinations in LLMs.
Survey of Hallucination in Natural Language Generation — Ji et al., 2023. Foundational survey on hallucinations in natural language generation.
RAGAS: Automated Evaluation of Retrieval Augmented Generation — Es et al., 2023. Evaluation framework for RAG systems including faithfulness.
LaMDA: Towards Safe, Grounded, and High-Quality Dialog Models for Everything — Google Research Blog, 2022. Google's approach to safe and grounded dialog models.
Reduce hallucinations — Anthropic, 2024. Practical guide with prompting techniques to reduce hallucinations in production.

What it is

Types of hallucinations

Type	Example	Detection	Mitigation
Factual	Incorrect data as facts	Verification against sources	RAG with citations
Fabrication	Inventing URLs, papers, citations	Validate sources exist	Instruct "I don't know" + verification
Inconsistency	Contradicting itself in same response	Compare assertions	Chain-of-thought
Extrapolation	Generalizing from limited examples	Evaluate model confidence	Limit prompt scope

Mitigation strategies

Grounding with RAG

RAG anchors responses in real documents. The key is instructing the model to cite specific sources and limit itself to the provided context:

GROUNDED_PROMPT = """Answer ONLY with information from the provided documents.
For each factual claim, include the reference in brackets: [Doc N].
If the documents don't contain the information, respond: "I don't have enough information."
 
Documents:
{context}
 
Question: {question}
"""

This pattern reduces fabrication but doesn't eliminate it — the model can misinterpret context or combine fragments incorrectly.

Chain-of-Verification (CoVe)

Technique from Meta (Dhuliawala et al., 2023) where the model verifies its own response in four steps:

Loading diagram...

Draft: the model generates an initial response
Planning: generates verification questions about its own claims
Independent verification: answers each question separately (without seeing the draft, to avoid bias)
Final response: generates a corrected response based on verifications

def chain_of_verification(client, question: str) -> str:
    # 1. Initial draft
    draft = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": question}],
    ).choices[0].message.content
 
    # 2. Generate verification questions
    verification_qs = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": (
            f"Draft: {draft}\n\n"
            "List the factual claims and generate a verification question for each."
        )}],
    ).choices[0].message.content
 
    # 3. Verify each question independently
    verifications = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": (
            f"Answer each question independently:\n{verification_qs}"
        )}],
    ).choices[0].message.content
 
    # 4. Corrected final response
    return client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": (
            f"Original question: {question}\n"
            f"Draft: {draft}\n"
            f"Verifications: {verifications}\n\n"
            "Generate a final response correcting any detected errors."
        )}],
    ).choices[0].message.content

Prompt design

Instruct the model to say "I don't know" when it lacks information
Ask it to cite sources for factual claims
Use chain-of-thought to make reasoning explicit
Clearly separate facts from opinions

Confidence calibration

Generate multiple responses (N=5) and compare consistency — if responses diverge, confidence is low
Detect linguistic patterns associated with hallucinations: excessive hedging, overly specific details without a source

Evaluation metrics

Metric	What it measures	Tool
Faithfulness	Is the response faithful to the provided context?	RAGAS, DeepEval
FActScore	Atomic-level factual precision (claim by claim)	FActScore
Attribution	Are citations real and relevant?	Manual verification + LLM-as-judge
Self-consistency	Do multiple generations agree?	Sampling + comparison

Limitations

No perfect solution exists. Even with RAG, the model can:

Misinterpret retrieved context
Combine information in incorrect ways
Invent details that "complete" the information

Mitigation reduces frequency, it doesn't eliminate the problem. In critical applications (medical, legal, financial), human verification remains necessary.

Why it matters

References

Chain-of-Verification Reduces Hallucination in Large Language Models — Dhuliawala et al. (Meta), 2023. Four-step self-verification method.
FActScore: Fine-grained Atomic Evaluation of Factual Precision — Min et al., 2023. Atomic-level factual precision evaluation.
A Survey on Hallucination in LLMs — Huang et al., 2023. Comprehensive survey on hallucinations in LLMs.
Survey of Hallucination in Natural Language Generation — Ji et al., 2023. Foundational survey on hallucinations in natural language generation.
RAGAS: Automated Evaluation of Retrieval Augmented Generation — Es et al., 2023. Evaluation framework for RAG systems including faithfulness.
LaMDA: Towards Safe, Grounded, and High-Quality Dialog Models for Everything — Google Research Blog, 2022. Google's approach to safe and grounded dialog models.
Reduce hallucinations — Anthropic, 2024. Practical guide with prompting techniques to reduce hallucinations in production.

Hallucination Mitigation

What it is

Types of hallucinations

Mitigation strategies

Grounding with RAG

Chain-of-Verification (CoVe)

Prompt design

Confidence calibration

Evaluation metrics

Limitations

Why it matters

References

Related content

Hallucination Mitigation

What it is

Types of hallucinations

Mitigation strategies

Grounding with RAG

Chain-of-Verification (CoVe)

Prompt design

Confidence calibration

Evaluation metrics

Limitations

Why it matters

References

Related content