Jonatan Matajonmatum.com
conceptsnotesexperimentsessays
© 2026 Jonatan Mata. All rights reserved.v2.1.1
Concepts

Hallucination Mitigation

Techniques to reduce LLMs generating false but plausible information, from RAG to factual verification and prompt design.

evergreen#hallucination#factuality#grounding#rag#verification#llm

What it is

Hallucinations are LLM responses that sound correct but contain fabricated information. The model generates plausible text based on statistical patterns, not verified facts. Mitigating hallucinations is critical for applications where accuracy matters.

Types of hallucinations

TypeExampleDetectionMitigation
FactualIncorrect data as factsVerification against sourcesRAG with citations
FabricationInventing URLs, papers, citationsValidate sources existInstruct "I don't know" + verification
InconsistencyContradicting itself in same responseCompare assertionsChain-of-thought
ExtrapolationGeneralizing from limited examplesEvaluate model confidenceLimit prompt scope

Mitigation strategies

Grounding with RAG

RAG anchors responses in real documents. The key is instructing the model to cite specific sources and limit itself to the provided context:

GROUNDED_PROMPT = """Answer ONLY with information from the provided documents.
For each factual claim, include the reference in brackets: [Doc N].
If the documents don't contain the information, respond: "I don't have enough information."
 
Documents:
{context}
 
Question: {question}
"""

This pattern reduces fabrication but doesn't eliminate it — the model can misinterpret context or combine fragments incorrectly.

Chain-of-Verification (CoVe)

Technique from Meta (Dhuliawala et al., 2023) where the model verifies its own response in four steps:

Loading diagram...
  1. Draft: the model generates an initial response
  2. Planning: generates verification questions about its own claims
  3. Independent verification: answers each question separately (without seeing the draft, to avoid bias)
  4. Final response: generates a corrected response based on verifications
def chain_of_verification(client, question: str) -> str:
    # 1. Initial draft
    draft = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": question}],
    ).choices[0].message.content
 
    # 2. Generate verification questions
    verification_qs = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": (
            f"Draft: {draft}\n\n"
            "List the factual claims and generate a verification question for each."
        )}],
    ).choices[0].message.content
 
    # 3. Verify each question independently
    verifications = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": (
            f"Answer each question independently:\n{verification_qs}"
        )}],
    ).choices[0].message.content
 
    # 4. Corrected final response
    return client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": (
            f"Original question: {question}\n"
            f"Draft: {draft}\n"
            f"Verifications: {verifications}\n\n"
            "Generate a final response correcting any detected errors."
        )}],
    ).choices[0].message.content

Prompt design

  • Instruct the model to say "I don't know" when it lacks information
  • Ask it to cite sources for factual claims
  • Use chain-of-thought to make reasoning explicit
  • Clearly separate facts from opinions

Confidence calibration

  • Generate multiple responses (N=5) and compare consistency — if responses diverge, confidence is low
  • Detect linguistic patterns associated with hallucinations: excessive hedging, overly specific details without a source

Evaluation metrics

MetricWhat it measuresTool
FaithfulnessIs the response faithful to the provided context?RAGAS, DeepEval
FActScoreAtomic-level factual precision (claim by claim)FActScore
AttributionAre citations real and relevant?Manual verification + LLM-as-judge
Self-consistencyDo multiple generations agree?Sampling + comparison

Limitations

No perfect solution exists. Even with RAG, the model can:

  • Misinterpret retrieved context
  • Combine information in incorrect ways
  • Invent details that "complete" the information

Mitigation reduces frequency, it doesn't eliminate the problem. In critical applications (medical, legal, financial), human verification remains necessary.

Why it matters

Hallucinations are the most visible risk of AI systems in production. A model that generates false information with confidence can cause real harm — from citing nonexistent case law to fabricating medical data. Mitigation techniques — RAG, grounding, CoVe, verification — are engineering requirements, not optional improvements. The goal is not to eliminate hallucinations (impossible with current LLM architecture) but to reduce their frequency and detect them before they reach the user.

References

  • Chain-of-Verification Reduces Hallucination in Large Language Models — Dhuliawala et al. (Meta), 2023. Four-step self-verification method.
  • FActScore: Fine-grained Atomic Evaluation of Factual Precision — Min et al., 2023. Atomic-level factual precision evaluation.
  • A Survey on Hallucination in LLMs — Huang et al., 2023. Comprehensive survey on hallucinations in LLMs.
  • Survey of Hallucination in Natural Language Generation — Ji et al., 2023. Foundational survey on hallucinations in natural language generation.
  • RAGAS: Automated Evaluation of Retrieval Augmented Generation — Es et al., 2023. Evaluation framework for RAG systems including faithfulness.
  • LaMDA: Towards Safe, Grounded, and High-Quality Dialog Models for Everything — Google Research Blog, 2022. Google's approach to safe and grounded dialog models.
  • Reduce hallucinations — Anthropic, 2024. Practical guide with prompting techniques to reduce hallucinations in production.

Related content

  • Large Language Models

    Massive neural networks based on the Transformer architecture, trained on enormous text corpora to understand and generate natural language with emergent capabilities like reasoning, translation, and code generation.

  • Retrieval-Augmented Generation

    Architectural pattern that combines information retrieval from external sources with LLM text generation, reducing hallucinations and keeping knowledge current without retraining the model.

  • AI Safety

    Field dedicated to ensuring artificial intelligence systems behave safely, aligned with human values, and predictably, minimizing risks of harm.

  • Chain-of-Thought

    Prompting technique that improves LLM reasoning by asking them to decompose complex problems into explicit intermediate steps before reaching a conclusion.

  • Content Agent QA Review: PR #187

    Findings from manual review of PR

Concepts