TroubleshootingIntermediate

How to reduce Claude hallucinations by grounding answers in your documents

Cut fabricated facts by feeding source documents into the prompt and turning on citations so every claim is traceable.

9 minIntermediate

A hallucination is a confident answer that is not supported by anything real. The most reliable fix is grounding: give the model the source material in the request and tell it to answer only from that material. The Anthropic API has a built-in citations feature that ties each sentence of the answer back to a span of your document, so you can verify claims instead of trusting them.

  • The Anthropic SDK installed and an API key set
  • One or more source documents (text or PDF)
  • A question you want answered strictly from those documents

Step 1: Put the source in a document block

Instead of pasting the source as plain text, wrap it in a document content block. This lets the API track character positions for citations later.

grounded.py
from anthropic import Anthropic

client = Anthropic()
source = open("policy.txt").read()

resp = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=1024,
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "document",
                "source": {"type": "text", "media_type": "text/plain", "data": source},
                "title": "Refund policy",
                "citations": {"enabled": True},
            },
            {"type": "text", "text": "What is the refund window? Answer only from the document."},
        ],
    }],
)
Say what to do when the answer is not there
Add a line like: if the document does not contain the answer, say you do not know. Without this, the model may fall back on general knowledge, which is exactly the hallucination you are trying to prevent.

Step 2: Read the citations off the response

With citations enabled, the answer is split into text blocks. Cited blocks carry a citations array, each entry pointing at the exact span of the source the claim came from. Walk the blocks to show or log where each statement is grounded.

show_citations.py
for block in resp.content:
    if block.type != "text":
        continue
    print(block.text)
    for c in (block.citations or []):
        print("  source:", repr(c.cited_text))
Agent transcript — grounded answer
You
What is the refund window?
Agent
Refunds are available within 30 days of purchase. [source: 'eligible for a full refund within 30 calendar days']
Every claim points back at a span of the policy document.

Step 3: Keep the model honest about gaps

Test with a question the document cannot answer. A well grounded setup should refuse to invent. If it still makes something up, tighten the instruction and confirm the source actually contains the section you expected.

Agent transcript — honest gap
You
Does the policy cover digital gift cards?
Agent
The document does not mention digital gift cards, so I cannot answer that from this source.
Grounding plus a clear instruction produces an I-do-not-know instead of a guess.
Citations and structured outputs do not mix
Enabling citations on a document while also setting output_config.format returns a 400. Pick one: either traceable prose with citations, or a strict JSON shape, not both in the same request.

Result: with the source in a document block and citations on, the refund answer pointed straight at the 30 day clause, and the gift card question returned an honest I-do-not-know instead of a fabricated policy. You can now audit every claim.

Watch related tutorials

Tags
#hallucinations#citations#grounding#claude