Your AI is lying to you. Confidently. Fluently. And most of the time, you have no idea.
AI hallucination — where models generate false information presented as fact — is one of the biggest unsolved problems in production AI. Here's what's actually causing it and how to measure it.
When a model hasn't seen enough data about a topic, it fills in the gaps by generating plausible-sounding content based on patterns. It doesn't know what it doesn't know.
LLMs are trained to produce fluent, confident text. This same property makes them dangerous — they'll state false information with the same confidence as true information.
When a model's context window is too small to hold all relevant information, it starts generating from memory — which may be wrong, outdated, or simply fabricated.
"The model doesn't know it's wrong. It's not lying — it's pattern-matching toward plausibility." — AI Safety Researcher, 2025
Models that are uncertain often hedge. Count uncertainty signals relative to confident assertions:
import re
def hedge_score(text: str) -> float:
hedges = re.findall(
r'\b(probably|likely|perhaps|around|approximately|roughly|I think|I believe)\b',
text, re.IGNORECASE
)
assertions = re.findall(
r'\b(always|never|proven|confirmed|established|guaranteed)\b',
text, re.IGNORECASE
)
# High assertions + low hedges = higher hallucination risk
return min(100, len(hedges) * 5 + len(assertions) * 8)
Does the AI cite sources? Do those sources actually exist and support the claim?
import httpx
async def verify_sources(text: str) -> dict:
urls = re.findall(r'https?://[^\s]+', text)
results = []
async with httpx.AsyncClient() as client:
for url in urls:
try:
r = await client.head(url, timeout=5)
results.append({"url": url, "exists": r.status_code == 200})
except:
results.append({"url": url, "exists": False})
return results
Ask the model the same question multiple times with slightly different phrasing. If the answers contradict each other, it's likely hallucinating:
def consistency_check(question: str, model, n=3) -> float:
answers = [model.generate(question) for _ in range(n)]
# Check semantic similarity between answers
# Low similarity = high hallucination risk
similarities = [cosine_similarity(answers[i], answers[j])
for i in range(n) for j in range(i+1, n)]
return 1 - (sum(similarities) / len(similarities))
Before shipping any AI feature, verify these:
✓ Are all factual claims grounded in retrieved documents (RAG)?
✓ Does your system prompt instruct the model to say "I don't know"?
✓ Are cited URLs verified to exist?
✓ Is hallucination rate measured and monitored in production?
SoruvaGuard scores every AI output for hallucination risk, source integrity, and more — in one API call.
Get Early Access — Free