Summarize

OpenAI discovers the reason why AI chatbots hallucinate, suggests fix

By Akash Pandey

Sep 06, 2025

03:19 pm

What's the story

OpenAI has discovered the root cause of hallucinations in large language models (LLMs). The issue isn't forgetfulness or creativity, but rather a tendency to bluff. A new paper from OpenAI explains that these models are not programmed to lie, but are rewarded for guessing when uncertain, which improves their test performance. This is similar to how students guess answers on exams even if they don't know them.

Evaluation bias

AIs are evaluated on how well they guess

OpenAI's researchers have pointed out that the current evaluation process for language models rewards guessing over silence. They explained, "Humans learn the value of expressing uncertainty outside of school, in the school of hard knocks. On the other hand, language models are primarily evaluated using exams that penalize uncertainty." This results in AIs appearing overly confident even when they're wrong.

Solution proposal

Changing evaluation method instead of rebuilding models

To tackle this issue, OpenAI suggests changing the evaluation method instead of rebuilding the models from scratch. The company argues that "the root problem is the abundance of evaluations that are not aligned." They propose adjusting primary evaluations to stop penalizing abstentions when uncertain. In a blog post accompanying the paper, they wrote, "The widely used, accuracy-based evals need to be updated so that their scoring discourages guessing."

Balanced response

Shift in focus from improving AI to evaluating them

OpenAI's approach to evaluating AIs is a major shift from the industry's focus on making chatbots faster, smarter, and more articulate. The company hopes that by tweaking the way AIs are evaluated, it can create models that provide measured and reliable responses instead of "fake it till you make it." This is especially important when users seek medical advice or financial guidance from their AI assistants.