OpenAI admits GPT-5 hallucinates: ‘Even advanced AI models can produce confidently wrong answers’–Here’s why

OpenAI admits GPT-5 hallucinates: ‘Even advanced AI models can produce confidently wrong answers’–Here’s why


OpenAI has outlined the persistent issue of “hallucinations” in language models, acknowledging that even its most advanced systems occasionally produce confidently incorrect information. In a blogpost published on 5 September, OpenAI defined hallucinations as plausible but false statements generated by AI that can appear even in response to straightforward questions.

Persistent hallucinations in AI

The problem, OpenAI explains, is partly rooted in how models are trained and evaluated. Current benchmarks often reward guessing over acknowledging uncertainty, creating incentives for AI systems to provide an answer rather than admit they do not know. In one example, an earlier model produced three different, incorrect responses when asked for an author’s dissertation title and similarly varied answers when asked for a birth date.

Accuracy vs humility

OpenAI highlights that standard evaluation methods, which focus on accuracy alone, encourage models to guess.

“Think about it like a multiple-choice test. If you do not know the answer but take a wild guess, you might get lucky and be right. Leaving it blank guarantees a zero. In the same way, when models are graded only on accuracy, the percentage of questions they get exactly right, they are encouraged to guess rather than say ‘I don’t know,’” the company notes.

Newer models, including GPT‑5, show a reduction in hallucinations compared with earlier versions, particularly in reasoning tasks. OpenAI’s research indicates that abstaining from answering when uncertain, a form of AI humility, can lower error rates, though it may slightly reduce apparent accuracy on conventional benchmarks.

The company argues that addressing hallucinations requires more than developing better models; evaluation frameworks themselves must be revised. Penalising confident errors more heavily than abstentions and giving partial credit for uncertainty could reduce the prevalence of hallucinations. Current accuracy-focused scoreboards, OpenAI warns, continue to incentivise guessing.

The underlying cause of hallucinations, the research explains, is tied to the nature of pretraining.

Language models learn by predicting the next word in vast amounts of text, without labels indicating whether statements are true or false. While consistent patterns such as spelling and grammar are learned reliably, rare or arbitrary facts, such as an individual’s birthday, cannot be predicted accurately from patterns alone.

Misconceptions and future efforts

OpenAI’s paper also addresses misconceptions about hallucinations. It notes that while some questions are inherently unanswerable, hallucinations are not inevitable; small models can avoid them by recognising their limitations. Furthermore, hallucinations are not a mysterious glitch but a predictable consequence of the statistical training process combined with reward structures in evaluations.

The company says that lowering hallucination rates is an ongoing effort and that reforming evaluation methods will be crucial in reducing confidently incorrect outputs in future language models.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *