The Decoder September 6, 2025 IDIOCRACY

Why ChatGPT hallucinations may shift from guessing to doubt

OpenAI says language models will always hallucinate because they predict likely words rather than verify truth. The company argues that future systems should be better at recognizing uncertainty, using tools, asking for help, or saying "I don't know" instead of guessing.

WTF Index IDIOCRACY

◄ Terminator 0 Idiocracy 2 ►

The story centers on AI hallucinations eroding truth and reliability, though it also describes mitigation through uncertainty handling.

Why ChatGPT hallucinations may shift from guessing to doubt

OpenAI’s latest position on ChatGPT hallucinations is blunt: the problem will not disappear. Language models can become more useful and more careful, but they will still sometimes produce false or misleading information.

The more realistic goal is not a model that never makes things up. It is a model that better recognizes when it is unsure and changes its behavior before a confident-sounding error reaches the user.

Why hallucinations happen

OpenAI says language models will always hallucinate. In this context, a hallucination means a false or misleading statement, a problem the source also describes with the term "bullshit."

The basic reason is how these systems are trained. They are built to predict the next most likely word, not to determine what is true. That makes them capable of producing fluent answers that look reliable even when the underlying content is wrong.

This distinction matters because language quality can hide uncertainty. A model can present an inaccurate answer with the same confidence and polish as a correct one. For creative uses, that may be tolerable. For users seeking reliable information, it becomes a serious weakness.

OpenAI’s framing also makes clear that the issue is not only about one bad response. It is part of the basic design of language models: they generate language from patterns, and those patterns do not give them a human-like grasp of truth and falsehood.

The different ways models get facts wrong

OpenAI separates hallucinations into several categories. That breakdown helps explain why some mistakes are easy to spot, while others can be harder for users to detect.

Intrinsic hallucinations contradict the prompt itself. The source gives the example of a model answering "2" when asked, "How many Ds are in DEEPSEEK?"
Extrinsic hallucinations conflict with real-world facts or with the model’s training data. Examples include invented quotes or made-up biographies.
Arbitrary fact hallucinations appear when the model tries to answer questions about details rarely or never represented in its training. The source points to specific birthdays and dissertation titles as examples.

The last category shows why guessing is such a persistent problem. When a model lacks enough basis for a specific answer, it may still produce one. To the user, that can look like knowledge. Inside the system, it is closer to a guess shaped by language patterns.

How OpenAI tries to reduce the problem

OpenAI says it uses several methods to reduce hallucinations. These include reinforcement learning with human feedback, external tools such as calculators and databases, retrieval-augmented generation, and fact-checking subsystems.

Each approach addresses a different part of the problem. External tools can help when a model needs calculation or stored information. Retrieval can bring in relevant material instead of forcing the model to rely only on what it has learned. Fact-checking subsystems can add another layer of review before an answer is delivered.

OpenAI’s longer-term goal is a modular "system of systems" that can make model behavior more reliable and predictable. That phrase points to a future in which the language model is only one part of a larger process, with other components helping it check, retrieve, calculate, or stop.

Still, none of these strategies changes the central claim: hallucinations remain part of language models. The improvement OpenAI is emphasizing is better self-monitoring, not perfect truthfulness.

Why saying "I don't know" matters

OpenAI says future versions should be better at knowing when they are unsure. When that happens, the model should not simply invent an answer. It should use outside tools, ask for help, or stop responding.

That does not mean the model truly understands truth. The source is careful on this point. The model may be able to signal low confidence without having a real concept of what is true or false.

Even so, that signal could change the user experience. A model that admits uncertainty is less likely to mislead someone with a polished answer that has no solid basis. It also behaves more like a person in one narrow but important way: people do not know everything, and sometimes the responsible answer is to say so.

The challenge is that current evaluation systems often push models in the other direction.

Benchmarks may reward guessing

OpenAI points to a deeper issue in how large language models are tested. Many benchmarks use right-or-wrong scoring and do not reward an answer such as "I don't know." According to OpenAI, that structure encourages guessing.

The result is a bad incentive. A model that always answers may score better than one that honestly admits uncertainty, even when some of those confident answers are made up. OpenAI calls this an "epidemic," because uncertainty is punished while hallucination can be rewarded.

OpenAI suggests changing benchmark design. Instead of rewarding models for always responding, tasks could require an answer only when the model is confident. Wrong answers would be penalized, while "I don't know" would not count against the model.

OpenAI calls these "confidence thresholds." The idea is to measure responsible behavior more directly, rather than simply favoring models that sound certain. In that kind of test, restraint becomes a feature, not a failure.

The source also notes signs of progress outside OpenAI’s own benchmarks. A Stanford math professor recently spent a year testing an unsolved problem on OpenAI’s models. Earlier versions answered incorrectly, while the latest model admitted it could not solve the problem. It also chose not to guess on the toughest question from this year's International Mathematical Olympiad.

OpenAI says these improvements should reach commercial models in the coming months. If that happens, the practical change may be subtle but important: ChatGPT may still hallucinate, but it could become better at recognizing when an answer should not be given at all.