Why Sam Altman says LLM scaling still has room to run

Sam Altman is pushing back against critics who argue that large language models are a dead end. He says the evidence still supports continued LLM scaling, while acknowledging that models remain weaker than people on very long-horizon tasks requiring high judgment.

WTF Index NEUTRAL
◄ Terminator 1 Idiocracy 0 ►

The story is mainly a debate about whether LLM capabilities will keep scaling, with only a mild lean toward more powerful AI and no clear harm angle.

Why Sam Altman says LLM scaling still has room to run

Sam Altman is not backing away from the scaling argument. The OpenAI CEO continues to say that large language models have more room to improve, and he is challenging critics who believe the approach has reached its limits.

His view is blunt: the field lost time because too many researchers were too sure about what scaling could not achieve. Altman’s position is not that LLMs already solve every problem. It is that the data, as he sees it, still points toward continued gains.

The Core Disagreement Over LLM Scaling

Altman’s argument is aimed at a familiar split in AI: whether larger and more capable language models can keep producing meaningful progress, or whether the approach is fundamentally limited. In the source article, Altman is described as pushing back against LLM skeptics while continuing to bet on scaling large language models.

He put the point directly:

Betting against LLMs scaling at this point feels quite misguided to me.

That sentence captures the center of his case. Altman is not merely saying that LLMs are useful today. He is saying that dismissing further progress from scaling is, in his view, the wrong read of the evidence.

He also argues that a whole generation of researchers held the field back because they were too confident about what scaling could not do. The criticism is not just technical. It is about how strongly people can commit to a belief before the evidence is settled.

Why Altman Is Pushing Back On Skeptics

At Stanford, Altman responded to critics including Yann LeCun, who has called LLMs a dead end. Altman said some people tie their identity to a position and cannot let go, even when the data proves them wrong.

That is a sharp way to frame the debate. For Altman, the issue is not simply that some researchers disagree with him. It is that skepticism can become personal, and once it does, people may resist evidence that challenges the position they have taken.

He also said that "Twitter trolls" predicting OpenAI’s failure for years do not bother him. The point, as presented in the source, is that outside criticism has not changed his confidence in the direction of the work.

The source also notes that Anthropic CEO Dario Amodei recently made similar remarks. That matters because Altman’s view is not presented as an isolated defense of one company’s strategy. It reflects a broader argument from leaders who still see scaling as central to progress in AI.

What LLMs Have Already Shown

Altman argued that LLMs have already surpassed human intelligence in some areas. The example given in the source is an OpenAI model that recently disproved a mathematical conjecture that had stumped smart people for a long time.

That example is important because it shifts the discussion from fluent text generation to new knowledge. According to the source, mathematicians are now asking what that achievement means for their field.

Altman summarized the implication this way:

So clearly, LLMs are capable of figuring out new knowledge

For readers following AI closely, this is the heart of the stakes. If LLMs can contribute to problems where the answer was not already obvious, then the debate over scaling is not just about better chatbots or more polished responses. It becomes a debate about whether these systems can help move fields forward.

Still, the source does not present Altman as claiming that LLMs are superior across the board. His argument is narrower and more careful in one key respect: the models can be very strong in some areas while still falling short in others.

Where Altman Still Sees Limits

Altman acknowledged that LLMs are not equally capable at every kind of work. For very long-horizon tasks requiring high judgment, he said LLMs "seem much worse than people."

That limitation keeps the argument grounded. The claim is not that current models can replace human judgment in every extended, complex situation. The claim is that their weaknesses do not prove scaling is a dead end.

The source also says world models matter for things like robotics. That leaves room for approaches beyond language models, even as Altman argues that the data clearly supports continued scaling.

In practical terms, Altman’s position separates two ideas that are often blurred together:

  • LLMs may remain weak on very long-horizon tasks requiring high judgment.
  • LLMs may still continue improving through scaling.
  • Progress in areas like robotics may require attention to world models.
  • Skepticism about current limits does not necessarily prove the broader approach has failed.

That distinction is why the debate remains live. Critics can point to difficult tasks where LLMs still fall behind people. Altman can point to areas where models have already exceeded expectations and argue that betting against further scaling is the mistake.

The Bigger Question For AI Research

The source article presents Altman’s remarks as a challenge to researchers who underestimated scaling. His message is that confidence about what LLMs could not do was itself a drag on progress.

That does not settle the future of AI. It does clarify the terms of the argument. Altman is betting that the evidence from large language models supports continued investment in scaling, even while he accepts that important limits remain.

The result is a more precise version of the current AI debate. It is not simply optimism against skepticism. It is a fight over which evidence matters most: the failures that show where LLMs still struggle, or the breakthroughs that suggest their ceiling has been repeatedly underestimated.