How GPT-5 compressed a month of math work into an afternoon

OpenAI researcher Sebastien Bubeck says GPT-5 completed a complex mathematical task that would have taken him around a month. The model planned the solution, ran a simulation to check a formula, and wrote a complete proof in just an afternoon.

WTF Index TERMINATOR
◄ Terminator 2 Idiocracy 1 ►

The story highlights GPT-5 gaining powerful autonomous research abilities, though without direct harm or control risks.

How GPT-5 compressed a month of math work into an afternoon

GPT-5 is being described by an OpenAI researcher as capable of producing the "most impressive LLM output" he has seen so far. The reason is not a general benchmark or a public demo, but a specific mathematical task that Sebastien Bubeck says compressed around a month of work into just an afternoon.

What GPT-5 did in the math task

According to the source, Bubeck reported in a post on X that GPT-5 handled a highly complex mathematical task for him. The important part is the sequence of work: the model did not only produce a final answer. It designed the solution path, ran a simulation to check a formula, and then wrote a complete proof.

That combination matters because high-level math work often requires more than one kind of reasoning step. A useful result has to connect the plan, the calculation, and the proof. In Bubeck’s account, GPT-5 linked those pieces into what the source describes as effectively a seamless calculation.

Bubeck said this process would previously have taken him around a month. GPT-5 completed it in just an afternoon. He called the result the "most impressive LLM output" he has seen to date.

Why the sequence is significant

The source does not present GPT-5 as merely producing a shortcut. It describes a workflow in which the model moved through several stages of mathematical work. First, it found a path. Then it checked part of that path by simulation. Finally, it produced a complete proof.

That is different from a tool that only helps with formatting, search, or routine calculation. The reported task involved planning and verification before a final written result. For researchers, that kind of sequence is where time can disappear: not only in knowing what to prove, but in testing whether the proposed route is likely to hold together.

The source also frames the task as highly complex. It does not give the full mathematical problem, so the claim should be read through Bubeck’s description rather than as an independently explained proof. Still, the reported time difference is clear: around a month of work reduced to an afternoon.

AI is becoming more visible in advanced mathematics

The example fits into a broader pattern described in the source: generative AI is becoming increasingly apparent in high-level mathematics. The article notes that this is not only because of gold medals in Math Olympiads. It also points to use by working mathematicians and researchers.

Terence Tao is cited as another example. The source says he recently noted that AI saved him several hours of work. His use case was different from Bubeck’s account: Tao used the tool to verify his theoretical assumptions rather than relying on it autonomously.

That distinction is important. In one case, GPT-5 is described as designing a solution path, running a simulation, and writing a proof. In the other, AI is described as helping verify assumptions. Both examples involve research time saved, but they show different levels of dependence on the system.

What this suggests for research work

The source says an OpenAI report supports the broader point, showing how GPT-5 can save significant research time across various scientific fields. It does not provide details of those fields in the excerpt, so the safest conclusion is general: the reported value is time saved in research workflows.

That value can come from several logical parts of the process described in the article:

  • finding a possible route through a difficult problem;
  • checking a formula through simulation;
  • turning the result into a complete proof;
  • reducing the time needed for verification or theoretical checking.

None of this means every mathematical output should be accepted without review. The source itself contrasts Bubeck’s account with Tao’s more limited use, where AI helped verify assumptions rather than act autonomously. The emerging picture is not one single mode of use, but a range of research assistance.

For high-level mathematics, that range matters. A model that can save several hours is useful. A model that can compress a task from around a month into an afternoon, if the result holds up, points to a more substantial shift in how researchers may approach difficult technical work.

The practical takeaway

The core claim from the source is narrow but notable: Sebastien Bubeck says GPT-5 completed a complex math workflow in an afternoon that would previously have taken him around a month. The model planned, simulated, and proved, leading him to describe the output as the "most impressive LLM output" he has seen so far.

The broader implication is equally clear from the examples given. GPT-5 and generative AI are being used not only for simple assistance, but also in parts of advanced mathematical and scientific work where verification, proof, and research time are central concerns.