The Decoder October 2, 2024 TERMINATOR

Google builds AI reasoning model to challenge OpenAI's o1

Google is working on a new AI model designed to compete with OpenAI's o1 in logical reasoning and complex problem solving. The effort centers on multi-step math and programming tasks, with Google Deepmind research pointing to smarter use of computing power during inference.

WTF Index TERMINATOR

◄ Terminator 1 Idiocracy 0 ►

The story describes more capable AI reasoning and problem-solving systems, but without clear autonomy, harm, surveillance, or social degradation impacts.

Google builds AI reasoning model to challenge OpenAI's o1

Google is pursuing a new AI model focused on reasoning, according to insider reports cited by Bloomberg. The work is aimed at the same kind of capability OpenAI has highlighted with OpenAI's o1: handling problems that require several steps rather than a quick single response.

The reported focus is practical and technical. Google wants stronger performance on complex math and programming tasks, two areas where an AI system has to plan, test possibilities, and choose between competing answers.

Why reasoning is becoming the next AI battleground

Most language model progress has often been framed around scale: more data, more training power, and larger systems. The Google work described in the source points to another direction: using more computing power while the model is answering.

Bloomberg reports that several Google teams have recently advanced work on "AI reasoning software". The goal is not only to generate fluent text, but to improve how a system works through a hard prompt.

Like OpenAI's o1, Google's model is described as using a "chain of thought" approach. In plain terms, that means the system can produce multiple possible answers, evaluate them, and then select the strongest result.

This matters because difficult tasks are rarely solved by the first attempt. A math proof, a programming problem, or a multi-part logic puzzle may require the model to explore alternatives before settling on an answer. The reported Google approach treats that exploration as part of the process.

Inference compute is the key idea

The source describes an important shift in where extra computing power can be applied. Instead of only spending more compute during training, researchers can also spend more compute during inference, which is the stage when the model responds to a prompt.

That can potentially improve results because the model gets more room to search, compare, and refine. The source frames this as a new avenue for scaling AI models beyond simply increasing training data and training power.

A recently published research paper by GoogleDeepmind supports that direction. Google Deepmind researchers studied how additional computing power during inference can improve language model performance.

The paper focused on two main approaches:

Searching with verifier reward models.
Adjusting the model's response distribution based on the particular prompt.

The researchers also developed a "computationally optimal" strategy. That strategy adapts the amount of compute to the difficulty of the task, rather than treating every prompt the same way.

According to the source, this improved efficiency by more than four times compared to standard methods. That figure is important, but it is not the same as a public head-to-head result against OpenAI's o1.

What can and cannot be compared yet

The source is careful about benchmarking. A direct comparison between Google's model and OpenAI's o1 will only be possible when both companies make their full models available for benchmarking.

That leaves the current picture incomplete. The reported Google project may be aimed at the same kind of reasoning problem, and the Google Deepmind research shows a technical path, but the source does not provide a public benchmark result between the systems.

For readers following AI reasoning software, that distinction matters. A company can make progress internally, and researchers can publish promising methods, while the market still lacks a clear public comparison across full models.

Even so, the direction is clear from the source: Google is investing in methods that make models spend more effort on hard questions. The emphasis is on logical reasoning, complex problem solving, math, and programming rather than general conversational polish alone.

Google Deepmind's earlier math work shows the pattern

Google's interest in reasoning-focused AI is also visible in earlier Google Deepmind projects. In July, the company unveiled AlphaProof and AlphaGeometry 2.

AlphaProof is described as a model that specializes in mathematical reasoning. AlphaGeometry 2 is an updated version of a geometry-focused model.

Both programs mastered four of the six tasks in the International Mathematical Olympiad, an annual competition for high school students. That result shows why math has become a central test case for reasoning systems: the tasks demand structured problem solving, not just surface-level pattern matching.

For these models, Google Deepmind combined familiar features from generative language models with elements from classic search algorithms. That combination fits the broader theme of the reported new model: language generation alone is not enough for the hardest tasks, so search and evaluation become part of the system.

The company announced that the next step will be to scale up these systems. In that context, the reported work on a model to rival OpenAI's o1 looks less like a standalone experiment and more like part of a wider Google push toward scalable reasoning.

The bigger implication

The most important takeaway is not simply that Google is building another AI model. It is that the competition is moving toward models that can allocate more effort when a prompt is hard.

If inference compute can be used more efficiently, AI systems may become better at tasks where answer quality depends on search, verification, and step-by-step evaluation. The source points to math and programming as core examples.

For now, the public evidence remains limited to insider reports, the GoogleDeepmind research paper, and earlier Google Deepmind math systems. The decisive test will come only when full models are available for benchmarking.

Until then, Google's AI reasoning work should be understood as a clear signal of where the field is heading: toward systems that do more than respond quickly, and instead spend compute to reason through complex problems.