TechCrunch AI December 12, 2025 TERMINATOR

Google deepens Gemini Deep Research as AI agents heat up

Google released a reimagined Gemini Deep Research agent based on Gemini 3 Pro. The update adds developer embedding through the Interactions API and arrives as OpenAI launches GPT 5.2 on the same day.

WTF Index TERMINATOR

◄ Terminator 2 Idiocracy 1 ►

The story is mostly a product update, but it mildly leans toward more powerful embedded agentic systems handling complex research tasks.

Google deepens Gemini Deep Research as AI agents heat up

Google is pushing Gemini Deep Research beyond the familiar idea of an AI tool that simply writes a report. The company released a reimagined version of the research agent on Thursday, built on Gemini 3 Pro, and positioned it for a wider agentic AI landscape.

The timing matters. Google’s announcement landed on the same day OpenAI launched GPT 5.2, codenamed Garlic, turning a product update into another direct signal of how fast the AI agent race is moving.

What Changed In Gemini Deep Research

The new Gemini Deep Research is based on Gemini 3 Pro, Google’s state-of-the-art foundation model. Google describes Gemini 3 Pro as its most factual model and says it is trained to minimize hallucinations during complex tasks.

Gemini Deep Research remains able to produce research reports, but the update expands its role. It is now designed to synthesize large amounts of information and work with a major context dump in the prompt.

That matters because deep research agents are not just answering a single question. They are expected to follow threads, compare information, and assemble a useful result from many pieces of input.

Google says customers use the tool for tasks ranging from due diligence to drug toxicity safety research. Those examples show the kind of work Google wants this agent to handle: information-heavy, multi-step tasks where factual accuracy is central to the result.

Developers Get A New Way To Embed Research Agents

The biggest shift is not only inside the Gemini Deep Research interface. Google is also making it possible for developers to embed its SATA-model research capabilities into their own apps.

That is enabled by Google’s new Interactions API. The source describes the API as a way to give developers more control as AI systems become more agentic.

For developers, this changes the role of Gemini Deep Research from a standalone destination into a capability that can sit inside other products. Instead of asking users to leave an app and use a separate research tool, developers may be able to bring that research workflow directly into the software people already use.

Google also says it will soon integrate the new deep research agent into several of its own services, including Google Search, Google Finance, its Gemini App, and NotebookLM.

That planned integration points to a broader shift in user behavior. The source frames it as another step toward a world where people do not search manually in the same way, because their AI agents can do more of that work for them.

Why Hallucinations Matter More For Deep Agents

AI hallucinations are a known problem for large language models. In plain terms, a hallucination happens when the model makes something up.

For a short answer, a hallucination can still be damaging. For a long-running agent task, the risk can become more serious because the system may make many autonomous choices while working toward an answer.

The source highlights why that is especially important for deep reasoning and agentic tasks that may run over minutes, hours, or longer. If a system makes many decisions, even one fabricated step can weaken or invalidate the final output.

That is why Google is emphasizing Gemini 3 Pro’s factuality in this release. A research agent is only useful if users can trust that the chain of work behind the answer has not been distorted by invented details.

This is also why benchmarks have become central to the public fight over AI agents. Companies are not only releasing models; they are trying to prove that those models can handle complicated work better than competitors.

The Benchmark Fight Gets Crowded

Google created a benchmark called DeepSearchQA to support its progress claims. The benchmark is intended to test agents on complex, multi-step information-seeking tasks, and Google has open sourced it.

The company also tested Gemini Deep Research on Humanity’s Last Exam, an independent general knowledge benchmark filled with extremely niche tasks, and BrowserComp, a benchmark focused on browser-based agentic tasks.

According to the source, Google’s new agent beat the competition on DeepSearchQA and Humanity’s Last Exam. OpenAI’s ChatGPT 5 Pro was close behind in those comparisons and slightly beat Google on BrowserComp.

Those results were quickly complicated by timing. The same day Google published its benchmark comparisons, OpenAI launched GPT 5.2. OpenAI says its newest model beats rivals, especially Google, across a suite of typical benchmarks, including OpenAI’s own benchmark.

That leaves the public comparison unsettled. Google used its release to show progress in deep research agents, while OpenAI used the same day to argue that GPT 5.2 had moved the competitive line again.

What The Timing Says About The AI Agent Race

The release schedule may be one of the clearest signals in the story. The source notes that the world was awaiting the release of Garlic when Google dropped AI news of its own.

That timing suggests the major AI companies are not only competing on model quality and product features. They are also competing for attention at the exact moments when developers, customers, and the broader market are watching.

For Google, Gemini Deep Research now carries several messages at once. It is a research report tool, a developer capability through the Interactions API, a future layer inside Google Search and other services, and a benchmark-tested agent meant to show progress in complex information work.

For OpenAI, GPT 5.2 immediately challenged the freshness of Google’s comparisons. The result is a fast-moving contest in which benchmark wins can be meaningful and short-lived at the same time.

The practical takeaway is simple: AI research agents are becoming a more important battleground. The companies building them are trying to make them more factual, more useful inside apps, and more capable of handling the kind of multi-step work that once required users to search, read, compare, and synthesize on their own.