TechCrunch AI April 17, 2025 TERMINATOR

Experts Say Gemini 2.5 Pro Report Leaves AI Safety Gaps

Google released a technical report for Gemini 2.5 Pro weeks after making the model public, but experts told TechCrunch the document lacks enough safety detail. The criticism centers on timing, missing dangerous capability findings, and whether Google is meeting its own public commitments on AI transparency.

WTF Index TERMINATOR

◄ Terminator 2 Idiocracy 0 ►

The story centers on insufficient safety transparency for a powerful public AI model, raising concerns about unmanaged risk rather than societal deskilling.

Experts Say Gemini 2.5 Pro Report Leaves AI Safety Gaps

Google’s safety documentation for Gemini 2.5 Pro has drawn criticism from AI policy and governance experts who say the company has not provided enough information for outsiders to understand the model’s risks.

The report arrived on Thursday, weeks after Google launched Gemini 2.5 Pro, which the company described as its most powerful AI model yet. While technical reports are often used to help researchers and safety reviewers assess model behavior, several experts told TechCrunch that this one leaves major questions unanswered.

Why The Gemini 2.5 Pro Report Matters

Technical reports occupy an important place in the AI ecosystem. They can reveal details that companies may not emphasize in product announcements, including safety evaluation results, model limitations, and areas where systems performed poorly.

For researchers, policymakers, and AI safety groups, these documents can serve as a starting point for independent analysis. They are not a substitute for full access to internal testing, but they can show how a company is thinking about risk before or after a model reaches the public.

That is why the Gemini 2.5 Pro report has become a flashpoint. The concern is not simply that Google published a short document. The deeper issue is whether the report contains enough substance to let outsiders judge the safety and security of a model that is already available.

Experts Say The Details Are Too Thin

Peter Wildeford, co-founder of the Institute for AI Policy and Strategy, told TechCrunch that the report does not provide enough information to verify Google’s public commitments.

“This [report] is very sparse, contains minimal information, and came out weeks after the model was already made available to the public,”

Wildeford added that the lack of detail makes it impossible to assess whether Google is living up to its promises or to evaluate the safety and security of its models.

Another concern is how the report handles Google’s proposed Frontier Safety Framework. Google introduced the FSF last year and described it as an effort to identify future AI capabilities that could cause “severe harm.” Experts told TechCrunch that the Gemini 2.5 Pro report does not discuss that framework in great detail.

For an AI model safety report, that omission matters because the FSF is tied to Google’s stated approach for identifying severe risks. If a public report says little about how that approach applies to a major model, outside observers have less basis for judging whether the framework is shaping real release decisions.

Google’s Reporting Approach Is Different

Google does not publish safety reports in exactly the same way as some of its AI rivals. According to the source article, the company publishes technical reports only after it considers a model to have moved beyond the “experimental” stage.

Google also does not include all of its “dangerous capability” evaluation results in those write-ups. Instead, it reserves those findings for a separate audit.

That distinction is central to the criticism. A technical report may exist, but if key safety findings are not included, readers may still lack the details needed to understand what risks were tested, what was found, and how those findings affected release decisions.

Thomas Woodside, co-founder of the Secure AI Project, said he was glad Google released a report for Gemini 2.5 Pro. But he also said he was not convinced that Google is committed to delivering timely supplemental safety evaluations.

Woodside pointed to the last time Google published dangerous capability test results: June 2024, for a model announced in February of that same year. He also noted that Google had not made available a report for Gemini 2.5 Flash, a smaller and more efficient model announced last week. A Google spokesperson told TechCrunch that a report for Flash is “coming soon.”

A Wider Transparency Problem In AI

The criticism of Google is part of a broader debate about AI transparency. TechCrunch noted that Google is not the only major AI lab facing questions about whether it is giving the public enough safety information.

Meta released what the article described as a similarly skimpy safety evaluation for its new Llama 4 open models. OpenAI, meanwhile, did not publish any report for its GPT-4.1 series.

Kevin Bankston, a senior adviser on AI governance at the Center for Democracy and Technology, described the pattern as a “race to the bottom” on AI safety.

Bankston connected Google’s sparse documentation with reports that competing labs like OpenAI have reduced safety testing time before release from months to days. In his view, the result is a troubling story about companies rushing models to market while offering less transparency about the testing behind them.

The Commitments Now Under Scrutiny

The pressure on Google is sharpened by its past assurances to regulators. Two years ago, Google told the U.S. government it would publish safety reports for all “significant” public AI models “within scope.” The company later made similar commitments to other countries, pledging to “provide public transparency” around AI products.

Those promises raise the stakes for every major model report. If the public document is too vague, critics argue that it becomes difficult to tell whether the company is meeting the spirit of those commitments.

Google has said that it conducts safety testing and “adversarial red teaming” for models before release, even when those details are not included in its technical reports. That statement addresses whether testing happens, but it does not resolve the transparency question: what should companies disclose, when should they disclose it, and how much detail is enough for meaningful outside scrutiny?

For Gemini 2.5 Pro, the immediate answer from several experts is that Google has not disclosed enough. The larger issue is whether AI labs will treat safety reporting as a core public accountability tool or as a limited post-release formality.