Google Cloud Next brings AI agents, chips and Gemini into focus

Google used its annual Cloud Next conference to introduce AI updates across chips, models, agent collaboration, media tools, Workspace automation, and scientific computing. The announcements show a broad push to make Google Cloud a platform for inference, enterprise workflows, creative generation, and research workloads.

Google Cloud Next brings AI agents, chips and Gemini into focus

Google’s annual Cloud Next conference delivered a wide set of AI announcements, spanning the hardware that runs models, the software that coordinates AI agents, and the applications that bring automation into everyday work.

The common thread is Google Cloud’s attempt to cover more of the AI stack. The company introduced new infrastructure for inference, a faster Gemini model variant, an open protocol for agent collaboration, updates to media-generation systems, new Workspace automation tools, and cloud services aimed at scientific computing.

Ironwood moves Google’s TPU focus toward inference

The largest infrastructure announcement was Ironwood, Google’s seventh-generation Tensor Processing Unit. Google describes the chip as built specifically for inference, the process of generating outputs from AI models.

According to Google, Ironwood delivers up to 3,600 times the performance of the original TPU. The company also calls it its most powerful and energy-efficient TPU so far.

Ironwood uses liquid-cooled chips and will be available in pod configurations ranging from 256 to 9,216 chips. The largest configuration reaches 42.5 exaflops of computing power, which Google says is 24 times greater than the projected peak of El Capitan, currently the fastest supercomputer.

Each Ironwood chip includes 192 GB of high-bandwidth memory, 7.2 terabits per second of bandwidth, and 1.2 terabits per second of interconnect speed. Those figures matter because inference at large scale is not only about raw compute. Memory, bandwidth, and interconnect speed all affect how efficiently models can serve responses.

Gemini 2.5 Flash targets speed and cost control

Google also introduced Gemini 2.5 Flash, a faster and more cost-efficient version of Gemini 2.5 Pro. The model is designed to let organizations adjust the level of reasoning, creating a way to balance model performance against cost.

Gemini 2.5 Flash will be available soon through Vertex AI. For businesses already building on Google Cloud, that makes the model part of the same platform where other AI services and infrastructure are being positioned.

The core Gemini 2.5 model now also powers Google’s Deep Research feature. Google's product manager Logan Kilpatrick says that "early tests show users prefer this 2:1 vs "other products.""

Agent2Agent aims to connect autonomous AI systems

Another major announcement was Agent2Agent, or A2A, an open protocol intended to let autonomous AI agents work together across platforms. The protocol is built on standards including HTTP and JSON-RPC, supports inputs such as text, audio, and video, and complies with security standards through OpenAPI.

The basic idea is that agents should be able to describe what they can do and coordinate on how work should be handled. In A2A, agents can use "agent cards" to advertise their capabilities. They can also negotiate task formats or user interfaces dynamically.

Google demonstrated the concept through an automated hiring process inside its Agentspace interface. In that example, one agent coordinates with others to identify suitable candidates, schedule interviews, and conduct background checks.

A production-ready version of the Agent2Agent protocol is expected later this year. If it works as described, the protocol would give developers and organizations a shared framework for connecting agents instead of treating each one as an isolated tool.

Media models gain editing, music, image, and voice features

Google updated several generative AI models for media creation. The changes cover video, music, images, and sound generation, with a focus on adding more direct editing and production capabilities.

Veo 2, Google’s video model, now supports editing functions including inpainting, outpainting, camera control, and interpolation for smoother transitions. Inpainting can remove elements from video, while outpainting can expand content beyond its original frame. These features are available in preview for selected testers.

Lyria, Google’s text-to-music model, is now available. It can generate music from short text prompts and adapt to genre, mood, and tempo, with potential uses including podcasts, marketing, and video content.

Imagen 3 received improved inpainting capabilities for removing objects or reconstructing damaged areas. Google says the model now has significantly enhanced transitions and detail reproduction.

Chirp 3, Google’s sound generation model, adds two features: "Instant Custom Voice," which can synthesize a voice from just ten seconds of audio, and speaker segmentation for multi-person recordings.

Google says its generative models include safeguards such as SynthID digital watermarks, content filters, and privacy controls. The company also offers a liability guarantee in case of copyright disputes involving generated content.

Workspace and science tools bring AI into practical workflows

Google is expanding AI inside its enterprise productivity suite with Workspace Flows, an automation platform that works across Docs, Sheets, Meet, and Chat. The system uses custom Gemini-powered chatbots called "Gems" to automate workflows.

Examples include handling customer service requests, reviewing policy documents, and triaging support tickets. Individual Workspace apps are also gaining AI features.

  • Google Docs can generate full audio versions of documents or podcast-style summaries, and its "Help me refine" feature can suggest improvements to clarity, structure, and tone.
  • Google Sheets is adding "Help me analyze," which can evaluate data and suggest visualizations or next steps.
  • Google Vids can create realistic video content using Veo 2 without external editing software.
  • Google Meet can use Gemini to summarize meetings or answer specific questions.
  • Google Chat can summon Gemini with the @ command to extract decisions, open questions, or next steps from conversations.

Google Cloud is also adding infrastructure and applications for scientific computing. New H4D virtual machines are optimized for highly parallel, CPU-based workloads, including molecular dynamics, climate modeling, and materials science.

The H4D VMs use AMD processors and Google’s Titanium network to improve data transmission and support cloud-based supercomputing clusters. Cluster Toolkit and Cluster Director are available to coordinate multi-node deployments.

Google is also rolling out AI-supported scientific applications. AlphaFold 3, developed by Google DeepMind, can predict the structure and interactions of biological molecules, while the high-throughput version on Google Cloud can process large sequence datasets to accelerate disease research. WeatherNext, based on Google Research, provides customizable, high-resolution weather forecasting through Vertex AI.

Taken together, the announcements show Google placing AI across cloud infrastructure, model access, agent coordination, content generation, office productivity, and scientific workloads. Cloud Next was not a single-product launch. It was a platform-wide statement about where Google wants AI development and deployment to happen.