New Gemini models push robots toward agentic AI

Google Deepmind has introduced Gemini Robotics 1.5 and Gemini Robotics-ER 1.5 for robots that can plan, reason, and act. The two models split high-level decision-making from physical execution, while adding safety checks and support for multiple robot platforms.

WTF Index TERMINATOR
◄ Terminator 3 Idiocracy 0 ►

Agentic robotics that can plan, reason, use tools, and act in the physical world meaningfully increases autonomy, even with safety checks.

New Gemini models push robots toward agentic AI

Google Deepmind is moving its robotics work closer to agentic AI with two new models designed to help machines understand tasks, make plans, and carry out actions in the physical world.

The models are Gemini Robotics 1.5 and Gemini Robotics-ER 1.5. Together, they combine multimodal perception, language processing, motor control, and internal decision-making for robots that need to handle complex instructions rather than simply follow narrow commands.

Two Models With Different Jobs

The clearest way to understand the announcement is as a division of labor. Gemini Robotics-ER 1.5 acts as the high-level planning system, while Gemini Robotics 1.5 turns those plans into movement.

Gemini Robotics-ER 1.5 is described as a high-level "brain" for robots. It is responsible for task planning, natural language communication, tool use, and tracking whether a task is progressing successfully.

That tool use includes digital tools such as Google Search. In practice, this means the robot’s planning layer is not limited to what it can directly see or already know from its robotics model. It can use outside digital information as part of the task process, while still coordinating with the physical system that performs the action.

According to Google Deepmind, Gemini Robotics-ER 1.5 delivers state-of-the-art results on 15 embodied reasoning benchmarks. The named benchmarks include Point-Bench, ERQA, and MindCube.

Reasoning Before Movement

Gemini Robotics 1.5 handles the action side. Its role is to translate a plan into physical behavior that a robot can execute.

The key change is that the model reasons before acting. Instead of going directly from visual input and language to motion, it builds internal logic chains, plans intermediate steps, breaks down complex tasks, and can explain its decisions.

That matters because many real-world robot tasks are not single-step problems. A robot may need to identify the goal, understand constraints, choose an object, decide how to grip it, and then move it correctly. The source example is laundry sorting: the model identifies a goal such as "light-colored clothes in the white bin," then plans the grip and performs the movement.

This kind of process brings robotics closer to the agentic AI pattern already developing on computers. A system receives an objective, plans the work, uses tools where needed, monitors progress, and adjusts its actions toward the goal.

Working Across Robot Platforms

Google Deepmind says both models can generalize abilities across different robot types. That is an important practical point because robotics systems vary widely in hardware, movement range, and physical design.

The source gives a specific example: movement patterns learned with the ALOHA 2 robot also work on platforms like Apptronik's Apollo or the two-armed Franka robot, with no extra fine-tuning required.

For robotics, that kind of transfer is central to usefulness. If a capability only works on one machine, it remains tightly bound to that platform. If it carries over to other robot bodies, the same model family can become more broadly applicable.

The models are based on the broader Gemini multimodal family but have been adapted specifically for robotics. That adaptation is what connects general perception and language understanding to the motor-control requirements of physical machines.

Safety Checks Before Action

The models also include built-in safety checks. Before an action is executed, Gemini Robotics 1.5 checks whether the move is safe.

If needed, it can trigger features such as collision avoidance. The source does not describe the full safety system in detail, but the basic structure is clear: safety evaluation happens before the robot carries out the physical move.

That placement is important. In a physical environment, a flawed instruction is not only a software error. It can become a movement that affects objects, spaces, or people nearby. A robot that plans and acts with more autonomy therefore needs safeguards at the point where reasoning becomes motion.

Availability And The Road So Far

Gemini Robotics-ER 1.5 is now available through the Gemini API in Google AI Studio. Gemini Robotics 1.5 is currently limited to select partners.

The announcement builds on earlier steps in the Gemini Robotics family. Google first introduced the Gemini Robotics family in March 2025 to give robots multimodal understanding. In June, the company followed with Gemini Robotics On-Device, a local version optimized for fast adaptation and robust dexterity on robotic hardware.

The new models extend that work with stronger planning, improved tool use, and a clearer agentic structure. Rather than treating a robot as a device that only reacts to direct commands, Google Deepmind is positioning these systems as machines that can interpret goals, organize steps, check progress, and act in the world.

The result is not just a new robotics model release. It is a sign of where the field is heading: toward robots that combine perception, language, reasoning, and movement in a single task loop.