Google DeepMind has unveiled Gemini Robotics 1.5 and Gemini Robotics-ER 1.5, the first AI models that enable robots to “think” before taking action, marking what researchers call the dawn of agentic robots. This breakthrough represents a fundamental shift from task-specific robotic programming to general-purpose AI that can adapt to new situations without reprogramming, potentially transforming how robots operate in real-world environments.
How it works: The system uses two complementary AI models that work together to enable more sophisticated robotic behavior.
- Gemini Robotics-ER 1.5 serves as the “thinking” model, processing visual and text input to generate step-by-step natural language instructions for complex tasks.
- Gemini Robotics 1.5 acts as the action model, translating those instructions into actual robot movements while conducting its own reasoning process for each step.
- Both models are built on Google’s Gemini foundation but fine-tuned specifically for physical space operations.
Why this matters: Current robots require intensive training for specific tasks and struggle with anything outside their programming, creating deployment challenges that can take months for single-task installations.
- “Robots today are highly bespoke and difficult to deploy, often taking many months in order to install a single cell that can do a single task,” said Carolina Parada, head of robotics at Google DeepMind.
- The generative AI approach enables robots to handle entirely new situations and workspaces without reprogramming.
Key breakthrough: Gemini Robotics-ER 1.5 introduces simulated reasoning capabilities to robotics AI, similar to modern chatbots’ thinking processes.
- The model can call external tools like Google search to gather additional information when planning tasks.
- It achieves top performance in both academic and internal benchmarks for physical space interaction decisions.
- “There are all these kinds of intuitive thoughts that help [a person] guide this task, but robots don’t have this intuition,” explained Kanishka Rao, a researcher at DeepMind.
Cross-platform capabilities: The system can transfer skills across different robot embodiments without specialized tuning.
- DeepMind tests the technology on various machines including the two-armed Aloha 2 and humanoid Apollo robots.
- Skills learned from Aloha 2’s grippers can transfer to Apollo’s more complex hands automatically.
- This eliminates the previous need to create customized models for each robot type.
Current availability: While still in early deployment, developers can begin experimenting with the reasoning component.
- Gemini Robotics 1.5 (the action model) remains limited to trusted testers only.
- Gemini Robotics-ER 1.5 (the thinking model) is now available in Google AI Studio for developer experiments.
- The technology enables more complex multi-stage tasks, bringing agentic capabilities to robotics for the first time.
Google DeepMind unveils its first “thinking” robotics AI