×
Now docking at Contemplation Station: DeepMind’s new AI models let robots think before they act
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Google DeepMind has unveiled Gemini Robotics 1.5 and Gemini Robotics-ER 1.5, the first AI models that enable robots to “think” before taking action, marking what researchers call the dawn of agentic robots. This breakthrough represents a fundamental shift from task-specific robotic programming to general-purpose AI that can adapt to new situations without reprogramming, potentially transforming how robots operate in real-world environments.

How it works: The system uses two complementary AI models that work together to enable more sophisticated robotic behavior.

  • Gemini Robotics-ER 1.5 serves as the “thinking” model, processing visual and text input to generate step-by-step natural language instructions for complex tasks.
  • Gemini Robotics 1.5 acts as the action model, translating those instructions into actual robot movements while conducting its own reasoning process for each step.
  • Both models are built on Google’s Gemini foundation but fine-tuned specifically for physical space operations.

Why this matters: Current robots require intensive training for specific tasks and struggle with anything outside their programming, creating deployment challenges that can take months for single-task installations.

  • “Robots today are highly bespoke and difficult to deploy, often taking many months in order to install a single cell that can do a single task,” said Carolina Parada, head of robotics at Google DeepMind.
  • The generative AI approach enables robots to handle entirely new situations and workspaces without reprogramming.

Key breakthrough: Gemini Robotics-ER 1.5 introduces simulated reasoning capabilities to robotics AI, similar to modern chatbots’ thinking processes.

  • The model can call external tools like Google search to gather additional information when planning tasks.
  • It achieves top performance in both academic and internal benchmarks for physical space interaction decisions.
  • “There are all these kinds of intuitive thoughts that help [a person] guide this task, but robots don’t have this intuition,” explained Kanishka Rao, a researcher at DeepMind.

Cross-platform capabilities: The system can transfer skills across different robot embodiments without specialized tuning.

  • DeepMind tests the technology on various machines including the two-armed Aloha 2 and humanoid Apollo robots.
  • Skills learned from Aloha 2’s grippers can transfer to Apollo’s more complex hands automatically.
  • This eliminates the previous need to create customized models for each robot type.

Current availability: While still in early deployment, developers can begin experimenting with the reasoning component.

  • Gemini Robotics 1.5 (the action model) remains limited to trusted testers only.
  • Gemini Robotics-ER 1.5 (the thinking model) is now available in Google AI Studio for developer experiments.
  • The technology enables more complex multi-stage tasks, bringing agentic capabilities to robotics for the first time.
Google DeepMind unveils its first “thinking” robotics AI

Recent News

6 places where Google’s Gemini AI should be but isn’t

Despite impressive expansion, Gemini misses crucial opportunities where users need AI assistance most.

How to protect your portfolio from a potential AI bubble burst

Even AI champions like Altman and Zuckerberg are whispering about bubble risks.