×
Google DeepMind’s new AI models enable robots to understand, adapt to complex tasks on the fly
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Google DeepMind is pushing the boundaries of robotics with new AI models designed to transform how robots interact with the physical world. These advances mark a crucial step toward bridging the gap between today’s specialized industrial robots and future general-purpose robot assistants capable of understanding and adapting to complex environments autonomously. This development addresses one of the most challenging aspects of robotics: creating AI systems sophisticated enough to control robots safely through novel situations.

The big picture: Google DeepMind has introduced two specialized AI models—Gemini Robotics and Gemini Robotics-ER—built on its Gemini 2.0 foundation to serve as sophisticated “brains” for robots.

  • Gemini Robotics features “vision-language-action” capabilities that enable robots to process visual information, understand natural language commands, and generate appropriate physical movements.
  • Gemini Robotics-ER focuses on “embodied reasoning” with enhanced spatial understanding, allowing roboticists to integrate it with existing robot control systems.

How it works: The models can interpret natural language commands and visual input to perform delicate tasks that previously challenged robotics systems.

  • Users can give conversational instructions like “pick up the banana and put it in the basket,” and the system will recognize objects in its camera view and guide robotic arms to complete the task.
  • More complex commands such as “fold an origami fox” are also possible, with the system applying knowledge of origami techniques to carefully manipulate paper.

Real-world applications: Google is partnering with several robotics companies to implement these models across diverse platforms.

  • The technology is being integrated with Apptronik‘s Apollo humanoid robot, potentially advancing the development of general-purpose robot assistants.
  • Figure’s humanoid robot and Sanctuary AI’s Phoenix robot are also utilizing these models, suggesting broad commercial potential.

Why this matters: Creating AI systems capable of safely controlling robots through unfamiliar scenarios has been a persistent challenge in robotics, often referred to as a “holy grail” that could transform robots into versatile physical-world workers.

  • While robot hardware has been steadily improving, the intelligence to pilot these systems autonomously has remained elusive until recent developments.

The competitive landscape: Google’s announcement positions it alongside other major tech companies racing to develop embodied AI for robotics.

  • Nvidia has identified embodied AI as a “moonshot goal,” highlighting the industry-wide recognition of this technology’s transformative potential.
Google’s origami-folding AI brain may power new wave of humanoid robots

Recent News

Runway’s Gen-4 AI model solves video character consistency problem for filmmakers

The AI video system maintains character and object consistency across different scenes using just one reference image, solving a critical challenge for narrative filmmaking.

MSI Stealth 18 AI gaming laptop gets $800 price cut at Best Buy

The high-end gaming laptop features Intel Ultra 9 and RTX 4080 alongside a high-resolution 18-inch display, positioning it for both gaming and professional creative work.

Google’s Gemini and the hallucination problem plaguing AI assistants

Google's phasing out of its traditional Assistant for Gemini highlights a core challenge: AI that can convincingly present false information is inherently problematic for tasks requiring factual accuracy.