×
Written by
Published on
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Alter3, a GPT-4 powered humanoid robot, showcases the potential of combining advanced language models with robotics to create more realistic and adaptable robot behaviors.

Harnessing the power of large language models: Alter3 leverages GPT-4’s vast knowledge to directly map natural language commands to robot actions, simplifying the process of controlling the robot’s 43 axes:

  • Researchers at the University of Tokyo and Alternative Machine have designed Alter3 to take advantage of GPT-4’s capabilities, enabling it to perform complex tasks like taking a selfie or mimicking a ghost.
  • GPT-4 acts as a planner, determining the steps required to perform the desired action, and then generates the necessary commands for the robot to execute each step using its in-context learning ability.

Refining actions through human feedback: Since language may not always precisely describe physical poses, Alter3 incorporates a feedback loop that allows humans to provide corrections, further improving the robot’s performance:

  • Users can provide feedback such as “Raise your arm a bit more,” which is sent to another GPT-4 agent that reasons over the code, makes necessary corrections, and returns the updated action sequence to the robot.
  • The refined action recipe and code are stored in a database for future use, enabling Alter3 to learn and adapt its behaviors over time.

Demonstrating emotional expression and realistic behaviors: GPT-4’s extensive knowledge about human behaviors and actions enables Alter3 to create more realistic behavior plans and even mimic emotions:

  • Experiments show that Alter3 can mimic emotions such as embarrassment and joy, even when emotional expressions are not explicitly stated in the text instructions.
  • GPT-4’s linguistic representations of movements can be accurately mapped onto Alter3’s body, resulting in more natural and human-like behaviors.

The growing trend of foundation models in robotics: Alter3 is part of a growing body of research that combines the power of foundation models with robotics systems:

  • Other projects, such as Figure, RT-2-X, and OpenVLA, also utilize foundation models as reasoning and planning modules in robotics control systems, showcasing the potential of this approach.
  • As multi-modality becomes the norm in foundation models, robotics systems will become better equipped to reason about their environment and choose their actions.

Analyzing deeper: While the integration of advanced language models like GPT-4 with robotics systems is a significant step forward, there are still challenges to be addressed:

  • Projects like Alter3 often overlook the base challenges of creating robots that can perform primitive tasks such as grasping objects, maintaining balance, and moving around.
  • Fine-tuned foundation models specifically designed for robotics commands, such as RT-2-X and OpenVLA, may produce more stable results and generalize better to various tasks and environments, but they require technical skills and are more expensive to create.
  • The lack of data for low-level robot tasks remains a significant hurdle in the development of more advanced and adaptable robotics systems.
Alter3 is the latest GPT-4-powered humanoid robot

Recent News

Library of Congress Data Fuels AI Development Surge

The Library's vast digital archives attract AI companies seeking diverse, copyright-free data to train language models.

AI Detection Tools Disadvantage Black Students, Study Finds

Black students are twice as likely to have their work falsely flagged as AI-generated, exacerbating existing disciplinary disparities in schools.

How Autodesk Boosted Efficiency by 63% with AI-Powered Customer Service

Autodesk deploys Salesforce's AI platform to boost customer service efficiency, cutting case handling time by 63%.