back
Get SIGNAL/NOISE in your inbox daily

Alter3, a GPT-4 powered humanoid robot, showcases the potential of combining advanced language models with robotics to create more realistic and adaptable robot behaviors.

Harnessing the power of large language models: Alter3 leverages GPT-4’s vast knowledge to directly map natural language commands to robot actions, simplifying the process of controlling the robot’s 43 axes:

  • Researchers at the University of Tokyo and Alternative Machine have designed Alter3 to take advantage of GPT-4’s capabilities, enabling it to perform complex tasks like taking a selfie or mimicking a ghost.
  • GPT-4 acts as a planner, determining the steps required to perform the desired action, and then generates the necessary commands for the robot to execute each step using its in-context learning ability.

Refining actions through human feedback: Since language may not always precisely describe physical poses, Alter3 incorporates a feedback loop that allows humans to provide corrections, further improving the robot’s performance:

  • Users can provide feedback such as “Raise your arm a bit more,” which is sent to another GPT-4 agent that reasons over the code, makes necessary corrections, and returns the updated action sequence to the robot.
  • The refined action recipe and code are stored in a database for future use, enabling Alter3 to learn and adapt its behaviors over time.

Demonstrating emotional expression and realistic behaviors: GPT-4’s extensive knowledge about human behaviors and actions enables Alter3 to create more realistic behavior plans and even mimic emotions:

  • Experiments show that Alter3 can mimic emotions such as embarrassment and joy, even when emotional expressions are not explicitly stated in the text instructions.
  • GPT-4’s linguistic representations of movements can be accurately mapped onto Alter3’s body, resulting in more natural and human-like behaviors.

The growing trend of foundation models in robotics: Alter3 is part of a growing body of research that combines the power of foundation models with robotics systems:

  • Other projects, such as Figure, RT-2-X, and OpenVLA, also utilize foundation models as reasoning and planning modules in robotics control systems, showcasing the potential of this approach.
  • As multi-modality becomes the norm in foundation models, robotics systems will become better equipped to reason about their environment and choose their actions.

Analyzing deeper: While the integration of advanced language models like GPT-4 with robotics systems is a significant step forward, there are still challenges to be addressed:

  • Projects like Alter3 often overlook the base challenges of creating robots that can perform primitive tasks such as grasping objects, maintaining balance, and moving around.
  • Fine-tuned foundation models specifically designed for robotics commands, such as RT-2-X and OpenVLA, may produce more stable results and generalize better to various tasks and environments, but they require technical skills and are more expensive to create.
  • The lack of data for low-level robot tasks remains a significant hurdle in the development of more advanced and adaptable robotics systems.

Recent Stories

Oct 17, 2025

DOE fusion roadmap targets 2030s commercial deployment as AI drives $9B investment

The Department of Energy has released a new roadmap targeting commercial-scale fusion power deployment by the mid-2030s, though the plan lacks specific funding commitments and relies on scientific breakthroughs that have eluded researchers for decades. The strategy emphasizes public-private partnerships and positions AI as both a research tool and motivation for developing fusion energy to meet data centers' growing electricity demands. The big picture: The DOE's roadmap aims to "deliver the public infrastructure that supports the fusion private sector scale up in the 2030s," but acknowledges it cannot commit to specific funding levels and remains subject to Congressional appropriations. Why...

Oct 17, 2025

Tying it all together: Credo’s purple cables power the $4B AI data center boom

Credo, a Silicon Valley semiconductor company specializing in data center cables and chips, has seen its stock price more than double this year to $143.61, following a 245% surge in 2024. The company's signature purple cables, which cost between $300-$500 each, have become essential infrastructure for AI data centers, positioning Credo to capitalize on the trillion-dollar AI infrastructure expansion as hyperscalers like Amazon, Microsoft, and Elon Musk's xAI rapidly build out massive computing facilities. What you should know: Credo's active electrical cables (AECs) are becoming indispensable for connecting the massive GPU clusters required for AI training and inference. The company...

Oct 17, 2025

Vatican launches Latin American AI network for human development

The Vatican hosted a two-day conference bringing together 50 global experts to explore how artificial intelligence can advance peace, social justice, and human development. The event launched the Latin American AI Network for Integral Human Development and established principles for ethical AI governance that prioritize human dignity over technological advancement. What you should know: The Pontifical Academy of Social Sciences, the Vatican's research body for social issues, organized the "Digital Rerum Novarum" conference on October 16-17, combining academic research with practical AI applications. Participants included leading experts from MIT, Microsoft, Columbia University, the UN, and major European institutions. The conference...