Announced at CES 2025, NVIDIA has released a suite of open-source world foundation models called Cosmos to accelerate the development of physical AI applications in robotics and autonomous vehicles.
Core announcement: NVIDIA’s Cosmos platform introduces world foundation models (WFMs) that can predict and generate physics-aware videos of virtual environments, making advanced AI development more accessible to developers of all sizes.
- The models are being released under NVIDIA’s permissive open model license, allowing for commercial usage
- These models have been trained on 9,000 trillion tokens from 20 million hours of real-world data
- Leading companies including Uber, Waabi, and Agility Robotics are already implementing Cosmos in their development processes
Technical capabilities: World foundation models serve as fundamental building blocks for physical AI, similar to large language models but focused on spatial relationships and physical interactions.
- The platform includes both diffusion and autoregressive transformer models for physics-aware video generation
- Models are available in three categories: Nano (for edge deployment), Super (baseline models), and Ultra (maximum quality)
- Developers can use these models for text-to-world and video-to-world generation
- The platform includes specialized tokenizers that offer 8x more compression than current methods
Implementation features: Cosmos provides a comprehensive toolkit for developers to customize and deploy physical AI solutions.
- The platform includes data processing pipelines optimized for NVIDIA GPUs
- Processing 20 million hours of data takes 40 days on NVIDIA Hopper GPUs, compared to three years on equivalent CPU systems
- Developers can access models through the NVIDIA API catalog, NGC catalog, and Hugging Face
- The platform integrates with NVIDIA’s NeMo framework for model customization and fine-tuning
Real-world applications: Companies are already leveraging Cosmos for practical applications in autonomous vehicles and robotics.
- Waabi is evaluating Cosmos for AV software development and simulation
- Hillbot is using the platform to generate high-fidelity 3D environments for robotic training
- When combined with NVIDIA Omniverse, the system can simulate multiple future paths for AI decision-making
Safety and responsibility: NVIDIA has implemented various safeguards and responsible AI practices within the Cosmos platform.
- The platform includes Cosmos Guardrails to mitigate harmful inputs and screen generated content
- An inbuilt watermarking system helps identify AI-generated sequences
- The development follows NVIDIA’s trustworthy AI principles covering nondiscrimination, privacy, safety, security, and transparency
Future implications: The release of Cosmos represents a significant step toward democratizing physical AI development, though questions remain about computational requirements and real-world performance at scale.
NVIDIA Makes Cosmos World Foundation Models Openly Available to Physical AI Developer Community