NVIDIA Announces Major Release of Cosmos World Foundation Models and Physical AI Data Tools

  • New Models Enable Prediction, Controllable World Generation and Reasoning for Physical AI

  • Two New Blueprints Deliver Massive Physical AI Synthetic Data Generation for Robot and Autonomous Vehicle Post-Training

  • 1X, Agility Robotics, Figure AI, Skild AI Among Early Adopters

March 18, 2025—GTC—NVIDIA today announced a major release of new NVIDIA Cosmos™ world foundation models (WFMs), introducing an open and fully customizable reasoning model for physical AI development and giving developers unprecedented control over world generation.

NVIDIA is also launching two new blueprints — powered by the NVIDIA Omniverse™ and Cosmos platforms — that provide developers with massive, controllable synthetic data generation engines for post-training robots and autonomous vehicles.

Industry leaders including 1X, Agility Robotics, Figure AI, Foretellix, Skild AI and Uber are among the first to adopt Cosmos to generate richer training data for physical AI faster and at scale.

“Just as large language models revolutionized generative and agentic AI, Cosmos world foundation models are a breakthrough for physical AI,” said Jensen Huang, founder and CEO of NVIDIA. “Cosmos introduces an open and fully customizable reasoning model for physical AI and unlocks opportunities for step-function advances in robotics and the physical industries.”

Cosmos Transfer for Synthetic Data Generation

Cosmos Transfer WFMs ingest structured video inputs such as segmentation maps, depth maps, lidar scans, pose estimation maps and trajectory maps to generate controllable photoreal video outputs.

Cosmos Transfer streamlines perception AI training, transforming 3D simulations or ground truth created in Omniverse into photorealistic videos for large-scale, controllable synthetic data generation.

Agility Robotics will be an early adopter of Cosmos Transfer and Omniverse for large-scale synthetic data generation to train its robot models.

“Cosmos offers us an opportunity to scale our photorealistic training data beyond what we can feasibly collect in the real world,” said Pras Velagapudi, chief technology officer of Agility Robotics. “We’re excited to see what new performance we can unlock with the platform, while making the most use of the physics-based simulation data we already have.”

The NVIDIA Omniverse Blueprint for autonomous vehicle simulation uses Cosmos Transfer to amplify variations of physically based sensor data. With the blueprint, Foretellix can enhance behavioral scenarios by varying conditions like weather and lighting for diverse driving datasets. Parallel Domain is also using the blueprint to apply similar variation to its sensor simulation.

The NVIDIA GR00T Blueprint for synthetic manipulation motion generation combines Omniverse and Cosmos Transfer to generate diverse datasets at scale, benefiting from OpenUSD-powered simulations and reducing data collection and augmentation time from days to hours.

Cosmos Predict for Intelligent World Generation

Announced at the CES trade show in January, Cosmos Predict WFMs generate virtual world states from multimodal inputs like text, images and video. New Cosmos Predict models will enable multi-frame generation, predicting intermediate actions or motion trajectories when given start and end input images. Purpose-built for post-training, these models can be customized using NVIDIA’s openly available physical AI dataset.

With the inference compute power of NVIDIA Grace Blackwell NVL72 systems and their large NVIDIA NVLink™ domain, developers can achieve real-time world generation.

1X is using Cosmos Predict and Cosmos Transfer to train its new humanoid robot NEO Gamma. Robot brain developer Skild AI is tapping into Cosmos Transfer to augment synthetic datasets for its robots. Plus, Nexar and Oxa are using Cosmos Predict to advance their autonomous driving systems.

Multimodal Reasoning for Physical AI

Cosmos Reason is an open, fully customizable WFM with spatiotemporal awareness that uses chain-of-thought reasoning to understand video data and predict the outcomes of interactions — such as a person stepping into a crosswalk or a box falling from a shelf — in natural language.

Developers can use Cosmos Reason to improve physical AI data annotation and curation, enhance existing world foundation models or create new vision language action models. They can also post-train it to build high-level planners to tell the physical AI what it needs to do to complete a task.

Accelerating Data Curation and Post-Training for Physical AI

Based on their downstream task, developers can post-train Cosmos WFMs using native PyTorch scripts or the NVIDIA NeMo™ framework on NVIDIA DGX™ Cloud.

Cosmos developers can also use NVIDIA NeMo Curator on DGX Cloud for accelerated data processing and curation. Linker Vision and Milestone Systems are using it for curating large amounts of video data to train large vision language models for visual agents built on the NVIDIA AI Blueprint for video search and summarization. Virtual Incision is exploring it to be deployed in future surgical robots, while Uber and Waabi are advancing autonomous vehicles development.

Driving Responsible AI and Content Transparency

In line with NVIDIA’s trustworthy AI principles, NVIDIA enforces open guardrails across all Cosmos WFMs. In addition, NVIDIA is collaborating with Google DeepMind to integrate SynthID to watermark and help identify AI-generated outputs from the Cosmos WFM NVIDIA NIM™ microservice featured on build.nvidia.com.

Availability

Cosmos WFMs are available for preview in the NVIDIA API catalog and now listed in the Vertex AI Model Garden on Google Cloud. Cosmos Predict and Cosmos Transfer are openly available on Hugging Face and GitHub. Cosmos Reason is available in early access.

Learn more by watching the NVIDIA GTC keynote and by registering for Cosmos sessions and training from NVIDIA and industry leaders at the show, including “An Introduction to Cosmos World Foundation Models” with Ming-Yu Liu, vice president of generative AI research at NVIDIA.

About NVIDIA

NVIDIA (NASDAQ: NVDA) is the world leader in accelerated computing.

Here you’ll find a wealth of practical technical insights and expert advice to help you bring AI and visual intelligence into your products without flying blind.

Contact

Address

Berkeley Design Technology, Inc.
PO Box #4446
Walnut Creek, CA 94596

Phone
Phone: +1 (925) 954-1411
Scroll to Top