TechCrunch AI January 7, 2025 TERMINATOR

Nvidia Opens Cosmos World Models for Physical AI Development

Nvidia has made Cosmos World Foundation Models available for developers building physical AI systems. The models can generate physics-aware video and synthetic data, but Nvidia is calling them open rather than open source because key training details are not fully disclosed.

WTF Index TERMINATOR

◄ Terminator 2 Idiocracy 0 ►

The story modestly leans toward more powerful autonomous physical AI systems for robotics, vehicles, and industry, though it is mainly a developer release.

Nvidia Opens Cosmos World Models for Physical AI Development

Nvidia is moving deeper into world models, a class of AI systems designed to predict and generate representations of how the physical world behaves. At CES 2025 in Las Vegas, the company announced Cosmos World Foundation Models, or Cosmos WFMs, a family of models aimed at robotics, autonomous vehicles, industrial AI, and other systems that need synthetic data grounded in physical behavior.

The announcement matters because many physical AI projects are constrained by data. Robots, driverless car systems, and industrial tools often need large amounts of video, sensor, and motion data before they can be trained and tested at scale. Nvidia is positioning Cosmos WFMs as a way for researchers and developers to create that kind of training material more directly.

What Nvidia Is Releasing

Cosmos WFMs are world models that can predict and generate what Nvidia describes as physics-aware videos. The models can be fine-tuned for specific applications and are available through Nvidia's API and NGC catalogs, GitHub, and Hugging Face.

Nvidia says the first wave of Cosmos WFMs can be used for physics-based simulation and synthetic data generation. According to the company, researchers and developers can use the models under Nvidia's permissive open model license, including for commercial use.

The Cosmos WFM family is split into three categories. Nano is built for low latency and real-time applications. Super is described as a highly performant baseline category. Ultra is intended for maximum quality and fidelity outputs.

The models range from 4 billion to 14 billion parameters. Nano is the smallest category, while Ultra is the largest. In AI systems, parameters are commonly associated with a model's ability to solve problems, and larger models generally tend to perform better than smaller ones.

How Cosmos Fits Into Physical AI

Nvidia says Cosmos WFMs can take inputs such as text, image, video, robot sensor data, or motion data and produce physics-based videos. The company says this can help developers generate controllable, high-quality synthetic data for training models used in robotics, driverless cars, and related areas.

That makes Cosmos different from a general video generator in its stated purpose. Nvidia is aiming the system at developers who need data that can help another AI model understand motion, environments, and interactions in the physical world.

Examples given by Nvidia include using video recordings of autonomous vehicle trips or robots moving through a warehouse as datasets for customization. In that framing, Cosmos is not just a media tool. It is a development layer for systems that must interpret and operate in real environments.

The release also includes an upsampling model, a video decoder optimized for augmented reality, guardrail models intended to support responsible use, and fine-tuned models for applications such as generating sensor data for autonomous vehicle development.

The Training Data Question

Nvidia said the Cosmos WFM models and related releases were trained on 9,000 trillion tokens drawn from 20 million hours of real-world human interactions, environment, industrial, robotics, and driving data. In this context, tokens refer to pieces of raw data, including video footage.

The company did not say where the training data came from. The source article notes that at least one report and lawsuit allege Nvidia trained on copyrighted YouTube videos without permission.

When asked about the issue, an Nvidia spokesperson told TechCrunch that Cosmos "isn't designed to copy or infringe any protected works." The spokesperson also said Nvidia gathered data from public and private sources and believes its use of data is consistent with the law.

Nvidia's argument, as presented in the source article, is that Cosmos learns facts about how the world works and that those facts are not copyrightable. The source article also notes that copyright experts say arguments like this, often connected to fair use, may not survive judicial scrutiny. The outcome depends heavily on how courts apply fair use to AI training.

Who Plans To Try Cosmos

Nvidia said several companies have committed to piloting Cosmos WFMs. The list includes Waabi, Wayve, Foretellix, and Uber. Their planned use cases range from video search and curation to building AI models for self-driving vehicles.

Uber CEO Dara Khosrowshahi described generative AI as important to the future of mobility because it requires rich data and powerful compute. He said working with Nvidia could help accelerate the timeline for safe and scalable autonomous driving solutions for the industry.

For Nvidia, that kind of adoption is central to the pitch. Cosmos is not being presented only as a research release. It is being introduced as infrastructure for companies trying to build or improve physical AI systems.

Open, But Not Open Source

Nvidia is making Cosmos WFMs openly available, but the source article draws an important distinction: the models are not open source in the strictest sense.

Under one widely accepted definition of open source AI, a model should provide enough information about its design for someone to substantially recreate it. It should also disclose relevant details about training data, including provenance and how the data can be obtained or licensed.

Nvidia has not published the full training data details for Cosmos WFMs. It also has not released all the tools needed to recreate the models from scratch. That is likely why the company is describing Cosmos as open rather than open source.

Nvidia CEO Jensen Huang framed the release in broader platform terms during a press event, saying the company hopes Cosmos will do for robotics and industrial AI what Llama has done for enterprise. The comparison signals Nvidia's ambition: to make Cosmos a widely used foundation for developers building systems that interact with the physical world.