TechCrunch AI January 6, 2025 TERMINATOR

Google DeepMind moves deeper into world models for AI simulation

Google DeepMind is forming a new team focused on AI models that can simulate the physical world. The effort, led by Tim Brooks, will connect work across Gemini, Veo and Genie while raising familiar questions about creative work and copyright.

WTF Index TERMINATOR

◄ Terminator 2 Idiocracy 1 ►

Google DeepMind's push into large world-simulation models mildly leans toward more powerful and autonomous AI systems, though the article is mostly a routine research expansion.

Google DeepMind moves deeper into world models for AI simulation

Google is putting more weight behind world models, a branch of AI aimed at simulating environments rather than only producing text, images or video. The company is forming a new team inside Google DeepMind to work on models that can represent and generate parts of the physical world.

The team will be led by Tim Brooks, one of the co-leads on OpenAI’s video generator, Sora. Brooks left for Google’s AI research lab, Google DeepMind, in October, and announced the new effort in a post on X.

A new team with a world simulation mission

Brooks described the assignment in direct terms. In his post, he wrote:

DeepMind has ambitious plans to make massive generative models that simulate the world,

He also said he is hiring for a new team with that mission. According to job listings Brooks linked to, the group will work with and build on projects already underway at Google, including Gemini, Veo and Genie.

That matters because each of those efforts points to a different layer of Google’s AI strategy. Gemini is Google’s flagship series of AI models for tasks such as analyzing images and generating text. Veo is Google’s video generation model. Genie is Google’s take on a world model, designed to simulate games and 3D environments in real time.

The new team is not being framed as an isolated research project. The job listings say it will take on “critical new problems” and scale models “to the highest levels of compute.” That signals a push toward large systems that combine video, multimodal data and interactive generation.

Why world models are drawing attention

A world model is different from a conventional content generator in an important way. Instead of only creating a single output, it aims to represent an environment that can change, react and be explored. In the source article, Genie is described as AI that can simulate games and 3D environments in real time.

Google’s latest Genie model, previewed in December, can generate a massive variety of playable 3D worlds. That makes the technology relevant not only to video generation, but also to interactive systems where users or agents can move through generated spaces.

One of the job descriptions connects this work to a larger AI goal:

We believe scaling [AI training] on video and multimodal data is on the critical path to artificial general intelligence,

Artificial general intelligence, or AGI, generally refers to AI that can accomplish any task a human can. The same description says world models could support domains such as visual reasoning and simulation, planning for embodied agents, and real-time interactive entertainment.

In practical terms, that means Google is looking beyond static media. The team will study how to develop “real-time interactive generation” tools on top of its models and how to integrate those models with existing multimodal models such as Gemini.

The competitive field is already forming

Google is not the only company pursuing this direction. The source article names several startups and big tech efforts in the world modeling space, including World Labs, Decart and Odyssey. World Labs is connected to influential AI researcher Fei-Fei Lee, while Decart is described as an Israeli upstart.

The shared bet is that world models could eventually help build interactive media and realistic simulations. The examples given include video games, movies and training environments for robots.

That range explains why world models are becoming strategically important. If AI can generate environments that respond in real time, the technology could affect entertainment, simulation and robotics-related planning. It also places video and multimodal AI closer to the center of broader AI development.

Logan Kilpatrick also posted about the effort on X, writing:

Come work with Tim and the Deepmind team on massive world simulation models : )

In the same post, he described the work as being “On the critical path to AGI.”

Creative work and copyright remain unresolved

The rise of world models also brings pressure points that are already familiar in generative AI. Creatives have mixed feelings about the technology, and the source article highlights concerns from games, animation, film and television.

A recent Wired investigation found that game studios like Activision Blizzard, which has laid off scores of workers, are using AI to cut corners, ramp up productivity and compensate for attrition. A 2024 study commissioned by the Animation Guild, a union representing Hollywood animators and cartoonists, estimated that over 100,000 U.S.-based film, television and animation jobs will be disrupted by AI by 2026.

Those concerns are especially relevant for tools that can generate interactive worlds. If the same systems that support simulation can also produce games, movies or other media, the boundary between creative assistance and labor replacement becomes a central issue.

The source article notes that some startups in the world modeling space, including Odyssey, have pledged to collaborate with creative professionals rather than replace them. It remains unclear whether Google will take the same approach.

Copyright is another open question. Some world models appear to be trained on clips of video game playthroughs, which could expose companies to lawsuits if the videos were unlicensed. Google, which owns YouTube, says it has permission to train its models on YouTube videos under the platform’s terms of service. The company has not said which specific videos it uses for training.

What to watch next

The immediate development is straightforward: Google DeepMind is hiring for a team focused on large generative models that simulate the world. The larger story is how that work connects Gemini, Veo and Genie into a more interactive AI roadmap.

For Google, the opportunity is to build systems that reason visually, simulate environments and generate interactive experiences in real time. For creative industries and rights holders, the same technology raises questions about work, authorship and training data that have not been settled.

The new team gives Google a clearer place in the race to build world models. What remains to be seen is how the company will balance technical ambition with the concerns already surrounding AI-generated media and simulation.