Google's new AI creates interactive video game worlds from text

By Mudit Dube

Aug 06, 2025

10:07 am

What's the story

Google's DeepMind team has unveiled Genie 3, a cutting-edge artificial intelligence (AI) model capable of generating interactive 3D environments from text prompts. The new model allows users and AI agents to interact with these virtual worlds in real time. This is a major improvement over its predecessor, Genie 2, which only allowed for short bursts of interaction.

Advanced features

Improved interaction time and visual memory

Genie 3 offers a much longer interaction time, lasting several minutes, as opposed to the mere seconds offered by Genie 2. It also boasts a visual memory that can hold onto details for about a minute, ensuring continuity in virtual spaces. The model generates worlds at a resolution of 720p and a frame rate of 24 frames per second (fps), providing an immersive experience for users.

Model capabilities

Promptable world events and current limitations

Genie 3 also introduces "promptable world events," allowing users to manipulate elements like weather or add characters with a simple prompt. However, the model is still a research preview, available only to select academics and creators. It has its limitations too, such as restricted user interaction methods and legible text generation often depending on the input world description.

Model applications

Potential applications and future access plans

Genie 3's ability to create realistic natural scenes, fictional worlds, and historical simulations makes it a valuable tool for researchers training AI agents. DeepMind has already been testing Genie 3 with its SIMA generalist agent in different worlds. Despite its current limitations, DeepMind is looking into ways to expand access beyond the current research preview while ensuring ethical safeguards are in place.