Home Technology Build a marshmallow castle with Google’s new AI World Generator

Build a marshmallow castle with Google’s new AI World Generator

Build a marshmallow castle with Google’s new AI World Generator

Google DeepMind is opening up access to Project Genie, an AI tool for creating interactive game worlds from text prompts or images.

Starting Thursday, Google AI Ultra subscribers in the U.S. will be able to play with experimental research prototypes powered by a combination of Google’s latest world model, Genie 3, and image generation models, Nano Banana Pro and Gemini.

The move, which comes five months after Genie 3’s research preview, is part of a broader push by DeepMind to collect user feedback and training data as it races to develop more capable models of the world.

A world model is an AI system that creates an internal representation of the environment, which can be used to predict future outcomes and plan actions. Many AI leaders, including DeepMind, believe that world models are an important step toward achieving artificial general intelligence (AGI). But in the near future, labs like DeepMind envision going to market with training embodied agents (aka robots) in simulations, starting with video games and other forms of entertainment.

DeepMind’s launch of Project Genie comes as the global model race begins to heat up. Fei-Fei Li’s World Labs launched its first commercial product called Marble late last year. Runway, an AI video generation startup, also recently launched a world model. And former Meta chief scientist Yann LeCun’s startup AMI Labs will also focus on developing world models.

“I think it’s exciting to be in a place where more people can access and give feedback,” Shlomi Fruchter, DeepMind’s director of research, told TechCrunch in a video interview, smiling from ear to ear with obvious excitement about Project Genie’s launch.

DeepMind researchers interviewed by TechCrunch were candid about the experimental nature of the tool. It can be inconsistent, sometimes creating impressively playable worlds, and sometimes producing baffling results that miss the mark. Here’s how it works:

Tech Crunch Event

Boston, Massachusetts
|
June 23, 2026

A claymation-style castle in the sky made from marshmallows and candies.Image Credits:Tech Crunch

You start with a “world sketch” by providing text prompts for the environment and a protagonist who can later explore the world in first- or third-person views. Nano Banana Pro generates images based on prompts that you can theoretically modify before Genie uses them as a starting point for its interactive world. The fix worked for the most part, but the model sometimes stumbled and displayed purple hair when asking for green.

You can also use real photos as the basis for your models, once again building a world that has failed. (More on this later.)

Once you’re satisfied with the image, Project Genie takes a few seconds to create an explorable world. You can also remix existing worlds into new interpretations based on prompts, explore curated worlds in the gallery, or explore through randomization tools for inspiration. You can then download a video of the world you just explored.

DeepMind currently only allows world creation and exploration for 60 seconds, partly due to budget and compute constraints. Because Genie 3 is an autoregressive model, it requires a lot of dedicated computing, which limits how much DeepMind can offer to users.

“The reason we limited it to 60 seconds is because we wanted to provide it to a wider audience,” Fruchter said. “Basically, when you use it, you have your own chip somewhere and it’s dedicated to your session.”

He added that extending it beyond 60 seconds would reduce the incremental value of the test.

“The environment is interesting, but at some point it is somewhat limited by the level of interaction and the dynamics of the environment. Still, we see this as a limitation and want to improve it.”

Whimsical works, realism doesn’t

Google won a cease-and-desist order from Disney last year, so it won’t be making models related to Disney.Image Credits:Tech Crunch

When I used the model, the safety guardrails were already installed and operational. I may not create anything resembling nudity or worlds that have even the slightest whiff of Disney or other copyrighted material. (Last December, Disney issued a cease-and-desist order to Google, accusing the company’s AI models of copyright infringement by training them on Disney’s characters and IP and creating unauthorized content.) I couldn’t even get Genie to create a world of mermaids exploring an underwater fantasy land or an ice queen in a winter castle.

Still, the demo was very impressive. The first world I created was an attempt to fulfill a little childhood fantasy of being able to explore a castle in the clouds made of marshmallows, rivers of chocolate sauce, and trees made of candy. (Yes, I was a chubby kid.) I asked the model to do it claymation style, and it presented a whimsical world that I would have devoured as a child, the castle’s pastel and white spiers and turrets looking fluffy and delicious enough to tear off a chunk and dip it into a chocolate moat. (This is the video above.)

A “Game of Thrones” inspired world that I wasn’t able to create as realistically as I wanted.Image Credits:Tech Crunch

That said, Project Genie still has some issues that need to be addressed.

The models excelled at creating worlds based on artistic prompts, whether using watercolor, anime style, or classic comic book aesthetics. But in realistic or cinematic worlds, they tended to fail, often looking like video games rather than real people in real environments.

It also didn’t always respond well when given actual photos to work with. I gave them a photo of my office and asked them to create a world based on the photo, and the result was a world with the same furniture as my office (wooden desk, plants, gray sofa, etc.) arranged differently. And it looked lifeless, dry and digital.

I sent a photo of my desk with a plush toy, and Project Genie animated the toy as it explored space, sometimes reacting as other objects passed by.

This interactivity is something DeepMind is working to improve. There were several instances where my character would go right through walls or other solid objects.

I asked Project Genie to animate a plush toy (Bingo Bronson) to help it navigate my desk. Image Credits:Tech Crunch

When DeepMind first released Genie 3, researchers highlighted how the model’s autoregressive architecture meant it could remember what it produced. So I wanted to test this by going back to some of the environments the model created and seeing if they were the same. In most cases, the model was successful. In one case, we generated a cat exploring another desk, and only once did the model generate a second mug when it turned to the right of the desk.

The most frustrating part was using the arrows to look around, the space bar to jump or climb, and the WASD keys to move and navigate the space. I’m not a gamer, so this didn’t feel natural, but the keys were often unresponsive or sent in the wrong direction. Walking from one side of the room to the other doorway often became a confusing zig-zag movement, like trying to steer a shopping cart with a broken wheel.

Fruchter assured us that his team was aware of these shortcomings, reminding us that Project Genie was an experimental prototype. In the future, the team said they hope to enhance realism and improve interaction features, including giving users more control over their actions and environment.

“We don’t think of (Project Genie) as an end-to-end product that people can go back to their normal lives with, but we think it’s already a glimpse into something that’s exciting and unique and can’t be done any other way,” he said.

Exit mobile version