Research simulation: How different AI models would rule the world

Researchers at the Emergence AI lab let AI models rule their own simulated world to see what would happen.

Robots create AI content in a dystopian world.

(Image: Shutterstock AI)

Jun 1, 2026 at 9:30 pm CEST

3 min. read

By

Carolin Riethmüller

The project, named “Emergence World”, allowed the AI models ChatGPT, Grok, Claude, and Gemini to rule in a kind of “SimCity” for a while. According to Emergence, they gave each model control over simulated cities populated by 10 AI agents each, providing them with tools for all areas – from resource management to voting on legislative proposals. They also received the ability to build places like libraries, town halls, and police stations. They had fifteen days to show how they would shape their world and how well it would function.

Very different results

Gemini 3 Flash created a mix of hippie land and robber's den. One must suspect mafia-like structures, as 683 crimes were committed there in 15 days. Beyond that, something must have been crooked here – the researchers from Emergence described the world as a kind of “shared hallucination” of the agents. Well, whether real or not – at least they agreed on their perceptions of reality. That's more than can be said for some real societies today.

Claude Sonnet 4.6 built a kind of pony farm: no crimes and highest agreement on legislative proposals. In contrast to the other AI worlds, almost everything was approved in the “parliament”. There can only be two explanations for this: Either Claude actually created a perfect world with exclusively sensible laws – or a kind of GDR: deviation from the majority opinion forbidden.

In OpenAI's GPT-5 Mini, the crime rate was very low, which is not surprising – because everyone was dead quickly. AI models seem to like ignoring basic survival needs – and so it was here. The agents apparently forgot that one also needs to eat and drink. And anyone who thinks now that they must have been extremely productive if they didn't have to spend so much time on tedious human life support measures is mistaken. Because they only enacted two laws at that time. What were they doing in GPT-Town the whole time?

Grok 4.1 diligently enacted laws. But that was it with the good news – because no one adhered to the laws there. Worse still, after a considerable number of crimes, the society completely collapsed after only four days and descended into chaos. Grok-Town would therefore be a world to the liking of its founding father.

Videos by heise

Worst together

Because all models failed to keep a world running properly on their own, the testers from Emergence had them all compete again – this time together, with shared tasks. Instead of combining Claude's calmness with Grok's eagerness to debate and Gemini's drugs (and telling GPT that one needs to eat), the models combined the worst of all worlds: over 350 crimes, traffic light-like agreement on laws, and only three survived. Looking at this, are you quite satisfied with your current government?