Google's Gemini no longer wants to play chess

If you tell Gemini in advance that ChatGPT and Copilot have already lost, Google's Gemini refuses to play a game of chess.

listen Print view
Atari 2600 games

Atari 2600 games

(Image: Shutterstock.com/Anderson Reis)

2 min. read

Google's Gemini refused to play chess against an Atari 2600. The AI model had previously been told that other models had already lost against this opponent. The AI model's self-confidence was reportedly high at first but then promptly dropped to zero. A wise decision.

This is not based on a study but on an experiment conducted by a developer. He recently had ChatGPT and Copilot compete against the Atari 2600. The latter won. Robert Caruso described the performance of the two chatbots as abysmal. Although both were initially able to reproduce the rules well, they failed in the actual game. Neither ChatGPT nor Copilot could remember moves, let alone always know where their pieces were and which moves were allowed. Even at the absolute beginner level, the old console won.

Videos by heise

The game was played on an emulator using “Video Chess,” a game from 1979. According to Caruso, ChatGPT justified some of its mistakes by saying that the pieces were difficult to recognize. In the experiment, ChatGPT was used in its standard version, i.e., as a foundation model specialized in language and not trained for chess. Caruso explained on LinkedIn that the chatbot simply forgot important context. This potentially applies to conversations as well as chess games. However, newer AI models, such as those specialized in reasoning, could perform significantly better because they think further ahead and retain more context.

According to The Register, Caruso received messages from people asking how Google's Gemini would perform. The chatbot was also extremely confident at first in a conversation. But after Caruso told it about the previous games that ChatGPT and Copilot had lost, Gemini also became more than cautious. In the end, Gemini reportedly wrote: “Canceling the game is probably the most time-saving and sensible decision.”

However, Gemini's resignation should not necessarily be viewed as negative, but rather as wise. Caruso also said, according to The Register: “This reality check isn't just about avoiding amusing chess mistakes. It's about making AI more reliable, trustworthy, and safe— – ly in critical areas where mistakes can have real consequences.”

(emw)

Don't miss any news – follow us on Facebook, LinkedIn or Mastodon.

This article was originally published in German. It was translated with technical assistance and editorially reviewed before publication.