OpenAI and Google win gold at the Math Olympics
International Math Olympiad is a particularly difficult math test. OpenAI and Google DeepMind declare that the performance of their AI models is a breakthrough.
The AI models also test different solutions.
(Image: worker/Shutterstock.com)
Five correct answers to six questions doesn't sound particularly surprising at first. However, according to Google and OpenAI, these are breakthroughs for their AI models. This is because the correct answers could be given in a particularly difficult math test – without the tasks having been prepared for the models beforehand.
The International Math Olympiad (IMO) is an annual global competition for schoolchildren. It is run by a non-profit organization. Those who do particularly well are awarded a gold medal.
Last year, Google won silver with its math AI models AlphaProof and AlphaGeometry 2. However, the tasks had to be translated by a human into a machine-readable problem. This year, both Google's AI department DeepMind and OpenAI competed without the need for this addition. This means that the tasks and answers were in natural language. Both companies claimed to have received five correct answers – out of six tasks. That is more than most students manage.
Videos by heise
According to the blog post, Google specifically used a specialized version of Gemini with Deep Think. The conditions were the same as for humans. Six tasks, 4.5 hours, 42 possible points. Advanced Gemini with Deep Think achieved 35 points. The model did not simply use a linear chain of thought, but considered multiple possible solutions in parallel. Gemini is also said to have been given access to curated math problems and solutions.
OpenAI announced at X that it had also achieved gold and 35 points. Here, too, the conditions were the same as for humans. Alexander Wei also published a series of results. The CEO of DeepMind, Demis Hassabis, did not like this very much. He then criticized OpenAI at X for not waiting until the students had received their results and the results had been checked by independent experts.
Wei shows a strawberry on the winner's podium at X. Last year, OpenAI was already said to be working on a product called Strawberry, which is supposed to be particularly good at math. This, in turn, is particularly important in the development of an AGI.
(emw)