Summarize

Teen geniuses outsmart top AI models at International Math Olympiad

By Akash Pandey

Jul 28, 2025

11:10 am

What's the story

The International Mathematical Olympiad (IMO), the ultimate competition for young mathematicians, witnessed a surprising turn of events this year. Elite high school students from around the world were joined by Artificial Intelligence (AI) models developed by Google DeepMind and OpenAI. The notoriously tough IMO exam, which spans two days and presents students with three increasingly difficult problems each day, has become a benchmark for measuring AI progress over the years.

AI achievement

DeepMind takes home gold; OpenAI also claims gold

In a historic feat, an AI model from Google DeepMind scored a gold medal at the IMO by perfectly solving five out of six problems. OpenAI also claimed a gold medal without even participating in the official event. However, 26 students outperformed these AI systems on the IMO exam, including two-time gold medalist Qiao (Tiger) Zhang and Alexander Wang, who brought home his third consecutive gold.

Limitations

How the DeepMind model fared on day 1

The IMO process is specifically designed to come up with original and unconventional problems, which can be challenging for AI models. Despite their capabilities, these systems often struggle with novel challenges. This year, the DeepMind team improved their AI system, an advanced reasoning model, for the competition. On the first day itself, it nailed all three problems, stunning mathematicians.

Triumphs

OpenAI's experimental reasoning model

After the IMO's closing ceremony, OpenAI announced that its latest experimental reasoning model had solved five out of six problems. The company wasn't part of the IMO event but had given its model all six problems and enlisted former medalists to grade the proofs. Like DeepMind's, OpenAI's system flawlessly solved five and scored 35 out of 42 points to meet the gold standard.

Progress

Major advances from last year

Google DeepMind's model also operated entirely in "natural language" without any human intervention, a major advance from last year when it needed the problems to be translated into a computer programming language for math proofs. The system also solved the exam within the IMO time limit of 4.5 hours after taking several days of computation just a year ago.