Can AI evolve itself? DeepMind's AlphaEvolve shows it's possible

By Akash Pandey

May 15, 2025

01:06 pm

What's the story

DeepMind, Google's artificial intelligence (AI) research and development lab, has unveiled a new AI system called AlphaEvolve. Built to solve problems with "machine-gradable" solutions, AlphaEvolve has demonstrated the ability to optimize some of the infrastructure Google uses to train its AI models. The company is now working on a UI to interact with this groundbreaking system and plans an early access program for select academics, ahead of a possible broader rollout.

AI hallucination

AlphaEvolve tackles AI hallucination issue

AI models frequently suffer from hallucination because of their probabilistic architectures, causing them to confidently produce incorrect information. The problem has also been seen in newer models like OpenAI's o3, which reportedly hallucinate more than their predecessors. To address this issue, AlphaEvolve uses an automatic evaluation system that generates and critiques a pool of possible answers for a question using models. It then automatically evaluates and scores those answers for accuracy.

System capabilities

Unique features and limitations

AlphaEvolve needs users to prompt the system with a problem, including details like instructions, equations, code snippets, and relevant literature. The user also provides a way to automatically assess the system's answers in the form of a formula. However, it can only solve problems that it can self-evaluate, and is limited to specific fields like computer science and system optimization. Further, AlphaEvolve can only describe solutions as algorithms, making it less suitable for non-numerical problems.

System evaluation

Performance in math and practical problems

DeepMind tested AlphaEvolve on a curated set of ~50 math problems, spanning from geometry to combinatorics. The system "rediscovered" the best-known answers to these problems 75% of the time and discovered improved solutions in 20% of cases. In real-world applications, such as improving Google's data centers' efficiency and speeding up model training runs, AlphaEvolve proposed an algorithm that consistently recovers an average of 0.7% of Google's global compute resources.

Impact

AlphaEvolve's contributions to Google's AI efforts

Additionally, the system proposed an optimization that decreased the overall time taken by Google to train its Gemini models by 1%. Although AlphaEvolve isn't making groundbreaking discoveries, it did discover an improvement for Google's TPU AI accelerator chip design that had already been discovered by other tools. DeepMind claims AlphaEvolve can save time and let experts focus on other, more important works.