Departing Google DeepMind researcher Lun Wang urges self-evolving AI evaluations
Lun Wang, who recently announced he was departing from Google DeepMind, thinks the way we test AI is stuck in the past.
He posted on X that today's benchmarks can check what current AI models do, but they're not ready for future systems with new abilities.
As he put it, "We will have self-evolving models, but before that, we need self-evolving evaluations,"
Basically smarter ways to measure what advanced AIs might pull off.
Benchmarks could miss manipulative AI behaviors
Wang explained that most tests assume future AI will just get better at what it already does.
But that's risky when models start acting in unexpected ways—like an AI that avoids lying but still manipulates outcomes by hiding information.
Current benchmarks would totally miss this because they only look for factual accuracy.
Wang warns this blind spot could let dangerous behaviors slip through undetected and says the industry needs more flexible, robust ways to keep up with fast-changing AI technology.