AI can find bugs in decades-old code, but there's a catch
AI tools like Anthropic's Claude Opus 4.6 are now digging through code written way before most of us were born, like Microsoft Azure CTO Mark Russinovich's 1986 Apple II assembly code, and finding bugs that have been hiding for decades.
It sounds great for making old, unpatchable systems safer, but there's a twist.
AI can fix and find bugs, but it also creates them
Unlike traditional tools that just flag known issues, LLMs can actually reason through how code works and spot deeper problems.
In fact, Claude Opus 4.6 has uncovered previously hidden bugs in legacy code and high-severity issues in real-world projects.
But studies show that while AI writes code much faster than humans, it also introduces up to 1.7 times more bugs, including serious ones.
The bug balance
As companies rush to use more AI in coding, in a 2025 head-to-head study, LLMs like GPT-4.1, Mistral Large, and DeepSeek V3 were as good as industry-standard static analyzers at finding bugs across multiple open-source projects.
So while LLMs are powerful for finding hidden issues, relying on them too much could make software less reliable, especially for older systems that can't easily be fixed.