ARTEMIS, Stanford's AI agent, outsmarts most human hackers

Technology Dec 13, 2025

Stanford's new AI, ARTEMIS, just beat nine out of 10 human hackers in a cybersecurity test.
Facing off against pros and other AIs on a massive network, ARTEMIS found nine real vulnerabilities with 82% accuracy—all in just 10 hours.

What makes ARTEMIS different?

ARTEMIS works by splitting up tasks: a supervisor keeps things organized, sub-agents tackle multiple jobs at once, and a triager double-checks the findings.
This teamwork helped it spot security flaws that even humans might miss—like sneaking into old servers using tricks modern browsers would block.

Is it worth it?

With hourly rates between $18 and $60, ARTEMIS is way cheaper than hiring human testers (who average $125k per year).
But it's not perfect—it tends to flag more false alarms than people do and struggles with tasks that need clicking around in menus or spotting login redirects.