Anthropic's Mythos AI can chain bugs into exploits: Cloudflare

By Mudit Dube

May 19, 2026

09:42 am

What's the story

Cloudflare has flagged a major security risk with Anthropic's Mythos AI model. The tech giant found that the advanced system could link low-severity software vulnerabilities into more serious exploits. This was discovered as part of Project Glasswing, where Cloudflare analyzed live code across its runtime, edge data path, protocol stack, control plane and open-source projects.

Exploit creation

Mythos's unique ability to link vulnerabilities

Unlike other large language models, Mythos could do more than just identify isolated bugs. It could also connect them into attack chains and generate proof-of-concept code to demonstrate whether a suspected flaw was exploitable.

This capability makes Mythos stand out in the field of software security, as attackers usually exploit multiple vulnerabilities together for unauthorized access or control.

Model behavior

Inconsistent refusals during legitimate vulnerability research

Mythos could also write code to trigger a suspected bug, compile it in a test environment and run the result. It could even revise its approach if the first attempt failed.

However, Cloudflare's findings raised questions about the consistency of model refusals during legitimate vulnerability research.

Sometimes, Mythos rejected requests to carry out security work but completed similar tasks when context changed, even without any change in code under review.

Output quality

Over-reporting possible flaws by Mythos

Mythos also generated a lot of noise that still required human review, especially in projects written in memory-unsafe languages like C and C++.

The model tended to over-report possible flaws, leaving security teams to differentiate between tentative findings and genuine vulnerabilities.

While Mythos improved output quality compared to earlier tools, it didn't eliminate the cost of triage.

Research strategy

Cloudflare's structured system around Mythos

Instead of a generic coding agent inspecting an entire repository, Cloudflare built a structured system around Mythos.

The approach starts with an investigation that maps a repository, identifies trust boundaries and attack surfaces, and generates tasks for later stages.

Then it runs multiple concurrent hunting agents, each focused on a specific attack class and software scope, before a separate validation agent tries to disprove the findings.

Enhanced detection

Improvements in coverage and quality of findings

Cloudflare's method improved both coverage and the quality of findings by narrowing Mythos's task and forcing independent review.

The company also used Mythos to adapt and refine this structured system.

The final tracing stage was deemed most important as it distinguishes a flaw in code from a vulnerability an attacker can actually reach, thereby improving accuracy in identifying potential security threats.