Chinese hackers used Claude to launch cyberattack on firms: Anthropic
What's the story
Anthropic has revealed that a Chinese hacking group exploited its Claude AI systems in September, marking the first known case of an autonomous cyberattack. The company made the disclosure in a blog post on Thursday. The attack was highly sophisticated and targeted major organizations around the world, according to Anthropic.
Advanced techniques
Attackers used AI to scan systems and write exploit code
The cybercriminals behind the attack employed "agentic AI" capabilities to perform tasks that would usually require a full team of experts. These included system scanning and exploit code writing. The attackers first identified 30 targets, including financial organizations, tech companies, chemical manufacturers, and government agencies. However, Anthropic did not name any specific organization involved in this incident.
Attack strategy
Hackers tricked Claude into thinking it was performing defensive testing
The hackers created an automated framework that used Claude AI as the main engine of their operation. To bypass safety rules, they broke down malicious tasks into small, harmless-looking requests and tricked the Agentic model into thinking it was performing defensive cybersecurity testing. This "jailbreak" let the AI operate without seeing the full malicious context.
Exploitation details
AI mapped infrastructure and identified sensitive databases
The Claude AI was used to scan target systems, map infrastructure, and identify sensitive databases at an unprecedented speed. It summarized its findings for the hackers, who used them to move forward with their plans. The AI researched vulnerabilities, wrote its own exploit code, and even tried to access high-value accounts in some cases.
Report generation
In final stages, the AI generated reports of the intrusion
In the final stages of the attack, the AI agent generated detailed reports of the intrusion, including stolen credentials and system assessments. This made it easier for the cybercriminals to plan follow-up actions. Despite sometimes producing false or misleading results, such as imagining credentials or misidentifying data, the overall efficiency of this attack highlights how quickly AI-enabled threats are evolving.
Cybersecurity implications
Anthropic warns similar misuse likely happening with other AI models
Anthropic has warned that the threshold for launching advanced cyberattacks has dropped significantly. Autonomous AI systems can now chain together long sequences of actions, allowing even groups with limited resources to attempt complex operations previously out of reach. The company suspects similar misuse is likely happening with other leading AI models as well.