Why experts are criticizing Anthropic's cybersecurity model 'Fable'
What's the story
Anthropic's latest AI model, Fable, has come under fire from cybersecurity researchers and professionals. The company launched Fable as a public and limited version of its highly anticipated cybersecurity model, Mythos. However, many experts are unhappy with the restrictions imposed on this new tool. Valentina "Chompie" Palmiotti, a renowned security researcher at IBM X-Force, criticized these limitations, saying they reject any request that could be remotely cyber-related.
Safety measures
Fable pauses chat and says safety measures flagged message
When a prompt triggers its guardrails, Fable pauses the chat and says that its "safety measures flagged this message for cybersecurity or biology topics." These restrictions were put in place to prevent the potential misuse of Fable for creating malware or compromising software. The limitations on biology stem from similar concerns about developing biological weapons.
Model deployment
Mythos was initially available to select companies under Project Glasswing
When Anthropic launched Mythos in April, it was only available to a select number of companies and organizations under Project Glasswing. This initiative aimed to deploy the model to protect critical software and infrastructure. However, last week, Anthropic broadened access to Mythos for hundreds of organizations across 15 countries. Despite these good intentions, many cybersecurity experts are still concerned about the arbitrary nature of these restrictions.
Model behavior
Keyword-based triggers are causing issues for users
Cybersecurity veteran Matt Suiche told TechCrunch that asking Fable to write secure code gets flagged as cybersecurity-related work instead of software engineering best practices. He said, "It seems to be keyword-based, so anything in the lexical field of 'cybersecurity' triggers the guardrails." Despite these concerns, Suiche believes that these are early days and Anthropic is still adapting its guardrails.
Information
Cyber verification program for cybersecurity professionals
Along with internal guardrails, Anthropic also has a Cyber Verification Program for cybersecurity professionals. If approved, these applicants get fewer restrictions while using Claude for cybersecurity work. OpenAI has a similar program called Trusted Access for Cyber.