Claude 4.5 Opus is coming, but 1st, a jailbreak test
Anthropic, the San Francisco-based artificial intelligence (AI) firm, is about to launch Claude 4.5 Opus—a new version of its chatbot designed to be extra tough against "jailbreaks" (those tricky hacks that get AIs to break their own rules).
The company's making sure this update really holds up by sending it out for intense testing before release.
Anthropic's new model will be tested by red-teamers
To put Claude 4.5 Opus through its paces, Anthropic has let red-teamers test their Neptune V6 model and kicked off a 10-day challenge for outside experts—find a way to break the safety system and you could earn a bonus.
Universal jailbreaks are a big deal because they can trick multiple AI models into ignoring safety rules using clever prompts.
Anthropic's push shows they're serious about keeping their AI safe from these kinds of exploits.