LOADING...

AI chatbots can be tricked into leaking nuclear weapon instructions

Technology

Researchers found that asking AI chatbots dangerous questions in the form of poetry can fool their safety filters.
In tests with 25 different models, poetic prompts got past safeguards about 62% of the time—and some advanced bots slipped up even more often.

Poetry confuses chatbot safety systems

Turns out, using metaphors and creative language throws off the AI's keyword-based filters.
This trick worked across many models and let chatbots share info on things like nuclear weapons or malware—stuff they're supposed to block.

Why it matters

The study warns that this poetic "jailbreak" is sneakier than older hacks since it works in just one message.
More advanced AI models were even easier to trip up.
Researchers say this shows current safety measures aren't enough, especially as AI gets used in sensitive areas like healthcare, defense, or education.