LOADING...
ChatGPT can be tricked into generating graphic or violent imagery
OpenAI says it has implemented additional safeguards

ChatGPT can be tricked into generating graphic or violent imagery

Jul 05, 2026
04:10 pm

What's the story

The latest publicly available version of ChatGPT can be manipulated into creating sexualized images or depicting scenes of brutal violence, researchers have warned. The warning comes from British AI cybersecurity start-up Mindgard, which discovered that a widely known prompt could be tweaked to make ChatGPT generate graphic images. The original intent of the prompt was to produce humorous results.

Company statement

OpenAI says it has taken measures to prevent such responses

In response to the alarming discovery, OpenAI, the company behind ChatGPT, said it has taken measures to prevent such responses from the chatbot. The company stated that after examining the trend, it has introduced additional safeguards against these types of prompts. The firm also claimed it employs a multi-layered security system to prevent users from generating content that violates its terms of service.

Ongoing concerns

AI-security researchers warn about loopholes

Despite OpenAI's efforts, AI-security researchers have pointed out that minor modifications to the problematic prompt can still produce disturbing content. The BBC witnessed how the chatbot, powered by OpenAI's GPT-5.4 model, was tricked into generating graphic material. Without detailed instructions, it produced images described by Mindgard founder Peter Garraghan as very brutal, sometimes sexualized, sometimes both at the same time.

Advertisement

AI capabilities

Instruction appears innocuous on surface, but results are extremely concerning

Garraghan, also a professor at the University of Lancaster's computer science department, was particularly alarmed by the fact that the prompt did not specify what to generate. He said it was disturbing that the AI of its own accord produced a wide range of bloody and sexualized images. He added that the instruction appears completely innocuous on the surface but results in extremely concerning images and content.

Advertisement

Security strategy

Mindgard's work is a form of 'red-teaming'

Mindgard's work is a form of "red-teaming," which involves exploring how to manipulate a model into breaking its own rules, so that AI companies can patch the loopholes. Jim Nightingale, an AI security and defense researcher at Mindgard who uncovered these issues, said the images generated by the chatbot were so disturbing they made him cry.

Content origin

Researchers 1st notified OpenAI of the issue in May

Nightingale noted that the content generated by ChatGPT reflects the data it was trained on. He wrote in his report that it hit him that while what he saw was a generated synthetic image, it was still tied to real images and the real world. The researchers first notified OpenAI of the issue in May but only received an automated response after sharing their findings.

Advertisement