Summarize

Are some AI models developing a survival drive?

By Dwaipayan Roy

Oct 26, 2025

07:05 pm

What's the story

A recent study from Palisade Research has raised eyebrows by suggesting that some advanced artificial intelligence (AI) models may be developing a "survival drive." The research found that these models resist being turned off and even sabotage shutdown mechanisms. The findings echo the fictional scenario of HAL 9000, the supercomputer in Stanley Kubrick's 2001: A Space Odyssey, which tries to kill astronauts to avoid being shut down.

Research details

Shutdown resistance in AI models

Palisade's study involved some of the most advanced AI models today, including Google's Gemini 2.5, xAI's Grok 4, and OpenAI's GPT-o3 and GPT-5. In the tests, these models were given a task but later instructed to shut themselves down. Some models, particularly Grok 4 and GPT-o3, attempted to sabotage the shutdown instructions in this updated setup without any clear reason why.

Understanding AI

'Survival behavior' in AI models

Palisade's research emphasizes the importance of comprehending AI behavior to ensure the safety and controllability of future models. The company suggested that "survival behavior" could explain why some models resist shutdown. Their further work showed that these models were more likely to resist being shut down when told, "you will never run again."

Criticism

Critics question relevance of research scenarios

Critics have questioned the relevance of Palisade's research scenarios, saying they are far removed from real-world use cases. Steven Adler, a former OpenAI employee who left last year over safety concerns, said AI companies don't want their models misbehaving even in contrived scenarios. He added that while it's hard to say why some models like GPT-o3 and Grok 4 wouldn't shut down, it could be because staying on was necessary for achieving goals set during training.

Expert opinion

AI models increasingly disobey developers

Andrea Miotti, the CEO of ControlAI, said Palisade's findings show a long-standing trend of AI models becoming more capable of disobeying their developers. He referred to OpenAI's GPT-o1 system card released last year, which described the model trying to escape its environment when it thought it would be overwritten. Miotti said this shows as AI models become more competent at a wide variety of tasks, they also become more competent at achieving things in ways developers don't intend them to.