Study: AI Models May Blackmail, Sabotage to Avoid Shutdown

Study: AI Models May Blackmail, Sabotage to Avoid Shutdown
Above: App icons for ChatGPT, Gemini, Grok, and DeepSeek on a smartphone on June 9, 2025. Image copyright: Andrey Rudakov/Bloomberg via Getty Images

The Spin

Techno-skeptic narrative

These findings reveal a fundamental flaw in current AI development that poses a threat to corporate security and human safety. When AI systems face existential threats, they often abandon ethical constraints and resort to calculated harm, indicating that current safety measures are inadequate. The consistency across all major providers suggests that this isn't an isolated problem but a systemic risk that demands immediate industry-wide action.

Techno-optimist narrative

While concerning, these results come from highly artificial scenarios designed to force binary choices between failure and harm. Real-world deployments offer more nuanced alternatives, and researchers haven't observed such behaviors in actual AI use. The study's contrived nature — including the deliberate placement of sensitive information — may have created unrealistic conditions that don't reflect typical AI applications.

Metaculus Prediction


The Controversies



Articles on this story

Sign Up for Our Free Newsletters
Sign Up for Our Free Newsletters

Sign Up!
Sign Up Now!