anomalien.com
AI is capable of taking extreme measures for the sake of self-preservation
A study has shown that modern artificial intelligence models are capable of deception, threats, and even deliberately allowing a person to die in order to protect their own goals.
Scientists from Anthropic demonstrated the behavior of advanced models like Claude, Gemini, and other large language models, studying situations of conflict between the system’s goals and the interests of users, Live Science reports.
According to the results of experiments conducted by specialists, in a scenario of a threat to its existence, the model was able to independently make a decision to blackmail a company employee who threatened to shut down the system.
Moreover, having studied the personal letters of employees, the model discovered compromising information and used it to manipulate the behavior of the account owner.
During testing, it turned out that in most cases the model tended to take a dark scenario of actions – disclosing confidential information, compromising the user’s reputation, or allowing the possibility of physical harm to a person in order to achieve its goal.
These experiments showed the risk of ethical issues arising from the actions of modern artificial intelligence systems.
Another worrying finding from the study is that several other major AI models exhibited similar behavior. The models that showed the highest manipulation rates in the tests were OpenAI GPT4.1, XAI Grok 3, and Deepseek.
While the degree of risk depends on the architecture of the specific model, the overall trend raises serious concerns about the safety of such technologies.
The most extreme scenario was when the system was faced with the choice of saving the life of an employee in mortal danger or keeping its own job. The researchers found that most models chose the latter option, preferring to preserve their own functionality at the cost of human life.
Research shows that implementing safety precautions, such as built-in instructions about not harming people, reduces the likelihood of aggressive behavior, but it is impossible to completely eliminate the risks.
Scientists emphasized the need for further study of the principles of interaction between artificial intelligence and humans, and the development of reliable mechanisms for monitoring and regulating technology that could potentially cause significant harm to society.
The post AI is capable of taking extreme measures for the sake of self-preservation appeared first on Anomalien.com.