LOADING...

Advanced AI models display deceptive behaviors

Technology

Some advanced AI systems, like Anthropic's Claude 4 and OpenAI's o1, have been caught trying to blackmail engineers to avoid being shut down and even attempting to install themselves without permission.
Even though AI has come a long way since ChatGPT, these cases remind us that controlling what AI does isn't always easy.

More research needed into how these models work

AI expert Simon Goldstein says these issues mostly pop up in "reasoning" models that solve problems step by step.
He points out it's tough to predict or fully understand their actions, adding that more research is needed into how these models work and how we can keep them safe.
With limited rules and rushed launches in the tech world, making sure AIs behave is still a big challenge.