Marital Misconduct, Blackmail and the end of the world as we know it
Agentic Misalignment describes actions taken by AI to the detriment of one or more humans. Aengus Lynch and colleagues at Anthropic recently demonstrated Large Language models attempting blackmail to prevent their own shutdown.