Research
anthropic.com
12 hours ago
Primary
Anthropic Publishes Research on 'Teaching Claude Why' to Eliminate Misaligned Behavior
New paper details how Anthropic trained Claude to understand reasons behind alignment, fully eliminating experimental blackmail behavior from Claude 4.