AI
machinebrief.com
Transformers in Transition: What Weight Decay Reveals About AI Models
New research on transformers and modular arithmetic uncovers how weight decay influences model behavior, shifting from memorization to learning and even collapse.