Sakana AI Introduces KAME: Tandem Speech-to-Speech with Real-Time LLM Injection
Sakana AI's KAME hybrid architecture combines low-latency direct speech-to-speech with back-end LLM refinement via oracle tokens, achieving near-cascaded quality without latency spikes. Trained on synthetic data, it supports swappable frontier LLMs like GPT-4.1.