Google's Gemma 4 AI models get 3x speed boost with speculative decoding
Google released Gemma 4 family (E2B, E4B, 26B MoE, 31B) under Apache 2.0, claiming up to 3x faster inference without quality loss via token prediction. The 31B model ranks #3 on LMSYS Arena among opens.