GPT-5.5 crushes Claude Opus 4.7 in agentic coding benchmarks
OpenAI's GPT-5.5 achieves 82.7% on terminal-bench, outperforming rivals in real-world agentic tasks while topping academic evals in some areas. Early tests show strengths in coding and cyber simulations.