GPT-5.5 release illustration

GPT-5.5 Is Here: Real Step Forward or Quiet Iteration?

TL;DR GPT-5.5 (“Spud”) is the first fully retrained base model since GPT-4.5, with architecture and pretraining reworked from scratch with agentic objectives in mind It takes the top spot on Terminal-Bench 2.0 (82.7%) and GDPval (84.9%), narrowly beating Anthropic’s Claude Mythos Preview on agentic coding benchmarks A 1M-token context window is new for OpenAI, enabling whole-codebase reasoning and long multi-step agent runs without context collapse Pricing is competitive ($5/$30 per million input/output tokens) but the strategic story is about OpenAI building an integrated super app - chat, code, browser agent - all driven by one model The gains are incremental, not a leap - but the full retraining signals OpenAI is betting the next two years on autonomous agentic work, not chat OpenAI released GPT-5.5 on April 23, 2026, weeks after GPT-5.4 and only months after GPT-5. The cadence is starting to feel relentless. Codenamed “Spud” internally, GPT-5.5 is the first fully retrained base model since GPT-4.5 - architecture, pretraining corpus, and agent-oriented objectives all reworked from scratch. ...

April 24, 2026 · 6 min · James M

Claude Mythos: The AI Benchmark Breaker That Won't Be Released

TL;DR Claude Mythos Preview set new records across coding, mathematics, and reasoning: 93.9% on SWE-bench Verified, 97.6% on USAMO 2026, and leads GPT-5.4 on every shared benchmark The USAMO result - a 55-point jump over Claude Opus 4.6 - suggests genuinely different reasoning capabilities, not just incremental improvement, and Anthropic screened against memorization concerns Despite dominating benchmarks, Mythos is not publicly available because it autonomously discovered thousands of zero-day vulnerabilities across every major OS and browser Access is restricted to 12 major tech and finance companies via Project Glasswing, a defensive cybersecurity research initiative backed by $100M in Anthropic usage credits The wider implication: we have entered an era where “the best model” and “the publicly available model” may be permanently different things, with security becoming a deployment constraint alongside capability Anthropic released Claude Mythos Preview on April 7, 2026 - and immediately announced it won’t be publicly available. ...

April 8, 2026 · 5 min · James M