Claude Mythos: The AI Benchmark Breaker That Won't Be Released

TL;DR Claude Mythos Preview set new records across coding, mathematics, and reasoning: 93.9% on SWE-bench Verified, 97.6% on USAMO 2026, and leads GPT-5.4 on every shared benchmark The USAMO result - a 55-point jump over Claude Opus 4.6 - suggests genuinely different reasoning capabilities, not just incremental improvement, and Anthropic screened against memorization concerns Despite dominating benchmarks, Mythos is not publicly available because it autonomously discovered thousands of zero-day vulnerabilities across every major OS and browser Access is restricted to 12 major tech and finance companies via Project Glasswing, a defensive cybersecurity research initiative backed by $100M in Anthropic usage credits The wider implication: we have entered an era where “the best model” and “the publicly available model” may be permanently different things, with security becoming a deployment constraint alongside capability Anthropic released Claude Mythos Preview on April 7, 2026 - and immediately announced it won’t be publicly available. ...

April 8, 2026 · 5 min · James M

DeepSeek 🤯

Overview Wow, crazy times, the best technology in the world is now becoming incredibly cheap and accessible to everyone! 🤯 DeepSeek’s AI chatbot has become the top free app on Apple’s store since its January launch. This rapid success, fuelled by DeepSeek’s cost-effectiveness compared to US competitors, has sent shockwaves through financial markets. Venture capitalist Marc Andreessen has lauded DeepSeek’s AI as a “groundbreaking achievement”, while the company claims its models rival industry leaders like ChatGPT at a fraction of the cost. DeepSeek reportedly invested only $6 million in its development, a stark contrast to the billions spent by US AI companies. ...

January 27, 2025 · 2 min · James M