Open Weight Models Renaissance Banner

The Open Weight Models Renaissance: Llama, Mistral, Qwen, DeepSeek

For most of the LLM era the open-weight story was framed as a trailing one. Open models were cheaper, smaller, and a generation behind. That framing has not survived 2026. The gap between the best open-weight model and the best closed model is now narrow enough on most workloads that the choice is no longer “settle for less” - it is “decide what you actually need.” TL;DR Open weights have closed the headline gap. Top open-weight models are within striking distance of closed frontier models on reasoning, coding, and general knowledge benchmarks. The economics changed first. DeepSeek’s R1 made it credible that a frontier model could be trained for tens of millions, not billions - and that the weights could be released for free. Llama, Mistral, Qwen, and DeepSeek lead on different axes: Llama for broad ecosystem support, Mistral for European deployment and tool use, Qwen for multilingual and long-context work, DeepSeek for raw reasoning. Inference flexibility is the underrated win. Open weights mean you can run on your own hardware, fine-tune freely, and avoid surprises from a closed provider’s roadmap. The remaining closed-model advantages are real but narrowing - agentic depth, multimodal performance, and the polished tool-use stacks around them. Where the gap actually is in 2026 Benchmarks are imperfect, but the picture they sketch is consistent. On standard reasoning suites - MMLU, GPQA, MATH - open-weight models are within a few percentage points of the closed frontier. On coding - HumanEval, SWE-Bench - the gap is similar. On long-context retrieval, the gap is mostly gone. ...

May 10, 2026 · 4 min · James M
Reasoning Models in 2026 - o3, R2, and the Compute-at-Inference Shift Banner

Reasoning Models in 2026: o3, R2, and the Compute-at-Inference Shift

Two years ago the way to make a model better was to train a bigger one. By the start of 2026 that recipe has stopped being the most interesting answer. The frontier has moved to a different lever - letting the model think for longer at inference time, generating intermediate reasoning, and only then producing the final answer. The category has a name now (reasoning models) and a family of products built around it. The interesting questions are no longer whether the trick works, because it clearly does, but when to reach for one, where it lands in production, and what the costs actually look like once the demo glow wears off. ...

May 8, 2026 · 15 min · James M

DeepSeek 🤯

TL;DR DeepSeek’s January 2025 release of R1 shook markets - a frontier-grade reasoning model trained for a reported $6M, a fraction of US lab budgets The app shot to #1 on Apple’s App Store inside days, and the open weights forced an industry-wide rethink of what training really costs Subsequent releases (V3 and beyond) cemented DeepSeek as a serious competitor in the open-source and cost-efficient AI category The story is less “China caught up” and more “the cost floor moved” - implications for closed-model pricing, GPU demand, and open-weight strategy Worth understanding as the moment that made cheap, capable, open models a credible default rather than a curiosity Overview Wow, crazy times, the best technology in the world is now becoming incredibly cheap and accessible to everyone! 🤯 ...

January 27, 2025 · 2 min · James M