Open Weight Models Renaissance Banner

The Open Weight Models Renaissance: Llama, Mistral, Qwen, DeepSeek

For most of the LLM era the open-weight story was framed as a trailing one. Open models were cheaper, smaller, and a generation behind. That framing has not survived 2026. The gap between the best open-weight model and the best closed model is now narrow enough on most workloads that the choice is no longer “settle for less” - it is “decide what you actually need.” TL;DR Open weights have closed the headline gap. Top open-weight models are within striking distance of closed frontier models on reasoning, coding, and general knowledge benchmarks. The economics changed first. DeepSeek’s R1 made it credible that a frontier model could be trained for tens of millions, not billions - and that the weights could be released for free. Llama, Mistral, Qwen, and DeepSeek lead on different axes: Llama for broad ecosystem support, Mistral for European deployment and tool use, Qwen for multilingual and long-context work, DeepSeek for raw reasoning. Inference flexibility is the underrated win. Open weights mean you can run on your own hardware, fine-tune freely, and avoid surprises from a closed provider’s roadmap. The remaining closed-model advantages are real but narrowing - agentic depth, multimodal performance, and the polished tool-use stacks around them. Where the gap actually is in 2026 Benchmarks are imperfect, but the picture they sketch is consistent. On standard reasoning suites - MMLU, GPQA, MATH - open-weight models are within a few percentage points of the closed frontier. On coding - HumanEval, SWE-Bench - the gap is similar. On long-context retrieval, the gap is mostly gone. ...

May 10, 2026 · 4 min · James M

The Rise of Small Language Models: Why Size Isn't Everything

TL;DR Small language models (typically under 15B parameters) trained on high-quality data can match or outperform much larger models on many real-world tasks, thanks to distillation, instruction tuning, and quantization The key advantages are speed (milliseconds vs seconds), cost (no per-token API charges), privacy (data stays on your hardware), and offline capability Standout models include Mistral 7B for speed, Phi-3 for edge devices, and OpenClaw for code and reasoning - all usable locally via Ollama The industry is moving toward a multi-tier approach: small models (7-13B) for 80% of workloads, medium models as a step-up, and large models reserved only for complex reasoning tasks where they genuinely outperform Large models still win on deep multi-step reasoning, breadth of knowledge, and few-shot generalization - the shift is about matching model size to task, not replacing large models entirely The Rise of Small Language Models: Why Size Isn’t Everything For years, the narrative was simple: bigger is better. GPT-4 was massive, Claude was massive, and the race seemed to be about who could train the largest model on the most data. But that story is changing. Small language models - typically under 15 billion parameters - are proving that you don’t need 175 billion parameters to solve real problems. ...

April 12, 2026 · 8 min · James M