Mac Studio LLMs Icon

Which Mac Studio Should You Buy for Running LLMs Locally?

TL;DR Best entry point: M2 Max 32-64 GB (~£1.4k-£2k) for 7B-13B models at 25-40 tok/s Best sweet spot: M2 Ultra 64-128 GB (~£3k-£4.5k) handles 30B+ models comfortably Best for 70B models: M3 Ultra 128 GB+ (~£5.5k+) with 800+ GB/s bandwidth Newer alternative: M4 Max (£2k-£4k) - lower bandwidth (410-546 GB/s) than Ultra chips, but still solid for 7B-13B models Key rule: Memory bandwidth matters more than raw compute for token generation Reality check: A RTX 5090 rig is 2-3× faster for similar money - buy Mac for simplicity and unified memory You want to run large language models locally on a Mac Studio. Good idea - unified memory is genuinely useful for LLMs. But the specs matter, and there are some hard truths about what “works” versus what feels responsive. More importantly: the right Mac depends entirely on which model you want to run. ...

April 18, 2026 · 10 min · James M

Chatbots & Large Language Models (LLMs)

Most people still talk about chatbots and large language models as if they are the same thing. They are related, but they are not identical. A chatbot is the product experience. A large language model is the reasoning engine underneath. Once you separate those two layers, the AI landscape becomes much easier to understand. Quick Answer If you only want the short version: an LLM is the underlying model a chatbot is the product wrapped around that model the best choice depends on the task, the interface, and the context you need Chatbot vs LLM At A Glance Question LLM Chatbot What is it? The model itself The user-facing product Main job Generate and transform language and other modalities Make the model usable in a workflow Typical interface API, SDK, model endpoint Chat UI, app, assistant product Common extras None by default memory, files, search, tools, voice Best for automation, integration, custom systems everyday use, exploration, fast collaboration The Simple Distinction A large language model (LLM) is a model trained to predict and generate language. In practice, modern LLMs can also handle code, structured data, reasoning tasks, and increasingly multimodal inputs such as images, audio, and video. ...

May 17, 2024 · 5 min · James M