Artificial Intelligence

In-depth exploration of AI in practice: building and deploying AI agents that work, designing developer workflows around Claude and other LLMs, critical analysis of AI safety and reliability, and the real shifts happening in careers, skills, and how we work. This section mixes tactical guides (how to actually build with AI), strategic analysis (what’s hype vs. what matters), and deeper dives into the tools and systems reshaping software development and knowledge work.

Scott Galloway on AI: The Marketing Professor's Case That the Rich Don't Need You Anymore

Scott Galloway is the kind of commentator the AI conversation rarely produces: not a researcher, not a founder, not a doomer, not a booster. He is a marketing professor and a serial entrepreneur with a record of correctly reading the corporate stories of the last two decades, and he has spent the last two years pointing at the AI story with increasing concern. The headline of his pitch - that AI was not built for ordinary people and that the rich no longer need them - is provocative on purpose. The argument underneath is more careful, and worth pulling apart on its own terms. ...

The Quiet Standardisation of Agent Protocols - MCP, A2A, ACP Compared

TL;DR The 2026 agent ecosystem has, while nobody was paying close attention, converged on three protocols that solve different problems and partly overlap: MCP (Model Context Protocol), A2A (Agent-to-Agent), and ACP (Agent Communication Protocol). MCP is the model-to-tool protocol. It standardises how an agent talks to its tools, data sources, and local context. This is the one that has clearly won its layer. A2A is the agent-to-agent protocol. It standardises how separately deployed agents discover each other, exchange tasks, and pass results. Adoption is growing but the picture is less settled. ACP is the orchestration-and-runtime protocol. It standardises how an agent runtime exposes its lifecycle, state, and operations to the systems around it. Newer, more enterprise-focused, and not yet a clear winner. The mental model: MCP for tools, A2A for peers, ACP for the platform. Build with all three in mind even if you only need one today. Why Protocols, Why Now A year ago “agents” was still a debate about whether the things existed. By mid-2026 the debate has shifted. Agents exist. They do useful work. The interesting question is no longer “will this work” but “how do we connect them to everything else.” ...

LLM-Powered Personal Productivity Banner

LLM-Powered Personal Productivity: Building a Private Automation Stack

TL;DR The interesting question in 2026 is not “can a local model do this”, it is “which jobs should you give it”. My stack: Ollama for inference, Letta for persistent agent memory, Obsidian as the second brain, Home Assistant for the physical world, and a small router that decides where each thought goes. Three jobs are the sweet spot for local: inbox triage, note enrichment, and routine automation. Each one is repetitive, private, and tolerant of a bit of latency. Two jobs are still worth handing to a frontier cloud model: anything novel-and-hard, and anything where you want the best draft on the first attempt. The bit nobody talks about is the router. The model is not the product. The thing that decides which model gets which job is the product. Why Local Got Interesting For years the answer to “should I run an LLM locally” was “no, just use the API”. The API was cheaper, faster, smarter, and you did not have to think about VRAM. The only reason to go local was privacy, and most people did not actually care about privacy enough to give up the quality gap. ...

Roman Yampolskiy: The Researcher Who Thinks AI Cannot Be Controlled

Most people writing about AI risk in 2026 are recent arrivals. Roman Yampolskiy is not. He has been making the same argument - that advanced AI systems may be fundamentally uncontrollable - since before the field of AI safety had a settled name, which is partly because he is the one who gave it that name. Whether you find his conclusions alarmist, prescient, or somewhere in between depends mostly on how you read the gap between current systems and the ones he writes about. This post is an attempt to lay out the man, the argument, and the reasons it deserves more than a dismissal. ...

Humanoid Robotics in 2026: From Prototypes to Production

TL;DR 2026 is the inflection point for humanoid robotics - real customers like BMW, GXO, and Mercedes-Benz are paying for deployments, not just watching demos Hardware is no longer the bottleneck; the constraints have shifted to physical training data, unstructured-task autonomy, and production supply chains The economics work today for two-to-three shift warehouse operations via Robots-as-a-Service contracts at roughly USD 30-50K per year Production volumes still lag announcements by 3-5x - Unitree is likely the 2026 volume leader, not Tesla or Figure The form factor wins where environments are human-shaped and mixed-use; wheeled robots remain cheaper in purpose-built facilities For most of the last decade, humanoid robotics looked like a category that would always be three years away. Demos were impressive, factory floors stayed empty, and serious analysts pointed to bipedal locomotion, dexterous manipulation, and the price of high torque-density actuators as reasons the form factor would lose to wheeled and fixed-arm systems for any real industrial work. ...

AI Agents That Actually Work: Patterns From Real Projects

TL;DR Most agent demos fail in production because demos operate in a regime where the model’s natural behaviour is good enough - production is longer, messier, and largely unobserved Eight patterns separate agents that stay shipped from the ones that fall over: scope the loop, structured tool design, mandatory verification, curated context, first-class human handoff, idempotency, agent-level observability, and real evaluation infrastructure Models confabulate actions - “I ran the tests” does not mean the tests were run; every agent needs explicit verification baked into the control flow, not bolted on as an afterthought The tool layer between the model and underlying systems is where most of the engineering effort actually lives, and exposing raw APIs directly to the agent almost always goes wrong Build agents the same way you would build any other long-running, partially-autonomous system you cannot afford to have fail silently - the novelty is in the failure modes, not the engineering principles I have spent the last eighteen months either building, reviewing, or operating systems that some marketing department somewhere has called “agents”. The definition has been so thoroughly stretched that it now means anything from a chatbot with a calculator tool to a long-running autonomous workflow that touches production infrastructure. Underneath the noise there is a real engineering discipline emerging, and the patterns that separate the systems that survive contact with real users from the ones that demo well and fall over are starting to be legible. ...

A Year of Agents, and What is Coming Next

TL;DR The defining shift from April 2025 to April 2026 is the move from “ask” to “delegate” - agents now run for minutes, open files, execute shells, and return results rather than waiting for each prompt Key developments that drove this: coding agents becoming operators (Claude Code, Cursor, Codex), MCP standardising tool access, spec-driven development going mainstream, and context windows expanding to millions of tokens In the next two years, longer-horizon agents, multi-agent coordination, persistent personal AI memory, and computer-use automation will move from early features to default expectations The working day is reshaping around less typing and more reviewing - the skill that matters is judgement over diffs, not typing speed or boilerplate generation To adapt now: pick a stack and use it daily, write specs before code, build the habit of reviewing diffs fast, and move procedural knowledge into reusable agent skills A year ago, in April 2025, “AI in your workflow” mostly meant a chat window in a browser tab and an autocomplete plugin in your editor. You typed, it suggested, you accepted or rejected. The interaction model was small. The blast radius was small. The verb was “ask”. ...

AI Safety From First Principles: What Actually Matters vs What's Hype

TL;DR “AI safety” covers four distinct layers - product safety, system safety, model alignment, and civilisational safety - and conflating them produces incoherent debates For engineers building production systems today, system safety dominates: most real incidents trace back to flawed system design around the model, not the model itself Practical mitigations are unglamorous: scope tool permissions, bound blast radius, require human approval for irreversible actions, validate outputs, and observe everything The hype conflates capability with intent, existential risk with ordinary risk, and refusal with safety - all three conflations make the conversation harder to act on The load-bearing principle across all four layers is the same: a system should fail in ways that are detectable, recoverable, and bounded The AI safety conversation has reached the point where the phrase has stopped meaning anything specific. In the same week, you will see “AI safety” used to describe content moderation on a chat product, the alignment of frontier models toward human values, the question of whether superintelligence ends civilisation, and a regulatory paper about copyright. These are not the same problem. Treating them as one conversation is the reason the conversation never resolves. ...

AI Skills: One Folder, Any Model

TL;DR A Claude Code skill is just a folder with a SKILL.md file - YAML frontmatter plus natural-language instructions - and the same folder works across Cursor, Gemini CLI, Codex, and a dozen other tools The format is model-agnostic because it contains no provider-specific syntax; any instruction-following model can read it, and any harness that loads markdown can execute it Progressive disclosure keeps large skill libraries cheap: only names and descriptions load at session start, with full instructions loading only when a skill is activated The portability is practically valuable - version-controlled runbooks that survive tool switches, model upgrades, and team growth without being rewritten Core skills are genuinely portable; advanced frontmatter extensions (like allowed-tools or context: fork) are tool-specific and may need tuning across harnesses Most of the tooling I have written about over the last year has been provider-specific. A particular model, a particular harness, a particular set of features. The thing I find interesting about agent skills is that they are not. ...

My AI-Augmented Design Workflow: A 10-Minute Loop From Discussion to Documented Decision

TL;DR A combination of Cursor in the IDE, Claude Code and Codex in the terminal, and GitHub Spec Kit as the living contract has collapsed the discuss-design-document loop from days to under ten minutes Every meeting is transcribed and checked into GitHub alongside the design corpus, giving AI agents access to the full historical record - not just curated decisions but the debates that shaped them Model selection matters: cheaper, faster models for throwaway sketches and small refactors; expensive models (Opus) for large cross-repo work where the cost of a wrong answer is high The real transformation is cognitive flow - removing friction between thinking and recording means decisions get made and captured while the problem is still fresh, with almost no context switching AI is now suggesting improvements faster than the author can implement them; the next bottleneck is compaction, not generation - asking the model to reduce documents to their load-bearing claims rather than produce more content Since making a combination of Cursor in the IDE and Claude Code and Codex in the terminal the centre of my working day - with ChatGPT for general questions and GitHub Spec Kit holding the design contract - the way I move from a question on Slack to a documented design decision has changed beyond recognition. ...