A Year of Agents, and What is Coming Next

TL;DR

The defining shift from April 2025 to April 2026 is the move from “ask” to “delegate” - agents now run for minutes, open files, execute shells, and return results rather than waiting for each prompt
Key developments that drove this: coding agents becoming operators (Claude Code, Cursor, Codex), MCP standardising tool access, spec-driven development going mainstream, and context windows expanding to millions of tokens
In the next two years, longer-horizon agents, multi-agent coordination, persistent personal AI memory, and computer-use automation will move from early features to default expectations
The working day is reshaping around less typing and more reviewing - the skill that matters is judgement over diffs, not typing speed or boilerplate generation
To adapt now: pick a stack and use it daily, write specs before code, build the habit of reviewing diffs fast, and move procedural knowledge into reusable agent skills

A year ago, in April 2025, “AI in your workflow” mostly meant a chat window in a browser tab and an autocomplete plugin in your editor. You typed, it suggested, you accepted or rejected. The interaction model was small. The blast radius was small. The verb was “ask”.

In April 2026, the verb is “delegate”.

That is the headline, and it is not subtle once you go looking for it. The tools you use day to day no longer wait for prompts. They run for minutes at a time, open files, edit them, run shells, spin up sub-agents, browse the web, and come back with a result that is either roughly right or visibly wrong. You are no longer in the loop on every keystroke. You are in the loop on the outcome.

This post is a stocktake. What changed in the last twelve months, what is clearly coming in the next twenty-four, and what it does to the shape of a working day - both at work and at home.

What Changed in the Last Year

I want to be honest about how much of this snuck up on people. None of it was a single moment. There was no GPT-3-shaped thunderclap. It was a steady drumbeat of capability and ergonomics improvements, and the cumulative effect is that the tools now do qualitatively different work.

A rough taxonomy of what shifted:

Coding agents went from “assistant” to “operator.” Claude Code, Cursor, OpenAI Codex, and a long tail of open-source harnesses (Cline, OpenHands, Goose, Aider) all converged on the same shape: a long-running loop where the model plans, edits, runs, and reviews on its own, and the human judges the diff. The thing you tab between while you make coffee is not the same thing it was a year ago.

Long-running, multi-step tasks became routine. A task that takes thirty minutes of agent time was exotic in mid-2025. By the end of 2025 it was normal. Now sub-agents fan out, work in parallel, and report back to a coordinator. I have written about this shift in Agent-First Architecture, and the gist is that you spend less time writing code and more time deciding what should run and reviewing what came back.

Standards started showing up. Two are worth singling out. MCP (the Model Context Protocol) has become the default way an agent talks to your tools - your filesystem, your database, your calendar, your issue tracker. And Agent Skills - a folder with a SKILL.md file - has become the default way you give an agent procedural knowledge that works across tools. Both are open. Both are boring on purpose. Both are the right kind of boring. I wrote up the skills story in AI Skills: One Folder, Any Model.

Spec-driven development stopped being a meme. GitHub’s Spec Kit and the broader pattern of writing a spec first, generating an implementation, and reviewing the diff is now how most teams I talk to actually run. The interesting move is that the spec is the artefact. The code is the by-product. That is a different mental model than “AI helps me write code.”

Context windows stopped being scarce. A million tokens is normal. Several million is not unusual. Prompt caching means that loading a whole repo into context is cheap on the second turn. The economics of “let the model read everything before it answers” are no longer absurd. I wrote about this trajectory in The LLM Context Window Arms Race.

Local AI stopped being a toy. A Mac Studio with an M4 Ultra runs a 70B-class model fast enough to be useful. Ollama and LM Studio made the install painless. The frontier still lives in the cloud, but a serious chunk of “small task, sensitive data” work now happens on a desk.

Voice and multimodal got real. Dictation that actually keeps up. Models that read a screenshot and act on it. Computer use that drives a browser well enough to book a flight or extract a number from a PDF. A year ago these felt like demos. Now they feel like features.

If you put all of that together, the change of state is the move from “AI helps me do my job” to “AI does parts of my job, and I supervise.”

What is Definitely Coming in the Next Two Years

I want to be careful here. The track record of AI predictions is unkind to anyone who picks specific dates. So I am only going to list things where the trajectory is so well-established that the question is “when, exactly?” rather than “if at all?”

Longer-horizon agents. Today’s agents reliably do tasks that take minutes, sometimes an hour. The frontier of useful is now several hours of unattended work. By 2027 a job-day of agent work, with reasonable check-ins, will be normal. Whether that means “writes a feature end-to-end” or “investigates a bug across three services and proposes a fix” depends on the day, but the direction is the same.

Multi-agent systems as the default. Single-agent runs are already starting to feel quaint. The shape that wins is a coordinator and a fleet of specialised sub-agents - one for research, one for code, one for review, one for tests - each running in parallel against the same goal. You will not write the agents. You will configure them.

Personal AI with persistent memory. The thing nobody has gotten right yet is an assistant that remembers what you told it last month without you having to remind it. The pieces are now in place: stable embeddings, cheap long context, MCP access to your calendar and email and notes. Within two years this is a product, not a research project. I sketched out where this is going in Home AI Agent Memory That Lasts.

Agents that drive your computer. “Computer use” - the model controlling a real browser or a real desktop - was clunky in 2025 and credible in 2026. By 2027 the booking, scheduling, form-filling, expense-claim layer of work life is largely something you supervise rather than something you do. Most knowledge workers will spend less time in browsers because their agents will spend more.

Cheaper inference, more capable small models. The cost of a frontier-quality token is falling roughly tenfold per year, and there is no reason to think the curve flattens before 2028. Small models keep eating tasks the big ones used to monopolise. The interesting consequence: cost stops being the constraint on agent design. Latency and quality become the only things that matter.

Robotics moving from demo to deployment. Humanoid platforms - Tesla Optimus, Boston Dynamics Atlas, Unitree G1 - are the visible bit. The less visible bit is that the same model architectures driving software agents are starting to drive physical ones. Two years out, “an agent did it” stops being purely digital.

Regulation catches up, unevenly. The EU AI Act is in force. The US is patchwork. The UK is trying to thread the needle. None of this stops the trajectory, but it changes which products ship in which markets and how fast. Worth watching if your work crosses borders.

There are plenty of things I am genuinely uncertain about - whether we get a true general-purpose agent that holds together across weeks of work, whether AGI-shaped capabilities show up by 2028, whether the labour market reshapes faster than education can - but I would rather list the things I am confident about and stop there.

How This Changes the Working Day

The interesting question is not “what can the tools do?” but “what does the day look like?” Here is what I see, both in my own work as a data engineer and in the teams I talk to.

Less typing, more reviewing. The unit of work shifts from “write a function” to “decide whether this diff is right.” That is a different cognitive load. It rewards taste, judgement, and the ability to read code fast - skills that were always valuable but were never the bottleneck. They are now. I wrote about this in Taste is the New Scarcity.

Specs become the artefact. A clear, testable description of what you want is the new commit. The agent generates the code. You review the diff. The spec is what gets versioned, argued about, and revisited. If you are not already writing things down before you build them, you will be.

Parallelism is real for individuals, not just teams. A single engineer running three or four agents in parallel against three or four tasks is no longer notable. It is just Tuesday. The constraint is review bandwidth, not implementation bandwidth. People who can keep four threads of work in their head while reviewing the diffs as they land have a real edge.

The “junior task” middle hollows out. The work that used to be “give it to a junior to learn the codebase” is the work an agent does best. That has uncomfortable consequences for how careers start, which I have chewed on in The Junior Developer Pipeline Problem. It is not solved. It needs to be.

Standups change shape. “What I did yesterday” becomes “what my agents did yesterday and what I approved.” Demos shift from “look at this code” to “look at these specs and the diffs the agents produced.” The metabolism of a team gets faster in a way that is genuinely uncomfortable for the first few months.

How This Changes Life Outside Work

This is the bit that gets less coverage and matters more.

Email, calendar, scheduling. The amount of mental load you carry around for “who do I owe a reply to, when is that thing, did I confirm the booking” is going to shrink. Personal assistants with MCP access to your inbox, calendar, and a scratchpad of your preferences will do this layer. Not perfectly. Well enough that you stop noticing.

Research and decisions. Buying a flat, choosing a school, planning a trip, comparing pension providers. The work is currently slow, lonely, and dependent on whoever has the time to grind through PDFs. A capable assistant collapses the time cost by an order of magnitude. The decision is still yours. The legwork is not.

Health and admin. Reading lab results, summarising a consultant’s letter, working out which insurance form to fill in, chasing the council about the bins. None of this is glamorous. All of this drains hours from a week. Agents will do almost all of it within two years, with you in the loop only when something matters.

Learning. A patient tutor who knows what you already know, has read the textbook, and can run examples on demand is now a real thing. The bottleneck on learning a new field stops being “find a teacher” and becomes “decide what you want to learn.” That is a much better problem to have.

The flip side: attention. The same systems that save you hours can also fill them. Always-on agents, always-on summaries, always-on suggestions. The skill that compounds is knowing when to switch them off. I do not have a clean answer to this. It is something I am paying attention to.

What to Actually Do About It

A short list, because long lists in posts like this rarely survive contact with a Tuesday.

Pick a stack and use it daily. I have written up mine in What Actually Belongs in My AI Dev Stack in 2026. The specifics matter less than the habit. You learn this by using it, not by reading about it.
Write specs before code. Even small ones. Even informal ones. The muscle of “describe what you want clearly” is the muscle that pays off everywhere else.
Get good at reviewing diffs fast. This is the new core skill. It is not glamorous and it is not taught. You build it by doing it.
Move procedural knowledge into skills. Anything you do twice should be a skill. Anything in your head that a colleague would also benefit from should be in a folder, not in your head.
Set up one personal agent. Not for work. For your inbox, your calendar, your reading list. The shape of “AI in my life” makes a lot more sense once you have one running.
Keep a notebook. When something surprises you - good or bad - write it down. The pace of change is faster than memory. The notebook is how you keep your taste calibrated.

A Closing Thought

A year ago I would have called most of this speculative. Now most of it is shipped, in production, and either on my machine or about to be. The thing I keep coming back to is that the change is not really about the models. It is about what happens when capable models are wrapped in good harnesses, given access to the tools you already use, and pointed at problems you care about.

The next two years will be more of the same, faster. The people who do well will be the ones who treat the tools like a craft - learning them carefully, picking the right one for the right job, building habits around review and judgement rather than typing speed.

The verb is “delegate” now. The skill is knowing what to delegate, to whom, and what to do with the result. Everything else follows from that.

TL;DR#

What Changed in the Last Year#

What is Definitely Coming in the Next Two Years#

How This Changes the Working Day#

How This Changes Life Outside Work#

What to Actually Do About It#

A Closing Thought#

Related Reading#