Why the AI Cyber Threat Is Rising Banner

Why the AI Cyber Threat Is Rising

For most of the last few years, the “AI and cybersecurity” conversation has been a vibes argument. One side said the models would soon write novel exploits at scale. The other side said the models were still tripping over basic shell commands and could not be trusted to hack anything more dangerous than a CTF box. The honest answer was that nobody had hard numbers, so the debate stayed stuck on intuition. ...

May 26, 2026 · 6 min · James M
Context Engineering - The Discipline That Replaced Prompt Engineering Banner

Context Engineering: The Discipline That Replaced Prompt Engineering

TL;DR Prompt engineering optimised the wording of a single human-written request. Context engineering optimises the entire set of tokens in the model’s window across a whole run - system prompt, tool definitions, retrieved documents, tool results, conversation history, and memory The shift happened because of agents. The window is no longer one prompt you wrote - it is an accumulation that grows on every step, and most of it is produced by the system, not by you More context is not better context. Research on “context rot” and the older lost-in-the-middle effect show model accuracy degrades as the window fills, even well below the advertised limit The four levers are retrieval (what you pull in), memory (what persists across runs), tool results (what tools dump back), and compaction (what you summarise and discard) Treat the window as a budget. Measure its token composition, design tools to return terse output, curate rather than accumulate, and keep the static prefix stable so prompt caching still works For a few years, “prompt engineering” was the named skill of working with language models. It meant finding the wording, the framing, the few-shot examples, and the role instructions that coaxed the best answer out of a single request. It produced a small industry of prompt libraries, prompt marketplaces, and job titles. And in 2026 it is mostly gone, absorbed into something larger and harder. ...

May 20, 2026 · 11 min · James M
Cursor Composer 2.5 banner

Composer 2.5: Cursor's In-House Model Grows Up

TL;DR Composer 2.5 is Cursor’s most capable in-house coding model yet, built on Moonshot’s open-source Kimi K2.5 checkpoint with about 85% of total training compute spent on Cursor’s own continued pretraining and RL The model is purpose-built for the agent loop inside Cursor - long-horizon tasks, hundreds of tool calls, multi-step instructions - rather than as a general-purpose chat model Cursor claims parity with Claude Opus 4.7 and GPT-5.5 on its own CursorBench v3.1 (63.2%) and a strong 79.8% on SWE-Bench Multilingual Pricing is dramatically lower: $0.50 / $2.50 per million input/output tokens on the default variant, with included usage doubled for the first week Together with SpaceXAI, Cursor is now training a much larger successor model from scratch on Colossus 2 with around 10x the compute - so 2.5 is a waypoint, not the endgame For a while, Cursor was an IDE wrapped around someone else’s models - Claude, GPT, Gemini. That story has shifted. With Composer 2.5, released this week, Cursor has shipped its most capable first-party coding model yet, and it is a serious enough piece of work that it deserves real consideration as a daily driver rather than a budget fallback. ...

May 18, 2026 · 8 min · James M
The Agent Reliability Problem Banner

The Agent Reliability Problem: Debugging Non-Deterministic Systems

The conventional reliability engineering toolkit was built for systems that behaved the same way each time given the same input. AI agents do not behave the same way each time given the same input. The classic tools - unit tests, integration tests, deterministic replay, traditional monitoring - all assume a property that the systems being operated do not have. This mismatch is not a small operational annoyance; it is the central challenge of running AI agents in production, and the patterns for handling it are still being worked out. ...

May 15, 2026 · 7 min · James M
ETL Tools and Data Integration

ETL Tools & Data Integration Platforms

What is ETL? ETL is a foundational data engineering process that powers modern analytics: Extract - Retrieve data from various sources (databases, APIs, files, cloud services, streaming platforms) Transform - Clean, validate, deduplicate, and reshape data into required data models Load - Move processed data into data warehouses, data lakes, or analytical systems ETL ensures data quality, consistency, and accessibility for analytics and reporting. In 2026 the dominant pattern is ELT (Extract-Load-Transform), which leverages cloud data warehouse compute for transformation, and increasingly EtLT (adding lightweight pre-load transforms for streaming and schema drift). See the Fundamentals of Data Engineering book for a deeper framing. ...

May 4, 2026 · 9 min · James M
Onchain AI Agents Hype Reality Banner

Onchain AI Agents - Hype, Reality, and Where the Money Actually Flows

TL;DR “Onchain AI agents” became the dominant crypto narrative in 2025 and has cooled meaningfully in 2026 as the picture has gotten clearer. The honest taxonomy has three buckets: agents that hold wallets and trade, agents that automate DeFi operations, and agents that exist primarily as tokens with a chatbot attached. Only the first two are doing real work. Real revenue is concentrated in agent-driven DeFi automation, MEV strategies executed by agents, and onchain payment rails for AI services. Most of the rest is meme economics dressed in technical clothing. The structural question - “do AI agents need crypto rails at all” - has become a genuinely live debate. The answer in 2026 is “yes, but only for a narrow set of jobs, and most of those jobs are not what was being pitched.” If you are evaluating an onchain AI agent project, the test is brutally simple: strip away the token and ask whether the agent does something useful. If the answer is no, the project is a token with extra steps. How We Got Here The phrase “onchain AI agent” started showing up in crypto Twitter in late 2024 and exploded in early 2025. By the middle of last year there were thousands of agent tokens, dozens of agent platforms, and a handful of agents with billion-dollar implied market caps doing things that would have embarrassed a 2010-era chatbot. ...

May 3, 2026 · 9 min · James M
Agent Protocols MCP A2A ACP Banner

The Quiet Standardisation of Agent Protocols - MCP, A2A, ACP Compared

TL;DR The 2026 agent ecosystem has, while nobody was paying close attention, converged on three protocols that solve different problems and partly overlap: MCP (Model Context Protocol), A2A (Agent-to-Agent), and ACP (Agent Communication Protocol). MCP is the model-to-tool protocol. It standardises how an agent talks to its tools, data sources, and local context. This is the one that has clearly won its layer. A2A is the agent-to-agent protocol. It standardises how separately deployed agents discover each other, exchange tasks, and pass results. Adoption is growing but the picture is less settled. ACP is the orchestration-and-runtime protocol. It standardises how an agent runtime exposes its lifecycle, state, and operations to the systems around it. Newer, more enterprise-focused, and not yet a clear winner. The mental model: MCP for tools, A2A for peers, ACP for the platform. Build with all three in mind even if you only need one today. Why Protocols, Why Now A year ago “agents” was still a debate about whether the things existed. By mid-2026 the debate has shifted. Agents exist. They do useful work. The interesting question is no longer “will this work” but “how do we connect them to everything else.” ...

May 3, 2026 · 8 min · James M
Five AI Tokens Worth Understanding in 2026 Banner

Five AI Tokens Worth Understanding in 2026 (And One You're Probably Missing)

A technical reader’s guide to where AI and crypto actually meet - without the hype. TL;DR The AI-token sector has stratified. There is a clear top tier of projects with real engineering, real revenue and visible institutional interest, and a long tail of speculation. The total AI-crypto market just crossed $17B and the measurable-infrastructure share is growing faster than the speculative tail. The five tokens worth understanding in May 2026 are Bittensor (TAO) as the conviction long, Virtuals Protocol (VIRTUAL) as the speculative growth bet, Render (RENDER) as the infrastructure hold, Artificial Superintelligence Alliance (FET / ASI) as the deep value play, and NEAR Protocol (NEAR) as the AI commerce layer. Every name on the list has drawn down 60%+ from its all-time high in the last 18 months. The drawdowns are not theoretical and they will happen again. Position-sizing matters more than picks. Worth flagging without putting them in the main basket - Kite (KITE), Internet Computer (ICP) and The Graph (GRT). Worth avoiding - the long tail of “AI memecoin” launches. Nothing here is investment advice. Prices are snapshots from publicly available data (CoinGecko, CoinMarketCap) as of 4 May 2026 and will be stale within hours. Why The Sector Looks Different In 2026 A year ago the AI-token sector was mostly a betting market on which token had “AI” most prominently in its tagline. In May 2026 the picture has changed character. There is a clear top tier of projects with measurable engineering output, real revenue, and visible institutional interest, and a long tail of names whose only product is a narrative. The total AI-crypto market cap just crossed $17B, and the share of that capital flowing into infrastructure with measurable usage has grown faster than the speculative tail. ...

May 3, 2026 · 13 min · James M
AI Agents That Actually Work Banner

AI Agents That Actually Work: Patterns From Real Projects

TL;DR Most agent demos fail in production because demos operate in a regime where the model’s natural behaviour is good enough - production is longer, messier, and largely unobserved Eight patterns separate agents that stay shipped from the ones that fall over: scope the loop, structured tool design, mandatory verification, curated context, first-class human handoff, idempotency, agent-level observability, and real evaluation infrastructure Models confabulate actions - “I ran the tests” does not mean the tests were run; every agent needs explicit verification baked into the control flow, not bolted on as an afterthought The tool layer between the model and underlying systems is where most of the engineering effort actually lives, and exposing raw APIs directly to the agent almost always goes wrong Build agents the same way you would build any other long-running, partially-autonomous system you cannot afford to have fail silently - the novelty is in the failure modes, not the engineering principles I have spent the last eighteen months either building, reviewing, or operating systems that some marketing department somewhere has called “agents”. The definition has been so thoroughly stretched that it now means anything from a chatbot with a calculator tool to a long-running autonomous workflow that touches production infrastructure. Underneath the noise there is a real engineering discipline emerging, and the patterns that separate the systems that survive contact with real users from the ones that demo well and fall over are starting to be legible. ...

May 1, 2026 · 11 min · James M
AI Skills banner

AI Skills: One Folder, Any Model

TL;DR A Claude Code skill is just a folder with a SKILL.md file - YAML frontmatter plus natural-language instructions - and the same folder works across Cursor, Gemini CLI, Codex, and a dozen other tools The format is model-agnostic because it contains no provider-specific syntax; any instruction-following model can read it, and any harness that loads markdown can execute it Progressive disclosure keeps large skill libraries cheap: only names and descriptions load at session start, with full instructions loading only when a skill is activated The portability is practically valuable - version-controlled runbooks that survive tool switches, model upgrades, and team growth without being rewritten Core skills are genuinely portable; advanced frontmatter extensions (like allowed-tools or context: fork) are tool-specific and may need tuning across harnesses Most of the tooling I have written about over the last year has been provider-specific. A particular model, a particular harness, a particular set of features. The thing I find interesting about agent skills is that they are not. ...

April 30, 2026 · 9 min · James M