• Artificial Intelligence (LLMs, AI agents, and the future of human expertise)
  • Blockchain (Decentralized infrastructure, networks, and ecosystem evolution)
  • Data Engineering (Building data infrastructure that actually scales)
  • Data Science (Graph algorithms, network analysis, and statistical methods)
  • DevOps (Infrastructure, automation, and operational philosophy)
  • General (Culture, science, and the miscellaneous)
  • Retro Computing (The machines and culture that shaped computing)
  • Music Production (Gear, sound design, and creative workflow)
  • Personal Development (Expertise, craft, and the engineering mindset)
  • Security (Threat modeling, cryptography, and systems that resist attack)
  • Software Engineering (System design, languages, and the craft of code)
  • Space (Infrastructure and vision for human expansion beyond Earth)
Recursive Self-Improvement - Can AI Bootstrap Its Own Intelligence? Banner

Recursive Self-Improvement: Can AI Bootstrap Its Own Intelligence?

TL;DR Recursive self-improvement (RSI) is the idea of an AI that improves its own ability to improve - each round producing a smarter system that does the next round better. It is the engine behind every “intelligence explosion” story since I.J. Good described it in 1965 The narrow version is already real. Systems like AlphaEvolve and the AI Scientist measurably improve algorithms, code, and even research output - including, in AlphaEvolve’s case, the infrastructure that trains the models themselves The leap people fear is different: improving an algorithm is not the same as improving general intelligence. Nothing in 2026 has crossed that line, and the gap is structural, not just a matter of scale Four bottlenecks decide whether RSI runs away or fizzles: compute, data, verification, and diminishing returns. Each is a hard physical or informational limit, not a temporary engineering nuisance The realistic picture is steady, human-paced acceleration - AI assisting AI research - not an overnight takeoff. METR’s time-horizon data shows fast but smooth exponential progress, which is exactly what a bottlenecked process looks like It still deserves serious safety attention, because a slow takeoff is the one we can actually govern There is a particular shape of argument that has haunted artificial intelligence since before the field had a settled name. It goes like this: build a machine slightly better than humans at designing machines, and it will design a machine better than itself. That machine designs a better one. The loop tightens, each turn faster than the last, and intelligence runs away from us in an afternoon. ...

May 20, 2026 · 12 min · James M
Context Engineering - The Discipline That Replaced Prompt Engineering Banner

Context Engineering: The Discipline That Replaced Prompt Engineering

TL;DR Prompt engineering optimised the wording of a single human-written request. Context engineering optimises the entire set of tokens in the model’s window across a whole run - system prompt, tool definitions, retrieved documents, tool results, conversation history, and memory The shift happened because of agents. The window is no longer one prompt you wrote - it is an accumulation that grows on every step, and most of it is produced by the system, not by you More context is not better context. Research on “context rot” and the older lost-in-the-middle effect show model accuracy degrades as the window fills, even well below the advertised limit The four levers are retrieval (what you pull in), memory (what persists across runs), tool results (what tools dump back), and compaction (what you summarise and discard) Treat the window as a budget. Measure its token composition, design tools to return terse output, curate rather than accumulate, and keep the static prefix stable so prompt caching still works For a few years, “prompt engineering” was the named skill of working with language models. It meant finding the wording, the framing, the few-shot examples, and the role instructions that coaxed the best answer out of a single request. It produced a small industry of prompt libraries, prompt marketplaces, and job titles. And in 2026 it is mostly gone, absorbed into something larger and harder. ...

May 20, 2026 · 11 min · James M
Threat Modeling for Engineers - Finding the Flaws Before Attackers Do Banner

Threat Modeling for Engineers: Finding the Flaws Before Attackers Do

TL;DR A scanner finds bugs in code that already exists. Threat modeling finds flaws in a design before the code exists - which is the cheapest possible time to find them It is a structured conversation built around four questions: what are we building, what can go wrong, what are we going to do about it, and did we do a good job STRIDE gives you a vocabulary for “what can go wrong”: Spoofing, Tampering, Repudiation, Information disclosure, Denial of service, and Elevation of privilege You do not need a tool or a certificate. You need a diagram, the people who understand the system, and an hour The highest-value moment to threat model is when the design is still cheap to change - and the most common mistake is treating it as a one-off audit instead of a habit Most security work, as people experience it day to day, is reactive. A scanner flags a vulnerable dependency. A penetration test produces a report. An alert fires. Someone patches the thing, closes the ticket, and moves on. This is necessary work, but it has a structural weakness: it can only find problems in systems that already exist. By the time a scanner can see a flaw, you have already built it, shipped it, and possibly run it in production for months. ...

May 20, 2026 · 9 min · James M
Quantum Computing: A Threat to Bitcoin? Banner

Quantum Computing: A Threat to Bitcoin?

TL;DR Quantum computers threaten Bitcoin because Shor’s algorithm can derive a private key from an exposed public key, breaking the ECDSA and Schnorr signatures that authorise transactions. The threat is real but not imminent. Credible estimates put a cryptographically relevant quantum computer somewhere between 2029 and 2035. Research cited by Google and Bitcoin security analysts suggests a roughly 10% chance of a break by 2032. Around 6.9 million BTC - close to a third of all supply - sit in addresses with exposed public keys, including roughly 1 million BTC believed to belong to Satoshi Nakamoto. These are the coins most at risk. Mining (SHA-256) is far less exposed. Grover’s algorithm only offers a quadratic speed-up, which higher network difficulty can absorb. Bitcoin’s defences are forming: BIP-360 adds a quantum-resistant address type, BIP-361 proposes a controversial migrate-or-freeze deadline, and NIST has finalised post-quantum standards (ML-DSA, SLH-DSA) for future signature schemes to draw on. The safest action for an ordinary holder today: use a modern address and never reuse it, so your public key stays hidden behind a hash until you spend. Overview Quantum computing is one of the most significant theoretical threats to modern cryptography. For Bitcoin, the core concern is that a sufficiently powerful quantum computer could run Shor’s algorithm to solve the elliptic curve discrete logarithm problem - the hard maths that secures Bitcoin’s public-key cryptography. ...

May 20, 2026 · 9 min · James M
System Design Fundamentals - Making Trade-offs You Won't Regret Banner

System Design Fundamentals: Making Trade-offs You Won't Regret

TL;DR System design has no right answers, only trade-offs chosen deliberately or chosen by accident. The skill is making the choice consciously Most decisions move along a few core axes: consistency against availability, latency against throughput, simplicity against flexibility, and build against buy A good design states its assumptions - expected load, acceptable latency, failure tolerance - because a design is only “good” relative to assumptions The most common self-inflicted wound is designing for scale you do not have. Complexity added for an imagined future is paid for every day until that future arrives, if it ever does Write designs down. A short document that names the options, the choice, and the reason is worth more than any diagram There is a particular kind of interview question, and a particular kind of blog post, that treats system design as a body of correct answers - as if there were a known-good way to “design a URL shortener” or “design a news feed” and the job is to recall it. This framing is actively harmful, because it teaches people that system design is about memorising solutions. ...

May 19, 2026 · 8 min · James M
What I'm Researching in AI Right Now Banner

What I'm Researching in AI Right Now - And Where I'm Going Next

TL;DR I treat my own learning like a research agenda - a small set of questions I am actively chasing, not a reading list I feel guilty about The work I have been deep in clusters into four areas: agent reliability and non-determinism, context engineering and memory, the economics of intelligence, and the open-weight and small-model frontier The areas I have decided to move into next are the ones where I keep hitting questions I cannot answer well: securing agents that hold real tool access, evaluating agents on their trajectory rather than their final answer, world models beyond the language-only era, and the machine-to-machine agent economy I treat AGI timelines less as a forecast to win and more as a planning input - what changes for an engineer if capable autonomous systems arrive in three years rather than fifteen I am deliberately not chasing every frontier. Quantum machine learning and neuromorphic hardware sit on my watch list, not my work list, and being honest about that line is the whole point Most people consume AI news. I used to do the same - a feed of model releases, benchmark claims, and launch threads that left me feeling informed and changed nothing about what I could actually build. ...

May 19, 2026 · 12 min · James M
Diagrams as Code Banner

Diagrams as Code: A Practitioner's Guide for Data Engineers

TL;DR Hand-drawn diagrams in Lucidchart, Visio, draw.io or Confluence rot because they live outside the codebase, cannot be diffed, and have no compiler to flag when they go stale. Diagrams as code closes all three gaps by treating the text source as truth and the rendered image as a build artefact. Pick by the question you are answering, not by taste. Mermaid for embedded docs and anything that has to render in GitHub. D2 for aesthetically polished architecture with real cloud icons. Python diagrams for AWS-heavy decks. PlantUML or Structurizr when you need formal UML or the C4 model. The conventions that make trust explicit: co-locate diagrams with the code they describe, add a metadata header with last_verified and next_review_due, encode confidence visually ( verified / stale / proposed ), pair each non-obvious diagram with an ADR, and render in CI. The highest-leverage move is to generate diagrams from the system itself - Terraform state, lineage graphs, dbt manifests, Airflow DAGs. A generated diagram is provably current by construction, which is a much stronger guarantee than “I reviewed it last quarter.” If you have ever opened a Confluence page from two years ago and wondered whether the architecture it shows is still real, you have already met the problem this post is trying to fix. Hand-drawn diagrams in Lucidchart, Visio, draw.io or PowerPoint share three failure modes that no amount of governance ever quite eliminates. They live somewhere your code does not, so nobody updates them in the same PR that changes the system. They cannot be diffed, reviewed, or merged. And they rot silently, because there is no compiler error for “this picture is now a lie.” ...

May 18, 2026 · 21 min · James M
Cursor Composer 2.5 banner

Composer 2.5: Cursor's In-House Model Grows Up

TL;DR Composer 2.5 is Cursor’s most capable in-house coding model yet, built on Moonshot’s open-source Kimi K2.5 checkpoint with about 85% of total training compute spent on Cursor’s own continued pretraining and RL The model is purpose-built for the agent loop inside Cursor - long-horizon tasks, hundreds of tool calls, multi-step instructions - rather than as a general-purpose chat model Cursor claims parity with Claude Opus 4.7 and GPT-5.5 on its own CursorBench v3.1 (63.2%) and a strong 79.8% on SWE-Bench Multilingual Pricing is dramatically lower: $0.50 / $2.50 per million input/output tokens on the default variant, with included usage doubled for the first week Together with SpaceXAI, Cursor is now training a much larger successor model from scratch on Colossus 2 with around 10x the compute - so 2.5 is a waypoint, not the endgame For a while, Cursor was an IDE wrapped around someone else’s models - Claude, GPT, Gemini. That story has shifted. With Composer 2.5, released this week, Cursor has shipped its most capable first-party coding model yet, and it is a serious enough piece of work that it deserves real consideration as a daily driver rather than a budget fallback. ...

May 18, 2026 · 8 min · James M
AI as Analogy Engine Banner

AI as Analogy Engine: Synthesis, Invention, and the Combinatorial Frontier

A common dismissal of modern AI goes like this: “It is just a fancy autocomplete. It memorises text and stitches it back together. There is no real understanding, only retrieval.” It is a comforting story, and it has the shape of a critique that ought to be true. But spend enough time with frontier systems and a different picture starts to form. The thing that large models actually seem to be good at is not memorisation. It is something stranger and arguably more important: the formation of analogies, the combination of distant concepts, and the generation of conceptual relationships that were not explicitly present in any one place in the training data. ...

May 16, 2026 · 13 min · James M
The Agent Reliability Problem Banner

The Agent Reliability Problem: Debugging Non-Deterministic Systems

The conventional reliability engineering toolkit was built for systems that behaved the same way each time given the same input. AI agents do not behave the same way each time given the same input. The classic tools - unit tests, integration tests, deterministic replay, traditional monitoring - all assume a property that the systems being operated do not have. This mismatch is not a small operational annoyance; it is the central challenge of running AI agents in production, and the patterns for handling it are still being worked out. ...

May 15, 2026 · 7 min · James M