AI Hallucinations Understanding and Mitigating False Outputs Banner

AI Hallucinations: Understanding and Mitigating False Outputs

The word “hallucination” is one of the most successful pieces of accidental marketing in our industry. It is a soft, almost endearing way to describe an LLM stating with full confidence that a function exists when it does not, that a court case was decided when it was not, that a paper was written by an author who has never published in that field. It makes the failure sound like a quirk rather than the central reliability problem of the entire technology. ...

April 28, 2026 · 13 min · James M

AI Reliability Is Weird: Why Testing LLMs Breaks Everything You Know

We’ve embraced the future. AI agents like Cline are now the primary “builders” of software, executing complex engineering plans from high-level specifications. As I’ve argued in “The Architect vs The Builder”, the human role is shifting from execution to architectural oversight and defining intent. The patterns that determine whether agents stay shipped are covered in “AI agents that actually work”, and the wider safety framing sits in “AI safety from first principles”. But this shift introduces a profound, often uncomfortable, question: How do we know it actually works? ...

April 9, 2026 · 6 min · James M