TL;DR

  • The Trust series is my answer to one question: what has to be true before you can hand a non-deterministic system a real job and walk away?
  • Read in this order: research map → evals → security → world models → trajectory evaluation
  • Supporting posts cover reliability, context engineering, and safety foundations
  • Full series index: /series/trust/

Start here

  1. What I’m Researching in AI Right Now — the research map and trust through-line
  2. AI Evals Are Broken — why public benchmarks stopped measuring real capability
  3. Securing AI Agents — MCP hardening, confused deputy, and what I run on my home stack
  4. World Models: What Comes After the Language-Only Era — when text-only agents hit their ceiling
  5. Evaluating Agents in Production: Trajectory Metrics — step-level scoring, not just final answers

Supporting reading