TL;DR
- The Trust series is my answer to one question: what has to be true before you can hand a non-deterministic system a real job and walk away?
- Read in this order: research map → evals → security → world models → trajectory evaluation
- Supporting posts cover reliability, context engineering, and safety foundations
- Full series index: /series/trust/
Start here
- What I’m Researching in AI Right Now - the research map and trust through-line
- AI Evals Are Broken - why public benchmarks stopped measuring real capability
- Securing AI Agents - MCP hardening, confused deputy, and what I run on my home stack
- World Models: What Comes After the Language-Only Era - when text-only agents hit their ceiling
- Evaluating Agents in Production: Trajectory Metrics - step-level scoring, not just final answers
Supporting reading
- AI Agents That Actually Work - patterns from real projects
- The Agent Reliability Problem - debugging non-deterministic systems
- Context Engineering - curating the window across a whole agent run
- AI Reliability Is Weird - why testing LLMs breaks familiar QA
- AI Safety From First Principles - engineering safety vs speculative scenarios
Related paths
- Home Agent Stack - build the stack these defenses protect
- AI Dev Tooling - the coding-agent side of the same problem
Related Reading
- AI Economics and Hardware: A Reading Path - cost and infrastructure decisions that constrain what you can actually deploy
- Expertise and Work in the Age of AI - how trust and accountability reshape what human expertise is for
- Agent Protocols in 2026: MCP, A2A, and ACP - the protocol layer where many trust boundaries live
- Structured Outputs and Schema Design for LLMs - making agent behaviour predictable enough to evaluate