Documentation

Diagrams as Code: A Practitioner's Guide for Data Engineers

TL;DR Hand-drawn diagrams in Lucidchart, Visio, draw.io or Confluence rot because they live outside the codebase, cannot be diffed, and have no compiler to flag when they go stale. Diagrams as code closes all three gaps by treating the text source as truth and the rendered image as a build artefact. Pick by the question you are answering, not by taste. Mermaid for embedded docs and anything that has to render in GitHub. D2 for aesthetically polished architecture with real cloud icons. Python diagrams for AWS-heavy decks. PlantUML or Structurizr when you need formal UML or the C4 model. The conventions that make trust explicit: co-locate diagrams with the code they describe, add a metadata header with last_verified and next_review_due, encode confidence visually ( verified / stale / proposed ), pair each non-obvious diagram with an ADR, and render in CI. The highest-leverage move is to generate diagrams from the system itself - Terraform state, lineage graphs, dbt manifests, Airflow DAGs. A generated diagram is provably current by construction, which is a much stronger guarantee than “I reviewed it last quarter.” If you have ever opened a Confluence page from two years ago and wondered whether the architecture it shows is still real, you have already met the problem this post is trying to fix. Hand-drawn diagrams in Lucidchart, Visio, draw.io or PowerPoint share three failure modes that no amount of governance ever quite eliminates. They live somewhere your code does not, so nobody updates them in the same PR that changes the system. They cannot be diffed, reviewed, or merged. And they rot silently, because there is no compiler error for “this picture is now a lie.” ...

My AI-Augmented Design Workflow: A 10-Minute Loop From Discussion to Documented Decision

TL;DR A combination of Cursor in the IDE, Claude Code and Codex in the terminal, and GitHub Spec Kit as the living contract has collapsed the discuss-design-document loop from days to under ten minutes Every meeting is transcribed and checked into GitHub alongside the design corpus, giving AI agents access to the full historical record - not just curated decisions but the debates that shaped them Model selection matters: cheaper, faster models for throwaway sketches and small refactors; expensive models (Opus) for large cross-repo work where the cost of a wrong answer is high The real transformation is cognitive flow - removing friction between thinking and recording means decisions get made and captured while the problem is still fresh, with almost no context switching AI is now suggesting improvements faster than the author can implement them; the next bottleneck is compaction, not generation - asking the model to reduce documents to their load-bearing claims rather than produce more content Since making a combination of Cursor in the IDE and Claude Code and Codex in the terminal the centre of my working day - with ChatGPT for general questions and GitHub Spec Kit holding the design contract - the way I move from a question on Slack to a documented design decision has changed beyond recognition. ...

Where Should Documentation Actually Live? Thinking Out Loud in the AI Era

TL;DR Documentation sprawl across Confluence, Jira, SharePoint, Google Docs, GitHub, and Miro is not a tool problem - it is a joints problem: the same decision exists in four places, drifting out of sync immediately Three forces constantly pull against each other: source of truth (one canonical home), discoverability (right surface for every audience), and governance (real access control) - optimising for any one breaks the others The proposed shape: docs-as-code for engineering artefacts in Git, collaborative tools for business content, a read-only render layer between them, and an AI-assisted discovery layer across all of it AI tooling weakens the old boundary - a business user can get a summary generated from a markdown master without ever seeing the file, and an engineer can draft an ADR pulling context from Confluence and Jira automatically Several genuine open questions remain unsolved: versioning across boundaries, who owns the render pipeline, and whether Jira tickets as documents should be formalised or fought against This post is me thinking out loud. It is not a proposal, not a recommended pattern, and possibly not even a useful framing. I am writing it because I am actively stuck on the question, and writing in public tends to be the fastest way I find out what I have got wrong. Feel free to disagree with any of it. ...