Lakeflow Declarative Pipelines

Lakeflow Declarative Pipelines: From DLT to Production

If you’ve been writing Delta Live Tables (DLT) pipelines, you’ve been building with Lakeflow without knowing the new name. In 2026, the rebranding matters because it signals how Databricks now wants you to think about declarative pipeline design. This isn’t just a rename. The mental model has shifted from “tables and dependencies” to “data flows and transformations.” Let me show you what changed and why it matters. For where Lakeflow fits relative to other orchestration choices and the broader paradigm question, see The modern lakehouse stack and Stream vs batch processing. ...

April 6, 2026 · 9 min · James M

Boosting Productivity: Essential Habits for Personal Growth

Introduction In today’s fast-paced world, personal development and productivity are more crucial than ever. It’s not just about doing more, but about doing the right things more effectively to lead a fulfilling life. Cultivating essential habits can be the cornerstone of significant personal growth and sustained productivity. This post delves into practical strategies and habits you can adopt to unlock your potential. Setting Clear Goals The journey of personal development begins with a clear destination. Without well-defined goals, efforts can be scattered and ineffective. ...

April 6, 2026 · 3 min · James M
Modern Data Engineering on Databricks

Modern Data Engineering on Databricks (2026 Guide)

The 2026 Databricks Baseline Databricks in 2026 looks much more opinionated than it did just a few years ago. For most new data engineering work, the default stack is now clear: Unity Catalog for governance managed tables where possible serverless compute for notebooks, SQL, pipelines, and jobs Lakeflow Declarative Pipelines for batch and streaming data products liquid clustering instead of old-style partition design for many workloads That shift matters because the platform has moved beyond “bring your own clusters and tune everything manually.” The modern Databricks approach is increasingly declarative, governed, and automated. ...

April 6, 2026 · 7 min · James M

Rebuilding Your Life With Small Systems

There are periods in life when big goals feel completely unrealistic. You might know what you want in theory: more energy, more clarity, better health, stronger finances, a calmer home, a more meaningful life. But when you are tired, emotionally stretched, or rebuilding after a difficult chapter, the idea of “transforming your life” can feel absurdly far away. This is where small systems become powerful. Not dramatic reinvention. Not a perfect morning routine copied from the internet. Not a 90-day personal reset with colour-coded trackers and impossible standards. ...

April 6, 2026 · 6 min · James M

NASA Artemis II

Mission status note: this page includes a time-sensitive status snapshot from April 6, 2026. For live updates, use the official NASA links below and the site tracking page. In Brief Artemis II is NASA’s first crewed mission of the Artemis program and the first time astronauts have traveled toward the Moon since Apollo 17 in 1972. The mission uses NASA’s Space Launch System (SLS) rocket and Orion spacecraft to send four astronauts on a roughly 10-day journey around the Moon and back to Earth. ...

April 6, 2026 · 3 min · James M

DevOps in the Age of AI Agents

For years, DevOps has been about breaking down silos and automating the software delivery lifecycle. We moved from manual deployments to Jenkins scripts, then to YAML-defined pipelines, and eventually to Infrastructure as Code (IaC). But in 2026, the bottleneck is no longer the speed of the pipeline - it’s the speed of human decision-making within that pipeline. We are entering the era of Agentic DevOps. From Automation to Autonomy Traditional DevOps automation follows a strict “if this, then that” logic. AI-driven DevOps uses reasoning models to handle the “I’m not sure, let me figure it out” scenarios that typically stall a release. ...

April 5, 2026 · 3 min · James M
Data Engineering Blogs

Data Engineering Blogs

Modern Data Stack & Engineering Core Blogs & Publications Start Data Engineering - Practical guides, tutorials, and real-world projects for building scalable data platforms from scratch. Seattle Data Guy - Balance of business strategy and technical implementation in modern data engineering. Eclectic Data - Deep technical analysis of data infrastructure, distributed systems, and architectural patterns. Benn Stancil’s Blog - Strategic insights and industry commentary on analytics, data culture, and organizational challenges. Platform & Tool Blogs Airbyte Blog - Data integration, ELT approaches, and best practices for data movement at scale. Databricks Blog - Comprehensive coverage of Apache Spark, Delta Lake, and Lakehouse architectural patterns. LakeFS Blog - Data versioning, governance, and data lakes as code principles. dbt Blog - Analytics engineering workflows, SQL best practices, and modern data transformation. Apache Airflow Blog - Workflow orchestration patterns, DAG design, and production deployment strategies. Kafka Blog - Stream processing, real-time data architectures, and event-driven systems. Redpanda Blog - Kafka ecosystem evolution, streaming data pipelines, and cost optimization. Podcasts & Multimedia The Data Engineering Podcast - Interviews and deep dives into data tools, techniques, and industry practitioners. DataFramed Podcast - Conversations on data careers, best practices, and emerging technologies. Data Warehousing & Analytics Snowflake Blog - Cloud data warehouse innovations, performance optimization, and enterprise data strategies. Google Cloud Data Analytics Blog - BigQuery best practices, modern data stack integration, and Google Cloud data solutions. Restack Blog - Data infrastructure comparisons, architecture patterns, and cost optimization strategies. Communities & Learning Online Communities DataTalks.Club - Free community-driven courses, job board, and peer-to-peer learning for data professionals. r/dataengineering - Active community discussions, career advice, and industry insights. dbt Community - Slack workspace, forums, and networking for analytics engineers and data teams. Learning Resources Data Engineering Fundamentals - Comprehensive guide covering data architecture, ETL/ELT, and system design. Engineer Codehouse - Practical tutorials and guides for modern data stack technologies. Industry News & Trends The Data Stack News - Weekly roundup of news, funding announcements, and updates across the data ecosystem. KDnuggets - News, tutorials, and discussions on data science, machine learning, and data engineering. Data Engineering Weekly - Curated newsletter featuring tools, articles, and thought leadership in data engineering. The Pragmatic Engineer - Data - Engineering-led analysis with frequent data platform deep dives. Open Table Format & Lakehouse Apache Iceberg Blog - Official updates on the open table format increasingly central to the 2026 lakehouse. Tabular Blog - Deep technical writing on Iceberg internals and multi-engine lakehouse design. Dremio Blog - Query engines, Iceberg, and open data architecture. Onehouse Blog - Hudi and open lakehouse patterns. Transformation & Analytics Engineering dbt Developer Blog - Analytics engineering patterns and practical SQL modelling guidance. Tobiko / SQLMesh Blog - Next-generation transformation framework with virtual environments. Locally Optimistic - Long-form posts on analytics engineering culture and practice.

April 5, 2026 · 3 min · James M
Databricks vs Snowflake

Databricks vs Snowflake in 2026: An Honest Comparison

The views in this post are my own personal reflections on the data industry, written in my own time. They are not about any specific employer, team, or colleague, past or present, and do not draw on any non-public information. The question “Databricks or Snowflake?” has dominated data engineering conversations for the past five years. In 2026, it’s still the wrong question. But let me answer it anyway, because sometimes you have to pick one. For the wider stack this choice sits inside, see The modern lakehouse stack. ...

April 5, 2026 · 11 min · James M

GitHub Spec Kit in 2026: SDD Goes Mainstream 🚀

TL;DR GitHub Spec Kit reached v0.5.0 in 2026, evolving from a documentation toolkit into a full extensibility platform for AI-assisted development Claude Code CLI is now a native skill within Spec Kit, making spec-to-code pipelines seamless and built-in The ecosystem has exploded with dedicated tools like AWS Kiro and Tessl, while multi-agent support covers Copilot, Cursor, Gemini CLI, and more Spec-Driven Development prevents architectural drift by making the spec the single source of truth - versioned, reviewable, and respected by AI agents Getting started is now low-effort: write a spec.md, pick any AI tool, and let the spec drive implementation Six months ago, we explored how GitHub Spec Kit was beginning to reshape software development. In early 2026, that promise isn’t just materializing - it’s accelerating. The project has hit version 0.5.0, the ecosystem has exploded, and Spec-Driven Development has transitioned from “interesting idea” to actual industry standard. ...

April 4, 2026 · 5 min · James M

Mac Homebrew packages

Homebrew is the package manager that makes a Mac genuinely usable as a development machine. The list below is the working set of packages I install on a new laptop, organised by what they do rather than alphabetically. Most can be installed in one command: brew install <package>. For graphical applications, see the companion Mac Applications and Utilities page. Essential bat - Cat alternative with syntax highlighting and Git integration fzf - Fuzzy finder for CLI (command history, file search, etc.) glow - Markdown reader in the terminal htop - Interactive process monitor with colors and mouse support jq - JSON query and manipulation tool (sed for JSON) pyenv - Python version manager python - Python (3.11+) ripgrep (rg) - Fast, recursive grep alternative terraform - Infrastructure as code provisioning tfswitch - Switch Terraform versions easily (warrensbox/tap/tfswitch) tree - Display directory structure visually wget - Command-line file downloader yq - YAML/JSON/XML processor and querying tool Cloud & Container Tools awscli - AWS Command Line Interface docker - Container platform and runtime gcloud - Google Cloud CLI helm - Kubernetes package manager k9s - Interactive Kubernetes resource viewer and manager kubectl - Kubernetes command-line tool kubectx - Switch between Kubernetes clusters and namespaces minikube - Run Kubernetes locally in a VM Development Languages & Frameworks django - Python web framework go - Go programming language nvm - Node.js version manager npm - Node Package Manager pytorch - Machine learning framework for deep learning rbenv - Ruby version manager rust - Rust programming language tensorflow - ML library for machine learning and AI DevOps & Infrastructure Tools ansible - Configuration management and automation consul - Service mesh and service discovery hashicorp/tap/vault - Secrets management tool packer - Machine image builder prometheus - Metrics collection and monitoring System & Network Tools bottom - System monitor (process, memory, disk, network) dust - Disk usage analyzer (better than du) exa - Modern ls replacement with colors and icons fd - Fast find alternative lnav - Log file analyzer and explorer mtr - Network diagnostic combining ping and traceroute speedtest-cli - Test internet upload/download speed tldr - Simplified man pages with practical examples File & Directory Tools midnight-commander - Full-screen file manager (mc) ncdu - Disk space usage analyzer ranger - Terminal file manager with preview support Productivity & Utilities direnv - Load environment variables based on directory httpie - HTTP CLI client (curl alternative) jupyter - Interactive notebooks for data science navi - Interactive cheatsheet and command browser task - Task management and todo app tmux - Terminal multiplexer (multiple sessions/panes) Database & Data Tools postgresql - PostgreSQL database client redis-cli - Redis key-value store client sqlite - Lightweight embedded database Additional Utilities neofetch - System information display snappy - Compression library for fast compression/decompression youtube-dl - Download videos from YouTube and other sites Related Pages Mac Applications and Utilities - graphical applications to pair with this CLI toolkit DevOps Best Practices

April 4, 2026 · 3 min · James M