“Best practice” is a phrase that should be treated with suspicion. What works for a fintech running 500 engineers rarely works for a five-person startup. The notes below are patterns I have seen hold up across multiple organisations - but always weighed against context, team size, and what the system is actually trying to do.
Microservices
Microservices solve organisational problems as much as technical ones. A service boundary is a team boundary in disguise.
When to reach for microservices
- When your monolith has become a merge-conflict factory and deploys are blocking each other
- When distinct parts of the system genuinely have different scaling, reliability, or data requirements
- When multiple teams need to ship independently without coordinating every release
Patterns worth adopting early
- Well-defined contracts - OpenAPI or gRPC schemas published and versioned, not a Slack message
- Independent deployability - a service that cannot be released without its neighbour is not really a microservice
- Observability by default - distributed tracing from day one, not bolted on during the first outage
- Circuit breakers and bulkheads - assume downstream services will fail and contain the blast radius
- Idempotent APIs - retries become safe, which is the whole point when the network is unreliable
Typical tech stack
Most production microservice stacks converge on a similar shape: Kubernetes or a managed container platform for runtime, Kafka or a managed equivalent for asynchronous messaging, a service mesh such as Istio or Linkerd for mTLS and traffic policy, and OpenTelemetry feeding Prometheus, Grafana, and a tracing backend.
A picture is worth a thousand words: 9 best practices for developing microservices.
— Bytebytego (@bytebytego) October 29, 2023
When we develop microservices, we need to follow the following best practices:
1. Use separate data storage for each microservice
2. Keep code at a similar level of maturity
3. Separate build… pic.twitter.com/Qjdmlf2kdY
What tech stack is commonly used for microservices?
— Alex Xu (@alexxubyte) October 30, 2023
Below you will find a diagram showing the microservice tech stack, both for the development phase and for production.
▶️ 𝐏𝐫𝐞-𝐏𝐫𝐨𝐝𝐮𝐜𝐭𝐢𝐨𝐧
🔹 Define API - This establishes a contract between frontend and backend. We… pic.twitter.com/Gnwhwxlxy7
System Design
The interesting system design decisions happen before you write any code. Once you have committed to a database, a messaging pattern, or an identity model, changing direction is expensive.
Questions worth answering up front
- What are the acceptable latency and availability targets, and who decides?
- Where is the source of truth for each piece of data, and who owns writes?
- What happens when each dependency is slow, down, or lying?
- How will you know the system is behaving correctly in production?
10 things about System design you should learn
— javinpaul (@javinpaul) October 29, 2023
1. Caching
2. Sharding
3. load-balancing
4. replication
5. fault-tolerance
6. high-avaibility
7. Performance
8. scalability
9. Performance
10. Indexing
learn more on Design Guru - https://t.co/VuZLWnCdMw pic.twitter.com/yRdilMCVqq
CI/CD
A healthy pipeline is short, fast, and trusted. A pipeline that takes 40 minutes and fails flakily is worse than no pipeline - people learn to ignore it.
- Keep main always releasable - treat a red main branch as an all-hands incident
- Deploy small changes often - smaller diffs are easier to review and easier to roll back
- Automate the rollback - if rollback is a manual runbook, it will be too slow in the moment
- Parallelise tests aggressively - engineer time is more expensive than CI minutes
Observability
The three pillars - metrics, logs, and traces - are the starting point, not the destination. What actually matters is whether you can answer “is this behaving correctly for real users?” without guessing.
- Define SLOs before you build dashboards
- Alert on symptoms (user-visible impact) not causes (CPU, memory)
- Log with structure, not prose - future-you will need to query it
Reference reading
- The Twelve-Factor App - still the clearest statement of how modern services should be built
- Google SRE Book - the source text for SLOs, error budgets, and toil management
- Microservices.io patterns - Chris Richardson’s pattern catalogue