Architecture notes from production systems
Practical architecture notes from production systems covering modular boundaries, cloud delivery, observability, and technical leadership.
Architecture work is most useful when it stays close to production constraints. The cleanest diagram is not the goal by itself. The goal is a system that a team can build, operate, explain, and change when requirements move.
These notes summarize the decision patterns I keep returning to across backend systems, cloud platforms, data-heavy products, and delivery leadership.
Start with boundaries, not services
The first architectural question is usually not whether the system should be a monolith or microservices. The better first question is where the business boundaries are and which data needs to change together.
A modular monolith can be a strong default when a team needs fast delivery, clear domain ownership, and transactional consistency. It allows the codebase to express module boundaries without forcing every boundary to become a network boundary. That matters when the product is still learning and when domain rules are still changing.
Separate services become more useful when lifecycle, scaling, ownership, security, or integration requirements genuinely differ. Identity is a common example. Authentication and authorization often have a different release cycle, compliance profile, and integration roadmap than the core product workflow. Pulling that part out early can be worth the operational cost.
Use events where they reduce coupling
Event-driven architecture is valuable when it decouples modules that should not block each other. It is risky when it becomes a substitute for understanding transaction boundaries.
Inbox and Outbox patterns are useful because they make the failure model explicit. A system can commit a business change and the intention to publish an event in the same transaction. A dispatcher can publish later. A consumer can record processed message identifiers. That makes retries, duplicate delivery, and replay part of the design instead of a late production surprise.
The same pattern becomes even more important in offline-first systems. When a laboratory, device, or edge installation reconnects after an outage, the architecture must expect delayed and repeated messages. Idempotent handlers and clear ownership of state are not optional details.
Cloud decisions are operational decisions
Choosing between App Services, AKS, Container Apps, Functions, ECS, EKS, or Fargate is not only a hosting decision. It changes deployment complexity, scaling behavior, observability, cost, and the skills required from the team.
For small services with straightforward triggers, serverless functions can reduce operational load. For longer-running services with predictable traffic, container platforms can give better control. For complex multi-service platforms, Kubernetes can be appropriate, but only when the team is ready to own the operational surface area.
The most reliable cloud architecture conversations include cost and on-call reality early. A technically elegant design that nobody can operate safely is not finished architecture.
Make performance measurable
Optimization work needs a concrete target. In queue-based workloads, CPU alone may not be a useful signal. Queue depth, worker throughput, pod startup time, provider quotas, and total completion time often tell the real story.
For web products, Core Web Vitals should be treated as user experience signals, not only SEO metrics. LCP, INP, and CLS expose whether the page loads quickly, responds to interaction, and stays visually stable. The best fixes usually come from reserved image dimensions, fewer client-side dependencies, direct server rendering where possible, and avoiding heavy third-party scripts.
Documentation should explain decisions
Architecture documentation should preserve why a decision was made, not just what the final shape looks like. A C4 diagram, a short ADR, or a delivery note is useful when it helps a future engineer understand constraints, tradeoffs, and rejected options.
The documentation does not need to be heavy. It needs to be current enough to prevent repeated debates and specific enough to guide the next implementation decision.
Leadership is part of architecture
Architecture is not only written in code and diagrams. It also appears in backlog shape, sprint planning, review habits, mentoring, and discovery workshops. A good architecture that the team cannot execute will still fail.
The practical work is to keep the system understandable, keep the team aligned on boundaries, and make tradeoffs visible before they turn into production incidents.