Follow this blog

Software engineering, design, and psychology

Problems of the “One Database per Microservice” Approach | Microservice Architecture — Ep. 16

As discussed in the previous post, the proper approach of data ownership in microservice architecture is for each service to have its own database hidden behind a network API.

While this approach solves independent service development and deployment, it introduces strong downsides:

  • Accessing foreign data now requires a network call, typically adding 100 – 200 ms of latency.
  • Cross-service joins become inefficient: data must be fetched from multiple services, normalized, and only then combined.
  • Cross-service transactions get much more complicated — some services may use storage technologies that don’t support transactions at all!

How to cope with these problems?

  • Cache foreign data locally to reduce network calls. This improve latency, but introduces tough questions: what to cache and when to invalidate.
  • For complex joins, create a dedicated microservice that gathers data on schedule and stores join results in its own database (a benefit here is that it can keep and provide results of previous computations).
  • For cross-service transactions, use sagas. Each service defines forward and compensating actions, coordinated either via orchestration (by a separate coordinator service) or choreography (event-driven coordination configured for each participating microservice).

These solutions are not simple. The “one database per service” principle trades local simplicity for system-level complexity — but it remains the most scalable and sustainable approach for microservice architectures.

One Microservice — One Database | Microservice Architecture — Ep. 15

The core idea of microservice architecture is decoupling: separate teams, codebases, lifecycles. Yet often it is still tempting to share a database between multiple services, especially when the system is relatively small. But it is a wrong decision.

Why shared databases break microservices:

  1. Microservice A cannot write to tables owned by microservice B, as each microservice must own its data undivided.
  1. Microservice A cannot read tables owned by microservice B, as any schema change, table / row permission update, or migration can silently break downstream services.
  1. Microservice A cannot keep its data in a separate table of microservice B’s database. If the B changes DB technology or scaling strategy, the tables of A have to be moved elsewhere, but by who owns the migration? It is especially complex issue once there are more than two services involved.
  1. With direct DB connections, the load on IO from some fast growing service might become so intensive it would degrade operation of unrelated services.

This is why each microservice needs its own database. All access to that data must go through the service’s API or events, and never through direct database connections.

Book review: “Data Pipelines Pocket Reference”, James Densmore

Software engineering is all about manipulating data. A big portion of software engineer’s attention is drawn to collection data from users and presenting it back to them in a useful form. However, there is another side of data — the kind that is not produced by software users but only consumed by them. Here, we aim for the goals of achieving a single source of truth, data validity and availability, and enabling performant processing (for analysis or presentation).

To get a better grasp of the tooling for working with such data, this week I read Data Pipelines Pocket Reference by James Densmore. The book focuses on the modern ELT (Extract-Load-Transform) approach, as well as EtLT (with ‘t’ for generic non-business-related data transformation).

It turned out to be a very practical pocket guide indeed. Each section of the book dedicated to data extraction, loading, and transformation is supplied with clear code snippets in Python. The snippets demonstrate means to connect to essential services, such as databases, AWS S3, Amazon Redshift, Snowflake, Apache Airflow, as well as basics of data manipulation.

I liked two things. First, these snippets feel production-ready. Surely, they feature no robust logic, but they are sufficient to start moving data around, running validations and applying transformations. Second, the author not only focuses on interaction with services, but
also provides some tricks of data processing and validation. In particular, there is a neat data testing framework based on separate Python scripts for each check, which can be integrated into Airflow workflows. The approach, while being quite lean, requires a certain mindset to arrive at, so this bit of knowledge was one of the things that saves time and builds a scalable data processing foundation.

That said, I think this book lacks example that are closer to real practice. It would benefit from a companion GitHub repository with a substantial dataset to run ELT against, in addition to the primitive data samples from the book which take no more than 10 rows and 5 columns in a single SQL table. The book also misses any in-depth discussions, making it a pocket reference, indeed.

What it covers:

  • Data roles: data engineering, data analytics, and data science
  • Types of pipelines: ETL vs. ELT vs. EtLT
  • Overview of tools for each ELT step and their orchestration
  • Minimal instructions for setting up data ingestion and transformation
  • Approaches to pipeline orchestration
  • A framework for data validation
  • Building pipelines with monitoring and maintenance in mind

Verdict: 4 / 5 — a good reference to start building simple ELT pipelines in a day, which is likely exactly what a general software engineer would want if data engineering is not their primary area of specialization

How to Migrate from a Monolith to Microservices (Without Regretting It) | Microservice Architecture — Ep. 14

When an architectural change this big is agreed upon, it is natural to perform migration of the whole monolith at once. Freeze all current development, develop a plan, split tasks and start. But beware!

If the decision to migrate is made, it is likely we already have a large codebase and an excessively large team that has lost coordination and velocity. The business will not accept months without new features, either.

The right way is to go incremental. Create a small team of experienced engineers, isolate a narrow component for migration, and set no hard deadlines. You will uncover unexpected issues early — but with limited scope, the cost of mistakes stays low and progress is still clearly visible.

How to execute the migration?

  1. Define API of the component to be extracted.
  2. Ensure its thorough automated test coverage with a 100% pass rate.
  3. Isolate the component by removing dependencies on the monolith.
  4. Extract into a microservice and ensure all tests passing.
  5. Put the microservice behind a facade (e. g., an API gateway).
  6. Route part of production traffic to it.
  7. Monitor the service under the real load and fix issues.
  8. Remove the extracted component from the monolith.
    This approach is commonly known as the Strangler Fig Pattern.

Microservices migrations fail most often not because of technology — but because the change must be both technical and organizational.

Where to Split? | Microservice Architecture — Ep. 13

A classic question when designing microservices: what defines a good boundary?

One approach is to split business capabilities. For an online store they can be:

  • Product Catalog
  • Search
  • Reviews
  • Cart
  • Billing
  • Shipping

Capability-based splitting usually leads to very stable services, as business needs evolve slowly and rarely require large-scale changes.

Another approach is to split by technical domains:

  • Core — unique functionality of this particular business (e. g. Product Catalog)
  • Supporting — industry-common core helpers (e. g. Cart, Billing, Shipping)
  • Generic — non-unique non-specific functionality (e. g. Auth, Search, Payment, Image Processing)

This model is more intuitive for engineers. It also makes it easier to decompose large domains into smaller services when scale or ownership demands it.

Both approaches focus on cohesion and loose coupling and achieve quite stable microservice boundaries under changing requirements.

Book review: “Web Scraping with Python”, Ryan Mitchell

Recently I faced a challenge of designing a web crawling and scraping system. To build context, I started my work by reading Web Scraping with Python: Data Extraction for the Modern Web by Ryan Mitchell (3rd edition, revised in 2024).

The book turned out to be a very pleasant read. The author’s approach is well structured with chapters going from simple practical tasks and legal overview to advanced considerations such as Natural Language Processing and race conditions in distributed scraping systems. Code snippets are concise and useful — I can imagine them being used in a small scale production system. In two days, I managed to build a solid understanding of common approaches, architectures, problems, and solutions in this field.

That said, the content is not without its flaws. The section on JavaScript and SSR section is so outdated it is almost hilarious. Mentions of Dynamic HTML, jQuery and AJAX calls are appropriate for a book written around 2010, but not for a revised version from 2024. Nonetheless, even this section is useful at a conceptual level: modern SPAs achieve the same goals as early 2000s web applications that generated dynamic HTML server-side and sent it to browsers.

The ease with which I read this book was strongly influenced by my existing knowledge of the web. Over the years, I have built a solid foundation in HTML and CSS, JavaScript and Python, APIs, application architecture, and networking — all of which helped me clearly see the connections between the author’s ideas. However, the book should still be accessible to any technical reader, thanks to its clear explanations and practical code examples.

What it covers:

  • Principles of web technologies
  • Legal and ethical considerations
  • Common scraping use cases
  • Building web crawlers and scrapers
  • Crawling strategies
  • Transformation and validation of collected data
  • Parsing text and image documents
  • Scraping traps
  • Distributed scraping systems

Verdict: 4.5 / 5 — a go-to practical guide for those planning to build their own scraping system.

Principles Behind Good Microservice Boundaries | Microservice Architecture — Ep. 12

  1. Single Responsibility. Consumers of a service need to clearly understand its purpose.
  2. High Cohesion and Exhaustiveness. All related functionality has to end up in the same service.
  3. Low Coupling. Services should minimize dependencies on others — even if that means duplicating supporting code. Big codebases for microservices are OK as long as the services are cohesive and centered around a single business capability.

Imaging we have a patient data management system for hospital intensive care units (ICUs), built as a monolith. It manages patient information, prescriptions, treatment history, vital parameters, and provides a dashboard with current state and therapy.

❌ Wrong decomposition:

  • dashboard
  • care plan
  • monitoring

Why? Consider we add a new drug into the system. All services must change:

  • care plan — to prescribe it
  • monitoring — to display past treatments with the new drug
  • dashboard — to show current therapy

✅ Better approach:

  • drugs
  • vital parameters
  • timeline (treatment plans, current state, history)

Why? Available drugs and vital parameters change in one place, while evolution of timeline does not affect other services. Each service owns a stable business concept, not a UI page.

Good microservice boundaries reduce change coordination, not just code size.

Benefits and Challenges of Microservice Architecture | Microservice Architecture — Ep. 11

The core of microservice approach contains two ideas:

  • narrow business domains
  • small and independent teams (2-pizza team size)

So, splitting monoliths into microservices immediately brings benefits:

  • smaller codebases — easier to comprehend, faster to change
  • higher cohesion — easier to comprehend, more reasonable to scale when needed
  • smaller build sizes — cheaper infrastructure and better horizontal scaling

These benefits are balanced by fundamental trade-offs:

  • the system becomes distributed — network delays and partitions become ordinary and must be designed for
  • the system becomes asynchronous — integration and end-to-end testing becomes significantly harder
  • events are now processed by service chains — bugs become harder to trace, reproduce, and reason about
  • wrong service boundaries are expensive — errors here lead to numerous inter-service dependencies, spawning a “distributed monolith”, combining cons of both approaches while bringing few benefits

Microservices shift complexity from code size to communication structure, forcing boundaries between business logic and supporting infrastructure.

Problems of Monoliths in Growing Projects | Microservice Architecture — Ep. 10

When a project grows successfully, with time amount of code grows, and more devs are hired. This often leads to:

  • a bloated codebase — harder for new engineers to understand, slower to change
  • tight coupling between many modules — releases require more coordination and happen less frequently
  • growing coordination overhead — N team members can form up to N² communication paths
  • legacy accumulation — dependencies receive updates, but codebase upgrades are postponed due to fear of breaking changes
  • longer build times
  • increasing hardware requirements for all environments

This is a time when teams start thinking about splitting the system into smaller, independently evolving parts.

Would you name the approach? :)

It is Right to Start with a Monolith | Microservice Architecture — Ep. 9

Microservices have many benefits — but they are not a default choice for greenfield projects.

Monoliths have strong advantages that service-oriented approaches cannot offer:

  • the whole app is deployed as a single unit: if it compiles, cross-module integration is likely correct
  • development and refactoring are simpler: the compiler helps to track if changes are complete
  • cross-module calls are synchronous or close to synchronous, taking less than 1-10 milliseconds
  • testing is easier, with clear targets of integration and e2e tests
  • debugging is simpler: spin up the app locally, set breakpoints, attach a profiler — and you have the whole system for inspection

When monoliths are OK:

  • small projects
  • small teams
  • unclear domain
  • unclear scaling requirements

...basically, in most new projects.

Decomposition of ESBs into Cloud Services | Microservice Architecture — Ep. 8

Let’s revise what were the responsibilities of Enterprise Service Buses:

  • service discovery and communication
  • request routing
  • protocol mediation
  • authN and authZ
  • rate limiting
  • logging and monitoring
  • workflow orchestration

These capabilities are generic. Any sufficiently complex system needs them — but they don’t need to live inside a single, centralized component.

In cloud-native systems, ESBs were effectively decomposed into specialized services:

  • service discovery -> service mesh tools (AppMesh, Linkerd)
  • service communication -> messaging and event streaming platforms (Kafka, SQS)
  • request routing, rate limiting, auth -> API gateways
  • logging and monitoring -> observability tools (CloudWatch, CloudTrail)
  • orchestration -> workflow engines (Step Functions)

What remained were microservices themselves:

  • independently developed
  • independently deployed
  • independently scaled
    ...units of a system, focused on isolated business capabilities.

Tech Background of the Early 2010s | Microservice Architecture — Ep. 7

The late 2000s to early 2010s marked the emergence of cloud computing. AWS and Google Cloud made on-demand compute available at scale.

The hardware and infrastructure assumptions shifted:

  • ephemeral instances became the norm
  • horizontal scaling became cheaper
  • infrastructure-as-code tools appeared and matured
  • instance and network failures now considered expected, not exceptional

At the same time, communication protocols and platforms stabilized around clear leaders:

  • HTTP as the universal transport
  • JSON as the universal data language
  • REST / gRPC as dominant API styles
  • Linux as the default server OS

This convergence reduced the need for protocol mediation and heavy integration layers — some of the reasons ESBs were invented. The switch from SOA was not driven only by its limitations, but also by changes in technology.

Drawbacks of Enterprise Service Buses | Microservice Architecture — Ep. 6

ESBs gathered all control over the system in a single place: communication, schemas, orchestration, infrastructure. Over time, this revealed systemic issues:

  • The ESB became a single point of failure: a bug could bring down entire system.
  • ESB changes were risky and slow: engineers had to deeply understand schemas, adapters, business rules, and their interdependencies.
  • Updates to services became problematic: even a small change in API required coordination with the ESB integration layer and orchestration logic.
  • The ESB team turned into a bottleneck, struggling to catch up with changes in different services.
  • Horizontal scaling was limited: ESBs commonly relied on vertical scaling and expensive hardware.

These issues slowed innovation and adaptability, turning large systems rigid, slow, and outdated. Another kind of architectural approach had to appear — one favoring independent ownership, decentralized control, and horizontal scaling of any service.

Peak of SOA — Enterprise Service Bus | Microservice Architecture — Ep. 5

The core of service-oriented architecture is centralized governance over how diverse services provide their capabilities and communicate.

This idea lead to the Enterprise Service Bus (ESB) — a central integration layer connecting all services, handling:

  • protocol conversion
  • message translation into enterprise-wide data models
  • request routing
  • security rules for all services (auth, rate limiting, access control)
  • centralized logging and auditing
  • workflow orchestration, including cross-service transactions and compensations

ESB made sense in heterogeneous enterprise environments — but it also concentrated complexity and control in one place. Many of today’s architectural advancements are reactions to this tradeoff.

Service-Oriented Architecture (SOA) | Microservice Architecture — Ep. 4

SOA is a predecessor to microservices. It is an architectural style that treats services as independent, heterogeneous providers of business capabilities.

“Heterogeneity” here acknowledges that services may:

  • belong to different vendors
  • run on different platforms
  • be written in different languages
  • communicate over different protocols
  • be developed by different teams

SOA emphasizes:

  • well-defined, explicit service interfaces
  • stability and genericity of contracts to serve multiple consumers over long periods of time
  • service discovery via service registries
  • centralized service administration with approval of contracts, schemas, and compatibility guarantees

The focus on central governance and long-lived, broadly reusable contracts is the key distinction between SOA and modern microservices. It is also its key limitation factor: services cannot evolve quickly because of dependence on central governance and strictness of agreed-upon interfaces.

Microservices vs. Traditional Services | Microservice Architecture — Ep. 3

A microservice is not a “small service” — it is a service with stricter constraints.

A microservice:

  • owns a single business capability (“bounded context‘ in DDD terms)
  • has limited to no dependencies on other microservices
  • is deployed and versioned independently, without coordination with its consumers
  • evolves its API in a backward-compatible way, giving consumers time to upgrade
  • is owned by a team small enough to understand and operate it end-to-end (the ‘two-pizza team‘)

What the ‘micro-’ prefix does not mean:

  • trivial logic
  • a small codebase
  • few endpoints

A microservice can be large and complex, as long as it remains cohesive and operable by a small team. And it can be small and simple too — if that’s what its business capability or scaling needs require.

Services vs. Components | Microservice Architecture — Ep. 2

What is the difference between a service and a component? Both should be cohesive and loosely coupled to the rest of application, both should solve a single problem, both have no hard limits on size. The key difference is the boundary.

Comparing to a component, a service:

  • does not share memory with other processes
  • is not accessed directly — only via a network
  • fails independently, without crashing the entire application
  • scales independently from the main process or other services

Components are in-process abstractions.
Services are distributed system units.

What is a Service? | Microservice Architecture — Ep. 1

This is the beginning of series on Microservices & Event-Driven Architecture (MEDA). The series explores the theme from its historical context to practical topics like testing, deployment, and observability.

All research and writing are done by me. The ideas are drawn from respected books and lectures, as well as my own professional experience. No AI is used to generate the content itself; I use ChatGPT only for editing, as English is not my native language, and I believe the texts benefit from AI corrections of my grammar and fluency.

I hope you find this series helpful and interesting. If you notice any errors or have suggestions, feel free to contact me at george@mishurovsky.com or leave a comment — I read them all.

Now, let’s proceed to the topic.

It helps to settle with fundamentals before diving into modern software architecture buzzwords like ‘microservices’ and ‘event-driven systems’.

A service is:

  • a self-contained unit of functionality
  • serving a specific business purpose
  • owning both its logic and data
  • deployed independently
  • providing capabilities through a standardized interface
  • accessed through a network boundary (real or assumed)

Note the emphases on autonomy and boundaries (functional and communicational). Without those, we’re talking about components, but not services.

Understanding this distinction makes architectural discussions clearer and prevents “microservices” from becoming just a fancy label for a distributed monolith.

Bookshelf

Below is a list of books I’ve finished — and those I plan to read. I will update it from time to time, as I discover new titles. Welcome!

If you are a junior developer, consider some of the marked titles. As for senior engineers, I hope you’ll find here some interesting reads as well.

Also welcome to the comments for this post and in linked book reviews! What do you think? Which other books are worthy of attention?

Notes on marks:
⭐️ — Brilliant, must read
🧱 — Foundational, recommended for beginners in a particular technology or software engineering in general

Finished

General Software Engineering

  • Clean Architecture — R. C. Martin 🧱
  • Clean Code — R. C. Martin
  • Code Complete — S. McConnell 🧱
  • Design Patterns — E. Gamma 🧱
  • Domain Modeling Made Functional — S. Wlaschin
  • Domain-Driven Design — E. Evans ⭐️
  • Patterns of Enterprise Application Architecture — M. Fowler ⭐️
  • Professor Fisby’s Mostly Adequate Guide to Functional Programming — B. Lonsdorf 🧱
  • Refactoring — M. Fowler ⭐️
  • The Object Oriented Way — C. Okhravi 🧱 (Review)

Working with Data

  • Designing Data-Intensive Applications — M. Kleppmann ⭐️
  • Data Pipelines Pocket Reference — J. Densmore (Review)
  • Learning SQL — A. Beaulieu 🧱

DevOps & Cloud Computing

  • AWS Certified Solutions Architect Associate (SAA-C03) Cert Guide — M. Wilkins
  • Continuous Integration — P. M. Duvall 🧱

Design

  • Practical UI — A. Dannaway
  • Refactoring UI — A. Wathan
  • The Elements of Color — J. Itten ⭐️

Management & Leadership

  • Fundamentals of Project Management — J. Heagney
  • Getting Real — D. H. Hansson ⭐️
  • Start with No — J. Camp ⭐️

Particular Technologies

  • AI Engineering — C. Huyen 🧱
  • Effective TypeScript — D. Vanderkam
  • Node.js Design Patterns — M. Casciaro
  • Web Scraping with Python — Ryan Mitchell 🧱 (Review)

In-Progress

  • Continuous Delivery — D. Farley
  • Continuous Deployment — V. Servile
  • Introduction to Algorithms — T. Cormen
  • Purely Functional Data Structures — C. Okasaki
  • Stylish F# — K. Eason
  • Systems Engineering Principles and Practice — A. Kossiakoff
  • The Art of PostgreSQL — D. Fontaine

Waiting In the Shelf

  • Accelerate — N. Forsgren
  • Building Microservices — S. Newman
  • Distributed Services with Go — T. Jeffery
  • Grokking Simplicity — E. Normand
  • Philosophy of Software Design — J. Ousterhout
  • Serverless Development on AWS — S. Brisals
  • Software Architecture — N. Ford
  • Structure and Interpretation of Computer Programs — H. Abelson
  • Team Topologies — M. Skelton
  • The Linux Command Line — W. Shotts

Dijkstra’s Algorithm is Basically a BFS Algorithm

A small note on a commonly mentioned algorithm — trying not to sound too pretentious 😅

If you, like me, get startled every time you see Dijkstra’s algorithm, forgetting how it works exactly — it is essentially a breadth-first search (BFS), but with two twists:

  • The graph is weighted
  • The queue is min-priority, not FIFO

So instead of blindly processing nodes in a queue one by one, we always pick the node with the lowest cumulative distance.

Once you realize this, the algorigthm becomes quite simple to implement, and the most of the complexity moves into building an efficient min-priority queue based on a Fibonacci or pairing heap.

Strictly speaking, it is BFS that is a special case of Dijkstra’s algorithm for unweighted graphs, not the other way around.

Earlier Ctrl + ↓