Notes about Monolith to Microservices

 These are my notes for my future self about Sam Newman's book Monolith to Microservices.

Chapter 1 - Just Enough Microservices

What are Microservices?

Microservices are independently deployable boxes modelled around business domains, which communicate with each other over network. They are form of Service-Oriented Architecture (SOA) which is opinionated about how services' boundaries should be drawn. The services encompass both behavior and data - they own their underlying database and they expose the data via a well-defined API. 

Sam Newman stresses that independent deployability is the absolute key. To achieve it, we need the services with loose coupling, which means services should have stable contracts between them. That also means microservices should not share their databases directly.

Microservices are modelled around business domains. If done correctly we should almost never have to change more than ones service for any given feature. If we had to, it would be much harder to work with than with the monolith. The ideal decomposition will copy the company decomposition to teams according to the famous Conway's Law. If the old basic, layered architecture, was copying the traditional technical silos, our aspirations have since changed. The teams these days are poly-skilled and concerned about their business domain as a whole. Reducing the count of services shared between teams is key to reduce delivery contention. 

According to Newman, the services should contain it all - the database, the backend and the frontend, in one package. Even after reading the whole book, I still don't understand how Sam Newman would package modern frontends (Android, react, ...) with the backend logic, so I consider this idea kind of bullshit. What he could have meant is that in case of some old-school-ish frontend, e.g. JSPs or Wicket, the backend and the frontend could be deployed as one package. Other possible explanation of this seemingly crazy idea is we should look and try to have Micro Frontends, i.e. independently deployable frontend parts. They can be part of the team's source code repository, which makes some sense.

Advantages of Microservices

Independent deployability opens options to improve scalability and robustness. They also allow you to mix technologies. They can be simpler to understand given their smaller size. None of the advantages comes for free. Therefore we have to have a vision what we exactly are trying to achieve with refactoring to microservices.

Disadvantages of Microservices

Its the networks. Latencies will vary. Networks sometimes fail. We will likely have to ditch database transactions and work out how to have data integrity across multiple machines other way. They honestly seem like a bad idea, except the good stuff. As James Lewis put it, "Microservice buy you options."

We should not try to learn all the new technologies just because we want to implement microservices. Microservices have no need for Kubernetes, Docker, containers or the public cloud. If our strong side is PHP, we should try use PHP to implement them.

Size

We should not worry about microservices size. We should ask ourselves how many we could handle. Newman is strong advocate of incremental implementation of microservices. Adding more and more microservices, how we will not end up with horrible entangled mess? The whole idea behind microservices, why they were invented, is that some people were optimizing SOA services for replacability. Therefore each of them should fit into developer's head.

Monolith

There are three kinds of these. A basic single-process monolith without any inside modularity. A modular monolith, where the single process is broken into couple of modules. These modules inside monolith can even use their own database each. Shopify is famous for using modular monolith instead of microservices architecture and is successful at that. More about their story is in the video http://bit.ly/2oauZ29. The third kind of monolith is a distributed monolith, so the set of services which for whatever reason has to always be deployed together. Sometimes we need to refactor a 3rd party black-box to microservices, which is possible as well, we just cannot use techniques which need to change the monolith code (there are many of those). 

Challenges of Monoliths

They are more vulnerable to implementation and deployment coupling. They force us to delivery contention - multiple teams have to coordinate delivery. They are confuse who owns which part. Microsoft has studied this http//bit.ly/2p5RlT1.

The coupling can be temporal (e.g. synchronous calls which have to be made in a specific order). It can be mitigated by asynchronous calls between services, e.g. some kind of message or event queue. 

Deployment coupling means that one change can cause that multiple services must be deployed. This should be avoided as deployment of just a small subset of services goes to the heart of continuous delivery. 

Domain coupling is when services depend on each other too much. This can be mitigated my message broker or events. Some knowledge of Domain-Driven-Design is desirable to understand how microservices should be decoupled. Aggregate is a business object which has a lifecycle. Microservice will typically handle couple of aggregates. Bounded contexts are boundaries around couple of aggregates. So the microservices could be created around bounded contexts initially, and when we are comfortable with that, we can further decompose those to microservices around aggregates.

Advantages of Monoliths

Because of so little networks involved, monoliths have much simpler deployment topology. Therefore they avoid pitfalls of distributed systems. They have also much simpler developer workflow, monitoring, troubleshooting, much simpler end-to-end testing. They simplify code reuse. Unfortunately people started to view monoliths as something to be avoided, but they are a valid architecture.

Chapter 2 - Planning a Migration

It is important to know that microservices are not the goal. You don't "win" by implementing them. Without having an idea what are we trying to achieve, we most probably fell into a cargo cult mentality. The most common failure is when managers dictate technical staff to implement microservices, most probably because some top level manager told them to. We should instead answer three key questions:
  1. What are we hoping to achieve?
  2. Have we considered all alternatives?
  3. How will we know if the transition is working?

Why might we choose Microservices?

Every company has a different situation and has to come with its own motivation. 

We could try to improve team autonomy. That said, this has nothing to do with microservices. We could just assign more responsibilities to the teams without services. A modular monolith could be helpful here. We should avoid silos as much as possible.

Or we could try to optimize time to market. But for that to happen, we should do some kind of path to production modelling. We might be surprised that the problem is not what we think. 

Or we could perhaps want to scale according to load to save costs. We could also achieve this by vertical and/or horizontal scaling of a monolith. This doesn't work if the bottleneck is in the database, but otherwise it's much cheaper solution than the full-blown microservices architecture. 

Maybe we want to improve robustness. But microservices don't give us robustness for free. Quite contrary, they could decrease it because there are more moving parts to worry about. Moreover we could achieve this by running multiple instances of our monolith. The real way to improve robustness is to avoid reliance on manual processes.

We might want to scale the number of developers. As we know from the Mythical Man Month adding more developers doesn't improve effectively linearly. Microservices are smaller parts to focus on, but we could achieve the same with a modular monolith. But modular monolith will have some kind of deployment coordination between teams. 

Maybe we want to try different technologies to build the services with. Although possible, most organization end up limiting technology choices for service because of practical and maintenance reasons. 

Latter were some good examples, and Sam Newman added a bad one, which is Reuse. Reuse, while often stated as a reason to go for microservices, is not a real goal itself. BTW the reuse might go worse.

When might Microservices be a bad idea? 

When we have a unclear domain, e.g. we are a startup. Getting service boundaries wrong might be very expensive. Or when we are creating a out-of-the-box application or a library, we cannot expect the client to respect difficult operational processes for our microservices. Or, in most cases, when we don't have a good reason. Newman recommends to have a clear idea which reason is the dominant one.

Changing Organizations

What follows is a generic approach to changing organizations from Dr. John Kotter. It has 8 steps:
  1. Establish a sense of urgency. An ideal time to do this is after resolving some crisis which microservices might be a solution to. 
  2. Create a guiding coalition. We don't have to have everyone on board, but a couple of people and ideally someone more senior would do.
  3. Develop a vision and the strategy. Strategy from the point of this book, are microservices. Vision is the reason why are we doing it.
  4. Communicate the change vision. We can start small and incrementally refine our message.
  5. Empower employees for broad-based action. This usually means to remove obstacles somehow.
  6. Generate short-term wins. Functionality that is easy to extract from the monolith should be high on our priorities list.
  7. Consolidate gains and product more change. We shouldn't delay decomposing the database forever.
  8. Anchor new approaches in the culture. For a change to stick, we need to share information in our organization.

Importance of Incremental Migration

We should chip away at monoliths extracting bits at a time. Stock advice for starters to microservices is to start small. The vast majority of lessons will be learned in production. The decisions we make should be made in toward the reversible end. 

Where to start 

We should sketch out our proposed design. Do we see circular dependencies? Are some of the services too chatty, so they should be one thing? Each of the bounded contexts represents a potential future service. What we need from DDD, is just enough information to make a decision. We could also use a special technique called Event Storming, a bottom-up technique, where technical and non-technical people brainstorm events, then they model aggregates from them and finally bounded contexts. The resulting domain model can be used for prioritization. A bounded context with outbound dependencies should be easier to extract than one with more inbound dependencies. We should prioritize service decomposition based on two axis. Benefit and ease of decomposition. 

Reorganizing Teams

The technical silos should break down. This happens naturally in most modern companies. But there is no one size fits all. What works for Spotify, might not work for an investment bank. We should copy the questions, not the answers. As an example, we probably shouldn't force 24/7 support on development team immediately. Operations staff is used to this kind of work, but others might not.

We should begin by writing down all activities of delivery teams and map them to our current organizational structure. Then we might think about reassigning some of the responsibilities to merged teams. 

Newman is fan of self-assessment of people regarding their technical skills and which new ones they want to improve. This assessment has to be kept private. Changing skills of the current employees is not the only way. We could also hire some more experts. 

How do we know the Transition is working?

Do we know if it is working? Did we make a mistake? We should try to define some kind of measures to help us answer those questions. Having regular checkpoints might help:
  1. Restate our goal, why we are implementing microservices.
  2. Review any quantitative measures to verify progress. These measures depend on the overall goal we want to achieve.
  3. Ask for qualitative feedback - what do people think about our microservices implementation? Do they think it still is a good idea? Beware of Sunk Cost Fallacy, when people are so invested in previous approaches, that they couldn't see the approach is not working. Small incremental steps help here. 
  4. Decide what, if anything, to change in the approach. We should be always open to new approaches.

Chapter 3 - Splitting the Monolith

We want to make our migration incremental and reversible, allowing us to learn from the process and change our minds when needed. Our monolith could be a 3rd party software which is a black-box for us which we cannot change. Or it could be in so bad shape, that changing it is too costly. Whatever the state, there are some migration patterns for changeable and non-changeable monoliths as well. At the migration stage we want to be sure to copy (not cut or reimplement) the existing code from the monolith. Newman's inclination is to always try to salvage existing codebase first, before resorting to big rewrites. Although existing codebases are traditionally not organized around business domains, we should consider making a modular monolith instead. It might be the right fit for our company. 

Pattern: Strangler Fig Application

This pattern is coined by Martin Fowler https://bit.ly/2p5xMKo. The new and old system can coexist next to each other. This will give the new system time to grow and eventually replace the old system. 
  1. Identify functionality to move.
  2. Copy functionality to a new microservice.
  3. Redirect calls to monolith to new microservice.
The Strangler Fig pattern does not work very well then the extracted functionality is too deep in the monolith. It can be implemented with a reverse proxy between clients and our monolith first, then redirecting to new microservice based on e.g. URL. The new service can be deployed to production without being actually used at first. We can use feature flags to switch redirection as we like. Proxy should ideally be some standard tool like NGINX.

The proxy can be also used to change protocol on the way, but this breaks a good rule, to keep pipes dumb and endpoints smart. It is better to implement the new protocol inside the service as an endpoint. 

There is a rather advanced pattern of service meshes, which is not settled down yet, so we should be cautious when implementing it. Each service has a dedicated local proxy which handles communication with other services. Then there is a Control Plane for managing those local proxies. From the tools, Istio seems to be a clear leader, but this technology is quite new and therefore with a risk.

It is worth noting that the protocol does not have to be HTTP. It could be FTP, message broker, or anything else. It is also worth noting that strangler fig pattern generic and not limited to microservices. It is useful whenever you are replacing old solution with a new one. 

Pattern: UI Composition

UI can be decomposed along Pages or Widgets. This means that some pages/widgets are served from the old monolith and some new pages/widgets are served from new microservice. Special case are mobile applications, which are kind of monoliths. We can reduce deployment contingency by e.g. embedded web views. A modern way to decompose frontend is a Micro Frontend movement, which tries to make modern JS frameworks like react work with UI decomposition.

Pattern: Branch by Abstraction

To me this pattern looks a lot like a feature flag for calling microservice from inside the monolith instead of normal function calls. We should generally use feature flags instead of long-lived feature branches. The pattern itself:
  1. Create and abstraction of functionality to be replaced.
  2. Redirect all client code to the new abstraction.
  3. Create new implementation of the abstraction which will call the new microservice. We should deploy the new service as soon as possible, to catch some bugs in the deployment process.
  4. Use feature flag to switch to the new implementation. This way we can always switch back. We can also implement a Verify switch, which will call the old implementation in case the new one fails. 
  5. Clean up and remove the old implementation. Optionally also remove now-redundant abstraction.
This pattern is useful when strangler fig pattern is too cumbersome to use, because the code we'd like to extract from monolith is too deep in it. We also have to be able to change the monolith. 

Pattern: Parallel Run

This weird pattern means that instead of calling old or new service, we call both. This technique can be used to verify both are giving us the same result. The old implementation is usually the one we trust. In case of write operations, we need to replace the new writing functionality with "mocks/spies" from the testing world. 

Another useful technique is to use Canary Releasing, which is piloting a new future for just a subset of our customers. Or Dark Launching which means deploying and testing new services, while they are not called by the users.

Pattern: Decorating Collaborator

Similar to the Decorator pattern, we wrap the call to the monolith, we allow the former calls to execute as usual, but then we act based on the result. For example a loyalty microservice which computes benefit points for users based on finished orders. We should only use this pattern if requests and responses of the monolith are giving us all information we need to perform our business logic.

Pattern: Change Data Capture

We can build a microservice which is coupled to monoliths database and acts based on the data. We can use triggers or jobs to implement this. We can also parse transaction logs which each database has. 

Chapter 4 - Decomposing the Database

This is the longest chapter of the book, 80 pages long out of 250. Newman calls it the elephant in the room - how will we decompose the stateful database? It is probably just me, but this chapter was really hard to read and seemed chaotic and not edited enough. The patterns seemed to be presented in almost a random order.

Pattern: Shared Database

This pattern is only acceptable in case of codetables schema or database-as-a-service pattern, where database could be considered as an API. The problem with this pattern is that there are multiple services accessing the same data and we don't know which ones and who owns the data. If each service has its own database credentials it can be much easier to restrict access and detect users of a schema. Splitting the database is almost always preferred.

Pattern: Database View 

This pattern can be used as a replacement for interface from object-oriented programming languages. This is sometimes needed when our database is effectively a public contract, i.e. API, which we have to preserve and maintain. Using Views allows us to change the underlying schema. This pattern is only useful when the API is read-only, which limits its usefulness. Newman recommends to only use it when we think it is impractical to decompose the current schema. Splitting the database is almost always preferred.

Pattern: Database Wrapping Service

This pattern replaces database API with the other protocol service wrapping it and hiding it. The former clients of the database API have to be redirected to the new service. The idea is to stop different teams to modify and add new features to the already-spaghetti code of the part of the database schema. Newman has shown this pattern on Entitlements example from large Australian bank. This pattern works well when the underlying schema is too hard to split apart. Ideally this pattern is a middle step to more refactorings giving us some time while random teams cannot change the underlying schema anymore. 

Pattern: Database-as-a-Service

This pattern is useful for e.g. reporting clients. We create a schema dedicated for read-only access where we expose parts of the "real" internal database. This data can be understood as e.g. events. Martin Fowler calls this Reporting Database Pattern. We have to create a Mapping Engine, which will fill the external database based on internal data. Newman recommends a dedicated change data capture system, perhaps something like Debezium instead of batch jobs. This pattern makes sense where Views cannot be used, e.g. when the new internal database is on a different tech stack. It is read-only as well.

Pattern: Aggregate Exposing Monolith

If we can modify the Monolith, this is the preferred way of exposing the underlying data of Monolith so that new microservices can access it. We will create new endpoint in the Monolith which will expose the data needed by new microservice. This is always the preferred way compared to Views if we can change the Monolith.

Pattern: Change Data Ownership 

This time we are in the situation when the data should be owned by the new microservice. First we point the microservice to the original database. Then we move the data, point microservice to the new data and redirect Monolith to use the new microservice for this data access. We have to consider the consequences of breaking foreign keys, transactions and more. But overall this is the way to go if we are going to have clean microservices owning their own data. Remember we still want to have reversibility of the decision to have the new microservice. Therefore we have to think about the data synchronization.

Pattern: Synchronize Data in Application

This pattern consists of these steps:
  1. Initial import - bulk copy data from the old database to new database.
  2. Synchronize on write, read from old schema. The point is that the Application writes data to both databases.
  3. Once we are comfortable with the new schema, we keep synchronizing on write, but we read from the new schema. Once the new system is bedded in, we could stop synchronizing and remove the old schema.
This pattern makes sense if we want to split schema before splitting the service out of Monolith. We shouldn't use this pattern if we are releasing new microservice partially as part of canary release, because the synchronization from two possible applications will be tricky.

Pattern: Tracer Write

With Tracer Write, we move the source of truth for data in an incremental fashion. We don't move the whole tables at once, but we copy them partially. The biggest issue is the synchronization again. We can write from Application always to one of the sources and the data is then synchronized to other source. Or we can write from Application to both sources. We shouldn't allow client code to write data to either of the sources because then the synchronization will start to become tricky. 

Splitting apart the Database

We only learned about integrating services via databases, which is an anti-pattern. We would like to split the database. The question is when. If we split schema before the application, we can spot issues in performance and transaction integrity earlier. The flip side is that there is not much real benefit in doing so, unless we have a suspicion of problems in these two areas. BTW changing databases is difficult for two reasons - lack of good tooling and statefullness of the data (compared with stateless applications).

We can split our code to have one repository (database gateway) for each bounded context. Then we can split the database around the bounded contexts. The repositories will point to their schemas. This can be useful in the modular Monolith as well. 

The most common approach is to split the application first, then the database. This approach is with a concern that some people stop there and never split their database, resulting in suboptimal architecture. If we are not especially concerned with performance or transactional integrity, this is the approach we should take.

Pattern: Monolith as data access layer

We can expose data in the Monolith for the new service. This pattern should be used if the accessed data should not belong to the new service. Otherwise we can skip it and move the data along.

Pattern: Multischema storage

In this pattern, the new service uses its own schema for new features, but still also uses old schema directly for old features.

Pattern: Split Table

If the table is part of two or more bounded contexts, we have to split it to multiple tables. We will have to decide, who will own the data.

Pattern: Move Foreign Keys to Code

When the formerly connected tables are now in separate database schemas, we will have to make the connections in the code with the service calls. We should be careful here not to break an aggregate. If the two things want to be together, maybe there is a reason for it.

Pattern: Duplicate static reference data

This is the first pattern for codetable data. If we cannot keep the codetable in the code, so the data is probably too large, while it is not essential that all services have the 100% same data, we can copy the codetables, so each service will have its own copy in the database. Sam Newman does not recommend this pattern.

Pattern: Dedicated reference data schema

If the codetable doesn't fit into code, so the data is quite large, this is a valid approach. All services will access this schema for codetables. Sam Newman does not recommend this pattern.

Pattern: Static reference data library

We can move small codetables into code, sharing it between services with a library. The problem is updating the codetable with new data. Then we either have to redeploy all services, which breaks the independent deployability feature, or we have to be OK with services using different versions of the library. Sam Newman prefers this pattern in case codetables does not have to be consistent between services.

Pattern: Static reference data service

We can of course create a dedicated service for serving the codetable(s). In some companies, not used to microservices, this will look like overkill. For others, it is a reasonable approach. Performance issues can be solved with caching. Sam Newman prefers this pattern in case codetables have to be consistent between services.

Transactions

Maintaining data integrity when using services can be hard. We should avoid distributed transactions, i.e. two-phase commits, because they have to be very short-lived. There is a better way, the Sagas. The Sagas were originally designed for solving long-lived transactions (LLTs) problem. They solve it by splitting the LLT into number of steps, each of which can be solved by short-lived normal transaction. This pattern works well for solving data integrity problems of microservices. 

There are two recovery modes. Backwards recovery will basically try to programmatically revert the LLT. E.g. you cannot unsend an email, so you will probably need to send another email with the problem explanation to your users. Forward recovery will try to finish the LLT in case of non-critical error. It is useful to reorder steps in the Saga to reduce rollbacks. 

Sagas can be orchestrated or choreographed. Orchestrated sagas have an orchestrator, one service which know what to do, and orchestrates calls to other services to perform the saga steps. The problem with this approach is that sometimes your services other than the orchestrating one could become anemic. We should avoid BPM tools at all costs. They were never meant to be used by developers. Choreographed sagas means that services call themselves in the order of the saga. The integration should as asynchronous as possible, which makes it hard to implement, maintain and understand by developers not used to asynchronous integration. Sam Newman is OK with the orchestrated sagas if one team owns the entire saga. 

Chapter 5 - Growing Pains

Microservices have couple of disadvantages. When you have 2-10 services, Breaking Changes, Reporting, Monitoring and Troubleshooting, and Resiliency and Robustness problems will probably occur. When you have 10-50 services, you will probably encounter Ownership at Scale, Developer experience and Running too many things. If you have 50+ services, you will have to think about Global vs Local Optimization and Orphaned Services.

Ownership at Scale

If you have multiple service, you might want to reconsider the code ownership model you use. There 3 possible code ownership models:
  • Strong code ownership - all changes to a service have to go through review or implementation of the owners of the service.
  • Weak code ownership - same as Collective code ownership, but you should ask owners of the service first.
  • Collective code ownership - anyone could change anything.
For a Collective code ownership to work with microservices, there has to be very clear microservices design, and shared understanding what a good change looks like. It is much harder to fix the bad changes in distributed monolith than in normal or modular monoliths. For teams experiencing a rapid growth, collective ownership will become problematic. In Newman's experience organizations using microservices are almost universally adopting Strong code ownership model.

Breaking Changes

With services, we should make sure that our changes don't break the clients of our service. If they are, we will need to deploy them together, loosing independent deployability advantage. The design smell is having to deploy multiple services often together. Maybe they should be one. We should eliminate accidental breaking changes and if we absolutely have to break our contract, we should give clients some time to adapt. There are two solutions - either we run two versions of our service, or we run one version which supports both contracts. The second approach is preferred. We should avoid accidental breaking changes with some kind of automated test and organizations should solve this quite early in the process.

Reporting

The problem is Reporting tools are SQL heavy. They were used to access Monolith's database and select tables from there. We might have to create dedicated schema for reporting tools which looks the same as monolith's schema to them. 

Monitoring and Troubleshooting 

With microservices it will become harder to answer the question "Is everything OK?". The first thing you should do when adopting microservices is to implement a log aggregation system, like Elastic Search. Then use something like correlation ID to trace request in every microservice.

Local Developer Experience

Developers will start to require bigger faster machines to develop locally. You can stub things out locally at first, but eventually you will probably need something like remote development against remote realistic services and therefore you might have connectivity issues (developers will need network access to those remote services).  

Running Too Many Things

You will start to spend too much time deploying everything and managing deployments. Kubernetes has become a tool of choice for microservices architecture in this space. Newman is fan of FAAS part of the Serverless, i.e. Lambda or similar services. He recommends teams to start with Serverless instead of Kubernetes at first and only go Kubernetes when really needed.

End-to-End Testing 

We can cope with harder end-to-end testing by limiting scope of the tests to the bounded contexts owned be teams. We can try customer-driven contracts which will be tested before every deployment. There is a completely different approach - use progressive delivery and therefore limit users impacted with the new version of the released services and test them in production. Anyway we should look holistically on our microservices and find ways to test them.

Global versus Local Optimization

The way this problem manifests is that multiple teams have solved the same problem different ways without knowing each others solution. During time this can become very inefficient. Potential solution is to have one tech leader per each team and they should all meet from time to time to sync.

Robustness and Resiliency

We should start asking what should I do when this service fails? Potential solutions are using asynchronous calls, e.g. message brokers, using sensible timeouts or circuit-breakers. We should keep track of all production problems and find root causes and document solutions for fixing them.

Orphaned Services

Possible solution is to have a catalogue made in-house which has metadata about each service and their dependencies.

Chapter 6 - Closing Words

Sam Newman warns against the cargo cult again and repeats the need to implement microservices incrementally.

Comments

Popular posts from this blog

Notes about the Clean Architecture book

Notes about the Building Microservices

Notes about A Philosophy of Software Design