Micro Services Architecture – A Practical Look

Most of us know and are used to WCF and WEB API services, we know how to model applications to best utilize these frameworks. Some also heard about micro services and perhaps even read an article or two about this architecture approach. I will try to describe my practical view of what they are, when to use them, what problems could be solved and what issues may arise while developing a system based on micro services. So let’s dig in.

What I often hear about this architecture is that it’s just like regular service approach but… there are more services and they tend to have very few responsibilities. So this isn’t exactly true. Actually, it’s just plain wrong. When contemplating architecture patterns you should always ask yourself the following question: what problems developers had to deal with before this approach was introduced to the world of software?

I’ll describe some core problems in the next paragraph. For now, let’s assume that it doesn’t matter how many things a single service can do as long as they are related to a single business module/feature/capability. One example would be Netflix where providing a video stream would be a single feature that could manage compression, providing a stream, selecting appropriate quality. For an insurance provider, one service could be responsible for managing a policy cancelation based on various business rules but the same service shouldn’t create a new policy since this isn’t its area of concern.

Below are some of the things you get almost out of the box from Micro Services architecture:

Scaling based on service load or business capability requirements

When you build an enterprise system, it is hard to predict how you will grow. Imagine you architect an IT support system. Initially, you assume there is a bunch of people on the phone answering voice IP calls and remoting to customers, trying to help them out with their PC issues. At this point you may have designed a system composed of a few core services:

CallCentreService with some Voice IP and routing capabilities.
AccessHubService utilized via a web portal which allows you to register and download remote access software.
ResilienceService which is utilized by call centre employees to backup customer data remotely and restore things on demand.
DisasterRecoveryService which is triggered on demand by customers who installed your software; once a particular pattern or malware is detected we react in a certain way.

Some of these services will need to be set up on machines with 3 times the amount of RAM other services have. Some may need a powerful CPU but little RAM to operate. Perhaps some will require terabytes of quick SSD storage or an array of CUDA enabled GPUs to do some complex parallel computations. Some can be set up on many cheap servers that may fail a lot as long as 70% of them is always online but others may require a triple 9 SLA grade availability.

Some modules may start small and not need any particular configuration until they evolve/grow in unique features and you learn that they are now utilized much more than other services. With this architecture you don’t need to know how you will grow and evolve in the next 3 years. As long as you separate boundaries and follow some guidelines that shape Micro Service architecture you can expect it will be easy to adapt to any scaling scenario. At some point you may even decide to move some services to a private cloud and again there should be little or no work required to do so.

There are more advantages of using this architecture but the one mentioned above is unique to Micro Services. Nevertheless, lets quickly list some other advantages:

Resilience

Since communication between services is often based on events you can use any kind of ESB approach, RabbitMQ or even MSMQ. All these give you resilience out of the box. If an endpoint is offline, your message will be replayed later if you so desire (obviously your client application have to handle each comms failure gracefully).

Ease of testing and support for multiple versions of each service

Minimalistic approach and common sense suggests that you have a single routing service delegating requests to particular services (if you grow big I recommend you load the balancer approach – even custom build extension of this initial routing service). While this may potentially be a single point of failure depending on a solution, it also gives you a capability of routing to a particular version of service based on the client type (the required version could be stored in your http header). You could also test new services by configuring your routing service to direct 1% of load to the new service or simply direct any traffic with particular diagnostic header to your new version. Having many machines hosting the same service allows you to update your system without any downtime – you just use our routing service config to prevent routing to the service currently being updated.

Obviously, the two advantages mentioned above are shared by many other architecture patterns. But since one of the suggested approaches is event based communication and routing service it is pretty easy to code some diagnostics, load monitoring and clever resilience which is always good especially in scenarios when you don’t know how your business and services are growing and what future server configuration each may need. With clever diagnostics you may quickly find out your IT Support company’s user base favors your DisasterRecoveryFeatures. In this features set, lives your neuron-network based “predictive analysis module” which now analyses much more data than initially anticipated. Perhaps you will decide to split it into 2 services – one with processing capabilities and other with a lot of quick SSD drives to analyze data in some NO-SQL graph database. Who knows what happens? But you will know your architecture will adapt to any scaling decision you may need to make and your diagnostics will show you there is a need for such decision once it’s required.

One thing to notice – there may potentially be a lot of services and a lot of clever routing and versioning so you really want to involve Continuous Delivery in the process. Too many points of failure if you do things manually. While CI and CD is sometimes optional to a point in other architectures, with this one it really isn’t – you simply have to remember and orchestrate too many little things human (a machine will never get tired and miss a step).

Here, you can read more “raw data” kind of information and learn how Martin Flower describes particulars of this pattern. Bear in mind though that your use scenario is the most important thing and while you may disagree I think the first advantage/scenario listed above should be the only one that should always ring a bell in your head and point you to this pattern. Other advantages can be achieved by many other architectural patterns and you may not always want to deal with some small disadvantages of this architecture choice. Also, as an architect, don’t be afraid to mix even large architectural patterns – as long as you know what you need to achieve and understand reasoning behind each patterns existence it is ok to mix and match.

Happy design phase to you all and don’t hesitate to leave your thoughts, comments below.