We all may know what is deployment of a service or a web application. In nutshell, it is the process of delivering a software / application from developer system to a system which is accessible to the end users.
In traditional deployments, application release manager notifies all the users of the system via various mechanisms about the downtime. A typical example is receiving sms/ email notifications from the banks when they upgrade their applications.
So here are the points to note with this style of deployment.
- Releases happen after hours or in the early-morning hours.
- The team relies on the “release manager” – one person who actually knows how to get the code deployed.
- Users are often negatively affected by new releases and hence releases are infrequent, occurring only a few times a year.
After all, “Scheduled downtime” message is NOT graceful.
Releasing new features no longer needs to be a stressful with the help of tools like Docker, Configuration management tools like Chef, Puppet etc… and continuous integration tolls like Jenkins/ Hudson, circleci etc… So, what is this ‘zero downtime deployment’? Is this the alternative to the traditional deployment we discussed so far? Does this mean we do not require any downtime for a software upgrade?
Yes. of course!! There exists no downtime for software upgrade activity in this style of deployment. ‘Deploying an update should neither interrupt the end users nor the business, even if something goes wrong’ is the goal of this deployment process. Typical applications are HTTP services now, so it means that no requests are lost at any point of time.
After all, why should the end users care that the application is getting deployed/ upgraded?
Here are the key advantages of Zero Downtime Deployment
- More reliable releases.
- More repeatable release process.
- No deployment at odd hours.
- No end users knows about software upgrade.
Containerisation in Zero-Downtime Deployment
For High Availability reason, application runs in multiple instances. Running a software in many instances means that we also need to do the deployment/upgrade in all instances.
Then this question pops up – How do we update the application in all the instances with zero down time?
We know that installing an application manually becomes error-prone process as it includes verifying prerequisites, verifying application configuration sets, data sets etc… By using scripting, the installation process can be improved to ensure that no steps have been missed, and the elements of human errors are reduced. But even a scripted solution is slow compared to shipping with a container. There comes the container technology like docker as rescue.
As we may all know, a container contains everything that an application needs – including all of the supporting files, libraries and binaries needed to run which are packaged in as a container image. This container image becomes a process that is ready to be moved, shipped and deployed to one or more other computers. And docker becomes the de facto standard for containers.
One of the reasons why containers make this a lot easier is the fact that we can change a container running the software to a new container with a newer version of the software easily with few commands. So, zero downtime deployments are much more easily achieved using containers. And is 100% doable without containers too as many organisations followed in deploying the applications in pre-container era.
In one of our customer assignments, for the past 5 years with the monolithic architecture based product, they had been using the same, unchanged run-time environment with rpm based deployment. It also meant that no container based deployment was used by then. He came to Imaginea mainly for 2 reasons. Making his software available 24/7 & scalable.
With these in mind, we started to decompose the application into manageable microservices. W.r.t infrastructure, we decided couple of things.
- Run each and every microservice inside its own Docker container.
- Practice Infrastructure as code (IAC) as the era of manually crafting servers is over. We chose Chef as the configuration management tool. It let us declare, in code, what our server should look like. Then it automatically applies the changes to our servers.
To achieve 24/7 availability, we wanted to provide him zero-downtime deployment with chosen mix of software ingredients – Jenkins, Docker, Chef and Load Balancer.
When there is a code change in github, jenkins is configured to pull those changes, build a docker image and push the image to the docker repository. The chef clients installed in the EC2 instances listen to the docker image changes. When a new docker image is pushed, the chef clients are configured to stop the running containers in the machines, pull the new version of the image and re-launch the containers with newer image as detailed in as detailed in one of my previous posts.
All right, if the chef clients listen to the changes, then all the containers will be restarted all at once. Then how are we saying is zero down time? So, here is the secret.
The chef clients configured in the instances do not verify the docker image changes at the same time otherwise they all go down at the same time and is not zero-downtime deployment process.
Let me detail the steps involved in achieving this. There are 3 instances of the application launched in production as docker containers as shown in the figure.
The chef client installed in Instance-1 is configured to check the docker repository (in this case, it is docker hub) for newer image presence every 5 minutes, the other chef client in Instance-2 verifies it every 8 minutes and the last one does this every 11 minutes. Remember, the load balancer fronting these instances plays a crucial role in terms of identifying the healthy nodes and sending the requests to only those nodes.
- So the first node (Instance-1) detects the newer version of the docker image
- It stops the running docker containerised application in that node which it makes it down while the other 2 instances continue.
- Load balancer identifies that the instance instance-1 is down and sends all the requests to other 2 instances leaving the instance-1.
- Instance-1 gets the latest docker image from the docker repository and launches the newer container and the application gets restarted with upgrades.
- Load balancer finds the health of Instance-1 up and sends the requests to all 3 instances
The interim process is depicted in the figure for clarity
The same process continues for other 2 instances in next few minutes and at the end, all 3 instances will be up with newer docker images with the latest code. This way of deployment ensures 0 downtime for the entire process and the mission achieved.
Zero Downtime Deployment is a process worth pursuing. It enables faster deployment supporting agile development without sacrificing end-users’ experience. Container management platforms help to achieve the same at ease.
Note: I have originally published this article in my linked in space.