In this post, we will start with a quick introduction to Docker and then learn about from my curated list of guideline to help you build docker in a faster & structured way.
What is a Docker? What is a Container? Why Docker? Is Docker is same as VM?
In terms of efficiency and performance, a new era started when the virtualisation came into IT. There were too many application systems(traditionally built server side architecture) which used to reserve all the resources for a job/application that would required only 1 – 2% of the hardware resources were replaced by virtual servers, hyper visors to double, triple or even quadruple these resources utilisation.
But these improved virtualisation solutions came at reasonable costs. Every virtual machine requires its own operating system(unix/linux/ubuntu/windows), so each container carries their own set of libraries many of which were common. Some common libraries are like for network management, memory management, cpu jobs management and user & security.
It is in this evolutionary process, the rise of Containers came to light. Containers are like a lightweight VM’s that allow sharing few limited resources(kernel, cpu, memory) with a Host Operating System., that acts as a virtual os for application to be deployed.
To put simply, you can think of containerisation as a multi-tenancy implementation at the OS level.
Few benefits of Containerisation:
- Improved performance on same hardware
- Smaller size than VM with benefits of VM
- Portability of Micro services Applications
- Increasing Availability of Cloud Services(IAAS, PAAS)
- Simple & faster in development, deployment and rollback
- Easy to scale containerised Micro services Applications
- Containers: A packaged application along with its dependencies.
- Docker: A container platform from Docker Inc. which specialises in providing Container services(openSource/paid). Often used to refer Containers.
- Docker Hub: A public containers repository for sharing containers.
To learn more about docker, you can visit following
- To Quick Start Dockers: https://www.guru99.com/docker-tutorial.html
- Deep Dive into Dockers: https://www.youtube.com/watch?v=lcQfQRDAMpQ
- Documented Learning Resource: https://www.tutorialspoint.com/docker/index.htm
Docker Development Guidelines
Where to start when building docker? What to look for in using dockerhub image? How to write or structure your Dockerfile? How to run docker in foreground? How to check docker logs? How to ignore files when building a docker? How to check container statistics?
1. Choose the right base docker:
DockerHub is like your github community. In most development scenario, you can find a suitable base container image. When you can find a readily available solution why build from scratch?
For example: Say you are building a custom Python or Scala based application. Instead of building whole docker from scratch, you just choose list from list of well tested and readily available solutions.
Always search for pre existing solutions(docker images) if you cant find them then try searching docker images on top of which you add to save time and effort.
Few key things to remember while selecting a docker image as your base image
- Nice readme [MUST]
- Base operating system details (ubuntu, alpine, centos,..) [MUST]
- Availability of Dockerfile source code [MUST]
- Docker provider details (prefer to use only authenticated sources)
- Community Demand (Pull requests count)
2. Postpone running you docker as daemon process
The standard & common practice of running a docker is in demon mode by passing “-d” param to “docker run” command
docker run -d -p 5000:5000 --name flask-server flask-tutorial
This is good in production and stage but in development I often see after running docker, one does the following steps
- Checking logs
1docker logs flask-server
- To stop docker
1docker rm -f flask-server
So use “–rm” instead of “-d” param during development, to view logs interactive. When running in interactive, use “ctrl+c” or “docker rm here_your_docker_name”to stop docker.
docker run --rm -p 5000:5000 --name flask-server flask-tutorial
While “-d” used to run docker in background, “–rm” is used to remove container once it is stopped.
3. Write a Makefile
In unix env, may be a developer cannot find more friendly tool than Make Utility. Make can be a wonderful utility that not only documents what all commands a user can run but also can simplifies the jobs executions to be more readable.
# sample Makefile
docker build -t flask-tutorial .
docker run --rm -p 5000:5000 --name flask-server flask-tutorial
docker rm -f flask-server
docker rmi flask-tutorial
To run above 4 commands at once
bash-$ make clean rebuild
Here it might look simple for readers, but during the building docker building stage you may have to add 2-3 port, 1-2 volume mount, name, docker name,.. etc eventually make it too big.
Prepare a makefile when you start to notice increasing list of command for build a docker.
4. Entrypoint Script
Docker files are supposed to be building only the technical stack that an container operating system and not running the services. For example an Airflow Setup often required developers to ssh services for remote task executions.
A well written Airflow Image servers as webserver, scheduler, worker and celery-flower. Additionally available airflow debug features or running airflow in different modes(celery, not celery).. All this functionality can be simply set in Entrypoint.
5. Container Root User
Although readymade solutions are there, more than once developer have to check or install to investigate the functionality of running docker images. To login as root user, once use following to logging to a running docker
docker exec -it -u root flask-server bash
AWS SECRET KEYS:
To access AWS services without the need of setting Host instance roles. Helpful in faster developer yet unsafe practice. Passing AWS Secret keys or any cloud or authentication keys
AWS or Other cloud credentials are some critical information one must protect at all costs. While Cloud technologies make development, deployment and running production jobs easy, lack of security knowledge can potentially destroy companies reputation as well vulnerability
7. Dockerfile Structure
- Standard Config: Here include base image details like following
123456789FROM python:3.6-slimLABEL maintainer="PRAMATI" # Never prompts the user for choices on installation/configuration of packagesENV DEBIAN_FRONTEND noninteractiveENV TERM linux # Define en_US.ENV LANGUAGE en_US.UTF-8ENV LANG en_US.UTF-8ENV LC_ALL en_US.UTF-8ENV LC_CTYPE en_US.UTF-8ENV LC_MESSAGES en_US.UTF-8
Here, you can even define your ENV variable like an application’s version or package name to build tools and also define required eco system around it.
- Required Installations: Next section is for common installation that are required or found in your base docker
1RUN apt-get update && apt-get install -y openssh-server
- Dev Installations: A temporary section required only in development stage
1234# Development toolsRUN apt-get update && apt-get install -y vim \&& apt-get install -y net-tools \&& apt-get install -y iputils-ping
Above is a list of installation, your may require for developing dockers.
- Additional Installation: A temporary section to append required installation during development stage. Toward the end of development, it is expected to merge these installation into Required Installation part or remove it.
12# Additional toolsRUN apt-get update && apt-get install logstash
- Copy and paste – application files: In case of static code, you may directly put these files into Docker itself. Otherwise, when you need to do changes or separate configuration files from container, you can use Docker Volumes.
1COPY . /app
- Sleep 999999: Running an entrypoint script in the start of development is not possible. So my preferred option is to using sleep command.
1CMD ['sleep', '999999']
- Setup Dockerignore: Like github provides docker dockerignore file. One should ensure to not pass .git or any local app cache files into docker image as this results increase in docker images & often not required in production.
- Stats: Run docker stats to know your CPU, memory & network usage.
Dockers are meant to be short, sweet & smooth(smaller in size, faster in speed and above all easy to use)
Although Container are wonderful, they are not always the best solution for all the use-cases. In some cases, VM or traditionally built systems could provide richer results. For example a database server or a messaging broker or security & health monitoring system.