Docker- A Beginner End-to-End Understanding

Prem Prakash
17 min readApr 28, 2020

And bit beyond.

Figure 1: Traditional vs Docker Containerization (Source)

Disclaimer: This is written from the perspective of a complete noob about Docker, nonetheless, this should give a complete end-to-end beginner understanding about it, and is demonstrated with an example. There are many resources on the web from where one can learn about it, and from where I learned it; this is simply an accumulation of all those to provide better insight for a good foundation and for the depth ahead. Moreover, this write-up is also for me, to which I can come back for a reference, as it is a really powerful tool that eases developers with the deployment process for web-related applications. Also, as a disclaimer a python project, specifically a web-application (QASO), is taken as an example to present its working. Regardless of the example, this should not limit the readers from getting a good understanding on the working of Docker.

Docker- similar to its literal meaning (a labourer on shipping docks) that is it is a safe place where you can dock your dependencies and ship it anywhere through containerization without worrying about platform conflict resolution, of course, a prerequisite would be that the platform should have Docker tool installed to run your program smoothly.

Use of containers for shipping goods allows it to contain different items, regardless of its shape or size, which makes shipping process easier for anywhere in the world. Similarly, a container in Docker has all the dependencies of software (or program) along with source-code which can safely run without any conflict, regardless of other applications running on the system. Let us not get ahead and overwhelm ourselves, we will get to understand each of these terms, terminology, and its working step by step.

Docker has primarily been proposed as a os-level virtualization to deliver software in packages called containers (wiki), notwithstanding this, it has established itself as a widely used prominent tool to deploy projects in production in an automatic, seamless, and hassle-free way. To be precise deploying is putting the application on a server (local or remote - well for the production it is remote though), where you need to install all the dependencies of the project for it to function, which may differ from platform to platform i.e. Linux, window, mac, etc. Which begs the question, how does it achieve all this? What is so special about it that many are using it? What is the issue with other existing approaches? To what extent does Docker solve the process? We will get to understand and know all of these through a non-trivial (well it’s pretty trivial) web-application deployment. This should present a first-hand experience with everything i.e. Dockerfile, image, container, docker-compose, docker-hub, etc.

Before we dive into the world of Docker containerization, it would be somewhat pertinent to present the typical approach deploying a python project locally that may have been developed on a different platform/OS, Start with creating an environment using conda create --name <env-name>(replace <env-name>with an environment name removed of the angle brackets. We will follow the same convention for the brackets, either in command-line or code). Next activate the new environment with source activate <the-new-env-name>and then install all the dependencies listed in requirements.txt with pip install -r requirements.txt (install pip if not present. `Conda` and `pip` version-tool manager are compatible with each other); on a side note, it is easy to create a requirements.txtfile for the dependency list of a project with `pipreqs` (need to install it- pip install pipreqs) using command pipreqs <path-to-project>- this limits the dependencies to only those used in the project. Installing dependencies and running the program should work most of the time. But sometimes an error or a fault or some other issue might creep in during the installation process, which as a developer you would not want to be entangled in and rather focus on development and performance work. To avoid this there are many alternatives such as rkt, LXC, LXD, Linux-VServer, OpenVZ, runC etc. along with Docker of course. They all are used, nonetheless, Docker seems to be winning the race not because it is in vogue, but in fact, it does have some advantage (automatic hassle-free creation of fast deployable containers) over others; that is to not say there aren’t any shortcoming. Discussion on comparison (for different containers) is out of scope for this write-up (well actually I do not know enough to comment on this), here the focus is to have a beginner-level end-to-end understanding of Docker on which one can further build their understanding.

Working with Docker

To start working with Docker one needs to install it on a host system. Download it from here. Linux and Mac are Unix-like operating system that makes the installation process pretty straightforward, however, for windows that are home edition (not for window-professional) may need to install virtual box before installing Docker or maybe not, if it is already provided with Docker-toolbox, please do check for it.

Docker has two parts- client and server (daemon engine). Docker client is of no use without the running server. In mac and Linux OS one can access the client from the terminal itself, however, one needs to start the Docker app to start the server to execute any command; while for windows starting docker will also open a docker-terminal which works as a client for talking with docker-daemon. You can run to check the version with docker versioncommand- this will give you the latest build information about client and server.

Let’s start the fun part- similar to learning of any new programming language where you typically start with printing hello worldsimilarly, for Docker as well we can run the hello-worldcontainerized application with docker run hello-world- this asks docker to run the hello-world, which is the name of an image and the runcommand creates a running instance of it called container (more on them later). The output of the command will be the following:

Figure 2: Output of `docker run hello-world`

In the above figure, you can see that Docker first tries to find the image locally on the system but since it does not find it, therefore, the server has to pull from the library (cloud) of images, aptly called Docker-Hub, where the user can push their images either for public use (is free) or just for themselves (paid). All of these will be explained in detail and more, hang on.

Image, Container, and Docker-Hub

Let us declutter the terms we have come across - image, container, and docker-hub. Additionally, anyone familiar with the git version-control system will find some similarities here with pull, push of images from docker-hub (you can think of it as GitHub but for docker-images)- more on this as we proceed.

Image- is simply an accumulation of all the dependencies that are necessary for the running of a program or project. Like for python projects, one creates a new python environment and installs all the dependencies in, similarly for Docker, one builds an image with all the dependencies of a project using Dockerfile. Dockerfile basically is a set of sequential instructions that need to be executed for the creation of an image of the project. We will learn about it in detail, in the later sections. Use command-line-interface (CLI) docker imagesto find available images. This will show something like in Figure 3. The figure also shows ubuntuimage, please pull it from the hub using docker pull ubuntu, to understand container that we will learn next.

Figure 3: The output of `docker images` should be like.

Container- to put it loosely is a running instance of an image, but the official documentation defines it very well as:

A standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another. A Docker container is a lightweight, standalone, executable package of software that includes everything needed to run an application: code, runtime, system tools, system libraries and settings.

Typically, containerization is a process of running applications on an OS with virtualization such that it gives the application an illusion that it is getting its own OS and dependencies (Figure 1) independent of others. What is different in Docker is that it achieves the same task without having to create separate virtualization for OS as well thus making the process faster. It achieves this virtualization (with of course only host OS kernel) by using-

resource isolation features of the Linux kernel (which is why for some window systems you need Linux virtual-box) such as cgroups and kernel namespaces to allow containers to run within a single Linux instance, avoiding the overhead of starting and maintaining virtual machines, where each container gets their own guest operating OS (Figure 1) which can be slow to boot (wiki).

To run an image, for example, ubuntu use docker run ubuntu- this also pulls the image from the docker-hub if it is not already present. To check all the containers that are running or otherwise finished, execute docker ps -a, -a for all. If you want to see only running containers then execute docker ps. One can also run an image interactively with docker run -it ubuntu and then open a new tab in the terminal to check running container with docker ps which shall output something like in Figure 4. To exit from interactive mode use `exit` CLI, or if you want to stop a container use docker stop <container-id> or to forcefully kill use docker kill <container-id>

Figure 4: A output of `docker ps` should be like.

Docker-Hub is a hub where one can find various publicly available images to download from. For private images, you would have to loosen your pockets (at least you get one for free). One can consider the docker-hub a bit like GitHub, where you can pull and push images. We will see this in use in a later section when we build our own custom images and see that it is useful in helping avoid building images on each new system where we want to run the application. There is also a commit command as well where one can commit a container with tag-name to know more on this please do visit the official documentation. Although, you do not need to know all about this upfront, especially if you are just starting to dive into the realm of Docker containerization.

Dockerfile and docker-compose

In this section, we will learn to create our own custom image for a project (QASO for example) with Dockerfile and/or docker-compose YAML file.

Dockerfile is a plain text file that has a set of instructions that are executed sequentially to create an image. An example is provided below from the QASO project available on github:

Let us understand this Dockerfile a bit. To create an image you always have to start with a parent base image, here, it is python:3-slim(a slim version of python for reduced size), nevertheless, if you want your Dockerfile to start from nothing then use From scratch. What follows next is a bunch of system updates (for this image- recall that Docker images do not have a separate copy of OS, rather exploits the properties of UNIX like OS to give that virtualization using cgroups- ) on the base-image.

Next, we set the working directory (is created if it does not exist already) where the application files and folders will be copied. The following line is an instruction to copy the requirements.txt file from the current directory of your local system (make sure that your current directory is where the file requirements.txtis) to set the working directory. After this, the requirementfile is used to install all the requisites of the project, although, rather than using the file one can also write a set of instructions in the Dockerfile itself to install those dependencies.

Now, the next instruction is to copy the app files and folder from the current directory (.) to the working directory (/qaso/- in Docker sandbox, it not locally think the second location in scp). This is to provide the source code of the application available when starting the container (as the container is a full-fledged independent running application). However, this copying instruction does not necessarily have to be put into Dockerfile, because if your application launches with docker-compose (in next section) then there also we can instruct to copy the source code as well, which, in fact, is more flexible (we will learn this next). The final instruction is a message echoed in the terminal which says how to access the application locally, this, of course, is not needed, and is displayed only once when the image is built for the first time.

The use of Dockerfile is to automate the build process; this is what makes Docker a powerful software packaging tool. The above Dockefile by itself wouldn’t launch the web-application for which one might need to start multiple services such as web-server, database server, etc. Nevertheless, if an application can be such that it has only one service and isn’t much complex, and for which one Dockerfile is enough to do the job then very well go ahead.

To build an image with a Dockerfile execute Docker build . -f <path-to-Dockerfile> if your Dockerfile is at a different location or named something other than the default name Dockerfile, otherwise, typically one can simply build with Docker build . when the command prompt is at the location of the Dockerfile (read more about building images with Dockerfile here). Once the image is built, run it in a containerized fashion withdocker run <image-name (creates a writeable container layer over the specified image and then start it i.e. it is equivalent to creating an image and then starting the container). One can find the newly created image with docker images. By default, the name of the image will be none (dangling), which is fine since it can be changed later using the tagging function, with which you can explicitly name it if building the image using docker build . --tag=<username/image_tag:version> -f <path-to-Dockerfile>.

As mentioned, we will be presenting an example to launch a web application with Docker. To launch a web application (locally or remotely) you would need to start a web server, expose a port, which is well within Dockerfile capacity, however, most of the time you may need to work with multiple services, in that case, it is better to use docker-compose and also the thing about the dynamic update of source-code (recall that). We will learn about all of these in the following section.

Docker-compose is a YAML configuration file that is composed of one or many services needed for the complete functioning of a project, where each service builds an image (if not already present) from Dockerfile and then starts a container with all the dependencies information. For example, a web application typically has at least one web server, and/or other services such database-server, etc. For sake of demonstration, we consider web and data server-based web-application; the configuration for the same is provided in an example below (database-server is trivial here and is for the purpose of demonstration only), this again taken from the QASO application.

In the above YAML file, the first line indicates which version of docker-compose should parse this file. Next are services that should be launched to start the application. Here, we have two services- web and database.

The web service builds the image from instructions provided in Dockerfile (if the name is different than Dockerfile, then please use that name) but it checks for an image with the name voyager2020/qaso_web first and if it is not there then only it builds it from the given Dockerfile. The following line is about volumes. This is important because your server will run and work only if it has all the code, and that is why you need to copy into the working-directory /qasowhich is set in Dockerfile. Although, it is not needed, when, if you have already instructed in Dockerfile to be copied, which makes it available, having said so, it is highly recommended to perform this action in docker-compose file, since any changes in code can be reflected immediately with starting of a new container for the application (need to remove older container to effect changes otherwise Docker will start the old one), otherwise each time you have to build the image for effecting the code changes. Performing this action in both (Dockerfile and docker-compose) is simply redundant, therefore, use only docker-compose, here it is used in both for demonstration purposes only. Finally, the next couple of lines in the web-service section are to start the Django web-server on exposed port 8000.

The database service is trivial and is only for demonstration purposes, it used redis a public image from docker-hub, nevertheless, one can build a DB-image with a Dockerfile (of course it should have a different name) as well that will have all the configurations (username, password, etc.) to start the database server.

The above docker-compose file is non-trivial, this simple example demonstrates what docker-compose is capable of. It shows how it can ease the process of deployment. It really enhances the ability of Docker to run multiple containers simultaneously that allows your application to be broken down into several microservices; where each service can be packaged in a separate container thus enabling smooth scalability and maintainability of the application. Since this facility allows the adding of a new service smoothly. Suppose you want to add a service to perform data analytics of your application without interfering with other services, the docker-compose is the way to go. To read more on the docker-compose, please visit the official site here.

See It All Work

To see it all working please pull the QASO application code from GitHub if you have not until now. After you have downloaded the source code change your terminal dir to the code-dir and perform docker-compose upform the terminal (of course you need to start Docker first). This will build the image and will start two services in containers for web and database. You can see that it is running with docker ps(if the current tab in the terminal is busy then run it in a new tab) and that will look something like in Figure 5. Also, to check if the web-app is working or not, please open localhost:8000 in your browser.

Figure 5: The running containers for two micro-services.

Executing docker-compose up can keep the current tab in the terminal busy, to avoid this, execute it in a detached mode with docker-compose up -d. To stop the containers when running in detached mode use, docker-compose down while in a normal case ctrl+c will suffice.

You can also now investigate what new images have been built (voyager2020/qaso) or pulled (redis, python) from the hub with docker imagesthat will look something like as shown in Figure 6. When next time docker-compose up is executed then it only starts the containers since images have been already built unless you deleted them.

Figure 6: New docker images downloaded (redis, python) or build (voyager2020/qaso).

You can now push the built images (voyager2020/qaso) to the docker-hub. For this, you need to create an account on the hub. However, your pushed images will be public, if you want them to be private (one private is free) you will have to upgrade your account. To push an image to docker please follow these steps in succession:

# Tag the new image. By default the version is <latest> for all tagging.
docker tag <image-id> <your-docker-hub-username>/<image_tag>:<version>
# push the tagged image
docker push <your-docker-hub-username>/<image_tag>:<version>

Pushing your image to the hub will allow you to pull it on any new system you would want to run the application immediately. First, pull the image(s) executing docker pull <your-docker-hub-username>/<image_tag>:<version> and then with docker-compose up to start the containers for a composed application, otherwise, use docker run <img-name> if the image has all the dependencies as a complete application; of course, you need to have the codebase on the system as well if you are copying the source-code in docker-compose(as described earlier).

Change the image name in thedocker-compose file from voyager2020/qaso_web to <your-user-name>/<img-name> in case you want to play around pushing to the hub and learn how it works. Docker has git like capabilities that isn’t limited to merely pushor pull, in fact, it has pretty much all that is needed to track a project - read more about it here.

For the QASO application, I have already pushed the image on the hub. To use this application all you need is to install Docker, clone the codebase from GitHub and pull the image from the hub with docker pull voyager2020/qaso_web:latestand then finally launch the app with docker-compose up. This will launch the app, and to access it locally on the system, open localhost:8000 in a browser. One thing to keep in mind is that if you do not pull the image and directly executedocker-compose upthen this will build the image from scratch and start the services, which also works just fine.

Additional CLI

To delete images from the docker use docker rmi <image-id> but sometimes you may need to force it with docker rmi <image-id> --force. In a similar manner to remove a container, you use docker rm <container-id>and to forcefully remove it use docker rm <container-id> --force. To clear everything you can use docker system prune, beware that it comes with a warning (see help info with docker system prune --help), you might accidentally not only remove stopped containers but also all the images, even the non-dangling ones (untagged images are dangling) when you execute docker system prune -a. Therefore, it is recommended to be in the practice of using rmi and rm for removing images and containers respectively.

A bit beyond

By now the reader should have some clear understanding of what Docker is capable of and how it does what it does. And this is only part of the story while deploying a project. Because normally you want your application to be safe and secure and have very minimal or no downtime and also with time you would want to be able to scale the application performance. For all this, here comes Kubernetes. Kubernetes- is a standalone application to manage containerized apps for scaling, managing, updating and removing containers. One often hears about Kubernetes in relation to Docker because they have a symbiotic relationship i.e. Docker is one of the best in the market for containerization of an application and Kubernetes helps in managing containerized application. Moreover, both are open-sourced and because of their widespread adoption by the industry which has made them to merge each other, well not literally, in terms of support in their use, therefore, the symbiotic relationship. Having said so, of course, they both can be and are used as standalone applications isolated from each other. Since I am still learning about its uses and do not have enough understanding to provide any more lucid insight, nonetheless, here is a good view about them at this link- this should further clear on Docker vs Kubernetes thing.

Parting Thoughts

Hopefully, this has given you some understanding of Docker and its working and how it relates to Kubernetes. Docker and Kubernetes can provide the speed (both for development and in production) and scale that may be needed for your project. In this post, we learned the terms and terminologies associated with using Docker, which is Image, Container, Dockerfile, docker-compose, Docker-Hub, etc. with a working example QASO. To sum up, one last thing I would strongly recommend from my limited experience is that use `docker-compose` even when the number of services is only one because this gives you the flexibility of changing the source-code and seeing its effect immediately without having to build the image all over again. Because the container has all the dependencies that are libraries, entire code-base, anything and everything that is needed to run the application. And for the source code you either provide it through Dockerfile or with docker-compose- so why not use the latter one for more flexibility and control? I hope this article has given you a good starting point in the world of containerization- a smooth and less cumbersome way to deploy your projects.

Some More Info on Docker vs Kubernetes

References

[1]. Automation Step by Step — Raghav Pal, “What is DOCKER (step by step) | Docker Introduction | Docker basics”, URL.

[2]. CodeWithHarry, “Docker Tutorial in Hindi”, URL.

[3]. Docker Official, “FAQ”, URL.

[4]. Docker Official, “What is a Container?”, URL.

--

--