Docker is a tool that allows developers, sys-admins etc. to easily deploy their applications in a sandbox (called containers) to run on the host operating system i.e. Linux. The key benefit of Docker is that it allows users to package an application with all of its dependencies into a standardized unit for software development. Unlike virtual machines, containers do not have a high overhead and hence enable more efficient usage of the underlying system and resources.
Damn it. I don't understand a thing about Docker, what's that anyway? Wait, did you just say what it is?
Docker is a platform. You can create a specific configuration package and run it in your own isolated environment, which we call a container. In a way, a container is really similar to a virtual machine. You can easily create a lightweight container only with some relevant libraries and applications and run your code on top. If you decide to share your code with someone, the only thing you would need to share is the docker image. Amazing, innit?
It is! You mentioned the docker image. What are those?
Yes, that. A docker image is a bunch of instructions to set up a container with all the configurations, relevant files inside from your working directory. Since Docker is a platform, we have a server-client relationship like so;
- Client - most likely a command-line interface with which you interact with your docker-engine
- Server - what arranges and maps different things inside your docker-engine
- Registry - where your docker images are stored and can later be pulled by other users (yup, just like Github)
Enough talks about what Docker is, let's get our hands dirty, innit! Let's start by checking what version of Docker we have on our machine. Oh, before that- make sure you have Docker installed on your machine. Go through the steps in the article on how-to.
If you have installed Docker on your machine, and has type in the command "docker --version" on your terminal, you'll get the version of Docker that was installed. If you at least know some basics of terminal, you're probably familiar with the "ps" command.
This command lists all running processes and their PIDs. Luckily for us, we have a similar thing for docker.
We can see that I have 0 docker processes running.
How can we make a docker process run?
Run, you say! That's the answer.
Let me break it down. So we're executing on docker, creating or pulling a container with the run keyword and hello-world (image to load a container).
So let’s say we need to run Postgres DB on our machine to do some POC. To do that, we will need:
- Find docker image in docker hub (or if you have some private repo there)
- decide on the version we’re going to use
- Docker run it
If we’d google docker Postgres, we will have a docker hub link to the Postgres image at the top. There we can see tons of information about the image, its usage, etc.
Ok, so here we see 13.2, 13, latest, 13-alpine. What does it mean?
i.e., 13.2 number shows a particular Postgres version. Latest — new kid on the block, will always update to a newer version if there is one. Last but not least — alpine. Alpine is the most basic bare-bone for the application to run. Alpine version is as lightweight as it can be, just because it has only the necessities, it’s small compared to other versions. Benefits of using alpine — get the bare minimum and add only the relevant things as you go. Your docker container will be as small as it can be. Easy to move around and deploy.
Ok, so let’s choose 13.2.
Let’s do docker run postgres:13.2
Ok, something it’s not running, but we see what’s the issue — we need to pass a password for it with -e. -e means environment variables
We see that terminal is in running process mode, and to do something in the same session, we’d need to kill our Postgres process. There is a way to have the Postgress docker process running and do things on the same terminal session. We can use -d for running the detached process.
Now we’ve got some long string back. This string is our docker container ID. If we’d run docker ps, we’d see that the docker container is up and running:
We can see that the container id from docker ps is not the same and long as we’ve got from docker run. If we’d look carefully, we can see that the docker ps one is a substring from the longer one. The cool thing with docker is that it matches your container id's start, and you don’t need to paste the full version of it!
But wait, didn’t I mention that docker is an isolated environment? Yes. We won’t do anything with the database because we didn’t expose any port to interact with it. To do it, we need to use -p, like this:
docker run -e POSTGRES_PASSWORD=abc -p HOST_PORT:DOCKER_CONTAINER -d postgres:13.2
Ok, now you’re probably asking yourself, why is this guy explaining to me how to type long commands to run something isolated. It’s repetitive, boring. How do I share it? Send the command to the next person? it’s not efficient…
Yes, it makes sense to play around, but to create an environment, it’s not. That’s why we have Dockerfile.
Dockerfile is basically all these instructions mapped out with environment variables. So for our Postgres, we can create such:
And to build it, run in the folder of Dockerfile:
docker build .
we should see something like this
bd9416c1457a — is the newly built docker image id. We can run our container now like this:
One thing to mention is that when docker is building your Dockerfile in each step, it creates a new Docker image and passes it down to the next one. All of them will sit in cache and will be re
-used if there will be a place to do so, i.e., adding a new environment variable at the end:
The benefit of it is that we can faster build images by adding changes at the bottom. We need to build from scratch only the newly added parts! Though if we’d switch password and username places, we’ll have to build images again:
Interacting with containers
If you decide to shut down your docker process gracefully, you can use
docker stop CONTAINER_ID
or if you want to kill it
docker kill CONTAINER_ID
Later on, you can check the stopped processes with
and resume your container with
Running a couple of docker processes.
Ok, so all is quite easy with one docker process; we create a Dockerfile and run it. If I need an additional one, I can create it and run it. But at some point, it will get out of hand pretty fast because of the need to set ports and other information per each docker run.
There is a solution for it — docker-compose.
It’s a YAML file with more information on what to build and how different Dockerfiles play together.
Let’s cover some things:
- version — version of compose file format. Check if your docker engine is compatible with it by checking docker documentation
- app — application/service which will use some pre-defined docker image (i.e., airflow) with 8080 port mapped to our local machine 8080
- db — a database application, which we’re going to containerize using the Dockerfile, which is in the database directory
Ok, when we have the file ready, how can we build it?
if you’re in the directory, you have the file:
docker-compose up -d --build
Also, you can specify the full path to the file:
docker-compose -f "PATH/YOU/HAVE/PUT/DOCKERCOMPOSE/FILE/docker-compose.yml" up -d --build
Containerization has recently gotten the attention it deserves, although it has been around for a long time. Some of the top tech companies like Google, Amazon Web Services (AWS), Intel, Tesla, and Juniper Networks have their own custom version of container engines. They heavily rely on them to build, run, manage, and distribute their applications.
Docker is an extremely powerful containerization engine, and it has a lot to offer when it comes to building, running, managing and distributing your applications efficiently.
There is a lot more to learn about Docker, such as:
- Docker commands (More powerful commands)
- Docker Images (Build your own custom images)
- Docker Networking (Setup and configure networking)
- Docker Services (Grouping containers that use the same image)
- Docker Stack (Grouping services required by an application)
- Docker Compose (Tool for managing and running multiple containers)
- Docker Swarm (Grouping and managing one or more machines on which docker is running)
- And much more…
If you appreciate what we do here on DigitalStade,
consider sending us an appreciation token;