Published: Mar 20, 2023

Docker Basics

Introduction

Docker is a popular set of tools to containerize applications. A container includes all binaries and dependencies to run an application. Compared to directly running the application on the host system this provides much better isolation which improves stability and security since one service is no longer able to affect other services or the host system as easily. Another benefit is easier scalability by, for example, using a load balancer in front of multiple containers or a full orchestration system such as Kubernetes (K8s) which is able to launch new containers (pods) on demand.

Unlike a virtual machine a container does not include an operating system and shares a kernel with other containers. This leads to reduced overhead and improved performance compared to a VM but might be more susceptible to escape vulnerabilities.

While on the topic of security it should be mentioned that docker is often run as root which is not ideal. A rootless mode exists but it comes with some limitations.

Images

A docker image is a read-only immutable template that defines how a container will be realized. Images are built using docker build and the instructions defined in a Dockerfile. Usually images have a parent image specified by FROM ... in the Dockerfile, base images use FROM scratch.

To list currently available images run:

docker image ls

Images are hosted on registry servers and can be pulled from those. By default docker pulls from Docker Hub but self hosting is also possible.

To pull an image, for example “alpine”, use:

docker pull alpine

Containers

An image can be run which results in a new container being created. Thus a container is a runtime instance of an image. The docker run command is likely one of the most important commands for developers and has a lot of options.

A Short Example

Try running the alpine image:

docker run alpine

The command just exits without any output, which makes sense since the “alpine” image contains a tiny linux distribution (alpine) that is intended to be used as a base image and doesn’t really do anything on it’s own.

List all containers (without -a only running containers are listed):

docker ps -a

The alpine container created earlier should be visible.

A more sensible way to run the alpine image:

docker run -it --name alpine_test alpine /bin/sh
  • -it combines the options “interactive/keep STDIN open” and “allocate pseudo-TTY”
  • --name alpine_test is self explanatory
  • alpine specifies the image and /bin/sh a command to execute.

Now the alpine container can be explored using a shell to, for example, install packages using apk. Note that container names cannot be reused and docker rm alpine_test is needed before recreating the container otherwise following error will occur:

The container name "/alpine_test" is already in use by container

Persistence

At first one might think that application data can be stored inside the application container but this is usually a bad idea.

Consider a database like mongodb. First of all this would require to run mongodb inside the same container as the application. This is possible but kind of defeats the point of using docker to isolate services. Even when using something like SQLite storing the database file inisde the container is a bad idea since updating a container usually means recreating it which means wiping all data. Also it’s bad for scalability since files can not easily be shared across containers.

Two solutions for persistence are volumes and bind mounts.

Bind Mounts

A bind mount basically shares a folder from the host system with a container. This allows the container to read/write files on the host system and thus persist data.

To create a bind mount in the present working directory run:

docker run -it --mount type=bind,src="$(pwd)/binddir",target=/root/binddir alpine /bin/sh
  • type can be: bind, volume or tmpfs

Now the host folder ./binddir is mounted at /root/binddir in the container and accessible from both systems. Overall bind mounts are easy to set up and perform well, but they provide limited functionality compared to volumes.

Volumes

Volumes are comparable to bind mounts but are completely managed by docker and come with several advantages over bind mounts:

  • Work on Windows and Linux containers
  • Safer to share across multiple containers
  • Volume drivers allow volumes to be encrypted, stored on remote hosts or add other functionality.
  • Much faster than bind mounts if mounting from Windows/Mac hosts

Volumes can exist without a container. To create one run:

docker volume create --name test_volume

To list volumes use:

docker volume ls

To view properties of the volume inspect it:

docker volume inspect test_volume

Start a container with “test_volume” mounted at “/root/test”:

docker run -it -v=test_volume:/root/test alpine /bin/sh
  • -v basically wraps --mount to mount volumes with less verbosity

Try to create a file inside “/root/test” and then create a new container with the volume attached. Observe that the file persists on the volume.

To delete the volume run:

docker volume rm test_volume

If the error volume is in use shows up, list all containers using a certain volume:

docker ps -a --filter volume=test_volume

Delete the containers and then the volume can be safely deleted aswell.

Backup/Restore Volumes

This can be quite tricky depending on the configuration of the volume driver, so the following might not work for more complex use cases.

Assume the volume “test_volume” should be backed up or transferred to another machine. Sometimes volumes are not easy to access directly from the host so using a container is preffered.

Dump the volume into a tarball using:

docker run --rm -v test_volume:/root/test_volume -v $(pwd):/backup alpine tar cvf /backup/backup.tar /root/test_volume
  • --rm automatically removes the container on exit
  • first -v mounts the volume at /root/test_volume
  • second -v mounts the present working directory of the host at /backup
  • finally the tar command dumps all files at /root/test_volume to /backup/backup.tar

backup.tar should now exist on the host file system in the present working directory ready to be compressed or moved to another location.

To restore “test_volume” from “backup.tar”:

docker run --rm -v test_volume:/restore -v $(pwd):/backup alpine /bin/sh -c "cd /restore && tar xvf /backup/backup.tar --strip 1"
  • first -v mounts and creates “test_volume” at “/restore”
  • second -v mounts the directory containing “backup.tar” at “/backup”
  • finally /bin/sh -c ... is used to switch to “/restore” and extract “backup.tar”

Once the command completes a container with “test_volume” attached can be created and the data should have been successfully restored.

Networking

Assume a container created by

docker run --rm -it alpine /bin/sh

By default a docker container does not expose any ports to the outside world and is not reachable from the host system either. Using docker network ls available networks are listed. By default the container is connected to the bridge network which allows it to access the internet via the host system and interact with all other containers on the same network.

To create a new network use:

docker network create testnet

By default this creates a network using the “bridge” driver. If a container is now started using:

docker run --rm -it --network testnet alpine /bin/sh

It is no longer possible to reach containers on other networks unless the NET_ADMIN capability is granted. Upon adding another network following error occurs:

RTNETLINK answers: Operation not permitted

If you want to expose a port to the outside world or other networks use --publish or shorter -p:

docker run --rm -it -p 8081:8080 alpine /bin/sh
  • -p maps the host port 8081 to the container port 8080

One way to test this is to run a simple http server using python:

apk add python3
python -m http.server 8080

Which is now reachable on http://localhost:8081 from the host.

Other useful commands

To get help on any docker command:

docker help [command]

To check how much space docker is using:

docker system df

To cleanup space:

docker system prune
docker volume prune

Conclusion

Now you know what images and containers are, how to run them, how to persist data including backup and restore and configure some basic networking. Next I would recommend learning about Dockerfiles before moving on to docker compose. After that you are ready to deploy your application to the cloud or your own server. If you need scalability you might want to start looking into kubernetes and other orchestration tools, this knowledge will also come in handy when using a cloud provider such as AWS, Azure or GCP.