Installation
1
curl -fsSL https://get.docker.com | sh
Virtual Machines vs. Containers: A Brief Overview
Virtual Machines (VMs) and containers are both virtualization technologies used for creating isolated environments for running applications, but they differ significantly in their approach and resource efficiency.
Virtual Machines (VMs): VMs operate by emulating entire hardware systems, allowing you to run multiple operating systems on a single physical machine. Each VM includes a full copy of an operating system, the application, necessary binaries, and libraries - taking up tens of GBs. VMs are managed by a hypervisor and are great for full isolation and running multiple different operating systems on the same hardware.
Containers: Containers, on the other hand, share the same operating system kernel and isolate the application processes from the rest of the system. They’re incredibly lightweight (only megabytes in size), start almost instantly, and use a fraction of the memory compared to booting an entire OS.
Introduction to Dockerfiles
With a clear understanding of how containers differ from traditional VMs, let’s delve into Dockerfiles. A Dockerfile is a text document containing all the commands a user could call on the command line to assemble an image. Using docker build, users can create an automated build that executes several command-line instructions in succession. This guide will explore the basic structure and key elements of a Dockerfile, illustrating how to efficiently containerize your applications.
Basic Structure of a Dockerfile
The Dockerfile is the starting point for creating a Docker image. It’s a simple text file with instructions on how to build your Docker image. Here’s a basic example:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
FROM node:current-alpine3.16
WORKDIR /app
COPY . .
RUN npm install
ENV PORT=3000
EXPOSE 3000
VOLUME /app/public
CMD ["npm", "start"]
FROM
defines the image, your image will be based on. In this case it’s the Alpine version of the node Image, this means it’s Alpine Linux version 3.16 with nodeJS preinstalled.
WORKDIR
is basically the same as the cd command in Linux. All subsequent commands will be run from this directory.
COPY
copies files in your local directory into the container. In this case everything from the local directory gets copied into the working directory /app
. If you want to exclude specific files like the Dockerfile you can add a file .dockerignore
where you list all the files that are in your directory but are to be ignored in the build process.
RUN
can be used to run any bash command needed to setup everything for the application to run inside the container. Usually the commands are chained together with && \
so every command can be written in a separate line. While it is possible to use RUN multiple times keep in mind that every RUN uses a separate shell session meaning things like cd not get saved across sessions so it’s not recommended unless it’s necessary for some reason.
EXPOSE
exposes ports that will need to be accessible from outside the container later. For example in this case the port 3000 inside the container can then be mapped to port 3000 or any other port outside the container.
VOLUME
is used to map a folder from inside the container to a folder on the machine where the container is running. This can be used for files you may need to change later like in this case the files that will be displayed by this nodeJS app, or for files that need to be consistent because every file inside a container gets reset to it’s original state every time the container restarts.
CMD
is the final command that starts the application inside the container without opening a extra shell session.
Environment Variables
Now that we are done with the basics let’s move on to environment variables. First off what exactly are environment variables? Environment variables are key-value pairs that can affect the behaviour of running processes on a computer. They’re particularly useful in development and operations workflows for configuring and parameterising applications.
A common use case for environment variables is parsing on values to an application that you want to be able to change easily or values that you may want to keep secret (e.g. API keys).
You can set a Variable inside the Dockerfile if it’s something you don’t need to change anymore like the port our nodeJS application is supposed to use.
ENV PORT=3000
defines the variable PORT and initialises it with 3000 The application can then access this variable like this
1
const port = process.env.PORT;
Another way to set Environment variables, if you don’t want the values to be permanently written somewhere, is in the docker run command, but more on that later
Choosing a Base Image
- Size Matters: For a lightweight container, minimalistic base images like
alpine
orbusybox
are preferred. They are stripped down to the essentials, providing a smaller attack surface and faster deployment times.
1
FROM alpine:latest
- Compatibility and Libraries: If your application requires specific libraries or compatibility with certain software, consider using a more complete base image such as
ubuntu
,debian
, orfedora
. These distributions offer a wider array of pre-built packages and are generally easier to work with for complex applications.
1
FROM ubuntu:20.04
- Long-Term Support: Choose distributions that offer long-term support (LTS) for production environments. LTS distributions like
ubuntu:20.04 LTS
ordebian:buster-slim
provide ongoing updates and security patches, ensuring stability and reducing maintenance efforts.
1
FROM ubuntu:20.04
- Security: Opt for base images with a strong reputation for security and stability, such as
debian
orcentos
. These distributions are known for regularly scanning their images for vulnerabilities and applying security patches.
1
FROM debian:buster-slim
- Architecture Support: Consider the CPU architecture for which you are building your image. Some base images like
arm32v7/debian
oramd64/alpine
are optimised for specific architectures, ensuring better performance and compatibility on the target hardware and if you want your image to run on multiple architectures the base image needs to be compatible as well.
Building a Container
After creating a application and a Dockerfile simply run docker build -t your-image-name:tag .
. You can see all your local images by running docker images
. (Often a version identifier or latest
is used as tag)
Pushing Builds to a Registry
If you haven’t already, log in to your Docker registry (e.g. Docker Hub) and create a new repository for your image. Before you can push an image, it needs to be tagged with the registry name. Use the docker tag
command like this:
1
docker tag your-image-name:tag registry-username/repository-name:tag
If you’re not logged in to the Docker registry from your command line, you’ll need to do so. Just type docker login
in your terminal (For a private registry use docker login [registry-url]
). Now to push your image by running:
1
docker push registry-username/repository-name:tag
Multi Architecture Builds
First off we need to create a new builder that supports non native architectures.
1
2
docker buildx create --name mybuilder --use
docker run -it --rm --privileged tonistiigi/binfmt --install all
Now we can build and push an image that supports multiple architectures in one go. (If you don’t run this from the directory where your Dockerfile is you need to replace the dot with the correct path). If you want to use different Dockerfiles for different architectures you can specify which file is used for what architecture by adding a extension like .arm64
1
docker buildx build --platform linux/amd64,linux/arm64 -t registry-username/repository-name:tag . --push
Running a Container
If the image has been built successfully the container can be run by using:
1
`docker run -d your-image-name:tag`
If you want to pull a container and run it use:
1
docker run -d registry-username/repository-name:tag
(-d
runs the container in detached mode, essentially running it in the background so you can keep using your terminal session). To see which containers are running, use docker ps
.
You can start and stop containers using docker start name_or_id
or docker stop name_or_id
.
run options
--name=custom_name
Assign a custom name to your container--rm
Automatically remove the container when it exits-p host_port:container_port
Map a port from inside the container to a port on your host machine-v /host/path:/container/path
Mount a volume to save persistent files-e "ENV_VAR_NAME=value"
Setting a environment variable-it
Run container in interactive mode with a TTY (simulated terminal)
You can set different restart policies, too.
--restart=no
(Default): Do not automatically restart the container--restart=always
Always restart the container if it stops--restart=unless-stopped
Restart the container unless it has been stopped manually--restart=on-failure:6
Restart the container if it exits due to an error (non-zero exit status). You can also specify a maximum number of restarts.
Nvidia GPU Passthrough
If you need to use a gpu inside a container e.g. for video transcoding or machine learning then your container needs to have the necessary drivers installed. The easiest way to achieve this is to just use a prebuilt cuda base image.
1
FROM nvidia/cuda:12.1.0-base-ubuntu22.04
If you decide to use a ubuntu base image you may have to use this after the FROM
command:
1
2
RUN apt-get update && \
DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC apt-get install -y tzdata
Otherwise the container expects user input during the build that you obviously cant provide.
If you want to make your image as small as possible you can use rm -rf /var/lib/apt/lists/*
at the end of your RUN
instruction to remove the no longer needed apt files.
To make GPUs accessible inside the container use --gpu all
in the docker run command. If you want to use one or more specific GPUs add --gpus all -e NVIDIA_VISIBLE_DEVICES=GPU-UUID-HERE
. To list all available GPUs and the UUIDs run nvidia-smi -L
. The nvidia-smi
command can also be used to verify what GPUs are visible inside the container like this:
1
docker exec -it container_name_or_id nvidia-smi
Command Overview
Start a container:
1
docker start helloworld
Stop a container:
1
docker stop helloworld
Show all running containers:
1
docker ps
Show all containers:
1
docker ps -a
Show all images:
1
docker images
Show all volumes:
1
docker volume ls
Show all networks:
1
docker network ls
ssh into a container
1
docker exec -it helloworld /bin/bash #helloworld is the name of the container
or
1
docker exec --tty helloworld /bin/bash #helloworld is the name of the container
send a command to a container
1
docker exec helloworld ls