Docker

Posted Sep 18, 2022 Updated Oct 28, 2024

By Saracen Rhue

9 min read

Docker

Installation

curl -fsSL https://get.docker.com | sh

Virtual Machines vs. Containers: A Brief Overview

Virtual Machines (VMs) and containers are both virtualization technologies used for creating isolated environments for running applications, but they differ significantly in their approach and resource efficiency.

Virtual Machines (VMs): VMs operate by emulating entire hardware systems, allowing you to run multiple operating systems on a single physical machine. Each VM includes a full copy of an operating system, the application, necessary binaries, and libraries - taking up tens of GBs. VMs are managed by a hypervisor and are great for full isolation and running multiple different operating systems on the same hardware.
Containers: Containers, on the other hand, share the same operating system kernel and isolate the application processes from the rest of the system. They’re incredibly lightweight (only megabytes in size), start almost instantly, and use a fraction of the memory compared to booting an entire OS.

Introduction to Dockerfiles

With a clear understanding of how containers differ from traditional VMs, let’s delve into Dockerfiles. A Dockerfile is a text document containing all the commands a user could call on the command line to assemble an image. Using docker build, users can create an automated build that executes several command-line instructions in succession. This guide will explore the basic structure and key elements of a Dockerfile, illustrating how to efficiently containerize your applications.

Basic Structure of a Dockerfile

The Dockerfile is the starting point for creating a Docker image. It’s a simple text file with instructions on how to build your Docker image. Here’s a basic example:

  
FROM node:current-alpine3.16

WORKDIR /app

COPY . .

RUN npm install

ENV PORT=3000
EXPOSE 3000

VOLUME /app/public

CMD ["npm", "start"]

FROM defines the image, your image will be based on. In this case it’s the Alpine version of the node Image, this means it’s Alpine Linux version 3.16 with nodeJS preinstalled.

WORKDIR is basically the same as the cd command in Linux. All subsequent commands will be run from this directory.

COPY copies files in your local directory into the container. In this case everything from the local directory gets copied into the working directory /app. If you want to exclude specific files like the Dockerfile you can add a file .dockerignore where you list all the files that are in your directory but are to be ignored in the build process.

RUN can be used to run any bash command needed to setup everything for the application to run inside the container. Usually the commands are chained together with && \ so every command can be written in a separate line. While it is possible to use RUN multiple times keep in mind that every RUN uses a separate shell session meaning things like cd not get saved across sessions so it’s not recommended unless it’s necessary for some reason.

EXPOSE exposes ports that will need to be accessible from outside the container later. For example in this case the port 3000 inside the container can then be mapped to port 3000 or any other port outside the container.

VOLUME is used to map a folder from inside the container to a folder on the machine where the container is running. This can be used for files you may need to change later like in this case the files that will be displayed by this nodeJS app, or for files that need to be consistent because every file inside a container gets reset to it’s original state every time the container restarts.

CMD is the final command that starts the application inside the container without opening a extra shell session.

Environment Variables

Now that we are done with the basics let’s move on to environment variables. First off what exactly are environment variables? Environment variables are key-value pairs that can affect the behaviour of running processes on a computer. They’re particularly useful in development and operations workflows for configuring and parameterising applications.

A common use case for environment variables is parsing on values to an application that you want to be able to change easily or values that you may want to keep secret (e.g. API keys).

You can set a Variable inside the Dockerfile if it’s something you don’t need to change anymore like the port our nodeJS application is supposed to use.

ENV PORT=3000 defines the variable PORT and initialises it with 3000 The application can then access this variable like this

  
const port = process.env.PORT;

Another way to set Environment variables, if you don’t want the values to be permanently written somewhere, is in the docker run command, but more on that later

Choosing a Base Image

Size Matters: For a lightweight container, minimalistic base images like alpine or busybox are preferred. They are stripped down to the essentials, providing a smaller attack surface and faster deployment times.

FROM alpine:latest

Compatibility and Libraries: If your application requires specific libraries or compatibility with certain software, consider using a more complete base image such as ubuntu, debian, or fedora. These distributions offer a wider array of pre-built packages and are generally easier to work with for complex applications.

FROM ubuntu:20.04

Long-Term Support: Choose distributions that offer long-term support (LTS) for production environments. LTS distributions like ubuntu:20.04 LTS or debian:buster-slim provide ongoing updates and security patches, ensuring stability and reducing maintenance efforts.

FROM ubuntu:20.04

Security: Opt for base images with a strong reputation for security and stability, such as debian or centos. These distributions are known for regularly scanning their images for vulnerabilities and applying security patches.

FROM debian:buster-slim

Architecture Support: Consider the CPU architecture for which you are building your image. Some base images like arm32v7/debian or amd64/alpine are optimised for specific architectures, ensuring better performance and compatibility on the target hardware and if you want your image to run on multiple architectures the base image needs to be compatible as well.

Building a Container

After creating a application and a Dockerfile simply run docker build -t your-image-name:tag .. You can see all your local images by running docker images. (Often a version identifier or latest is used as tag)

Pushing Builds to a Registry

If you haven’t already, log in to your Docker registry (e.g. Docker Hub) and create a new repository for your image. Before you can push an image, it needs to be tagged with the registry name. Use the docker tag command like this:

docker tag your-image-name:tag registry-username/repository-name:tag

If you’re not logged in to the Docker registry from your command line, you’ll need to do so. Just type docker login in your terminal (For a private registry use docker login [registry-url]). Now to push your image by running:

docker push registry-username/repository-name:tag

Multi Architecture Builds

First off we need to create a new builder that supports non native architectures.

  
docker buildx create --name mybuilder --use
docker run -it --rm --privileged tonistiigi/binfmt --install all

Now we can build and push an image that supports multiple architectures in one go. (If you don’t run this from the directory where your Dockerfile is you need to replace the dot with the correct path). If you want to use different Dockerfiles for different architectures you can specify which file is used for what architecture by adding a extension like .arm64

  
docker buildx build --platform linux/amd64,linux/arm64 -t registry-username/repository-name:tag . --push

Running a Container

If the image has been built successfully the container can be run by using:

  
`docker run -d your-image-name:tag`

If you want to pull a container and run it use:

docker run -d registry-username/repository-name:tag

(-d runs the container in detached mode, essentially running it in the background so you can keep using your terminal session). To see which containers are running, use docker ps.

You can start and stop containers using docker start name_or_id or docker stop name_or_id.

run options

--name=custom_name Assign a custom name to your container
--rm Automatically remove the container when it exits
-p host_port:container_port Map a port from inside the container to a port on your host machine
-v /host/path:/container/path Mount a volume to save persistent files
-e "ENV_VAR_NAME=value" Setting a environment variable
-it Run container in interactive mode with a TTY (simulated terminal)

You can set different restart policies, too.

--restart=no (Default): Do not automatically restart the container
--restart=always Always restart the container if it stops
--restart=unless-stopped Restart the container unless it has been stopped manually
--restart=on-failure:6 Restart the container if it exits due to an error (non-zero exit status). You can also specify a maximum number of restarts.

Nvidia GPU Passthrough

If you need to use a gpu inside a container e.g. for video transcoding or machine learning then your container needs to have the necessary drivers installed. The easiest way to achieve this is to just use a prebuilt cuda base image.

FROM nvidia/cuda:12.1.0-base-ubuntu22.04

If you decide to use a ubuntu base image you may have to use this after the FROM command:

  
RUN apt-get update && \
    DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC apt-get install -y tzdata

Otherwise the container expects user input during the build that you obviously cant provide.

If you want to make your image as small as possible you can use rm -rf /var/lib/apt/lists/* at the end of your RUN instruction to remove the no longer needed apt files.

To make GPUs accessible inside the container use --gpu all in the docker run command. If you want to use one or more specific GPUs add --gpus all -e NVIDIA_VISIBLE_DEVICES=GPU-UUID-HERE. To list all available GPUs and the UUIDs run nvidia-smi -L. The nvidia-smi command can also be used to verify what GPUs are visible inside the container like this:

docker exec -it container_name_or_id nvidia-smi

Command Overview

Start a container:

docker start helloworld

Stop a container:

docker stop helloworld

Show all running containers:

docker ps

Show all containers:

docker ps -a

Show all images:

docker images

Show all volumes:

docker volume ls

Show all networks:

docker network ls

ssh into a container

  
docker exec -it helloworld /bin/bash #helloworld is the name of the container

  
  docker exec --tty helloworld /bin/bash #helloworld is the name of the container

send a command to a container

docker exec helloworld ls

linux, docker, containers

This post is licensed under CC BY 4.0 by the author.