Creating containers by hand

Alpine Linux container

At the beginning of 2016, you may have heard that Docker moved from the Ubuntu images to Alpine Linux as the default. This was mostly done due to the minimal size. Ubuntu took nearly 200MB at the time, while the minimal Alpine installation is closer to 5MB. Fewer bytes to transfer and write is always a good idea, especially if all the containers are going to be prepared, including the required dependencies.

But what do the containers actually have to include? Let’s look at creating a minimal, functional Alpine container with memcached, without Docker or fancy tools. Then make sure it runs via systemd-nspawn.

What’s in the container?

While you can create a continer which includes any software you want, a minimal container, which will run only one application doesn’t need much. The required parts are:

The application (memcached in this case)
Any dynamic libraries (libevent and musl)
Any runtime files needed by the application or libraries (none for memcached)

So what about the system? What actually makes this an “Alpine Linux” anyway? Not much actually. The whole system is really just a collection of libraries, runtime files, and configuration. The only thing that makes it Linux at this point is how it expects to communicate with the kernel (specific syscalls). It does not even include the kernel itself, since we’re only interested in running it in a namespace, not full virtualisation.

If you install the minimal recommended meta package “alpine-base”, here’s the list of packages you’ll get:

musl
busybox
alpine-baselayout
openrc
alpine-conf
zlib
libcrypto1.0
libssl1.0
apk-tools
busybox-suid
busybox-initscripts
scanelf
musl-utils
libc-utils
alpine-keys
alpine-base

Then we can use that basic image to install memcached. The only extra modification needed is the injection of the repository url into the apk configuration file. To create that minimal image and pack it into a tarball, you can use the following script (as root):

#!/bin/bash
set -euo pipefail

VER="${1:-latest-stable}"
MIRROR=http://dl-cdn.alpinelinux.org/alpine/
TMP_DIR=$(mktemp -d)
TMP_TAR=$(mktemp alpine-${VER}-XXXXXX.tar.gz)

apk.static -X "${MIRROR}/${VER}/main" -U --allow-untrusted --root "${TMP_DIR}/" --initdb add alpine-base
echo "${MIRROR}/${VER}/main" > "${TMP_DIR}/etc/apk/repositories"
tar -pacf "${TMP_TAR}" -C "${TMP_DIR}" .

rm -rf "${TMP_DIR}"

echo "${TMP_TAR}"

What happens here is: apk downloads and installs the package alpine-base into a temporary local directory. This pulls in standard dependencies needed to run a basic userspace commandline. Then the directory is packaged into a standard tarball, making sure all the permission information is preserved. The apk.static is used to make the process as independent from the host system as possible.

The resulting tarball can be imported to systemd to use as a named container in the future. This is done with:

machinectl --read-only import-tar alpine-latest-stable-.......tar.gz alpine-base

It takes around 7.4M in total and includes a package index which could be removed. Now you can login into your fresh alpine container and play around using:

systemd-nspawn -M alpine-base -x /bin/sh

The image can be copied to a new name and apk can be used from within the container to install more software.

Trimming it down

The previous list of packages can still be trimmed down. We don’t need openrc, because the container will start the service directly rather than via init. It doesn’t really need busybox either, because nobody needs to log in to the container interactively. So the total list of packages that can install memcached gets reduced to:

alpine-baselayout
alpine-keys
apk-tools

This new image takes only 4.6M. But actually… we’re already running apk, so it’s not needed in the container. Instead, we can install memcached directly - even instead of alpine-base. It still pulls all the required dependencies itself (musl, busybox, libevent, libressl), but that’s it.

If you replace alpine-base with memcached in the above script and add:

echo "memcached:x:100:101:memcached:/home/memcached:/sbin/nologin" > "${TMP_DIR}/etc/passwd"

the resulting package will contain a fully functional memcached daemon which can be started up using:

systemd-nspawn -M alpine-memcached --private-users=pick -x /usr/bin/memcached -u memcached

The extra passwd line had to be injected because the file doesn’t exist without alpine-baselayout. The private-users option has been used so that the running container doesn’t share users with the host. (and we don’t have to worry about UID collisions)

What else could be trimmed

There are still things that are not realy needed. Since the container’s filesystem can be mounted directly, (for debugging or updates) busybox is not necessary and could be removed for example. You could still save around 1 MB, which would be visible for this container. For larger applications it would likely be lost in the noise however.

Summary

Containers are not magic. They’re just standard files and some way of running them in a separate namespace. If you’re using them to run a single application, then that’s all you really need. While known images provide a good starting point and a working set of dependencies, they’re not always necessary. It’s worth having a look at what does your app really need and maybe installing just that.

Smaller images mean shorter copy/deploy time, less cruft in the CI system, less need for regular cleanups. Even if not needed every time, it’s worth thinking how to reduce the image size in larger environments.

Alpine Linux container

What’s in the container?

Trimming it down

What else could be trimmed

Summary

blogroll

social