Alpine Linux container
At the beginning of 2016, you may have heard that Docker moved from the Ubuntu
images to Alpine Linux as the default. This was mostly done due to the minimal
size. Ubuntu took nearly 200MB at the time, while the minimal Alpine
installation is closer to 5MB. Fewer bytes to transfer and write is always a
good idea, especially if all the containers are going to be prepared, including
the required dependencies.
But what do the containers actually have to include? Let’s look at creating a
minimal, functional Alpine container with memcached, without Docker or fancy
tools. Then make sure it runs via systemd-nspawn.
What’s in the container?
While you can create a continer which includes any software you want, a minimal
container, which will run only one application doesn’t need much. The required
parts are:
- The application (memcached in this case)
- Any dynamic libraries (libevent and musl)
- Any runtime files needed by the application or libraries (none for memcached)
So what about the system? What actually makes this an “Alpine Linux” anyway? Not
much actually. The whole system is really just a collection of libraries,
runtime files, and configuration. The only thing that makes it Linux at this
point is how it expects to communicate with the kernel (specific syscalls). It
does not even include the kernel itself, since we’re only interested in running
it in a namespace, not full virtualisation.
If you install the minimal recommended meta package “alpine-base”, here’s the
list of packages you’ll get:
musl
busybox
alpine-baselayout
openrc
alpine-conf
zlib
libcrypto1.0
libssl1.0
apk-tools
busybox-suid
busybox-initscripts
scanelf
musl-utils
libc-utils
alpine-keys
alpine-base
Then we can use that basic image to install memcached. The only extra
modification needed is the injection of the repository url into the apk
configuration file. To create that minimal image and pack it into a tarball, you
can use the following script (as root):
#!/bin/bash
set -euo pipefail
VER="${1:-latest-stable}"
MIRROR=http://dl-cdn.alpinelinux.org/alpine/
TMP_DIR=$(mktemp -d)
TMP_TAR=$(mktemp alpine-${VER}-XXXXXX.tar.gz)
apk.static -X "${MIRROR}/${VER}/main" -U --allow-untrusted --root "${TMP_DIR}/" --initdb add alpine-base
echo "${MIRROR}/${VER}/main" > "${TMP_DIR}/etc/apk/repositories"
tar -pacf "${TMP_TAR}" -C "${TMP_DIR}" .
rm -rf "${TMP_DIR}"
echo "${TMP_TAR}"
What happens here is: apk downloads and installs the package alpine-base into
a temporary local directory. This pulls in standard dependencies needed to run a
basic userspace commandline. Then the directory is packaged into a standard
tarball, making sure all the permission information is preserved. The
apk.static is used to make the process as independent from the host system as possible.
The resulting tarball can be imported to systemd to use as a named container
in the future. This is done with:
machinectl --read-only import-tar alpine-latest-stable-.......tar.gz alpine-base
It takes around 7.4M in total and includes a package index which could be
removed. Now you can login into your fresh alpine container and play around using:
systemd-nspawn -M alpine-base -x /bin/sh
The image can be copied to a new name and apk can be used from within the
container to install more software.
Trimming it down
The previous list of packages can still be trimmed down. We don’t need openrc,
because the container will start the service directly rather than via init. It
doesn’t really need busybox either, because nobody needs to log in to the
container interactively. So the total list of packages that can install
memcached gets reduced to:
alpine-baselayout
alpine-keys
apk-tools
This new image takes only 4.6M. But actually… we’re already running apk, so
it’s not needed in the container. Instead, we can install memcached directly -
even instead of alpine-base. It still pulls all the required dependencies itself
(musl, busybox, libevent, libressl), but that’s it.
If you replace alpine-base with memcached in the above script and add:
echo "memcached:x:100:101:memcached:/home/memcached:/sbin/nologin" > "${TMP_DIR}/etc/passwd"
the resulting package will contain a fully functional memcached daemon which can
be started up using:
systemd-nspawn -M alpine-memcached --private-users=pick -x /usr/bin/memcached -u memcached
The extra passwd line had to be injected because the file doesn’t exist without
alpine-baselayout. The private-users option has been used so that the
running container doesn’t share users with the host. (and we don’t have to worry
about UID collisions)
What else could be trimmed
There are still things that are not realy needed. Since the container’s
filesystem can be mounted directly, (for debugging or updates) busybox is not
necessary and could be removed for example. You could still save around 1 MB,
which would be visible for this container. For larger applications it would
likely be lost in the noise however.
Summary
Containers are not magic. They’re just standard files and some way of running
them in a separate namespace. If you’re using them to run a single application,
then that’s all you really need. While known images provide a good starting
point and a working set of dependencies, they’re not always necessary. It’s
worth having a look at what does your app really need and maybe installing just that.
Smaller images mean shorter copy/deploy time, less cruft in the CI system, less
need for regular cleanups. Even if not needed every time, it’s worth thinking
how to reduce the image size in larger environments.