Docker Introduction¶

Docker is an industry-standard technology to package software, including its full list of dependencies, such that it can be run on almost any machine. As there is no guarantee that the software environment offered by your institute or LXPLUS will be the same when the time to re-run the analysis comes, it is necessary to completely capture the analysis software.

As an industry standard there is extensive documentation on Docker around the web, such as the Docker Getting Started Guide.

Images, Containers, Dockerfiles¶

The core idea of Docker is that it captures software in a Container Image, a snapshot of a filesystem that includes your analysis software and its dependencies. Based on that image, processes — such as your analysis code — can be run in that captured software environment as containerized processes.

The easiest way to build a Docker image is to use the docker build command as well as a short text file — the Dockerfile. A Dockerfile starts by declaring a pre-existing "base image" that includes the operating system and other common software. Building on that base you can then install additional software either by copying data into the image via the COPY instruction or using a RUN instruction:

FROM <baseimage>
RUN <install step 1>
RUN <install step 2>
COPY --chown=<user> . /path/inside/container
RUN <install step 3>

Each RUN instruction runs in a separate shell (Note: this is different than GitLab's CI system, in which subsequent commands run in the same shell). A RUN instruction executes a given command within the container image and snapshots the container before running the next instruction.

A COPY instruction copies a directory or file into the container at a given path. The syntax is COPY --chown=<user> <src> <target>, where <src> is relative to the build context which one passes around in docker build <path to build context> and <user> is the name of the user that should have ownership over the files in the container.

There are many more instructions you can find in the Documentation, but RUN and COPY will cover most use-cases.

ATLAS Images¶

ATLAS provides base images that analyses can use. They are published on CERN's GitLab container registry under the atlas/athena prefix. All of AnalysisBase, AthAnalysis and and AnalysisTop images are published:

gitlab-registry.cern.ch/atlas/athena/analysisbase:<release>, e.g. gitlab-registry.cern.ch/atlas/athena/analysisbase:21.2.247
gitlab-registry.cern.ch/atlas/athena/athanalysis:<release>, e.g. gitlab-registry.cern.ch/atlas/athena/athanalysis:21.2.247
gitlab-registry.cern.ch/atlas/athena/analysistop, e.g. gitlab-registry.cern.ch/atlas/athena/analysistop:21.2.99

Note

The GitLab container registry web interface is not user friendly and is very slow (compared to Docker Hub). To browse all of the available images from the command line quickly consider using crane.

Example:

$ crane ls gitlab-registry.cern.ch/atlas/athena/analysisbase | grep 21.2.247
21.2.247-20230316
21.2.247

Example Dockerfile¶

An atlas Dockerfile might look something like this:

FROM gitlab-registry.cern.ch/atlas/athena/analysisbase:21.2.247
COPY --chown=atlas . /analysis/src
WORKDIR /analysis/build
RUN . /release_setup.sh &&  \
    cmake ../src && \
    make -j4
USER atlas

When executing docker build . in the root of a repository, Docker will:

Start from the AnalysisBase,21.2.66
Copy the full source code of the repository into the path /analysis/src within the container image, and change ownership of the files from root to the default user atlas (see discussion of permissions below)
Set the default work directory to the cmake build directory /analysis/build (which will be created as it does not exist yet)
Setup the release and compile the analysis

Permissions Considerations when using ATLAS Images¶

The ATLAS images all have a default non-root user atlas. The atlas user is part of the wheel group, which gives it sudo privileges (see here for the relevant bit of the Dockerfile used to create the ATLAS images).

Even though atlas is the default user of the base image (gitlab-registry.cern.ch/atlas/athena/analysisbase:21.2.247), files and directories created by instructions such as COPY are owned by root by default. For example, if we create an image with the following Dockerfile

# Dockerfile.permissions
FROM gitlab-registry.cern.ch/atlas/athena/analysisbase:21.2.247
COPY . /analysis/src
WORKDIR /analysis/build
USER atlas

then build the image using the docker build command and run a container from it, you'll see that root owns the /analysis directory and its contents, and the default atlas user doesn't have permission to write in this dir:

# Build a docker image using the Dockerfile named Dockerfile.permissions, and call the new image permissions_example
docker build . -f Dockerfile.permissions -t permissions_example

# Start up a container from the newly created permissions_example image which will be removed after we exit from it (--rm).
# Interact with the container via the terminal (-it)
docker run --rm -it permissions_example

# Check ownership and permissions of /analysis dir
[bash][atlas]:build > ls -lh /analysis

total 12K
drwxr-xr-x 2 atlas atlas 4.0K Mar 27 16:58 build
drwxr-xr-x 2 root  root  4.0K Mar 27 16:58 src

# Try creating a test file in /analysis dir
[bash][atlas]:build > touch /analysis/test.txt

touch: cannot touch `/analysis/test.txt': Permission denied

That's why we need to explicitly specify in the Dockerfile that during the COPY step /analysis/src should be owned by atlas user

FROM gitlab-registry.cern.ch/atlas/athena/analysisbase:21.2.247
COPY --chown=atlas . /analysis/src
WORKDIR /analysis/build
USER atlas

Though note that

The --chown and --chmod features are only supported on Dockerfiles used to build Linux containers, and will not work on Windows containers. Since user and group ownership concepts do not translate between Linux and Windows, the use of /etc/passwd and /etc/group for translating user and group names to IDs restricts this feature to only be viable for Linux OS-based containers.

Last update: March 27, 2023