Docker Introduction¶
Docker is an industry-standard technology to package software, including its full list of dependencies, such that it can be run on almost any machine. As there is no guarantee that the software environment offered by your institute or LXPLUS will be the same when the time to re-run the analysis comes, it is necessary to completely capture the analysis software.
As an industry standard there is extensive documentation on Docker around the web, such as the Docker Getting Started Guide.
Images, Containers, Dockerfiles¶
The core idea of Docker is that it captures software in a Container Image, a snapshot of a filesystem that includes your analysis software and its dependencies. Based on that image, processes — such as your analysis code — can be run in that captured software environment as containerized processes.
The easiest way to build a Docker image is to use the docker build command as well as a short text file — the Dockerfile. A Dockerfile starts by declaring a pre-existing "base image" that includes the operating system and other common software. Building on that base you can then install additional software either by copying data into the image via the COPY instruction or using a RUN instruction:
FROM <baseimage>
RUN <install step 1>
RUN <install step 2>
COPY --chown=<user> . /path/inside/container
RUN <install step 3>
Each RUN instruction runs in a separate shell (Note: this is different than GitLab's CI system, in which subsequent commands run in the same shell). A RUN instruction executes a given command within the container image and snapshots the container before running the next instruction.
A COPY instruction copies a directory or file into the container at a given path. The syntax is COPY --chown=<user> <src> <target>, where <src> is relative to the build context which one passes around in docker build <path to build context> and <user> is the name of the user that should have ownership over the files in the container.
There are many more instructions you can find in the Documentation, but RUN and COPY will cover most use-cases.
ATLAS Images¶
ATLAS provides base images that analyses can use. They are published on CERN's GitLab container registry under the atlas/athena prefix. All of AnalysisBase, AthAnalysis and and AnalysisTop images are published:
gitlab-registry.cern.ch/atlas/athena/analysisbase:<release>, e.g.gitlab-registry.cern.ch/atlas/athena/analysisbase:21.2.247gitlab-registry.cern.ch/atlas/athena/athanalysis:<release>, e.g.gitlab-registry.cern.ch/atlas/athena/athanalysis:21.2.247gitlab-registry.cern.ch/atlas/athena/analysistop, e.g.gitlab-registry.cern.ch/atlas/athena/analysistop:21.2.99
Note
The GitLab container registry web interface is not user friendly and is very slow (compared to Docker Hub).
To browse all of the available images from the command line quickly consider using crane.
Example:
$ crane ls gitlab-registry.cern.ch/atlas/athena/analysisbase | grep 21.2.247
21.2.247-20230316
21.2.247
Example Dockerfile¶
An atlas Dockerfile might look something like this:
FROM gitlab-registry.cern.ch/atlas/athena/analysisbase:21.2.247
COPY --chown=atlas . /analysis/src
WORKDIR /analysis/build
RUN . /release_setup.sh && \
cmake ../src && \
make -j4
USER atlas
When executing docker build . in the root of a repository, Docker will:
- Start from the
AnalysisBase,21.2.66 - Copy the full source code of the repository into the path
/analysis/srcwithin the container image, and change ownership of the files fromrootto the default useratlas(see discussion of permissions below) - Set the default work directory to the cmake build directory
/analysis/build(which will be created as it does not exist yet) - Setup the release and compile the analysis
Permissions Considerations when using ATLAS Images¶
The ATLAS images all have a default non-root user atlas. The atlas user is part of the wheel group, which gives it sudo privileges (see here for the relevant bit of the Dockerfile used to create the ATLAS images).
Even though atlas is the default user of the base image (gitlab-registry.cern.ch/atlas/athena/analysisbase:21.2.247), files and directories created by instructions such as COPY are owned by root by default. For example, if we create an image with the following Dockerfile
# Dockerfile.permissions
FROM gitlab-registry.cern.ch/atlas/athena/analysisbase:21.2.247
COPY . /analysis/src
WORKDIR /analysis/build
USER atlas
then build the image using the docker build command and run a container from it, you'll see that root owns the /analysis directory and its contents, and the default atlas user doesn't have permission to write in this dir:
# Build a docker image using the Dockerfile named Dockerfile.permissions, and call the new image permissions_example
docker build . -f Dockerfile.permissions -t permissions_example
# Start up a container from the newly created permissions_example image which will be removed after we exit from it (--rm).
# Interact with the container via the terminal (-it)
docker run --rm -it permissions_example
# Check ownership and permissions of /analysis dir
[bash][atlas]:build > ls -lh /analysis
total 12K
drwxr-xr-x 2 atlas atlas 4.0K Mar 27 16:58 build
drwxr-xr-x 2 root root 4.0K Mar 27 16:58 src
# Try creating a test file in /analysis dir
[bash][atlas]:build > touch /analysis/test.txt
touch: cannot touch `/analysis/test.txt': Permission denied
That's why we need to explicitly specify in the Dockerfile that during the COPY step /analysis/src should be owned by atlas user
FROM gitlab-registry.cern.ch/atlas/athena/analysisbase:21.2.247
COPY --chown=atlas . /analysis/src
WORKDIR /analysis/build
USER atlas
Though note that
The
--chownand--chmodfeatures are only supported on Dockerfiles used to build Linux containers, and will not work on Windows containers. Since user and group ownership concepts do not translate between Linux and Windows, the use of/etc/passwdand/etc/groupfor translating user and group names to IDs restricts this feature to only be viable for Linux OS-based containers.