Docker Introduction¶
Docker is an industry-standard technology to package software, including its full list of dependencies, such that it can be run on almost any machine. As there is no guarantee that the software environment offered by your institute or LXPLUS will be the same when the time to re-run the analysis comes, it is necessary to completely capture the analysis software.
As an industry standard there is extensive documentation on Docker around the web, such as the Docker Getting Started Guide.
Images, Containers, Dockerfiles¶
The core idea of Docker is that it captures software in a Container Image, a snapshot of a filesystem that includes your analysis software and its dependencies. Based on that image, processes — such as your analysis code — can be run in that captured software environment as containerized processes.
The easiest way to build a Docker image is to use the docker build
command as well as a short text file — the Dockerfile
. A Dockerfile
starts by declaring a pre-existing "base image" that includes the operating system and other common software. Building on that base you can then install additional software either by copying data into the image via the COPY
instruction or using a RUN
instruction:
FROM <baseimage>
RUN <install step 1>
RUN <install step 2>
COPY --chown=<user> . /path/inside/container
RUN <install step 3>
Each RUN
instruction runs in a separate shell (Note: this is different than GitLab's CI system, in which subsequent commands run in the same shell). A RUN
instruction executes a given command within the container image and snapshots the container before running the next instruction.
A COPY
instruction copies a directory or file into the container at a given path. The syntax is COPY --chown=<user> <src> <target>
, where <src>
is relative to the build context which one passes around in docker build <path to build context>
and <user>
is the name of the user that should have ownership over the files in the container.
There are many more instructions you can find in the Documentation, but RUN
and COPY
will cover most use-cases.
ATLAS Images¶
ATLAS provides base images that analyses can use. They are published on CERN's GitLab container registry under the atlas/athena
prefix. All of AnalysisBase, AthAnalysis and and AnalysisTop images are published:
gitlab-registry.cern.ch/atlas/athena/analysisbase:<release>
, e.g.gitlab-registry.cern.ch/atlas/athena/analysisbase:21.2.247
gitlab-registry.cern.ch/atlas/athena/athanalysis:<release>
, e.g.gitlab-registry.cern.ch/atlas/athena/athanalysis:21.2.247
gitlab-registry.cern.ch/atlas/athena/analysistop
, e.g.gitlab-registry.cern.ch/atlas/athena/analysistop:21.2.99
Note
The GitLab container registry web interface is not user friendly and is very slow (compared to Docker Hub).
To browse all of the available images from the command line quickly consider using crane
.
Example:
$ crane ls gitlab-registry.cern.ch/atlas/athena/analysisbase | grep 21.2.247
21.2.247-20230316
21.2.247
Example Dockerfile¶
An atlas Dockerfile might look something like this:
FROM gitlab-registry.cern.ch/atlas/athena/analysisbase:21.2.247
COPY --chown=atlas . /analysis/src
WORKDIR /analysis/build
RUN . /release_setup.sh && \
cmake ../src && \
make -j4
USER atlas
When executing docker build .
in the root of a repository, Docker will:
- Start from the
AnalysisBase,21.2.66
- Copy the full source code of the repository into the path
/analysis/src
within the container image, and change ownership of the files fromroot
to the default useratlas
(see discussion of permissions below) - Set the default work directory to the cmake build directory
/analysis/build
(which will be created as it does not exist yet) - Setup the release and compile the analysis
Permissions Considerations when using ATLAS Images¶
The ATLAS images all have a default non-root user atlas
. The atlas
user is part of the wheel
group, which gives it sudo privileges (see here for the relevant bit of the Dockerfile used to create the ATLAS images).
Even though atlas
is the default user of the base image (gitlab-registry.cern.ch/atlas/athena/analysisbase:21.2.247
), files and directories created by instructions such as COPY
are owned by root
by default. For example, if we create an image with the following Dockerfile
# Dockerfile.permissions
FROM gitlab-registry.cern.ch/atlas/athena/analysisbase:21.2.247
COPY . /analysis/src
WORKDIR /analysis/build
USER atlas
then build the image using the docker build command and run a container from it, you'll see that root
owns the /analysis
directory and its contents, and the default atlas
user doesn't have permission to write in this dir:
# Build a docker image using the Dockerfile named Dockerfile.permissions, and call the new image permissions_example
docker build . -f Dockerfile.permissions -t permissions_example
# Start up a container from the newly created permissions_example image which will be removed after we exit from it (--rm).
# Interact with the container via the terminal (-it)
docker run --rm -it permissions_example
# Check ownership and permissions of /analysis dir
[bash][atlas]:build > ls -lh /analysis
total 12K
drwxr-xr-x 2 atlas atlas 4.0K Mar 27 16:58 build
drwxr-xr-x 2 root root 4.0K Mar 27 16:58 src
# Try creating a test file in /analysis dir
[bash][atlas]:build > touch /analysis/test.txt
touch: cannot touch `/analysis/test.txt': Permission denied
That's why we need to explicitly specify in the Dockerfile that during the COPY
step /analysis/src
should be owned by atlas
user
FROM gitlab-registry.cern.ch/atlas/athena/analysisbase:21.2.247
COPY --chown=atlas . /analysis/src
WORKDIR /analysis/build
USER atlas
Though note that
The
--chown
and--chmod
features are only supported on Dockerfiles used to build Linux containers, and will not work on Windows containers. Since user and group ownership concepts do not translate between Linux and Windows, the use of/etc/passwd
and/etc/group
for translating user and group names to IDs restricts this feature to only be viable for Linux OS-based containers.