RECAST Checklist
This checklist is intended to help analyzers quickly check whether their RECAST setup is complete, and point to useful resources for completing each sub-task. Current recommendations are denoted with the 👍 symbol.
Is all of my analysis code preserved on GitLab?¶
- Is my event selection code maintained in one or more GitLab project(s)?
- Is my statistical interpretation code maintained in one or more GitLab project(s)?
Click for more details and resources!
Your analysis code (i.e. all code needed to get from DxAODs to the final statistical interpretation and limit-setting) should be preserved one or more GitLab projects. How you structure your GitLab projects is entirely up to you.👍 If your code relies on other GitLab/GitHub projects, the recommendation is to include these other projects as submodules in your project(s).
Is/are my analysis environment(s) fully preserved in docker image(s)?¶
- Do I have a
Dockerfile
in each of my GitLab projects? - Do I have a
.gitlab-ci.yml
file with an image-building stage in each of my GitLab projects?
Click for more details and resources!
All aspects of the computing environment(s) needed to run your analysis code should be fully preserved in one or more docker images. This includes pre-compiling any executables used to run your analysis.ATLAS maintains version-controlled images which encapsulate the AthAnalysis, AnalysisBase and AnalysisTop environments on the public docker image registry DockerHub. These images can be used as a base on top of which you can build the image(s) for your event selection environment. See the
👍 We recommend using the image-building functionality in GitLab CI to automatically re-build your GitLab project's docker image each time new code is pushed to the project. This is achieved by adding a Dockerfile to each separate GitLab project for your analysis and adding a .gitlab-ci.yml file (or modifying an already-existing one) with an image-building stage. See Image Building with CI for more details.
Have I preserved and automated the process of passing a new signal through my analysis chain (a.k.a. my RECAST workflow)?¶
- Have I asked the analysis preservation contact to make a dedicated project for my analysis in the central recast-atlas group, which will contain my workflow specification files?
Click for more details and resources!
To help keep analysis workflows in a centralized location, all analyses are encouraged to maintain the spec files (eg. steps.yml, workflow.yml -- see checklist items below for details) for their workflow in a dedicated gitlab project for the analysis, which will be located in the recast-atlas group area (https://gitlab.cern.ch/recast-atlas).To set this up, email the analysis preservation contact (atlas-sw-analysis-preservation-contacts@cern.ch), and let them know that you need a gitlab project in the recast-atlas group for your analysis workflow. Please include your analysis reference code (eg. ANA-EXOT-2020-04). Also let them know which users/groups should be added as maintainers to the project (this would typically be any users who plan to contribute to developing the RECAST workflow).
The analysis preservation contact will create a new project under the appropriate group name (eg. recast-atlas/exotics), and the name of the project will be your analysis' reference code. For our example, the new gitlab project to contain the workflow specs will be located in https://gitlab.cern.ch/recast-atlas/exotics/ANA-EXOT-2020-04.
- Does the project contain a
steps.yml
file and aworkflow.yml
file to encode the workflow?
Click for more details and resources!
The RECAST workflow (i.e. the passage of an arbitrary new signal through the analysis chain to the final statistical interpretation) is encoded with a steps.yml file and a workflow.yml file. The steps.yml encodes how each step in the analysis chain is executed, and in which environment (i.e. docker container) the steps need to be run - see Defining Steps for more details on authoring steps.The workflow.yml links the steps together by specifying how the output of each step feeds in as the input to subsequent steps - see Defining Workflows for more details on workflow authoring.
- Does the workflow project contain a
recast.yml
file?
Click for more details and resources!
👍 We recommend writing a recast.yml file to encode unit tests and full run configurations for developing, testing and running your workflow. See Testing Steps for more details on writing tests of individual workflow steps, and Running Workflows for more details on writing workflow tests and full run configurations.- Have I asked the analysis preservation contact to make a dedicated directory to store input files for my analysis' RECAST workflow in the RECAST project area on
eos
?
Click for more details and resources!
There is a project area on eos dedicated to storing any input files, such as signal DxAODs and cross section files, needed for analyses to run their RECAST workflows:/eos/project/r/recast/atlas
To keep the inputs to RECAST workflows centralized, analyses are encouraged to store any input files needed by their RECAST workflows in a dedicated directory for their analysis, which will be located in the /eos/project/r/recast/atlas project area.
To set this up, email the analysis preservation contact (atlas-sw-analysis-preservation-contacts@cern.ch), and let them know that you need a directory to be created for your analysis in the recast project area. Please include your analysis reference code (eg. ANA-EXOT-2020-04). Also specify which users, groups and/or service accounts should be given read/write access to the directory.
The analysis preservation contact will make a new directory in this area to contain the workflow input files for your analysis and will give the specified users, groups and service accounts read/write access to it. The name of the directory will be your analysis' reference code (so the full path to the directory would be eg. /eos/project/r/recast/atlas/ANA-EXOT-2020-04).
- Have I moved my workflow input files to the dedicated directory for the analysis in the central RECAST project area on
eos
?
Click for more details and resources!
Any user with read/write permission to the directory created for the analysis in the central RECAST project area on eos (see above checklist item for details) can copy workflow input files to this directory.In your workflow, you can then use xrootd to download the input files from this directory on eos into the docker container when they're needed by the workflow:
Link to instructions for setting up automated kerboros authentication
Link to example of using xrootd to download files from central RECAST project area on eos
- Does the workflow project contain a
.gitlab-ci.yml
file with a stage to automatically run the workflow using the recast-atlas service account?
Click for more details and resources!
Workflows can be run either locally or using gitlab CI in the repository containing the workflow specification, as documented in Running Workflows.We suggest developing your workflow locally for efficiency. But in addition to local development, your gitlab CI will automatically test the workflow when updates are pushed to the repo containing the workflow specs. This will:
1. fully preserve the procedure of running the workflow, and
2. automate the testing of any future updates to the workflow.
See Running Workflows Within CI for more details on how to set this up.
Gitlab projects located in the recast-atlas group area have access to the global CI variables RECAST_USER, RECAST_PASS and RECAST_TOKEN for the official recast-atlas service account. This service account has access to all files stored in the RECAST project area on
eos
(/eos/project/r/recast/atlas
), and to all image registries for gitlab projects with either public or CERN internal permissions. These global CI variables can be overridden by setting them to different values for the project, but in the end your workflow should be able to run to completion using the recast-atlas service account credentials (i.e. using the global RECAST_USER, RECAST_PASS and RECAST_TOKEN variables). This will ensure that your workflow can be run in the future by anyone using the recast-atlas service account.
- If there are any workflow inputs that the recast-atlas account cannot access, please either move them to the dedicated space for your analysis in the RECAST project area on
eos
, or get in touch with the analysis preservation contact (atlas-sw-analysis-preservation-contacts@cern.ch) to discuss alternatives. - If there are any images that the recast-atlas account cannot access, please either update the permissions of the gitlab project used to create the images to
Internal
, or get in touch with the analysis preservation contact to discuss alternatives.
Last update:
March 31, 2022