A few notes on the use of Singularity on the Trantor cluster

Singularity is a container platform designed to run complex applications on HPC clusters in a simple, portable, reproducible and escalation-safe way. You can build a container using Singularity on your workstation, and then run it on the compute nodes of the Trantor cluster.

At the following URL you can find a very good tutorial about Singularity:
https://singularity-tutorial.github.io/

You are also warmly welcome to read the official user guide:
https://sylabs.io/guides/3.6/user-guide/index.html

Currently, you can run Singularity containers on the following nodes:

  • Daneel
  • Hal
  • Helicon
  • Hypnos
  • Oromasdes03

Please note that, due to technical and security reasons, you are not allowed to build containers on the cluster. At this time, the suggested workflow is the following:

  1. Build the container on your workstation as a SIF file.
  2. Copy the SIF file in your home directory on Trantor (e.g. by using scp or sftp).
  3. Run the container on compute nodes by submitting a PBS job.

Due to known incompatibilities with NFS, you should not convert SIF images to "sandboxes" on the Trantor cluster (e.g. via the singularity build or the singularity run --writable commands). Running SIF images from NFS locations (such as your home directory) is the best and safe way.

CUDA-based containers can be run only on daneel01-03 nodes (the only nodes equipped with GPUs). To this end, you need to provide the "--nv" flag when executing the singularity shell or the singularity run commands. Also, do not forget to reserve one or more GPUs by explicitly requesting "ngpus" resources when submitting the job (for details: Submitting, inspecting and cancelling PBS Jobs ).

To create a container for a CUDA-based application, it is recommended to start from the CUDA Docker images provided by NVIDIA: https://hub.docker.com/r/nvidia/cuda .
You have to pay close attention to choose an image with an associated CUDA version equal to, or lower than, the one supported by the drivers currently installed on the Daneels. You can find the maximum CUDA version supported by the drivers by running the nvidia-smi command on a Daneel node. Then, to find the tag name associated to the required CUDA image, click on the "Tags" tab on the top of the https://hub.docker.com/r/nvidia/cuda web page and filter the list by entering the desired CUDA version in the search box. You can finally download the image by using the following command:

singularity pull docker://nvidia/cuda:tag_name

As an example, to download the image with the "devel" flavour of CUDA 10.2 based on Ubuntu 18.4 you have to execute the following command:

singularity pull docker://nvidia/cuda:10.2-devel-ubuntu18.04

Please note that support for CUDA-based containers should be considered experimental. If you find any problem, please report them to hpcstaff@sns.it