> Manneback¶

Manneback is a cluster built with hardware acquired progressively thanks to multiple funding solutions brought by CISM users. It is configured for most aspects just like the CÉCI clusters so the CÉCI documentation mostly applies for Manneback as well.

Note

The following assumes a basic knowledge of Slurm

Available hardware¶

While CÉCI clusters are mostly homogeneous, Manneback is made of several different generations of hardware.

Use the sinfo command to learn about the available hardware on Manneback:

Partitions:
Def* (5days)    keira (5days)   qclong (5days)  pauli (5days)   pelican (5days) cp3 (5days)     gpu (2days)
Nodes:
#Nodes  Partition  CPU                        S:C:T   CPUS  Memory  GPUs
    cp3        CascadeLake,Xeon,4214      2:24:1  48    187.4G
     cp3        Genoa,EPYC,9454            1:96:1  96    377.6G
     cp3        Genoa,EPYC,9454P           1:96:1  96    377.4G
    cp3        Genoa,EPYC,9454P           1:96:1  96    377.6G
     cp3        Rome,EPYC,7452             2:32:2  128   503.6G
     cp3        Rome,EPYC,7452             2:64:1  128   503.5G
     cp3        Rome,EPYC,7452             2:64:1  128   503.6G
    cp3        SkyLake,Xeon,4116          2:24:1  48    187.4G
     cp3        Westmere,Xeon,X5675        2:1:1   2     7.8G
     Def*       Milan,EPYC,7763            2:64:1  128   1T
     gpu        CascadeLake,Xeon,5217,Tes  2:8:1   16    377.4G  TeslaV100:2
     gpu        CascadeLake,Xeon,5217,Tes  2:8:2   32    377.4G  TeslaV100:2
     gpu        CascadeLake,Xeon,6244,GeF  2:8:2   32    376.4G  GeForceRTX2080Ti:6
     gpu        Genoa,EPYC,9354P,Tesla,Te  1:32:1  32    377.6G  TeslaL40s:4
     gpu        Genoa,EPYC,9354P,Tesla,Te  1:32:2  64    377.4G  TeslaL40s:4
     gpu        IceLake,Xeon,6346,Tesla,T  2:16:2  64    251.3G  TeslaA10:4
     gpu        Milan,EPYC,7313,Tesla,Tes  2:16:2  64    251.5G  TeslaA100:2
     gpu        Milan,EPYC,7313,Tesla,Tes  2:16:2  64    251.7G  TeslaA100_80:2
     gpu        Rome,EPYC,7302,Tesla,Tesl  2:16:2  64    503.6G  TeslaA100:2
     gpu        Rome,EPYC,7352,GeForce,Ge  2:24:2  96    503.6G  GeForceRTX3090:4
     gpu        SkyLake,Xeon,6346,TeslaA1  2:16:2  64    251.5G  TeslaA10:4
    keira      Milan,EPYC,7763            2:64:2  256   1T
     keira      Rome,EPYC,7742             2:64:2  256   251.5G
     keira      Rome,EPYC,7742             2:64:2  256   503.6G
     pauli      Milan,EPYC,7313            2:16:2  64    1T
     pelican    IceLake,Xeon,6326          2:16:2  64    251.3G
     qclong     Genoa,EPYC,9354            2:32:1  64    1.5T
     qclong     Genoa,EPYC,9354            2:32:1  64    377.6G
Filesystems:
Filesystem      quota
$HOME           100G
$GLOBALSCRATCH  10T
$CECIHOME       100.0G
$GLOBALSCRATCH  unlimited

Multiple CPU and GPU vendors and generations are represented. The Features column in the above list shows the CPU code name, the CPU family (Xeon is Intel’s server CPU brand, EPYC is AMD’s) and the CPU reference. For GPU nodes, the column also contains the GPU series and type.

The CPU code name is representative of the generation of the CPU (from older to more recent):

Intel: Nehalem > Westmere > SandyBridge > IvyBridge > Haswell > Broadwell > SkyLake > CascadeLake > IceLake
AMD: K10 > Zen > Rome > Milan > Genoa

In your submission script, you can select one or more specific features with the --constraint= option, like --constraint="Nehalem|Westmere" for instance to choose a compute node with an older CPU. The --constraint option accepts quit complex expressions, you can find the documentation here.

Another distinctive feature of Manneback is that some nodes are equiped with GPUs. They are listed in the last column as a GPU name followed by the number of GPUs in each node.

As for CPUs multiple generations are available, in chronological order:

nVidia: TeslaM10 > TeslaV100 (Volta) > GeForceRTX2080Ti (Turing) > TeslaA100, GeForceRTX3090, TeslaA10 (Ampere) > L40s (AdaLovelace)

The GPU’s are considered a “generic resource” in Slurm, meaning that you can reserve the GPU for your job with a command like --gres="gpu:TeslaV100:2" in your submission script. This would request two V100 GPUs. If you do not need to choose a particular type of GPU, you can also simply request --gres=gpu:1 for instance for just one GPU. The --gres (documentation) option will allow you to request a number of GPUs per node. You can also use --gpus (documentation) option to request a total number of GPUs per job and not per node.

The --gres and --gpus options do not allow as much flexibility as the --constraint. Therefore, we also define features related to GPUs, such as Testla, Tesla100, Geforce, etc. You can thus specify

--gpus=1
--constraint=Tesla

to request one Tesla GPU if you do not care about the difference between a TeslaV100 and a TeslaA100.

Warning

Beware that the demand for GPUs is very high so jobs that request a GPU but let it sit idle for long periods will be cancelled.

As GPU resources are much scarcer that CPU resources, GPUs will impact the fairshare much more than CPUs, depending on the performance of the GPU and the ratio of CPUs to GPU on the compute nodes.

Partitions¶

The nodes are organised into partitions, which are logical sets of nodes with either similar hardware or similar policies/configurations. Multiple CPU partitions are available on Manneback: The default one (Def) which is opened to everyone, cp3, which is open to everyone but CP3 users have a higher priority there, keira which is open to everyone but NAPS users have a higher priority there, and pelican which is also open to everyone but mainly to ELIC users. The GPU nodes are grouped into the gpu partition. Note that you can specify #SBATCH -–partition=Def,cp3 to submit a job that will run on the earliest available partition.

Quality of Service¶

A Quality Of Service (QOS) is a parameter that a job can request to bring specific privileges for the job, for instance a higher priority, or a longer run time. A QOS is typically granted to department/groups who participate in the funding of the Manneback hardware.

Currently, the following QOS’es are defined and available on the respective partitions for the listed groups:

QOS	Type	Partition	Group
`cp3`	higher priority / longer jobs	`cp3`	CP3
`keira`	higher priority	`keira`	MODL
`interactive`	higher priority but limited nb of jobs/job duration	`gpu`	ELEN,MIRO,CP3
`preemptible`	longer job, more jobs, but preemptible (killable) jobs	`gpu`	Everyone

A QOS is chosen in Slurm with `--qos=.... For instance, to use the interactive QOS, add #SBATCH --qos=interactive in your submission script.

Note

Beware that QOS are not automatically granted to users; users must request access to them explicitely by email to the administrators.

The cp3 and keira QOS’es bring a higher priority for jobs on the partitions that their corresponding groups funded.

The configuration of the GPU partition is specific and was designed in order to make interactive jobs and production jobs run as smoothly as possible together on that partition.

By default, a job submitted to the GPU partition will be subjected to the following restrictions: maximum two jobs running at all times, for a maximum duration of 2 days. To run more jobs and/or longer jobs, users must use the preemptible QOS, that will allow up to 30 5-days-long jobs. The drawback is that those jobs will be preemptible, meaning that they can potentially be stopped by Slurm to let a higher-priority job run immediatley. As a consequence,

Warning

jobs submitted with the preemptible QOS must then be checkpoint/restart-enabled.

The interactive QOS will allow 1 9-hour-long job with a very high priority per user, intended for interactive jobs running during the day.

Disk space¶

Every user has access to a home directory with a 100GB quota.

A global scratch $GLOBALSCTACH (/globalscratch) is available, offering a 200TB space NFS mounted on all the compute nodes. There is no quota enforced on that filesystem ; it is the responsibility of the users to remove the files that are not needed anymore by any job. This space is cleaned periodically.

Warning

The scratch space is not suitable for long-term storage of data.

Each node furthermore has a local /scratch space. Local scratch is a smaller file system directly attached to the worker node. It is created at the beginning of a job and deleted automatically at the end. Again here no quota is enforced, but the available space is limited by the hardware.

Connection and file transfer¶

Clusters are accessed with your CÉCI login and SSH key. Refer to the CÉCI documentation for more details. Access to Manneback is achieved by pointing your SSH client to manneback.cism.ucl.ac.be with your CÉCI login and SSH key. Do not hesitate to use the SSH configuration wizard if you are using the command line SSH client. Manneback will be automatically configured for you.

First connection warning¶

When you connect for the first time, the SSH client will ask you to confirm the identity of the remote host, you can do so by verifying that the fingerprint shown to you is among the ones listed below:

Frontend 1:

SHA256:Q3IYMwb5QElBkqmVbJyi8UgFoyKZMZQsWRRU3CEvV8s
SHA256:iR1HQsjGvKxo4uwswD/xLepW6DA3e45jUbNEZTntWRc
SHA256:i2Hb6HDaeMz6h99/qHu3lIqGUX6Zrx8Yuz0ELTQzsjc

Frontend 2

SHA256:Q3IYMwb5QElBkqmVbJyi8UgFoyKZMZQsWRRU3CEvV8s
SHA256:iR1HQsjGvKxo4uwswD/xLepW6DA3e45jUbNEZTntWRc
SHA256:i2Hb6HDaeMz6h99/qHu3lIqGUX6Zrx8Yuz0ELTQzsjc

User self-assessment test¶

Users must pass a simple test to use Manneback. The test is very short (5 questions) and easy; 5 minutes should be enough. It exists only to make sure every one uses the cluster in the intended way and does not harm other users’ experience on the cluster. New users have a few weeks to pass the test voluntarily before the test is enforced upon login. This is recalled upon every login in the message of the day.

To start the test, you simply run selftest.py. As soon as you respond correctly to the questions, the message will disappear. As a hint, each question contains a link to the corresponding section in the documentation.