1 of 44

Introduction to HPC: basic level

1

2 of 44

Outline:1

  • Introduction
    • INCD
    • INCD Computing infrastructure
    • Computing farm structure
    • Authentication mechanism
    • Documentation
  • How to access the infrastructure
    • What is ssh & why use it
    • Generate ssh keys

  • Useful information
    • Linux commands
    • Best practices

2

3 of 44

Outline:2

  • Load software
    • Using pre-compiled software
  • How to submit a simple job
    • Single and multiple CPU job
    • GPU job
  • How to submit an MPI job
    • Using OpenMPI

3

4 of 44

Introduction INCD Computing infrastructure

5 of 44

INCD - Infraestrutura Nacional de Computação Distribuída

  • INCD is a digital infrastructure:
    • LIP Technical coordination
    • Goals:
      • Provide computing and data services for the research community.
      • Computing Services:
        • Cloud.
        • HTC and HPC (farm)

5

6 of 44

Computing Farm:

Applications run in a compute node:

  • Linux (Centos 7)
  • You do not have any direct access to the compute nodes
  • Access trough scheduler (slurm)
  • Storage accessible in both submission node and compute nodes

6

Compute nodes

Job submission�(pauli.ncg.ingrid.pt)

Job scheduling

Scheduler (slurm)

Storage

7 of 44

Computing Farm Advantages:

  • Applications with limited execution time (minutes-weeks).
  • “Almost” static environment:
    • Operating System, software, compilers and libraries are deployed by the system administrator. Users request the sys admins to deployed specific software, libraries, compilers to their needs.
  • Applications are scheduled into a queue for execution.
  • Resources are shared with other others users

7

8 of 44

How to access the infrastructure

9 of 44

What is ssh & why use it

  • Provides a secure way to access remote resources over an unsecured network;
  • It is a secure alternative to old login protocols like telnet or ftp;
  • Enables secure encryption sessions, only the endpoints have the ability to decrypt exchanged messages between them;
  • Based on two asymmetric encryption keys known as “public key” and “private keys, respectively.

9

10 of 44

Generate ssh keys

This example assume a linux machine.

Generate a RSA pair of keys on your local machine with the following command and answer the prompt requests:

  • Press enter on “Enter file in which to save the key”;
  • On “Enter passphrase” type a secure passphrase (DO NOT USE EMPTY PASSPHASES, NEVER!);
  • Reinsert the passphrase on “Enter same passphrase again”.

10

$ ssh-keygen -t rsa -b 4096 -C “your_email”

Generating public/private rsa key pair.

Enter file in which to save the key (/user/.ssh/id_rsa):

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

11 of 44

Generate ssh keys (cont.)

This will create two files on directory “~/.ssh”: “id_rsa” and “id_rsa.pub”, the first is your private key and should be kept away from outsiders, the second is the public key and we will use it to provide you access to remote systems.

.

11

$ ls -l ~/.ssh

-rw------- 1 user group 3247 Sep 8 11:16 id_rsa

-rw-r--r-- 1 user group 752 Sep 8 11:16 id_rsa.pub

12 of 44

Generate ssh keys (cont.)

You can also use the algorithm Ed25519 instead of RSA, offer more or less the same security capabilities with shorter keys, the command would be:

The private key will be stored on file “id_ed25519” and the public key on file “id_ed25519.pub”.

.

12

$ ssh-keygen -t ed25519 -C “your_email”

$ ls -l ~/.ssh

-rw------- 1 user group 3247 Sep 8 11:16 id_ed25519

-rw-r--r-- 1 user group 752 Sep 8 11:16 id_ed25519.pub

13 of 44

Access to INCD in Lisbon: Cirrus-A

Access the INCD advanced computing facility at Lisbon with a ssh session:

We use a new domain name, a.incd.pt, but the old domain name, ncg.ingrid.pt, is still valid and you can also login with command:

The user interfaces cirrus.a.incd.pt are CentOS 7.9.2009 servers as the cluster worker nodes, but please note that they have different architectures and some application may not behave as expected on the user interface. Two big differences are the unavailability of infiniband network and GPU’s on the user interfaces. If you need to test an application interactively then start an interactive session; shown below.

13

$ ssh -l “username” cirrus.a.incd.pt

$ ssh -l “username” cirrus.ncg.ingrid.pt

14 of 44

How to use Software @INCD cluster

15 of 44

Available software

The INCD provides an extensive list of pre-compiled applications and tools made available through the environment modules tool, this tool enables easy management of unix shell environment.

Check the available list with following command, the list is extensively:

We provide a list of environments relevant to some of the tutorial modules to facilitate the visualization.

15

[user@cirrus01 ~]$ module avail

------------------------------------ /cvmfs/sw.el7/modules/hpc ---------------------------

DATK gcc63/ngspice/34 libs/32/jemalloc/5.3.0

FigTree/1.4.4 gcc63/openmpi/1.10.7 libs/blas/3.9.0 ….

--------------------------- /cvmfs/sw.el7/modules/tut/module_3 ----------------------

FigTree/1.4.4 Tracer/1.7.2 gcc83/MrBayes/3.2.7a

16 of 44

Load software

In the example we’ll load the gromacs application version 2021.5:

16

[user@cirrus01 ~]$ which gmx

usr/bin/which: no gmx in /usr/condabin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin)

[user@cirrus01 ~]$ env | grep GMX

[user@cirrus01 ~]$ module load intel/gromacs/2021.5

[user@cirrus01 ~]$ which gmx

/cvmfs/sw.el7/ar/ix_5400/i20/gromacs/2021.5/b01/bin/gmx

[user@cirrus01 ~]$ env | grep GMX

GMXMAN=/cvmfs/sw.el7/ar/ix_5400/i20/gromacs/2021.5/b01/share/man

GMXDATA=/cvmfs/sw.el7/ar/ix_5400/i20/gromacs/2021.5/b01/share/gromacs

GMXBIN=/cvmfs/sw.el7/ar/ix_5400/i20/gromacs/2021.5/b01/bin

GMXLDLIB=/cvmfs/sw.el7/ar/ix_5400/i20/gromacs/2021.5/b01/lib64

17 of 44

List loaded software

We can get the loaded environments with the following command:

Note that the gromacs intel compiler dependency was loaded automatically. This could not be true on some cases, you should always check if the environment includes all needed dependencies, and if not, load the appropriate missing modules.

17

[user@cirrus01 ~]$ module list

Currently Loaded Modules:

1) intel/oneapi/2021.3 2) intel/gromacs/2021.5

18 of 44

Unload software

The command “unload” remove the target environment from the shell:

We can use the command “purge” to unload all loaded modules:

18

[user@cirrus01 ~]$ module list

Currently Loaded Modules:

1) intel/oneapi/2021.3 2) intel/gromacs/2021.5

[user@cirrus01 ~]$ module unload intel/gromacs/2021.5

[user@cirrus01 ~]$ module list

[user@cirrus01 ~]$ module list

Currently Loaded Modules:

1) intel/gromacs/2021.5 3) intel/openmpi/4.0.3 5) intel/openfoam/2112

2) intel/oneapi/2022.1 4) intel/hdf5/1.12.0

[user@cirrus01 ~]$ module purge

[user@cirrus01 ~]$ module list

19 of 44

How to submit a simple CPU job

20 of 44

How to submit a simple CPU job

We will start with a simple “hello” test job using one “core”.

  • Copy the hello directory from support material directory:

  • Examine hello.sh script content:

20

[user@cirrus01 ~]$ cp -r /data/tutorial/modulo0/hello .

[user@cirrus01 ~]$ cd hello

[user@cirrus01 hello]$ ls -l

-rw-r-----+ 1 user group 322 Sep 8 19:09 hello.sh

[user@cirrus01 hello]$ cat hello.sh

#!/bin/bash

#SBATCH -p short

#SBATCH --tasks-per-node=1

#SBATCH --nodes=1

echo "Hello, ready to business"

21 of 44

How to submit a simple CPU job (cont.)

  • Submit the script with command “sbatch” and check the job state with “squeue”:

  • The job is pending (ST=PD) while is waiting for resources, eventually it will start running and the status will change to running (ST=R), we can also see the running node name, hpc060:

21

[user@cirrus01 hello]$ sbatch hello.sh

Submitted batch job 6956063

[user@cirrus01 hello]$ squeue

JOBID PARTITION NAME USER ST TIME NODES CPUS TRES_PER_NODE NODELIST

6956063 short hello.sh user PD 0:00 1 1 N/A

JOBID PARTITION NAME USER ST TIME NODES CPUS TRES_PER_NODE NODELIST

6956063 short hello.sh user R 0:00 1 1 N/A hpc060

22 of 44

How to submit a simple CPU job (cont.)

  • Once complete the job will disappear from the “squeue” command output and you will find a new file on current directory, this file name form is slurm-”jobid”.out:

  • Examine the output file “slurm-6956063.out”, the header will provide a summary of the job execution environment, lines starting by ‘*’, followed by the the job standard output:

*

22

[user@cirrus01 hello]$ ls -l

-rw-r-----+ 1 user group 322 Sep 8 19:09 hello.sh

-rw-r-----+ 1 user group 922 Sep 8 19:21 slurm-6956063.out

[user@cirrus01 hello]$ cat slurm-6956063.out

* JOB_NAME : hello.sh

* JOB_ID : 6956063 …

Hello, ready to business

23 of 44

How to submit a GROMACS CPU job

24 of 44

How to submit a GROMACS CPU job

We will run a protein analyze using one MPI instance over one CPU “core”:

  • Copy the grom-cpu-1 directory from support material directory and examine the submission script:

  • Note the option “-maxh .05” on “gmx_mpi mdrun” command, we limit the execution time in order to give all participants a chance to run the tests.

24

[user@cirrus01 ~]$ cp -r /data/tutorial/modulo0/grom-cpu-1 .

[user@cirrus01 ~]$ cd grom-cpu-1

[user@cirrus01 grom-cpu-1]$ ls -l

-rw-r----- 1 user group 381 Sep 8 19:09 grom-cpu.sh

-rw-r----- 1 user group 1133028 Sep 8 19:59 md.tpr

[user@cirrus01 grom-cpu-1]$ cat grom-cpu-1.sh

25 of 44

How to submit a GROMACS CPU job (cont.)

  • Submit grom-cpu-1.sh and wait for completion, it will take about 3 minutes:

  • On completion you will find all gromacs output files on current directory:

25

[user@cirrus01 grom-cpu-1]$ sbatch grom-cpu-1.sh

Submitted batch job 6959174

[user@cirrus01 grom-cpu-1]$ squeue

JOBID PARTITION NAME USER ST TIME NODES CPUS TRES_PER_NODE NODELIST

6959174 short grom-cpu-1 user R 0:05 1 1 N/A hpc060

[user@cirrus01 grom-cpu-1]$ ls -l

-rw-r-----+ 1 user group 2160 Sep 9 09:55 ener.edr

-rw-r-----+ 1 user group 401 Sep 8 14:14 grom-gpu.sh

-rw-r-----+ 1 user group 28180 Sep 9 09:55 md.log

-rw-r-----+ 1 user group 1133028 Sep 8 14:13 md.tpr

-rw-r-----+ 1 user group 4941 Sep 9 09:55 slurm-6958633.out

-rw-r-----+ 1 user group 835180 Sep 9 09:55 state.cpt

-rw-r-----+ 1 user group 126728 Sep 9 09:48 traj_comp.xtc

-rw-r-----+ 1 user group 833184 Sep 9 09:48 traj.trr

26 of 44

How to submit a GROMACS CPU job (cont.)

  • The md.log will show at the bottom the job performance:

  • We can now run on more than one CPU “core” and check the performance increase

26

[user@cirrus01 gom-cpu-1]$ less md.log

Core t (s) Wall t (s) (%)

Time: 180.630 180.631 100.0

(ns/day) (hour/ns)

Performance: 5.789 4.146

Finished mdrun on rank 0 Fri Sep 9 10:36:39 2022

27 of 44

How to submit a GROMACS MPI job (cont.)

  • Copy the grom-cpu-4 directory from support material directory and submit:

  • When completed we can check the increase on job performance (about four times):

27

[user@cirrus01 ~]$ cp -r /data/tutorial/modulo0/grom-cpu-4 .

[user@cirrus01 ~]$ cd grom-cpu-4

[user@cirrus01 grom-cpu-4]$ sbatch grom-cpu-4.sh

[user@cirrus01 gom-cpu-1]$ less md.log

Core t (s) Wall t (s) (%)

Time: 731.051 182.764 400.0

(ns/day) (hour/ns)

Performance: 19.856 1.209

Finished mdrun on rank 0 Fri Sep 9 10:45:53 2022

28 of 44

How to submit a simple GPU job

29 of 44

How to submit a GROMACS GPU job

We will run the same protein analyze using one MPI instance over one GPU:

  • Copy the grom-gpu directory from support material directory and examine the submission script:

  • Note the option “#SBATCH –gres=gpu”, this directive instruct the batch system to provide the job one GPU.

29

[user@cirrus01 ~]$ cp -r /data/tutorial/modulo0/grom-gpu .

[user@cirrus01 ~]$ cd grom-gpu

[user@cirrus01 grom-cpu-1]$ ls -l

-rw-r----- 1 user group 357 Sep 8 19:09 grom-gpu.sh

-rw-r----- 1 user group 1133028 Sep 8 19:59 md.tpr

[user@cirrus01 grom-cpu-1]$ cat grom-gpu.sh

30 of 44

How to submit a GROMACS GPU job (cont.)

  • Submit grom-gpu.sh and wait the same 3 minutes for completion, note the TRES_PER_NODE column indicating we are using a GPU:

  • On completion we will find a much better performance compared to the CPU runs:

30

[user@cirrus01 grom-gpu]$ sbatch grom-gpu.sh

Submitted batch job 6959565

[user@cirrus01 grom-gpu]$ squeue

JOBID PARTITION NAME USER ST TIME NODES CPUS TRES_PER_NODE NODELIST

6959565 short grom-gpu.s user R 0:05 1 1 gres:gpu hpc063

[user@cirrus01 gom-gpu]$ less md.log

Core t (s) Wall t (s) (%)

Time: 178.510 178.511 100.0

(ns/day) (hour/ns)

Performance: 120.808 0.199

Finished mdrun on rank 0 Fri Sep 9 10:45:53 2022

31 of 44

Remark on GROMACS and others

Some applications, such as GROMACS, may try to take all available resources when not properly configured.

The batch system will not allowed but the jobs will suffer from poor performance and could even abort.

The users have the responsibility to configure applications to use only the requested resources, the IT team will help on parametrization of the batch system part but the software tuning is out of our scope.

31

32 of 44

How to submit a MPI job

33 of 44

How to submit a MPI job

We will calculate PI with a multicore MPI job using openmpi over four nodes and sixteen instances:

  • Copy the openmpi directory from support material directory and examine the submission script:

  • Submit and wait for completion:

33

[user@cirrus01 ~]$ cp -r /data/tutorial/modulo0/openmpi .

[user@cirrus01 ~]$ cd openmpi

[user@cirrus01 openmpi]$ ls -l

-rw-r----- 1 user group 1518 Sep 8 19:09 cpi_mpi.c

-rw-r----- 1 user group 647 Sep 8 19:59 cpi.sh

[user@cirrus01 openmpi]$ sbatch cpi.sh

Submitted batch job 6959924

[user@cirrus01 openmpi]$ squeue

JOBID PARTITION NAME USER ST TIME NODES CPUS TRES_PER_NODE NODELIST

6959924 short cpi.sh user R 0:05 4 16 N/A hpc[060-063]

34 of 44

How to submit a MPI job (cont.)

  • After a few seconds, we decrease the number of steps to rush the job, we can check the result:

34

[user@cirrus01 openmpi]$ cat slurm-6959924.out

=== Environment ===

=== Compiling Parallel ===

=== Running Parallel ====

pi=3.1415926536607000, error=0.0000000000709068, ncores 16, wall clock time = 24.771612

35 of 44

How to start an interactive session

36 of 44

How to start an interactive session

It may be convenient (and faster) to test and troubleshoot applications on interactive mode or on the shell console. The system user interfaces have a different architecture and are not a good choice for such tests, in this cases you should start an interactive session on the workernodes.

  • Run the following command to request an interactive session with one CPU “core”:

  • You’ll get one CPU “core” and only one, for example the GPU will be off limits:

36

[user@cirrus01 ~]$ srun -p short --job-name "my_interactive" --pty bash -i

srun: job 6959936 queued and waiting for resources

srun: job 6959936 has been allocated resources

[user@hpc060 ~]$ _

[user@hpc060 ~]$ nvidia-smi

No devices were found

37 of 44

How to start an interactive session (cont.)

  • If you need a GPU available on the interactive session add the option “--gres=gpu”:

37

[user@cirrus01 ~]$ srun -p short –gres=gpu --job-name "my_interactive" --pty bash -i

srun: job 6959969 queued and waiting for resources

srun: job 6959969 has been allocated resources

[user@hpc063 ~]$ nvidia-smi

+-----------------------------------------------------------------------------------------------------------+

| NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 |

|-------------------------------------------+------------------------------+-------------------------------+

| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |

| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M.|

38 of 44

Production batch system

39 of 44

Differences with the production system

  • The production batch system will also provide a “short” partition for testing but the GPU will not be available;
  • The default partition on production system is “hpc” and provide CPU only;
  • The GPU is provided through the “gpu” partition;
  • The FCT grant beneficiaries must provide extra options on their submission scripts, contact IT team if needed, they use the partition “fct” exclusively and can access the GPU’s directly from it;
  • There are some more partitions available but they do not compete with HPC resources.

39

40 of 44

Useful commands

41 of 44

Useful commands

  • sinfo -Nl
  • squeue
  • sacct -j “job_id”
  • macct

41

42 of 44

Documentation

&

Helpdesk

43 of 44

Documentation

  • Lip Wiki: https://wiki-lip.lip.pt/
    • Configure email
    • Eduroam access
    • LIP internal groups documentation
    • LIP computing FARM access and commands

  • More generic FARM: https://wiki.incd.pt/

43

44 of 44

�Q&A