1 of 27

Beginner Slurm

Paul Hall, PhD

Senior Research Software Engineer

HPC - team

Center for Computation and Visualization

2 of 27

Goals

  • Learn how to submit jobs to Oscar
  • How to check up on your jobs
  • How to fix common errors
  • Optimal resources for your job
  • Learning the queue priority
  • Interactive jobs

3 of 27

Introduction

  • Oscar is Brown University’s supercomputer

  • Supercomputers are essential for running simulations that require more resources than your average machine

Simulation of physical phenomena

Visualization

4 of 27

Introduction

  • Oscar is a shared machine (100s of users - 10,000s jobs )

  • Oscar is available to everyone in the Brown community: faculty, staff, students, external collaborators, and URI researchers through the EPSCOR project
  • Slurm is the software used to submit jobs so that everyone has access to the machine

5 of 27

Oscar: Under the Hood

Gateway nodes

login

desktop

transfer

/home

50 GB

/data

512+ GB

/scratch

up to 12 TB

GPFS

Storage

Compute

Compute nodes

CPU

CPU

CPU

CPU

GPU

GPU

GPU

GPU

GPU

GPU

GPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

GPU

VNC

Scheduler

(Slurm)

6 of 27

Submitting jobs

  • Questions SLURM needs to address before it can schedule your job*
    • How many nodes do you need
    • How many cores do you need
    • How much time will your job run for
    • Where to put your output and error logs

You can specify these outputs using a batch file

7 of 27

Anatomy of batch file

#!/bin/bash

# Here is a comment

#SBATCH --time=1:00:00

#SBATCH -N 1

#SBATCH -c 1

#SBATCH -J MyJob

#SBATCH -o MyJob-%j.out

#SBATCH -e MyJob-%j.err

module load workshop

hello

This is a bash file but with slurm flags

Use #SBATCH to specify slurm flags

To remove the flag just break the pattern # SBATCH

8 of 27

Anatomy of batch file

#!/bin/bash

# Here is a comment

#SBATCH --time=1:00:00

#SBATCH -N 1

#SBATCH -c 1

#SBATCH -J MyJob

#SBATCH -o MyJob-%j.out

#SBATCH -e MyJob-%j.err

module load workshop

hello

This is a bash file but with slurm flags

How much time do I need

How many nodes do I need - Use 1 here if your code is not MPI enabled

cores

Name of your job

Where to put your output and error files

%j expands into the job-number

Job number is unique

Bash commands to run your job

9 of 27

Example batch scripts

  • A series of example batch scripts covering a range of scenarios can be found on Oscar:
    • ~/batch_scripts

  • For a complete description of SBATCH options, see the online documentation

10 of 27

Submitting batch files

sbatch <file_name>

sbatch <flags> <file_name>

sbatch -N 2 submit.sh - This will override the corresponding flags in your batch script

11 of 27

Checking on your jobs

Running/Pending jobs

  • myq - specific to Oscar
  • squeue -u <username> - works on any machine with SLURM scheduler

Completed jobs

  • sacct -u <username> -S YYYY-MM-DD -E YYYY-MM-DD

12 of 27

Examples

13 of 27

What resources should I ask for?

This depends on the code you are running

Nodes/Cores

Q. is your code parallel? You will need to find out if your code can

- run on multiple cores

- run across multiple nodes

Look at the documentation for your code, is it threaded, multiprocessor, MPI

Q. Is your code serial?

- This means it can only make use of one core (-n 1)

14 of 27

What resources should I ask for?

This depends on the code you are running

Time

Make an estimate and add a bit.

e.g. if think your code will take an hour, give it 3 hours

if you think your code will take a day, give it 2-3 days

If you run out of time your job will be killed, so be generous with your estimate

15 of 27

What resources should I ask for?

This depends on the code you are running

Memory

For memory, this can take some trial and error. You can ask for a lot, then measure your usage. If you have asked for 100GB of memory, but only used 1GB, you can reduce your memory for your next job. To ask for all the memory available on a node, use #SBATCH --mem=0

16 of 27

What resources should I ask for?

This depends on the code you are running

GPUs

If you code is built to use gpus you can submit to the gpu partition. To request 1 gpu:

#SBATCH -p gpu --gres=gpu:1

17 of 27

I have a condo account, how do I submit to the it?

#SBATCH --account=<account-name>

If you do not know the name of the condo account then execute condos command

You need to explicitly added to the condos, to check -

sacctmgr show assoc where user=$USER

if you haven’t been added please email support@ccv.brown.edu

18 of 27

Finding out optimal resources for your job

It is good practice to occasionally check what resources your job is using.

myjobinfo -j <job-id>- This command is Oscar specific

seff <job-id>

19 of 27

Why wont my Job start?

Reason

What this means

(Resources)

Waiting for enough resources to be available

(QOSGrpCpuLimit)

Your condo cores are all in use

(QOSGrpMemLimit)

Your condo memory is all in use

(JobHeldUser)

You have put a hold on the job

(Priority)

Jobs with higher priority are waiting for compute nodes

(ReqNodeNotAvail)

The nodes you requested are not available

(PartitionNodeLimit)

You have requested more nodes that are in the partition

20 of 27

Understanding queue priority

This blue line represents all the cores on Oscar

21 of 27

Understanding queue priority

This blue line represents all the cores on Oscar

time

the x axis is time

22 of 27

Understanding queue priority

time

the x axis is time

Job1

Job2

Job3

23 of 27

Understanding queue priority

time

the x axis is time

Job1

Job2

Job3

Job4

24 of 27

Understanding queue priority

time

the x axis is time

Job1

Job2

Job3

Job4

Job5

25 of 27

Understanding queue priority

time

the x axis is time

Job1

Job2

Job3

Job4

Job5

Condo

Job

26 of 27

Interactive jobs

You can start the interactive jobs using the interact command

27 of 27

Have Questions?