Advanced Slurm
Paul Hall, PhD
Senior Research Software Engineer
HPC - team
Center for Computation and Visualization
Goals
Oscar: Under the Hood
Gateway nodes
login
desktop
transfer
/home
50 GB
/data
512+ GB
/scratch
up to 12 TB
GPFS
Storage
Compute
Compute nodes
CPU
CPU
CPU
CPU
GPU
GPU
GPU
GPU
GPU
GPU
GPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
GPU
VNC
Scheduler
(Slurm)
Submitting jobs
You can specify these outputs using a batch file
Anatomy of batch file
#!/bin/bash
# Here is a comment
#SBATCH --time=1:00:00
#SBATCH -N 1
#SBATCH -c 1
#SBATCH -J MyJob
#SBATCH -o MyJob-%j.out
#SBATCH -e MyJob-%j.err
module load workshop
hello
This is a bash file but with slurm flags
Use #SBATCH to specify slurm flags
Anatomy of batch file
#!/bin/bash
# Here is a comment
#SBATCH --time=1:00:00
#SBATCH -N 1
#SBATCH -c 1
#SBATCH -J MyJob
#SBATCH -o MyJob-%j.out
#SBATCH -e MyJob-%j.err
module load workshop
hello
This is a bash file but with slurm flags
How much time do I need
How many nodes do I need - Use 1 here if your code is not MPI enabled
cores
Name of your job
Where to put your output and error files
%j expands into the job-number
Job number is unique
Bash commands to run your job
Submitting batch files
sbatch <file_name>
sbatch <flags> <file_name>
sbatch -N 2 submit.sh - This will override the corresponding flags in your batch script
Checking on your jobs
Submitting array jobs
#SBATCH --array=i-j - i, j specify a range of values
#SBATCH --array=id_1, id_2, id_3 ... - specify the index for your job
Environment variable SLURM_ARRAY_TASK_ID is created for you. Use this to distinguish your job
Examples
Examples
Problem 1: Print the array task ID for your job
Examples
Problem 2: Use SLURM_ARRAY_TASK_ID to use a different variable from a list
Examples
Problem 3: Submit jobs to work with different files
The file names are included in list_of_files
Dependent Jobs
sbatch --dependency=<dependency type>:<job_id> <batch_script>
Different types of dependencies
Finding out job ID using bash
squeue -u $USER -o "%.8A %.4C %.10m %.20E"
Examples
Start a job at scheduled time
sbatch --begin=<time> <batch-script>
Options for <time>
Checking up on resource utilization
Have Questions?