1 of 67

Introduction to Using the Discovery Cluster at NU

This is an informational document organized in collaboration between ITS, NU faculty and staff.

For questions, contact:

researchcomputing@northeastern.edu

2 of 67

Objective

This guide provides basic information on how to connect, run calculations on, and transfer data to/from, the Discovery Cluster (as of September 2018). It also provides some tips on how to ensure you are obtaining good performance and that your calculations are efficient. Focusing on performance and efficiency (i.e. more computing per core-hour used) benefits you, as well as the rest of the Discovery users.

This guide is not intended to provide a comprehensive introduction to cluster computing. However, this document does provide links to entry-level resources and step-by-step tutorials. If you are trying to accomplish something that is not described here, please let RC staff know, so they can help you properly implement your tasks.

Finally, this is a living document. If you see ambiguities/errors, or you have suggestions, please let us know. Your fellow Discovery users will appreciate it!

2

3 of 67

Why read this guide?

Even if you used previous-generation NU resources, it is still important to familiarize yourself with the policies and configuration of the recently-deployed Discovery Cluster.

Major changes include:

  • Recommended best-practices have been significantly revised (e.g. use of --exclusive)
  • SLURM now assigns a default 1Gb/core for every job
  • Submitted jobs can request additional memory
  • CPU, GPU and memory usage is strictly enforced
  • The process for interactive job submission has changed
  • All partitions have been reconfigured and reorganized
  • Data storage policies have been updated
  • A dedicated transfer node is available
  • Research Computing-specific issue ticketing system has been deployed
  • Now recommended that requests for Discovery access for courses obtain prior approval
  • Information on scheduled maintenance

3

4 of 67

Before You Start

Discovery is a Linux-based cluster. If you have never used the command line in Linux, then you will want to get familiar with it before trying to use this resource. Below is information to help you learn the basics of working in a Linux-based environment.

Code Academy intro to the command line

https://www.codecademy.com/learn/learn-the-command-line

In OSX, the command line may be accessed using the “Terminal” application. If you are using a PC, you will likely only use the command line after you connect to the cluster using a Secure Shell. See User-Contributed Tutorials for Windows-specific information.

4

5 of 67

Table of Contents

5

6 of 67

User-Contributed Tutorials

In addition to this guide, other members of the NU community have kindly volunteered to create additional tutorials. We encourage everyone to consult these other resources, since they complement the current document.

Introduction to Discovery for Windows Users by Cuneyt Eroglu (video)

Step-by-step Examples and Demos for Discovery from Kane Group (wiki)

Discovery Cluster guide and tutorials for Python, Spark, Keras and Pyro provided by the Ioannidis Group (wiki)

6

7 of 67

Connecting to Discovery

In order to log in to the discovery cluster, you need to first connect to the front end. This is achieved by establishing a Secure Shell using SSH:

On Linux, or from within the Terminal program when using OSX, you can initiate an SSH connection with the command:

ssh <username>@login.discovery.neu.edu

<username> is your username on discovery.

Windows users: You should use an SSH client, such as PuTTY, to connect. PuTTY may be downloaded at https://www.putty.org For a video tutorial, see User-Contributed Tutorials

7

8 of 67

Transferring Data to/from Discovery

In addition to the front end node, there is also a dedicated data transfer (DT) node for use with sftp, scp or rsync. To ensure stability of the front end, file transfers are only allowed through the data transfer node. Here are some (Linux, or Terminal in OSX) examples for how to transfer data:

scp testfile <username>@xfer.discovery.neu.edu:/scratch/<username>

or

rsync -auv <local directory> <username>@xfer.discovery.neu.edu:<remote directory>

Windows users: For a video tutorial, see User-Contributed Tutorials

8

9 of 67

Proper Data Storage Practices

/home directory

Each account is associated with a /home directory. The home directories are intended for storing executables and small persistent files (e.g. general config files, source code, etc). Each user may have a limit of 100Gb in their home directory. NEVER launch a job from your /home directory. In accordance with cluster policies, jobs launched from /home will be terminated without notice.

9

10 of 67

Proper Data Storage Practices

Shared /scratch directory

/scratch is a 3 PB parallel file server which all users may utilize. Every user should create a subdirectory on /scratch in order to write output from their calculations.

To create a scratch directory, you may use the command : mkdir /scratch/`whoami`

For example, if your login name is john, then your scratch directory would be named /scratch/john

Always launch submitted jobs from /scratch, and transfer your data from /scratch to another resource (e.g. personal/group file server) as soon as possible.

/scratch is for high-speed temporary file access.

/scratch is NOT backed up.

While there is no limit on the amount of data you may store temporarily on /scratch, data removal may be required if the system reaches critically-full levels.

RCC has approved the implementation of purge policies for /scratch. Once implemented, old files (>~3 mo) will be removed automatically. Prior notice will be provided.

10

11 of 67

Proper Data Storage Practices

ITS staff aims to satisfy your research needs. If you find that additional data storage services are required, please contact RC support and/or your college’s RCC representative. See Policy Questions and Suggestions, for contact information.

11

12 of 67

Using Modules to Load Executables

If you would you like to use pre-built code, rather than compile your own software, you will need to use the “module” command. The basic idea is that, by loading the appropriate modules you will be able to call executables from the command line with minimal effort (e.g. the commands will be in your path, and libraries will be loaded).

Here are the most commonly-used module options.

module avail : returns a list of available modules

module whatis <module name> : provides information about a specific module, including additional prerequisites

module load <module name> : load the module, so that you may use the executables

module list : list the modules that you have loaded

12

13 of 67

Hardware Descriptions of the Partitions

In the Discovery environment, collections of compute nodes are organized into “partitions”. Always submit jobs to specific partitions using sbatch, or srun. Never run calculations on the front end node.

Each partition has distinct hardware configurations (e.g. GPU nodes, phi processors, high memory, etc.). There are six publically-available partitions available to all NU researchers. When submitting jobs, you should always specify a partition name. The names of the public partitions are:

  • general
  • gpu
  • fullnode
  • multigpu
  • infiniband
  • phi

The hardware details of each partition are described on subsequent slides.

13

14 of 67

“general” partition

The general partition has a collection of CPU-only nodes. Serial and parallel (multi core and multi-node) jobs are allowed on this partition.

Each node contains two multi-core CPUs.

Current Node Configurations Available

Dual Intel Xeon E5-2650 @ 2.00GHz, 16 total cores, 128GB memory (7 nodes)

Dual Intel Xeon E5-2680 v2 @ 2.80GHz, 20 total cores, 128GB memory (78 nodes)

Dual Intel Xeon E5-2690 v3 @ 2.60GHz, 24 total cores, 128GB memory (184 nodes)

14

Hardware and partitions

15 of 67

“gpu” partition

Current Node Configurations Available

Dual Intel Xeon E5-2650 @ 2.00GHz, 16 total cores, 128GB memory

+ one K20 NVIDIA GPGPU (32 nodes)

Dual Intel Xeon E5-2690 v3 @ 2.60GHz, 24 total cores, 128GB memory

  • one K40m NVIDIA GPGPU (16 nodes)

15

Hardware and partitions

16 of 67

“fullnode” partition

This partition is ONLY for jobs that can utilize entire nodes, either in terms of memory, or CPUs. If your job requires more than 128 Gb of memory, or you can use all 28 cores with a single submission script, then this partition is appropriate. Each user must apply for access to this partition (See “Applying for Access to Specialized Partitions”).

Node Specifications

Dual Intel Xeon E5-2680 v4 @ 2.40GHz, 28 total cores, 256GB memory (416 nodes)

16

Hardware and partitions

17 of 67

“multigpu” partition

This partition is for jobs that can utilize multiple GPGPUs. Each user must apply for access to this partition (See “Applying for Access to Specialized Partitions”).

Node Specifications

Dual Intel Xeon E5-2680 v4 @ 2.40GHz, 28 total cores, 512GB memory - each node also has 8 NVIDIA K80 GPGPUs (In total, 8 nodes with 64 GPUs)

Dual Intel Xeon E5-2680 v4 @ 2.40GHz, 28 total cores, 512GB memory - each node also has 4 NVIDIA P100 GPGPUs (in total, 8 nodes with 32 GPUs)

17

Hardware and partitions

18 of 67

“infiniband” partition

Users must apply for access to this partition (See “Applying for Access to Specialized Partitions”).

Each node contains:

Dual Intel Xeon E5-2650 @ 2.00GHz, 16 total cores, 128GB memory (64 nodes)

FDR Infiniband interconnect

18

Hardware and partitions

19 of 67

“phi” partition - temporarily unavailable

Users must apply for access to this partition (See “Applying for Access to Specialized Partitions”).

Each node contains:

Dual Intel Xeon E5-2650 @ 2.00GHz, 16 total cores, 512GB memory (8 nodes) + one phi coprocessor

Note: The phi coprocessor is not supported by the current version of Centos (7.5). At this time, we are waiting for a fix to be released. If you need to use these nodes immediately, please file a ticket with RC so that a temporary workaround solution may be explored.

19

Hardware and partitions

20 of 67

Submitting and Monitoring Jobs on Discovery

In order to perform a calculation on Discovery, you must use SLURM (Simple Linux Utility for Resource Management). The idea is that you ask SLURM for resources (e.g. what type of node, how many cores, how much memory), and then your calculations will be executed once the resources become available.

20

21 of 67

Submitting and Monitoring Jobs on Discovery

The most common SLURM commands that you will need are:

sbatch <file name> : this will send the job to the scheduler.

srun : If you would prefer to work interactively on a node, you may launch and interactive session with srun.

squeue : see what jobs are waiting and running currently.

scancel <job id> : remove a running or pending job from the queue

scontrol <flags> : find more information about the machine configuration and job settings

seff <job id> : report the computational efficiency of your calculations

Below we provide examples for how to use these command appropriately. Even if you have used SLURM in the past, it is useful to review all of the examples below, in order to see how SLURM is configured on Discovery.

21

22 of 67

sbatch

In order to schedule a job for execution, you need to submit it to SLURM using the command:

sbatch example.script

In this example, the submission script is called example.script. Below, we provide multiple example for what you could include in this file.

22

Submitting jobs

23 of 67

Submit Script Format

The general format for a submit script is to provide a variety of SBATCH flags in the header and then the executables are called in the body of the script.

Note: Most sbatch flags may be given in single-letter or whole-word formats. For example, “-N 1” and “--nodes=1” are equivalent. For transparency, we will use the whole-word convention. To see the complete list of available flags, see man sbatch.

23

Submitting jobs

24 of 67

Time Limits

The default time limit for all submitted jobs is 24 hours. This is also the maximum allowable wall time. If your job does not complete within the requested time limit, SLURM will automatically terminate the calculation. Alternately, if you request more than 24 hours, your job will not launch.

Tip: One factor that SLURM uses to determine job order is the requested time. If you request less time, SLURM may be able to schedule your calculation sooner. For example, if the highest priority pending job has requested 10 nodes, and SLURM anticipates that 10 nodes will become available in 6 hours, then jobs that requires less than 6 hours could be completed before the 10-node job begins. In this case, SLURM will allow these shorter lower-priority jobs to run while the larger higher-priority calculation is waiting for available resources.

24

Submitting jobs

25 of 67

Examples for Serial Job Submission

25

Submitting jobs

26 of 67

1-core job

If you wanted to run a 1-core job for 4 hours on the general partition, then your submit script would look like the following:

_____________________________________________________________

#!/bin/bash

#SBATCH --nodes=1

#SBATCH --time=4:00:00

#SBATCH --job-name=MyJobName

#SBATCH --partition=general

<commands to execute.>

_____________________________________________________________

26

Submitting jobs

27 of 67

1-core job + additional memory

By default, SLURM will allow you to use 1GB of memory for every core you have allocated. Here is an example of a 1-core job for 4 hours on the general partition that requires 100 GB of memory:

_____________________________________________________________

#!/bin/bash

#SBATCH --nodes=1

#SBATCH --time=4:00:00

#SBATCH --job-name=MyJobName

#SBATCH --mem=100Gb

#SBATCH --partition=general

<commands to execute.>

_____________________________________________________________

Note: If your calculations try to use more memory than you requested, SLURM will automatically kill the job.

27

Submitting jobs

28 of 67

1-core job with exclusive use of a node

If you wanted to run a 1-core job for 4 hours on the general partition, and you need exclusive access to the node (e.g. perhaps you have high I/O requirements), then you may want to lock down the entire node with --exclusive:

_____________________________________________________________

#!/bin/bash

#SBATCH --nodes=1

#SBATCH --time=4:00:00

#SBATCH --job-name=MyJobName

#SBATCH --exclusive

#SBATCH --partition=general

<commands to execute.>

_____________________________________________________________

28

Submitting jobs

29 of 67

Examples for parallel job submission

29

30 of 67

Submit script contents

#SBATCH --nodes 2

#SBATCH --tasks-per-node=3

#SBATCH --cpus-per-task=4

Node 0

Node 1

2 nodes are reserved

task 0

task 1

task 2

task 3

task 4

task 5

4 cores reserved for each task

e.g. when launching 4 openMP threads per mpi rank

Schematic example of how to requests resources for a program that will employ multi-level mpi-openMP parallelization.

In this case, reserve 2 node, 3 tasks per node (e.g. for use with 3 mpi ranks per node) and reserve 4 cores for each task (e.g. for use with 4 openMP threads per rank).

3 tasks are reserved

(mpi may launch 3 ranks)

3 tasks are reserved

(mpi may launch 3 ranks)

4 cores reserved for each task

e.g. when launching 4 openMP threads per mpi rank

31 of 67

  • The last slide only illustrates how SLURM reserves resources. You must make sure that your code is also configured to utilize the reserved resources.
  • For example, make sure that your code is built to run in parallel if you reserve more than 1 core.

32 of 67

8-task job on a single node + additional memory

By default, SLURM will allow you to use 1GB of memory for every core you have allocated. Here is an example of an 8-task job (e.g for an 8-rank mpi calculation) that will run on a single node and will require 100 GB of memory:

_____________________________________________________________

#!/bin/bash

#SBATCH --nodes=1

#SBATCH --tasks-per-node=8

#SBATCH --cpus-per-task=1

#SBATCH --time=4:00:00

#SBATCH --job-name=MyJobName

#SBATCH --mem=100Gb

#SBATCH --partition=general

<commands to execute.>

_____________________________________________________________

32

Parallel job submission

33 of 67

8-task job on multiple nodes + additional memory

You may want to distribute your tasks across nodes. Here is an example of an 8-task job, where the tasks will be distributed across 4 nodes, with 100 GB of memory requested per node:

_____________________________________________________________

#!/bin/bash

#SBATCH --nodes=4

#SBATCH --tasks-per-node=2

#SBATCH --cpus-per-task=1

#SBATCH --time=4:00:00

#SBATCH --job-name=MyJobName

#SBATCH --mem=100Gb

#SBATCH --partition=general

<commands to execute.>

_____________________________________________________________

33

Parallel job submission

34 of 67

8-task, 32-core, job on multiple nodes + memory

You may want to distribute your ranks across nodes. Here is an example of an 8-task job, with 4 cores reserved per task, and the tasks distributed across 4 nodes. 100 GB of memory requested per node:

_____________________________________________________________

#!/bin/bash

#SBATCH --nodes=4

#SBATCH --tasks-per-node=2

#SBATCH --cpus-per-task=4

#SBATCH --time=4:00:00

#SBATCH --job-name=MyJobName

#SBATCH --mem=100Gb

#SBATCH --partition=general

<commands to execute.>

_____________________________________________________________

34

Parallel job submission

35 of 67

8-task job on multiple nodes + additional memory and exclusive

You may want to distribute your ranks across nodes and have exclusive access to all nodes (e.g. when using multi-level parallelization). Here is an example of an 8-task job, with tasks distributed across 4 nodes, 100 GB of memory requested per node and exclusive use of all nodes.

_____________________________________________________________

#!/bin/bash

#SBATCH --nodes=4

#SBATCH --tasks-per-node=2

#SBATCH --time=4:00:00

#SBATCH --job-name=MyJobName

#SBATCH --mem=100Gb

#SBATCH --exclusive

#SBATCH --partition=general

<commands to execute.>

_____________________________________________________________

35

Parallel job submission

36 of 67

Examples for GPU job submission

Submitting jobs to the gpu and multigpu partitions is very similar to submitting jobs on the other partitions. The main difference is that you need to tell SLURM how many GPUs you would like to reserve, per node. If you do not, the GPUs will not be visible to your executables. You can also optionally specify which types of GPUs are required.

36

GPU job submission

37 of 67

1 node, using 1 core and 1 GPU

For this type of job, you would want to request the gpu partition.

_____________________________________________________________

#!/bin/bash

#SBATCH --nodes=1

#SBATCH --time=4:00:00

#SBATCH --job-name=MyJobName

#SBATCH --partition=gpu

#SBATCH --gres=gpu:1

<commands to execute.>

_____________________________________________________________

37

GPU job submission

38 of 67

1 node, all compute cores and 1 GPU

For this type of job, you would want to request the gpu partition.

_____________________________________________________________

#!/bin/bash

#SBATCH --nodes=1

#SBATCH --time=4:00:00

#SBATCH --job-name=MyJobName

#SBATCH --partition=gpu

#SBATCH --gres=gpu:1

#SBATCH --exclusive

<commands to execute.>

_____________________________________________________________

38

GPU job submission

39 of 67

4 nodes, all compute cores and 1 GPU per node (4 total GPUs)

For this example, you are only requesting one rank per node.

_____________________________________________________________

#!/bin/bash

#SBATCH --nodes=4

#SBATCH --time=4:00:00

#SBATCH --job-name=MyJobName

#SBATCH --partition=gpu

#SBATCH --gres=gpu:1

#SBATCH --exclusive

<commands to execute.>

_____________________________________________________________

39

GPU job submission

40 of 67

4 nodes, all compute cores and 1 GPU per node (4 total GPUs)

Here, request two tasks/ranks per node (8 tasks, 4 nodes), and 6 cores per task.

_____________________________________________________________

#!/bin/bash

#SBATCH --nodes=4

#SBATCH --tasks-per-node=2

#SBATCH --cpus-per-task=6

#SBATCH --time=4:00:00

#SBATCH --job-name=MyJobName

#SBATCH --partition gpu

#SBATCH --gres=gpu:1

<commands to execute.>

_____________________________________________________________

40

GPU job submission

41 of 67

1 node, 8 compute cores and 4 GPUs per node

For multi-GPU-per-node calculations, one must use the multigpu partition.

_____________________________________________________________

#!/bin/bash

#SBATCH --nodes=4

#SBATCH --tasks-per-node=2

#SBATCH --cpus-per-task=6

#SBATCH --time=4:00:00

#SBATCH --job-name=MyJobName

#SBATCH --partition=multigpu

#SBATCH --gres=gpu:4

<commands to execute.>

_____________________________________________________________

Note: Do NOT use “--exclusive” with multigpu, unless you want to reserve all compute cores and all GPUs on a node. If you are not going to use every CPU and GPU on a node, then --exclusive would hold the remaining resources idle, while preventing other jobs from running.

41

GPU job submission

42 of 67

Additional Examples of Submission Options

42

43 of 67

Redirecting stdout/stderr

By default, SLURM will write all stdout and stderr to a single file called: slurm-<job number>.out, where <job number> is assigned at the time of submission. If you would like to write stderr and stdout to specific files, use the flags below:

_____________________________________________________________

#!/bin/bash

#SBATCH --nodes=1

#SBATCH --time=4:00:00

#SBATCH --job-name=MyJobName

#SBATCH --partition=general

#SBATCH --output=myjob.%j.out

#SBATCH --error=myjob.%j.err

<commands to execute.>

_____________________________________________________________

In the above example, “%j” tells slurm to insert the job number in the file name.

43

Specialized job submission

44 of 67

Specifying Required Node Features

Since the general partition contains heterogeneous nodes (different core count and speed), you may want to tell SLURM to only run your job on nodes that have specific features. Here is an example where SLURM is told to only run this job on a node that has an Intel E5-2690v3 chip:

_____________________________________________________________

#!/bin/bash

#SBATCH --nodes=1

#SBATCH --time=4:00:00

#SBATCH --job-name=MyJobName

#SBATCH --partition=general

#SBATCH --constraint=“E5-2690v3@2.60GHz”

<commands to execute.>

_____________________________________________________________

44

Specialized job submission

45 of 67

What are the available “Features”?

For a complete list of available features that may be specific when using SLURM on Discovery, you may use the command:

grep Feature /shared/centos7/etc/slurm/nodes.conf

RC staff is currently preparing a wrapper script with which to easily view the current configuration of all nodes. When it is available, a description will be provided here.

45

Specialized job submission

46 of 67

Specifying Required GPU Types

If you want to specify that a particular GPU model should be used:

_____________________________________________________________

#!/bin/bash

#SBATCH --nodes=1

#SBATCH --tasks-per-node=8

#SBATCH --time=4:00:00

#SBATCH --job-name=MyJobName

#SBATCH --partition=gpu

#SBATCH --gres=gpu:k20:1

<commands to execute.>

_____________________________________________________________

The following GPU designations are available:

gpu partition: K20, k40m

multigpu partition: k80, p100 (4 per node)

46

GPU job submission

47 of 67

Excluding Specific Nodes

You may also tell SLURM to exclude specific nodes. This can be useful if you find specific nodes are being problematic. Here is an example where SLURM is told to NOT run this job on node c0100:

_____________________________________________________________

#!/bin/bash

#SBATCH -N 1

#SBATCH --time=4:00:00

#SBATCH --job-name=MyJobName

#SBATCH --partition=general

#SBATCH --exclude=c0100

<commands to execute.>

_____________________________________________________________

47

Specialized job submission

48 of 67

Requesting Specific Nodes

You may also tell SLURM to only run jobs on a set of possible nodes. Here is an example where SLURM is told to only consider running on nodes c0100-c0200:

_____________________________________________________________

#!/bin/bash

#SBATCH -N 1

#SBATCH --time=4:00:00

#SBATCH --job-name=MyJobName

#SBATCH --partition=general

#SBATCH --nodelist=c0[100-200]

<commands to execute.>

_____________________________________________________________

Note: if you request more nodes with -N than are listed in --nodelist, additional resources will also be assigned to the job.

48

Specialized job submission

49 of 67

Using Job Dependencies

Oftentimes, a calculation will require more than the walltime limit. For example, you may have a job that takes 40 hours, but the walltime limit is 24 hours. In that case, you may want to break your calculation into two smaller parts, where the second part only begins after the first part has finished. To include dependencies, you edit the submit script, or define the dependency on the command line. For these examples, let’s assume you have already submitted the first segment of your calculation with the command:

sbatch first.script

You will see a message, such as:

Submitted batch job 45

49

Specialized job submission

50 of 67

Using Job Dependencies

Now, let’s submit the second job, but include a line in your submit script that will tell SLURM to only start it if the first job finishes without an error.

_____________________________________________________________

#!/bin/bash

#SBATCH --nodes=1

#SBATCH --time=4:00:00

#SBATCH --job-name=MyJobName

#SBATCH --partition=general

#SBATCH --dependency=afterok:45

<commands to execute.>

_____________________________________________________________

50

Specialized job submission

51 of 67

Using Job Dependencies

Alternately, you may only want the job to start if the first job has failed:

_____________________________________________________________

#!/bin/bash

#SBATCH --nodes=1

#SBATCH --time=4:00:00

#SBATCH --job-name=MyJobName

#SBATCH --partition=general

#SBATCH --dependency=afternotok:45

<commands to execute.>

_____________________________________________________________

51

Specialized job submission

52 of 67

Using Job Dependencies

As a third example, you may want the second job to start, regardless of whether the first job finished without an error:

_____________________________________________________________

#!/bin/bash

#SBATCH --nodes=1

#SBATCH --time=4:00:00

#SBATCH --job-name=MyJobName

#SBATCH --partition=general

#SBATCH --dependency=afterany:45

<commands to execute.>

_____________________________________________________________

52

Specialized job submission

53 of 67

srun

If you prefer to run a job interactively, you may request a session on a compute node. To do this, please use the command:

srun --pty --export=ALL --tasks-per-node 1 --nodes 1 --mem=10Gb --time=00:30:00 /bin/bash

This example would allocate 1 core on 1 node, 10Gb of memory, and the reservation would be held for 30 minutes. Note: srun automatically logs you on to the compute node. There is no need to additionally ssh to the node. When you are done with your interactive session, you may close the window, or type “exit”. Note: if you would like to use X forwarding, add the flag --x11

Note: Since it is very easy to misuse salloc, it is now recommended that all users launch interactive sessions using srun.

53

54 of 67

squeue

If you want to see all jobs in the queue, including yours, simply use:

squeue

If you only want to see your jobs:

squeue -u `whoami`

If you only want to see your jobs that are currently running:

squeue -u `whoami` -t RUNNING

54

55 of 67

scontrol

If you want extensive information about a running (or pending) job, you can use the scontrol command.

scontrol show job <job id>

Where <job id> is the number assigned to the job you would like to check on.

You may also see all SLURM configurations with

scontrol show config

55

56 of 67

seff

seff is a SLURM tool that will calculate the fraction of reserved resources that were used by a completed job. Here is an example where we will check the efficiency of job 49908:

> seff 49908

Job ID: 49908

Cluster: discovery

User/Group: whitford/users

State: COMPLETED (exit code 0)

Nodes: 8

Cores per node: 28

CPU Utilized: 185-15:16:44

CPU Efficiency: 104.24% of 178-01:55:12 core-walltime

Job Wall-clock time: 19:04:48

Memory Utilized: 753.43 MB

Memory Efficiency: 0.34% of 218.75 GB

Note: You should always strive to use 100% of the CPU time that you reserve. If you are requesting more memory than the default 1Gb/core, then Memory Efficiency should also be close to 100%. Users that regularly perform low-efficiency jobs will have reduced access to the resource.

56

57 of 67

Software-specific tips

While the above description focused on how to reserve resources, the next step is to effectively use your code. In this section, we provide some guidance on how to use specific applications/languages.

Note: If you are an expert with a particular software package, and you would like to provide tips for use on Discovery, please let us know.

57

58 of 67

Running Jupyter Notebook

You may want to use GPU on the cluster to speed up your program and Jupyter Notebook to have a better control over your code.

Since there is no web browser on the cluster, we have to do it in another way.

Generate a config file, since we are not going to use the default.

$jupyter notebook --generate-config

58

Jupyter Tips

59 of 67

Put the following lines in the config file and save. Use any port number between 1025 and 65535.

c = get_config()

c.NotebookApp.ip = '0.0.0.0'

c.NotebookApp.open_browser = False

c.NotebookApp.port = 8888

Check the IP address of a compute note and run Jupyter Notebook on it.

Open the browser locally and put the IP address and port number. You should be able to utilize the resource on Discovery and project the results to your local browser.

59

Jupyter Tips

60 of 67

Adding your own python packages

If you have a deadline and cannot wait for your package request to be done, you can install it under your home directory. The only thing is to tell the Python interpreter where to find it.

Install the package from source or with other virtual environment tools, such as conda.

In your Python code, put the following lines in the front

import sys

sys.path.append(PATH_TO_THE_PACKAGE)

Then you should be able to use the Python on Discovery and the packages you install.

60

Jupyter Tips

61 of 67

Known Issue

As with any migration process, the transition to the new Discovery cluster will require fine-tuning. Here are known issues that are currently being worked on:

  • Cross compiling: Since the chips on the front end are newer than those on the compute nodes, some packages may auto-configure various optimization flags specifically for usage on the front end. This can lead to “Illegal Instruction” errors when executing a program on the general partition. RC support staff is currently expanding the module environment to address these requirements. If you build your own code, you may need to cross compile it. Alternately, you can build your code on the compute nodes that you wish to run on. This way, configure scripts will likely select appropriate options.

61

62 of 67

Policy Questions and Suggestions

Usage policies for the Discovery Cluster are established by the NU Research Computing Committee, which is composed of ITS staff and faculty representatives from every college. We aim to provide thoughtful policies that will support research activities of all users. If you have suggestions on how to improve the utility of the Discovery Cluster, always feel free to reach out. More information about NURCC may be found at: https://web.northeastern.edu/rcac/

62

63 of 67

Applying for Access to Specialized Partitions

While access to the “general” and “gpu” partitions is automatically granted to all Discovery account holders, the “fullnode”, “multigpu” and “infiniband” partitions require an additional application be completed. Since these are more powerful (and expensive) resources, the application is intended to ensure that you know how to appropriately use these nodes.

The new application for fullnode is now available on the RC page. The application for multigpu is currently being revised.

63

64 of 67

Using Discovery in Courses

In addition to serving the research community, the Discovery cluster is also a valuable educational resource. If you are an instructor, and would like to use the cluster in your class, it is highly recommended that you request support through RCC at least 2 months prior to the beginning of the semester. Ideally, you should coordinate your request with your college’s RCC representative. While approval is not required, there can be no guarantee of ITS support if prior approval is not obtained.

You may find a recently-approved request here. To streamline the review process, please model your request after this example.

64

65 of 67

Scheduled Downtime

In order to ensure that the Discovery cluster remains stable and is able to support all users, it is essential that the administrators are able to occasionally power down the system. To allow for this, the Research Computing Committee recommended that ITS/RC implements the following regularly- scheduled maintenance window.

Starting January 1st, 2019, RC/ITS will have a standing reservation for routine maintenance on the cluster. The maintenance window will be from 8am-12pm on the first Monday of each month. If RC/ITS determines that they need to utilize this window, the users will be notified 1 week prior. During the downtime, no jobs will be able to run. In addition, SLURM will not initiate additional calculations, if they are not expected to terminate before the service window.

65

66 of 67

Getting Help

Self-Serve portal: As of September 2018, there is a Self-Service portal available to file Research-Computing-Specific help requests. Since this will be a trackable system, it is the preferred mechanism for obtaining RC assistance. You may file a ticket at : https://northeastern.service-now.com/research

Man pages: This guide only provide a few examples of each command. For more complete listings of options, check the man pages (e.g. issue “man sbatch” on the command line)

In-person consultation: You can always stop by the RC office in 2 Ell Hall to discuss issues you may be having, or to elicit advice.

Training workshops: ITS/RC organizes introductory workshops for new users to Discovery. Check the RC page for upcoming events.

Discuss with other users: There is a Discovery discussion listserv open to all NU members. You can sign up for “discovery-user-forum” at listserv.neu.edu. See details on next slide.

Email: You may also email the RC computing staff directly at: researchcomputing@northeastern.edu

66

67 of 67

Getting Help: User Forum Listserv

There is a Discovery discussion listserv open to all NU members. You can join the “discovery-user-forum” by doing the following:

From the email account that you want to use send the following email:

To: listserv@listserv.neu.edu

Subject: <leave blank>

Content: subscribe discovery-user-forum <firstname> <lastname>

You will receive a confirmation email with a link.

Access archived posts by creating an account with your email address at https://listserv.neu.edu

Go to “Subscribers Corner” to locate the list, click on the list to see archived emails

67