1 of 44

NERSC Overview

Julia Tutorial at SC25

July 22nd 2025

Johannes Blaschke

Data Science Engagement Group

NERSC

1

2 of 44

NERSC: Mission HPC for DOE Office of Science Research

Biological and Environmental Research

Computing

High Energy Physics

Largest funder of physical science research in the U.S.

Nuclear Physics

Basic Energy Sciences

Fusion Energy, Plasma Physics

2

3 of 44

NERSC Directly Supports Office of Science Priorities

The distribution of time to Office of Science Programs is set by DOE
Percentages change infrequently
Roughly follows program budgets

2023 Allocation Breakdown

(Hours Millions)

Distributed by DOE �Office of Science �program managers

Competitive awards run �by DOE Advanced Scientific �Computing Research Office

Strategic awards from NERSC

3

4 of 44

NERSC Turns 50 This Year!

CDC 6600 at LLNL 1974

4

5 of 44

Success Is Depth and Breadth of Scientific Impact

An Accelerating Universe�Saul Perlmutter, Berkeley Lab

Perlmutter’s Nobel winning team is believed to have been the first to use supercomputers to analyze and validate observational data in cosmology, contributing to the discovery of the accelerating expansion of the universe.

Oscillating Neutrinos�Sudbury Neutrino Observatory (SNO)

Data from SNO was transferred to NERSC and analyzed in what became known as the “West Coast Analysis” leading to the discovery of neutrino oscillations and a Nobel Prize.

New Approach to Water Desalinization �Jeff Grossman, MIT

One of Smithsonian Magazine’s Top 5 Most Surprising Milestones of 2012 was the computationally driven discovery of an approach to desalination of water that is more efficient and less expensive than existing systems.

NERSC has been acknowledged in 8,790 refereed scientific publications & high profile journals since 2020

Nature [47]
Nature Family of Journals [463]
Proc. of the National Academy of Sciences [222]
Science [36]
Monthly Notices of the Royal Astron. Society [397]
Physical Review* [4,170]
Astrophysical Journal [792]
Physics of Plasmas [685]

5

6 of 44

We Accelerate Scientific Discovery for Thousands of Office of Science Users with 3 Advanced Capability Thrusts

Large-scale applications for simulation, modeling and data analysis

Complex experimental and AI driven workflows

Time-sensitive and interactive computing

The NERSC workload is diverse with growing emphasis on integrated research workflows

6

7 of 44

Responding to the DOE Mission

NERSC-8: Cori

Manycore CPU

architectures

2016

2026

NERSC-9: Perlmutter

CPU and GPU nodes

Expanded Simulation, Learning & Data

2020

NERSC-10:�Accelerating �end-to-end �workflows

2030+

NERSC-11:

Beyond Moore

AI

Simulation & Modeling

Expt

Data

Simulation & Modeling

Expt

AI

Training &�Inference

Simulation & Modeling

Experiment

Data Analysis

HPC

Workflows Running Seamlessly in IRI

Quantum�Computing

Pervasive AI

7

8 of 44

NERSC Systems Ecosystem

100 GB/s

5 GB/s

edge services

2 x 400 Gb/s

2 x 100 Gb/s

50 GB/s

HPSS Tape Archive ~300 PB

35 PB�All-Flash

Scratch

>5 TB/s

1.6 TB/s

Common File System 130 PB

/home 450 TB

Experimental Facility

ASCR Facility

Home Institution

Cloud

Edge

Off-Platform Storage

DTNs, Gateways

1,792 GPU-accelerated nodes�4 NVIDIA A100 GPUs+1 AMD “Milan” CPU�448 TB (CPU) + 320 TB (GPU) memory

�3,072 CPU-only nodes�2 AMD “Milan” CPUs�1,536 TB CPU memory

HPE Slingshot 11 interconnect

4 NICs/GPU node, �1 NIC/CPU node

#25 (#6), 93.8PF Peak

Ethernet

Science Friendly Security

Production Monitoring

Power Efficiency

LAN

8

9 of 44

NERSC Ecosystem in 2027

Quality of Service Storage System (QSS)

Platform �Storage System (PSS)

> 800 GB/s

> 10 GB/s

container services

200 GB/s

HPSS Tape Archive >1 EB

35 PB�All-Flash

Scratch

>5 TB/s

1.6 TB/s

Community File System 240 PB

/home 450 TB

Experimental Facility

ASCR Facility

Home Institution

Cloud

Edge

Workflow Environment Management Environment

NERSC-10

Off Platform Storage

DTNs, Gateways

3.25 TB/s

�(26 Tbps)

1,792 GPU-accelerated nodes� 4 NVIDIA A100 GPUs+1 AMD “Milan” CPU� 448 TB (CPU) + 320 TB (GPU) memory�3,072 CPU-only nodes� 2 AMD “Milan” CPUs� 1,536 TB CPU memory

Ethernet

Science Friendly Security

Production Monitoring

Power Efficiency

LAN

HPE Slingshot 11 ethernet-compatible interconnect

4 NICs/GPU node, �1 NIC/CPU node

#25 (#6), 93.8PF Peak

2 x 400 Gb/s

2 x 100 Gb/s

9

10 of 44

Perlmutter system configuration

AMD "Milan" CPU Node

2x CPUs

> 256 GiB DDR4

1x 200G "Slingshot" NIC

NVIDIA "Ampere" GPU Nodes

4x GPU + 1x CPU

40 GiB HBM + 256 GiB DDR

4x 200G "Slingshot" NICs

Compute racks

64 blades

Blades

2x GPU nodes or 4x CPU nodes

Centers of Excellence

Network

Storage

App. Readiness

System SW

Perlmutter system

GPU racks

CPU racks

~6 MW

10

11 of 44

Perlmutter Node Configuration

Each Milan CPU has 64 2.45 GHz cores with two hardware threads
Access is managed via Slurm partitions
Login nodes are GPU-enabled

Partition	Nodes	CPU	RAM	GPU	NIC
GPU	1536	1x AMD EPYC 7763	256GB	4x NVIDIA A100 (40GB)	4x HPE Slingshot 11
	256	1x AMD EPYC 7763	256GB	4x NVIDIA A100 (80GB)	4x HPE Slingshot 11
CPU	3072	2x AMD EPYC 7763	512GB	–	1x HPE Slingshot 11
Login	40	1x AMD EPYC 7713	512GB	4x NVIDIA A100 (40GB)	–
Large Memory	4	1x AMD EPYC 7713	1TB	4x NVIDIA A100 (40GB)	1x HPE Slingshot 11

GPU Node Architecture:

11

12 of 44

Simplified NERSC File Systems

Memory

Burst Buffer

Scratch

Community

HPSS

Performance

Capacity

Global Common

Global Home

35 PB Flash Scratch

Lustre >5 TB/s

temporarily (purge)

xGB (x=Total RAM) In-Memory Burst Buffer

/tmp RamFS

157 PB HDD Community

Spectrum Scale (GPFS)

150 GB/s, permanent

150 PB Tape Archive

HPSS Forever

20 TB SSD Software

Spectrum Scale

Permanent

Faster compiling / Source Code

12

13 of 44

Global File Systems

Global Home

Permanent, relatively small storage
Mounted on all platforms
NOT tuned to perform well for parallel jobs
Quota cannot be changed
Snapshot backups (7-day history)
Perfect for storing data such as source code, shell scripts
Addressed using $HOME

Community File System (CFS)

Permanent, larger storage
Mounted on all platforms
Medium performance for parallel jobs
Quota can be changed
Snapshot backups (7-day history)
Perfect for sharing data within research group
Addressed using $CFS

13

14 of 44

Local File Systems

Scratch

Large, temporary storage
Optimized for read/write operations, NOT storage
Not backed up
Purge policy (8 weeks)
Perfect for staging data and performing computations
Addressed using $SCRATCH

Burst Buffer

Temporary storage
High-performance in-memory file system
Perfect for getting good performance in I/O-constrained codes
Not shared between nodes

14

15 of 44

There are many different ways to access NERSC. To use our resources, you need to either:

Log into the login nodes and interact with Slurm (covered here)
Use services (eg. Jupyter) that expose web interfaces (covered later)
Interact via our REST API – called the “Superfacility API”�(see: https://docs.nersc.gov/services/sfapi/)

15

16 of 44

Connecting with SSH

To access Perlmutter, use:�saul.nersc.gov�perlmutter.nersc.gov

To be able to open GUI�applications on the login �nodes use: ssh -Y

16

17 of 44

Connecting with SSH

After successfully logging on, you�be greeted by the terms of use, and�the command-line input prompt:

From here you can interact with�perlmutter…

17

https://carbon.now.sh/?bg=rgba%2874%2C144%2C226%2C0%29&t=material&wt=none&l=text&width=930&ds=true&dsyoff=11px&dsblur=17px&wc=true&wa=false&pv=56px&ph=56px&ln=false&fl=1&fm=Fira+Code&fs=14px&lh=152%25&si=false&es=2x&wm=false&code=blaschke%2540laptop%253A%7E%2524%2520ssh%2520blaschke%2540saul-p1.nersc.gov%250A***************************************************************************%250A%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520NOTICE%2520TO%2520USERS%250A%250ALawrence%2520Berkeley%2520National%2520Laboratory%2520operates%2520this%2520computer%2520system%2520under%2520%250Acontract%2520to%2520the%2520U.S.%2520Department%2520of%2520Energy.%2520%2520This%2520computer%2520system%2520is%2520the%2520%250Aproperty%2520of%2520the%2520United%2520States%2520Government%2520and%2520is%2520for%2520authorized%2520use%2520only.%250AUsers%2520%28authorized%2520or%2520unauthorized%29%2520have%2520no%2520explicit%2520or%2520implicit%2520%250Aexpectation%2520of%2520privacy.%250A%250AAny%2520or%2520all%2520uses%2520of%2520this%2520system%2520and%2520all%2520files%2520on%2520this%2520system%2520may%2520be%250Aintercepted%252C%2520monitored%252C%2520recorded%252C%2520copied%252C%2520audited%252C%2520inspected%252C%2520and%2520disclosed%250Ato%2520authorized%2520site%252C%2520Department%2520of%2520Energy%252C%2520and%2520law%2520enforcement%2520personnel%252C%250Aas%2520well%2520as%2520authorized%2520officials%2520of%2520other%2520agencies%252C%2520both%2520domestic%2520and%2520foreign.%250ABy%2520using%2520this%2520system%252C%2520the%2520user%2520consents%2520to%2520such%2520interception%252C%2520monitoring%252C%250Arecording%252C%2520copying%252C%2520auditing%252C%2520inspection%252C%2520and%2520disclosure%2520at%2520the%2520discretion%250Aof%2520authorized%2520site%2520or%2520Department%2520of%2520Energy%2520personnel.%250A%250AUnauthorized%2520or%2520improper%2520use%2520of%2520this%2520system%2520may%2520result%2520in%2520administrative%250Adisciplinary%2520action%2520and%2520civil%2520and%2520criminal%2520penalties.%2520By%2520continuing%2520to%2520use%250Athis%2520system%2520you%2520indicate%2520your%2520awareness%2520of%2520and%2520consent%2520to%2520these%2520terms%2520and%250Aconditions%2520of%2520use.%2520LOG%2520OFF%2520IMMEDIATELY%2520if%2520you%2520do%2520not%2520agree%2520to%2520the%2520conditions%250Astated%2520in%2520this%2520warning.%250A%250A*****************************************************************************%250A%250ALogin%2520connection%2520to%2520host%2520x3114c0s15b0n0%253A%250A%250A%250A%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%250A%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520_%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520_%2520%2520%2520_%250A%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%257C%2520%257C%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%2520%257C%2520%257C%2520%257C%2520%257C%250A%2520_%2520__%2520%2520%2520___%2520_%2520__%257C%2520%257C_%2520__%2520___%2520%2520_%2520%2520%2520_%257C%2520%257C_%257C%2520%257C_%2520___%2520_%2520__%250A%257C%2520%27_%2520%255C%2520%252F%2520_%2520%255C%2520%27__%257C%2520%257C%2520%27_%2520%2560%2520_%2520%255C%257C%2520%257C%2520%257C%2520%257C%2520__%257C%2520__%252F%2520_%2520%255C%2520%27__%257C%250A%257C%2520%257C_%29%2520%257C%2520%2520__%252F%2520%257C%2520%2520%257C%2520%257C%2520%257C%2520%257C%2520%257C%2520%257C%2520%257C%2520%257C_%257C%2520%257C%2520%257C_%257C%2520%257C%257C%2520%2520__%252F%2520%257C%250A%257C%2520.__%252F%2520%255C___%257C_%257C%2520%2520%257C_%257C_%257C%2520%257C_%257C%2520%257C_%257C%255C__%252C_%257C%255C__%257C%255C__%255C___%257C_%257C%250A%257C%2520%257C%250A%257C_%257C%250A%250AHello%252C%2520World%21%250A%250AWelcome%2520to%2520perlmutter%21%250A%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%2523%250A%250AFor%2520all%2520planned%2520outages%252C%2520see%253A%2520https%253A%252F%252Fwww.nersc.gov%252Flive-status%252Fmotd%252F%250A%250AFor%2520past%2520outages%252C%2520see%253A%2520https%253A%252F%252Fmy.nersc.gov%252Foutagelog-cs.php%252F%250Ablaschke%2540perlmutter%253Alogin17%253A%7E%253E%2520%250A

18 of 44

Submitting Jobs

Jobs can be submitted to queueing�system through sbatch or salloc:

sbatch <my_job_script>
salloc <options>

The above methods list details�about resources needed for a job�and for how long

Eg.: a request for a cpu node to be�allocated for 5 mins:

salloc -N 1 -C cpu -q debug -t 5 -A <project>

18

19 of 44

Submitting Jobs

srun

sbatch

or

salloc

Login Node

Head Compute Node

Other Compute Nodes allocated to the job

Head compute node:

Runs commands in batch script
Issues job launcher “srun” to start parallel jobs on all compute nodes (including itself)

Login node:

Submit batch jobs via sbatch or salloc

*figure courtesy Helen (2020 NERSC Training)

19

20 of 44

My First “Hello World” Job Script

To run via batch queue

% sbatch my_batch_script.sh

To run via interactive batch

% salloc -N 2 -q interactive -C gpu -t 10:00 �<wait_for_session_prompt. Land on a compute node>

% srun -n 64 ./helloWorld

debug queue

2 nodes

10 min “walltime”

run on haswell partition

use SCRATCH file system

run 64 processes in parallel

20

21 of 44

Accessing NERSC

and submitting your first Job

22 of 44

Access to Perlmutter and Use Julia Module

NERSC users have been added to trn013 project
Non-users were sent the instruction to get a training account

Account valid through July 29

Login to Perlmutter: ssh username@perlmutter.nersc.gov
Julia modules:

% module load julia

Running Jobs examples:

https://docs.nersc.gov/jobs/

22

23 of 44

Compute Node Reservations

GPU node reservation:

To use 1 GPU only (sample flags for sbatch or salloc):

-A trn013 --reservation=juliacon_1 -C gpu -N 1 -c 32 -G 1 -t 30:00 -q shared

To use multiple nodes (sample flags for sbatch or salloc):

-A trn013 --reservation=juliacon_1 -C gpu -N 2 -t 30:00 -q regular

Outside of reservation, use:

To use 1 GPU only (sample flags for sbatch or salloc):

-A <project> -C gpu -N 1 -c 32 -G 1 -t 30:00 -q shared

To use multiple nodes (sample flags for sbatch or salloc):

-A <project> -C gpu -N 2 -t 30:00 -q regular (or -q interactive for salloc)

23

24 of 44

Make Sure to Clone The Tutorial Repo

https://github.com/JuliaParallel/julia-hpc-tutorial-juliacon25

Pro tip: $HOME is not the best file system for running jobs at scale. For large-scale jobs, add

export JULIA_DEPOT_PATH=$SCRATCH/depot

to you scripts for optimal performance (remember to do this in the config script, and the job scrip) … or use a container!

24

25 of 44

Logging into Jupyter

Go to https://jupyter.nersc.gov
Sign in using your training account credentials
Select your preferred jupyter instance:

25

26 of 44

Logging into Jupyter

Go to https://jupyter.nersc.gov
Sign in using your training account credentials
Select your preferred jupyter instance:

For now, let’s use the “Shared GPU Node” – or the “Login Node”

26

27 of 44

Logging into Jupyter

Go to https://jupyter.nersc.gov
Sign in using your training account credentials
Select your preferred jupyter instance:

Later we’ll be using the “Exclusive GPU Node” or reservations (using the “Configurable Job”)

27

28 of 44

Getting a Terminal in Jupyter

Jupyter should take a minute to start:

28

29 of 44

Getting a Terminal in Jupyter

(If you don’t see a terminal), select “+” followed by “Terminal”

29

30 of 44

Setup

Once started you should see a terminal

30

31 of 44

Pro Tip: Projects handle dependencies

Run: import Pkg; Pkg.activate(“path/to/the/repo”); Pkg.instantiate() to insure all dependencies are installed:

31

32 of 44

Pro Tip: Sanity Check

The versioninfo() function can be used to check if any backends are configured correctly. Eg. for MPI.jl and CUDA.jl:

32

33 of 44

Using Reservations in Tutorials

Go to https://jupyter.nersc.gov and select:� “Configurable GPU” in the “Perlmutter” row

33

34 of 44

Jupyter�Options:

Leave defaults, except:

Account:

trn013

Reservation

juliacon_1

Time

180

34

35 of 44

Jupyter on Bridges2

https://ondemand.bridges2.psc.edu/

35

36 of 44

36

37 of 44

37

38 of 44

38

39 of 44

Building HPC Julia Workflows

40 of 44

41 of 44

Building a HPC Workflow in Julia

WF node

High-speed network

Compute 1

Compute 2

Compute 3

Compute 4

41

42 of 44

Building a HPC Workflow in Julia

WF node

High-speed network

Compute 1

Compute 2

Compute 3

Compute 4

User SW

User WF

42

43 of 44

Building a HPC Workflow in Julia

WF node

High-speed network

Compute 1

Compute 2

Compute 3

Compute 4

Jupyter / Pluto

Distributed.jl / MPI.jl

CUDA.jl

Vendor SW

User SW

User WF

Dagger.jl / ImplicitGlobalGrid.jl / ParallelStencil.jl

43

44 of 44

Building a HPC Workflow in Julia

WF node

High-speed network

Compute 1

Compute 2

Compute 3

Compute 4

Jupyter / Pluto

Distributed.jl / MPI.jl

CUDA.jl

Vendor SW

User SW

User WF

Dagger.jl / ImplicitGlobalGrid.jl / ParallelStencil.jl

Possibly also login node, or head node

JACC.jl and KernelAbstractions.jl provide portability

High-level abstraction

Low-level communications

Interactivity

44