National Research Platform�- A Status Update -��Frank Würthwein�Director, San Diego Supercomputer Center��April 11th 2023�
Long Term Vision
2
Openness for an Open Society
Open Compute
Open Storage & CDN
Open devices/instruments/IoT, …?
Community vs Funded Projects
3
Community with
Shared Vision
Lot’s of funded projects that
contribute to this shared
vision in different ways.
We want you to …
… grow NRP.
… build on NRP.
NRP is “owned” and “built” by the community for the community
A single Kubernetes Cluster Across the World
Rotating Storage
4000 TB
Feb 9, 2023
NRP passed its NSF acceptance review in February 2023
Cyberinfrastructure Stack
5
HTCondor/OSG
Hardware
IPMI, Firmware, BIOS
Kubernetes
Admiralty
SLURM
NRP operates at all layers of the stack, from IPMI up
The layer you integrate at depends on
Cyberinfrastructure Stack
6
HTCondor/OSG
Hardware
IPMI, Firmware, BIOS
Kubernetes
Admiralty
SLURM
NRP operates at all layers of the stack, from IPMI up
All of these find it difficult to
justify staff to support all layers
Hardware on NRP is Global
7
NRP integrates hardware in USA, EU, and Asia
Grafana Graphs Nautilus Namespaces Usage�Calendar 2022 GPUs
900
AI/ML is largest “domain” both in # of namespaces & # of GPU-hours
Usage by K8S Namespace
9
osg-opportunistic
ucsd-haosulab
osg-icecube
ucsd-ravigroup
cms-ml
braingeneers
Let’s look at some
example science
ML Inference as a Service on NRP
Raghav Kansal (grad. Stud. UCSD) runs ~1,000 CPU jobs calling out to
~10 GPUs on NRP for inference for his ML model in his thesis analysis.
80M events inferenced, sending 1.3TB of data from CPUs to GPUs in 3h
The ML model is too large to fit into the DRAM of the CPUs.
Fastest way to get the job done is “ML Inference as a service” on NRP
~4MB/s output from GPUs
~200MB/s input to GPUs
Raghav & colleagues are
4th largest GPU users in 2022
157,571 GPU-Hours
Peaking at 130 GPU
Experimental Particle Physics
cms-ml namespace
NRP Bringing Machine Learning �to Building Virtual Worlds, �Including Robotics and Autonomous Vehicles
(video)
A Major Project in UCSD’s Hao Su Lab�is Large-Scale Robot Learning
585,170 GPU-Hours
Peaking at 150 GPUs
2nd largest consumer
of GPU power in 2022
UCSD’s Ravi Group: How to Create Visually Realistic�3D Objects or Dynamic Scenes in VR or the Metaverse
Source: Prof. Ravi Ramamoorthi, UCSD
ML Computing Transforms a Series of 2D Images
Into a 3D View Synthesis
200,000 GPU-Hours
Peaking at 122 GPUs
4th largest GPU
consumer in 2022
Machine Learning-Based�Neural Radiance Fields for View Synthesis (NeRFs) Are Transformational!
BY JARED LINDZON
NOVEMBER 10, 2022
A neural radiance field (NeRF) is �a fully-connected neural network �that can generate
novel views of complex 3D scenes,
based on a partial set of 2D images.
https://datagen.tech/guides/synthetic-data/neural-radiance-field-nerf/
Source: Prof. Ravi Ramamoorthi, UCSD
https://youtu.be/hvfV-iGwYX8
Summary & Conclusions
=> more consistent coverage across USA
15
Acknowledgements
16