What’s Up in the Cloud?
A Cloud of Resources Overview &
Project Eureka, Adaptable Cyber-Infrastructure at Scale
Connect!
Boyd Wilson
Boyd at omnibond.com
linkedin.com/in/boydwilson
x.com/boydwilson
Instagram.com/boydfryguy
Leadership
Team
Omnibond�a customer-focused software engineering and support company
Software Products
Omnibond�a customer-focused software engineering and support company
Data -
The Oil of the AI Generation
NSF Global Instrument Data (100’s of PB / yr)
Weather Station Data
A Cloud of CI Resources
Network Foundation
Image:Internet2
Campus Locations
Image: Andrew C. Comrie + IPEDS (2020)
Sampling of University CI Capabilities
| Clemson | Princeton | Oklahoma |
Nodes | 1,786 | 1,492 | 919 |
Cores | 34,916 | 90,000 | 29,428 |
GPUs | 850 | 423 | 89 |
Storage (PB) | 2.2 | 18 | 12.8 |
NSF CI Resources
Image:NSF
Sampling of NSF Resource Capabilities
| JetStream 2 IU, ASU, Cornell, UH, TACC | Stampede 3 TACC | Bridges-2 PSC |
Nodes | 506 | 1,858 | 576 |
Cores | 49,152 | 140,000 | 65,344 |
GPUs | 360 | 40 | 280 |
Storage (PB) | 14 | 13 | 15 |
Public Cloud Provider Locations
Image: Peter Alguacil
+ Atomia
Example: AWS Scaling
https://aws.amazon.com/blogs/aws/natural-language-processing-at-clemson-university-1-1-million-vcpus-ec2-spot-instances/
Example: Google Cloud Scaling
Google HPC Blog Post
Kevin Kissell
Technical Director,
Office of the CTO
Nov 18, 2019
Google HPC Blog Post
Kevin Kissell, Technical Director,
Office of the CTO
Urgent HPC can Burst Affordably to the Cloud
Processed 2,479,396 hours (~256TB) of video data
Total Cost: $52,598.64 USD
Average cost of $0.008 USD per vCPU hour
Accessing Resources
NSF Resource Access
Access (Traditional HPC)
OSGPool (High Throughput)
NRP (K8s Hypercluster for Containers)
AWS Cloud HPC Resource Options
Amazon FSx for Lustre
Partner/Other HPC Solutions
AWS Supports:
Google Cloud HPC Resource Options
Partner HPC Options
Highlevel steps:
Google Cloud Supports:
Azure Cloud HPC Resource Options
Partner HPC Options
Azure Cloud Supports:
Help from CaRCC (Campus Research Computing Consortium)
carcc.org
Join the People Network
A work in progress update
Vision &�Use Cases
Project Eureka Vision
Storage
Interactive
Compute
Enabling
Moments
Multi-Cloud/Edge Architecture
K8s
K8s
Job Routing�Elastic Scratch
Data Staging�Elastic Compute
Job Routing
Elastic Scratch
Data Staging
Elastic Compute
Job Routing
Elastic Scratch
Data Staging
Elastic Compute
Job Routing
Elastic Scratch
Data Staging
Elastic Compute
Job Routing
Data Staging
Job Routing�Elastic Scratch
Data Staging�Elastic Compute
Job Routing
Data Staging
OSPool
Interactive Application Integration Use Case
…
Interactive, HPC, HTC applications
On-prem, edge, or in the cloud
OmniSched
Instance/Storage�Provisioning�& Job Execution
K8s
Eureka User Experience
Kevin Kissell, Technical Director,
Office of the CTO
Open OnDemand Deployments
User-Level Security Architecture (Based on Open OnDemand)
Per User NGiNX
(Runs as each individual user)
Eureka-UI
HTTPS/WSS
Server Frontend
(runs as Apache User)
Functions
User Authentication
Reverse Proxy
HTTPS/WSS
Passenger
IPC Sockets
HTTPS/WSS
Per User JWT
Jobs run as the user
OmniSched
Identity Architecture (Using AWS as an Example)
Shibboleth
Entra
Interactive Apps Demo
Bursting &
Data Staging
Elastic On-Prem -> Cloud Use Case
Elastic Compute�& Storage
based on Job Directives
Jobs
OmniSched
On-Prem
Standard Jobs
Burst Jobs
Elastic Scratch
Elastic Compute
Data Staging
K8s
Job Directives
CloudyCluster
projectEureka
Storage Manager
Demo
What’s Up in the Cloud?
Thank You!�
Questions?
A Cloud of Resources Overview &
Project Eureka, Adaptable Cyber-Infrastructure at Scale
Please Connect!
Boyd Wilson
Boyd at omnibond.com
linkedin.com/in/boydwilson
x.com/boydwilson
Instagram.com/boydfryguy