Exploring Clouds for Acceleration of Science
NSF Award #190444�________________________
Cornell Cloud Forum 2021
Presented by: Ananya Ravipati
Agenda
2
Leadership
NSF approached Internet2 to lead this in cooperative effort
3
Why E-CAS?
4
Exploring Clouds for Acceleration of Science. NSF Award #1904444
5
Scope of the project:
Acceleration of Science:
Innovation:
E-CAS: Exploring Clouds for Acceleration of Science
NSF Award #1904444: $3.2M
6
Purdue
Building Clouds�Urban Climate Modeling
UWMadison
IceCube Astronomical�Neutrino �Detector
SDSC
Bursting CIPRES Phylogenetics Science Gateway
MIT
Accelerated Machine Learning
George
BioCompute Objects in Galaxy
Washington
SUNY�Downstate
Deciphering the Brain’s Neural Code
https://internet2.edu/ecas
Timeline
Feb 2019 Academic review of proposals
Dec 2018 – Jan 2019
Call for proposals
Mar 2019 Phase 1 subaward contracts
Apr 2019 – Apr 2020
6x Phase 1
2x Phase 2
Aug 2020 – Aug 2021
Apr 2020 Reports
& Presentations on Phase 1 projects
Sep 2021 Final reports, project wrap-up
May 2020 Academic review of Phase 1 projects
July 2020 Phase 2 subaward contracts
NCE
Phase 1 video presentations and reports available at
SDSC: CIPRES Phylogenetic Analysis Gateway
CIPRES Science Gateway
XSEDE – Comet Supercomputer
Cloud Burst to AWS
UW Madison: Cloud Computing for the IceCube Neutrino Observatory
The IceCube Neutrino observatory
Science from disciplines including astrophysics, particle physics, and geophysical sciences, operating continuously, and being simultaneously sensitive to the whole sky
Open Science Grid
GWU: BioCompute Objects in Galaxy Science Gateway
Galaxy
BCO
https://galaxy.aws.biochemistry.gwu.edu/static/bco_tour.html�
x
Y
Z
Purdue: Building Typology for Urban Climate Modeling
SUNY Downstate: Deciphering the Brain’s Neural Code
Each simulation requires 100 compute cores – on GCP they ran 100K compute simultaneously
Algorithm optimization using 1.8 million core hours over 2 weeks
Run a single job for ~10 days
Slurm-GCP multi-user clusters
Containerization
MIT: Heterogeneous Computing of LHC Data
Large Hadron Collider data rates about ~1 Petabits/second
Only data related 1 in 100000 collisions is analyzed
MLaaS to design
Design CPU/GPU density required for the LHC HLT
Train Graph Neural networks
Extended similar approaches to different areas of High Energy Physics
Gravitational Wave analysis
Project Learnings
14
Technical
15
Technical
16
Business and related processes
17
Programmatic
18
Survey on ‘Research Computing use of Cloud Platforms’
19
What are the main opportunities/benefits of using cloud for research
20
What are the main difficulties when using cloud for research?
21
Conclusions
22
Thank you�
23
Contact info– aravipati@internet2.edu
con