Published using Google Docs
Cloud Computing Activities
Updated automatically every 5 minutes

Status Report: Science Gateway and Cloud Computing Activities

September 2024 - March 2025

Sean Arms, Julien Chastang, Nicole Corbin, Ethan Davis, Doug Dirks,

 Ana Espinoza, Ward Fisher, Thomas Martin, Ryan May, Tiffany Meyer, Jennifer Oxelson Ganter, Mike Schmidt,  Tanya Vance

Executive Summary

Questions for Immediate Committee Feedback

NSF Unidata Science Gateway staff maintain a THREDDS Data Server (TDS) on Jetstream2 (https://tds.scigw.unidata.ucar.edu) that largely mirrors the datasets available on https://thredds.ucar.edu, but with a shorter archive. This TDS serves as an experimentation and  staging ground for thredds-docker, which sees significant community adoption. Moreover,  while it does receive some use—such as by Daryl Herzmann at Iowa State, who notes when it is unavailable—overall, its utilization appears to be limited. Do you have any suggestions for how we might enhance the value and impact of this TDS? Are there particular datasets or services that would make it more useful to the broader atmospheric science community?

Activities Since the Last Status Report

JupyterHub Activities

PyAOS JupyterHub and Magnum Autoscaling on Jetstream2

In collaboration with Andrea Zonca (Jetstream2/SDSC) and Julian Pistorius (Jetstream2), NSF Unidata Science Gateway staff have made significant progress in launching OpenStack Magnum autoscaling JupyterHub clusters. Although still in the experimental and testing phase, this approach is expected to optimize Jetstream2 resource usage by dynamically scaling clusters based on student demand, rather than provisioning for peak usage and leaving resources underutilized. To explore this capability, we have begun testing it on an internal JupyterHub server at https://unistaff.ees220002.projects.jetstream-cloud.org.

JupyterHub Virtual Desktop Technology for Legacy Applications

NSF Unidata Science Gateway staff is exploring the deployment of JupyterHub virtual desktop technology to enable the integration of legacy UI-based applications within modern JupyterHub environments. While we have experimented with similar approaches over the past decade, recent advancements—particularly in autoscaling—represent a significant leap forward, bringing this technology closer to practical usability.

This work supports our efforts to deploy Cloud IDV and Cloud CAVE as part of the Science Gateway Reimagined initiative. Given IDV and CAVE’s computational demands, autoscaling is crucial, allowing resources to dynamically scale based on user needs, thereby improving UI responsiveness within science gateway workflows.

Moreover, these advancements foster a more seamless transition between traditional interactive visualization tools and modern notebook-based analysis pipelines. The integration of JupyterHub with Magnum, Cloud IDV, and Cloud CAVE has the potential to create a powerful NSF Unidata-enabled ecosystem that enhances the accessibility and usability of complex scientific workflows. We look forward to making these capabilities available to the community in the near future.

Exploring AI-NWP Models with earth2mip on Jetstream2

In collaboration with Thomas Martin, we experimented with running pre-trained AI-based Numerical Weather Prediction (AI-NWP) models using NVIDIA's earth2mip(https://github.com/NVIDIA/earth2mip) package on Jetstream2. Our goal was to deploy this technology on a GPU-enabled JupyterHub for hands-on exploration and testing. We successfully demonstrated that models such as FourCastNet and Pangu can be executed on standard GPU hardware, highlighting their potential for advancing AI/ML-driven weather modeling research within the NSF Unidata community. A blog post detailing our findings and experiences is currently in preparation.

Kubernetes Fluent Bit Logging for PyAOS JupyterHub Clusters

To improve collaboration with NSF Unidata system administrators, we have implemented Fluent Bit for log aggregation on PyAOS JupyterHub clusters running on Kubernetes. This technology ensures that all relevant logs are collected and stored on a dedicated virtual machine within the local network, providing system administrators with the necessary logs for monitoring and troubleshooting.

Dask Cluster for University of Wisconsin

During the fall semester of 2024, we worked with Professor Hanna Zanowski from the University of Wisconsin-Madison to establish a JupyterHub Kubernetes Dask cluster for her earth system science class of 22 students. Dask, a Python library, facilitates the scaling of Python code for parallel computing on distributed clusters.

WRF Single Column Model in JupyterHub

During the fall 2024 and spring 2025 semesters, we continued our work with Greg Blumberg from Millersville University and deployed an idealized single-column WRF model in a JupyterHub environment for undergraduate instructional objectives.

New JupyterHub Administrator Documentation

Each semester, we work with professors who serve as JupyterHub administrators, ensuring they can effectively manage their course environments. To streamline the onboarding process, we have developed new administrator documentation. The new administrator documentation is available at:

https://github.com/Unidata/science-gateway/blob/master/user-docs/admin-docs.md.

JupyterHub Servers for Workshops, Fall and Spring Semesters

NSF Unidata is employing our Jetstream2 resource allocation for the benefit of students in the atmospheric science community by providing access to customized JupyterHub servers at a steady pace. NSF Unidata tailors these servers to the requirements of the instructors so they can accomplish their Earth Systems Science teaching objectives. Since the fall semester of 2024 encompassing the length of this status report , 416 students at 17 academic institutions and various workshops have used NSF Unidata JupyterHub servers running on Jetstream2.

Notably, we provided JupyterHub resources to:


Jetstream2 TDS, TDM, and LDM Modernization

We have undertaken a modernization effort for the THREDDS Data Server (TDS), THREDDS Data Manager (TDM), and Local Data Manager (LDM) services running on Jetstream2 (https://tds.scigw.unidata.ucar.edu). These data distribution components had become outdated. In collaboration with Sean Arms, we have worked to update these configurations. For example, we incorporated Canadian Meteorological Centre (CMC) model data, which is now available at https://tds.scigw.unidata.ucar.edu/thredds/catalog/grib/CMC/RDPS/NA_10km/catalog.html

LROSE Collaboration between Colorado State University and NSF NCAR EOL

The NSF Unidata Science Gateway team continues its collaboration with Professor Michael Bell’s group at Colorado State University and NSF NCAR's Earth Observing Laboratory (EOL) to enhance their science gateway for radar meteorology. This gateway features a JupyterHub environment integrated with LROSE (Lidar Radar Open Software Environment) to support advanced data analysis and visualization.

Recent efforts have focused on adapting SAMURAI (Spline Analysis at Mesoscale Utilizing Radar and Aircraft Instrumentation) for use within JupyterHub. Additionally, we are leveraging JupyterHub Virtual Desktop technology—the same approach used for Cloud IDV and Cloud CAVE—to integrate legacy LROSE tools such as HawkEye into this gateway, ensuring continued usability of established analysis workflows.

Our team brings expertise in JupyterHub, OpenStack, and Jetstream2 to support this initiative. As part of this collaboration, we presented the paper "A Lidar and Radar Meteorology Science Gateway for Education and Research on the NSF Jetstream2 Cloud" at the Science Gateways 2024 conference and assisted an LROSE workshop for the AMS 2025 meeting in New Orleans.

To support this work, NSF Unidata secured $70,000 in funding for contributions from Ana Espinoza and Julien Chastang.

NSF Unidata Science Gateway Re-Imagined

The Science Gateway Re-Imagined (SGRI) team–consisting of Nicole Corbin, Ana Espinoza, and Julien Chastang with managerial support from Ethan Davis and Tanya Vance–convenes regularly to move the project forward. We are synchronizing our efforts with Doug Dirks and the NSF Unidata web group to ensure our initiative moves in harmony with NSF Unidata's overall web strategy.

The SGRI team has completed Phase 1 with the completion of the JupyterHub requests and Education Hub beta periods. Feedback from these periods have been incorporated and are ready for deployment with the new NSF Unidata website.

Phase 2 (on-demand notebooks, data integration, and community forums) began in early 2025. The team is working collaboratively with other NSF Unidata staff in the Systems and Community Services group to design the capabilities, scope, and user experience for these deliverables. With on-demand notebooks, visitors to the science gateway will be able to launch notebooks without the need for requesting dedicated JupyterHub resources. In addition to enabling users to demo existing or new NSF Unidata authored notebooks, the work done to bring users on-demand notebooks can be applied in the future during Phase 3 of the project–Community Contributions. On-demand notebooks will be made possible with BinderHub, although this is subject to change pending a more thorough investigation of potential security concerns and exploration of other technologies. Expect the next beta period for these features to begin in early 2026.

Phased Approach to Development Summary

Phase 1 (Jan 2025 Release): Requests and Education – Users can request both compute resources (in the form of JupyterHubs) and educational resources (trainings, modules, etc.) and browse existing educational resources

Phase 2 (Apr 2026 Release): On-Demand Notebooks, Data Integration, and Community Forums Users can interact with NSF Unidata curated “on-demand” notebooks without the need for a JupyterHub request, access data which is proximate to the computational environment, and share and develop ideas with colleagues in a community forum

Phase 3 (July 2026): Community Contributions – Users can contribute to the content (educational materials, notebooks, workflows, etc.) found on the Science Gateway according to written guidelines for the management and maintenance of this content

Phase 4 (Jul 2027 Release): App Streaming & Fully Re-Imagined Science Gateway – Users can “test-fire” NSF Unidata products such as the IDV or NSF Unidata’s version of AWIPS CAVE in their browser as a substitute for or prior to a local installation

Ongoing Activities

NOAA Big Data Program


University of Oklahoma REU Program Support

NSF Unidata Science Gateway staff collaborate each semester with Ben Schenkel (University of Oklahoma) to provide a JupyterHub environment for NSF Research Experience for Undergraduates (REU) students. When needed, we also host datasets on the Science Gateway RAMADDA server, ensuring seamless access to data from within the students' JupyterHub environment.

Jetstream2 Allocation Management and Collaboration

NSF Unidata staff continue to actively manage our Jetstream2 allocation and collaborate with the Jetstream2 team. We closely monitor our allocation and decommission outdated resources to prevent allocation exhaustion. In preparation for a Jetstream2 planned outage (Jan 6–9), we took steps to mitigate disruptions to NSF Unidata services. Additionally, we occasionally work with Jetstream2 staff to resolve disk attachment issues that impact users. We also set up a VM on Jetstream2 for Stonie Cooper and Sean Arms to support a WMO Information System 2.0 (WIS 2.0) node. WIS2 enables WMO members to efficiently publish, exchange, and access standardized weather data, and will gradually replace the Global Telecommunication System (GTS). The UPC is evaluating WIS2 as a potential data source as well as for possible direct use by the community. Finally, our disk quota increase request from 50 TB to 70 TB was approved to better meet our needs for AWIPS.

Jetstream2 and Science Gateway Security

We continually work with NSF Unidata system administrator staff to ensure that our web-facing technologies and virtual machines on Jetstream2 adhere to the latest security standards. This effort involves such tasks as ensuring we are employing HTTPS , keeping cipher lists current, ensuring docker containers are up-to-date, limiting ssh access to systems, etc. It is a constantly evolving area that must be addressed frequently.

Collaboration with Andrea Zonca on Jetstream2

The ongoing collaboration between NSF Unidata Science Gateway staff and Andrea Zonca continues to expand into new areas. Together, we have made significant progress in porting the OpenStack Magnum Kubernetes-as-a-Service "Zero to JupyterHub" workflow to Jetstream2. Andrea publishes these workflows on his blog, and we actively test them, providing feedback to refine and enhance their functionality.

Docker Containerization of NSF Unidata Technology

We continue to employ Docker container technology to streamline building, deploying, and running NSF Unidata technology offerings in cloud-based environments. Specifically, we are refining and improving Docker images for the LDM,  RAMADDA, THREDDS (TDS), and the THREDDS Data Manager (TDM). Most recently, we released thredds-docker 5.6 in conjunction with the 5.6 release of the TDS.  In addition, we also maintain a security-hardened NSF Unidata Tomcat container inherited by the RAMADDA and THREDDS containers. Independently, this Tomcat container has gained use in the geoscience community. To keep our containers up-to-date, especially with respect to security, we programmatically monitor and respond to upstream updates by automatically building and deploying the refreshed containers to DockerHub.

AWIPS EDEX in Jetstream2 Cloud

NSF Unidata continues to host our publicly accessible EDEX servers on the Jetstream2 cloud platform where we serve real-time AWIPS data to CAVE clients and the python-awips users. We’ve had upwards of 300 clients connecting to EDEX in a single day. The distributed architectural concepts of AWIPS allow us to scale EDEX in the cloud to account for the desired data feed (and size). We continue using Jetstream2 to develop cloud-deployable AWIPS instances as imaged virtual machines (VMI) available to users of OpenStack CLI.

Unfortunately, our systems have had quite a few issues on Jetstream2 and both the AWIPS team and the Science Gateway team have spent significant time troubleshooting and repairing machines to keep our servers operational. In January, Jetstream2 told us about a maintenance downtime that was supposed to last multiple days. Because we knew ahead of time, we were able to spin up a local set of EDEX machines at NSF Unidata.

Because we are needing to spin up new machines fairly often, we have simplified and streamlined this process by creating custom Rocky 8 images that can be used for deployment on Jetstream2.  We have successfully created and launched a Rocky 8 EDEX system which the AWIPS team has been using to develop the latest version of AWIPS.

EDEX is designed so different components can be run across separate virtual machines (VMs) to improve efficiency and reduce latency. Our current design makes use of three VMs: one large instance to process most of the data and run all of the EDEX services including all requests, and two other ancillary machines which are smaller instances used to ingest and decode radar and satellite data individually.

We are currently supporting 2 sets of servers as described above: one set has been running our v23 production software, another running v23 development software. We may be looking to add an additional 2 back in the mix (running Rocky 8) for future development and beta builds.  Having backup/development  servers allows us to be able to patch, maintain, and develop our servers while still having a functional server for our users and to minimize any down time. In January we decommissioned the v18 and v20 systems since they were running CentOS7 which was end of life back in June 2024.

Nexrad AWS THREDDS Server on Jetstream2 Cloud

As part of the NOAA Big Data Project, NSF Unidata maintains a THREDDS data server on the Jetstream2 cloud serving Nexrad data from Amazon S3. This TDS server leverages Internet2 high bandwidth capability for serving the radar data from Amazon S3 data holdings. TDS team and science gateway staff collaborate to maintain this server.

NSF Unidata Science Gateway Website and GitHub Repository

Website

The NSF Unidata Science Gateway web site is regularly updated to reflect the progress of what is available on the gateway. The news section is refreshed from time-to-time for announcements concerning the gateway. The conference section and bibliography is also maintained with new information. We are in the process of redesigning this web site. See “NSF Unidata Science Gateway Re-Imagined” section above.

Repository

All technical information on deploying and running NSF Unidata Science Gateway technologies is documented in the repository README. This document is constantly updated to reflect the current state of the gateway.

Presentations/Publications/Posters

New Activities

Over the next three months, we plan to organize or take part in the following:

Forthcoming conference participation

The LROSE collaboration has submitted abstracts to the following conferences:

Over the next twelve months, we plan to organize or take part in the following:

Relevant Metrics

Fall 2024 / Spring 2025 JupyterHub Servers

Since spring of 2020, NSF Unidata has provided access to JupyterHub scientific computing resources to about 2300 researchers, educators, and students (including a few NSF REU students) at 28 universities, workshops (regional, AMS, online), and the UCAR SOARS program. Below are the latest metrics (institution, number of active users, point of contact) since the last status report.

Fall 2024

UND Fall Workshop

34

David Delene

Florida Institute of Technology

23

Steve Lazarus

Florida Institute of Technology B

13

Milla Costa

Southern Arkansas University

36

Keith Maull

Seoul National University

24

Duseong Jo

University of Florida

7

Stephen Mullens

University of Wisconsin

22

Hannah Zanowski

University of Wisconsin B

1

Hannah Zanowski

University of Wisconsin Dask

22

Hannah Zanowski

Vermont State University

8

Andrew Westgate

SUNY Oswego

13

Scott Steiger

Colorado School of Mines

28

Thomas Martin (NSF Unidata Staff)

University of Alabama Huntsville

13

Sean Freeman

Millersville University

2

Greg Blumberg

University of Oklahoma

3

Ben Schenkel

University of Oklahoma B

3

Ben Schenkel

Spring 2025

CUNY

37

Bill Spencer

Florida Institute of Technology

10

Milla Costa

Florida State University

34

Christopher Homles

Millersville University

33

Greg Blumberg

Seoul National University

1

Duseong Jo

Southern Arkansas University

27

Keith Maull

University of Massachusetts Lowell

1

Mathew Barlow

University of Louisville

18

Jason Naylor

University of Northern Colorado

0

Wendilyn Flynn

University of Oklahoma

3

Ben Schenkel

SUNY Oswego

0

Scott Steiger

Naval Postgraduate School

0

Derek Podowitz

Note: Some entries in the table above indicate zero or one user. These are recently launched Hubs and the instructors have not yet had a chance to complete their setup and launch the coursework for students to access.

Jetstream2 Allocation Usage Overview

In addition to service units (SUs) used for running various kinds of virtual machines–“regular” CPU and GPU instances–NSF Unidata was also granted a limited number of compute, storage, and network resources to carry out Jetstream2 operations. These three kinds of resources are ephemeral, being created and destroyed as necessary. Thus, metrics regarding these resources are representative of short-term utilization, while SU usage is a metric that can be representative of our long-term Jetstream2 utilization.

Following NSF Unidata’s 8M+ SU grant renewal, which went into effect October 2023, NSF Unidata staff has been proactive in ensuring Jetstream2 resources are being used effectively in a non-wasteful manner. The Science Gateway team has automated SU usage data collection through interactions with the JS2 API. This data is extrapolated forward in time to predict future SU usage, allowing us to make meaningful decisions about the science gateway’s capabilities. The scripts have been shared with our LROSE collaborators.

SU usage and resource metrics, current as of March 12, 2025, are presented below.

SU Usage

Type

SUs Used

SUs Allocated

% Usage *

CPU

3,079,650

8,191,300

38 %

GPU

59,194

672,768

9 %

Resource Metrics

Compute

Type

Used

Total

Percent Usage*

Instances

103

150

69 %

vCPUs

1003

4185

24 %

RAM

3.7 TB

16.4 TB

23 %

Storage

Type

Used

Total

Percent Usage*

Volumes

254

400

64 %

Volume Snapshots

0

50

0 %

Volume Storage

20.1 TB

58.6 TB

34 %

Network

Type

Used

Total

Percent Usage*

Floating IPs

40

320

13 %

Security Groups

69

100

69 %

Security Group

Rules

229

300

76 %

Networks

2

100

2 %

Ports

139

250

56 %

Routers

2

15

13 %

* Percent Usage is rounded to the nearest whole number

Github Statistics*

Repository

Watches

Stars

Forks

Open Issues

Closed Issues

Open PRs

Closed PRs

science-gateway

6 (-1)

19

13

5

167

19 (+3)

819 (+45)

tomcat-docker

10 (-1)

65

70

0

42

0

97

thredds-docker

12 (-3)

38 (+5)

29(+1)

4 (+1)

124 (+4)

0

188

(+4)

ramadda-docker

3 (-1)

0

2

1

10

0

38

ldm-docker

8 (-1)

12

14

1

40

0

70

tdm-docker

4 (-1)

4

7

0

10

0

29 (+2)

* Numbers in parentheses denote change from last stat report


Prepared March 2025