Published using Google Docs
Cloud Computing Activities
Updated automatically every 5 minutes

Status Report: Science Gateway and Cloud Computing Activities

November 2023- April 2024

Shay Carter, Julien Chastang, Nicole Corbin, Ethan Davis, Doug Dirks, Tara Drwenski,

 Ana Espinoza, Ward Fisher,  Thomas Martin, Ryan May, Tiffany Meyer, Jennifer Oxelson Ganter, Mike Schmidt,  Tanya Vance, Jeff Weber

Executive Summary

Questions for Immediate Committee Feedback

  1. What new or existing software are you excited to use or teach in your research? Does the installation process pose a barrier to entry, and can the Unidata Science Gateway team help you?
  2. Are you interested in being a beta tester of the Re-Imagined Science Gateway?
  3. Are there any specific case studies or success stories we should highlight?
  4. What feedback do you have on the phased SGRI development approach?

Activities Since the Last Status Report

LROSE Collaboration between Colorado State University and NSF NCAR EOL

Unidata science gateway staff collaborates with Professor Michael Bell’s team at Colorado State University and NCAR EOL to develop their science gateway, which features a JupyterHub equipped with LROSE (Lidar Radar Open Software Environment) radar meteorological software. Currently, we are adapting the SAMURAI (Spline Analysis at Mesoscale Utilizing Radar and Aircraft Instrumentation) software for JupyterHub use. We share our expertise in JupyterHub and related technologies with the team. Along with the LROSE team, we are in the beginning stages of drafting a submission to the ERAD (European conference on RADar in meteorology and hydrology) conference and a potential submission to the Gateways conference. As part of this effort, Unidata has secured $70K to fund Ana and Julien’s contributions to the project.

Weather Research Forecast Model on Jetstream2

Summary

We continue to have the capability of running containerized versions of the Weather Prediction and Forecasting (WRF) numerical weather prediction model on Jetstream2. This capability has two major advantages: easy distribution of the model to any Docker enabled machine and the analysis and visualization of output in a data proximate manner, for example, in a JupyterLab environment.

WRF Navajo Technical University

Unidata is collaborating with the Southwestern Indian Polytechnic Institute and Navajo Technical University to deploy an operational WRF model over the Navajo Nation. This project aims to provide Tribal Nations, and the Tribal Colleges and Universities (TCUs) with the capacity for environmental monitoring in alignment with data sovereignty objectives. Progress in this area has been put on stand-by as the TCUs engage in other aspects of their project.

WRF Single Column Model in JupyterHub

In collaboration with Greg Blumberg at Millersville University, Unidata staff have deployed an idealized single-column WRF model in a JupyterHub environment for undergraduate instructional objectives. As a result of this collaboration, Unidata staff presented their procedures and findings at the Science Gateways 2023 Conference, hosted in Pittsburgh, PA on Oct 29 through Nov 1, 2023. Our collaboration with Dr. Blumberg is ongoing; he has requested the WRF SCM-capable JupyterHub to support course development and learning objectives for both the Fall 2023 and Spring 2024 semesters.

Unidata Science Gateway Re-Imagined

The Science Gateway Re-Imagined (SGRI) team–consisting of Julien Chastang, Nicole Corbin, and Ana Espinoza with managerial support from Ethan Davis and Tanya Vance–convenes bi-weekly to move the project forward. Our recent endeavors involve familiarizing ourselves with the Drupal Content Management System, the technology which UCAR plans to use to standardize its web presence. We are synchronizing our efforts with Doug Dirks and the Unidata web group to ensure our initiative moves in harmony with Unidata's overall web strategy.

Phased Approach to Development

The SGRI team, in collaboration with the Unidata Management Team, has developed a phased release approach to advance the project's development. These different phases, summarized below, were decided upon based on several factors, including ease of development, available FTE’s, and priorities which align with other Unidata goals. Most phases include an initial beta release. We anticipate soliciting participation from NSF Unidata committees and other community members to provide feedback on the design and usability of these features. Dates in parentheses are the tentative start/release dates for each phase of the project.

Phase 1: Requests and Education – Users can request both compute resources (in the form of JupyterHubs) and educational resources (trainings, modules, etc.) and browse existing educational resources

  1. (Aug 2024) JupyterHub Requests Beta
  2. (Oct 2024) Education Hub/Requests Beta
  3. (Jan 2025) Phase 1 Release

Phase 2: On-Demand Notebooks, Data Integration, and Community Hub – Users can interact with Unidata curated “on-demand” notebooks without the need for a JupyterHub request, access data which is proximate to the computational environment, and share and develop ideas with colleagues in a community forum

  1. (Jan 2026) Phase 2 Beta
  2. (Apr 2026) Phase 2 Release

Phase 3: Community Contributions – Users can contribute to the content (educational materials, notebooks, workflows, etc.) found on the Science Gateway according to written guidelines for the management and maintenance of this content

  1. (July 2026) Phase 3 Release

Phase 4: App Streaming & Fully Re-Imagined Science Gateway – Users can “test-fire” Unidata products such as the IDV or Unidata’s version of AWIPS CAVE in their browser as a substitute for or prior to a local installation

  1. (Jan 2027) Phase 4 Beta
  2. (Jul 2027) Phase 4 Release


JupyterHub Activities

Streamlining Launching of JupyterHub Clusters

We have deepened our understanding of Ansible and its capacity to enhance our JupyterHub deployment process, including distributing SSH keys and disabling non-essential Ubuntu services like CUPS for security. Consequently, we have integrated new Ansible playbooks into our existing deployment process to achieve these objectives.

GPU Cluster Developments

We collaborated with Alex Haberlie from Northern Illinois University and his master's student, Jeremy Corner, to set up a GPU-enabled JupyterHub. This platform supports their atmospheric science and machine learning objectives and is accessible at jupyterhub.unidata.ucar.edu. Access is granted upon request for those who wish to experiment with GPU technology.

We also collaborated with Maria Molina from University of Maryland to launch a GPU-equipped JupyterHub for her graduate level AI/ML class.

We have made progress in two key areas: optimizing our GPU-specific container images, which simplifies our GPU JupyterHub deployment and reduces disk image size; and developing a workflow that includes both CPU and GPU-enabled instances for improved Service Unit (SU) management. We are currently testing this hybrid CPU/GPU approach with our colleague Sean Freeman at the University of Alabama Huntsville.

Dask Cluster Developments

We worked with Professor Hanna Zanowski from the University of Wisconsin-Madison to establish a JupyterHub Kubernetes Dask cluster for her Earth System Science class of 15 students. Dask, a Python library, facilitates the scaling of Python code for parallel computing on distributed clusters.

JupyterHub Servers for Workshops, Fall and Spring Semesters

Unidata is employing our Jetstream2 resource allocation for the benefit of students in the atmospheric science community by providing access to customized JupyterHub servers at an accelerating pace. Unidata tailors these servers to the requirements of the instructors so they can accomplish their Earth Systems Science teaching objectives. Since the fall semester of 2023 (encompassing the length of this status report) , 444 students at 14 academic institutions and various workshops have used Unidata JupyterHub servers running on Jetstream2.

Notably, we provided JupyterHub resources to:

Ongoing Activities

NOAA Big Data Program

University of Oklahoma REU Students

Unidata continues to collaborate with Ben Schenkel (OU) to provide data sets via the science gateway RAMADDA server. We also deployed a JupyterHub server so that NSF REU students at OU could access those data for their projects.

Andrea Zonca Collaboration

Unidata staff continues to collaborate with Andrea Zonca (SDSC/Jetstream2) employing his port of the "Zero to JupyterHub with Kubernetes" project to OpenStack and Jetstream2. We give Andrea feedback by testing his instructional blog entries and workflows. When we encounter issues, we submit bug reports via GitHub and work together until the problem is resolved. In addition, when we develop techniques or improvements to the workflow, we work with Andrea to ensure that this information is shared for the benefit of the wider Jetstream2 community.

Docker Containerization of Unidata Technology

We continue to employ Docker container technology to streamline building, deploying, and running Unidata technology offerings in cloud-based environments. Specifically, we are refining and improving Docker images for the LDM,  RAMADDA, and THREDDS. In addition, we also maintain a security-hardened Unidata Tomcat container inherited by the RAMADDA and THREDDS containers. Independently, this Tomcat container has gained use in the geoscience community. To keep our containers up-to-date, especially with respect to security, we programmatically monitor and respond to upstream updates by automatically building and deploying the refreshed containers to DockerHub.

AWIPS EDEX in Jetstream2 Cloud

Unidata continues to host our publicly accessible EDEX servers on the Jetstream2 cloud platform where we serve real-time AWIPS data to CAVE clients and the python-awips users. The distributed architectural concepts of AWIPS allow us to scale EDEX in the cloud to account for the desired data feed (and size). We continue using Jetstream2 to develop cloud-deployable AWIPS instances as imaged virtual machines (VMI) available to users of OpenStack CLI. Since last summer all EDEX servers have been running on Jetstream2. Unfortunately, the service has not been entirely seamless and both the AWIPS team and the Science Gateway team have spent significant time troubleshooting and repairing machines to keep our servers operational. In addition, we have created custom CentOS 7 and Rocky 8 images for deployment on Jetstream2 on which to provision new EDEX machines before CentOS 7’s End of Life on June 30, 2024. We have successfully created and launched a Rocky 8 EDEX system which the AWIPS team has been using to develop the latest version of AWIPS.

EDEX is designed so different components can be run across separate virtual machines (VMs) to improve efficiency and reduce latency. Our current design makes use of three VMs: one large instance to process most of the data and run all of the EDEX services including all requests, and two other ancillary machines which are smaller instances used to ingest and decode radar and satellite data individually.

We are currently supporting 4 sets of servers as described above: one set has been running our v18 software (as a backup during the transition from v18 to v20), two sets are running our new production v20, and a final system, using the Rocky 8 OS, has been used in the development of the new v23 release. Live backups allow us to be able to patch, maintain, and develop our servers while still  having a fail-safe when something goes wrong with the current production system. We plan on decommissioning the v18 system in the near future, and will likely replace it with a second v23 server to use as the backup and development server, while the existing one can become the production v23 server. Some period after that we will decommission the “old” v20 servers after we have fully migrated to support v23.

Nexrad AWS THREDDS Server on Jetstream2 Cloud

As part of the NOAA Big Data Project, Unidata maintains a THREDDS data server on the Jetstream2 cloud serving Nexrad data from Amazon S3. This TDS server leverages Internet 2 high bandwidth capability for serving the radar data from Amazon S3 data holdings. TDS team member, Tara Drwenski, and  Science gateway staff collaborate to maintain this server.

The URL for the THREDDS Nexrad radar server will be changed from thredds-aws.unidata.ucar.edu to tds-nexrad.scigw.unidata.ucar.edu to better reflect its purpose. It will also be integrated into the science gateway's data services section.

Jetstream2 and Science Gateway Security

We continually work with Unidata system administrator staff to ensure that our web-facing technologies and virtual machines on Jetstream2 adhere to the latest security standards. This effort involves such tasks as ensuring we are employing HTTPS , keeping cipher lists current, ensuring docker containers are up-to-date, limiting ssh access to systems, etc. It is a constantly evolving area that must be addressed frequently.

Unidata Science Gateway Website and GitHub Repository

Website

The Unidata Science Gateway web site is regularly updated to reflect the progress of what is available on the gateway. The news section is refreshed from time-to-time for announcements concerning the gateway. The conference section and bibliography is also maintained with new information. We are in the process of redesigning this web site. See “Unidata Science Gateway Re-Imagined” section above.

Repository

All technical information on deploying and running Unidata Science Gateway technologies is documented in the repository README. This document is constantly updated to reflect the current state of the gateway.

Presentations/Publications/Posters

New Activities

Over the next three months, we plan to organize or take part in the following:

Forthcoming conference attendance

None planned at this time.

Jetstream2 Allocation Renewal

The ACCESS program has updated their procedures for requesting new or renewed “Maximize” allocations, the largest type of allocation grant. ACCESS begins accepting Maximize requests starting June 15, 2024 to July 31, 2024. If granted, Unidata’s Jetstream2 allocation will renew on October 1, 2024.

Over the next twelve months, we plan to organize or take part in the following:

Tomcat 8.5 End of Life

Tomcat 8.5 has reached end of life on 31 Mar 2024. This will require staff to transition the Tomcat Docker containers and any dependencies to the newer version of Tomcat.

Lessons Learned from April 14-17 Jetstream2 Outage

We aim to work with the Jetstream2 team to learn lessons from an outage that occurred from April 14-17.

Improved JupyterHub Kubernetes Cluster Stability

We aim to provide an optimal experience for our users, but unfortunately, we've experienced more downtimes than we'd prefer. Specifically, issues with disk attachments have disrupted users' ability to consistently access their Jupyter instances. To proactively address these issues, we plan to use cluster monitoring software like Prometheus and Grafana. This will allow us to identify and resolve problems before they impact the user experience.

Relevant Metrics

Fall 2023 / Spring 2024 JupyterHub Servers

Since spring of 2020, Unidata has provided access to JupyterHub scientific computing resources to about 1830 researchers, educators, and students (including a few NSF REU students) at 24 universities, workshops (regional, AMS, online), and the UCAR SOARS program. Below are the latest metrics since the last status report.

Fall 2023

Boise State

0

Prof Lejo Flores

Florida Institute of Technology

8

Prof Milla Costa

Metropolitan State University of Denver

20

Erin Rhoades

Millersville University

3

Prof Greg Blumberg

University of Oklahoma

2

Ben Schenkel

University of Oklahoma 2

1

Professor Sakaeda

Southern Arkansas University

33

Keith Maull

University of Louisville

8

Prof Jason Naylor

University of Wisconsin

29

Pete Pokrandt, Prof Mayra Oyola

University of Wisconsin 2

20

Prof Hannah Zanowski

University of Wisconsin Dask

20

Prof Hannah Zanowski

CSU Python Workshop 1

25

Unidata Staff: Drew, Nicole, Thomas

CSU Python Workshop 2

14

Unidata Staff: Drew, Nicole, Thomas

AI/ML pyaos.unidata.ucar.edu November

36

Unidata Staff: Thomas

Colorado School of Mines

27

Prof Zane Jobe

Spring 2024

AMS MetPy short course

27

Unidata Staff: Drew

AMS Python workshop

38

Unidata Staff: Drew

Florida Institute of Technology

14

Milla Costa

Florida State University

14

Christopher Homles

Millersville University

34

Greg Blumberg

Seoul National University

22

Duseong Jo

Southern Arkansas University

12

Keith Maull

University of Alabama Huntsville

12

Sean Freeman

University of Louisville

5

Jason Naylor

University of Maryland

17

Maria Molina

University of Northern Colorado

0

Wendilyn Flynn

University of Oklahoma

3

Ben Schenkel

Jetstream2 Allocation Usage Overview

In addition to service units (SUs) used for running various kinds of virtual machines–“regular” CPU and GPU instances–Unidata was also granted a limited number of compute, storage, and network resources to carry out Jetstream2 operations. These three kinds of resources are ephemeral, being created and destroyed as necessary. Thus, metrics regarding these resources are representative of short-term utilization, while SU usage is a metric that can be representative of our long-term Jetstream2 utilization.

Following Unidata’s 8M+ SU grant renewal, which went into effect October 2023, Unidata staff has been proactive in ensuring Jetstream2 resources are being used effectively in a non-wasteful manner with the on-going development of a SU monitoring script which tracks historical SU usage data and makes predictions on future usage.

SU usage and resource metrics, current as of April 17, 2024, are presented below.

SU Usage

Type

SUs Used

SUs Allocated

% Usage *

CPU

4,316,750

8,191,300

53 %

GPU

458,005

672,768

68 %

Resource Metrics

Compute

Type

Used

Total

Percent Usage*

Instances

105

150

70 %

vCPUs

1178

4035

29 %

RAM

4.4 TB

15.8 TB

28 %

Storage

Type

Used

Total

Percent Usage*

Volumes

222

400

56 %

Volume Snapshots

5

50

10 %

Volume Storage

32.4 TB

39.1 TB

83 %

Network

Type

Used

Total

Percent Usage*

Floating IPs

52

310

17 %

Security Groups

73

100

73 %

Security Group

Rules

229

300

76 %

Networks

3

100

3 %

Ports

134

250

54 %

Routers

2

15

13 %

* Percent Usage is rounded to the nearest whole number

Github Statistics*

Repository

Watches

Stars

Forks

Open Issues

Closed Issues

Open PRs

Closed PRs

science-gateway

6

17

11

5

167

10 (-4)

731 (+49)

tomcat-docker

11

64 (+4)

66

(+2)

0 (-2)

42 (+2)

0

88(+5)

thredds-docker

15

31 (+4)

27(+1)

3 (-1)

120(+3)

0

178

(+2)

ramadda-docker

4

0

2

1

10

0

35 (+1)

ldm-docker

9

12

14 (+1)

1

40

0

70(+5)

tdm-docker

5

4

7

0

10

0

24 (+1)

* Numbers in parentheses denote change from last stat report

Strategic Focus Areas

We support the following goals described in Unidata Strategic Plan:

  1. Managing Geoscience Data
    Unidata supplies a good portion of the data available on the IDD network to the Jetstream2 cloud via the LDM and the high bandwidth Internet 2 network. Those data are distributed to the TDS, ADDE, RAMADDA and AWIPS EDEX installations running on Jetstream2 for the benefit of the Unidata community. Unidata also makes the AWS Nexrad archive data accessible through the TDS Nexrad server running on Jetstream2 at no cost to the community. These data can be accessed in a data-proximate manner with a JupyterHub running on Jetstream2 for analysis and visualization. Containerization technology complements and enhances Unidata data server offerings such as the TDS and ADDE. Unidata experts install, configure and in some cases, security harden Unidata software in containers defined by Dockerfiles. In turn, these containers can be easily deployed on cloud computing VMs by Unidata staff or community members that may have access to cloud-computing resources.
  2. Providing Useful Tools
     Jupyter notebooks excel at interactive, exploratory scientific programming for researchers and their students. With their mixture of prose, equations, diagrams and interactive code examples, Jupyter notebooks are particularly effective in educational settings and for expository objectives. Their use is prevalent in many scientific disciplines including atmospheric science. JupyterHub enables specialists to deploy pre-configured Jupyter notebook servers typically in cloud computing environments. With JupyterHub, users login to arrive at their own notebook workspace where they can experiment and explore preloaded scientific notebooks or create new notebooks. The advantages of deploying a JupyterHub for the Unidata community are numerous. Users can develop and run their analysis and visualization codes proximate to large data holdings which may be difficult and expensive to download. Moreover, JupyterHub prevents users from having to download and install complex software environments that can be onerous to configure properly. They can be pre-populated with notebook projects and the environments required to run them. These notebooks can be used for teaching or as templates for research and experimentation. In addition, a JupyterHub can be provisioned with computational resources not found in a desktop computing setting and leverage high speed networks for processing large datasets. JupyterHub servers can be accessed from any web browser-enabled device like laptops and tablets. In sum, they improve "time to science" by removing the complexity and tedium required to access and run a scientific programming environment.
  3. Supporting People
    A Unidata science gateway running in a cloud computing setting aims to assist the Unidata community arrive at scientific and teaching objectives quickly by supplying users with pre-configured computing environments and helping users avoid the complexities and tedium of managing scientific software. Science gateway offerings such as web -based Jupyter notebooks connected with co-located large data collections are particularly effective in workshop and classroom settings where students have sophisticated scientific computing environments available for immediate use. In the containerization arena, Unidata staff can quickly deploy Unidata technologies such as the THREDDS data server to support specific research projects for community members.

Prepared  April 2024