April 2023- November 2023
Shay Carter, Julien Chastang, Nicole Corbin, Ethan Davis, Ana Espinoza, Ward Fisher,
Thomas Martin, Ryan May, Tiffany Meyer, Jennifer Oxelson Ganter, Mike Schmidt,
Tanya Vance, Jeff Weber
From April to October 2023, we made significant advancements in our Science Gateway and Cloud Computing offerings and infrastructure. We secured an 8 million SU allocation from the ACCESS program on Jetstream2, marking our largest allocation since 2015. We streamlined the JupyterHub request process, enhanced Docker container projects, and expanded the capabilities of the Unidata demonstration server with GPU support enabling projects in the artificial intelligence and machine learning arena . We've been actively enhancing our AWIPS EDEX server on Jetstream2 to ensure seamless data delivery and to prepare for upcoming infrastructure changes like CentOS 7's End of Life. Collaborations with institutions such as Colorado State University and the NCAR EOL for the LROSE project have been fruitful, resulting in knowledge sharing and technology development. We successfully deployed the Weather Research Forecast (WRF) system on Jetstream2, enabling numerical weather prediction in classroom and research settings. Furthermore, the redesign of the Unidata Science Gateway is underway, with key components including a JupyterHub portal. We also continued to expand on PyAOS JupyterHub offerings and actively participated in several workshops, serving many students. This period saw significant engagement and growth, serving users, expanding technological capabilities, and fostering collaborations.
Unidata has been awarded 8 million SUs from the ACCESS program in another Jetstream2 cycle to maintain continuous access to essential servers like EDEX, JupyterHub, THREDDS, RAMADDA, and LDM/IDD nodes. This grant allows access to a variety of CPU and GPU virtual machines (VMs) with various configurations. This is our largest allocation since 2015, and a significant increase compared to around 5 million SUs received in 2022-23.
Science gateway staff have designed a JupyterHub request form that includes questions on:
This form streamlines the process of requesting JupyterHub servers for semester-long use and workshops. On our end, this form allows us to better keep track of not only the tasks that need to be completed, but also gives us an automated, centralized location to gather metrics on the requests we receive.
This form is one step in developing our Science Gateway Re-Imagined project, which, among other things, aims to enhance the user experience of using the Unidata Science Gateway and the resources we offer.
Reworked the Unidata tomcat-docker, thredds-docker, ramadda-docker, ldm-docker projects:
Additionally, automation scripts were written to keep these Docker containers consistently updated with the latest versions and security enhancements.
To generate more interest in the previously underutilized demonstration server, it has been upgraded to use GPU machines, enhancing its AI/ML capabilities. Unidata’s Thomas Martin and Jeremy Corner, a master’s student at NIU under Alex Haberlie, are utilizing this improved Hub for their respective AI/ML projects.
Unidata science gateway staff have collaborated with Professor Mike Bell’s team at Colorado State University and NCAR EOL to help build their science gateway which involves a JupyterHub equipped with LROSE radar meteorological software. We have shared our accumulated expertise in JupyterHub and related technologies with the team.
For the first time in Unidata's presence on Jetstream, we have deployed a containerized version of the Weather Research Forecast (WRF) numerical weather prediction system on Jetstream2, providing two different scenarios. This new capability allows for exploration of Numerical Weather Prediction (NWP) models and subsequent analysis and visualization of the output in a data-proximate manner, for example, in a JupyterLab environment.
Unidata is collaborating with the Southwestern Indian Polytechnic Institute and Navajo Technical University to deploy an operational WRF model over the Navajo Nation. This project aims to provide Tribal Nations, and the Tribal Colleges and Universities (TCUs) with the capacity for environmental monitoring in alignment with data sovereignty objectives.
In collaboration with Greg Blumberg at Millersville University, Unidata staff have deployed a single-column WRF model in a JupyterHub environment for undergraduate instructional objectives. As a result of this collaboration, Unidata staff will be presenting their procedures and findings at the Science Gateways 2023 Conference, hosted in Pittsburgh, PA on Oct 29 through Nov 1, 2023.
We continue to make progress on the design phase of the Unidata Science Gateway Re-Imagined project as time permits. After collaborating with the redesign team and Unidata management. We have settled on plan “2b” which consists of a redesigned science gateway with the following components:
Meetings have resumed twice monthly to develop an implementation strategy.
Unidata is employing our Jetstream2 resource allocation for the benefit of students in the atmospheric science community by providing access to customized JupyterHub servers at an accelerating pace. Unidata tailors these servers to the requirements of the instructors so they can accomplish their Earth Systems Science teaching objectives. Since spring semester of 2023 (encompassing the length of this status report) , 606 students at twelve academic institutions and various workshops have used Unidata JupyterHub servers running on Jetstream2.
Notably, we provided JupyterHub resources to:
Unidata continues to collaborate with Ben Schenkel (OU) to provide data sets via the science gateway RAMADDA server. We also deployed a JupyterHub server so that NSF REU students at OU could access those data for their projects.
Unidata staff continues to collaborate with Andrea Zonca (SDSC/Jetstream2) employing his port of the "Zero to JupyterHub with Kubernetes" project to OpenStack and Jetstream2. We give Andrea feedback by testing his instructional blog entries and workflows. When we encounter issues, we submit bug reports via GitHub and work together until the problem is resolved.
Beyond what we mentioned earlier about improvements in this area, we continue to employ Docker container technology to streamline building, deploying, and running Unidata technology offerings in cloud-based environments. Specifically, we are refining and improving Docker images for the LDM, ADDE, RAMADDA, THREDDS, and AWIPS. In addition, we also maintain a security-hardened Unidata Tomcat container inherited by the RAMADDA and THREDDS containers. Independently, this Tomcat container has gained use in the geoscience community.
Unidata continues to host our publicly accessible EDEX server on the Jetstream2 cloud platform where we serve real-time AWIPS data to CAVE clients and the python-awips data access framework (DAF) API. The distributed architectural concepts of AWIPS allow us to scale EDEX in the cloud to account for the desired data feed (and size). We continue using Jetstream2 to develop cloud-deployable AWIPS instances as imaged virtual machines (VMI) available to users of OpenStack CLI. Since last summer all EDEX servers have been running Jetstream2. Unfortunately, the service has not been entirely seamless and both the AWIPS team and the Science Gateways team have spent significant time troubleshooting and repairing machines to keep our servers operational. In addition, we have created a custom CentOS 7 image for deployment on Jetstream2 on which to provision new EDEX machines before CentOS 7’s End of Life on June 30, 2024. Before that time EDEX will be transitioned to be deployable on Rocky or another RHEL derivative.
EDEX is designed so different components can be run across separate virtual machines (VMs) to improve efficiency and reduce latency. Our current design makes use of three VMs: one large instance to process most of the data and run all of the EDEX services including all requests, and two other ancillary machines which are smaller instances used to ingest and decode radar and satellite data individually.
We are currently supporting 4 sets of servers as described above: two sets are running our v18 software (production version of AWIPS), and two sets are running our new beta v20 software. The live backups allow us to be able to patch,maintain, and develop our servers while still having a fail-safe when something goes wrong with the current production system. Shortly after we release our production version of 20 before the end of the year, we will decommission the two v18 servers, and go back to having just two sets of servers in Jetstream.
As part of the NOAA Big Data Project, Unidata maintains a THREDDS data server on the Jetstream2 cloud serving Nexrad data from Amazon S3. This TDS server leverages Internet 2 high bandwidth capability for serving the radar data from Amazon S3 data holdings. TDS team member, Tara Drwenski, and Science gateway staff recently collaborated to upgrade this server.
We continually work with Unidata system administrator staff to ensure that our web-facing technologies and virtual machines on Jetstream2 adhere to the latest security standards. This effort involves such tasks as ensuring we are employing HTTPS , keeping cipher lists current, ensuring docker containers are up-to-date, limiting ssh access to systems, etc. It is a constantly evolving area that must be addressed frequently.
The Unidata Science Gateway web site is regularly updated to reflect the progress of what is available on the gateway. The news section is refreshed from time-to-time for announcements concerning the gateway. The conference section and bibliography is also maintained with new information. We are in the process of redesigning this web site. See “Unidata Science Gateway Re-Imagined” section above.
All technical information on deploying and running Unidata Science Gateway technologies is documented in the repository README. This document is constantly updated to reflect the current state of the gateway.
Tomcat 8.5 will reach end of life on 31 Mar 2024. This will require staff to transition the Tomcat Docker containers and any dependencies to the newer version of Tomcat.
We aim to provide an optimal experience for our users, but unfortunately, we've experienced more downtimes than we'd prefer. Specifically, issues with disk attachments have disrupted users' ability to consistently access their Jupyter instances. To proactively address these issues, we plan to use cluster monitoring software like Prometheus and Grafana. This will allow us to identify and resolve problems before they impact the user experience.
Since spring of 2020, Unidata has provided access to JupyterHub scientific computing resources to about 1500 researchers, educators, and students (including a few NSF REU students) at 18 universities, workshops (regional, AMS, online), and the UCAR SOARS program. Below are the latest metrics since the last status report.
No. of users | POC | |
Spring 2023 | ||
AMS 2023 Python Workshop | 87 | Drew, Nicole, Ana, Julien |
AMS 2023 CSU LROSE Workshop | 24 | Jen DeHart, Julien |
AMS 2023 MetPy Short Course | 30 | Drew, Ryan, Kevin, Ana |
LROSE University of Hawaii WS | 15 | Prof Mike Bell (CSU) |
Florida State University | 31 | Prof Chris Holmes |
Florida Institute of Technology | 10 | Prof Steve Lazarus |
University of Oklahoma | 3 | Ben Schenkel |
Millersville University (3 classes!) | 33 | Prof Greg Blumberg |
Penn State University | 16 | Prof Paul Markowski |
Saint Cloud State University | 7 | Prof Matthew Vaughan |
University of Louisville | 11 | Prof Jason Naylor |
University of Wisconsin | 0 | Pete Pokrandt |
Virginia Tech University | 12 | Prof Craig Ramseyer |
Southern Arkansas University | 4 | Keith Maull |
Northern Illinois University (GPU) | 2 | Alex Haberlie |
Summer 2023 | ||
UCAR SOARS Internship | 15 | Keith Maull, UCAR/UCP |
Unidata users workshop | 66 | Unidata Staff |
I-Guide | 16 | Drew, Ryan |
UCAR Professional Development Workshop Series 7 | 30 | Unidata Staff: Drew, Nicole, Thomas |
UND Summer Workshop | 10 | David Delene |
MetPy for Quantitative Analysis of Meteorological Data | 21 | Unidata Staff: Drew, Nicole, Thomas |
Python Readiness Series: Train-the-Trainer | 10 | Unidata Staff: Drew, Nicole, Thomas |
Fall 2023 | ||
Florida Institute of Technology | 9 | Prof Milla Costa |
Metropolitan State University of Denver | 19 | Erin Rhoades |
Millersville University | 2 | Prof Greg Blumberg |
University of Oklahoma | 2 | Ben Schenkel |
University of Oklahoma 2 | 1 | Professor Sakaeda |
Southern Arkansas University | 33 | Keith Maull |
University of Louisville | 7 | Prof Jason Naylor |
University of Wisconsin | 26 | Pete Pokrandt |
University of Wisconsin 2 | 15 | Prof Hannah Zanowski |
CSU Python Workshop 1 | 25 | Unidata Staff: Drew, Nicole, Thomas |
CSU Python Workshop 2 | 14 | Unidata Staff: Drew, Nicole, Thomas |
In addition to service units (SUs) used for running various kinds of virtual machines, “regular” CPU instances, and GPU instances, Unidata was also granted a limited number of compute, storage, and network resources to carry out Jetstream2 operations. These three kinds of resources are ephemeral, being created and destroyed as necessary. Thus, metrics regarding these resources are representative of short term utilization, while SU usage is a metric that can be representative of our long-term Jetstream2 utilization. As Unidata was only recently granted a new 8M+ SU allocation, starting October 2023, SU usage may not prove a useful metric and has been omitted for this Status Report. Resource metrics current as of October 16, 2023 are presented below.
Compute | |||
Type | Used | Total | Percent Usage* |
Instances | 77 | 150 | 51 % |
vCPUs | 1034 | 4035 | 26 % |
RAM | 3.9 TB | 15.8 TB | 25 % |
Storage | |||
Type | Used | Total | Percent Usage* |
Volumes | 206 | 400 | 52 % |
Volume Snapshots | 0 | 50 | 0 % |
Volume Storage | 31.0 TB | 39.1 TB | 79 % |
Network | |||
Type | Used | Total | Percent Usage* |
Floating IPs | 47 | 310 | 15 % |
Security Group | 61 | 100 | 61 % |
Security Group Rules | 198 | 300 | 66 % |
Networks | 4 | 100 | 4 % |
Ports | 111 | 250 | 44 % |
Routers | 2 | 15 | 13 % |
* Percent Usage is rounded to the nearest whole number
Repository | Watches | Stars | Forks | Open Issues | Closed Issues | Open PRs | Closed PRs |
6 (+2) | 17 (+1) | 11 | 5 (+1) | 167 (+1) | 14 (+8) | 682 (+86) | |
11 (+1) | 60 (+1) | 64 (-1) | 2 | 40 | 1 | 83(+11) | |
15 | 27 (+2) | 26(+1) | 4 | 117(+7) | 0 | 176 (+17) | |
4 | 0 | 2 | 1 | 10 | 0 | 34 (+10) | |
9(+1) | 12(-3) | 13 | 1(-4) | 40(+4) | 0 | 65(+4) | |
5(+1) | 4 | 7 | 0 (-1) | 10 (+1) | 0 | 23 (+5) |
We support the following goals described in Unidata Strategic Plan:
Prepared October 2023