Status Report: THREDDS
October 2021- June 2022
Hailey Johnson, Tara Drwenski, Jennifer Oxelson, Ryan May, Ethan Davis, Dennis Heimbigner
Areas for Committee Feedback
We are requesting your feedback on the following topics:
- What has been your experience so far with the TDS v5 migration process? Do you have any concerns or suggestions for how to best support our community through the transition?
- Do you have thoughts on the migration of the TDS to microservices? What do you foresee as the greatest challenges and benefits associated with the change?
- How can we help you and your students? We can do much more than Java programming - we love Python too! Our team comes from a variety of academic backgrounds as well, including Meteorology (boundary-, surface-, and canopy-layer, in-situ observations, radar), Computer Science, Oceanography, Chemistry, and Physics!
Activities Since the Last Status Report
Staffing Changes
Tara Drwenski joined the THREDDS team on February 28, 2022. Welcome, Tara! We are in the process of searching for an additional third team member as well, who will hopefully be joining us in the next few months.
The THREDDS Project
The THREDDS Project encompases four projects: **netCDF-Java, the THREDDS Data Server (TDS), Rosetta, and Siphon** (the Unidata Python client to interact with remote data services, such as those provided by the TDS). For specific information on Siphon, please see the Python Status Report. This status report contains updates on cloud compatibility within netCDF-Java and the TDS; for updates on further cloud efforts, including the popular Docker container effort, please see the Cloud Computing Activities Status Report.
Log4Shell and Spring4Shell
- The Java world over the last few months has seen an unusual number of critical security exploits, including Log4Shell in December and Spring4Shell in April.
- The THREDDS team has responded with a number of quick releases containing solely upgrades to third party libraries.
- Throughout this period, we relied more heavily on snapshot releases to respond quickly to security concerns. The current security-compliant release of netCDF-Java and the TDS are both snapshot releases. Most users have responded positively to our handling of the situation and have been willing to put snapshots into production.
NetCDF-Java
- NetCDF-Java is now on version 5.5.3-SNAPSHOT
- Since the last report, netCDF-Java has completed read support for the Zarr data format. This includes a new, extensible module for data filters, allowing users to provide their own compressors to the library.
- Version 5.5.3 (official release) will be released soon, containing a number of bug fixes and enhancement for the TDS, including:
- HttpServer access for S3 objects
- Expanded `Coverage` feature to support accessing profile features
TDS version 4.6.x (Maintenance)
- The maintenance release of the TDS is now on version 4.6.21-SNAPSHOT.
- An official release of 4.6.21 will be available soon.
- The end of life for TDS 4.6.x was pushed to August 2022 as we work out issues in TDS 5.x. We will continue to provide security upgrades only in the meantime.
TDS version 5.x (Current)
- The current release of the TDS is version 5.4-SNAPSHOT.
- Since the release of 5.0.0, a number of bugs have been identified and fixed, including:
- The ncss map widget
- Grid as point requests
- The Godiva3 viewer
- …and more
- Version 5.4 (official release) will be released soon and will address a number of user concerns and support tickets.
Rosetta
Rosetta remains in a temporary maintenance mode due to limited development resources; no new development is planned for the short-term future.
Ongoing Activities
We plan to continue the following activities:
Maintenance
- Maintain thredds.ucar.edu and keep up with the addition of new datasets to the IDD.
- Closely monitor the security status of our project dependencies, and provide updated versions of our libraries and server technologies to address as needed.
- Maintain threddsrc.ucar.edu as a production server running TDS version 5, until we are ready to transition thredds.ucar.edu to the latest version.
- Solicit and respond to user feedback regarding threddsrc.ucar.edu and TDS 5.x.
Development
- Continued work to implement write support for Zarr and NCZarr.
- Continue development of the new filters module and add support for requested common filters, starting with ZStandard.
- Continued work to curate existing and create new documentation.
- gCDM (gRPC for the Common Data Model)
- gCDM is an ongoing effort which currently exists for netCDF-Java versions 6 and 7 (branches __6.x__ and __develop__ ) and is a new way to allow netCDF-Java to communicate remotely. See April 2021 THREDDS status report for an in depth description of the gCDM project.
- gCDM stands for "gRPC Cdm", where gRPC is a recursive acronym that stands for "gRPC Remote Procedure Calls". For more information on gRPC, checkout the gRPC FAQ.
- Work is beginning soon to port gCDM back to netCDF-Java version 5.
- Porting gCDM to netCDF-Java 5.x will allow the TDS to include a gCDM endpoint.
- Expand S3 support in netCDF-Java and TDS to effectively mirror that of local storage.
- Expand testing for S3 support.
- Improve performance for TDS S3 access, particularly for large aggregations, to prevent potential server timeouts.
The following active proposals directly involve THREDDS work:
- We are in our final four months of the NOAA IOOS award titled "A Unified Framework for IOOS Model Data Access", in which we are partnered with Rich Signell and Axiom Data Science. The goal is to enable support of the UGRID specification within the THREDDS stack, as well as create a GRID featureType to allow for serving large collections of gridded datasets (including UGRID). This work **strategically aligns with the Unidata 2024 focus area “Managing Geoscience Data, Making Geoscience Data Accessible** by improving the reliability and scalability of the TDS to handle very large collections of gridded datasets, as well as **“Managing Geoscience Data, Enhancing Community Access to Data”** through the addition of UGRID support (example: MPAS output is on a mesh, a.k.a. “unstructured”, grid).
New Activities
Over the next three months, we plan to organize or take part in the following:
- Release version 5.5.3 in support of TDS 5.4.
- Migrate the HDF5 filter support to use the same filters module used by Zarr.
- Release version 5.4 as a stable, feature-complete TDS release.
- Migrate the main Unidata THREDDS instance to version 5.4.
- Help the user community upgrade their servers to TDS version 5.4.
- Complete unstructured grid support.
Over the next twelve months, we plan to organize or take part in the following:
- Initial support for any codecs the community deems necessary for reading Zarr and NCZarr.
- Re-evaluate the future of netCDF-Java versions 6-8 and consider forking a new API that more heavily relies on user contributions.
- Add gCDM support to the TDS.
- Re-evaluate the TDS dependency on Java and consider development options to optimize maintainability.
- Begin development of a new product based on microservices.
- Better curate existing documentation into four documentation sets: server administrator (with quick start guide), end user (browser), developer (web access via api), reference (nitty-gritty details, for those interested in learning more or hacking on the TDS codebase).
Beyond a one-year timeframe, we plan to organize or take part in the following:
- Fully support Java 11 and the Java Platform Module System (end of Java 8 support)
- Continue development of a new product based on microservices.
- Continue to explore object storage as it relates to the TDS.
Relevant Metrics
THREDDS Startup Metrics
| 2021-10 — 2022-05 | 2014-08 — 2021-04 |
TDS Startup (unique IP address count) | 1,626 | 38,234 |
| Total Servers | Information page updated |
Publicly Accessible[1] TDS count | 160 | 85 |
Over the past six months, **1,626** unique IPs started up the TDS (October 2021 through May 2022). Since we’ve started tracking these metrics (v4.5.3, August 26th, 2014), we’ve seen the TDS startup from **38,234** unique IP addresses. There are currently **160** publically accessible TDSs running “in the wild” (31 fewer than our last report ) . Of the **160** publically accessible servers, **85** have updated the name of their server in their server configuration file (taken as a sign that they are maybe, possibly, intended to be used by others...maybe…).
The figures below show the distribution of TDS versions (top), and the fractional share of servers running version X or older (bottom). Each labeled version includes betas and snapshots, not just the official release of that version, for presentation simplicity. The majority of the publicly accessible servers are running v4.6.13 or above . TDS v5.0 is the dominant specific version running in the wild.
![](https://lh7-us.googleusercontent.com/docsp/AMfWKRAQrSCLNFVamOaoQ-29PwDVYkbuyfFByWJwpm6LpWLLDnsNe7hLtLjCRROYgE6EI7yydc1782jTHW1HbN8JbvKXyIgouMlCvMZB2Po2oPlXd4GBQhPG3UNADdMRuN2mDxo0q4_N5qTZIFjOCW3WUFrYSlhLwNUSISIBRIOHOUduogJk1N8QAoVH5HcLouG_SCGBtDY)
![](https://lh7-us.googleusercontent.com/docsp/AMfWKRD2ARGHObfPkT49Xcew4TatZKGbAaJiaTg0dkHgwhY_kqNKN9v7M2bl0N3QsSJ7nd0ihY2tEaR7VZph1inOKAiFgQXH3FBMrZgfHdFubFPDWDQ-Eime7TJE9tJ8LqVg-pqGSH9rFSUV4ZWGaRaS3JKV6aeSW-_4qYXHkOYjRndl3tzmZmF525Oz3GWenuwOqJe_1DQ)
Strategic Focus Areas
The THREDDS projects covered in this report support the following goals described in Unidata Strategic Plan:
- Managing Geoscience Data
The component software projects of the THREDDS project work to facilitate the management of geoscience data from four points of view: __Making Geoscience Data Accessible, Making Geoscience Data Discoverable, Making Geoscience Data Usable, and Enhancing Community Access to Data__ . As a client-side library, **netCDF-Java** enables end users to read a variety of data formats both locally and across numerous remote technologies. Less user-friendly formats, such as GRIB, are augmented with metadata from community driven metadata standards (e.g. Climate and Forecast metadata standards), and viewed through the more user friendly Common Data Model (very similar to the netCDF Data Model), providing a single set of Java APIs for interacting with a multitude of formats and standards. The **THREDDS Data Server** exposes the power of the netCDF-java library outside of the Java ecosystem with the addition of remote data services, such as __OPeNDAP__ , __cdmremote__ , __OGC WCS__ and __WMS__ , __HTTP direct download__ , and other remote data access and subsetting protocols. The TDS also exposes metadata in standard ways (e.g. ISO 19115 metadata records, json-ld metadata following schema.org), which are used to drive search technologies. **Rosetta** facilitates the process of translating ascii based observational data into standards compliant, archive ready files. These files are easily read into netCDF-Java and can be served to a broader community using the TDS. - Providing Useful Tools
Through Rosetta, the THREDDS project seeks to intercede in the in-situ based observational data management lifecycle as soon as possible. This is done by enabling those who produce the data to create archive ready datasets as soon as data are collected from a sensor or platform without the need to write code or intimately understand metadata standards. NetCDF-java and the TDS continue to support legacy workflows by maintaining support for legacy data formats and decades old data access services, while promoting 21st century scientific workflows through the creation of new capabilities and modernization of existing services (e.g. Immutability, upgraded technical stack, microservice development). - Supporting People
Outside of writing code, the THREDDS project seeks to support the community by __providing technical support, working to build capacity through Open Source Software development, and by building community cyber-literacy__ . The team provides expert assistance on software, data, and technical issues through numerous avenues, including participation in community mailing lists, providing developer guidance on our GitHub repositories, and leading and participating in workshops across the community. The team also actively participates in “upstream” open source projects in an effort to help sustain the efforts of which we rely and build upon. We have mentored students as part of the Unidata Summer Internship Program, and worked across organizations and disciplines in support of their internship efforts.
Prepared May, 2022
[1] “Publicly accessible” means we could find a top-level THREDDS Client Catalog. We checked <server>/thredds/catalog.xml (version 4), <server>/thredds/catalog/catalog.xml (version 5), including the most common ports of 80, 8080, 443, and 8443.