Status Report: THREDDS
November 2022- March 2023
Hailey Johnson, Tara Drwenski, Megan Lerman, Jennifer Oxelson, Ryan May, Ethan Davis, Dennis Heimbigner
Executive Summary
TDS version 5 is now the only supported version of the TDS, and inflow of bug reports is slowing. We have started plans for netCDF-Java version 6 and begun development on TDS microservices, to include: gCDM, File Service, and Catalog Service.
Questions for Immediate Committee Feedback
Would any of your automated data access workflows be disrupted if the format of the THREDDS catalogs were to change (e.g. from XML to JSON)? (That is, are you explicitly parsing catalogs or relying on a provided API like Siphon?)
Activities Since the Last Status Report
NetCDF-Java
- NetCDF-Java remains on version 5.5.3 (as of 3/15/23).
- Work has begun to plan for the development of netCDF-Java 6.x API, which will remove the large number of deprecated methods and limit the public-facing API.
TDS
- Version 5.5 will be released soon and will address a number of user concerns and support tickets; we additionally anticipate a performance improvement in the NCSS and WMS services for enhanced datasets.
- The main development focuses for the TDS currently are stability, performance, and cloud support.
- The THREDDS team has begun working on the TDS microservices, with progress being made on gRPC access (netcdf-Java), File Service (Python), and Catalog Service (Java).
- Plans have been made to develop a performance and benchmarking test suite for the TDS services.
Ongoing Activities
Server management
- We have implemented a new continuous deployment process for the Unidata THREDDS servers:
- thredds.ucar.edu now always runs the latest stable release of the TDS (unless a quick security update is required)
- thredds-test.ucar.edu automatically deploys new versions of the TDS when netCDF-Java or the TDS GitHub repositories update; it is therefore always running the latest development version
- thredds-dev.ucar.edu is intended to be used by THREDDS developers, rather than THREDDS users; we use this domain to test changes that require access to “real” data
Maintenance
- Maintain thredds.ucar.edu and keep up with the addition of new datasets to the IDD.
- Closely monitor the security status of our project dependencies, and provide updated versions of our libraries and server technologies to address as needed.
- Continue to respond to user feedback regarding TDS 5.x and transitioning servers to the latest version.
Development
- Continued work to implement write support for Zarr and NCZarr.
- Continue development of the new filters module and add support for requested common filters, starting with ZStandard.
- Add support for Zarr and NCZarr in the TDS.
- Continued work to support the DAP4 protocol in netCDF-Java and the TDS
- Expand S3 support in netCDF-Java and TDS to effectively mirror that of local storage.
- Expand testing for S3 support.
- Performance and benchmarking
- Create automated benchmarking and regression testing tools for both netCDF-Java and the TDS.
- Improve performance for TDS S3 access, particularly for large aggregations, to prevent potential server timeouts.
- Continue planning a development for the next iteration of the TDS with a microservice-based architecture.
New Activities
Over the next three months, we plan to organize or take part in the following:
- Improve the performance of scale/offset transformations
- Release version 5.5.3
- Continue to participate in the Zarr development community
- Finish porting the gCDM (gRPC for the Common Data Model) module to netCDF-Java.
- Begin developing other gRPC endpoints for netCDF-Java
- Plan for a netCDF-Java version 6 with a limited public API
- Release version 5.5 of the TDS.
- Continue to help the user community upgrade their servers to TDS version 5.x.
- Work with Thomas Martin on AI/ML pipeline using the THREDDS Notebook Service
Over the next twelve months, we plan to organize or take part in the following:
- Continue to develop gCDM services
- Continue work on a version 6 API
- Develop performance and benchmarking tools for the TDS.
- Complete a Python-based File Service
- Complete a Java-based Catalog Service
Beyond a one-year timeframe, we plan to organize or take part in the following:
- Release a version 6 of netcdf-Java that fully support Java 11 and the Java Platform Module System (end of Java 8 support)
- Fully support the Zarr and NCZarr data models, including new iterations of the specifications.
- Continue development of standalone TDS services
- Continue to explore object storage as it relates to the TDS.
- Continue to improve data access performance, exploring the possibility of asynchronous requests.
The following active proposals directly involve THREDDS work:
- The THREDDS team is not participating in any active proposals at this time.
Relevant Metrics
THREDDS Startup Metrics
| 2022-11 — 2023-02 | 2014-08 — 2023-02 |
TDS Startup (unique IP address count) | 1759
| 41107
|
| Total Servers | Information page updated |
Publicly Accessible[1] TDS count | 148 | 76 |
Over the past 4 months, **1,759** unique IPs started up the TDS (Nov 2022 through Feb 2023). Since we’ve started tracking these metrics (v4.5.3, August 26th, 2014), we’ve seen the TDS startup from **41,107** unique IP addresses. There are currently **148** publically accessible TDSs running “in the wild”. Of the **148** publically accessible servers, **76** have updated the name of their server in their server configuration file (taken as a sign that they are maybe, possibly, intended to be used by others...maybe…).
The figures below show the distribution of TDS versions (top), and the fractional share of servers running version X or older (bottom). Each labeled version includes betas and snapshots, not just the official release of that version, for presentation simplicity. The majority of the publicly accessible servers are running v4.6.13 or above . TDS v5.0 remains the dominant specific version running in the wild, despite several later releases.


Strategic Focus Areas
The THREDDS projects covered in this report support the following goals described in Unidata Strategic Plan:
- Managing Geoscience Data
The component software projects of the THREDDS project work to facilitate the management of geoscience data from four points of view: __Making Geoscience Data Accessible, Making Geoscience Data Discoverable, Making Geoscience Data Usable, and Enhancing Community Access to Data__ . As a client-side library, **netCDF-Java** enables end users to read a variety of data formats both locally and across numerous remote technologies. Less user-friendly formats, such as GRIB, are augmented with metadata from community driven metadata standards (e.g. Climate and Forecast metadata standards), and viewed through the more user friendly Common Data Model (very similar to the netCDF Data Model), providing a single set of Java APIs for interacting with a multitude of formats and standards. The **THREDDS Data Server** exposes the power of the netCDF-java library outside of the Java ecosystem with the addition of remote data services, such as __OPeNDAP__ , __cdmremote__ , __OGC WCS__ and __WMS__ , __HTTP direct download__ , and other remote data access and subsetting protocols. The TDS also exposes metadata in standard ways (e.g. ISO 19115 metadata records, json-ld metadata following schema.org), which are used to drive search technologies. **Rosetta** facilitates the process of translating ascii based observational data into standards compliant, archive ready files. These files are easily read into netCDF-Java and can be served to a broader community using the TDS. - Providing Useful Tools
Through Rosetta, the THREDDS project seeks to intercede in the in-situ based observational data management lifecycle as soon as possible. This is done by enabling those who produce the data to create archive ready datasets as soon as data are collected from a sensor or platform without the need to write code or intimately understand metadata standards. NetCDF-java and the TDS continue to support legacy workflows by maintaining support for legacy data formats and decades old data access services, while promoting 21st century scientific workflows through the creation of new capabilities and modernization of existing services (e.g. Immutability, upgraded technical stack, microservice development). - Supporting People
Outside of writing code, the THREDDS project seeks to support the community by __providing technical support, working to build capacity through Open Source Software development, and by building community cyber-literacy__ . The team provides expert assistance on software, data, and technical issues through numerous avenues, including participation in community mailing lists, providing developer guidance on our GitHub repositories, and leading and participating in workshops across the community. The team also actively participates in “upstream” open source projects in an effort to help sustain the efforts of which we rely and build upon. We have mentored students as part of the Unidata Summer Internship Program, and worked across organizations and disciplines in support of their internship efforts.
Prepared October, 2022
[1] “Publicly accessible” means we could find a top-level THREDDS Client Catalog. We checked <server>/thredds/catalog.xml (version 4), <server>/thredds/catalog/catalog.xml (version 5), including the most common ports of 80, 8080, 443, and 8443.