Status Report: THREDDS
April 2017 - October 2017
Sean Arms, Ethan Davis, Dennis Heimbigner, Ryan May, Christian Ward-Garrison
Activities Since the Last Status Report
The THREDDS Project
The THREDDS Project encompases four projects: netCDF-Java, the THREDDS Data Server (TDS), Rosetta, and Siphon (the Unidata Python client to interact with a TDS). For specific information on Siphon, please see the Python Status Report. An update regarding cloud efforts related to the TDS can be found in the Cloud Computing Activities Status Report.
Released netCDF-Java / TDS version 4.6.10 (Stable)
Progress has been made on the following:
- The 4.6.x line of development is now in maintenance mode so that the team can focus on v5.0. However, “maintenance mode” is taking up quite a bit of resources and progress on v5.0 has been impacted.
Focus netCDF-Java / TDS (Soon-to-be Beta) v5
Our last update indicated that the THREDDS team was preparing to release a beta version of the THREDDS Data Server (version 5.0) at the end of May. Unfortunately, due to external projects coming to a close, we were unable to meet that deadline. We hope to have the beta out before the beginning of 2018...real soon now™
Progress has been made on the following:
- The Nexus Repository Manager at https://artifacts.unidata.ucar.edu has been upgraded from version 2 to version 3 and it will now host all build artifacts. For users, this means:
- The configuration management tool Ansible has shown great promise as a way for users to be able to deploy TDS and other Unidata software in an automated fashion.
- DAP4 in the TDS has been updated to be consistent with the specification and to successfully allow the netCDF-C DAP4 and NetCDF-java libraries to read DAP4 responses from the TDS.
- New Coverage data type allows for subsetting across array boundaries (often called the “seam” problem)
- Uses the new edal-java based ncWMS 2.0 server, as well as javascript client Godiva3
- CatalogScan feature allows for incremental updating of TDS catalogs without the need to restart Tomcat
- Upload/Download support has been added to TDS. This now includes an upload web form accessible as http://.../thredds/upload.
- Unit and Integration tests are passing in 5.0. This is a big step towards releasing a beta.
- ncSOS has been integrated into the TDS distribution (as part of the OIIP project - see the Rosetta section for more details)
- The access to the netCDF-c library via jni is now thread-safe so that the HDF5 library no longer needs to be built with thread-safe support.
Dependencies, challenges, problems, and risks include:
- The longer the 4.6.x line of development is maintained, the longer it will take to move forward with the 5.x line of development
Rosetta
Rosetta is progressing thanks to support from a NASA ACCESS grant (the Oceanographic In-situ data Interoperability Project, or OIIP), in which Unidata is partnering with the PO.DAAC at JPL and UMASS-Boston.
Progress has been made on the following:
- Support for the NCEI NODC netCDF v2.0 templates (metadata standards)
- Extension of the NCEI templates to support metadata critical to the use of electronic tagging datasets
- Support automated transformation of output from electronic animal tagging datasets in the Electronic Tag Unified File Format (eTUFF) format via Rosetta.
- Working to create a unified workflow for the gui wizard interface that allows for selection of which metadata standards to use when determining recommended/required metadata
- Engaging with the netCDF Linked Data initiative to define best practices identifying netCDF metadata to a particular metadata standard.
Dependencies, challenges, problems, and risks include:
- Two of the core javascript libraries used by Rosetta have been abandoned by their original creators. One has been picked up by the community (SlickGrid), while the other is in limbo (jWizzard). Unidata will likely need to pick up jWizzard and maintain it for use within Rosetta, at least internally. However, it would be a good community service to open this up to a wider audience, but resources would be required to do so.
Ongoing Activities
We plan to continue the following activities:
- Documentation updates - We are reworking the tutorial material for the TDS v5.0 with the goal of enabling asynchronous training. The material will undergo a major overhaul to include the use Docker containers, video snippets, and other new forms of training tools.
- Maintain thredds.ucar.edu and keep up with the addition of new datasets to the IDD
- GOES-16 data, with tiles stitched together using python, available on our test TDS.
- Continue development of the TDS python client siphon, as well as extending its functionality to interface with other web services and servers
The following active proposals directly involve THREDDS work:
- The NASA ACCESS award with JPL is entering into the second year of the two year award. The award is titled: "Leveraging available Technologies for Improved interoperability and visualization of Remote Sensing and in-situ Oceanographic Data at the PO.DAAC" and was submitted with JPL/PO.DAAC. [Rosetta]
- EarthCube award: "Advancing netCDF-CF for the Geosciences". This two-year, Unidata lead project will work to extend netCDF-CF conventions in ways that will broaden the range of earth science domains whose data can be represented.
- Finished the second and final year of EarthCube award: "CyberConnector: Bridging the Earth Observations and Earth Science Modeling for Supporting Model Validation, Verification, and Intercomparison" with George Mason University.
New Activities
Over the next three months, we plan to organize or take part in the following:
Over the next twelve months, we plan to organize or take part in the following:
- Finalize the TDS plugin layer.
- Upgrade the ncWMS, ncISO, and other plugin services to use the new TDS 5.x plugin layer
- Incorporate ncSoS into TDS
- Transitioning thredds.ucar.edu to TDS 5.x
- Getting TDS v5.0 to a stable release
- Getting netCDF-Java v5.x to a stable release
Beyond a one-year timeframe, we plan to organize or take part in the following:
- Enable Rosetta to publish to a TDS
Relevant Metrics
9558 unique IPs started up thredds from November 2014 through September 2017, 536 of which are publicly accessible servers. Publically accessible is defined as the following URL patterns being accessible with an HTTP HEAD requests with a return status less than 400:
http://<ip address>/thredds/catalog.xml
http//<ip address>:8080/thredds/catalog.xml
This information is only known for servers running v4.5.3 and above. There are many reasons why these number are so different. The differences could be due to:
- Reporting TDS running behind a firewall that does not allow incoming traffic on 80 or 8080 (the ports tested)
- It might be possible that a TDS running through a proxy server may not been “seen” in this analysis as publically reachable at the normal url pattern (<server>/thredds/catalog.xml)
- A TDS running in the past is no longer running today
- Finally, the most likely reason: people testing the TDS on their local machine, but not actually running a server
Note that the vast majority of the publicly accessible servers are running v4.6.3 or above (v4.6.10 was the most current release during this period, and was released on 20 April 2017, and is the most commonly run version of the 4.6.x line of the TDS ). This indicates that users and organizations running the TDS tend to follow along closely with the current releases of the TDS.
As with the last report, the updated analysis also indicates a number of sites are running TDS v5.0, even though it is pre-beta. This underscores the desire for the new features in 5.0, and highlights one reason why we feel the need focus most of our efforts on and to move all new development to the v5 codebase.
Note that there are some odd looking versions of the TDS being reported in the log files, such as TDS_4.28.x. It is likely these version numbers are actually generated by software that is being built on top of the TDS. Previous version of the figure above listed each of these odd versions as its own entry; these oddities are aggregated together and shown as “TDS_Unknown”.
Strategic Focus Areas
We support the following goals described in Unidata Strategic Plan:
- Enable widespread, efficient access to geoscience data
The work of the THREDDS group is comprised of two main areas: the THREDDS Data Server (TDS) and the Common Data Model (CDM) / netCDF-Java library. The TDS provides catalog and data access services for scientific data using OPeNDAP, OGC WCS and WMS, HTTP, and other remote data access protocols. The CDM provides data access through the netCDF-Java API to a variety of data formats (e.g., netCDF, HDF, GRIB). Layered above the basic data access, the CDM uses the metadata contained in datasets to provide a higher-level interface to geoscience specific features of datasets, in particular, providing geolocation and data subsetting in coordinate space. The CDM also provides the foundations for all the services made available through the TDS.
The data available from the IDD is a driving force on both the TDS and netCDF-Java development. The ability to read all the IDD data through the netCDF-Java library allows the TDS to serve that data and provide services on/for that data.
- Develop and provide open-source tools for effective use of geoscience data
Unidata's Integrated Data Viewer (IDV) depends on the netCDF-java library for access to local data, and on the THREDDS Data Server (TDS) for remote access to IDD data. At the same time, the CDM depends on the IDV to validate and test CDM software. Many other tools build on the CDM / netCDF-Java library (eg ERDDAP, Panoply, VERDI, etc) and on the TDS (ESGF, LAS, ncWMS, MyOcean, etc).
- Provide cyberinfrastructure leadership in data discovery, access, and use
The Common Data Model (CDM) / netCDF-Java library is one of the few general-purpose implementations of the CF (Climate and Forecast) metadata standards. Current active efforts in CF that we are involved with include use of the extended netCDF-4 data model (CF 2.0) and for point data (Discrete Sampling Geometry CF-DSG).
The TDS has pioneered the integration of Open Geospatial Consortium (OGC) protocols into the earth science communities. Strong international collaborations have resulted in WCS and WMS services as part of the TDS.
The CDM and TDS are widely used implementations of the OPeNDAP DAP2 data access protocol. Unidata has worked with the OPeNDAP group to design, develop, and implement a new version of the DAP specification, DAP4, which is now available in the TDS server and the netCDF-Java client software stack.
- Build, support, and advocate for the diverse geoscience community
The THREDDS project is involved in several international standardization efforts (CF, OGC, etc.) which cross-cut a multitude of disciplines, both inside and outside of the geoscience community. The netCDF-Java client library, as well as the TDS often serve as incubators for new pushes in these efforts.
Prepared October 2017