Published using Google Docs
THREDDS
Updated automatically every 5 minutes

Status Report: THREDDS

April 2019 - October 2019

Sean Arms, Ethan Davis, Dennis Heimbigner, Cece Hedrick, Ryan May, Jennifer Oxelson, Howard Van Dam II 

Areas for Committee Feedback

We are requesting your feedback on the following topics:

  1. Do you know of anyone using netCDF-Java to read Vis5D grid files?
  2. Do you know anyone who just **loves** BUFR and understand it inside and out? Unidata could use your help (knowledge of Java **not** required)!
  3. With an eye towards guiding our outreach efforts, what THREDDS Data Servers do you use, and are there any that you consider critical to your needs?

Activities Since the Last Status Report

The THREDDS Project

The THREDDS Project encompases four projects: **netCDF-Java, the THREDDS Data Server (TDS), Rosetta, and Siphon** (the Unidata Python client to interact with remote data services, such as those provided by the TDS). For specific information on Siphon, please see the Python Status Report. An update regarding cloud efforts related to the TDS, including the popular Docker container effort, can be found in the Cloud Computing Activities Status Report.

The various THREDDS projects were featured in a paper regarding Data Interoperability presented at OceanObs’19, a decadal conference series which seeks to improve response to scientific and societal needs of a fit-for-purpose integrated ocean observing system, for better understanding the environment of the Earth, monitoring climate, and informing adaptation strategies as well as the sustainable use of ocean resources.

Data Interoperability Between Elements of the Global Ocean Observing System. Derrick P Snowden, Vardis Tsontos, Nils Olav Handegard, Marcos Zarate, Kevin M. O'Brien, Kenneth S Casey, Neville Smith, Helge Sagen, Kathleen Bailey, Mirtha Lewis, Sean Arms. Review, Front. Mar. Sci. - Ocean Observation, Submitted on: 15 Nov 2018. DOI: 10.3389/fmars.2019.00442

Released netCDF-Java 5 (Stable)

Released TDS version 4.6.14 (Stable)

Released TDS version 5.0.0-beta7

Documentation for  netCDF-Java / TDS (Beta) v5

Rosetta

Rosetta continues to progress following a very successful NASA ACCESS grant (the Oceanographic In-situ data Interoperability Project, or **OIIP** ), in which Unidata partnered with the PO.DAAC at JPL and UMASS-Boston. A poster was presented at the Fall AGU 2018 meeting with respect to the advances in Rosetta related to OIIP. We continue to work with JPL as part of their user acceptance process, with the ultimate goal being the operational use of Rosetta at the PO.DAAC. We, along with the PI of the OIIP project, participated in an ESIP Marine Data Cluster presentation.

Progress has been made on the following:

General dependencies, challenges, problems, and risks include:

Ongoing Activities

We plan to continue the following activities:

The following active proposals directly involve THREDDS work:

New Activities

Over the next three months, we plan to organize or take part in the following:

Over the next twelve months, we plan to organize or take part in the following:

Beyond a one-year timeframe, we plan to organize or take part in the following:

Relevant Metrics

NetCDF-Java

Recently we’ve found that the method by which we count download statistics for the netCDF-Java library has been excluding a significant source of downloads. Traditionally, users have downloaded a single jar (netcdfAll.jar) to use with their packages. However, the way Java developers have been consuming our libraries has changed significantly over the past several years, as many projects now rely on a build system to pull in just the components of the library they need, and we have not taken that into account.

As an example, if we consider the netCDF-Java version 4.6.13 release, our current method for computing library downloads would show 1717 downloads over the past six months. If we take the previously unaccounted for downloads into account (for just the core component of the library — cdm.jar — not all components), the number of downloads for netCDF-Java version 4.6.13 over the past six months jumps to 16,049.

To put this new number in perspective, at our last meeting we reported that the total **yearly** downloads of **all versions** of netCDF-Java from 2018-02 — 2019-03 was 8389, and our new **six month** download figure for a **single version** of netCDF-Java nearly doubles that. As such, we will be changing the way the download statistics of netCDF-Java are computed to better reflect the community of developers who rely on the library. These changes will be reflected in the download metrics reported at our next meeting (Spring 2020).

THREDDS Data Server

We see that **10,436** unique IPs started up thredds from March 2019 through August 2019, **106** of which are publicly accessible servers. “Publicly accessible” means we could find them using common url patterns. For this plot, the version includes betas and snapshots, not just the official release of that version, for presentation simplicity.

This information is only known for servers running v4.5.3 and above. There are many reasons why these numbers are so different. The differences could be due to:

Note 1: the vast majority of the publicly accessible servers are running v4.6.3 or above (v4.6.14 was the most current release during this period, and was released on 26 July 2019 ).

Note 2: there are some odd looking versions of the TDS being reported in the log files, such as TDS_4.39.x. It is likely these version numbers are actually generated by software that is being built on top of the TDS or applications that bundle the TDS as part of a deployment package.

Furthermore, of the **106** publically accessible servers, **64** have updated the name of their server in their server configuration file (taken as a sign that they are maybe possibly intended to be used by others...maybe…).

In the next six months, we will be working towards enabling TDSs, on an opt-in basis, to officially advertise their availability to the community through a centralized resource.

Strategic Focus Areas

The THREDDS projects covered in this report support the following goals described in Unidata Strategic Plan:

  1. Managing Geoscience Data
    The component software projects of the THREDDS project work to facilitate the management of geoscience data from four points of view: __Making Geoscience Data Accessible, Making Geoscience Data Discoverable, Making Geoscience Data Usable, and Enhancing Community Access to Data__ . As a client-side library, **netCDF-Java** enables end users to read a variety of data formats both locally and across numerous remote technologies. Less user-friendly formats, such as GRIB, are augmented with metadata from community driven metadata standards (e.g. Climate and Forecast metadata standards), and viewed through the more user friendly Common Data Model (very similar to the netCDF Data Model), providing a single set of Java APIs for interacting with a multitude of formats and standards. The **THREDDS Data Server** exposes the power of the netCDF-java library outside of the Java ecosystem with the addition of remote data services, such as __OPeNDAP__ , __cdmremote__ , __OGC WCS__ and __WMS__ , __HTTP direct download__ , and other remote data access and subsetting protocols. The TDS also exposes metadata in standard ways (e.g. ISO 19115 metadata records, json-ld metadata following schema.org), which are used to drive search technologies. **Rosetta** facilitates the process of translating ascii based observational data into standards compliant, archive ready files. These files are easily read into netCDF-Java and can be served to a broader community using the TDS.

  1. Providing Useful Tools
    Through Rosetta, the THREDDS project seeks to intercede in the in-situ based observational data management lifecycle as soon as possible. This is done by enabling those who produce the data to create archive ready datasets as soon as data are collected from a sensor or platform without the need to write code or intimately understand metadata standards. NetCDF-java and the TDS continue to support legacy workflows by maintaining support for legacy data formats and decades old data access services, while promoting 21st century scientific workflows through the creation of new capabilities (such as adding Zarr support) and services.

  1. Supporting People
    Outside of writing code, the THREDDS project seeks to support the community by __providing technical support, working to build capacity through Open Source Software development, and by building community cyber-literacy__ . The team provides expert assistance on software, data, and technical issues through numerous avenues, including participation in community mailing lists, providing developer guidance on our GitHub repositories, and leading and participating in workshops across the community. The team also actively participates in “upstream” open source projects in an effort to help sustain the efforts of which we rely and build upon. We have mentored students as part of the Unidata Summer Internship Program, and worked across organizations and disciplines in support of their internship efforts.


Prepared  September 2019