Status Report: THREDDS
April - September 2015
John Caron, Sean Arms, Ethan Davis, Dennis Heimbigner, Ryan May, Christian Ward-Garrison
Strategic Focus Areas
We support the following goals described in Unidata Strategic Plan:
- Enable widespread, efficient access to geoscience data
The work of the THREDDS group is comprised of two main areas: the THREDDS Data Server (TDS) and the Common Data Model (CDM) / netCDF-Java library. The TDS provides catalog and data access services for scientific data using OPeNDAP, OGC WCS and WMS, HTTP, and other remote data access protocols. The CDM provides data access through the netCDF-Java API to a variety of data formats (e.g., netCDF, HDF, GRIB). Layered above the basic data access, the CDM uses the metadata contained in datasets to provide a higher-level interface to geoscience specific features of datasets, in particular, providing geolocation and data subsetting in coordinate space. The CDM also provides the foundations for all the services made available through the TDS.
The data available from the IDD is a driving force on both the TDS and netCDF-Java development. The ability to read all the IDD data through the netCDF-Java library allows the TDS to serve that data and provide services on/for that data.
- Develop and provide open-source tools for effective use of geoscience data
Unidata's Integrated Data Viewer (IDV) depends on the netCDF-java library for access to local data, and on the THREDDS Data Server (TDS) for remote access to IDD data. At the same time, the CDM depends on the IDV to validate and test CDM software. Many other tools build on the CDM / netCDF-Java library (eg ERDDAP, Panoply, VERDI, etc) and on the TDS (ESGF, LAS, ncWMS, MyOcean, etc).
- Provide cyberinfrastructure leadership in data discovery, access, and use
The Common Data Model (CDM) / netCDF-Java library is one of the few general-purpose implementations of the CF (Climate and Forecast) metadata standards. Current active efforts in CF that we are involved with include use of the extended netCDF-4 data model (CF 2.0) and for point data (Discrete Sampling Geometry CF-DSG).
The TDS has pioneered the integration of Open Geospatial Consortium (OGC) protocols into the earth science communities. Strong international collaborations have resulted in WCS and WMS services as part of the TDS.
The CDM and TDS are widely used implementations of the OPeNDAP DAP2 data access protocol. Unidata has worked with the OPeNDAP group to design, develop, and implement a new version of the DAP specification, DAP4, which is now available in the TDS server and the netCDF-Java client software stack.
- Build, support, and advocate for the diverse geoscience community
The THREDDS project is involved in several international standardization efforts (CF, OGC, etc.) which cross-cut a multitude of disciplines, both inside and outside of the geoscience community. The netCDF-Java client library, as well as the TDS often serve as incubators for new pushes in these efforts.
Activities Since the Last Status Report
The THREDDS Project
The THREDDS Project encompases four projects: netCDF-Java, the THREDDS Data Server (TDS), Rosetta, and siphon (the Unidata Python client to interact with a TDS server).
Released netCDF-Java / TDS version 4.6 (Stable)
The stable release of both netCDF-Java and the THREDDS Data Server is version 4.6.
Progress has been made on the following:
- GRIB Collections now scale to large numbers of files.
- We now use the Gradle build system
- Using Coverity to find and fix more than 4000 defects. Defect count now < 1 / 1000 Lines of Code.
Dependencies, challenges, problems, and risks include:
- Addressing feedback as community upgrades their TDS installations
2015 TDS Training Workshop
The 2015 TDS Training Workshop utilized Docker container technology which freed up time for teaching more TDS-specific material.
Progress has been made on the following:
- Documentation updated to reflect latest changes in version 4.6.
- Used Docker for running the TDS in the workshop
Dependencies, challenges, problems, and risks include:
- Major changes coming in 5.0 - will require another very thorough pass over documentation, training materials, etc.
Ongoing Activities
We plan to continue the following activities:
- Documentation updates - reworking the tutorial material to use Docker
- Maintain thredds.ucar.edu and keep up with the addition of new datasets to the IDD
- Continue development of the TDS python client siphon, as well as potentially extend its functionality to interface with the AWIPS-II EDEX server
- Continue to implement a Rosetta interface for each discrete sampling geometry (DSG) from the CF-1.6 specification (http://cfconventions.org/Data/cf-conventions/cf-conventions-1.6/build/cf-conventions.html#discrete-sampling-geometries)
The following active proposals directly involve THREDDS work:
- New EarthCube award: "Advancing netCDF-CF for the Geosciences". This two year, Unidata lead project will work to extend netCDF-CF conventions in ways that will broaden the range of earth science domains whose data can be represented.
- Beginning the second year of NASA ROSES ACCESS award: "High Performance Multidisciplinary Open Standard Data Services to Serve Terrestrial Environmental Modeling" with USGS CIDA.
- Three EarthCube awards are finishing up on a no-cost extension:
1) EarthCube Building Blocks award: "Integrating Discrete and Continuous Data" with Univ of Texas, Austin and others.
2) EarthCube Building Blocks award: "Specifying and Implementing ODSIP, A Data-Service Invocation Protocol" with OPeNDAP, Inc.
3) EarthCube Building Blocks award: "Deploying Web Services Across Multiple Science Domains" with IRIS, UNAVCO, and others. Period of performances: Oct 2013 - Sept 2015. - Two NASA ROSES ACCESS proposals were submitted this year:
1) "Interactive Algorithm Development and Product Validation through Innovative Data Access and Visualization Methods" with UWisc/SSEC.
2) "Leveraging available Technologies for Improved interoperability and visualization of Remote Sensing and in-situ Oceanographic Data at the PO.DAAC" with JPL/PO.DAAC
New Activities
Over the next three months, we plan to organize or take part in the following:
- Getting netCDF-Java/TDS v5.0 to a beta release
- Making public a TDS 5.0 Test Server
- Enable TDS to serve data from Amazon S3 buckets
- Finalize visualization preview of converted data in Rosetta
- Move issue tracking, roadmap planning, etc. from our Jira server to GitHub
- Host non-Maven artifact downloads (i.e. toolsUI.jar, netCDFAll.jar, tdm.jar, ncIDV.jar, and thredds.war) on github.
Over the next twelve months, we plan to organize or take part in the following:
- Update the TDS 5.0 ncWMS plug-in to use the new ncWMS 2.0 code from the University of Reading
- Upgrade the ncWMS, ncISO, and other plugin services to use the new TDS 5.0 plugin layer
- Transitioning thredds.ucar.edu to TDS 5.0
- Getting netcdf-Java/TDS v5.0 to a stable release
- Incorporate ncSoS into THREDDS 5.0
- Enable siphon to interface with CDMRFeature objects
Beyond a one-year timeframe, we plan to organize or take part in the following:
- Integrate Rosetta with the TDS to allow conversion and publishing of observational ASCII-based datasets to the TDS
- Move to a fully online based training tutorial, reserving in-person, annual training for advanced topics
Areas for Committee Feedback
We are requesting your feedback on the following topics:
- Does your department or campus IT utilize Docker technology?
- Have there been discussions regarding moving student computing resources (i.e. computer labs) to the cloud?
- What are the top three analysis and visualization tools utilized in a) the classroom, b) student research, and c) faculty research
Relevant Metrics
While it is still early in the semester, it should be noted that the top client accessing data from the THREDDS data server over the past month is now Python. The THREDDS team will keep a close eye on our server usage statistics to see if this continues to be the case, and will provide a more detailed report for the Spring meeting.
Prepared September 2015