Status Report: Internet Data Distribution(IDD)
November 2023- April 2024
Mike Zuranski, Stonie Cooper, Mike Schmidt, Jeff Weber
Executive Summary
Unidata continues to support, update, and enhance the data available via the IDD for the benefit of research and education. Included but not limited to adding new data formats, bridging the knowledge gap in newly introduced data, and providing statistics of data flow and composition.
Questions for Immediate Committee Feedback
None at this time.
Activities Since the Last Status Report
Internet Data Distribution (IDD)
IDD data volumes continue to increase especially when new datasets are made available.
The following output is from a Linux-based data server that the UPC operates on behalf of the community, lead.unidata.ucar.edu:
``bqb
20240418
Data Volume Summary for lead.unidata.ucar.edu
Maximum hourly volume 124359.462 M bytes/hour
Average hourly volume 63359.113 M bytes/hour
Average products per hour 392779 prods/hour
Feed Average Maximum Products
(M byte/hour) (M byte/hour) number/hour
SATELLITE 15264.235 [ 24.092%] 20758.908 6527.979
NEXRAD2 10499.148 [ 16.571%] 13769.130 109733.667
NIMAGE 8157.437 [ 12.875%] 12197.203 7478.208
NGRID 6992.673 [ 11.037%] 12000.162 66519.479
HDS 4516.974 [ 7.129%] 8186.544 30300.083
FNEXRAD 4488.991 [ 7.085%] 4962.380 8558.083
NEXRAD3 3814.658 [ 6.021%] 5045.800 91566.667
EXP 3055.116 [ 4.822%] 4558.471 2541.708
GEM 3047.831 [ 4.810%] 11342.866 5175.562
CONDUIT 1968.457 [ 3.107%] 64975.640 5847.688
UNIWISC 992.154 [ 1.566%] 1174.309 924.667
NOTHER 281.755 [ 0.445%] 766.130 59.542
FSL2 188.142 [ 0.297%] 682.392 1275.417
IDS|DDPLUS 89.833 [ 0.142%] 114.865 55867.104
LIGHTNING 1.708 [ 0.003%] 3.947 402.667
``bqe
Data Distribution:
IDD CONDUIT feed:
Beginning the week of 4/15 and lasting through at least 4/19 (the date of this writing) CONDUIT has been down at the source. This is the result of serious instability at the College Park Data Center and teams there have been working tirelessly to address them. In fact a Critical Weather Day was issued for this week to ensure NCEP et al. have all the resources they need at their disposal. There is no ETR at this time.
IDD FNEXRAD NIMAGE and UNIWISC feeds:
We continue to create the content for the FNEXRAD (NEXRAD Level III national composites), NIMAGE (GOES-East/West Level 2 images and products, fully reconstituted images from NOAAPort tiles and with broadcast headers and footers stripped off to leave “bare” netCDF4 files), and UNIWISC (select GOES-East/West images converted to McIDAS AREA format for use in legacy systems like GEMPAK) feeds.
Existing Data Distribution:
The data volume seen in the SATELLITE (which is known as DIFAX in LDM distributions prior to v6.13.6) listing above represents all products received in the GOES ReBroadcast (GRB) downlinks that we installed in UCAR (currently GOES-18 at the NCAR Mesa Lab and GOES-16 at UCAR Foothills Lab 2). The data volume seen in the NIMAGE entry represents GOES-East/West ABI Level 2 imagery that has been reconstituted by stitching together tiles that are distributed in NOAAPort and all other Level 2 products. In both cases, binary headers and footers that are added to products before distribution in NOAAPort have been stripped off leaving “raw” netCDF4 files. The UNIWISC feed represents the volume of 3 select channels (0.64um VIS, 6.2um WV and 10.3um IR) for all coverages (CONUS, FullDisk, Mesoscale-1 and Mesoscale-2) of GOES-East/West image products that are in PNG compressed McIDAS AREA format that is suitable for use in GEMPAK, the IDV and McIDAS-V, McIDAS-X, and AWIPS.
Challenges, problems, and risks:
- Due to the nature of how LDM Feed Types are defined and used, it is often difficult to know where to insert new products. Even though we’ve already done this with NIMAGE, we want to refrain from adding anything to any NOAAPort feeds to avoid confusion and potential issues. We also know that at least several Feed Types are currently going unused going out from the IDD, yet we cannot tell how sites use Feed Types if they do not report to RTSTATS. We don’t want to repurpose an existing Feed Type unless we can be sure it is and will be unused for its original purpose (e.g. DIFAX to SATELLITE). We are continuing to evaluate options to repurpose existing Feed Types, as well as looking forward to new options for Feed Type handling with future versions of the LDM. At this time, however, it can be tricky to find where the best place is for new products to go.
- Plans are also underway to enhance how we manage our LDM installations and IDD node configurations, which will make them more robust, easier to manage and faster to investigate & respond to issues. These activities have been paused due to higher priority projects and staffs’ limited time availability.
- The operational RTSTATS web site will be going away in the near future. The environment currently hosting this will be decommissioned soon, and previous attempts by NSF Unidata staff to port the old software to newer environments have not been successful. Mike Zuranski has continued his work on his replacement, RTSTATS-NG, and we hope to have this version online before the end of May (if it isn’t already by the time you read this). However, due to multiple learning curves with the chosen tech stack (Dash) and other priorities, this has taken longer than planned. It’ll be worth it though, I swear!
- We recently came to the conclusion we need to find better ways to collect metrics for IDD usage, as well as some of our other services such as ADDE. The Netvizura service we have been using to collect the metrics below had lost the first quarter of 2023’s data, and days after pulling these metrics they had a bad database migration that lost all metrics up to that point. Between that and working to better maintain consistency in these metrics we will be working to improve our methods. More details on this can be found in the Data Services report.
Ongoing Activities
We plan to continue the following activities:
- Fire weather products (HRRR Smoke) that are being made available by NOAA/GSL in an EXP feed were added to the set of HRRR products that are available from hrrr.unidata.ucar.edu. These products along with other model output are available via the TDS and Unidata AWIPS EDEX.
- Other data sets we continue to explore with NOAA/GSD/ESRL are:
NOAAPort Data Ingest
- Ingest of the DVBS-2 NOAAPort Satellite Broadcast Network (SBN) products and their relay to end-users via the IDD has been “operational” at the UPC since August 2014.
Considerable effort has been expended in streamlining our NOAAPort ingest systems and assisting sites (UWisc/SSEC, NOAA/GSL, NOAA/SPC, Fox13 TV) in troubleshooting problems being experienced in their systems.
- The NOAAPort-derived data streams (HDS, IDS|DDPLUS, NGRID, NIMAGE, NEXRAD3 and NOTHER) are redundantly injected into the IDD at four geographically separate locations: UCAR/Unidata, UWisc/SSEC, Allisonhouse.com and Fox13 TV in Tampa, FL.
- LSU/Climate is no longer participating in this due to satellite dish hardware degradation and after consideration determined it was not worth the cost to invest in a new dish, installation and other costs.
- Unidata's NOAAPort ingest package is bundled with current versions of the LDM. The current LDM release is v6.15.0.
Relevant Metrics
- Approximately 545 machines at 176 sites are running LDM-6 and reporting real-time statistics to the UPC.
We routinely observe that the number of sites reporting real-time statistics fluctuates. We are not 100% certain why this may be the case, but our best guess is that some sites do not keep their LDMs running all of the time; campus firewall adjustments block the sending of the statistics; and/or sites decide to stop sending statistics. The latter possibility seems to be happening more frequently.
NB: We know that there are a number of sites that are participating in the IDD, but are not reporting real-time statistics back to us. Reporting of real-time statistics is not and never has been mandatory.
Unidata staff routinely assist in the installation and tuning of LDM-6 at user sites as a community service. We have learned about sites not sending real-time statistics during these kinds of support activities, and a number of times the impediment to sending in stats is firewall configurations at the user sites.
- A number of organizations/projects continue to use the LDM to move substantial amounts of data that do not report statistics to Unidata: NOAA, NASA, USGS, USACE, Governments of Spain, South Korea, private companies, etc.).
- UCAR IDD toplevel relay clusters, idd.unidata.ucar.edu and iddb.unidata.ucar.edu
The IDD relay clusters, described in the June 2005 CommunitE-letter article Unidata's IDD Cluster, routinely relays data to more than 1250 downstream connections. The primary IDD relay cluster, idd.unidata.ucar.edu, was moved to the NCAR/Wyoming Super Computing facility in Cheyenne, WY in late August 2019.
Over the period from March 23, 2023 through December 31, 2023 (IDD volume snapshots are taken during periods that do not have monitoring dropouts in NetVizura plots) the average volume of LDM/IDD data flowing through the Front Range GigaPop averaged around 6.4 Gbps (~69.12 TB/day), and peak rates reached 9.3 Gbps (which would be ~100TB/day if the rate was sustained (which it is definitely not)).
The following table of volume snapshots shows that the volume of data flowing to downstreams out of UCAR has been reasonably consistent:
Date range | Src Ave Max | Dst Ave Max | Total Ave Max |
20200508 - 20200630 | 5.4 7.5 | 42.1 52.9 | 5.5 7.5 |
20200701 - 20200930 | 5.4 7.9 | 41.9 60.3 | 5.4 7.9 |
20201001 - 20201231 | 5.2 6.9 | 39.9 55.9 | 5.3 7.0 |
20210101 - 20210331 | 5.5 8.0 | 42.3 59.9 | 5.5 8.1 |
20210401 - 20210415 | 6.1 15.5 | 46.4 112.7 | 6.1 15.7 |
20210601 - 20210719 | 6.6 9.2 | 50.5 73.0 | 6.6 9.2 |
20210908 - 20211005 | 7.6 14.9 | 59.3 121.7 | 7.7 15.0 |
20211101 - 20211231 | 6.7 9.1 | 52.4 71.4 | 6.8 9.2 |
20220208 - 20220311 | 6.6 15.2 | 53.5 114.8 | 6.6 15.3 |
20220412 - 20220521 | 7.2 14.5 | 52.6 103.7 | 7.3 14.6 |
20220717 - 20220831 | 7.3 13.3 | 46.3 86.1 | 7.3 13.4 |
20220714 - 20230313 | 7.8 11.7 | 51.1 77.4 | 7.8 11.7 |
20230910 - 20231013 | 6.8 11.7 | 39.4 74.3 | 6.8 11.8 |
20230323 - 20231231 | 6.3 9.2 | 37.0 56.5 | 6.4 9.3 |
NB: The units for Src and Total Ave and Max are Gbps (gigabits per second), and the units for Dst are Mbps (megabits per second).
Strategic Focus Areas
We support the following goals described in Unidata Strategic Plan:
- Managing Geoscience Data
The IDD project demonstrates how sites can employ the LDM to move and process data in their own environments.
- Providing Useful Tools
The freely available LDM software and the IDD project that is built on top of the LDM have served as a demonstration for distribution of real-time data for a variety of organizations including the U.S. National Weather service.
The cluster approach for LDM/IDD data relay that Unidata pioneered has been adopted by several Unidata university sites, and is currently being implemented at U.S. government sites.
Unidata’s NOAAPort ingest package, which is bundled with LDM-6, is being used by a variety of university, U.S. government, and private sector entities.
Both the LDM and NOAAPort ingest packages are bundled with AWIPS.
- Supporting People
The IDD is the primary method that core Unidata sites use to get the meteorological data that they need. Providing access to data in near real-time is a fundamental Unidata activity. The IDD-Brasil, the South American peer of the North American IDD, and IDD-Caribe, the Central American peer of the North American IDD, are helping to extend real-time data delivery throughout the Americas
Prepared April 2024