Published using Google Docs
IDD
Updated automatically every 5 minutes

Status Report: Internet Data Distribution

April - September 2015

Mike Schmidt, Jeff Weber, Tom Yoksas

Strategic Focus Areas

We support the following goals described in Unidata Strategic Plan:

  1. Enable widespread, efficient access to geoscience data
    A project like the IDD demonstrates how sites can employ the LDM to move data in their own environments.
  2. Develop and provide open-source tools for effective use of geoscience data
    The IDD is powered by the Unidata LDM-6 which is made freely available to all. The Unidata NOAAPort ingest package is being used by a variety of university and non-university community members. Both the LDM and NOAAPort ingest packages are being bundled by Raytheon in AWIPS-II.

  1. Provide cyberinfrastructure leadership in data discovery, access, and use
    The community-driven IDDs provide push data services to users an ever increasing community of global educators and researchers.
  2. Build, support, and advocate for the diverse geoscience community
    Providing access to data in real-time is a fundamental Unidata activity.

The IDD-Brasil, the South American peer of the North American IDD operated by the UPC, is helping to extend real-time data delivery outside of the U.S. to countries in South America and Africa. The Universidad de Costa Rica is experimenting with relaying data received in the IDD to Colombia.

Activities Since the Last Status Report

Internet Data Distribution (IDD)

After an extensive evaluation period,  0.25 degree GFS data (which became operational in NCEP on January 14, 2015) was added to the CONDUIT data stream starting with the 12Z run on July, 28 . Monitoring has shown that peak CONDUIT data volumes increased from about 8 GB/hr to about 21 GB/hr for all forecast hours for the 0.25 degree GFS.

The increase in aggregate data volume that results from the addition of the 0.25 degree GFS and HRRR data from NOAA/GSD can be seen by comparing the volume on our IDD test leaf node, lead.unidata.ucar.edu with that on one of the idd.unidata.ucar.edu real server backends shown below:

``bqb

Data Volume Summary for lead.unidata.ucar.edu

Maximum hourly volume  63172.218 M bytes/hour
Average hourly volume  31509.536 M bytes/hour

Average products per hour     359719 prods/hour

Feed                           Average             Maximum     Products
                    (M byte/hour)            (M byte/hour)   number/hour
CONDUIT                7543.544    [ 23.941%]    21413.167    83618.087
FSL2                   7494.892    [ 23.786%]    32739.626    12676.435
NEXRAD2                5674.899    [ 18.010%]     8061.093    63293.500
NGRID                  4687.849    [ 14.878%]     8107.771    33423.978
NOTHER                 2118.776    [  6.724%]     4479.228     6299.522
NEXRAD3                1939.169    [  6.154%]     2602.765    92942.761
FNMOC                  1156.486    [  3.670%]     3860.758     3251.283
HDS                     355.593    [  1.129%]      654.965    19128.804
NIMAGE                  159.260    [  0.505%]      263.313      202.565
FNEXRAD                 128.916    [  0.409%]      169.548      105.239
GEM                      78.549    [  0.249%]      495.761      757.891
UNIWISC                  69.489    [  0.221%]      117.668       43.783
IDS|DDPLUS               62.497    [  0.198%]       75.362    43338.783
EXP                      36.210    [  0.115%]       73.385      285.848
LIGHTNING                 3.296    [  0.010%]        6.446      349.196
GPS                       0.111    [  0.000%]        1.290        1.022

Data Volume Summary for uni16.unidata.ucar.edu

Maximum hourly volume  33304.768 M bytes/hour
Average hourly volume  19089.235 M bytes/hour

Average products per hour     292558 prods/hour

Feed                           Average             Maximum     Products
                    (M byte/hour)            (M byte/hour)   number/hour
NEXRAD2                5676.682    [ 29.738%]     8061.093    63316.913
NGRID                  4691.342    [ 24.576%]     8107.771    33447.370
CONDUIT                2624.859    [ 13.750%]    10488.076    28922.630
NOTHER                 2169.096    [ 11.363%]     4479.228     6431.217
NEXRAD3                1940.247    [ 10.164%]     2602.765    93001.304
FNMOC                  1156.486    [  6.058%]     3860.758     3251.283
HDS                     355.714    [  1.863%]      650.487    19136.674
NIMAGE                  155.868    [  0.817%]      263.313      200.348
FNEXRAD                  94.550    [  0.495%]      155.292       66.957
GEM                      78.549    [  0.411%]      495.761      757.891
IDS|DDPLUS               62.529    [  0.328%]       75.362    43366.174
UNIWISC                  43.712    [  0.229%]      106.520       25.783
EXP                      36.213    [  0.190%]       73.385      285.870
LIGHTNING                 3.275    [  0.017%]        6.446      346.413
GPS                       0.112    [  0.001%]        1.290        1.043
``bqe

Recently, top level IDD relays and the sites that they are feeding CONDUIT data to have been experiencing unusually high latencies that correspond with the transmission of the 0.25 degree GFS data.  Current testing suggests that a large fraction of the latencies being experienced originate at or near NCEP.  Investigations are ongoing.

Ongoing Activities

We plan to continue the following activities:

(.xml for machines)

http://www.nws.noaa.gov/os/notification/tin14-28hrrr-cca.htm

http://www.nws.noaa.gov/os/notification/tin13-43estofs_noaaport_aaa.htm

Briefly, these additions are comprised of:

NOAAPort Data Ingest

The situation of routinely experiencing high numbers of missed frames has largely been mitigated through a combination of hardware upgrades and by a Novra firmware upgrade that was aimed at dealing with the “small” packets routinely seen in the GOES product channel.

Relevant Metrics

The IDD relay cluster, described in the June 2005 CommunitE-letter article Unidata's IDD Cluster, routinely relays data to more  than 1250 downstream connections.

Data input to the cluster nodes averages around 20 GB/hr (~0.5 TB/day);  average data output from the entire cluster exceeds 2.9 Gbps (~32 TB/day); peak rates routinely exceed 6.4 Gbps (which would be ~70 TB/day if the rate was sustained).

Cluster real server backends and accumulator nodes routinely have instantaneous output volumes that exceed a Gpbs.  Bonding of pairs of Ethernet interfaces was needed to be able to support these output data rates.  The next generation of cluster machines will need to have 10 Gbps Ethernet capability.

The increase in data volume over the past six months is attributable to the addition of 0.25 degree GFS data to CONDUIT, the overall increase in the volume of data being transmitted in NOAAPort (which now routinely exceeds 10 GB/hr), and the increase in dual polarization NEXRAD data.  During the end of August/beginning of September GOES-R test period, the NOTHER datastream’s pushed the total volume of data being sent over NOAAPort to peaks in excess of 20 GB/hr.


Prepared  September, 2015