The SAND Project and a New Research Networking Technical Working Group
Shawn McKee / University of Michigan
on behalf of the SAND Collaboration
At the Internet2 Community Measurement, Metrics, and Telemetry meeting - May 12, 2020
Outline
2
I2 PWG-CMMT
Internet2 Performance Working Group Community Measurement, Metrics, and Telemetry
OSG/WLCG Networking Activities
3
I2 PWG-CMMT
Internet2 Performance Working Group Community Measurement, Metrics, and Telemetry
perfSONAR deployment
288 Active perfSONAR instances
- 207 production endpoints
- WLCG T1/T2 coverage
- Continuously testing over 5000 links
- Testing coordinated and managed from central place
- Dedicated latency and bandwidth nodes at each site
- Open platform - tests can be scheduled by anyone who participates in our network and runs perfSONAR
4
I2 PWG-CMMT
Internet2 Performance Working Group Community Measurement, Metrics, and Telemetry
Network Measurement Platform Overview
5
Collector
Store (long-term)
Store (short-term)
pS Monitoring
pS Configuration
Tape
Experiments
MONIT-GRAFANA
pS Dashboard
I2 PWG-CMMT
Internet2 Performance Working Group Community Measurement, Metrics, and Telemetry
Grafana - perfSONAR dashboard
6
6th SIG-PMV Meeting Dublin
Grafana - IPv6 dashboard
7
See more Grafana dashboards at http://monit-grafana-open.cern.ch/
6th SIG-PMV Meeting Dublin
Current Platform Use
8
I2 PWG-CMMT
Internet2 Performance Working Group Community Measurement, Metrics, and Telemetry
The NSF SAND Project
SAND: Service Analysis and Network Diagnosis
This is a NSF funded project (award #1827116) focusing on combining, visualizing, and analyzing disparate network monitoring and service logging data. (GOAL: capitalize on our rich network dataset!!)
9
Website https://sand-ci.org/ (Project started in September 2018 and will last 2 years)
PI: Brian Bockelman, Co-PIs: Shawn McKee, Rob Gardner
Brian Bockelmann
Associate Scientist
Morgridge Institute for Research, University of Wisconsin
bbockelman@morgridge.org
Shawn McKee
Research Scientist
University of Michigan Physics
smckee@umich.edu
Rob Gardner
Senior Scientist
University of Chicago Physics
rwg@hep.uchicago.edu
I2 PWG-CMMT
Internet2 Performance Working Group Community Measurement, Metrics, and Telemetry
SAND Project Vision
It will extend and augment the OSG networking efforts with a primary goal of extracting useful insights and metrics from the wealth of network data being gathered from perfSONAR, FTS, R&E network flows and related network information from HTCondor and others.
Shown on the top diagram to the right is the logical SAND data flow from source to analytics.
The bottom diagram to the right shows the potential power of the extensive network tomography we have by continuously measuring thousands of R&E network paths. In this example, 3 host-pairs see differing packet loss on intersecting paths. We can infer a solution!
�
10
E-F 1%
D-C 2%
A-B 3%
I2 PWG-CMMT
Internet2 Performance Working Group Community Measurement, Metrics, and Telemetry
SAND Planning and Work Areas
11
I2 PWG-CMMT
Internet2 Performance Working Group Community Measurement, Metrics, and Telemetry
SAND Planning and Work Areas (2)
We have more items on our list:
In the interest of time, I will only show a couple things.
12
I2 PWG-CMMT
Internet2 Performance Working Group Community Measurement, Metrics, and Telemetry
Challenge: Network Topology
Whenever we identify a possible network problem, the first question is: what path is being measured?
It should be noted that having many paths continuously monitored is a very powerful tool for both identify network issues and localizing them!
Fortunately, we are scheduling regular “traceroute” tests between our perfSONAR measurement end-points
We have students working on data cleaning and topology extraction.
New path visualisation tool being developed by MEPHi SAND collaborators
13
I2 PWG-CMMT
Internet2 Performance Working Group Community Measurement, Metrics, and Telemetry
DEMO: Dashboards for Network Metrics
14
I2 PWG-CMMT
Internet2 Performance Working Group Community Measurement, Metrics, and Telemetry
Finding Relevant Information
So far I have shown a few different links. Another area the SAND team would like to improve is to make it easier to find all the relevant tools, docs and data
We have setup a web server at: https://toolkitinfo.opensciencegrid.org/toolkitinfo/
15
The goal is to continue to maintain and add-to the various menus available to allow a broad range of users to easily find and access network data and analytics results.
We will be adding info on any future containerized perfSONAR, new topology capabilities and links to adding your site data to SAND.
I2 PWG-CMMT
Internet2 Performance Working Group Community Measurement, Metrics, and Telemetry
SAND Summary
team@sand-ci.org
Part 2: RNTWG Slides
16
I2 PWG-CMMT
Internet2 Performance Working Group Community Measurement, Metrics, and Telemetry
Acknowledgements
We would like to thank the WLCG, HEPiX, perfSONAR and OSG organizations for their work on the topics presented.
In addition we want to explicitly acknowledge the support of the National Science Foundation which supported this work via:
17
I2 PWG-CMMT
Internet2 Performance Working Group Community Measurement, Metrics, and Telemetry
SAND References
18
I2 PWG-CMMT
Internet2 Performance Working Group Community Measurement, Metrics, and Telemetry
SAND Backup Slides
19
OSG/WLCG networking projects
20
There are 4 coupled projects around the core OSG Net Area
I2 PWG-CMMT
Internet2 Performance Working Group Community Measurement, Metrics, and Telemetry
Available Data Overview
SAND and OSG/WLCG are gathering a number of potentially very useful metrics
This data is being transferred using message bus technologies (RabbitMQ (OSG) and ActiveMQ (CERN)) and ends up in two different Elasticsearch instances (University of Chicago analytics platform and University of Nebraska)
This data could provide powerful insights into our R&E network infrastructure by using the temporal and spatial information we have available.
21
I2 PWG-CMMT
Internet2 Performance Working Group Community Measurement, Metrics, and Telemetry
Some Context: IRIS-HEP
The Institute for Research and Innovation in Software in High Energy Physics (IRIS-HEP) project has been funded by National Science Foundation in the US as grant OAC-1836650 starting 1 September, 2018.
The institute focuses on preparing for High Luminosity (HL) LHC and is funded at $5M / year for 5 years. There are three primary development areas:
The institute also funds the LHC part of Open Science Grid, including the networking area and created a new integration path (the Scalable Systems Laboratory) to deliver its R&D activities into the distributed and scientific production infrastructures. Website for more info: http://iris-hep.org/
22
I2 PWG-CMMT
Internet2 Performance Working Group Community Measurement, Metrics, and Telemetry
perfSONAR Data Details
We are collecting a number of different types of data from perfSONAR which are sent to different “topics” on the RabbitMQ bus and put into their own index in Elasticsearch:
You can explore the details via Kibana: https://atlas-kibana.mwt2.org/s/networking/app/kibana#/discover?_g=()
23
I2 PWG-CMMT
Internet2 Performance Working Group Community Measurement, Metrics, and Telemetry
SAND Collaboration Meeting Details
Our first in person collaboration meeting was June 17-18, 2019 at U Chicago
24
Main topic areas discussed day 1
The second day was a “hackathon” were we worked on items from day 1.
The “Team”
Picture credit: Rob Gardner (that’s why he’s missing)
I2 PWG-CMMT
Internet2 Performance Working Group Community Measurement, Metrics, and Telemetry
Visualizing NSF CC* Institutions
The NSF has had a very successful series of Campus Cyberinfrastructre (CC) solications, and all require recipients to deploy perfSONAR
SAND wants to make it easy for these sites to be seen by simply adding a ‘CCSTAR’ community to their perfSONAR toolkits https://display.sand-ci.org/
25
Of course showing them on the map is just a first step
We want to then provide a very easy way for sites to “opt-in” to SAND so the we can begin to gather their perfSONAR data and provide our analytics, alerting and monitoring for them.
I2 PWG-CMMT
Internet2 Performance Working Group Community Measurement, Metrics, and Telemetry
SAND Activities to Date
Initial efforts targeted improving the network data pipeline from OSG
We have also been working with the collected data and have identified challenges that we need to address to make it more useful
26
I2 PWG-CMMT
Internet2 Performance Working Group Community Measurement, Metrics, and Telemetry
Issues with Traceroute and Network Paths
While we regularly try to measure the network paths between our hosts (and by proxy, between our sites), the traceroute tools has some limitations
27
For all these reasons, we have challenges in trying to use our traceroute results to understand the network topology
The SAND project is planning to work on cleaning things up
I2 PWG-CMMT
Internet2 Performance Working Group Community Measurement, Metrics, and Telemetry