Research Networking Technical Working Group
Shawn McKee, Marian Babik
on behalf of the RNTWG
LHCONE/LHCOPN Meeting (virtual)
May 13, 2020
Presentation Overview
From our last meeting in January, the various LHC/HEP experiments described their networking needs, interests and use-cases
The experiments reinforced what the HEPiX NFV phase I report suggested were useful areas to focus effort upon:
In this presentation I want to cover the recent work to create a working group to push the outlined work forward
2
WLCG Network Requirements
3
Research Networking Technical WG
Charter:
https://docs.google.com/document/d/1l4U5dpH556kCnoIHzyRpBl74IPc0gpgAG3VPUp98lo0/edit#
Mailing list:
http://cern.ch/simba3/SelfSubscription.aspx?groupName=net-wg
Members (80 as of today, in no particular order):
Christian Todorov (Internet2) Frank Burstein (BNL) Richard Carlson (DOE) Marcos Schwarz (RNP) Susanne Naegele Jackson (FAU)
Alexander Germain (OHSU) Casey Russell (CANREN) Chris Robb (GlobalNOC/IU) Dale Carder (ESnet) Doug Southworth (IU)
Eli Dart (ESNet) Eric Brown (VT) Evgeniy Kuznetsov (JINR) Ezra Kissel (ESnet) Fatema Bannat Wala (LBL) Joseph Breen (UTAH) James Blessing (Jisc) James Deaton (Great Plains Network) Jason Lomonaco (Internet2) Jerome Bernier (IN2P3) Jerry Sobieski
Ji Li (BNL) Joel Mambretti (Northwestern) Karl Newell (Internet2) Li Wang (IHEP) Mariam Kiran (ESnet) Mark Lukasczyk (BNL)
Matt Zekauskas (Internet2) Michal Hazlinsky (Cesnet) Mingshan Xia (IHEP) Paul Acosta (MIT) Paul Howell (Internet2)
Paul Ruth (RENCI) Pieter de Boer (SURFnet) Roman Lapacz (PSNC) Sri N () Stefano Zani (CNAF) Tamer Nadeem (VCU)
Tim Chown (Jisc) Tom Lehman (ESnet) Vincenzo Capone (GEANT) Wenji Wu (FNAL) Xi Yang (ESnet) Chin Guok (ESnet)
Tony Cass (CERN) Eric Lancon (BNL) James Letts (UCSD) Harvey Newman (Caltech) Duncan Rand (Jisc)
Edoardo Martelli (CERN) Shawn McKee (Univ. of Michigan) Simone Campana (CERN) Andrew Hanushevsky (SLAC)
Marian Babik (CERN) James William Walder () Petr Vokac () Alexandr Zaytsev (BNL) Raul Cardoso Lopes () Mario Lassnig (CERN) Han-Wei Yen () Wei Yang (Stanford) Edward Karavakis (CERN) Tristan Suerink (Nikhef) Garhan Attebury (UNL) Pavlo Svirin ()
Shan Zeng (IHEP) Jin Kim (KISTI) Richard Cziva (ESnet) Phil Demar (FNAL) Justas Balcas (Caltech) Bruno Hoeft (FZK)
4
Making our network use visible
Understanding HEP traffic flows in detail is critical for understanding how our complex systems are actually using the network. Current monitoring/logging tell us where data flows start and end, but is unable to understand the data in flight. In general the monitoring we have is experiment specific and very difficult to correlate with what is happening in the network.
(See next slide example)
5
Packet Marking Overview (Example)
The proposal is to provide a mechanism to mark our network packets with the experiment/owner and activity
6
Pacing/Shaping WAN data flows
It remains a challenge for HEP storage endpoints to utilize the network efficiently and fully.
7
Network orchestration
8
Straw man proposal for work plan
We already identified areas of work, so the proposed work plan would be (per area):
Goal: finish prototype marking stage by EoY (or Q1 2021)*
9
Packet Marking Sub Group
Since Packet Marking was first on the list, we have a soon-to-be-announced document focused on organizing this work
See draft here
Join the mailing list to participate
My goal would be to have some amount of WLCG traffic being labeled by the end of this calendar year and we should discuss this.
10
Packet Marking Challenges
We would like this to be applicable for ALL significant R&E network users/science domains, not just HEP
How best to use the number of bits we can get?
What can we rely on from the Linux network stack and what do we need to provide?
What can the network operators provide for accounting?
11
Let’s Discuss!
We have identified packet marking as important for WLCG
How do we enable it for all (most) of our data sources?
We really need a broad range of expertise involved: network programming, standardization experience, experiment software expertise, storage software expertise, NRENs, documentation experience, monitoring, accounting, etc.
Questions, Comments, Suggestions?
12
Acknowledgements
We would like to thank the WLCG, HEPiX, perfSONAR and OSG organizations for their work on the topics presented.
In addition we want to explicitly acknowledge the support of the National Science Foundation which supported this work via:
13
References
WG Meetings and Notes: https://indico.cern.ch/category/10031/
SDN/NFV Tutorial: https://indico.cern.ch/event/715631/
2018 IEEE/ACM Innovating the Network for Data-Intensive Science (INDIS) – http://conferences.computer.org/scw/2018/#!/toc/3
OVN/OVS overview: https://www.openvswitch.org/
GEANT Automation, Orchestration and Virtualisation (link)
Cloud Native Data Centre Networking (book)
MPLS in the SDN Era (book)
14
Backup slides
15
Packet Marking - Jobs
As jobs source data onto the network OR pull data into the job, we should try to ensure the corresponding packets are marked appropriately
16
Packet Marking - Storage Elements
The primary challenge here is in two areas:
17
Some Important Notes
Network monitoring needs to continue and evolve
We have a good collaboration with ESnet, who provides our primary connectivity for WLCG traffic between North America and Europe. We have Monthly meetings to analyze our use and help ESnet plan how best to support our future needs.
The new IRNC testbed option will be important for our prototyping
18
High Level Notes
What is useful? Feasible? Possible?
The idea of marking, shaping and orchestration are steps in order of assumed difficulty and time-to-implement
Marking and shaping/pacing must happen on the source
Orchestration is much more feasible once marking is in place
19
ESnet TransAtlantic Capacity Forecast
What ESnet can afford for $2M/year is the green line.
Capacity evolution for our terrestrial networks looks reasonable in terms of technology up to the HL-LHC
20
Packet Marking - IPv6
IPv6 incorporates a “Flow Label” in the header (20 bits)
21
Packet Marking - IPv4
IPv4 incorporates a “Options” in the header (allowing to add more 32 bit words)
22
Network Functions Virtualisation WG
Mandate: Identify use cases, survey existing approaches and evaluate whether and how Software Defined Networking (SDN) and Network Functions Virtualisation (NFV) should be deployed in HEP. �
Team: 60 members including R&Es (GEANT, ESNet, Internet2, AARNet, Canarie, SURFNet, GARR, JISC, RENATER, NORDUnet) and sites (ASGC, PIC, BNL, CNAF, CERN, KIAE, FIU, AGLT2, Caltech, DESY, IHEP, Nikhef) ��Monthly meetings started in Jan 2018 (https://indico.cern.ch/category/10031/)
23
Future Work for Experiments/NRENs
The report proposes areas of future work with the experiments
During the LHCONE/LHCOPN meeting we heard consistent interest in making network use more visible (all VOs), more effective (CMS pacing, others) and orchestrated (managed, controlled). This matches what we identified:
Areas proposed for this WG (pages 53-56):
24
NFV Report Conclusions
The primary challenge we face is ensuring that WLCG and its constituent collaborations will have the networking capabilities required to most effectively exploit LHC data for the lifetime of the LHC. To deliver on this challenge, automation is a must. The dynamism and agility of our evolving applications, tools, middleware and infrastructure require automation of at least part of our networks, which is a significant challenge in itself. While there are many technology choices that need discussion and exploration, the most important thing is ensuring the experiments and sites collaborate with the RENs, network engineers and researchers to develop, prototype and implement a useful, agile network infrastructure that is well integrated with the computing and storage frameworks being evolved by the experiments as well as the technology choices being implemented at the sites and RENs.
25