Response to Questions/Requests for Analysis Systems
Kyle Cranmer (NYU)
Analysis Systems Team
2
Institutions: NYU, Washington, Princeton, Cincinnati, Illinois
Summarize the set of projects/activities and associated effort for your area
See next two pages
3
Projects
4
Summarize the set of projects/activities and associated effort for your area
Awkward, uproot:
func_adl & ServiceX:
pyhf: Matthew Feickert (~80%)
MadMiner & Exploratory ML: Johann Brehmer (~50%)
Integrating tools & developing declarative specifications for end-user:
Modernizing analysis support (eg. docker, analysis preservation, training, conda forge, etc...)
Cyberinfrastructure: Sinclert Perez (~30% starting 2020, HEPData and RECAST)
Core Scikit-HEP : boost-histogram, hist, vector, … Henry Schreiner (~65%)
Misc Scikit-HEP: scikit-hep org, particle, decay language Henry Schreiner (10%), Daniel Vieira (???) not participating in biweekly meetings)
5
It should be easier for area leads to see the FTEs
ServiceX activities overlap with DOMA/SSL, not sure how to deal with accounting for the purposes of this meeting.
Are there internal or external collaborations associated with each project or activity? For external collaborations, is IRIS-HEP leading, contributing or simply “connecting/liaising”?
Internal:
External:
6
Which project/activities/goals are making progress and which are not? (Area lead’s opinion) For those that are not, what is impeding progress?
Projects making good progress with good adoption:
Some issues / concerns
Basically completed
7
analysis facility blueprint needed
A gap in planning.
How are each of these projects/activities connected to, being informed by or planning on delivering (eventually) to the experiments? Are there relevant blueprint meetings or workshops that should happen to make progress?
8
KyungEon doesn’t seem to formally be part of IRIS-HEP. �He should be. Maybe a fellow?
How are each of these projects/activities connected to, being informed by or planning on delivering (eventually) to the experiments? Are there relevant blueprint meetings or workshops that should happen to make progress?
ServiceX is designed to facilitate high-performance array-based analyses. It does this by allowing users to construct sophisticated in-place data queries via an analysis description language, performing on-the-fly data transformations into convenient analysis formats, and connecting the output to future analysis facilities.
9
What would be potential Year 3 milestones for each of the projects? (First ideas, to be iterated with PIs and the whole team as this process moves forward.)
10
autodiff blueprint
(may need to send gradients back to analysis facility via ServiceX)
What “grand challenges” would be useful to organize involving your area during Year 3 of IRIS-HEP? How would these challenges depend on efforts from other areas of IRIS-HEP, the US LHC Ops programs or the experiments?
(Assuming not to be completed in Y3, but organized in Y3)
11
Are there new opportunities where effort from IRIS-HEP can make an impact? Is the alignment of the focus areas in IRIS-HEP appropriate?
12
How are projects currently managed in your area? What tools are being used? How is progress measured? How are risks recorded, identified and mitigated?
13
Progress tracked on GitHub: github.com/iris-hep/project-milestones/
Are the metrics being used to measure success clearly defined? How well do metrics in your area measure progress, success or impact? Where can the metrics be improved or refined to better measure progress, success or impact?
Metrics listed on next page for reference.
14
Metrics
M.2.1: Number of specifications developed
M.2.2: Number of implementations for corresponding specifications
M.2.3: Throughput and latency metrics for analysis systems using SSL testbed
M.2.4: List of experiments using CAP and number of analyses stored in CAP
M.2.5: Number of results / papers making use of CAP/REANA
M.2.6: GitHub stars, forks, watch, contributor statistics
15
eg. uproot & awkward
Backup
16
Prior to IRIS-HEP
Bulk Data Processing
Reconstruction Algorithms
Analysis Code
Analysis code in HEP is often more free-form with less organized development:
17
IRIS-HEP as an Institute
Analysis Systems
ad hoc analysis code
Analysis Systems strategies:
18
IRIS-HEP
Analysis Systems
19
Analysis Systems projects span all stages of end-user analysis.
Scikit-HEP
20
A broad community project with heavy IRIS-HEP involvement.
A coherent ecosystem
21
One of our analysis use cases involves a vertical slice from ServiceX to final limits for a real-world ATLAS Higgs analysis. See Alex Held’s poster.
A coherent ecosystem
22
ServiceX
yadage
func_adl formulate
coffea
The Future
IRIS-HEP Focus Areas
23
Slides from Johann Brehmer’s Keynote talk at ACAT on Constraining Effective Field Theories with Machine Learning
Tight integration of
Major Activities
24
Connections to DOMA & SSL
ServiceX is part of DOMA’s iDDS
25
ServiceX
Data Lake
Cached
Distribution
Analysis Systems
ServiceX is being prototyped using IRIS-HEP’s Scalable Systems Lab
B. Galewsky, R. Gardner, L. Gray, M. Neubauer,J. Pivarski, M. Proffitt, I. Vukotic, G. Watts, M. Weinberg
Milestones and Deliverables
26
Milestones and Deliverables
27
Progress tracked on GitHub: github.com/iris-hep/project-milestones/
Metrics
M.2.1: Number of specifications developed
M.2.2: Number of implementations for corresponding specifications
M.2.3: Throughput and latency metrics for analysis systems using SSL testbed
M.2.4: List of experiments using CAP and number of analyses stored in CAP
M.2.5: Number of results / papers making use of CAP/REANA
M.2.6: GitHub stars, forks, watch, contributor statistics
28
eg. uproot & awkward
Community Building
29
“I just wanted to express my personal awe to you and your team working so hard on a bunch of wonderful projects. The talks delivered by Johann, Lukas and Gunes were excellent! In my personal opinion it was the best part of ACAT conference.” � - Andrey Ustyuzhanin (LHCb & Yandex School of Data Analysis)
Training
30
Presentations & Publications
31
108 presentations and 22 publications thus far�
Value of IRIS-HEP as an Institute
IRIS-HEP as a tugboat:
32
Value of IRIS-HEP as an Institute
IRIS-HEP as a lighthouse:
33
Highlight
34
ROOT: 10+ hours
pyhf: < 30 minutes
Highlight
35
Featured on CERN homepage
Highlight
IRIS-HEP Focus Areas
36
Finalist for best paper award at SC19 (Super Computing)
Beyond HEP
37
Beyond HEP
Machine learning & statistical techniques originally developed for LHC now being used to probe Dark Matter with gravitational strongly lensing
38
arXiv:1909.02005 published in The Astrophysical Journal.
Beyond HEP
Collaboration with DeepMind on AI techniques inspired by physics
Relevant for:
39
Protein figure from Boomsma [https://doi.org/10.1073/pnas.0801715105]
in collaboration with
Beyond HEP
40
See Sebastian Macaluso’s poster highlighting exploratory machine learning projects.
https://arxiv.org/abs/2002.11661
Beyond HEP
Collaboration with DeepMind on AI techniques inspired by physics
Models that incorporate physics generalize to unseen systems (zero-shot learning)
41
Beyond HEP
Collaborating with CS & astrophysics on computing models and tools to use HTC and HPC together, published as:
E. A. Huerta, R. Haas, S. Jha, M. Neubauer, D. S. Katz, "Supporting High‐Performance and High‐Throughput Computing for Experimental Science," Computing and Software for Big Science 3:5, 2019. doi: 10.1007/s41781-019-0022-7
42
Components involved in starting a Shifter job on Blue Waters (HPC). Jobs are submitted to workload manager on Blue Waters’ login nodes, which launches jobs on compute nodes. When job requests use containers, workload manager first uses Shifter runtime environment to pull an up-to-date copy of the container image from Docker Hub. This image is repackaged as a user-defined image, then pre-mounted (prologue) by the jobs on the compute nodes and unloaded post-job (epilogue).
Left: period of time during which 35 million ATLAS events were processed using 300 Blue Waters nodes. Utilization during this period averaged 81%, typical for Blue Waters. Right: backlog of queued jobs for the same period in requested nodes, with colors indicating user accounts. During this period, the queued workload never dropped below 80,000 nodes i.e., four times the number of nodes in Blue Waters. The red and blue curves below the horizontal axis are nodes available for work scavenging during this period.