1 of 9

Scope & Activities

Teng Jian Khoo (HU-Berlin/Innsbruck - ATLAS)

Paul Laycock (BNL - Belle II, DUNE)

Andrea Rizzi (INFN Pisa - CMS)

2 of 9

Outline

  • Mandate & Goals
  • Recent events (pre-COVID)
  • HL-LHC computing review
    • Near- & medium-term targets
  • Outlook

2

3 of 9

DAWG Goals

Aims:

  • Reduce monotonous and laborious tasks in physics analysis
  • Optimise human and computing costs of publishing physics results

Priorities:

  • Define problems by identifying the needs of physicists and the requirements of analyses across experiments via direct consultation
  • Find solutions by connecting physics analysis experts and technological innovators within and beyond the HEP community

3

4 of 9

Highlight event

Pre-CHEP ‘19 WLCG/HSF WorkshopAnalysis Systems: From Future Facilities to Final Plots

“Brain-writing” exercises addressing:

  • Future analysis models
  • Facility requirements for high-throughput analysis
  • Growing integration of Machine Learning

Continued active engagement with WLCG critical

  • Following up with DAWG/DOMA meetings on analysis facilities

4

5 of 9

5

6 of 9

HL-LHC Computing Review

LHCC commissioned review by HSF: “Common Tools and Community Software”

Analysis highlights:

  • Analysis data formats -- centralised production, disk costs, data access patterns, systematic uncertainties
  • Metadata handling -- bookkeeping analysed data (does processing 100% of data scale to HL-LHC?), validity & retrieval of calibrations, cross-sections, …
  • Quality assurance -- code testing for accuracy & efficiency
  • Analysis interfaces -- declarative configuration, transparency, preservation

6

7 of 9

Development targets

7

Declarative analysis models

Analysis code quality

Reproducibility & preservation

Data formats for analysis

Growing use of ML

Trends

Topics

Targets

Analysis facility design

More data, higher precision

Efficient use of resources

Fluid research workforce

Analysis metadata

8 of 9

Specific questions

Standardised analysis formats a la CMS nano-AOD, ATLAS DAOD_PHYSLITE� -- Production models? Adaptability c.f. the “10% analyses”

Analysis interfaces, description, preservation�-- Is a Domain-Specific Language a practical solution? �-- Or declarative layers (high-level workflow, mid-level tasks, low-level cuts)?�-- How to store/access metadata uniformly and robustly?

Analysis & the grid�-- What do we need at computing facilities (GPU, fast network vs disk, …)?�-- Do we need specialised facilities for analysis? How will job distribution work?�-- How to improve validation & performance monitoring of user code?

8

9 of 9

Outlook

Analysis software should be an enabler, not an obstacle� -- Design such that good practices are the default

Build capabilities for growing sophistication without exploding costs� -- Need effective interfaces to ML, accelerators� -- Must provide equitable access to infrastructure

Close connections to software training & documentation� -- “Higher level” languages for analysis operations could help

Quis custodiet analysis metadata?�-- Do we need an event/body to steer? Key stakeholders?

9