1 of 15

2 of 15

Working Together

Anecdote

"Software is the soul of the detector,” Ian Shipsey replied in a poetic way and emphasized the importance of great software for great science. He added that we need to work together, on a global scale and with other fields, to achieve this goal.

EIC Software Meeting: Lessons Learned II, March 23, 2022.

2

After a presentation on “Breakthroughs in Detector Technology”, Ian Shipsey (Oxford) was asked about the role of software.

Thank you very much for working together with the EICUG SWG!

* CORE adapts existing software for their needs and has a far smaller software effort than other proto-collaborations.

liaisons

Common Software Effort

Proto-Collaborations

EIC User Group

Steering Committee

Software Working Group

ATHENA

CORE*

ECCE

Software & Computing Working Groups

3 of 15

EIC Software: Lessons Learned (https://indico.bnl.gov/event/14319/)

EIC Software Meeting: Lessons Learned II, March 23, 2022.

3

Lessons Learned from ATHENA (Sylvester Joosten, Wouter Deconinck)

Lessons Learned from ECCE (David Lawrence, Jin Huang, and Bill Li)

4 of 15

Software is in a very early life stage.

Work with the EIC community

  • Both ATHENA and ECCE have been very successful in large-scale, detailed full detector simulations:
    • ATHENA have successfully developed a modular software stack based on common NHEP software.
    • ECCE have successfully leveraged familiar software and will reevaluate their software stack and rebuild parts of it.
  • EIC collaborations will determine for themselves what they do for software, but that will likely include common software.
    • This will include the software stack from ATHENA and ECCE.

Common Software

  • Define requirements for EIC Software and common software projects:
    • Software needs of the EIC addressed in Software EoI.
    • Evolve with the EIC community and the EIC project. Right now, after the review of the detector collaboration proposals and during the formation of the EIC collaboration(s), is the ideal time for doing so.
  • Work together on common software projects based on these requirements.
    • Avoid duplication of the effort, e.g., workflows for distributed computing.
    • Team up on challenges, e.g., running on heterogeneous resources. ToDo
  • Continue to build a EIC Software community with close connections and collaborations to the experts in NHEP:
    • ATHENA made a very deliberate choice to avoid “not-invented-here” syndrome and sits now at the table with HEP developers.

EIC Software Meeting: Lessons Learned II, March 23, 2022.

4

5 of 15

Expression of Interest for Software

EIC Software Meeting: Lessons Learned II, March 23, 2022.

5

Common Projects

  • Software Tools for Simulations and Reconstruction
    • Monte Carlo Event Generators
    • Detector Simulations
    • Reconstruction
    • Validation
  • Middleware and Preservation
    • Workflows
    • Data and Analysis Preservation
  • Interaction with the Software Tools
    • Explore User-Centered Design
    • Discoverable Software
    • Data Model

Future Technologies

  • Artificial Intelligence
  • Heterogeneous computing
  • New languages and tools
  • Collaborative software

29 institutions

6 of 15

Monte Carlo Event Generators

EIC Software Meeting: Lessons Learned II, March 23, 2022.

6

Other DEMPGen, Djangoh, elSpectro, TopHEG

We have successfully established a HEP standards: HepMC3

And have started with Rivet

for MCEG validation for the EIC.

We understand how to handle accelerator and beam effects in (and after) the event generation.

 We have a vibrant community:

and are part of a community white paper on Event Generators for HEP Experiments, with EIC as part of cross-cutting aspects.

7 of 15

Detector Simulations

  • Detailed detector simulations based on Geant4:
    • Various versions being used: ATHENA 10.7 (compatible with 11.0) and ECCE 10.6
    • EIC physics list from Project eAST, validation being started.
    • Lack of test beam data for the validation of the detector simulations.
    • Challenge: Reconstruction (!) of Cherenkov detectors (Geant4 describes Cherenkov radiation and optical physics very well).

  • Work focused on successful geometry integration:
    • YR: Full simulation of detector components. → ATHENA and ECCE: Full simulation of detector concepts.
  • Geometry description and exchange:
    • ATHENA: DD4hep. Geometry browser in the cloud (jsROOT).
    • ECCE: Pure G4 geometry for simulation -> TGeo for reconstruction.

  • Accelerate simulations:
    • eAST: full and fast simulations in Geant4. Sub-event level parallelism for heterogeneous computing.
    • Open question: What are the most promising applications for AI/ML?

  • ECCE:
    • Detector design optimization using AI/ML.

EIC Software Meeting: Lessons Learned II, March 23, 2022.

7

8 of 15

Reconstruction

  • Enormous progress in reconstruction:
    • Mainly for central detector though.
    • A lot of work needed towards 4D reconstruction for central detector and far-forward detectors, in particular for PID.

  • Plethora of reconstruction algorithms being used:
    • Including some very old ones, e.g., IRT, that have been superseded a long time ago.
    • But also very new ones based on AI/ML.

  • ATHENA and ECCE:
    • A lot of progress with ACTS, including EIC-specific contributions from ATHENA.

EIC Software Meeting: Lessons Learned II, March 23, 2022.

8

9 of 15

Distributed Workflows and Data Management

  • ATHENA deployed successfully automated workflows on eicweb:
    • Workflows based on either slurm (Compute Canada, JLab) or HT-Condor (OSG).

  • ECCE used successfully git-based production system on batch systems:
    • Notes that they could used fully developed solutions instead, e.g., PanDA.

  • Both ATHENA and ECCE successfully used distributed computing resources.
  • Both ATHENA and ECCE worked with the host labs on computing resources.

  • Scientific data management has been timesink during simulation campaigns:
    • In data management, Rucio is used increasingly widely in NHEP and drawing interest (cf. last round table).
    • Rucio has been discussed within EIC community. Requires support from host labs.

EIC Software Meeting: Lessons Learned II, March 23, 2022.

9

10 of 15

Data and Analysis Preservation

  • The ATHENA and ECCE workflows allow to reproduce results:
    • Here, a key aspect is containerization.

  • It pays to start early:
    • e.g. create websites and documentation repositories for the long haul
    • There has been major progress on that by ATHENA and ECCE.
    • Static site generation on standard platforms like GitHub (e.g. https://eic.github.io/) is a robust, relatively simple approach (including robust against time).

  • Beyond that, the EIC community needs to develop a strategy for data and analysis preservation:
    • Data and analysis preservation can only work with the majority of the community. We need to work with the user community.

  • CERN, DPHEP, and collaborators have developed a suite of DAP tools that can contribute to a DAP strategy:
    • HEPData, OpenData, Zenodo/InvenioRDM, REANA, InspireHEP, …

EIC Software Meeting: Lessons Learned II, March 23, 2022.

10

11 of 15

User-Centered Design: Listen to Users, and/then Develop Software

  • State of Software Survey: Collected information on software tools and practices during the Yellow Report Initiative.
  • As part of the State of Software Survey, we asked for volunteers for focus-group discussions:
    • Students (2f, 2m), Junior Postdocs (2f, 3m), Senior Postdocs (2f, 3m), Professors (5m), Staff Scientists (2f, 3m), Industry (2f, 2m)
  • Results from the six focus-group discussions:
    • Extremely valuable feedback, documented many suggestions and ideas.
    • Developed user archetypes with Communication Office at Jefferson Lab and UX Design Consultant:

  • We repeated Software Survey now after detector collaboration proposals:
    • The regular software census will be essential to better understand and quantify software usage throughout the EIC community. During the next survey, we will also ask on feedback on the user archetypes.

EIC Software Meeting: Lessons Learned II, March 23, 2022.

11

User Archetypes: Input to software developers as to which users they are writing software for:

  • Software is not my strong suit.
  • Software as a necessary tool.
  • Software as part of my research.
  • Software is a social activity.
  • Software emperors.

12 of 15

Discoverable Software

  • Both ATHENA and ECCE have setup GitLab and GitHub organization for their software stack:
    • (Major?) part of their repositories is available on the GitHub organization for the EIC.

  • ATHENA provided containers both on eicweb and Docker Hub and singularity via cvmfs:
    • Spack environment to handle environment.

  • ECCE provided singularity containers.

  • Both ATHENA and ECCE put an emphasis on the education of their user base:
    • Various tutorials that have been well received.
    • Documentation for users.
    • Documentation for developers, e.g., Doxygen.

EIC Software Meeting: Lessons Learned II, March 23, 2022.

12

13 of 15

Data Model

  • Major progress:
    • Both ATHENA and ECCE developed standardized ROOT files for their simulation campaigns.
    • Working with flat data structures paid off for development of physics analyses and first AI/ML algorithms.
    • ATHENA data model in eicd inspired from key4hep/EDM4hep (generic event data model that has been developed for future HEP collider experiments):
      • now ATHENA uses key4hep/EDM4hep directly

  • Promising:
    • ATHENA formulized the creation of flat data models using key4hep/EDM4hep.

EIC Software Meeting: Lessons Learned II, March 23, 2022.

13

14 of 15

Future Technologies

  • Artificial Intelligence

  • Heterogeneous computing

  • New languages and tools
    • Both ATHENA and ECCE successfully deploy continuous integration.

  • Collaborative software

EIC Software Meeting: Lessons Learned II, March 23, 2022.

14

15 of 15

15

Building a Software Community

A message from “Future Trends in NP Computing” to bring to the EIC community:

  • People are most important, not the software. Setting up an organization to create the right incentives to create and maintain the software.
  • A strain repeated throughout the day: career support!
  • Another strain supporting developers and their careers: software citations.
  • Common software projects create a pool of highly valuable, valued developers who can carry expertise on a key tool to other experiments and communities. cf. career path.
  • Management support up the hierarchy is important for successful open source project
    • Acceptance of objectives wider than those of the home experiment
    • Recognition of the value of the wider investment
  • Developers need the time and space to develop something new, not something just a little better