1 of 20

Software for PED studies

(About) Compute Resource Needs

7th FCC Physics Workshop 2024

Annecy, France

January 31, 2024

G Ganis, CERN-EP

2 of 20

Outline

  • Quick recap about the problem
  • How it looks for FCC-ee
  • A few remarks

2

G Ganis, S&C, Compute Resources Needs, 7th FCC Physics Workshop 2024, 31 January 2024

3 of 20

Computing and HEP

  • Cost of IT-related components in HEP experiments, from software development to storage to processing power, constantly raised in the last decades
    • Starting with 90’s, i.e. when computing has become significant
  • Experiment workflows and their components have become more and more complex in many directions
    • Increasing expected data samples, hence increasing needs
    • Varying scenario of ressources
      • From single-core to multi-core to heterogeneous processing units
      • Evolution of storage systems from local to distributed and cache hierarchies
    • Continuous evolution/optimisation of software and data structures
  • Estimating/predicting needs for computing resources has become crucial to plan for and secure them
  • All this requires modeling

3

G Ganis, S&C, Compute Resources Needs, 7th FCC Physics Workshop 2024, 31 January 2024

4 of 20

Modeling the resource needs

  • Simple, in principle
    • Define the needs and the activities to satisfy them
      • E.g. ‘MC samples for year YN’ requires the activities
        • Event generation, simulation, reconstruction
    • To each given activity corresponds a set of workflows, each defining a set of resources to be used, e.g.
        • Event generation on GPU producing output on local NAS
        • Simulation on the Grid producing output on the Grid
        • Reconstruction on HLT producing output on EOS
    • Put everything together to get the compute ressources needs
      • Taking into account experiment guidelines and policies�
  • All this depends on assumptions which may be or may be not well defined
  • For LHC they reasonably well defined

4

G Ganis, S&C, Compute Resources Needs, 7th FCC Physics Workshop 2024, 31 January 2024

Inspired by D Lange et al, CMS Computing Resources Modeling

5 of 20

Projections of resource needs of HL-LHC

5

G Ganis, S&C, Compute Resources Needs, 7th FCC Physics Workshop 2024, 31 January 2024

Better

software

Better software

Pledged +20%

Pledged +10%

Processing power

Storage

6 EB

6 of 20

How does it look for FCC-ee?

6

G Ganis, S&C, Compute Resources Needs, 7th FCC Physics Workshop 2024, 31 January 2024

7 of 20

Modeling resources for a project “en devenir”

  • Same reasoning, less precise assumptions, different purpose
    • Monte Carlo only
    • No data processing, calibration, …
    • Several detector concepts
    • Several digitisation, reconstruction options, …
  • Several purposes: e.g.
    • Projects needs for pledged resources “in production”
    • Estimate the potential of a limited set of resources to set priorities in the short term

7

G Ganis, S&C, Compute Resources Needs, 7th FCC Physics Workshop 2024, 31 January 2024

8 of 20

Workflows to support for FCC

8

MDI codes

MC Generators

Full / Fast Simulation

Digitisation / Reconstruction

Parametrized

simulation

Geometry

MDI format readers

Pileup / MDI overlay

Analysis

MC Generators Interface

Workload and Data Management

Software Infrastructure

(Repositories,Build/Test/Deploy)

G Ganis, S&C, Compute Resources Needs, 7th FCC Physics Workshop 2024, 31 January 2024

9 of 20

What we have now

  • CERN
    • EOS volumes
      • 500 TB for central productions (157 TB free, stil used by some CDR files)
      • 200 TB for analysis, starts to be used
    • CPU: 9000 HS06 on lxbatch
    • Integrated in iLCDirac
  • Other sites integrated
    • BARI
    • CNAF
    • Glasgow (storage only)
  • Some GPU resources
    • CERN, EuroHPC

9

G Ganis, S&C, Compute Resources Needs, 7th FCC Physics Workshop 2024, 31 January 2024

Not yet limited, in general, but start reaching the boundaries

10 of 20

A first resources analysis for FCC-ee1

  • Assumptions
    • Nominal luminosities
      • {90, 12, 5, 0.2, 1.5} ab-1 at √s = {91.2, 160, 240, 350, 365} GeV
    • MC reference sample = data sample
      • i.e. 3x1012 visible Z decays, 108 WW events, 106 ZH events, 106 tt events
  • Event sizes (see next)
    • RAW: 1 - 2 MB/evt
    • AOB: 5 - 10 kB/evt
  • Processing Power
    • CERN Openstack Core = 10-15 HEPSpec06
    • FCC currently assigned processing units = Computing Unit = 9000 HEPSpec06
    • CERN OpenStack node used for tests: 16 cores, 32 GB RAM

10

G Ganis, S&C, Compute Resources Needs, 7th FCC Physics Workshop 2024, 31 January 2024

  • Based on: GG, C Helsens: EPJ Plus (2022) 137:30

11 of 20

Event Sizes estimations1

11

G Ganis, S&C, Compute Resources Needs, 7th FCC Physics Workshop 2024, 31 January 2024

RAW: 1 - 2 MB / evt

AOD: 5 - 10 kB / evt

1. Based on: F. Grancagnolo: Event Rates at Z-pole,� talk presented at 4th FCC PED workshop, Nov 2020

12 of 20

Storage requirements

12

G Ganis, S&C, Compute Resources Needs, 7th FCC Physics Workshop 2024, 31 January 2024

≈HL-LHC

13 of 20

Computing requirements

13

G Ganis, S&C, Compute Resources Needs, 7th FCC Physics Workshop 2024, 31 January 2024

2000-3000 years!

14 of 20

Computing estimation remarks

  • MC event generator can be challenging for full-scale production
    • Code optimisations and/or filtering techniques might be required
  • Full simulation times have been cross-checked with ATLAS times for similar multiplicity
    • Recent Geant4 is up to a factor of 2 faster for ATLAS
    • Fast simulation techniques (see A Zaborowska talk) might help
    • Other option: selected simulation, i.e. non simulating what is never touched is not simulated
  • Reconstruction
    • Between 10% (ALEPH) - 30% (BELLE) of simulation
    • Could really benefit from using heterogenous resources

14

G Ganis, S&C, Compute Resources Needs, 7th FCC Physics Workshop 2024, 31 January 2024

15 of 20

Putting all together …

  • The FCC-ee resource needs are of the same order of HL-LHC
    • If HL-LHC solves the problems, FCC-ee gets it for free�
  • Putting all together

  • The resources currently available are O(1000) off for full simulation for FSR
    • Might be ok for parametrized simulation
  • Numbers should be multiplied by the number of detector variations and analysis
    • Although some optimisation might be possible

15

G Ganis, S&C, Compute Resources Needs, 7th FCC Physics Workshop 2024, 31 January 2024

16 of 20

What next: S&C

  • Investigate possibility to get more resources - e.g. spare cycles from WLCG - and be ready to use efficiently all what becomes available
    • See L. Valentini, iLCDirac
  • Increase quality and efficiency of code
    • Long and expensive process, in general.
    • New faster simulation techniques promising (see A. Zaborowska talk)
      • Need to understand how to go beyond the full sim training statistics
  • Investigate possibility selective/filtered simulations
    • Filters at generation level
    • Simulate only parts of relevance
  • Facilitate interplay targeted full simulation and parametrized simulation
    • E.g. Automatic/optimal creation of Delphes configurations

16

G Ganis, S&C, Compute Resources Needs, 7th FCC Physics Workshop 2024, 31 January 2024

17 of 20

What next: Physics Performance, …

  • Investigate (statistical) technologies to go beyond the rule of thumb� MC Sample = Expected data sample�reducing the number of events required
    • Could be useful also in perspective, when data will be there
  • LHC is testing at similar statistics
    • Use this to identify processes requiring more attention

17

G Ganis, S&C, Compute Resources Needs, 7th FCC Physics Workshop 2024, 31 January 2024

18 of 20

Final remarks

  • Available computing resources are limited and do not allow full statistics studies
    • This is rather normal, given the investment they would represent
  • Improvements in the code are always possible, but not such to change completely the picture
  • MC implements our knowledge, should now what to expect
  • We should try to use all that to identify possible criticalities where to use the available resources

18

G Ganis, S&C, Compute Resources Needs, 7th FCC Physics Workshop 2024, 31 January 2024

19 of 20

Thanks!

19

G Ganis, S&C, Compute Resources Needs, 7th FCC Physics Workshop 2024, 31 January 2024

20 of 20

From full sim to parametrized sim to phys perf

G Ganis, S&C, Compute Resources Needs, 7th FCC Physics Workshop 2024, 31 January 2024