1 of 23

Porting of CMS electron seeding to GPUs (UC2.2.4)

Leonardo Cristella (INFN Bari)

ICSC Spoke 2 meeting 12th Sep 2024

2 of 23

Outline

  1. Parallelization strategy for electron seeding algorithm
    1. Port algorithm to alpaka
      1. Access CMS magnetic field in alpaka
  2. Infrastructure
    • ReCaS-Bari (T2_IT_Bari)
    • CINECA HPC “Leonardo”

2

ICSC Spoke 2 meeting 12th Sep 2024

ICSC Spoke 2 meeting 12th Sep 2024

3 of 23

Problem statement

  1. e/gamma seeding represents ~5-10% of total CPU time at HLT & runs on CPU
      • doublet/triplet seeds building (~⅔ )
      • seed matching (~⅓ )
  2. What it does
    1. Creates a collection of doublet and triplet seed of hits in the pixel tracker
    2. Loops over the ECAL SuperClusters (SCs)
      • For each SC loops over the entire seed collection
      • Performs a “matching” of the SC with potential seeds that are compatible with a hypothesised electron trajectory
      • This reduced collection of seeds is then passed to GSF tracking and we finally end up with reconstructed electrons

3

ICSC Spoke 2 meeting 12th Sep 2024

ICSC Spoke 2 meeting 12th Sep 2024

4 of 23

Approach

  1. Goal�Speed up the seeding by creating a new GPU version of the algorithm
  2. Procedure
    1. Remove the dedicated seed building and use a modified version of Patatrack pixel tracks were doublet info is available
    2. Change the matching algo to be GPU compliant and port it to alpaka
  3. Task splitting
    • Replace the dedicated electron seed building sequence with Patatrack pixel tracks as seeds, recovering doublet info
    • Define a new parallel seeding algorithm ensuring the electron reconstruction efficiency is consistent with previous version
    • Porting of new parallel algo implementation to alpaka
    • Add parallel electron trajectories propagation towards barrel, end-caps and transition region
      1. Requires a GPU-friendly access to magnetic field information

4

ICSC Spoke 2 meeting 12th Sep 2024

ICSC Spoke 2 meeting 12th Sep 2024

5 of 23

3.1 - Patatrack pixel track as seeds

  1. Replace the dedicated electron seed building sequence with Patatrack pixel track as seeds recovering doublet info
    1. Implement the “storing” of doublets collection in the pixel track SoA
  2. The first version of the new parallel algorithm had a lower electron reconstruction efficiency w.r.t. the CPU version
    • Tested a more refined implementation of the inside-out matching algo
    • Instead of one step from seed first hit to ECAL, split in more steps (figure below)
      1. Propagate to each subsequent hit of the seed and then to ECAL
      2. Apply quality cuts at each step

5

First propagation and check

Second propagation and check

Third propagation and check

ICSC Spoke 2 meeting 12th Sep 2024

ICSC Spoke 2 meeting 12th Sep 2024

6 of 23

3.2 - Ensure the electron reconstruction efficiency

  • The optimization of the “inside-out” pixel matching algorithm
    • to recover electron reconstruction efficiency and improve overall CPU timing switching to multi-step propagation towards the ECAL
    • Did not show any improvement in electron reconstruction efficiency

6

ICSC Spoke 2 meeting 12th Sep 2024

ICSC Spoke 2 meeting 12th Sep 2024

7 of 23

3.2 - Change seeding logic

  • Improve initial direction vector based on ECAL SC
  • First create a trajectory based on the ECAL SC energy, charge hypothesis and beamspot position
  • Check with loose criteria if the first hit of the seed is near the intersection of the trajectory and the layer of the hit
  • If matched correct the “beamspot” by propagating the trajectory and defining a z position
  • Propagate a new track to the second hit layer based on the vetrex and a freetrajectory defined by the ecal sc energy charge and the vertex and 1st hit positions
  • Check intersection of trajectory with layer of 2nd hit

7

ICSC Spoke 2 meeting 12th Sep 2024

ICSC Spoke 2 meeting 12th Sep 2024

8 of 23

  • Similar efficiency as legacy
  • ~twice the fake rate
  • Plan to add 1 more matching step
  • Optimize cuts based on legacy config.

  • Future plans
    • Will use this approach
    • Swap with this logic in the current Alpaka producer

8

Gfs electron pT [GeV]

Gfs electron η

Matched with gen / all reconstructed

ICSC Spoke 2 meeting 12th Sep 2024

ICSC Spoke 2 meeting 12th Sep 2024

9 of 23

3.3 - Porting to alpaka (status)

Porting of initial parallel algo implementation to alpaka (CMSSW branch)

  • Alpaka producer for the new seeding algo (skeleton in place)
  • Create & fill SC SoA & consume by Alpaka Producer (validated)
  • GPU complaint parabolic approximation of Magnetic field (completed)
  • SoA for electron seeds (not validated)
  • Alpaka kernels for propagation towards barrel (validated)
  • Alpaka kernels for propagation towards endcaps (not validated)
  • Compiles for both CPU and nvidia GPU
    • First manual validation performed

9

ICSC Spoke 2 meeting 12th Sep 2024

ICSC Spoke 2 meeting 12th Sep 2024

10 of 23

3.4 - Trajectories propagation on GPU

  • Parallel propagator towards barrel
    • Created and tested -> porting it to alpaka
  • Parallel propagator towards the end-caps and transition region
    • First version of a GPU compatible propagator available
      • Implemented new propagator into modified electron seed “NHit” producer
      • In the process of testing new propagator by comparing distributions produced with new propagator to distributions produced using old propagator
    • It requires a GPU-friendly access to magnetic field information
      • Tested out usage of parabolic approximation outside the default “validity region”
        • Very similar results for our use-case (ECAL boundary)
      • Can proceed with use of only parabolic approximation in alpaka

10

ICSC Spoke 2 meeting 12th Sep 2024

ICSC Spoke 2 meeting 12th Sep 2024

11 of 23

3.4.1 - Porting of magnetic field - initial status

CMSSW provides a parametric approximation of the magnetic field limited to the tracker region:

  • parallel to the z-axis: GlobalVector(0, 0, Bz)
    • with a magnitude evaluated using a parabolic approximation at the desired point Bz(point)
  • The validity region of the approximation is defined as point.perp2() < 13225.f && fabs(point.z()) < 280.f
    • Outside of which, the approximation returns just 0: GlobalVector(0, 0, 0)

11

ICSC Spoke 2 meeting 12th Sep 2024

ICSC Spoke 2 meeting 12th Sep 2024

12 of 23

3.4.1 - Porting of magnetic field - solution

The propagation of object trajectories requires access to the magnetic field value at the ECAL boundary.

  • In alpaka, defined a parabolic approximation extended also to the ECAL boundary and used it to propagate trajectories.
  • On CPU, verified that the reconstructed GSF electrons (multiplicity and kinematics) do not change between using the default magnetic field or the extended approximation.

12

ICSC Spoke 2 meeting 12th Sep 2024

ICSC Spoke 2 meeting 12th Sep 2024

13 of 23

Pull Request merged: #44360

13

ICSC Spoke 2 meeting 12th Sep 2024

ICSC Spoke 2 meeting 12th Sep 2024

14 of 23

Next plans

  • Validate all components
  • Open Pull Request(s) to CMSSW
  • Large scale physics validation

14

ICSC Spoke 2 meeting 12th Sep 2024

ICSC Spoke 2 meeting 12th Sep 2024

15 of 23

Infrastructure-related work

  • ReCaS-Bari (T2_IT_Bari)

  • CINECA HPC “Leonardo

15

ICSC Spoke 2 meeting 12th Sep 2024

ICSC Spoke 2 meeting 12th Sep 2024

16 of 23

CMS Grid jobs requiring GPUs

Since early 2023, the access and use of GPUs from CMS Grid jobs is under control of two main attributes in the Condor Job Description Language:

  • RequiresGPU: defines whether or not GPU pilots will be requested
  • RequestGPUs: number of GPU resources to be matched in the pool

16

ICSC Spoke 2 meeting 12th Sep 2024

ICSC Spoke 2 meeting 12th Sep 2024

17 of 23

Enable T2_IT_Bari to run Grid jobs requiring GPUs

Additions to modules/htcondor/templates/htc9_99-local.erb:

17

ICSC Spoke 2 meeting 12th Sep 2024

ICSC Spoke 2 meeting 12th Sep 2024

18 of 23

GPU Pool size per resource provider

18

ICSC Spoke 2 meeting 12th Sep 2024

ICSC Spoke 2 meeting 12th Sep 2024

19 of 23

CINECA HPC “Leonardo

  • Some Leonardo nodes have been configured with access to cvmfs

19

ICSC Spoke 2 meeting 12th Sep 2024

ICSC Spoke 2 meeting 12th Sep 2024

20 of 23

Backup

20

ICSC Spoke 2 meeting 12th Sep 2024

ICSC Spoke 2 meeting 12th Sep 2024

21 of 23

21

Load parameters for parabolic approximation

Compute approximation�at point vec�without region limitation

ICSC Spoke 2 meeting 12th Sep 2024

22 of 23

Infrastructure-related work

  • HEP experiments must be prepared to run substantial part of their processing tasks at HPCs
  • CMS internal study (ECoM2x, 2020) of the computing model and resource needs looking into the LHC Run 3, but specially at the HL-LHC phase, includes in its recommendations:
    • “CMS should continue to aggressively expand the resources accessible to it for production processing at facilities beyond those dedicated to the LHC”
    • CMS should strive towards using HPC resources effectively
    • Use of on board accelerators as much as possible during reconstruction
    • “Work towards enabling CMS software on CPUs, GPUs, FPGAs and TPUs as primary targets for its software stack”

22

ICSC Spoke 2 meeting 12th Sep 2024

ICSC Spoke 2 meeting 12th Sep 2024

23 of 23

Parallel activity

  • Investigate feasibility of porting ALICE ITS clustering to GPU (alpaka)

23

ICSC Spoke 2 meeting 12th Sep 2024

ICSC Spoke 2 meeting 12th Sep 2024