1 of 8

Porting Code to GPU Platforms - UC2.2.4

Status Report (M9)��

Adriano Di Florio

WP2 Bi-weekly Meeting - 05/11/2024 �������

2 of 8

Introduction

MS9 Report

2

We are well on track. A summary of the activities reported until now:

  • ALICE data decompression for asynchronous reconstruction (see report) -> KPI2.2.4.2;
  • CMS Electron Seeding on GPU (see report)-> KPI2.2.4.1 + KPI2.2.4.2;
  • CMS Primary Vertex Reconstruction on GPU (see report)-> KPI2.2.4.1;
  • CMS Pixel and Strip Particle Reconstruction on GPU (see report) -> KPI2.2.4.1 + KPI2.2.4.2;

Report for MS9 completed. Expected activities:

  • Algorithms’ porting activities leveraging on the experience acquired and on the best practices defined by WP4.
  • Setup of the physics validation infrastructure and definition of the minimal targets, in terms of physics performances and reproducibility, for the validation campaigns.
  • Training events will be organized also during this period. Their focus will partially switch to sharing and transferring the knowledge acquired among the different experiments. Hands-on session and hackathon events will be crucial.

Here a summary.

3 of 8

Introduction

Algorithms’ porting activities - CA Strips - KPI2.2.4.X

3

CMS Strip Reconstruction in Alpaka:

    • (already in MS7 and MS8).

    • The developments are still ongoing and the target for integration is the beginning of 2025 datataking. The current preliminary performance results are promising but some features of the algorithm still need to be better understood and tuned.

    • Also the integration in the CMS High Level Trigger chain is being worked on.

    • Contributes to KPI2.2.4.1 and KPI2.2.4.2.

    • See the report from Brunella.

4 of 8

Introduction

Algorithms’ porting activities - Electron Seeding - KPI2.2.4.X

4

CMS Electron Seeding on GPU:

  • (reported also in MS8)

  • Latest updates:

      • the GPU implementation of a parametrized magnetic field has been integrated in CMS software;

      • exposing the parallelization for each electron seed -> 1 thread == 1 electron seed;

      • ~ same efficiency, >~fake rate (WIP);

      • presented at CHEP.

5 of 8

Introduction

Algorithms’ porting activities - Optimizer for GPU Algos- KPI2.2.4.X

5

Multi-Objective Optimization Tool for CMS GPU Algorithms Optimization:

  • a new tool based on the Particle Swarm heuristic optimization algorithm (PSO);

  • developed to perform parameter tuning of the GPU pixel track reconstruction;

  • the tuning algorithm is a Multi-Objective Particle Swarm Optimizer. Two competing objectives:
    • maximize efficiency;
    • minimize fake rate;

  • result == Pareto front;

  • tested on late 2023 MCs with excellent results;

  • ~same efficiency, ~50% fakerate;

  • presented at CHEP;

  • worth to mention that the main developer (Simone) started to work with us with the first training event.

6 of 8

Introduction

Validation infrastructure - KPI2.2.4.3

6

Validation infrastructure

  • working validation infrastructure, tested also on pledged resources;

  • validate the usage of Alpaka by CMS from 2024 datataking;

  • pixel tracks performance as metrics;

  • two types of validations:

    • Validation 1:
      • Running on the same RAW events.
      • Four setups

        • pre-Alpaka migration on CPU/GPU (native CUDA);
        • Alpaka on CPU/GPU;
      • perfect match

    • Validation 2:

      • running the Alpaka (and CUDA) GPU and CPU reconstruction in the same job and for each event;
      • compare GPU and CPU quantities event by event;
      • same intrinsic fluctuations in Alpaka and native CUDA.

  • Testing on AMD resources at CNAF for an additional architecture.

7 of 8

Introduction

Validation infrastructure - KPI2.2.4.3 - II

7

Validation infrastructure

  • working validation infrastructure, tested also on pledged resources;

  • validate the usage of Alpaka by CMS from 2024 datataking;

  • pixel tracks performance as metrics;

  • two types of validations:

    • Validation 1:
      • Running on the same RAW events.
      • Four setups

        • pre-Alpaka migration on CPU/GPU (native CUDA);
        • Alpaka on CPU/GPU;
      • perfect match

    • Validation 2:

      • running the Alpaka (and CUDA) GPU and CPU reconstruction in the same job and for each event;
      • compare GPU and CPU quantities event by event;
      • same intrinsic fluctuations in Alpaka and native CUDA.

  • Testing on AMD resources at CNAF for an additional architecture.

Comparison of the number of pixel tracks per event.

The x-axis(y-axis) represents the number of pixel tracks running the reconstruction on CPU(GPU).

Each column is normalized to one.

8 of 8

��