6th European Advanced Accelerator Concepts workshop (EAAC'23)
WG3: Theory and simulations
Elba, Italy�September 20th, 2023
Exascale and�ML Models�for Accelerator Simulations
Axel Huebl
Lawrence Berkeley National Laboratory
On behalf of the WarpX, ImpactX & pyAMReX teams�LBNL, LLNL, SLAC, CEA, DESY, TAE, CERN
AM
LDRD
1
Funding Support
WarpX: longitudinal electric field in a laser-plasma accelerator
rendered with Ascent & VTK-m
This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of two U.S. Department of Energy organizations (Office of Science and the National Nuclear Security Administration) responsible for the planning and preparation of a capable exascale ecosystem, including software, applications, hardware, advanced system engineering and early testbed platforms, in support of the nation’s exascale computing imperative. This work was also performed in part by the Laboratory Directed Research and Development Program of Lawrence Berkeley National Laboratory under U.S. Department of Energy Contract No. DE-AC02-05CH11231, Lawrence Livermore National Laboratory under Contract No. DE-AC52–07NA27344 and SLAC National Accelerator Laboratory under Contract No. AC02–76SF00515. Supported by the CAMPA collaboration, a project of the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research and Office of High Energy Physics, Scientific Discovery through Advanced Computing (SciDAC) program.�This research used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725, the National Energy Research Scientific Computing Center (NERSC), a U.S. Department of Energy Office of Science User Facility located at Lawrence Berkeley National Laboratory, operated under Contract No. DE-AC02-05CH11231, and the supercomputer Fugaku provided by RIKEN.
The EAAC23 Workshop was supported by the EU I.FAST project. This project has received funding from�the European Union’s Horizon 2020 Research and Innovation programme under Grant Agreement No 101004730.
github.com/ECP-WarpX
github.com/openPMD
github.com/AMReX-Codes
github.com/picmi-standard
2
Outline
LDRD
3
Advanced Accelerator Modeling�at Exascale
4
Ultimate goal: virtual accelerator with on-the-fly tunability of physics & numerics complexity to users
Incomplete physics
Full physics
1D-1V
3D-3V
Low resolution
High resolution
Reduced models
First principles
Fast
Great for ensemble runs for design studies
Accurate
Great for detailed runs for physics studies
Goal
Start-to-end model-�ing in an open software ecosystem.
Start-to-End Modeling R&D
5
WarpX is a GPU-Accelerated PIC Code for Exascale
Available Particle-in-Cell Loops
Push particles
Deposit currents
Solve fields
Gather fields
Geometries
Advanced algorithms
boosted frame, spectral solvers, Galilean frame, embedded boundaries + CAD, MR, ...
Multi-Physics Modules
field ionization of atomic levels, Coulomb�collisions, QED processes (e.g. pair creation), macroscopic materials
Multi-Node parallelization
On-Node Parallelization
Scalable, Standardized I/O
6
WarpX: conceived & developed by a multidisciplinary, multi-institution team
7
Ryan
Sandberg
Andrew
Myers
Weiqun
Zhang
John
Bell
Jean-Luc Vay (ECP PI)
Rémi
Lehe
Olga Shapoval
Ann Almgren�(ECP coPI)
Marc Hogan�(ECP coPI)
Lixin�Ge
Cho
Ng
David Grote�(ECP coPI)
Revathi
Jambunathan
Axel�Huebl
Yinjian
Zhao
Kevin
Gott
(NESAP)
Edoardo
Zoni
Hannah Klion
Prabhat Kumar
Junmin
Gu
Marco Garten
AM
Arianna Formenti
(France)
Lorenzo
Giacomel
(Switzerland)
…& private sector
Henri
Vincenti
Luca
Fedeli
Thomas
Clark
Neïl
Zaim
Pierre
Bartoli
(Germany)
Maxence Thévenet
Alexander
Sinn
7
ImpactX: GPU-, AMR- & AI/ML-Accelerated Beam Dynamics�
Particle-in-Cell Loop
Fireproof Numerics
based on IMPACT suite of codes, esp. IMPACT-Z and MaryLie
Triple Acceleration Approach
github.com/ECP-WarpX/impactx
LDRD
User-Friendly
Multi-Node parallelization
On-Node Parallelization
Scalable, Parallel I/O
💡 Same Script
CPU/GPU & MPI
8
ImpactX: Easy to Use, Extent, Tested and Documented�
LDRD
github.com/ECP-WarpX/impactx
Example: ImpactX FODO Cell Lattice
9
We Develop Openly with the Community
python3 -m pip install .
brew tap ecp-warpx/warpx
brew install warpx
spack install warpx
spack install py-warpx
conda install
-c conda-forge warpx
module load warpx
module load py-warpx
cmake -S . -B build
cmake --build build --target install
Open-Source Development & Benchmarks:�github.com/ECP-WarpX
Online Documentation:�warpx|hipace|impactx.readthedocs.io
Rapid and easy installation on any platform:
230 physics benchmarks run on every code change of WarpX
19 physics benchmarks + 106 tests for ImpactX
10
Power-Limits Seed a Cambrian Explosion of Compute Architectures
without tiling
with tiling
Field-Programmable Gate Array (FPGA)
Application-Specific Integrated Circuit (ASIC)
Quantum- Circuit
potential future
distribute one simulation
millions of cores
over
10,000s of computers
for
J-L Vay, A Huebl et al., PoP 28.2, 023105 (2021); A Myers et al, JParCo 108.102833, (2021); L Fedeli, A Huebl et al., SC22 (2022)
Ranks and Details the 500 most powerful non-distributed Computer Systems, TOP500.org (June 2023)
11
Community Approaches to Exascale Programming
Applications
Libraries
PIC Algorithms
Communication
Performance Portability
Programming Models
/ HIP
ARM
Hardware
AMD
B Worpitz, MA (2015); E Zenker, A Huebl et al., IPDPSW (2016); E Zenker, A Huebl et al., IWOPH (2017); A Matthes, A Huebl et al., P3MA (2017); A Myers et al., JPARCO (2021); HC Edwards et al., SciProg (2012); RD Hornung et al., OSTI TR (2014)
Warp
Vendor
⇒
Scripts
WarpX
Vendor
Scripts
HiPACE++
domain science libs
AMReX
Math
IO
Impact�X
Artemis
Performance Portability Layer
Then
Now
ABLASTR
PICSAR-�QED
pyAMReX
12
WarpX is now 500x More Performant than its Baseline
April-July 2022: WarpX on world’s largest HPCs�L. Fedeli, A. Huebl et al., Gordon Bell Prize Winner at SC’22, 2022
from a full stage simulation
Figure-of-Merit: weighted updates / sec
110x
500x
Note: Perlmutter & Frontier were pre-acceptance measurements!
68,608 GPUs of First Exascale�Machine
7,299,072�CPU Cores
13
2022 ACM Gordon Bell Prize: using the First Exascale Supercomputer
April-July 2022: WarpX on world’s largest HPCs�L. Fedeli, A. Huebl et al., Gordon Bell Prize Winner at SC’22, 2022
A success story of a multidisciplinary, multi-institutional team!
L. Fedeli, A. Huebl et al., IEEE, SC22 (2022)
M. Thévenet et al., Nat. Phys 12 (2016)
14
2022 ACM Gordon Bell Prize: using the First Exascale Supercomputer
April-July 2022: WarpX on world’s largest HPCs�L. Fedeli, A. Huebl et al., Gordon Bell Prize Winner at SC’22, 2022
A success story of a multidisciplinary, multi-institutional team!
≈ nC
L. Fedeli, A. Huebl et al., IEEE, SC22 (2022)
M. Thévenet et al., Nat. Phys 12 (2016)
15
2022 ACM Gordon Bell Prize: using the First Exascale Supercomputer
April-July 2022: WarpX on world’s largest HPCs�L. Fedeli, A. Huebl et al., Gordon Bell Prize Winner at SC’22, 2022
A success story of a multidisciplinary, multi-institutional team!
L. Fedeli, A. Huebl et al., IEEE, SC22 (2022)
M. Thévenet et al., Nat. Phys 12 (2016)
16
If You Want to Go Far, Go Together
-5
Code A
Code B
...
Particle-In-Cell
Modeling Interface
open Particle Mesh�Data standard
Standardization…
strong int. partnerships
A Huebl et al., DOI:10.5281/zenodo.591699 (2015)�DP Grote et al., Particle-In-Cell Modeling Interface (PICMI) (2021)�LD Amorim et al., GPos (2021); M Thévenet et al., DOI:10.5281/zenodo.8277220 (2023)�A Ferran Pousa et al., DOI:10.5281/zenodo.7989119 (2023)�RT Sandberg et al., IPAC23, DOI:10.18429/JACoW-IPAC-23-WEPA101 (2023)
LDRD
… Accelerates Innovation
17
Across Scales: Advanced and Conventional Accelerators
18
BLAST is Now An Accelerated, Machine-Learning Boosted Ecosystem
fields & particles
tensors arrays
LDRD
A Huebl (PI), R Sandberg,�R Lehe, CE Mitchell et al.
A Huebl et al., NAPAC22, DOI:10.18429/JACoW-NAPAC2022-TUYE2 (2022)
RT Sandberg et al and A Huebl, IPAC23, DOI:10.18429/JACoW-IPAC-23-WEPA101 (2023)
A Huebl et al., AAC22, arXiv:2303.12873 (2023); RT Sandberg et al. and A Huebl, in preparation (2023)
A) Training
B) Inference: in situ to codes
GPU Workflows are blazingly fast
Can we augment & accelerate on-GPU�PIC simulations with on-GPU ML models?
Cross-Ecosystem, In Situ Coupling
Consortium for Python Data API Standards data-apis.org
Very easy to:
19
Modeling Time: ML-Acceleration of Plasma Elements for Beamlines
LPA integration via AI/ML for rapid beamline design & operations.
Fast surrogates: Data-driven modeling is�a potential middle ground between
LDRD
A Huebl (PI), R Sandberg,�R Lehe, CE Mitchell et al.
Trans-�port
Plasma Stage
Plasma Stage
Plasma Source
Injector
Model Speed: for accelerator elements
WarpX ImpactX WarpX HiPACE++ WarpX-ES
ML boosted: for a specific problem
ML ImpactX ML ML ML
Trans-�port
LWFA Stage
PWFA Stage
LWFA w/ iinj.
Kicker Magnet
A Huebl et al., NAPAC22, DOI:10.18429/JACoW-NAPAC2022-TUYE2 (2022)
RT Sandberg et al and A Huebl, IPAC23, DOI:10.18429/JACoW-IPAC-23-WEPA101 (2023)
A Huebl et al., AAC22, arXiv:2303.12873 (2023); RT Sandberg et al. and A Huebl, in preparation (2023)
Simulation time: full geometry, full physics
hrs sec hrs hrs min
Model Choice: for complex, nonlinear, many-body systems pick two of the following
level of detail
speed
accuracy
simulation
data-driven
analytical
20
We Trained a Neural Net with WarpX for Staging of Electrons
fast
precise
analytical
simulation
surrogate
Error of Beam Moments
combined beamline
stage 1
stage 2
Training data: 50,000 particles / beam
LDRD
A Huebl (PI), R Sandberg,�R Lehe, CE Mitchell et al.
A Huebl et al., NAPAC22, DOI:10.18429/JACoW-NAPAC2022-TUYE2 (2022)
RT Sandberg et al and A Huebl, IPAC23, DOI:10.18429/JACoW-IPAC-23-WEPA101 (2023)
A Huebl et al., AAC22, arXiv:2303.12873 (2023); RT Sandberg et al. and A Huebl, in preparation (2023)
one-time cost: few hr WarpX sim + 10min training
Lens
LWFA Stage 2
Drift …
LWFA�Stage 1
Drift
Drift
few pC
e- beam
Hyperparameters
A Neural Net is a non-linear�transfer map!
Assumption: purely tracking
A single NN can learn details of multiple stages (e.g, 10, 20, 30 GeV).
Assumption: laser-plasma parameters stay the same.
21
We Trained a Neural Net with WarpX for Staging of Electrons
fast
precise
analytical
simulation
surrogate
ImpactX: after 2 surrogates�WarpX: 2 stage simulation
Error of Beam Moments
combined beamline
stage 1
stage 2
Training data: 50,000 particles / beam
LDRD
A Huebl (PI), R Sandberg,�R Lehe, CE Mitchell et al.
A Huebl et al., NAPAC22, DOI:10.18429/JACoW-NAPAC2022-TUYE2 (2022)
RT Sandberg et al and A Huebl, IPAC23, DOI:10.18429/JACoW-IPAC-23-WEPA101 (2023)
A Huebl et al., AAC22, arXiv:2303.12873 (2023); RT Sandberg et al. and A Huebl, in preparation (2023)
Open challenges�Learning microscopic and collective�effects simultaneously.
one-time cost: few hr WarpX sim + 10min training
Lens
LWFA Stage 2
Drift …
LWFA�Stage 1
Drift
Drift
ImpactX simulation time: <1 sec
Flexible, Hybrid Beamline Sim
Same super-fast evaluation!
few pC
e- beam
22
Summary
presented by: Axel Huebl (LBNL)
📧 axelhuebl@lbl.gov
LDRD
github.com/ECP-WarpX
github.com/openPMD
github.com/AMReX-Codes
github.com/picmi-standard
level of detail
speed
accuracy
simulation
data-driven
analytical
23
Backup Slides
24
Abstract (16'+4')
Computational modeling is essential to the exploration and design of advanced particle accelerators. The modeling of laser-plasma acceleration and interaction can achieve predictive quality for experiments if adequate resolution, full geometry and physical effects are included.
Here, we report on the significant evolution in fully relativistic full-3D modeling of conventional and advanced accelerators in the WarpX and ImpactX codes with the introduction of Exascale supercomputing and AI/ML models. We will cover the first PIC simulations on an Exascale machine, the need for and evolution of open standards, and based on our fully open community codes, the connection of time and space scales from plasma to conventional beamlines with data-driven machine-learning models.
25
WarpX in ECP: Staging of Laser-Driven Plasma Acceleration
Goal: deliver & scientifically use the nation’s first exascale systems
first 3D simulation of a chain of plasma accelerator stages for future colliders
Our DOE science case is in HEP, our methods are ASCR:
26
WarpX in ECP: Staging of Laser-Driven Plasma Acceleration
J.-L. Vay, A. Huebl et al., ISAV’20 Workshop Keynote (2020) and PoP 28.2, 023105 (2021); L. Fedeli, A. Huebl et al., SC22 (2022)�J.-L. Vay et al., ECP WarpX MS FY23.1; A. Ferran Pousa et al., IPAC23, DOI:10.18429/JACoW-IPAC-23-TUPA093 (2023)
Ascent VTK-m In Situ Visualization:�N. Marsaglia, C. Harrison, A. Huebl
27
BLAST is Now An Accelerated, ML-Modeling Ecosystem
fields & particles
tensors arrays
LDRD
A Huebl (PI), R Sandberg,�R Lehe, CE Mitchell et al.
A Huebl et al., NAPAC22, DOI:10.18429/JACoW-NAPAC2022-TUYE2 (2022)
RT Sandberg et al and A Huebl, IPAC23, DOI:10.18429/JACoW-IPAC-23-WEPA101 (2023)
A Huebl et al., AAC22, arXiv:2303.12873 (2023); RT Sandberg et al. and A Huebl, in preparation (2023)
A) Training
B) Inference: in situ to codes
C Badiali et al., JPlasmaPhys. 88.6 (2022)
Related Works: Not or only partly GPU accelerated
Cross-Ecosystem, In Situ Coupling
Consortium for Python Data API Standards data-apis.org
Very easy to:
All-GPU Workflows are blazingly fast
Can we augment & accelerate on-GPU�PIC simulations with on-GPU ML models?
28
The WarpX Software Stack
WarpX
full PIC, LPA/LPI
AMReX
Containers, Communication,�Portability, Utilities
MPI
CUDA, OpenMP, SYCL, HIP
Diagnostics
I/O�code coupling
ADIOS2
HDF5
Lin.�Alg.
BLAS++�LAPACK++
Ascent
...
Python: Modules, PICMI interface, Workflows
ZFP
VTK-m
openPMD
PICSAR
QED Modules
FFT
on- or multi- device
ABLASTR library: common PIC physics
ARTEMIS
microelectronics
ImpactX
accelerator lattice design
Desktop
to
HPC
HiPACE++�quasi-static, PWFA
Object-Level Python Bindings
extensible, AI/ML
pyAMReX
29
Power-Limits Seed a Cambrian Explosion of Compute Architectures
AMD
ARM
30
Portable Performance through Exascale Programming Model
A. Myers et al., “Porting WarpX to GPU-accelerated platforms,” Parallel Computing 108, 102833 (2021)
AMReX library�
without tiling
with tiling
Data Structures
ParallelFor(/Scan/Reduce)
A100 gives additional ~< 2x
31
BLAST Codes: Transition to Exascale
Imagine a future, hybrid particle accelerator, e.g., with conventional and plasma elements.
s-based PIC�uses s instead of t as�independent variable�+ symplectic maps for �accelerator elements
Quasistatic PIC�separates timescale:�plasma wake & beam evl.
WarpX
HiPACE++
ImpactX
Booster
Source
Injector
Storage Ring
BeamBeam3D
ES or
Vlasov
FEL
IMPACT-T
Legend
BLAST: Exascale
in BLAST
LW3D
modeling of radiative & space-charge effects
POSINST
buildup of electron clouds, secondary electron yield
other
Injector
Plasma Stage
(S)RF Gun
LPA/LPI
Storage Ring
IP
IP
IMPACT-Z
cooling
Goal
Start-to-end model-�ing in an open software ecosystem.
Plasma Stage
t-based electrostatic �or electromagnetic PIC
Warp
FBPIC
Wake-T
Reduce Dynamics / Geometry