Flash-X Performance Orchestration�With the NP ENAF Partnership
1
Scientific Achievement
We started developing an end-to-end performance portability solution for Multiphysics applications under the Exascale Computing Project. The solution is language agnostic, obviating the need to rewrite the code in C++ as most other solutions do. The solution is now deployed in Flash-X for some of the physics.
Significance and Impact
Our approach provides an effective solution that is designed to evolve with the evolution of hardware and software complexity. It can be applied to many legacy codes also.
Some of our tools provide complimentary features to existing C++ solutions, and therefore can be used in combination with those solutions.
This figure shows performance comparison of using shock hydrodynamics with adaptive mesh refinement in three different modes. The top dotted line is the performance using OpenACC offloading within a block, and the bottom green lines shows offloading done with our toolchain. The solid blue line is CPU only performance..
Technical Approach
. It uses three sets of tools, each one addressing one aspect of performance.
Unify expression of arithmetic with different data layouts for different devices: This is done with macros that can have multiple alternative definitions, and a tool that can arbitrate on selecting the most suitable one.
Unify algorithmic variants and map computation to devices: Algorithmic variants are expressed in pseudocode like recipes specifying order of execution and where to execute which computation.
Move data and computation to targets specified in the recipe. This is done with a domain specific runtime Milhoja. The code to interface with Milhoja is also generated along with recipe translation.
PI(s)/Facility Lead(s): Person Name; Anshu Dubey
Collaborating Institutions: Virgina Tech, Riken
ASCR Program: [ECP, SciDAC.]
ASCR PM: Lali Chatterji, Kalyan Perumalla
Publication(s) for this work : Youngjun, et al., in preparation
Dubey et al., https://doi.org/10.1016/j.future.2023.07.014