1 of 1

Scientific Achievement

Researchers from AMCR replaced the traditional 2-sided inter-process MPI within SuperLU’s sparse triangular solvers with a low-latency 1-sided implementation.

Significance and Impact

SuperLU preconditioners (sparse triangular

solvers / SpTS) are essential components

in the linear solvers used by the M3D-C1 and NIMROD fusion simulation codes but are dominated by MPI communication. Our new one-sided CPU implementation of SpTS improves solver performance and scalability helping fusion simulations.

Research Details

  • Collaboration between CTTS, RAPIDS, and FastMath.
  • Evaluated potentials of three different one-sided MPI implementations against the baseline two-sided implementation on NERSC’s Perlmutter’s EPYC CPUs.
  • Achieved up to 1.5x speedup on 128 processes (one Perlmutter CPU node) relative to two-sided implementation.
  • Integrated best one-sided CPU SpTS implementation into SuperLU_DIST

Leveraging One-Sided Communication for Sparse Triangular Solvers on traditional CPUsSciDAC4: CTTS