1 of 1

Streaming Data in HPC Workflows Using ADIOS

1

Scientific Achievement

We present SST, the Sustainable Staging Transport, an approach which allows direct streaming between traditional file writers and readers. SST is an engine of the ADIOS I/O framework, that provides scalable staging on the entire scale of the Frontier supercomputer.

Significance and Impact

SST is used in various applications, for:

  • PIConGPU: Feeding model training with simulation data with substantially higher bandwidth than the theoretical limits of Frontier’s file system.
  • WDMApp: Strong coupling of separately developed applications for multi-physics multi-scale simulation.
  • WRF: In situ analysis and visualization of data to complete all data processing shortly after the simulation finishes.

PIConGPU data staging test on Frontier. Data from all GPUs sent to a second application for the purpose of training a model. The data transfer bandwidth with SST is much higher than using the file system.

Technical Approach

  • SST separates the Control Plane from the Data Plane.
  • Control Plane is designed for dynamic connections of multiple consumers to a producer, and for scalable distribution of application metadata to all processes. Consumer failures do not affect the producer.
  • Data Plane has several implementations to utilize Remote Direct Memory Access (libfabric, ucx), MPI inter-communicators, or reliable transfer protocols (tcp, rudp).

PI(s)/Facility Lead(s): Scott Klasky, ORNL

Collaborating Institutions: Oak Ridge National Laboratory

ASCR Program: SciDAC RAPIDS2

ASCR PM: Kalyan Perumalla

Publication(s) for this work: Greg Eisenhauer et al. "Streaming Data in HPC Workflows Using ADIOS", CUG'24, Perth, AU

Credit:

F. Pöschel, K. Steiniger

Center for Advanced Systems

Understanding (CASUS), Germany