1 of 1

Portable Staging for GPU-Based In-Situ Workflows

1

Scientific Achievement

We extend the use of GPU memories for data staging in coupled scientific workflows by implementing a “dual-channel dual staging” (DCDS) mode, in which data may be communicated between GPUs and staging servers by simultaneously utilizing both a direct GPU->remote channel and a GPU->CPU->remote proxied channel. We further implement adaptive load balancing across these channels to improve utilization in the face of system variation and different modes of congestion.

Significance and Impact

GPU-based computation has taken a prominent role in scientific computing workflows. Machine learning is only expected to increase this trend. Efficient use of system resources is critical to the performance of memory-intensive applications, and using all available transfer mechanisms improves throughput. Using DCDS, we observed a 49% and 56% improvement in write and read time, respectively, over GPU-GPU transfer alone in a LULESH/ZFP coupled workflow.

Evaluated on Perlmutter, splitting data between both GPU and host channels (dual channel) improves read/write time by 5%/6% over GPU channels alone and 17%/36% over host channels alone. The addition of pre-staging data in host-memory (dual channel dual staging) significantly increases these gains, with an improvement of read/write time of 60%/35% over GPU channels alone and 65%/56% over host channels alone. These results show the value that host memory and communication channels can bring to GPU workflows.

Technical Approach

  • Divide objects to be transferred between GPUs and staging servers into a “direct channel” and a “CPU-proxied channel” slice, according to a “slicing ratio”
  • Adapt the slicing ratio across these channels to improve utilization in the face of system variation and different modes of congestion.

PI(s)/Facility Lead(s): Philip Davis, Bo Zhang (University of Utah)

Collaborating Institutions: OLCF, Rutgers University

ASCR Program: SciDAC

ASCR PM: Kalyan Perumalla.

Publication for this work: Zhang, et al. “Dual Channel Dual Staging: Hierarchical and Portable Staging for GPU-Based In-Situ Workflow,” (Under Submission)

Code Developed: Dual Channel/Dual Staging DataSpaces extensions