1 of 1

Analysis of GPU Data Access Patterns on Complex Geometries for D3Q19 Lattice Boltzmann Algorithm

Scientific Achievement

Examine memory access schemes for the Lattice Boltzmann method (LBM) on GPUs via empirical testing and find addressing and memory layout schemes performing better than state-of-the-art practices.

Significance and Impact

Present the first near-optimal strong results for LBM with arterial geometries run on GPU systems (Titan and Summit), leading to an increased computational speed and memory reductions.

G. Herschlag, S. Lee, J. S. Vetter, and A. Randles. Analysis of GPU Data Access Patterns on Complex Geometries for D3Q19 Lattice Boltzmann Algorithm. IEEE Transactions on Parallel and Distributed Systems (TPDS), Vol. 32, No. 10, 2021

Resolution versus Performance on NVIDIA GPUs (K40, P100, V100) for the Aorta Geometry. Results show that semi-direct methods typically outperforms indirect methods, and locally-direct addressing is consistently outperformed by indirect and semi-direct addressing.

Technical Approach

    • Examine the computational cost of different data storage strategies for solving LBM on complex geometries with GPUs.
    • Find strong evidence that semi-direct addressing is superior for arterial and porous media geometries, and the CSoA memory layout consistently provides computational acceleration with minimal coding effort and negligible memory increase.