1 of 1

The Art of Sparsity: Mastering High-Dimensional Tensor Storage

Scientific Achievement

New findings from storing sparse tensors: (1) linear organization provides the best balance between storage size and access time; (2) they can be transformed into 2D tensors for efficient storage with compressed sparse row (CSR)/compressed sparse column(CSC); (3) tree-structured organization offer exceptional performance in storing high-dimensional tensors.

Significance and Impact

Sparse tensors find widespread use in applications, such as machine learning, partial differential equations(PDE) solvers, and graphs. These new findings contribute to a nuanced understanding of sparse tensor storage formats, guiding informed choices in practical applications.

Technical Approach

We analyzed both time and storage complexity for five sparse tensor organizations, including coordinate(COO), linear address(LINEAR), general CSC(GCSC), general CSR (GCSR), and Compressed Sparse Fibers (CSF)
We designed experiments to evaluate these five organizations with three representative patterns: tridiagonal sparse pattern (TSP), general graph sparse pattern (GSP), and mixed sparse pattern (MSP)

PI(s)/Facility Lead(s): Bin Dong

Collaborating Institutions: Suren Byna (Ohio State/LBNL), Kesheng Wu (LBNL)

ASCR Program: SciDAC RAPIDS2

ASCR PM: Kalyan Perumalla (SciDAC RAPIDS2), Steve Lee (FASTMath)

Publication(s) for this work: B. Dong, S. Byna, K. Wu , et al., “The Art of Sparsity: Mastering High-Dimensional Tensor Storage (Regular Paper),” ESSA 2024: 5th Workshop on Extreme-Scale Storage and Analysis in conjunction with IEEE IPDPS 2024, San Francisco

Writing Time

Storage Size

Reading Time

the lower

the better

Analysis of both time and storage complexity for five sparse tensor organizations, including coordinate(COO), linear address(LINEAR), general CSC(GCSC), general CSR (GCSR), and Compressed Sparse Fibers (CSF).

LOCAL LAB POC: Bin Dong dbin@lbl.gov

TALKING POINTS:

Sparse tensors find widespread use in applications, such as machine learning, partial differential equations(PDF) solver, and graphs.
Find out how to efficiently store them are very important, saving both time and storage space
This work provides both theoretical analysis and experimental verification for which organization works better on which types of sparse tensors.

METADATA:

Name of the associated awarded project: SciDAC RAPIDS 2

PI name(s): Bin Dong, Kesheng Wu, Alex Sim

Name of the program manager:

CITATIONS:

B. Dong, S. Byna, K. Wu , et al., “The Art of Sparsity: Mastering High-Dimensional Tensor Storage (Regular Paper),” ESSA 2024: 5th Workshop on Extreme-Scale Storage and Analysis in conjunction with IEEE IPDPS 2024, San Francisco (Accepted)

AWARDS:

REPRODUCIBILITY:

BACKGROUND AND CONTEXT INFORMATION:

Sparse tensors are prevalent in many applications. While numerous approaches have emerged to optimize the organization of sparse tensors, with the goal of reducing storage requirements and enhancing access performance, a comprehensive examination of the associated time and space complexities has been notably lacking. This study bridges this gap by conducting both theoretical and empirical investigations into various strategies for storing sparse tensors. Our major findings are as follows: (1) Linear address-based organization provides the best balance between storage size and access time; (2) Sparse high-dimensional tensor data can be transformed into lower-dimensional tensors, facilitating efficient storage and access; (3) In the absence of dimension transformation, tree-structured organizations offer compelling performance in low-dimensional tensors and exceptional performance in high-dimensional tensors.