1 of 1

ChatBLAS: A Portable AI-Generated BLAS Library

Scientific Achievement

An ORNL research team implemented the first AI-generated portable Basic Linear Algebra Subprograms (BLAS) math library for vector-vector level-1 operations
The AI-generated codes are functional (can be compiled and run) on different hardware configurations
The performance reached by the AI-generated codes is competitive or even higher than vendor (Intel MKL, NVIDIA cuBLAS, AMD AOCL and hipBLAS) libraries on modern CPU/GPU architectures

Significance and Impact

Evaluate the capabilities of current large language models (LLMs) to generate portable high-performance computing (HPC) libraries for BLAS operations
Define the fundamental practices and criteria to interact with LLMs for HPC targets

Figure 1: The hipBLAS performance vs. nontrained (grey) and trained (drak-grey) ChatBLAS-HIP performance.The performance reached by the AI-generated codes is competitive or even higher than vendor libraries on modern CPU/GPU architectures

Technical Approach

Used GPT 3.5 and 4.0 versions and the just-in-time (JIT) and LLVM-based programming model Julia to interact with GPT and testing
Evaluation of the LLM-based techniques prompt engineering and fine-tuning

PI(s)/Facility Lead(s): Pedro Valero-Lara, Prasanna Balaprakash, Jeffrey S. Vetter; Scott Klasky

Collaborating Institutions: Oak Ridge National Laboratory

ASCR Program: RAPIDS-2

ASCR PM: Hal Finkel, Kalyan Perumalla

Publication(s) for this work: Pedro Valero-Lara, et al., ChatBLAS: A Portable AI-Generated BLAS Library.International Conference on Parallel Architectures and Compilation Techniques (PACT). 2024

LOCAL LAB POC:

Scott Klasky

Prasanna Balaprakash

Pedro Valero-Lara

TALKING POINTS:

Focused on BLAS level-1 (vector-vector) operation. This library is composed of linear algebra routines, which are very popular and widely used by many HPC and scientific applications.
The AI-generated codes are functional (can be compiled and run) on different hardware configurations
The performance reached by the AI-generated codes is competitive or even higher than vendor (Intel MKL, NVIDIA cuBLAS, AMD AOCL and hipBLAS) libraries on modern CPU/GPU architectures
Most of the AI-generated codes used in this study were generated by non-trained LLMs models (GPT 3.5 and 4.0). However, to reach better performance on a few routines, we had to train these models.

METADATA:

Name of the associated awarded project:

PI name(s): Pedro Valero-Lara (ORNL), Prasanna Balaprakash (ORNL), Jeffrey S. Vetter (ORNL)

Name of the program manager: Hal Finkel , Kalyan Perumalla

CITATIONS:

The paper was submitted to PACT’24. It is not published yet.

Pedro Valero-Lara, et al., ChatBLAS: A Portable AI-Generated BLAS Library. The International Conference on Parallel Architectures and Compilation Techniques (PACT). 2024

AWARDS:

N/A

REPRODUCIBILITY:

We will have a repository soon, where the tests described in the paper can be reproducible

BACKGROUND AND CONTEXT INFORMATION:

This is an ongoing effort. We want to extend this effort to much more complicated scenarios, such as BLAS level-2 (matrix-vector), level-3 (matrix-matrix) linear algebra operations, and LaPACK (solvers). These libraries are very algorithm dependent, thereby adding an extra dimension of complexity to this study.