An ORNL research team implemented the first AI-generated portable Basic Linear Algebra Subprograms (BLAS) math library for vector-vector level-1 operations
The AI-generated codes are functional (can be compiled and run) on different hardware configurations
The performance reached by the AI-generated codes is competitive or even higher than vendor (Intel MKL, NVIDIA cuBLAS, AMD AOCL and hipBLAS) libraries on modern CPU/GPU architectures
Significance and Impact
Evaluate the capabilities of current large language models (LLMs) to generate portable high-performance computing (HPC) libraries for BLAS operations
Define the fundamental practices and criteria to interact with LLMs for HPC targets
Figure 1: The hipBLAS performance vs. nontrained (grey) and trained (drak-grey) ChatBLAS-HIP performance.The performance reached by the AI-generated codes is competitive or even higher than vendor libraries on modern CPU/GPU architectures
Technical Approach
Used GPT 3.5 and 4.0 versions and the just-in-time (JIT) and LLVM-based programming model Julia to interact with GPT and testing
Evaluation of the LLM-based techniques prompt engineering and fine-tuning
PI(s)/Facility Lead(s): Pedro Valero-Lara, Prasanna Balaprakash, Jeffrey S. Vetter; Scott Klasky
Collaborating Institutions: Oak Ridge National Laboratory
ASCR Program: RAPIDS-2
ASCR PM: Hal Finkel, Kalyan Perumalla
Publication(s) for this work: Pedro Valero-Lara, et al., ChatBLAS: A Portable AI-Generated BLAS Library.International Conference on Parallel Architectures and Compilation Techniques (PACT). 2024