1 of 1

Does In-Context Operator Learning Generalize to

Domain-Shifted Settings?

Scientific Achievement

We propose a framework for solving a variety of differential equations using a large pre-trained neural network. Our model generalizes to new and more challenging problem settings, including unseen equation parameters, noisy observations, and even new classes of equations.

Significance and Impact

Our work investigates the broader notion of generalization across many types of equations using a single pre-trained model. Understanding the limitations of this setting is a key step towards developing powerful foundation models for scientific machine learning (SciML), which have shown huge promise in language and vision settings.

Technical Approach

We leverage the Transformer architecture, dominant in NLP/vision settings, for our sequence-to-sequence differential equation model.
We demonstrate that our pretrained model can solve a variety of differential equation problems at inference time via in-context learning.
Using approximation theory, we propose a mathematical framework for understanding the limitations of generalization for differential equations.

In-context operator learning enables generalization within and even across classes of differential equations. Trained once, the proposed model generalizes to new situations. For example, providing a few in-context examples enables the model to generalize to new unseen forcing functions.

PI(s)/Facility Lead(s): Lenny Oliker (LBL)

Collaborating Institutions: ICSI, UC Berkeley, Stanford

ASCR Program: SciDAC RAPIDS2

ASCR PM: Kalyan Perumalla (SciDAC RAPIDS2)

number of in-context examples

squared error

forcing functions are drawn from a Gaussian process with RBF kernel, parameterized by l

Generalization to out-of-distribution forcing functions.

LOCAL LAB POC: Ben Erichson

TALKING POINTS:

Traditional neural network methods typically focus on solving a specific instance of a differential equation, such as Physics-Informed Neural Networks (PINNs) which address individual problems, or neural operators designed for single equations.
Our study explores the potential of a single pretrained neural network to handle a diverse range of differential equations. This includes solving multiple types of equations with varying parameters, initial and boundary conditions, and even entirely new classes of equations that were not encountered during the training phase.
Drawing inspiration from foundation models in natural language processing and vision, we investigated if a neural network, once trained, can address new differential equation challenges at inference time without further adjustments to the model. This is similar to how one might use ChatGPT with new prompts, where here the prompts are novel differential equation problems.
We successfully trained a sequence-to-sequence model, leveraging Transformer architectures, to address a wide variety of differential equations. This model proved capable of solving new, unseen differential equation problems at inference time. It effectively handled new equation parameters, initial conditions, forcing terms, and noisy data.
Our framework helps to better understand when and how a neural network can generalize to new types of differential equation problems.

METADATA:

Name of the associated awarded project: SciDac-5 Institutes RAPIDS2

PI name(s): Leonid Oliker

Name of the program manager: Kalyan Perumalla (RAPIDS)

CITATIONS:

Liu, J. W., Erichson, N. B., Bhatia, K., Mahoney, M. W., & Re, C. Does In-Context Operator Learning Generalize to Domain-Shifted Settings?. In The Symbiosis of Deep Learning and Differential Equations III.