1 of 1

Does In-Context Operator Learning Generalize to

Domain-Shifted Settings?

1

Scientific Achievement

We propose a framework for solving a variety of differential equations using a large pre-trained neural network. Our model generalizes to new and more challenging problem settings, including unseen equation parameters, noisy observations, and even new classes of equations.

Significance and Impact

Our work investigates the broader notion of generalization across many types of equations using a single pre-trained model. Understanding the limitations of this setting is a key step towards developing powerful foundation models for scientific machine learning (SciML), which have shown huge promise in language and vision settings.

Technical Approach

  • We leverage the Transformer architecture, dominant in NLP/vision settings, for our sequence-to-sequence differential equation model.
  • We demonstrate that our pretrained model can solve a variety of differential equation problems at inference time via in-context learning.
  • Using approximation theory, we propose a mathematical framework for understanding the limitations of generalization for differential equations.

In-context operator learning enables generalization within and even across classes of differential equations. Trained once, the proposed model generalizes to new situations. For example, providing a few in-context examples enables the model to generalize to new unseen forcing functions.

PI(s)/Facility Lead(s): Lenny Oliker (LBL)

Collaborating Institutions: ICSI, UC Berkeley, Stanford

ASCR Program: SciDAC RAPIDS2

ASCR PM: Kalyan Perumalla (SciDAC RAPIDS2)

number of in-context examples

squared error

forcing functions are drawn from a Gaussian process with RBF kernel, parameterized by l

Generalization to out-of-distribution forcing functions.