1 of 1

Robustifying State-space Models for Long Sequences via

Approximate Diagonalization

Scientific Achievement

We improve the prediction accuracy and robustness of state-space models (SSMs) for forecasting very long sequences. The improvements are the result of a novel "perturb-then-diagonalize" (PTD) methodology which solves instabilities of existing initialization schemes for SSMs.

Significance and Impact

Understanding and predicting sequential data is crucial in science and engineering. However, modeling long-term and complex temporal dependencies is challenging. In this work, we improve state-space models (SSMs) so that they can handle very long sequences more effectively.

Technical Approach

We theoretically analyzed the transfer functions of the HiPPO initialization schemes of SSMs. Based on that, we introduce our PTD methodology.
PTD is based on a backward-stable eigensolver.
We demonstrate the advantage of our approach by evaluating SSMs, initialized with our PTD methodology, on challenging forecasting tasks.

Our proposed PTD methodology improves robustness and accuracy of SSMs, when compared to transformer models and state-of-the-art SSM models. For example, the accuracy of the S4D model drastically drops when facing noisy inputs (a), while the S4-PTD model shows a high resilience to noisy inputs (b) at any point during training, and during inference time.

PI(s)/Facility Lead(s): Lenny Oliker (LBL)

Collaborating Institutions: ICSI, UC Berkeley, Cornell University

ASCR Program: SciDAC RAPIDS2

ASCR PM: Kalyan Perumalla (SciDAC RAPIDS2)

robust

brittle

TALKING POINTS:

Understanding and predicting sequential data is crucial in science and engineering.
State-space models have gained vast popularity and success in sequence modeling, and are known for their efficiency and capability of handling long sequences.
However, state-space models need to be initialized carefully to achieve the maximum performance on tasks that involve long sequences.
We propose a new initialization scheme as a remedy to the robustness issue of existing schemes.
We call our methodology perturb-then-diagonalize (PTD).
Specifically, we show that a small random perturbation improves numerical stability, which in turn improve the generalization performance and robustness to input perturbations.

METADATA:

Name of the associated awarded project: SciDac-5 Institutes RAPIDS2

PI name(s): Leonid Oliker

Name of the program manager: Kalyan Perumalla (RAPIDS)

CITATIONS:

Yu, A., Nigmetov, A., Morozov, D., Mahoney, M. W., & Erichson, N. B. Robustifying State-space Models for Long Sequences via Approximate Diagonalization. ICLR 2024.

AWARDS:

Spotlight.