Robustifying State-space Models for Long Sequences via
Approximate Diagonalization
1
Scientific Achievement
We improve the prediction accuracy and robustness of state-space models (SSMs) for forecasting very long sequences. The improvements are the result of a novel "perturb-then-diagonalize" (PTD) methodology which solves instabilities of existing initialization schemes for SSMs.
Significance and Impact
Understanding and predicting sequential data is crucial in science and engineering. However, modeling long-term and complex temporal dependencies is challenging. In this work, we improve state-space models (SSMs) so that they can handle very long sequences more effectively.
Technical Approach
Our proposed PTD methodology improves robustness and accuracy of SSMs, when compared to transformer models and state-of-the-art SSM models. For example, the accuracy of the S4D model drastically drops when facing noisy inputs (a), while the S4-PTD model shows a high resilience to noisy inputs (b) at any point during training, and during inference time.
PI(s)/Facility Lead(s): Lenny Oliker (LBL)
Collaborating Institutions: ICSI, UC Berkeley, Cornell University
ASCR Program: SciDAC RAPIDS2
ASCR PM: Kalyan Perumalla (SciDAC RAPIDS2)
robust
brittle