3 of 18

Background

Time series

Wide applications: anomaly detection, classification, forecasting, etc.
Deep learning brings significant improvement in these tasks

Data Augmentation in deep learning

Enhance the size and quality of the training data
Has shown effectiveness in CV and NLP
Less attention has been paid in time series

Our focus

Review existing data augmentation methods in common time series tasks, including time series forecasting, anomaly detection, classification
Highlight future research directions and opportunities

4 of 18

Taxonomy

Taxonomy of time series data augmentation

Systematically categorize existing methods into basic and advanced approaches

5 of 18

Outline

Background and Overview
Time Series Data Augmentation Methods
Discussion for Future Opportunities
Conclusions

6 of 18

Time Domain�

Key Idea:

Manipulate the original time series directly (most straightforward data augmentation)
Existing augmentation methods in CV can be leveraged for time series

[Um et al., 2017] Terry T Um, Franz M J Pfister, Daniel Pichler, Satoshi Endo, Muriel Lang, Sandra Hirche, Urban Fietzek, and Dana Kulic. Data augmentation of wearable sensor data for Parkinson’s disease monitoring using convolutional neural networks. In ACM ICMI 2017, pages 216–220, 2017.

Example^{[Um et al., 2017]}:

Jittering: add noise; Cropping: do window slicing; Scaling: changes the magnitude; TimeWarp: perturb the temporal location; Perm: perturb the temporal location of within-window event
Superior performance in time series classification

7 of 18

Frequency Domain�

Key Idea:

Frequency analysis is a useful tool especially for time series, which shows how the signal lies within each frequency band
Existing augmentation methods in time domain can be leveraged in frequency domain

[Gao et al., 2020] Jingkun Gao, Xiaomin Song, Qingsong Wen, Pichao Wang, Liang Sun, and Huan Xu. RobustTAD: Robust time series anomaly detection via decomposition and convolutional neural networks. In MileTS’20: 6th KDD Workshop on Mining and Learning from Time Series, pages 1–6, 2020.

Example: RobsutTAD^{[Gao et al., 2020]}

Perform Fourier transform (FT) for amplitude & phase spectrum
Perform perturbations in both spectrums
Transform back to obtain multiple versions
Superior performance in time series anomaly detection

8 of 18

Time-Frequency Domain�

Key Idea:

Classic frequency analysis cannot manipulate signals whose statistics vary in time
Time series can be understood better by studying time-frequency jointly rather than separately
Data augmentation may have more penitential in time-frequency domain

[Park et al., 2019] Daniel S Park,William Chan, Yu Zhang, Chung- Cheng Chiu, Barret Zoph, Ekin D Cubuk, and Quoc V. Le. SpecAugment: A simple data augmentation method for automatic speech recognition. In INTERSPEECH 2019, pages 2613–2617, 2019.

Example: SpecAugment^{[Park et al., 2019]}

Modify short-time Fourier transform (STFT) based spectrogram by time warping, and masking blocks of consecutive time steps (vertical masks) and frequency channels (horizontal masks)
Superior performance in time series classification (speech recognition)

9 of 18

Decomposition-based Methods

Key Idea:

Decompose time series into deterministic and stochastic components
Add transformation/changes in stochastic part to generate new data

[Bergmeir et al., 2016] Christoph Bergmeir, Rob J. Hyndman, and Jos´e M. Ben´ıtez. Bagging exponential smoothing methods using STL decomposition and Box–Cox transformation. International Journal of Forecasting, 32(2):303–312, 2016.

[Bandara et al., 2021] Bandara, Kasun, Hansika Hewamalage, Yuan-Hao Liu, Yanfei Kang, and Christoph Bergmeir. "Improving the accuracy of global forecasting models using time series data augmentation." Pattern Recognition (2021): 108148.

Example^{[Bergmeir et al., 2016]}:

Perform STL/Trend filter to extract residual
Apply moving block bootstrapping in residual
Add back to obtain multiple bootstrapped versions
Superior performance in time series forecasting^{[Bandara et al., 2021]}

10 of 18

Statistical Generative Models

Key Idea:

In statistical view, a time series can be considered a random realization of some underlying data generation process

[Smyl and Kuber, 2016] Slawek Smyl and Karthik Kuber. Data preprocessing and augmentation for multiple short time series forecasting with recurrent neural networks. In 36th International Symposium on Forecasting, June 2016.

Example^{[Smyl and Kuber, 2016]}:

Perform sampling of parameters and forecast paths from statistical model LGT (Local and Global Trend)
Thousand samples of parameters and forecast paths are saved and then subsampled, and the aggregated results are used as neural network features
Superior performance in time series forecasting

11 of 18

Embedding Space

Key Idea:

Perform augmentation transformation in learned embedding space instead of input space
More effective due to the manifold unfolding in learned embedding/latent space

[DeVries and Taylor, 2017] Terrance DeVries and GrahamW. Taylor. Dataset augmentation in feature space. In ICLR 2017, pages 1–12, Toulon, 2017.

Example^{[DeVries and Taylor, 2017]}:

Sequence autoencoder to construct a embedding space for time series data
Interpolation and extrapolation are applied in embedding space to generate new samples
Superior performance in time series classification

12 of 18

Deep Generative Models

Key Idea:

Adopt DGM to learn data distribution in order to generate near-realistic data with variations
Generative adversarial networks (GANs) are most popular

[Yoon et al., 2019] Jinsung Yoon, Daniel Jarrett, and Mihaela van der Schaar. Time-series generative adversarial networks. In NeurIPS 2019, pages 5508–5518, 2019.

Example: TimeGAN^{[Yoon et al., 2019]}

Combine unsupervised GAN with conditional temporal dynamics afforded by supervised autoregressive models
Can generate realistic time-series data based on measures of similarity and predictive ability

13 of 18

Automated Data Augmentation

Key Idea:

Automatically search for augmentation policies through reinforcement learning, meta learning, or evolutionary search
Can yield strong gains over common heuristic data augmentation methods

[Cheung and Yeung, 2021] Tsz-Him Cheung and Dit-Yan Yeung. MODALS: Modality-agnostic automated data augmentation in the latent space. In ICLR 2021.

Example: MODALS^{[Cheung and Yeung, 2021]}

Adopt evolution search strategy based on population based augmentation (PBA) for optimal composition of latent space transformations for data augmentation
Superior performance in time series classification

14 of 18

Outline

Background and Overview
Time Series Data Augmentation Methods
Discussion for Future Opportunities
Conclusions

15 of 18

Future Opportunities (1/2)

Augmentation in (Time-)Frequency Domain

Existing schemes in (time-)frequency domain (mainly based on FT and STFT) are limited
Wavelet transforms (CWT, DWT, MODWT) could be an interesting direction

Augmentation with (Deep) Gaussian Processes

GP/DGP induces distributions over functions, while time series can be viewed as functions of time
The choices of kernel in GP/DGP allow to place assumptions on time series data, such as smoothness, scale, periodicity and noise level
Few studies are investigated in GP/DGP for time series data augmentation, and GP/DGP could have great potentials

16 of 18

Future Opportunities (2/2)

Augmentation with Deep Generative Models

Existing DGMs for time series data augmentation are mainly GANs
Deep autoregressive networks (DARNs) exhibit a natural fit
Normalizing flows (NFs) show success in modeling time series stochastic processes
How to leverage other DGMs like DARNs, NFs, and VAEs, remain exciting future opportunities

Augmentation Selection and Combination

Existing schemes in time series are often heuristic
Designing effective and efficient selection/combination strategies could be an interesting direction (e.g., customized reinforcement/meta learning)

17 of 18

Outline

Background and Overview
Time Series Data Augmentation Methods
Discussion for Future Opportunities
Conclusions

18 of 18

Conclusions

Provide a survey on time series data augmentation in deep learning

Comprehensively review data augmentation methods for common time series tasks like forecasting, anomaly detection, and classification
Organize existing methods in a taxonomy with basic and advanced approaches
Summarize representative methods in each category
Highlight future research directions

Thanks!

Q&A

Personal website: https://sites.google.com/site/qingsongwen8/home