1
Agenda | Duration (CST) |
Introduction + AutoGluon Tabular | 2:00PM – 2:55PM |
Break | 2:55PM – 3:05PM |
AutoGluon Multimodal | 3:05PM – 4:00PM |
Break | 4:00PM – 4:10PM |
AutoGluon Timeseries | 4:10PM – 4:50PM |
Additional Q&A + Feedback | 4:50PM – 5:00PM |
Workshop Website
*Note on Hands-on Notebooks
© 2022, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AutoML for Time Series with AutoGluon
AutoGluon: Empowering (Multimodal) AutoML for the Next 10 Million Users at NeurIPS 2022
December 2022 | New Orleans, LA
Caner Türkmen, Sr. Applied Scientist @ Amazon Web Services
Time Series Forecasting: A Short Problem Definition
time
“past”
“future”
Time Series Forecasting: A Short Problem Definition
time
“features”
“labels”
Time Series Forecasting: A Short Problem Definition
time
“features”
“labels”
Uncertainty quantification
e.g., Newsvendor problem
Finance: value at risk
(conditional) quantiles of the predictive distribution
Time Series Forecasting: Other Features
time
“past”
“future”
1
2
3
Time Series Forecasting: Other Features
time
“past”
“future”
1
2
3
1
“Known” time-varying covariates (related time series)
Covariates which will have “known” values, or surrogates thereof, into the future at prediction time. Example: weather (forecasts), dummy variables for holidays, promotions (plans)
2
Other time-varying covariates / related time series
Other related time series the future values of which will not be known at prediction time. Example: prices of related assets in finance, demand for related items in demand forecasting
3
Item metadata / Static features
Static features describing the item that are not time-varying. For example, the category of an item in the catalogue, the industry of a financial asset, etc.
Global vs local
time
Very often, a data set is comprised of multiple “items”
time
In local models, we “fit” parameters to the dynamics of each time series (ETS, ARIMA, etc.)
θ1
θ2
θ3
θ4
θ5
time
θ
Global models share parameters across different time series and learn common dynamics (Neural networks)
Machine learning methods are increasingly used in real-world forecasting use cases�* majority already powering AutoGluon-TimeSeries
Local
STL-AR
ETS
ARIMA
Theta
Prophet
Naive-1
Seasonal Naive
Naive
Global
DL
DeepAR
TFT
MQC/RNN
FF Networks
DeepState
Informer
N-BEATS
Tree
XGBoost
LightGBM
CatBoost
Rotbaum
TCNs
Machine learning methods are taking over forecasting, but subject of ongoing debate
We’ve created monsters
M4 winner: “ensemble of specialists” and other levels of ensembling on ES-RNN (Smyl, 2020)
M5 winners: most reported winners used combinations of ML (LightGBM, NN forecasters) (Makridakis et al., 2022)
AutoML
Something like 5-10 lines of code
A highly accurate forecasting model (potentially a monster)
AutoML
AutoML
Hyperparameter Optimization
Model Selection
Model Ensembles
Thoughtful Data Preprocessing
Battle-tested “Presets”
CASH
Finding the best configuration of hyperparameters for a given model, including regularization, training, and model architecture
Selecting the model likeliest to generalize with high performance to test data
Combining trained models and training new models for even higher performance
Augmenting / transforming data to boost model performance
Collection of default hyperparameters, hyperparameter ranges, preprocessing steps
Introducing AutoGluon-TimeSeries
15
TimeSeries
Model zoo including ETS, ARIMA, Prophet, and many GluonTS models
Hyperparameter Optimization with Ray Tune or plain random search
Built on AutoGluon’s familiar API
Tabular
(XGBoost, CatBoost, LightGBM)
Default Backend
(Random Search)
Available as of v0.6
Coming in v0.7
Time Series Problems
Models
Model Selection and HPO
Ensembling
Due Q1 2023
AutoGluon-TimeSeries
AG Time Series
Hyperparameter Optimization
Model Selection
Model Ensembles
Thoughtful Data Preprocessing
Battle-tested “Presets”
Improve HPO by stabilizing validation: multi-window backtesting, prioritizing models with faster inference, etc.
Model zoo encompassing many common benchmark models in time series analysis, with a mix of local, naive, and global
Building powerful ensembles of time-series models while addressing unique challenges in temporal data, quantile model aggregation. [coming v0.7] Stack ensembles
[coming in v0.7] default data transformations known to significantly increase forecast accuracy
Presets and default hyperparameters for deep learning based models tuned over a wide set of benchmark data sets
Time series forecasting in a few lines of code
Time Series Data Frame
Time series forecasting in a few lines of code
Inside fit()
Fine-grained control of component models
Hyperparameter optimization
pip install autogluon>=0.6
AutoGluon Website
NeurIPS workshop Website
Q&A
atturkm@amazon.com
AutoGluon Website
NeurIPS workshop Website
References
Hyperparameter Optimization: �Too many DeepARs, too little time 😱
num_layers
num_cells
cell_type
dropoutcell_type
embedding_dimension
….
1
2
3
10
20
30
GRU
LSTM
10
20
30
….
….
….
Zoneout
RNN ZoneOut
Variational
Dropout
Variational
Zoneout
Hyperparameter Optimization (HPO)
num_layers
num_cells
cell_type
dropoutcell_type
embedding_dimension
….
1
2
3
10
20
30
GRU
LSTM
10
20
30
….
….
….
Zoneout
RNN ZoneOut
Variational
Dropout
Variational
Zoneout
How to select the best hyperparameters such that out of sample performance is optimized?
Time Series HPO: AutoETS and AutoARIMA
1 (Hyndman and Athanasopoulos, 2018)
Bagging
33
Image Source: Wikipedia
Multi-Layer Stack Ensembling
Cross-Validation
35
Train k different copies of model with different chunk of data held-out from each.
Ensembles
Ensembles of Forecasts: Unique Challenges
1 (Gneiting et al., 2005) 2 (Kim et al., 2021)