1 of 12

Towards Understanding �Automated Deep Learning

Prof. Dr. Marius Lindauer

@LindauerMarius

@AutoML_org

These slides are available at www.automl.org/talks --- all references are hyperlinks

AutoDL@UMLOP@PPSN’20

M. Lindauer

Leibniz University Hannover

2 of 12

AutoDL: Automated Deep Learning

2

Optimizer

Validation performance�(e.g., accuracy)

AutoDL Tool

Training Data

Validation Data

AutoDL@UMLOP@PPSN’20

M. Lindauer

Leibniz University Hannover

3 of 12

Auto-PyTorch [Mendoza et al. 2019, Zimmer et al. 2020]

3

  • Tabular data and image data
  • Very efficient because of meta-learning�and multi-fidelity optimization

AutoDL@UMLOP@PPSN’20

M. Lindauer

Leibniz University Hannover

4 of 12

Characteristics of Opt. Problem of AutoML/DL

  1. Complex search space:
    • Integer, float, categorical, conditional structures
  2. Black-box function
    • No analytic form known
    • “Performance” can only be queried
    • But there is more information available compared to classical black-box optimization
  3. Only few function evaluations affordable
    • A single function evaluation can cost between minutes or hours (or even more)
  4. Stochastic returns
    • Training of a DNN is non-deterministic (e.g., SGD)
    • Thus, returned performance can vary

4

AutoDL@UMLOP@PPSN’20

M. Lindauer

Leibniz University Hannover

5 of 12

LCBench: Learning Curve Benchmark [Zimmer et al. 2020]

  • Diverse set of 35 datasets �from OpenML
  • 7 hyperparameters
    • 3 integers
    • 4 floats
  • SGD + cosine annealing
  • 2000 configurations
  • 3 repeated runs each
  • ⇒ 3 x 35 x 2000 = 210 000 training of a neural network�

5

  • Other NAS-Bench: 101, 1shot1, 201, 301

AutoDL@UMLOP@PPSN’20

M. Lindauer

Leibniz University Hannover

6 of 12

Heatmap & Portfolio

6

  • There isn’t a configuration �that rules them all.
  • Surprisingly, small portfolio performs quite well.

AutoDL@UMLOP@PPSN’20

M. Lindauer

Leibniz University Hannover

7 of 12

7

7

Landscape?

  • Pushak & Hoos [2018] showed for algorithm configuration that landscapes are more benign than expected
    • Uni-modal
    • Convex
    • Relatively “smooth”�
  • → Does that also apply to AutoDL?
    • Plots from fANOVA �(similar to partial dependency plots) �show similar characteristics
      • Approximated via Random Forest
    • Example: Learning rate of PPO on cartpole (RL problem) [Lindauer et al. 2019]
    • Low effective dimensionality

AutoDL@UMLOP@PPSN’20

M. Lindauer

Leibniz University Hannover

8 of 12

Multi-Fidelity Optimization

8

  • Only the best configurations are evaluated until the end
  • → Makes AutoDL very efficient
    • Competitive with gradient-based NAS

AutoDL@UMLOP@PPSN’20

M. Lindauer

Leibniz University Hannover

9 of 12

Correlation between Budgets (e.g., #Epochs)

  • Kendall-tau between configurations on �different budgets
  • On some datasets, weak correlation
  • On some datasets, strong correlation
  • → How can we effectively determine this?
  • → Is correlation really what we care about?

9

AutoDL@UMLOP@PPSN’20

M. Lindauer

Leibniz University Hannover

10 of 12

Hyperparameter Importance across Budgets

  • Which hyperparameters influence performance the most?
  • Surprisingly stable across budgets
  • Different scores from different importance metrics
    • Global importance and local importance partially do not match�
  • If DNN trained for longer,
    • More layers can be better used
    • learning rate is less important�(if we use a good learning rate scheduler)

10

fANOVA [Hutter et al. 2014]

LPI [Biedenkapp et al. 2018]

AutoDL@UMLOP@PPSN’20

M. Lindauer

Leibniz University Hannover

11 of 12

Take Away

  • AutoDL is a complex and expensive problem
  • Multi-fidelity optimization is one of the state-of-the-art approaches
    • opens up new challenges
  • We only started to understand the real AutoDL problem
    • But we are working on it ;-)

11

Opt.

Validation performance�(e.g., accuracy)

AutoDL Tool

AutoDL@UMLOP@PPSN’20

M. Lindauer

Leibniz University Hannover

12 of 12

Thank you!

12

@LindauerMarius

@AutoML_org

AutoDL@UMLOP@PPSN’20

M. Lindauer

Leibniz University Hannover