Vanishing Gradient Problem
Training Neural Networks
……
……
……
……
……
……
……
……
y1
y2
yM
Small gradients, learns very slow
Slide credit: Hung-yi Lee – Deep Learning Tutorial
2
CS 404/504, Fall 2021
Generalization
Generalization
3
CS 404/504, Fall 2021
Overfitting
Generalization
Picture from: http://cs231n.github.io/assets/nn1/layer_sizes.jpeg
4
CS 404/504, Fall 2021
Regularization: Weight Decay
Regularization
Data loss
Regularization loss
6
CS 404/504, Fall 2021
Regularization: Weight Decay
Regularization
7
CS 404/504, Fall 2021
Regularization: Weight Decay
Regularization
8
CS 404/504, Fall 2021
Regularization: Dropout
Regularization
Slide credit: Hung-yi Lee – Deep Learning Tutorial
9
CS 404/504, Fall 2021
Regularization: Dropout
Regularization
minibatch
1
minibatch
2
minibatch
3
minibatch
n
……
Slide credit: Hung-yi Lee – Deep Learning Tutorial
10
CS 404/504, Fall 2021
Regularization: Early Stopping
Regularization
Stop training
validation
11
CS 404/504, Fall 2021
Batch Normalization
Regularization
12
CS 404/504, Fall 2021
Hyper-parameter Tuning
Hyper-parameter Tuning
14
CS 404/504, Fall 2021
Hyper-parameter Tuning
Hyper-parameter Tuning
15
CS 404/504, Fall 2021
k-Fold Cross-Validation
k-Fold Cross-Validation
16
CS 404/504, Fall 2021
k-Fold Cross-Validation
k-Fold Cross-Validation
17
CS 404/504, Fall 2021