MLDS HW1-2
TAs
ntu.mldsta@gmail.com
Outline
Timeline
Three Parts in HW1
Schedule
Task Descriptions
HW1-2: Optimization
Visualize the Optimization Process 1/3
Visualize the Optimization Process 2/3
Model
l1
l2
lk
m1
n1
m2
n2
nk
mk
.....
l1
l2
lk
.....
m2n2
mknk
.....
.....
1st event epoch 0
1st event epoch 3
1st event epoch 6
.
.
.
.
.
.
ith event epoch 0
ith event epoch 3
ith event epoch 6
m1n1
m1n1 + m2n2 + ...... + mknk
dimension
reduction
.
.
.
.
.
.
Visualize the Optimization Process 3/3
layer 1 whole model
Observe Gradient Norm During Training 1/2
Observe Gradient Norm During Training 2/2
MNIST
What Happened When Gradient is Almost Zero 1/3
What Happened When Gradient is Almost Zero 2/3
First, train the network with original loss function.
What Happened When Gradient is Almost Zero 3/3
HW1-2 Report Questions (10%)
Example of Bonus 1/3
Example of Bonus 2/3
Example of Bonus 3/3
Allow Packages
Submission
Q&A
ntu.mldsta@gmail.com