Neural Networks II: Backpropagation
Vatsal Sivaratri
Modified from last year’s presentation
TJ Machine Learning Club
Slide 1
Review
TJ Machine Learning Club
Slide 2
The Perceptron
Weights
Bias
TJ Machine Learning Club
Slide 3
Today’s Goal
Weights
Bias
TJ Machine Learning Club
Slide 4
But first, Calculus! (At least the differential part)
Algebraic Approach:
Calculus Approach:
dy
dx
TJ Machine Learning Club
Slide 5
IMPORTANT DISTINCTION!!!
d
dx
dy
dx
This is a command!
= “Take the derivative with respect to x”
This is a value!
= “The derivative of y with respect to x”
TJ Machine Learning Club
Slide 6
Rest of the Basic Calculus Rules
TJ Machine Learning Club
Slide 7
The Most Important - Chain Rule!
dy
dx
dy
du
du
dx
=
TJ Machine Learning Club
Slide 8
The Only Multi- you’ll need!
Partial Derivatives:
Move from dy/dx to ∂, measuring change in one variable while others stay constant.
Gradients:
"Extend derivatives to vectors; gradients show the steepest ascent in multiple dimensions."
Credit: https://calcworkshop.com
TJ Machine Learning Club
Slide 9
Minimizing Error: Gradient Descent
TJ Machine Learning Club
Slide 10
The Intuitive Explanation
Direction we push the weight
TJ Machine Learning Club
Slide 11
The Intuitive Explanation
TJ Machine Learning Club
Slide 12
The Gradient
Gradients
(also called a derivative)
Subtraction gives us descent
TJ Machine Learning Club
Slide 13
The Gradient as Slope
TJ Machine Learning Club
Slide 14
The Learning Rate
TJ Machine Learning Club
Slide 15
Optimizing the Learning Rate
TJ Machine Learning Club
Slide 16
TJ Machine Learning Club
Slide 17
Calculating One Neural Network Iteration
3
-2
-3
8
13
5
W13 = 1
W36 = -1
W23 = 3
W14 = 4
W15 = 3
W24 = 2
W25 = -2
W56 = 2
W46 = -3
Linear Activation Function (y = x) and no biases, n1 = 3, n2 = -2, y = 9, α = 0.1
Goal: Update W36
W36 = W36 - α
Loss (Doesn’t need to be this way!)
TJ Machine Learning Club
Slide 18
Calculating One Neural Network Iteration
3
-2
-3
8
13
5
W13 = 1
W36 = -1
W23 = 3
W14 = 4
W15 = 3
W24 = 2
W25 = -2
W56 = 2
W46 = -3
Linear Activation Function (y = x) and no biases, n1 = 3, n2 = -2, y = 9, α = 0.1
E = ½(n6 - y)2
n6 = W36n3 + W46n4 + W56n5
=
MOST IMPORTANT THING
TJ Machine Learning Club
Slide 19
Calculating One Neural Network Iteration
3
-2
-3
8
13
5
W13 = 1
W36 = -1
W23 = 3
W14 = 4
W15 = 3
W24 = 2
W25 = -2
W56 = 2
W46 = -3
Linear Activation Function (y = x) and no biases, n1 = 3, n2 = -2, y = 9, α = 0.1
E = ½(n6 - y)2
n6 = W36n3 + W46n4 + W56n5
W36 = W36 - α
W36 = -1 - 0.1(12) = -1 - 1.2 = -2.2
= (n6 - y) * n3 = -4 * -3 = 12
=
TJ Machine Learning Club
Slide 20
Calculating One Neural Network Iteration
3
-2
-3
8
13
5
W13 = 1
W36 = -1
W23 = 3
W14 = 4
W15 = 3
W24 = 2
W25 = -2
W56 = 2
W46 = -3
Linear Activation Function (y = x) and no biases, n1 = 3, n2 = -2, y = 9, α = 0.1
Goal: Update W13
W13 = W13 - α
Loss (Doesn’t need to be this way!)
TJ Machine Learning Club
Slide 21
Calculating One Neural Network Iteration
3
-2
-3
8
13
5
W13 = 1
W36 = -1
W23 = 3
W14 = 4
W15 = 3
W24 = 2
W25 = -2
W56 = 2
W46 = -3
Linear Activation Function (y = x) and no biases, n1 = 3, n2 = -2, y = 9, α = 0.1
Goal: Update W13
W13 = W13 - α
=
MOST IMPORTANT THING
TJ Machine Learning Club
Slide 22
Calculating One Neural Network Iteration
3
-2
-3
8
13
5
W13 = 1
W36 = -1
W23 = 3
W14 = 4
W15 = 3
W24 = 2
W25 = -2
W56 = 2
W46 = -3
Linear Activation Function (y = x) and no biases, n1 = 3, n2 = -2, y = 9, α = 0.1
Goal: Update W13
W13 = W13 - α
=
E = ½(n6 - y)2
n6 = W36n3 + W46n4 + W56n5
n3 = W13n1 + W23 + n2
TJ Machine Learning Club
Slide 23
Calculating One Neural Network Iteration
3
-2
-3
8
13
5
W13 = 1
W36 = -1
W23 = 3
W14 = 4
W15 = 3
W24 = 2
W25 = -2
W56 = 2
W46 = -3
Linear Activation Function (y = x) and no biases, n1 = 3, n2 = -2, y = 9, α = 0.1
W13 = W13 - α
=
(n6 - y) * W36 * n1
=
(-4) * -1 * 3 = 12
W13 = 1 - 0.1 * 12 = -0.2
=
TJ Machine Learning Club
Slide 24
NN After One Iteration
3
-2
-3
8
13
5
W13 = -0.2
W36 = -2.2
W23 = 3
W14 = 4
W15 = 3
W24 = 2
W25 = -2
W56 = 2
W46 = -3
Linear Activation Function (y = x) and no biases, n1 = 3, n2 = -2, y = 9, α = 0.1
TJ Machine Learning Club
Slide 25
Code (Please don’t worry about this too much)
TJ Machine Learning Club
Slide 26
Check these videos out when you get a chance!
TJ Machine Learning Club
Slide 27