Building simple ANN from scratch
MNIST
From scratch?
Starting with tensorflow, pytorch is easy, but can’t really tell what’s going on inside unless you read source codes or official documentation. Building simple network from scratch may help you better understand ANN.
Recap slides
A perceptron - neuron
Rosenblatt, F. (1958). “The Perceptron: A Probabilistic Model For Information Storage And Organization In The Brain”. 《Psychological Review》 65 (6): 386–408.
A perceptron – activation function
Single layer perceptron
input
output
A single layer can be seen as matrix multiplication too
https://www.jeremyjordan.me/intro-to-neural-networks/
Neural network is transformation
Linearity
Non-linearity using activation functions
https://junstar92.tistory.com/122
Multi layer perceptron
https://ml-cheatsheet.readthedocs.io/en/latest/forwardpropagation.html
Forward propagation
-1 or 1 ?
Orange denotes -1, Blue denotes 1
input
hidden 1
hidden 2
output
Forward propagation (input – hidden1)
-1 or 1 ?
input
hidden 1
hidden 2
output
| |
| |
| |
| |
| | |
| | |
| | |
| | |
| | |
| | |
Forward propagation (hidden1 – hidden2)
-1 or 1 ?
input
hidden 1
hidden 2
output
| |
| |
| |
| | |
| | |
| | |
| | |
| |
| |
| |
| |
Forward propagation (hidden2 – output)
-1 or 1 ?
input
hidden 1
hidden 2
output
|
|
| |
| |
| |
| |
|
|
|
|
loss
Gradient descent
Loss
Loss
Gradient descent
Loss
Loss
Optimizing
loss = 0.25
Optimizing
more closer to 1
Backward propagation
We use chain rule!
Backward propagation
Parameter update
This can be done over all weights!
when we renew every weights, we call 1 update is done
How can we make a deep learning model
Training / Testing
Training set
Testing set
From scratch?
Starting with tensorflow, pytorch is easy, but can’t really tell what’s going on inside unless you read source codes or official documentation. Building simple network from scratch may help you better understand ANN.
About MNIST
About MNIST
Network structure
Network structure
value
Size: 784
Size: 10
Size: 10
What do we need?
Read data
Network structure
value
Size: 784
Size: 10
Size: 10
Network structure
rand returns 0~1
Network structure
Network structure
Met in Bioinformatics…
Backward
W2 := W2 - 𝛼 dW2
b2 := b2 - 𝛼 db2
W1 := W1 - 𝛼 dW1
b1 := b1 - 𝛼 db1
One-hot encoding
Combined!
To do
Brought to you by..