AI@MIT Workshop Series
Presentation based on Nikhil’s Coursera course “An Introduction to Practical Deep Learning”
Workshop 5:
Recurrent Neural Networks
Types of Networks
MLP (Multilayer Perceptron)
CNN
(Convolutional Neural Networks)
RNN
(Recurrent Neural Networks)
Sources: http://bit.ly/2GHV0uS, http://bit.ly/2G3ynDk, http://bit.ly/2GJG13N
Types of Networks
MLP (Multilayer Perceptron)
CNN
(Convolutional Neural Networks)
RNN
(Recurrent Neural Networks)
Sources: http://bit.ly/2GHV0uS, http://bit.ly/2G3ynDk, http://bit.ly/2GJG13N
Today’s Agenda
Review
Training Procedure
Initialize weights
Fetch a batch of data
Forward-pass
Cost
Backward-pass
Update weights
Sources: “An Introduction to Practical Deep Learning” Coursera Course
Review
Sources: “An Introduction to Practical Deep Learning” Coursera Course
Inference Procedure
Fetch data
Forward-pass
The man ate his mushrooms raw.
Recurrent Neural Network: A Better Idea
Designed to capture temporal dependence
Recursive and thus accepts varied-length inputs
Allows for temporal indifference when needed
Recurrent Neural Network: A Better Idea
Recurrent Neural Network: A Better Idea
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
RNN Architectures
Recurrent Neuron
Unrolling
Unrolling Example:
Issues with RNNs
Vanishing/exploding gradients
Bidirectional RNNs
Deep RNNs
Layer 2
Layer 1
Layer L
LSTMs
LSTMs (Long-Short Term Memory modules)
Forget Gate
Memory Gates
Output Gate
GRUs
Attention
Attention: the Transformer
Transformers
Transformers
Applications
Task-Dependent
N.B. This is how word vectors are computed!
Encoding!
Non-Toxic
Toxic
Cool Results from Andrej Karpathy
Cool Results from Andrej Karpathy
Cool Results from Andrej Karpathy
Summary
Thank you!
Log attendance at tinyurl.com/aimws5
and enjoy the lab!