Long Short-Term Memory (LSTM)
RNN
An unrolled recurrent neural network
RNN
Long Distance Dependencies
4
Vanishing/Exploding Gradient Problem
How to resolve Vanishing Gradient Problems?
Solving Vanishing Gradient: Activation Functions
Solving Vanishing Gradient: Residual Networks
Solving Vanishing Gradient: LSTM
Long Short-Term Memory (LSTM)
RNN
LSTM Network Architecture
LSTM vs RNN
Cell State
Gates
sigmoid neural net layer followed by pointwise multiplication operator
Forget Gate
Input Gate
Updating the Cell State
Output Gate
Overall LSTM Cell Architecture
Overall LSTM Computation
LSTM Training
LSTM Training
General Problems Solved with LSTMs
Sequence to Sequence �Transduction (Mapping)
Summary of �LSTM Application Architectures
Image Captioning
Video Activity Recog
Text Classification
Video Captioning
Machine Translation
POS Tagging
Language Modeling
Successful Applications of LSTMs
Deep LSTMs
Bi-directional LSTM (Bi-LSTM)
xt+1
xt
xt-1
ht-1
ht+1
ht
Outputs both past and future elements
Gated Recurrent Unit (GRU)
GRU vs. LSTM
Conclusions