Long short-term memory
Neural Networks that remember
Victor ADASCALITEI
Neural Networks recap
Neural Networks recap
What we know so far?
Neural Networks recap
What we know so far?
Convolutional Neural Networks (CNNs) Generative Adversarial Networks (GANs)
Neural Networks recap
What we know so far?
Convolutional Neural Networks (CNNs) Generative Adversarial Networks (GANs)
Yes
No
Cat?
Neural Networks recap
What we know so far?
Convolutional Neural Networks (CNNs) Generative Adversarial Networks (GANs)
Yes
No
Cat?
Noise
Neural Networks recap
What we know so far?
Convolutional Neural Networks (CNNs) Generative Adversarial Networks (GANs)
Yes
No
Cat?
Noise
More generally
Neural Networks recap
What we know so far?
Convolutional Neural Networks (CNNs) Generative Adversarial Networks (GANs)
Yes
No
Cat?
Noise
More generally
Data
Result
Neural Networks recap
What we know so far?
Convolutional Neural Networks (CNNs) Generative Adversarial Networks (GANs)
Yes
No
Cat?
Noise
More generally
Data
Result
OR
Neural Networks recap
What we know so far?
Convolutional Neural Networks (CNNs) Generative Adversarial Networks (GANs)
Yes
No
Cat?
Noise
More generally
Data
Result
OR
Neural Network(Data) = Result
Neural Networks recap
What we know so far?
Convolutional Neural Networks (CNNs) Generative Adversarial Networks (GANs)
Yes
No
Cat?
Noise
More generally
Data
Result
OR
Neural Network(Data) = Result
f ( x ) = y
Neural Networks recap
They are universal function approximators
Neural Networks recap
They are universal function approximators
(Given enough data, they can approximate any function)
Neural Networks recap
They are universal function approximators
(Given enough data, they can approximate any function)
True for images
Neural Networks recap
They are universal function approximators
(Given enough data, they can approximate any function)
True for images and classifiers
Neural Networks recap
They are universal function approximators
(Given enough data, they can approximate any function)
True for images and classifiers
but
Neural Networks recap
They are universal function approximators
(Given enough data, they can approximate any function)
True for images and classifiers
but
What about text ?
Neural Networks recap
They are universal function approximators
(Given enough data, they can approximate any function)
True for images and classifiers
but
What about text ?
… and speech ?
… and music ?
Neural Networks recap
They are universal function approximators
(Given enough data, they can approximate any function)
True for images and classifiers
but
What about text ?
… and speech ?
… and music ?
What about time/context dependent data?
Neural Networks
Input in relation with the output
Neural Networks
Input in relation with the output
- Input
- Network
- Output
Neural Networks
Input in relation with the output
- Input
- Network
- Output
Neural Networks
Input in relation with the output
- Input
- Network
- Output
One to One
Neural Networks
Input in relation with the output
- Input
- Network
- Output
One to One
One to Many
Many to One
Many to Many
Neural Networks
Input in relation with the output
- Input
- Network
- Output
One to One
One to Many
Many to One
Many to Many
???
???
???
Neural Networks
Input in relation with the output
- Input
- Network
- Output
One to One
One to Many
Many to One
Many to Many
Neural Networks
Input in relation with the output
- Input
- Network
- Output
One to One
One to Many
Many to One
Many to Many
GANs
Recognition
Neural Networks
Input in relation with the output
- Input
- Network
- Output
One to One
One to Many
Many to One
Many to Many
Image
Captioning
Neural Networks
Input in relation with the output
- Input
- Network
- Output
One to One
One to Many
Many to One
Many to Many
Sentiment
Analysis
Neural Networks
Input in relation with the output
- Input
- Network
- Output
One to One
One to Many
Many to One
Many to Many
Movie
Analysis
Neural Networks
Input in relation with the output
- Input
- Network
- Output
One to One
One to Many
Many to One
Many to Many
Many to Many*
Neural Networks
Input in relation with the output
- Input
- Network
- Output
One to One
One to Many
Many to One
Many to Many
Many to Many*
Translations
Neural Networks
Input in relation with the output
- Input
- Network
- Output
One to One
One to Many
Many to One
Many to Many
Many to Many*
?
?
?
?
?
Recurrent Neural Networks
Input in relation with the output
- Input
- Network
- Output
One to One
One to Many
Many to One
Many to Many
Many to Many*
?
?
?
?
?
Recurrent Neural Networks
Input in relation with the output
?
Recurrent Neural Networks
Input in relation with the output
?
=
Recurrent Neural Networks
Input in relation with the output
Recurrent Neural Networks
Input in relation with the output
Recurrent Neural Networks
Explained
ht
Whh
Wxh
xt
Recurrent Neural Networks
Explained
ht
Whh
Wxh
xt
Recurrent Neural Networks
Explained
ht
Whh
Wxh
xt
Recurrent Neural Networks
The problem
Recurrent Neural Networks
The problem
Recurrent Neural Networks
The problem
Recurrent Neural Networks
The problem
Recurrent Neural Networks
The problem
Recurrent Neural Networks
The problem
Vanishing gradient
Long short-term memory cell
Long short-term memory cell
[1]
Long short-term memory cell
[1]
Code example
Sources
[1] “LSTM: A Search Space Odyssey” -- https://arxiv.org/pdf/1503.04069.pdf
[2] “RNN Escapades” -- London ML meetup 09/2015 Andrej Karpathy https://docs.google.com/presentation/d/1qs2IuSdZvbNfzw217kH5-1Z9DjG0Ng6fJiabaLNQVaY