Recurrent Neural Networks
Aaditya Prakash
Sep 26, 2018
RNN, LSTM, Seq2Seq, NMT and Attention
Art with deep recurrent neural networks
The crow crooked on more beautiful and free,
He journeyed off into the quarter sea.
his radiant ribs girdled empty and very –
least beautiful as dignified to see.
Rudi Levette Berice Lussa Hany Mareanne Chrestina Carissy Marylen Hammine Janye Marlise Jacacrie Hendred Romand Charienna Nenotto Ette Dorane Wallen Marly Darine Salina Elvyn Ersia Maralena Minoria Ellia Charmin Antley Nerille Chelon Walmor Evena Jeryly Stachon Charisa Allisa Anatha Cathanie Geetra Alexie Jerin Cassen Herbett Cossie Velen Daurenge Robester Shermond Terisa Licia Roselen Ferine Jayn Lusine Charyanne Sales Sanny Resa Wallon Martine Merus Jelen Candica Wallin Tel
Poetry
Baby Names
Image Captions
Art with deep recurrent neural networks
The crow crooked on more beautiful and free,
He journeyed off into the quarter sea.
his radiant ribs girdled empty and very –
least beautiful as dignified to see.
Poetry
Language Model
There are way too many histories once you’re into a sentence a few words! Exponentially many.
Language Model - Fix: Markov Assumption
Problem: Very small window gives bad prediction�Solution: Smoothing, attention (discussed later)
Language Model - Recurrent Model
Language Model - Recurrent Model
Language Model - Recurrent Model
Language Model - Recurrent Model
Recurrent Neuron
Recurrent Neuron - Unrolled
An unrolled recurrent neural network.
RNN - Structure
Forward propagation through time
Backpropagation through time
Backpropagation through time
BPTT
BPTT
RNN: vanishing & exploding gradient
RNN - Structure
Solution - Long Memory and Short Memory
Solution - Long Memory and Short Memory
Cell State
Why state? E.g remember gender, so that proper pronoun can be used.
Solution - Long Memory and Short Memory
Forget Gate Layer
Why forget then? Perhaps new subject with different gender?
Solution - Long Memory and Short Memory
Input Gate Layer
Solution - Long Memory and Short Memory
Combine to make current state
Solution - Long Memory and Short Memory
Output Gate Layer
Why current input in the state? So that, things like plurality of subject can be determined.
Even better LSTM
LSTM with “peephole connections”
Gate layers look at the cell state. �Gers & Schmidhuber (2000)
GRU - Gated Recurrent Unit
Common cell state and hidden state
Combines the forget and input gates into a single update gate Cho, et al. (2014)
Learned Representation
2D Visualization of ‘vectors’ learned for sentences. Similar sentences are close together in ‘vector’ space. Sutskever et al, 2014
Word Vectors
Word Vectors
Sequence to Sequence
Sequence to Sequence
Sequence to Sequence
Applications - Plenty
Neural Machine Translation - Alignment
NMT - Attention
Single-modal learning Multi-modal Learning
Images
Text
Images
Text
Single-modal learning Multi-modal Learning
Images
Text
Image Captioning - Show and Tell
Image Captioning - Show and Tell
Visual Attention, Show, attend & Tell paper
Let every step of an RNN pick information to look at from some larger collection of information
Attention Model in action
Visual Question Answering
VQA is a new dataset containing open-ended questions about images. These questions require an understanding of vision, language and commonsense knowledge to answer.
Visual Question Answering
Visual Question Answering - Attention
Source: Jiasen
Soft Attention
Soft Attention
Hard Attention
Hard Attention - Usecase
Neural Paraphrase Generation
Source: Yours kindly
Neural Paraphrase Generation
Source: Yours kindly
Beam Search (encourage diversity)
Sequence to Sequence - Modality agnostic
Recurrent Neural Networks - A recap
Vanilla Neural Networks
Recurrent Neural Networks - A recap
Image Captioning -�Sequence to Words
Recurrent Neural Networks - A recap
Sentiment Classification�sequence of words -> sentiment
Recurrent Neural Networks - A recap
Machine Translation�seq of words -> seq of words
Recurrent Neural Networks - A recap
Video classification (frame level)�VQA - ??? More on this later�
RMVA Recurrent Models of Visual Attention
- Glimpse sensor : bandwidth limited sensor of the input image. As an example, if the input image is of size 28x28 (height x width), the RAM may only be able to sense an area of size 8x8 at any given time-step, called glimpses
Unreasonable effectiveness of RNN LSTM�Images -
Even traditional areas where CNN has�done excellent are being improved by�use of RNN. �- Reads number left to right (steps)
Work by DeepMind�http://arxiv.org/abs/1412.7755
Unreasonable effectiveness of RNN LSTM�Literature -
Unreasonable effectiveness of RNN LSTM�Math & �Latex -
Unreasonable effectiveness of RNN LSTM�Math & �Latex &�Drawing
Unreasonable effectiveness of RNN LSTM�Linux�Source Code
Unreasonable effectiveness of RNN LSTM�Bible !
References