CS331B: Representation Learning in Computer Vision
Amir R. Zamir
Silvio Savarese
(class logistics)
2
What we talked about so far...
3
4
“Transcript”
Cat
Macbeth was guilty.
[ 81 20 84 64 58 39 17 54 72 15]
Representation
Mathematical Model
(e.g., classifier)
Some basics concepts related to representations
5
Handcrafting Representations
6
Color Histograms
Deformable Part based Models (DPM)
Histogram of Gradients
(HOG)
Models based Shapes
Felzenszwalb et al., 2010.
Dalal and Triggs, 2005.
Beis and Lowe, 1997.
Learning Representations
7
LeCun et al. 1998.
Hinton et al. 2006.
Unsupervised representation learning
8
Supervised representation learning
9
Stanford CS231n
Neural Net
A Neuron
Convolutional X
Supervised Low-level Matching
10
Zagoruyko & Komodakis. 2015.
Lectures 4&6
11
Objects-based Representations (ImageNet)
12
Krizhevsky et al. 2012.
Deng et al. 2009.
Zeiler & Fergus. 2014.
Escorcia et al. 2015.
Query |
Nearest Neighbors
Scene-based Representations (MIT-Places)
13
Zhou et al. 2014.
(~static) Video Representations
14
Simonyan & Zisserman. 2014.
Karpathy et al. 2015.
Recurrent Models & Structured Prediction
15
Jain et al. 2016.
Today
16
Methods of Mixing Representations
17
Mixing Representations
18
Mixing Representations - How?
19
Li & Hoiem. 2016.
Mixing Representations - fine tuning
20
Li & Hoiem. 2016.
Mixing Representations - joint (multi-task) training
21
Li & Hoiem. 2016.
Mixing Representations - feature extraction
22
Li & Hoiem. 2016.
Mixing Representations - LwF
23
Li & Hoiem. 2016.
ECCV’16
Mixing Representations
24
Li & Hoiem. 2016.
Mixing Representations
25
Li & Hoiem. 2016.
Curriculum Learning
For faster convergence, better minima, (and mixing representations)
26
Bengio et al. 2009.
Curriculum Learning
Guided learning helps training humans and animals
Shaping
Start from simpler examples / easier tasks (Piaget 1952, Skinner 1958)
Education
Bengio et al. 2009. slides and paper credit.
The Dogma in question
It is best to learn from a training set of examples sampled from the same distribution as the test set. Really?
Bengio et al. 2009. slides and paper credit.
Question
Can machine learning algorithms benefit from a curriculum strategy?
Cognition journal:
(Elman 1993) vs (Rohde & Plaut 1999),
(Krueger & Dayan 2009)
Bengio et al. 2009. slides and paper credit.
Convex vs Non-Convex Criteria
Bengio et al. 2009. slides and paper credit.
Deep Architectures
Bengio et al. 2009. slides and paper credit.
Deep Training Trajectories
Random initialization
Unsupervised guidance
(Erhan et al. AISTATS 09)
Bengio et al. 2009. slides and paper credit.
Starting from Easy Examples
3
2
1
abstractions
Bengio et al. 2009. slides and paper credit.
Continuation Methods
Target objective
Heavily smoothed objective = surrogate criterion
Track local minima
Final solution
Easy to find minimum
Bengio et al. 2009. slides and paper credit.
Curriculum Learning as Continuation
3
2
1
abstractions
Bengio et al. 2009. slides and paper credit.
How to order examples?
Bengio et al. 2009. slides and paper credit.
Larger Margin First: Faster Convergence
Bengio et al. 2009. slides and paper credit.
Cleaner First: Faster Convergence
Bengio et al. 2009. slides and paper credit.
Shape Recognition
First: easier, basic shapes
Second = target: more varied geometric shapes
Bengio et al. 2009. slides and paper credit.
Shape Recognition Experiment
Bengio et al. 2009. slides and paper credit.
Shape Recognition Results
k
Bengio et al. 2009. slides and paper credit.
Why?
Curriculum = particular continuation method
Bengio et al. 2009. slides and paper credit.
This lecture
43
Energy-Based GANs
44
Zhao et al. 2016.
Generative Adversarial Networks
45
Goodfellow et al. 2014.
Generative Adversarial Networks�
46
Goodfellow et al. 2014.
Kevin McGuinness. 2016.
Generative Adversarial Networks�
47
Goodfellow et al. 2014.
Generative Adversarial Networks�
48
Goodfellow et al. 2014.
MNIST
CIFAR10
CIFAR100
TFD
Generative Adversarial Networks�
49
Goodfellow et al. 2014.
MNIST
CIFAR10
CIFAR100
TFD
Problems with GAN
50
Goodfellow et al. 2014.
Energy-Based GANs
51
Zhao et al. 2016.
EBGAN vs GAN (on MNIST)
52
Zhao et al. 2016.
GAN
EBGAN
EBGAN with PT term
EBGAN vs GAN (on LSUN)
53
Zhao et al. 2016.
GAN
EBGAN
EBGAN vs GAN (on CelebA)
54
Zhao et al. 2016.
GAN
EBGAN
EBGAN (on ImageNet)
55
Zhao et al. 2016.
EBGAN (on ImageNet)
56
Zhao et al. 2016.
A use case of such distribution: “Generative Visual Manipulation on the Natural Image Manifold”
57
Zhu et al. 2016.
58
Zhu et al. 2016.
59
Zhu et al. 2016.
Understanding and Probing Representations
(very brief executive summary)
60
Understanding Representations�
61
Nearest neighbors in full dimensional space�
62
Query |
Nearest Neighbors
Krizhevsky et al. 2012.
Deng et al. 2009.
Nearest neighbors in full dimensional space�
63
Zamir et al. 2016
Zamir et al.
Wang & Gupta. 2015
Agrawal et al. 2015
Krizhevsky (Imagenet), 2015
Low-dimensional embeddings�
64
Van der Maaten & Hinton. 2008
Low-dimensional embeddings�
65
Van der Maaten & Hinton. 2008
Van der Maaten & Hinton. 2008
Inverting the representation
67
Simonyan et al. 2014
Class appearance models (ImageNet)
Image
Top-1 class saliency map
Thresholded saliency map (for segmentation)
Foreground Segment
Inverting the representation
68
Dala & Triggs. 2005.
Vondrick et al. 2013.
Mahendran & Vadaldi. 2016.
Dosovitskiy & Brox. 2016.
Dosovitskiy & Brox
Hoggles
HOG^-1
Mahendran
HOG
Image
Inverting the representation
69
Dosovitskiy & Brox. 2016.
Mahendran & Vadaldi. 2016.
Hinton et al. 2006.
Dosovitskiy & Brox
Mahendran & Vedaldi
AE
Understanding Representations�
70