ABCDE
1
PRESENTERTOPICPAPERSImportant
Dates
2
January 18AbhinavIntroduction to the class
3
4
January 23AbhinavTheories of vision
5
6
January 25AbhinavTheories of vision
7
8
January 30AbhinavIntroduction to Data• A. Halevy, P. Norvig, and F. Pereira. The Unreasonable Effectiveness of Data. IEEE Intelligent Systems, 24 8–12, 2009. link
• A. Torralba, and A. Efros. Unbiased Look at Dataset Bias. CVPR 2011
9
10
February 1AbhinavIntroduction to Deep Learning-1• A. Krizhevsky, I. Sutskever, and G.E. Hinton. ImageNet Classification with Deep Convolutional Neural Networks. NIPS 2012
• K. Simonyan, A. Zisserman. Very Deep Convolutional Networks for Large-Scale Image Recognition. ICLR 2015
11
12
February 6AbhinavIntroduction to Deep Learning-2• A. Mahendran, A. Vedaldi. Understanding Deep Image Representations by Inverting Them. CVPR 2015
• M.D. Zeiler, R. Fergus. Visualizing and Understanding Convolutional Networks. ECCV 2014
13
14
February 8AbhinavIntroduction to Deep Learning-3• He, Kaiming, et al. "Deep residual learning for image recognition." arXiv preprint arXiv:1512.03385 (2015).
15
16
February 13SenthilIntroduction to
Caffe
• Jia, Yangqing, et al. "Caffe: Convolutional architecture for fast feature embedding." Proceedings of the 22nd ACM international conference on Multimedia. ACM, 2014.ASSIGNMENT 1 RELEASED
17
18
February 15LerrelIntroduction to Torch
19
20
February 20Senthil+LerrelAssignment 1 Q&A
21
22
February 22AbhinavObject DetectionBackground
• R. Girshick, J. Donahue, T. Darrell, J. Malik. Region-based Convolutional Networks for Accurate Object Detection and Semantic Segmentation. TPAMI 2015
• R. Girshick. Fast R-CNN. ICCV 2015
• S. Ren, K. He, R. Girshick, J. Sun Faster R-CNN. NIPS 2015.


Readings (Presented in next class):
• Shrivastava, Abhinav, Abhinav Gupta, and Ross Girshick. Training region-based object detectors with online hard example mining. CVPR 2016.
• Liu, Wei, et al. SSD: Single Shot MultiBox Detector. arXiv preprint arXiv:1512.02325 (2015).
23
24
February 27AbhinavImage SegmentationBackground:
J. Long, E. Shelhamer, T. Darrell. Fully Convolutional Networks for Semantic Segmentation. ICCV 2015
B. Hariharan, P. Arbelaez, R. Girshick, J. Malik. Hypercolumns for Object Segmentation and Fine-grained Localization. CVPR 2015


Reading (Presented in next class):
Li, Ke, Bharath Hariharan, and Jitendra Malik. "Iterative Instance Segmentation." arXiv preprint arXiv:1511.08498 (2015).
Noh, Hyeonwoo, Seunghoon Hong, and Bohyung Han. "Learning deconvolution network for semantic segmentation." ICCV 2015.
25
26
March 1SenthilWeakly Supervised Object DetectionBackground:
Oquab, Maxime, et al. "Is object localization for free?-weakly-supervised learning with convolutional neural networks." CVPR. 2015.
Cinbis, Ramazan Gokberk, Jakob Verbeek, and Cordelia Schmid. "Multi-fold mil training for weakly supervised object localization." CVPR, 2014.
Bilen, Hakan, and Andrea Vedaldi. "Weakly Supervised Deep Detection Networks." arXiv preprint arXiv:1511.02853 (2015).

Reading (Presented in next class):
Kantorov, Vadim, et al. "ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Localization." ECCV, 2016.
Pathak, Deepak, Philipp Krahenbuhl, and Trevor Darrell. "Constrained convolutional neural networks for weakly supervised segmentation." ICCV 2015.
27
28
March 6Abhinav3D UnderstandingBackground:
D. Eigen, R. Fergus. Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture. ICCV 2015.
Zoran, D., Isola, P., Krishnan, K., Freeman, W.T. Learning Ordinal Relationships for Mid-Level Vision. ICCV 2015.
Girdhar, Rohit, et al. "Learning a Predictable and Generative Vector Representation for Objects." arXiv preprint arXiv:1603.08637 (2016).

Readings (Presented in next class):
Wu, Jiajun, et al. "Single image 3d interpreter network." ECCV 2016.
Zhou, Tinghui, et al. "View synthesis by appearance flow." ECCV 2016.
ASSIGNMENT 1 DUE,
ASSIGNMENT 2 RELEASED
29
30
March 8????Human Pose EstimationBackground:
Tompson, Jonathan, et al. "Efficient object localization using convolutional networks." CVPR. 2015.
Newell, Alejandro, Kaiyu Yang, and Jia Deng. "Stacked hourglass networks for human pose estimation." arXiv preprint arXiv:1603.06937 (2016).
Wei, Shih-En, et al. "Convolutional Pose Machines." arXiv preprint arXiv:1602.00134 (2016).

Readings (Presented in next class):
Carreira, Joao, et al. "Human pose estimation with iterative error feedback." arXiv preprint arXiv:1507.06550 (2015).
Cao, Zhe, et al. "Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields." arXiv preprint arXiv:1611.08050 (2016).
31
32
March 13Spring BreakSpring BreakSpring Break
33
34
March 15Spring BreakSpring BreakSpring Break
35
36
March 20???Action RecognitionBackground:
Simonyan, Karen, and Andrew Zisserman. "Two-stream convolutional networks for action recognition in videos." Advances in Neural Information Processing Systems. 2014.
D. Tran, L. Bourdev, R. Fergus, L. Torresani, M. Paluri. Learning Spatiotemporal Features with 3D Convolutional Networks. ICCV 2015.
X. Wang, A. Farhadi, A. Gupta. Actions~Transformations. Arxiv 2015.

Reading (Presented in next class):
Wang, Limin, et al. "Temporal segment networks: towards good practices for deep action recognition." ECCV, 2016.
Gkioxari, Georgia, Ross Girshick, and Jitendra Malik. "Contextual action recognition with r* cnn." ICCV 2015.
Feichtenhofer, Christoph, Axel Pinz, and Andrew Zisserman. "Convolutional two-stream network fusion for video action recognition." CVPR 2016.
Bilen, Hakan, et al. "Dynamic image networks for action recognition." CVPR 2016.
37
38
March 22???Training with
synthetic models
Background:
Su, Hao, et al. "Render for cnn: Viewpoint estimation in images using cnns trained with rendered 3d model views." CVPR 2015.

Reading:
A. Bansal, B. Russell, A. Gupta. Marr Revisited: 2D-3D Alignment via Surface Normal Prediction. Arxiv 2016
Varol, Gül, et al. "Learning from Synthetic Humans." arXiv preprint arXiv:1701.01370 (2017).
39
40
March 27Lerrel??Introduction to
Tensor Flow
Abadi, Martín, et al. "TensorFlow: A system for large-scale machine learning." arXiv preprint arXiv:1605.08695 (2016).ASSIGNMENT 2 DUE
41
42
March 29Lerrel/SenthilMid-Term
Presentations
43
44
April 3Abhinav GuptaSelf-Supervised LearningBackground:
Doersch, Carl, Abhinav Gupta, and Alexei A. Efros. "Unsupervised visual representation learning by context prediction." Proceedings of the IEEE International Conference on Computer Vision. 2015.
Wang, Xiaolong, and Abhinav Gupta. "Unsupervised learning of visual representations using videos." Proceedings of the IEEE International Conference on Computer Vision. 2015.
Pinto, Lerrel, and Abhinav Gupta. "Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours." Robotics and Automation (ICRA), 2016 IEEE International Conference on. IEEE, 2016.

Reading:
Jayaraman, Dinesh, and Kristen Grauman. "Learning image representations tied to ego-motion." ICCV 2015.
Agrawal, Pulkit, Joao Carreira, and Jitendra Malik. "Learning to see by moving." ICCV 2015.
Misra, Ishan, C. Lawrence Zitnick, and Martial Hebert. "Shuffle and learn: unsupervised learning using temporal order verification." ECCV 2016.
Noroozi, Mehdi, and Paolo Favaro. "Unsupervised learning of visual representations by solving jigsaw puzzles." ECCV 2016.
45
46
April 5Abhinav Gupta /
Xiaolong Wang
Webly SupervisedBackground:
X. Chen, A. Shrivastava, A. Gupta. NEIL: Extracting Visual Knowledge from Web Data. ICCV 2013
X. Chen, A. Gupta Webly Supervised Learning of Convolutional Networks. ICCV 2015.

Reading in April 12:
S. Divvala, A. Farhadi, C. Guestrin.Learning Everything about Anything: Webly-Supervised Visual Concept Learning. CVPR 2014.
A. Joulin, L. van der Maaten, A. Jabri, N. Vasilache. Learning Visual Features from Large Weakly Supervised Data. Arxiv 2015.
ASSIGNMENT 3 RELEASED
47
48
April 10Gunnar
Deep Sequential Models
49
50
April 12AbhinavGenerative ModelsBackground:
D. Kingma, M. Welling Auto-Encoding Variational Bayes. ICLR 2014
A. Radford, L. Metz, S. Chintala, Unsupervised Representation Learning With Deep Convolutional Generative Adversarial Networks. Arxiv 2015..

Reading on April 17:
Wang, Xiaolong, and Abhinav Gupta. "Generative image modeling using style and structure adversarial networks." ECCV, 2016.

Reading on April 19:
Isola, Phillip, et al. "Image-to-image translation with conditional adversarial networks." arXiv preprint arXiv:1611.07004 (2016).
Zhang, Han, et al. "StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks." arXiv preprint arXiv:1612.03242 (2016).
51
52
April 17LerrelIntroduction to Deep RL
53
54
April 19???Image Captioning & VQABackground:
O. Vinyals, A. Toshev, S. Bengio, D. Erhan. Show and Tell: A Neural Image Caption Generator. CVPR 2015.
Antol, Stanislaw, et al. "Vqa: Visual question answering." ICCV. 2015.
J. Devlin, S. Gupta, R. Girshick, M. Mitchell, C.L. Zitnick. Exploring Nearest Neighbor Approaches for Image Captioning. Arxiv 2015.
B. Zhou, Y. Tian, S. Sukhbaatar, A. Szlam, R. Fergus Simple Baseline for Visual Question Answering Arxiv 2015

Reading:
Xu, Kelvin, et al. "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention." ICML. Vol. 14. 2015.
Donahue, Jeffrey, et al. "Long-term recurrent convolutional networks for visual recognition and description." CVPR 2015.
ASSIGNMENT 3 DUE (APRIL 21st)
55
56
April 24???Reasoning: ContextBackground:
D. Hoiem, A.A. Efros, M. Hebert. Putting Objects in Perspective. CVPR 2006.
S.K. Divvala, D. Hoiem, J.H. Hays, A.A. Efros, M. Hebert. An Empirical Study of Context in Object Detection. CVPR 2009.

Reading (Presented on 04/26):
Shrivastava, Abhinav, and Abhinav Gupta. "Contextual Priming and Feedback for Faster R-CNN." ECCV, 2016.
Xiaolong Wang, David F. Fouhey, and Abhinav Gupta. Designing Deep Networks for Surface Normal Estimation. Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.

New Topics (Presented on 04/26):
J. Walker, C. Doersch, A. Gupta, and M. Hebert. An Uncertain Future: Forecasting from Static Images Using Variational Autoencoders. ECCV 2016
Lerer, Adam, Sam Gross, and Rob Fergus. "Learning physical intuition of block towers by example." arXiv preprint arXiv:1603.01312 (2016).
57
58
April 26???Self-supervised
Actions
59
60
May 1Project Presentations
61
62
May 3Project Presentations
63
64