ECCV 2016: paper summaries

Note: This is a Google Doc. If you would like to contribute to these notes by adding notes about other papers - e.g. papers that I missed, orals that I didn’t attend, your own paper, etc. Or if you want to add additional notes or make corrections to what is already here, then:

Please request to edit this document or send me an e-mail ( and I will happily give you edit permissions.

Notes prepared by: Zoya Bylinskii, ...

Oral papers*:        

(* the ones I successfully made it to)

CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples

Filip Radenovic, CMP, CVUT; Giorgos Tolias, CMP, CVUT; Ondra Chum, CMP, CVUT

SSD: Single Shot MultiBox Detector

Wei Liu, UNC Chapel Hill; Dragomir Anguelov, Zoox; dumitru Erhan, Google; Christian Szegedy, Google; Scott Reed, University of Michigan, Ann-Arbor; Cheng-Yang Fu, UNC Chapel Hill; Alex Berg, UNC Chapel Hill

A Recurrent Encoder-Decoder Network for Sequential Face Alignment

Xi Peng, Rutgers University; Rogerio Feris, IBM Research Center, USA; Xiaoyu Wang, Snapchat Research; Dimitris Metaxas, Rutgers University

Robust Facial Landmark Detection via Recurrent Attentive-Refinement Networks

Shengtao Xiao, National University of Singapore; Jiashi Feng, NUS; Junliang Xing, Chinese Academy of Sciences; Hanjiang Lai, SUN YAT-SEN UNIVERSITY; Shuicheng Yan, National University of Singapore; Ashraf Kassim, National University of Singapore

Ambient sound provides supervision for visual learning

Andrew Owens, MIT; Jiajun Wu, MIT; Josh Mcdermott, MIT; Antonio Torralba, MIT; William Freeman, MIT

Grounding of Textual Phrases in Images by Reconstruction

Anna Rohrbach; Marcus Rohrbach, UC Berkeley; Ronghang Hu, UC Berkeley; Trevor Darrell,  UC Berkeley; Bernt Schiele

Improving Multi-label Learning with Missing Labels by Structured Semantic Correlations

Hao Yang, NTU; Joey Tianyi Zhou, IHPC; Jianfei Cai, NTU

Visual Relationship Detection with Language Priors

Cewu Lu, Stanford University; Ranjay Krishna, Stanford University; Michael Bernstein, Stanford University; Fei-Fei Li, Stanford University

The Fast Bilateral Solver

Jonathan Barron, Google; Ben Poole, Stanford University

[Honorable mention award!]

Phase-based Modification Transfer for Video

Simone Meyer, ETH Zurich; Alexander Sorkine-Hornung, Disney Research Zurich; Markus Gross, ETH Zurich

Colorful Image Colorization

Richard Zhang, UC Berkeley; Phillip Isola, MIT; Alexei Efros

Focal flow: Measuring depth and velocity from defocus and differential motion

Emma Alexander, Harvard University; Qi Guo, Harvard University; Sanjeev Koppal, University of Florida; Steven Gortler; Todd Zickler

[Best student paper award!]

Top-down Neural Attention by Excitation Backprop

Jianming Zhang; Zhe Lin, Adobe Systems, Inc.; Jonathan Brandt; Xiaohui Shen, Adobe; Stan Sclaroff, Boston University

Learning Recursive Filters for Low-Level Vision via a Hybrid Neural Network

Sifei Liu, UC Merced; Jinshan Pan, UC Merced; Ming-Hsuan Yang, UC Merced

Learning Representations for Automatic Colorization

Gustav Larsson, University of Chicago; Michael Maire, Toyota Technological Institute at Chicago; Greg Shakhnarovich, TTI Chicago, USA

Spot On: Action Localization from Pointly-Supervised Proposals

Pascal Mettes, University of Amsterdam; Jan van Gemert, Delft University of Technology; Cees Snoek, University of Amsterdam

Detecting Engagement in Egocentric Video

Yu-Chuan Su, University of Texas at Austin; Kristen Grauman, University of Texas at Austin

Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking

Martin Danelljan, Linköping University; Andreas Robinson, Linköping University; Fahad Khan, Linkoping University, Sweden; Michael Felsberg, Link_ping University

Look-ahead before you leap: end-to-end active recognition by forecasting the effect of motion

Dinesh Jayaraman, UT Austin; Kristen Grauman, University of Texas at Austin