Large-Scale Object Detection with �Limited Supervision: An Empirical Study�[Tai-Yu Pan and Cheng Zhang]
The Long tailed phenomena
Surgical gown
Biscuit
Dog
Pizza
Objects in the LVIS dataset
# Occurrences
Problem for detection in the tail
How to train a good detector when few labeled examples are available?
Detection
(scene-centric, context)
Low-shot classification
(object-centric image)
Many examples
few examples
Plans and algorithms
Pascal VOC
2005-2012
ImageNet
2009
MSCOCO
2014
LVIS
2019
20 classes, uniform
21K classes
long-tailed, object-centric
80 classes
long-tailed, scene-centric
1.2K classes
long-tailed, scene-centric, fine-grained
Dataset
Method
Setting
+
Train
Deploy
Baselines
Faster
R-CNN
YOLO
Advanced approaches
RepMet
[CVPR 19]
Meta R-CNN
Meta-transfer
[ICCV 19]
Re-weighting
[ICCV 19]
……
Experiments and analysis
Evaluation on existing few-shot setting
Analysis and expectations
Methods | Setting 1 | …… | Setting N |
Faster-RCNN | | | |
RepMet | | | |
Meta RCNN | | | |
…… | | | |
Benchmarks: VOC, COCO, ImageNet, etc.
Metric: mean average precision (mAP)
Both quantitative and qualitative study
Few-shot
classifier
Detection
model
Localization?
Base class: person
Novel class: dog
Novel class (dog) may have already been seen as background!