1 of 3

Large-Scale Object Detection with �Limited Supervision: An Empirical Study[Tai-Yu Pan and Cheng Zhang]

The Long tailed phenomena

Surgical gown

Biscuit

Dog

Pizza

Objects in the LVIS dataset

# Occurrences

Problem for detection in the tail

How to train a good detector when few labeled examples are available?

Detection

(scene-centric, context)

Low-shot classification

(object-centric image)

Many examples

few examples

2 of 3

Plans and algorithms

Pascal VOC

2005-2012

ImageNet

2009

MSCOCO

2014

LVIS

2019

20 classes, uniform

21K classes

long-tailed, object-centric

80 classes

long-tailed, scene-centric

1.2K classes

long-tailed, scene-centric, fine-grained

Dataset

Method

Setting

+

Train

Deploy

Baselines

Faster

R-CNN

YOLO

Advanced approaches

RepMet

[CVPR 19]

Meta R-CNN

Meta-transfer

[ICCV 19]

Re-weighting

[ICCV 19]

……

3 of 3

Experiments and analysis

  • Compare SOTAs and analyze the limitations

Evaluation on existing few-shot setting

Analysis and expectations

Methods

Setting 1

……

Setting N

Faster-RCNN

RepMet

Meta RCNN

……

Benchmarks: VOC, COCO, ImageNet, etc.

Metric: mean average precision (mAP)

Both quantitative and qualitative study

  • Rethink more realistic scenarios to object detection in the wild

Few-shot

classifier

Detection

model

Localization?

Base class: person

Novel class: dog

Novel class (dog) may have already been seen as background!