1 of 17

The Visipedia Project:

A Retrospective Vision

Serge Belongie

Professor, DIKU

Director, P1

www.aicentre.dk

2 of 17

Visipedia: Pre-History

1995-2007

UC Berkeley

UC San Diego

3 of 17

Grad School Years: 1995-2000

  • UC Berkeley
    • Jitendra Malik’s group
  • Popular datasets
    • COIL-100, ALOI
    • Objects on turntables
  • Popular techniques
    • Expectation-Maximization
    • Kernel Machines
    • Optical Flow / Feature Tracking
    • Bundle Adjustment
    • Graph Cuts
    • Random Fields

4 of 17

Shape Contexts

5 of 17

Remarks: 1995-2000

  • Hammers looking for nails
    • Math/Physics Envy
  • ConvNets not mainstream
  • Could get a PhD with MNIST experiments
    • Outperformed LeNet-4
    • Soon overtaken by SVMs
  • “Big Data Moment” was yet to come

6 of 17

Tenure Track: 2001-2007

  • UC San Diego
  • Popular datasets
    • Berkeley Segmentation Benchmark
    • Caltech 101 & 256
    • TinyImages
    • ImageNet
  • Popular techniques
    • Graphical Models
    • Crowdsourcing
    • Particle Filtering
    • Feature Detectors/Descriptors
    • Boosting, Random Forests
    • Big Image Data

Smart Vivarium

7 of 17

Remarks: 2001-2007

  • “Dataset AI” era begins
    • Big D**k Data
    • Visipedia vision emerges
  • Antedeepluvian Era
    • Before Deep Learning
  • High Error Rates
    • Far from saturation
    • Team morale struggles
  • AI Ethics, Bias, Fairness
    • Not mainstream yet

8 of 17

Visipedia is born

2008-2013

Aim: capture and share visual expertise

9 of 17

Full Professor: 2008-2013

  • Caltech sabbatical
  • Cornell Lab of O visit
  • Popular datasets
    • PASCAL, MSRC
    • Oxford Flowers, CUB-200
  • Google Research Award
  • Popular techniques
    • Human-in-the-Loop
    • Parts and Attributes
    • Few Shot Learning
    • Deep Learning

[ECCV 2010, NeurIPS 2010]

10 of 17

Remarks: 2008-2013

  • Community-first
    • “Identify Species” button
    • “Robustness to label error”
    • Fine Grained, Long Tailed
    • FGVC workshop begins
  • Industry Research Labs
    • Google, Facebook, Microsoft
    • Huge impact: GPUs & funding
  • Public-Private partnership
    • GBIF + TensorFlow Hub
  • Deep Learning exploded onto the scene
    • AlexNet ILSVRC 2012
    • Razavian et al. “CNN features” preprint 2013

11 of 17

Merlin Bird ID & iNaturalist collaborations

2014-2021

12 of 17

Merlin Bird ID & iNaturalist collaborations: 2014-2021

  • Google engagement
    • Internships, Visiting Faculty
    • Project V
  • Popular datasets
    • COCO
    • iNaturalist
  • Popular techniques
    • Deep Learning
    • LSTMs
    • GANs
    • BERT

13 of 17

Remarks: 2014-2021

  • Deep Learning
    • Totally transformed CV/ML
  • Data Quality
    • Attribution, Provenance
  • Powered-by-Visipedia
    • Put the community in front
  • Natural World emphasis
    • Biodiversity
    • Ecology
    • Conservation
  • AI Ethics
    • Front and center

Google Trends: AI Ethics

14 of 17

2022-present: Multimodal, Generative AI, and Beyond

[Van Horn et al. ECCV 2022]

www.aicentre.dk

14

15 of 17

Conclusion

  • Participatory Design
    • AI supporting people
  • Moonshot Thinking
  • Theory & Practice
  • Big Data & Big Compute
    • Unreasonably effective
  • Partnerships
    • Critical to success
    • Academia, Industry, NGO, Non-Profit
    • Joint ownership of AI ethics/bias challenges

https://visipedia.org

16 of 17

17 of 17