Demystifying Self-Supervised Learning for Visual Recognition
Sayak Paul (@RisingSayak)
SciPy Japan 2020
$whoami
Agenda
Representation learning
Representation learning ftw!
Training models to learn representations for tasks like image classification, object detection, semantic segmentation, and so on.
Large pool of data
Train a model to learn ...
Extract the learned representations
Use the representations in a downstream task
The unreasonable success of supervised representation learning
Source: EfficientNet; Tan et al. (2019)
But ...
Often, this success is constrained by
Source: Big Transfer (BiT); Kolesnikov et al. (2020)
But ...
Often, this success is constrained by
Source: Big Transfer (BiT); Kolesnikov et al. (2020)
But ...
Gathering large amount of labeled data
Self-supervised learning - Intro
Pretext task - Examples
Self-supervised learning - Adaptation
Self-supervised learning for computer vision
Problem formulation
Instilling a sense of semantic understanding in a model -
Problem formulation
Contrasting between different views of the same images works very well!
Source: MoCo-V2 in PyTorch
Problem formulation
Why does this formulation matter?
In literature, this formulation is referred to as contrastive learning.
General workflow
Source: GitHub repositories of SwAV and SimCLR
Typical loss functions
Some notable self-supervised frameworks in vision
SimCLR (Chen et al.)
Source: SimCLR; Chen et al. (2020)
Moco-V2 (Chen et al.)
Source: MoCo-V2; Chen et al. (2020)
SwAV to rule’em all (Caron et al.)!
Source: SwAV; Caron et al. (2020)
The frameworks are all over the place!
Base image: SwAV; Caron et al. (2020)
Recipes that show promise
Evaluation - Overview
Evaluation - Numbers
Source: SwAV; Caron et al. (2020)
It’s not just image classification
Source: SwAV; Caron et al. (2020)
A recipe to consider in vision these days
Don’t have enough labeled data?
Even if you have enough labeled data
Final thoughts
Challenges
Self-training as another consideration
Source: Noisy student training (an extension of self-training); Xie et al. (2019)
Why this expansion of self-training?
Source: Rethinking Pre-training and Self-training; Zoph et al. (2020)
Why this expansion of self-training?
Source: Rethinking Pre-training and Self-training; Zoph et al. (2020)
Why this expansion of self-training?
Source: Noisy student training; Xie et al. (2019)
Some recommended reading
Minimal implementations
Deck available here: bit.ly/scipy-sp
Different areas for a model to optimize