1 of 23

Equivariant Contrastive Learning

Rumen Dangovski, Li Jing, Charlotte Loh, Seungwook Han, Akash Srivastava, Brian Cheung, Pulkit Agrawal, Marin Soljačić

https://github.com/rdangovs/essl

http://super-ms.mit.edu/essl.html

2 of 23

Outline

Equivariant Contrastive Learning (arXiv:2111.00899, ICLR 2022)
Extension to Sentence Embeddings (arXiv:2204.10298, NAACL-HLT 2022)
Applications to Science (arXiv:2110.08406, Nature Communications 2022)
Ongoing projects

Disclaimer: work focuses on joint embedding. Only preliminary results on MAE.

3 of 23

Exact equivariance matters in science

arXiv:2110.08406

Nat. Comm. 2022

4 of 23

Invariance to Transformations Learns Good Features

Chen & He. CVPR 2021

5 of 23

Generalizing Invariance to Equivariance Brings New SSL

blur,

flip,

rotation,

…

insensitive: invariance,

sensitive: equivariance

proof-of-concept CIFAR-10 study motivates complementarity of invariance & equivariance

Invariance is a special case of the property of equivariance. In equivariance, the action of the group of transformations on the input space induces an action of the group on the representation space. If T’g is the identity, then we have an invariant encoder that is insensitive to transformations. However, we can also make T’g nontivial, so that f is sensitive to transformations.

In a preliminary study we took a CIFAR10 baseline trained with SimCLR using only random resided cropping. If we include additional standard transformations for SimCLR, then we obtain improvements over the baseline. However, if we make the encoder sensitive to these transformations, say by predicting the transformation from the representations, we go below the baseline.

Interestingly, this trend is reversed for a collection of other meaningful transformations. This shows that there is a subtle complementary nature between invariance and equivariance when we use transformations in SSL.

60 seconds

6 of 23

Concept

7 of 23

Our Simple Equivariant SSL Objective

E-SSL: Equivariant Self-Supervised Learning

We predict g to encourage sensitivity to g

We could use any SSL loss

We control the level of sensitivity

8 of 23

Example pseudocode for sensitive rotation

9 of 23

Even Simple Generalization Improves SOTA Methods

10 of 23

E-SSL’s Bag of Tricks

E-SSL

11 of 23

E-SSL is Robust to Restricted Augmentation and Labels

12 of 23

E-SSL in NLP: Sentence Embeddings SOTA

invariance to Replaced Token Transformation reduces feature quality

https://github.com/voidism/DiffCSE

arXiv:2204.10298 NAACL-HLT 2022

13 of 23

Next Steps of E-SSL for Unbiased Datasets

If they appear in the dataset then E-SSL does not work directly

Small E-SSL modification:

predict relative orientation from concatenated features instead

Pre-train on unbiased CIFAR-10, downstream rotation prediction: 67.1% -> 71.2%

14 of 23

Importance of E-SSL to Physics, and Physics to E-SSL

https://github.com/rdangovs/essl

Code for new photonics datasets for SSL,

Minimal code for strong CIFAR10 and ImageNet SSL baselines,

Pretrained E-SSL models on ImageNet

15 of 23

Further steps

16 of 23

E-SSL for Exact Equivariance

On CIFAR-10 not different from original E-SSL

Let’s model it

17 of 23

Exact equivariance matters in science

arXiv:2110.08406

Nat. Comm. 2022

18 of 23

Method in science

19 of 23

Effective method

20 of 23

Effective against equivariant architectures

21 of 23

Outlook

Try it on MAE
Understand the importance of sensitivity: local vs. global
Study unbiased datasets
Alternative to equivariant neural networks
Discover symmetries in the data

22 of 23

Thank you!

My collaborators, the anonymous reviewers, and all of our friends :)
MIT Supercloud, ARO, AIIA, IAIFI
Dedicated to the memory of Boyko Dangovski

https://github.com/rdangovs/essl

http://super-ms.mit.edu/essl.html

23 of 23

Appendix