1 of 1

RFP: Representational Collapse & Intrinsic Dimension

  • Representational Collapse(RC) forces N-dimensional representations to lie on an n-dimensional manifold(n<N). It is seen as a bad thing. Is it really?
  • Recent results show that lower-dimensional representations generalize better. How low-dimensional can the manifold of representations be?
  • An ideal lossless encoding (in SSL) could encode data(images) to representations lying at least in an n-dimensional manifold, (n=number of dimensions of the representation that didn’t collapse, which is equal to the Intrinsic Dimension(ID) of the image dataset)
  • Even a lossy encoding should still give n as the upper bound of the ID of the image dataset(n<ID means information loss).
  • Questions:
    • Can we find ID of a dataset via RC in a SSL setup?
    • How is it related to generalization?
  • Some references:
  • Understanding Dimensional Collapse In Contrastive Self-Supervised Learning
  • Intrinsic Dimension Of Data Representations In Deep Neural Networks
  • The Intrinsic Dimension of Images and Its Impact on Learning

Singular Value Rank Index

Log of singular values

Fig 1.(red): Log of magnitude vs rank index of the ordered singular values of the covariance matrix of the embedding matrix of ResNet-18(Trained using SimCLR) on CIFAR-10 Test Set. [See the 1st reference]

Requester: Vaisakh M (Email)