1 of 54

A Subtle Introduction to Deep Representation Learning

Applied Machine and Deep Learning

190.015

M.Sc. Fotios (Fotis) Lygerakis

October 2023

Chair of Cyber-Physical-Systems

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

2 of 54

Outline

  • Recap on Neural Networks
  • Introduction to Representation Learning
  • Core Techniques and Approaches
  • Case Studies
  • Coding Examples
  • Conclusion and Q&A

2

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

3 of 54

Recap on Neural Networks

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

4 of 54

Recap: Neuron & Neural Network

4

Perceptron

https://www.allaboutcircuits.com/technical-articles/how-to-train-a-basic-perceptron-neural-network/

Deep Neural Network

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

5 of 54

Recap: BackProp and Gradient Descent

5

Gradient Descent

https://www.analyticsvidhya.com/blog/2023/01/gradient-descent-vs-backpropagation-whats-the-difference/

Backpropagation

https://www.3blue1brown.com/lessons/backpropagation-calculus

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

6 of 54

Representation Learning

https://classic.csunplugged.org/activities/image-representation/

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

7 of 54

Definition

“Representation Learning is a process in machine learning where algorithms extract meaningful patterns from raw data to create representations that are easier to understand and process. These representations can be designed for interpretability, reveal hidden features, or be used for transfer learning.”

Reads

  • Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representation Learning: A Review and New Perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35, 8 (August 2013), 1798–1828. https://doi.org/10.1109/TPAMI.2013.50

7

https://paperswithcode.com/task/representation-learning

h = f(x)

x

h

original data

representation

function

f

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

8 of 54

Feature Engineering vs Representation Learning

GO* Feature Engineering

  • Manually defined features
  • Requires domain expertise
  • time-consuming/not scalable with more data

Representation Learning

  • Automatically learns the most important features
  • No need for explicit programming
  • More efficient and can handle complex, high-dimensional data

8

*GO: good old

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

9 of 54

Why is it important?

  • Highlight Essential Features: Focuses the model’s attention on the most important aspects of the data
  • Dimensionality Reduction: Speeds up learning and helps removing noise, enhancing performance
  • Computational Efficiency: Train faster
  • Improved Generalization: Facilitates better understanding of unseen data, building robust models

9

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

10 of 54

Today’s Focus on Representation Learning

  • Fundamentals
  • Core Techniques & Approaches
  • Applications in various domains

Automated feature engineering

Improved Performance

Real-World Applications

10

WHY Deep?

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

11 of 54

Representation Learning

Encoder

Raw Data

Information

What is?

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

12 of 54

Representation Learning

Encoder

What is?

Raw Data

Information

Low

High

Activation

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

13 of 54

13

Raw Data

Pixel Values

255

000

128

255

148

000

000

056

002

255

255

000

174

154

078

000

Information

255

000

128

255

148

000

000

056

002

255

255

000

174

154

078

000

255

000

128

255

148

000

000

056

002

255

255

000

174

154

078

000

0.524

0.741

0.001

0.124

0.874

0.451

0.654

0.001

Representation Vector

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

14 of 54

14

Encoder

Neural Network

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

15 of 54

Kinds of Representations Learning

Unsupervised

Supervised

15

dog: 87.5%

cat: 8.5%

car: 4.0%

Neural Net

Neural Net

YES!

No!

Maybe?

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

16 of 54

Which flavor to choose?

Unsupervised

Supervised

16

Abundance of labeled data

High precision tasks

Exploratory analysis

Data with hidden structures

Self-Supervised

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

17 of 54

Self-Supervised Learning (SSL)

  • Supervision is derived from the input data itself
  • No explicit external labels needed
  • Training by designing auxiliary tasks (reconstruction, regularization, etc)
  • Given a portion of the data the model is trained to predict or reconstruct the remaining part.
  • Once the model is trained in this manner, the learned representations can be used for downstream tasks, often with a separate fine-tuning step.

17

That’s what we are going to discover today!

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

18 of 54

Which flavor to choose?

Unsupervised

Supervised

18

Unlabeled data

labeled data

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

19 of 54

Representation Learning

How do we get the representations?

Encoder

Raw Data

Classifier

Labels

Dog

85%

Cat

12%

Car

07%

Supervised Learning

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

20 of 54

Representation Learning

How do we get the representations?

Encoder

Raw Data

Self-Supervised Learning (SSL)

Inherent Learning Signal

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

21 of 54

Representation Learning

SSL

Encoder

Raw Data

Decoder

Reconstruction

Reconstruction

(Autoencoding)

D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” 2014.

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

22 of 54

Representation Learning

SSL

Encoder

Raw Data

Joint Embeddings

(Contrastive, Regularization, EBM)

Encoder

Positive Sample

A. van den Oord, Y. Li, and O. Vinyals, “Representation learning with contrastive predictive coding,” 2018

K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, “Momentum contrast for unsupervised visual representation learning,” 2020

T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations, 2020

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

23 of 54

Kinds of Representations Learning

Unsupervised

Supervised

23

dog: 87.5%

cat: 8.5%

car: 4.0%

Neural Net

Neural Net

YES!

No!

Maybe?

Representations

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

24 of 54

Which flavor to choose?

Unsupervised

Supervised

  • Benefits:
    • Learning from labeled data.
    • Often provides better performance with adequate labeled data.
  • Benefits:
    • Learning from unlabeled data, which is more abundant.
    • Can uncover latent structures and features not evident from labels.

24

  • Best for:
    • Tasks with a substantial amount of labeled data.
    • Situations where high accuracy is essential and labeled examples are clear.
  • Best for:
    • Tasks with limited or no labeled data.
    • Understanding the underlying structure of the data.

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

25 of 54

Representation Learning

Why bother with SSL?

  • Utilize abundant, unlabeled data.
  • Representations
    • robust
    • generalizable / task-agnostic
    • efficient

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

26 of 54

Real World Applications!

  • Image Denoising
  • Anomaly Detection
  • Dimensionality Reduction
  • Data Compression
  • Visual Representation Learning
  • Pre-training for Downstream Tasks
  • Image Retrieval
  • Image Generation
  • Data Augmentation
  • Super-Resolution
  • Style Transfer
  • Question Answering
  • Multimodal Learning
  • Robot Learning

26

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

27 of 54

Real World Applications!

27

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

28 of 54

Real World Applications!

28

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

29 of 54

Core Techniques and Approaches

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

30 of 54

Autoencoders

30

Encoder(img)

Decoder(z)

original

z

reconstructed

Training

  1. Forward pass (left-to-right)
  2. MSE(Original, Reconstructed) MSE: Mean Squared Error
  3. Backpropagate the MSE loss

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

31 of 54

Contrastive Learning

31

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

32 of 54

Contrastive Learning

32

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

33 of 54

Contrastive Learning

33

make similar

make dissimilar

Aäron van den Oord, Yazhe Li, & Oriol Vinyals (2018). Representation Learning with Contrastive Predictive Coding. CoRR, abs/1807.03748.

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

34 of 54

Contrastive Learning

34

Encoder(anc)

Anchor

zanc

Encoder(pos)

Positive

zpositive

Encoder(neg)

Negative

znegative

pull apart

pull together

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

35 of 54

Contrastive Learning

35

Training

  • Data Augmentation:
  • Forward Pass:
    • Compute zq
    • Compute zk
  • Compute Similarity
  • Contrastive Loss LNCE
  • Backward Pass

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

36 of 54

Transformers

36

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

37 of 54

Masked Autoencoders

37

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

38 of 54

Basics of Convolution

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

39 of 54

Convolutions Basics: Kernel(filter)

39

https://courses.cs.washington.edu/courses/cse446/21au/sections/08/convolutional_networks.html

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

40 of 54

Convolutions Basics: Stride

40

https://www.analyticsvidhya.com/blog/2022/03/basics-of-cnn-in-deep-learning/

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

41 of 54

Convolutions Basics: Padding

41

https://towardsdatascience.com/types-of-convolutions-in-deep-learning-717013397f4d

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

42 of 54

Examples

42

Kernel:

2x2

Stride:

1

Padding:

0

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

43 of 54

Examples

43

Kernel:

3x3

Stride:

2

Padding:

0

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

44 of 54

Examples

44

Kernel:

3x3

Stride:

1

Padding:

1

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

45 of 54

Coding Examples

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

46 of 54

Case Studies

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

47 of 54

Contrastive Regularized VAE (CR-VAE)

47

F. Lygerakis, E. Rueckert, CR-VAE: Contrastive Regularization on Variational Autoencoders for Preventing Posterior Collapse, 7th Asian Conference on Artificial Intelligence (ACAIT), 2023

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

48 of 54

Combining Vision and Touch with SSL: MViTac

48

V.Dave*, F. Lygerakis*, E. Rueckert (*Equal Contribution), Multimodal Visual-Tactile Representation Learning through Self-Supervised Contrastive Pre-Training, IEEE International Conference on Robotics and Automation (ICRA), 2024

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

49 of 54

Let the robots touch what they see: M2CURL

49

F. Lygerakis, V.Dave, E. Rueckert, M2CURL: Sample-Efficient Multimodal Reinforcement Learning via Self-Supervised Representation Learning for Robotic Manipulation, 2024, 21st International Conference on Ubiquitous Robots (UR2024)

Best Student Paper Award

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

50 of 54

Wrapping Up

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

51 of 54

Wrap up

  • Representation Learning is important
    • Highlight Essential Features
    • Dimensionality Reduction
    • Computational Efficiency
    • Improved Generalization
    • Exploit big unlabeled data

51

  • Real-World Applications
  • Core Techniques
    • Autoencoding
    • Contrastive Learning

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

52 of 54

Resources & Extra Reads

Autoencoders

Contrastive Learning

Masked Autoencoders(Transformers)

52

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

53 of 54

Thank you for your attention!

Fotios (Fotis) Lygerakis

Chair of Cyber-Physical-Systems

Montanuniversität Leoben

Franz-Josef-Straße 18,

8700 Leoben, Austria

E-mail: fotios.lygerakis@unileoben.ac.at

Web: https://cps.unileoben.ac.at/fotios-lygerakis-m-sc/

53

Disclaimer: The lecture notes posted on this website are for personal use only. The material is intended for educational purposes only. Reproduction of the material for any purposes other than what is intended is prohibited. The content is to be used for educational and non-commercial purposes only and is not to be changed, altered, or used for any commercial endeavor without the express written permission of Professor Rueckert.

Presentation

Link

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS

54 of 54

Thank you for your attention!

Fotios (Fotis) Lygerakis

Chair of Cyber-Physical-Systems, Montanuniversität Leoben

E-mail: fotios.lygerakis@unileoben.ac.at

Website: www.lygerakis.com

54

Disclaimer: The lecture notes posted on this website are for personal use only. The material is intended for educational purposes only. Reproduction of the material for any purposes other than what is intended is prohibited. The content is to be used for educational and non-commercial purposes only and is not to be changed, altered, or used for any commercial endeavor without the express written permission of Professor Rueckert.

MONTANUNIVERSITÄT LEOBEN

CYBER-PHYSICAL-SYSTEMS