DISENTANGLED REPRESENTATION LEARNING
Gennaro Gala
“In a certain light, all of science is one big unsupervised learning problem in which we search for the most disentangled representation of the world around us”
ABOUT ME
EDUCATION
TU/e, the Netherlands
University of Bari Aldo Moro, Italy
University of Bari Aldo Moro, Italy
2
INTERESTS
OUTLINE
3
Disentangled Representation Learning (DRL)
From AE to VAE
Unsupervised DRL methods
Weakly-Supervised DRL
Class-Content-Style Disentanglement
Image-to-Image translation
REPRESENTATION LEARNING
Representation Learning is learning representations of the data that make it easier to extract useful information when building classifiers or other predictors.
(Y. Bengio et al., 2013)
4
Manifold Assumption: The data lie approximately on a manifold of much lower dimension than the input space
REPRESENTATION LEARNING - ISSUES
5
v
v
CAT
DOG
FROG
ZEBRA
Due to their highly entangled nature, representations learned by NNs are not ideal.
What makes a good representation?
Difficult to interpret and reuse (without fine-tuning)
DISENTANGLED REPRESENTATION LEARNING (DRL)
6
Color
L-R
Pose
Color
L-R
Pose
Color
L-R
Pose
HOW CAN WE LEARN REPRESENTATIONS IN AN UNSUPERVISED WAY?�
7
AUTOENCODERS (AEs)
8
RGB image
RGB image
Information bottleneck
AEs 2D LATENT SPACE - MNIST
9
Note x and y axis values!
Disentangled Representations - How to do Interpretable Compression with Neural Models, Yordan Hristov.
VARIATIONAL AUTOENCODERS (VAEs)
10
RGB image
RGB image
Negative ELBO
Kingma, D. P. & Welling, M. (2014). Auto-Encoding Variational Bayes
VAEs 2D LATENT SPACE - MNIST
11
Disentangled Representations - How to do Interpretable Compression with Neural Models, Yordan Hristov.
Note x and y axis values!
VAEs 2D LATENT SPACE - MNIST
12
robz.github.io/mnist-vae
DSPRITES DATASET
13
github.com/deepmind/dsprites-dataset
HOW CAN WE CHECK IF A REPRESENTATION IS DISENTANGLED?
14
1
2
3
Grab an image
Encode
Traverse each dimension independently
4
Decode
LATENT TRAVERSAL IN VAEs
15
Latent traversal: each column corresponds to the traversal of a single latent variable while keeping the others fixed
Latent traversals are meaningless!
You don’t say?
16
RGB image
RGB image
Negative ELBO
17
Latent traversal: each column corresponds to the traversal of a single latent variable while keeping the others fixed
Why does it (partially) work?
Y-pos X-pos Scale Rotat. Rotat.
Unsupervised DRL Models
18
github.com/YannDubs/disentangling-vae
Common theme: Regularize more, regularize better!
Factor-VAE trained on CelebA
Impossibility of unsupervised DRL (without biases)
19
Metric=DCI Disentanglement
1 2 3 4 5 6
Dataset=Dsprites
OUTLINE
20
Disentangled Representation Learning (DRL)
From AE to VAE
Unsupervised DRL methods
Weakly-Supervised DRL
Class-Content-Style Disentanglement
Image-to-Image translation
WEAKLY SUPERVISED DRL – From i.i.d. to non-i.i.d.
21
Color
L-R
Pose
Color
L-R
Pose
CLASS-CONTENT-STYLE DISENTANGLEMENT
22
- A Style-Based Generator Architecture for GANs, 2018.
- thispersondoesnotexist.com
STYLE
CONTENT
Dataset dependent problem!
CYCLE-CONSISTENT VAE
23
Weakly supervision: labels are not used !
CYCLE-CONSISTENT VAE – AVOIDING SHORTCUT PROBLEM
24
randomly sampled
CYCLE-CONSISTENT VAE - SWAPPING
25
class
style
class
content
Disentangling Factors of Variation with Cycle-Consistent Variational Auto-Encoders
IMAGE-TO-IMAGE TRANSLATION
26
Image translation is the task of mapping images between different domains: Given an input image in a source domain (e.g., dogs), we aim to generate an analogous image in a target domain (e.g., cats).
The basic assumption is that multi-domain images share common content but differ style
IMAGE-TO-IMAGE TRANSLATION - MUNIT
27
Decode
GAN loss
Prior
Encode
Dog
Cat
Input
images
Recon.
images
Within-domain reconstruction
Cross-domain translation
Input
images
Trans.
images
IMAGE-TO-IMAGE TRANSLATION - APPLICATIONS
28
Multimodal Unsupervised Image-to-Image Translation, X. Huang et al.
CONCLUSIONS
29
“In a certain light, all of science is one big unsupervised learning problem in which we search for the most disentangled representation of the world around us”
30
DISENTANGLED REPRESENTATION LEARNING
THANK YOU FOR LISTENING!
ANY QUESTIONS?