1 of 12

Unsupervised Discovery of Interpretable Directions in the GAN Latent Space

Andrey Voynov, Artem Babenko

2 of 12

Teaser

3 of 12

Related Work

  • InterFaceGAN (Shen et al 2019)
    • Train SVM on latent space
  • Gan Steerability (Jahanian et al 2019)
    • Apply transformation, do reconstruction

  • Drawbacks of previous approaches
    • Requires human labels, pretrained models, or is limited to self-supervised labels (e.g. zooming, translation)
    • Discovers only directions researchers expect to identify

4 of 12

Method

  • Pretrained Generator (Orange)
  • Trainable: Matrix A and reconstructor R

5 of 12

Method

  • Columns of A are directions with unit length
  • Can be orthonormal

6 of 12

Method

  • Cross-entropy loss for direction
  • Mean absolute error loss for epsilon
  • λ=0.25

7 of 12

Experiments

  • Datasets
    • MNIST + Spectral Norm GAN
    • AnimeFaces + Spectral Norm GAN
    • CelebA-HQ + ProgGAN (pretrained)
    • ILSVRC + BigGAN (pretrained)
  • Evaluation Metrics
    • Reconstructor Classification Accuracy
    • Direction Variation Naturalness

8 of 12

Experiments

9 of 12

Experiments

10 of 12

Background Removal as Segmentation

11 of 12

Experiments

12 of 12

Limitations

  • Some attributes are still entangled in latent direction
  • Number of directions hyperparameters (they use 120 - 512)
    • Too many: is it harder to interpret? will training R always converge?
    • Too few: will it miss any interpretable directions?
  • Applications:
    • Can you backprop into BigGAN latent space to edit real images?