1 of 38

Generative Models

Theory and techniques

2 of 38

Outline

  • Overview of Generative Models
  • Probabilistic Generative Models
  • Deep Generative Models
    • Variational Autoencoder (VAE)
    • Generative Adversarial Networks (GANs)

  • Hands-on Naïve Bayes and GAN examples

3 of 38

Overview of Generative Models

  • Informally
    • a generative model can create new data instances
  • More formally
    • given an input variable X and a target variable Y, a probabilistic generative model is a statistical model of the joint probability distribution on X × Y, P(X, Y)

10

4 of 38

Overview of Generative Models

  • Informally:
    • a generative model can create new data instances
  • More formally:
    • given an input variable X and a target variable Y, a probabilistic generative model is a statistical model of the joint probability distribution on X × Y, P(X, Y)

The term generative model is also used for models that generate instances of variables in a way that has no clear relationship to probability distributions over potential samples of input variables. Generative adversarial networks (GANs) are examples of this class of generative models.

10

5 of 38

Generative vs. Discriminative Modeling

Discriminative Modelling

Prediction

Input Data

Generated Example

Generative Modeling

Prediction

Random Input

Generated Example

6 of 38

Popular Probabilistic Generative Models

  • Naïve Bayes
  • Gaussian Mixture Model (GMM)
  • Hidden Markov Model (HMM)
  • Latent Dirichlet Allocation (LDA)
  • Markov Random Fields*
  • Variational Autoencoder**

7 of 38

Popular Probabilistic Generative Models

Naïve Bayes

P(X, Y) = P(Y) * P(X|Y) = P(Y) * P(x1|Y) * P(x2|Y) * …. * P(xN|Y)

Classifier

8 of 38

Popular Probabilistic Generative Models

  •  

Z

X

9 of 38

Popular Probabilistic Generative Models

Hidden Markov Model

P(X, Y) = P(y1) * P(x1| y1) * P(y2| y1) * P(x2| y2) * …. * P(yN| yN-1) * P(xN| yN)

Sequence model

10 of 38

Popular Probabilistic Generative Models

11 of 38

Popular Probabilistic Generative Models

Latent Dirichlet Allocation11

P(W, X, Θ, Φ, α, β) = P(W| Z, Φ) * P(Φ| β) * P(Z| Θ) * P(Θ| α)

Admixture model

12 of 38

Variational Autoencoders (VAEs)

13 of 38

Variational Autoencoder (VAE)1,2,7

  • unsupervised technique
  • consists of two networks:
    • encoder
    • decoder

  • the (latent) encoded vector
    • has enough information to generate (reconstruct) the input
    • is constrained to come from a multidimensional Gaussian distribution
  • the latent space is carefully regularized, so that random points in that space can generate meaningful data points

14 of 38

Variational Autoencoder (VAE)2,7

VAE training

  1. The input point is encoded as a distribution over a latent space
  2. A point from the latent space is sampled from that distribution
  3. The sampled point is decoded, and the model error is computed
  4. The error is backpropagated through the network

15 of 38

Variational Autoencoder (VAE)2,7

VAE training

The loss function contains:

    • an input reconstruction term
    • a regularization term to ensure the distribution returned by the encoder is close to the standard normal

16 of 38

Variational Autoencoder (VAE)2,6,7

The regularization term is enforcing:

    • covariance matrices to be close to the identity thus preventing punctual distributions
    • means to be close to zero thus preventing encoded distributions to be too far apart from each other

17 of 38

Probabilistic view of VAEs14

The VAE can be viewed as a probabilistic model where:

  • the probabilistic decoder computes the distribution of the data given the latent encoded variable i.e., P(X|Z)
  • the latent encoded variable has a prior P(Z) which is a standard normal distribution, i.e., N(0, I)
  • inference, P(Z|X), is performed using (amortized) variational inference

18 of 38

Probabilistic view of VAEs14

19 of 38

Properties of Variational Autoencoders12

Work better than other methods in exploring variations of existing data:

20 of 38

Applications of Variational Autoencoders

Powerful generative models8,13 for many complex data including:

  • image generation4*
  • music generation
    • interpolate between MIDI samples 13 **
  • language modeling5
  • neuroimaging/brain mapping5

21 of 38

Generative Adversarial Networks (GANs)

22 of 38

A Probabilistic Motivation18

  • Complex data are generated by very complex probability distributions over very large spaces
    • e.g., the distribution of black and white images of dogs
  • Most of the times we
    • neither know how to explicitly express such distributions
    • nor have a way to sample from those distributions

  • With GANs we express such distributions as very complex transformations of normal variables
    • NNs are used to express and learn these transformations

23 of 38

A Probabilistic Motivation18

Adversarial training is used to train the NN and learn the complex transformation function

24 of 38

Generative Adversarial Networks (GANs)15,16,17

Unsupervised modelling technique that learns the regularities or patterns in input data in such a way that the model can be used to generate or output new examples that could have been drawn from the original dataset

The problem is framed as a supervised learning problem with two sub-models:

  • the generator model generates new examples
  • the discriminator model tries to classify examples as either real or generated
  • the two models are trained together in a zero-sum adversarial game, until the discriminator model is fooled about half the time, meaning the generator model is generating plausible examples

25 of 38

Generative Adversarial Networks (GANs)15,16,17

Generator Model : takes a fixed-length random vector as input and generates a sample in the domain

  • the vector (latent variable) is drawn randomly from a Gaussian distribution
  • it is used to seed the generative process
  • after training, points in this multidimensional vector space correspond to points in the problem domain, forming a compressed representation of the data distribution

26 of 38

Generative Adversarial Networks (GANs)15,16,17

Discriminator Model : takes an example from the domain as input (real or generated) and predicts a binary class label of real or fake (generated)

27 of 38

Generative Adversarial Networks (GANs)18

28 of 38

Generative Adversarial Networks (GANs)

Original loss function:

  • D(x) is the discriminator's estimate that a real data instance is real
  • Ex is the expected value over all real data instances
  • G(z) is the generator's output on input z
  • D(G(z)) is the discriminator's estimate that a fake instance is real
  • Ez is the expected value over all random inputs to the generator

29 of 38

Challenges of Real World GANs19, 20

  • Oscillating Loss

  • Mode Collapse: the generator learns to generate small subset of different outcomes
  • Uninformative loss: no correlation between loss value and model quality

  • Vanishing Gradients21: if the discriminator is too good, then the generator training can fail due to vanishing gradients
  • Large number of hyper-parameters

30 of 38

GAN Variations

  • Progressive GANs23
    • the generator's first layers produce very low-resolution images, and subsequent layers add details
    • train more quickly and produces higher resolution images.

  • Conditional GANs24
    • they model the conditional probability P(X |Y)
    • train on a labeled data set and let you specify the label for each generated instance
    • Example
      • an unconditional MNIST GAN would produce random digits
      • a conditional MNIST GAN would let you specify which digit the GAN should generate

31 of 38

GAN Variations

  • Image Generation

  • Image-to-Image Translation25: take an image as input and map it to a generated output image with different properties

32 of 38

GAN Variations

  • CycleGAN26: transform images from one set into images that could plausibly belong to another set

  • Super-resolution25: increase the resolution of images, adding detail where necessary to fill in blurry areas

33 of 38

GAN Variations

  • Text-to-image Synthesis28

  • Face Inpainting29

34 of 38

GAN Variations

  • Text-to-Speech30

  • Music generation31

https://salu133445.github.io/musegan/audio/best_samples.mp3

  • Text Generation?

35 of 38

GAN Variations

  • Video Prediction34

  • 3D Object Generation35

36 of 38

Hands-on Naïve Bayes and GAN examples

37 of 38

References

  1. https://towardsdatascience.com/deep-generative-models-25ab2821afd3
  2. https://towardsdatascience.com/understanding-variational-autoencoders-vaes-f70510919f73
  3. https://cedar.buffalo.edu/~srihari/CSE676/21.3-VAE-Apps.pdf
  4. A. Razavi, A. van den Oord and O. Vinyals: Generating Diverse High-Fidelity Images with VQ-VAE-2, NeurIPS 2019
  5. C. Li, X. Gao, Y. Li, B. Peng, X. Li, Y. Zhang and J. Gao: Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space, EMNLP 2020
  6. https://www.jeremyjordan.me/variational-autoencoders/
  7. D. P. Kingma and M. Welling: An Introduction to Variational Autoencoders, 2019
  8. https://awesomeopensource.com/project/matthewvowels1/Awesome-VAEs
  9. Q. Dong, G. Lian, et al: Deep Variational Autoencoder for Mapping Functional Brain Networks, IEEE Transactions on Cognitive and Developmental Systems 2020
  10. https://towardsdatascience.com/generative-vs-2528de43a836
  11. D. M. Blei, A. Y. Ng and M. I. Jordan: Latent Dirichlet Allocation, NIPS 2001
  12. https://towardsdatascience.com/intuitively-understanding-variational-autoencoders-1bfe67eb5daf
  13. A. Roberts, J. Eggel and D. Eck: Hierarchical Variational Autoencoders for Music, NIPS 2017
  14. https://www.jeremyjordan.me/variational-autoencoders/
  15. https://machinelearningmastery.com/what-are-generative-adversarial-networks-gans/
  16. I. J. Goodfellow and J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio: Generative Adversarial Networks, NIPS 2014
  17. I. J. Goodfellow : Generative Adversarial Networks, Tutorial NIPS 2016
  18. https://towardsdatascience.com/understanding-generative-adversarial-networks-gans-cd6e4651a29
  19. D. Foster, Generative Deep Learnng, 2019
  20. https://machinelearningmastery.com/practical-guide-to-gan-failure-modes/

38 of 38

References

  1. M. Arjovsky and L. Bottou: Towards Principled Methods for Training Generative Adversarial Networks, ICLR 2017
  2. https://developers.google.com/machine-learning/gan/applications
  3. T. Karras, T. Aila, S. Laine and J. Lehtinen: Progressive Growing of GANs for Improved Quality, Stability, and Variation, ICLR 2018
  4. M. Mirza and S. Osindero: Conditional Generative Adversarial Nets, arXiv:1411.1784
  5. P. Isola, J.-Y. Zhu, T. Zhou and A. A. Efros: Image-to-Image Translation with Conditional Adversarial Nets, CVPR 2017
  6. J.-Y, Zhu, T. Park, P. Isola and A. A. Efros: Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks, ICCV 2017
  7. C. Ledig et al: Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network, CVPR 2017
  8. H. Zhang, T. Xu, H. Li, S. Zhang, X. Wang, X. Huang and D. Metaxas: StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks, ICCV 2017
  9. R. A. Yeh, C. Chen, T. Y. Lim, A. G. Schwing, M. Hasegawa-Johnson, M. N. Do: Semantic Image Inpainting with Deep Generative Models, CVPR 2017
  10. S. Yang, L. Xie, X. Chen, X. Lou, X. Zhu, D. Huang, H. Li: Statistical Parametric Speech Synthesis Using Generative Adversarial Networks Under A Multi-task Learning Framework, ASRU 2017
  11. https://salu133445.github.io/musegan/
  12. C. Vondrick, H. Pirsiavash and A. Torralba: Generating Videos with Scene Dynamics, NIPS 2016
  13. M. Gadelha, S. Maji and R. Wang: 3D Shape Induction from 2D Views of Multiple Objects, 3DV 2017