JavaScript isn't enabled in your browser, so this file can't be opened. Enable and reload.

1 of 38

Generative Models

Theory and techniques

2 of 38

Outline

Overview of Generative Models
Probabilistic Generative Models
Deep Generative Models

Variational Autoencoder (VAE)
Generative Adversarial Networks (GANs)

Hands-on Naïve Bayes and GAN examples

3 of 38

Overview of Generative Models

Informally

a generative model can create new data instances

More formally

given an input variable X and a target variable Y, a probabilistic generative model is a statistical model of the joint probability distribution on X × Y, P(X, Y)

10

4 of 38

Overview of Generative Models

Informally:

a generative model can create new data instances

More formally:

given an input variable X and a target variable Y, a probabilistic generative model is a statistical model of the joint probability distribution on X × Y, P(X, Y)

The term generative model is also used for models that generate instances of variables in a way that has no clear relationship to probability distributions over potential samples of input variables. Generative adversarial networks (GANs) are examples of this class of generative models.

10

5 of 38

Generative vs. Discriminative Modeling

Discriminative Modelling

Prediction

Input Data

Generated Example

Generative Modeling

Prediction

Random Input

Generated Example

6 of 38

Popular Probabilistic Generative Models

Naïve Bayes
Gaussian Mixture Model (GMM)
Hidden Markov Model (HMM)
Latent Dirichlet Allocation (LDA)
Markov Random Fields*
Variational Autoencoder**

7 of 38

Popular Probabilistic Generative Models

Naïve Bayes

P(X, Y) = P(Y) * P(X|Y) = P(Y) * P(x₁|Y) * P(x₂|Y) * …. * P(x_N|Y)

Classifier

8 of 38

Popular Probabilistic Generative Models

Z

X

9 of 38

Popular Probabilistic Generative Models

Hidden Markov Model

P(X, Y) = P(y₁) * P(x₁| y₁) * P(y₂| y₁) * P(x₂| y₂) * …. * P(y_N| y_N-1) * P(x_N| y_N)

Sequence model

10 of 38

Popular Probabilistic Generative Models

11 of 38

Popular Probabilistic Generative Models

Latent Dirichlet Allocation¹¹

P(W, X, Θ, Φ, α, β) = P(W| Z, Φ) * P(Φ| β) * P(Z| Θ) * P(Θ| α)

Admixture model

12 of 38

Variational Autoencoders (VAEs)

13 of 38

Variational Autoencoder (VAE)^1,2,7

unsupervised technique
consists of two networks:

encoder
decoder

the (latent) encoded vector

has enough information to generate (reconstruct) the input
is constrained to come from a multidimensional Gaussian distribution

the latent space is carefully regularized, so that random points in that space can generate meaningful data points

14 of 38

Variational Autoencoder (VAE)^2,7

VAE training

The input point is encoded as a distribution over a latent space
A point from the latent space is sampled from that distribution
The sampled point is decoded, and the model error is computed
The error is backpropagated through the network

15 of 38

Variational Autoencoder (VAE)^2,7

VAE training

The loss function contains:

an input reconstruction term
a regularization term to ensure the distribution returned by the encoder is close to the standard normal

16 of 38

Variational Autoencoder (VAE)^2,6,7

The regularization term is enforcing:

covariance matrices to be close to the identity thus preventing punctual distributions
means to be close to zero thus preventing encoded distributions to be too far apart from each other

17 of 38

Probabilistic view of VAEs¹⁴

The VAE can be viewed as a probabilistic model where:

the probabilistic decoder computes the distribution of the data given the latent encoded variable i.e., P(X|Z)
the latent encoded variable has a prior P(Z) which is a standard normal distribution, i.e., N(0, I)
inference, P(Z|X), is performed using (amortized) variational inference

18 of 38

Probabilistic view of VAEs¹⁴

19 of 38

Properties of Variational Autoencoders¹²

Work better than other methods in exploring variations of existing data:

20 of 38

Applications of Variational Autoencoders

Powerful generative models^8,13 for many complex data including:

image generation⁴*
music generation

interpolate between MIDI samples ¹³ **

language modeling⁵
neuroimaging/brain mapping⁵

21 of 38

Generative Adversarial Networks (GANs)

22 of 38

A Probabilistic Motivation¹⁸

Complex data are generated by very complex probability distributions over very large spaces

e.g., the distribution of black and white images of dogs

Most of the times we

neither know how to explicitly express such distributions
nor have a way to sample from those distributions

With GANs we express such distributions as very complex transformations of normal variables

NNs are used to express and learn these transformations

23 of 38

A Probabilistic Motivation¹⁸

Adversarial training is used to train the NN and learn the complex transformation function

24 of 38

Generative Adversarial Networks (GANs)^15,16,17

Unsupervised modelling technique that learns the regularities or patterns in input data in such a way that the model can be used to generate or output new examples that could have been drawn from the original dataset

The problem is framed as a supervised learning problem with two sub-models:

the generator model generates new examples
the discriminator model tries to classify examples as either real or generated
the two models are trained together in a zero-sum adversarial game, until the discriminator model is fooled about half the time, meaning the generator model is generating plausible examples

25 of 38

Generative Adversarial Networks (GANs)^15,16,17

Generator Model : takes a fixed-length random vector as input and generates a sample in the domain

the vector (latent variable) is drawn randomly from a Gaussian distribution
it is used to seed the generative process
after training, points in this multidimensional vector space correspond to points in the problem domain, forming a compressed representation of the data distribution

26 of 38

Generative Adversarial Networks (GANs)^15,16,17

Discriminator Model : takes an example from the domain as input (real or generated) and predicts a binary class label of real or fake (generated)

27 of 38

Generative Adversarial Networks (GANs)¹⁸

28 of 38

Generative Adversarial Networks (GANs)

Original loss function:

D(x) is the discriminator's estimate that a real data instance is real
E_x is the expected value over all real data instances
G(z) is the generator's output on input z
D(G(z)) is the discriminator's estimate that a fake instance is real
E_z is the expected value over all random inputs to the generator

29 of 38

Challenges of Real World GANs^{19, 20}

Oscillating Loss

Mode Collapse: the generator learns to generate small subset of different outcomes
Uninformative loss: no correlation between loss value and model quality

Vanishing Gradients²¹: if the discriminator is too good, then the generator training can fail due to vanishing gradients
Large number of hyper-parameters

30 of 38

GAN Variations

Progressive GANs²³

the generator's first layers produce very low-resolution images, and subsequent layers add details
train more quickly and produces higher resolution images.

Conditional GANs²⁴

they model the conditional probability P(X |Y)
train on a labeled data set and let you specify the label for each generated instance
Example

an unconditional MNIST GAN would produce random digits
a conditional MNIST GAN would let you specify which digit the GAN should generate

31 of 38

GAN Variations

Image Generation

Image-to-Image Translation²⁵: take an image as input and map it to a generated output image with different properties

32 of 38

GAN Variations

CycleGAN²⁶: transform images from one set into images that could plausibly belong to another set

Super-resolution²⁵: increase the resolution of images, adding detail where necessary to fill in blurry areas

33 of 38

GAN Variations

Text-to-image Synthesis²⁸

Face Inpainting²⁹

34 of 38

GAN Variations

Text-to-Speech³⁰

Music generation³¹

https://salu133445.github.io/musegan/audio/best_samples.mp3

Text Generation?

35 of 38

GAN Variations

Video Prediction³⁴

3D Object Generation³⁵

36 of 38

Hands-on Naïve Bayes and GAN examples

https://bitbucket.org/diip20201/tutorials/src/master/generative_models/

37 of 38

References

https://towardsdatascience.com/deep-generative-models-25ab2821afd3
https://towardsdatascience.com/understanding-variational-autoencoders-vaes-f70510919f73
https://cedar.buffalo.edu/~srihari/CSE676/21.3-VAE-Apps.pdf
A. Razavi, A. van den Oord and O. Vinyals: Generating Diverse High-Fidelity Images with VQ-VAE-2, NeurIPS 2019
C. Li, X. Gao, Y. Li, B. Peng, X. Li, Y. Zhang and J. Gao: Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space, EMNLP 2020
https://www.jeremyjordan.me/variational-autoencoders/
D. P. Kingma and M. Welling: An Introduction to Variational Autoencoders, 2019
https://awesomeopensource.com/project/matthewvowels1/Awesome-VAEs
Q. Dong, G. Lian, et al: Deep Variational Autoencoder for Mapping Functional Brain Networks, IEEE Transactions on Cognitive and Developmental Systems 2020
https://towardsdatascience.com/generative-vs-2528de43a836
D. M. Blei, A. Y. Ng and M. I. Jordan: Latent Dirichlet Allocation, NIPS 2001
https://towardsdatascience.com/intuitively-understanding-variational-autoencoders-1bfe67eb5daf
A. Roberts, J. Eggel and D. Eck: Hierarchical Variational Autoencoders for Music, NIPS 2017
https://www.jeremyjordan.me/variational-autoencoders/
https://machinelearningmastery.com/what-are-generative-adversarial-networks-gans/
I. J. Goodfellow and J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio: Generative Adversarial Networks, NIPS 2014
I. J. Goodfellow : Generative Adversarial Networks, Tutorial NIPS 2016
https://towardsdatascience.com/understanding-generative-adversarial-networks-gans-cd6e4651a29
D. Foster, Generative Deep Learnng, 2019
https://machinelearningmastery.com/practical-guide-to-gan-failure-modes/

38 of 38

References

M. Arjovsky and L. Bottou: Towards Principled Methods for Training Generative Adversarial Networks, ICLR 2017
https://developers.google.com/machine-learning/gan/applications
T. Karras, T. Aila, S. Laine and J. Lehtinen: Progressive Growing of GANs for Improved Quality, Stability, and Variation, ICLR 2018
M. Mirza and S. Osindero: Conditional Generative Adversarial Nets, arXiv:1411.1784
P. Isola, J.-Y. Zhu, T. Zhou and A. A. Efros: Image-to-Image Translation with Conditional Adversarial Nets, CVPR 2017
J.-Y, Zhu, T. Park, P. Isola and A. A. Efros: Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks, ICCV 2017
C. Ledig et al: Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network, CVPR 2017
H. Zhang, T. Xu, H. Li, S. Zhang, X. Wang, X. Huang and D. Metaxas: StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks, ICCV 2017
R. A. Yeh, C. Chen, T. Y. Lim, A. G. Schwing, M. Hasegawa-Johnson, M. N. Do: Semantic Image Inpainting with Deep Generative Models, CVPR 2017
S. Yang, L. Xie, X. Chen, X. Lou, X. Zhu, D. Huang, H. Li: Statistical Parametric Speech Synthesis Using Generative Adversarial Networks Under A Multi-task Learning Framework, ASRU 2017
https://salu133445.github.io/musegan/
C. Vondrick, H. Pirsiavash and A. Torralba: Generating Videos with Scene Dynamics, NIPS 2016
M. Gadelha, S. Maji and R. Wang: 3D Shape Induction from 2D Views of Multiple Objects, 3DV 2017