1 of 52

Generative Adversarial Networks

2 of 52

Today’s class

  • Unsupervised Learning

  • Generative Models
    • Autoencoders (AE)
    • Generative Adversarial Networks (GAN)
    • GANs: Recent Trends

3 of 52

Supervised vs Unsupervised Learning

Credit: cs231n, Stanford

Supervised Learning

Data: (x, y)

x is data, y is label

Goal: Learn a function to map x -> y

Examples: Classification, regression, object detection, semantic segmentation, image captioning, etc.

4 of 52

Supervised vs Unsupervised Learning

Credit: cs231n, Stanford

Supervised Learning

Data: (x, y)

x is data, y is label

Goal: Learn a function to map x -> y

Examples: Classification, regression, object detection, semantic segmentation, image captioning, etc.

Cat

Classification

5 of 52

Supervised vs Unsupervised Learning

Credit: cs231n, Stanford

Supervised Learning

Data: (x, y)

x is data, y is label

Goal: Learn a function to map x -> y

Examples: Classification, regression, object detection, semantic segmentation, image captioning, etc.

Object Detection

6 of 52

Supervised vs Unsupervised Learning

Credit: cs231n, Stanford

Supervised Learning

Data: (x, y)

x is data, y is label

Goal: Learn a function to map x -> y

Examples: Classification, regression, object detection, semantic segmentation, image captioning, etc.

Semantic Segmentation

7 of 52

Supervised vs Unsupervised Learning

Credit: cs231n, Stanford

Supervised Learning

Data: (x, y)

x is data, y is label

Goal: Learn a function to map x -> y

Examples: Classification, regression, object detection, semantic segmentation, image captioning, etc.

Image Captioning

8 of 52

Supervised vs Unsupervised Learning

Credit: cs231n, Stanford

Unsupervised Learning

Data: x

Just data, no labels!

Goal: Learn some underlying hidden structure of the data

Examples: Clustering, dimensionality reduction, feature learning, density estimation, etc.

9 of 52

Supervised vs Unsupervised Learning

Credit: cs231n, Stanford

Unsupervised Learning

Data: x

Just data, no labels!

Goal: Learn some underlying hidden structure of the data

Examples: Clustering, dimensionality reduction, feature learning, density estimation, etc.

K-Means Clustering

10 of 52

Supervised vs Unsupervised Learning

Credit: cs231n, Stanford

Unsupervised Learning

Data: x

Just data, no labels!

Goal: Learn some underlying hidden structure of the data

Examples: Clustering, dimensionality reduction, feature learning, density estimation, etc.

(Principal Component Analysis) Dimensionality Reduction

11 of 52

Supervised vs Unsupervised Learning

Credit: cs231n, Stanford

Unsupervised Learning

Data: x

Just data, no labels!

Goal: Learn some underlying hidden structure of the data

Examples: Clustering, dimensionality reduction, feature learning, density estimation, etc.

Generative Advarsarial Networks (Distribution learning)

12 of 52

Autoencoders

Unsupervised approach for learning a lower-dimensional feature representation from unlabeled training data

Credit: cs231n, Stanford

13 of 52

Autoencoders

Unsupervised approach for learning a lower-dimensional feature representation from unlabeled training data

Originally: Linear + nonlinearity (sigmoid)

Later: Deep, fully-connected

Later: ReLU CNN

Credit: cs231n, Stanford

14 of 52

Autoencoders

Unsupervised approach for learning a lower-dimensional feature representation from unlabeled training data

Originally: Linear + nonlinearity (sigmoid)

Later: Deep, fully-connected

Later: ReLU CNN

Z usually smaller than X

(Dimensionality Reduction)

Credit: cs231n, Stanford

Q: Why dimensionality

reduction?

15 of 52

Autoencoders

Unsupervised approach for learning a lower-dimensional feature representation from unlabeled training data

Originally: Linear + nonlinearity (sigmoid)

Later: Deep, fully-connected

Later: ReLU CNN

Z usually smaller than X

(Dimensionality Reduction)

Credit: cs231n, Stanford

Q: Why dimensionality

reduction?

A: Want features to capture meaningful factors of variation in data

16 of 52

Autoencoders

How to learn this feature representation?

Credit: cs231n, Stanford

17 of 52

Autoencoders

How to learn this feature representation?

Train such that features can be used to reconstruct original data “Autoencoding” - encoding itself

Credit: cs231n, Stanford

18 of 52

Autoencoders

How to learn this feature representation?

Train such that features can be used to reconstruct original data “Autoencoding” - encoding itself

Originally: Linear + nonlinearity (sigmoid)

Later: Deep, fully-connected

Later: ReLU CNN

Credit: cs231n, Stanford

19 of 52

Autoencoders

How to learn this feature representation?

Train such that features can be used to reconstruct original data “Autoencoding” - encoding itself

Decoder: 4-layer upconv

Encoder: 4-layer conv

Input Data

Reconstructed Data

Credit: cs231n, Stanford

20 of 52

Autoencoders

Train such that features can be used to reconstruct original data

L2 Loss Function

Decoder: 4-layer upconv

Encoder: 4-layer conv

Input Data

Credit: cs231n, Stanford

Reconstructed Data

21 of 52

Autoencoders

Train such that features can be used to reconstruct original data

L2 Loss Function

Decoder: 4-layer upconv

Encoder: 4-layer conv

Input Data

Credit: cs231n, Stanford

Reconstructed Data

Doesn’t use labels!

22 of 52

Credit: cs231n, Stanford

Autoencoders

Reconstructed Data

Encoder: 4-layer conv

Decoder: 4-layer upconv

Input Data

After training, throw away decoder

23 of 52

Autoencoders

Credit: cs231n, Stanford

24 of 52

Autoencoders

Credit: cs231n, Stanford

Loss Function (Softmax, etc.)

Encoder can be used to initialize a supervised model

25 of 52

Autoencoders

Credit: cs231n, Stanford

Encoder can be used to initialize a supervised model

Loss Function (Softmax, etc.)

Fine- tune encoder jointly with classifier

Train for final task

(sometimes with

small data)

26 of 52

27 of 52

28 of 52

Credit: cs231n, Stanford

Generative Adversarial Networks

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014

Sample from a simple distribution, e.g. random noise. Learn transformation to training distribution.

29 of 52

Generative Adversarial Networks

Credit: cs231n, Stanford

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014

Sample from a simple distribution, e.g. random noise. Learn transformation to training distribution.

A neural network can be used to represent

this complex transformation?

30 of 52

Training GANs: Two-player game

Credit: cs231n, Stanford

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014

Generator network: try to fool the discriminator by generating real-looking images

Discriminator network: try to distinguish between real and fake images

31 of 52

Training GANs: Two-player game

Credit: cs231n, Stanford

Generator network: try to fool the discriminator by generating real-looking images

Discriminator network: try to distinguish between real and fake images

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014

Fake and real images copyright Emily Denton et al. 2015.

32 of 52

33 of 52

34 of 52

Training GANs: Two-player game

Figure: Ian Goodfellow NIPS Talk

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014

Generator network: try to fool the discriminator by generating real-looking images

Discriminator network: try to distinguish between real and fake images

35 of 52

Training GANs: Two-player game

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014

After training, use generator network to generate new images

Fake and real images copyright Emily Denton et al. 2015.

Credit: cs231n, Stanford

36 of 52

Training GANs: Two-player game

Credit: cs231n, Stanford

Generator network: try to fool the discriminator by generating real-looking images

Discriminator network: try to distinguish between real and fake images

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014

37 of 52

38 of 52

Training GANs: Two-player game

Credit: cs231n, Stanford

Generator network: try to fool the discriminator by generating real-looking images

Discriminator network: try to distinguish between real and fake images

  • Discriminator (θd) wants to maximize objective such that D(x) is close to 1 (real) and D(G(z)) is close to 0 (fake)
  • Generator (θg) wants to minimize objective such that D(G(z)) is close to 1 (discriminator is fooled into thinking generated G(z) is real)

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014

39 of 52

40 of 52

Training GANs: Two-player game

Credit: cs231n, Stanford

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014

41 of 52

Generative Adversarial Nets

Generated Samples [MNIST Database, Toronto Face Database (TFD)]

Nearest neighbor from training set

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014

42 of 52

Generative Adversarial Nets

Generated Samples [CIFAR-10 Database]

convolutional discriminator and Fully connected model “deconvolutional” generator

Nearest neighbor from training set

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014

43 of 52

DCGAN

Radford et al, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, ICLR 2016

Deep Convolutional Generative Adversarial Nets

  • Generator is an upsampling network with fractionally-strided convolutions.

Generator

44 of 52

DCGAN

Deep Convolutional Generative Adversarial Nets

  • Generator is an upsampling network with fractionally-strided convolutions.
  • Discriminator is a convolutional network.

Radford et al, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, ICLR 2016

45 of 52

DCGAN

Radford et al, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, ICLR 2016

Deep Convolutional Generative Adversarial Nets

  • Generator is an upsampling network with fractionally-strided convolutions.
  • Discriminator is a convolutional network.

46 of 52

DCGAN

Deep Convolutional Generative Adversarial Nets

Generated bedrooms after one training pass through the LSUN dataset.

Amazing!

Radford et al, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, ICLR 2016

47 of 52

Image-to-Image Translation with Conditional Adversarial Networks

Phillip et al. 2017

48 of 52

Unpaired image to image translation using cycle consistency adversarial networks

Jun-Yan Zhu et al. 2017

49 of 52

“The GAN Zoo”

https://github.com/hindupuravinash/the-gan-zoo

50 of 52

“The GAN Zoo”

https://github.com/hindupuravinash/the-gan-zoo

51 of 52

“The GAN Zoo”

And Many More …………….......

https://github.com/hindupuravinash/the-gan-zoo

52 of 52

GANs: Things to Remember

Take game-theoretic approach: learn to generate from training distribution through 2-player game

Pros:

  • Beautiful, state-of-the-art samples!

Cons:

  • Trickier / more unstable to train
  • Can’t solve inference queries such as p(x), p(z|x)

Active areas of research:

  • Better loss functions, more stable training (Wasserstein GAN, LSGAN, many others)
  • Conditional GANs, GANs for all kinds of applications