Lecture 9:
Generative Models II
Sookyung Kim
Spring 2025
Taxonomy of Generative Models
Generative models
Explicit density
Implicit density
Tractable density
Approximate density
Variational
Stochastic
Direct
Stochastic
Generative Adversarial Networks (GAN)
Generative Stochastic Networks (GSN)
Variational Autoencoders
Boltzmann Machine
Fully Visible Belief Nets
PixelRNN/CNN
Ian Goodfellow, Tutorial on Generative Adversarial Networks https://arxiv.org/abs/1701.00160
Today
Spring 2025
2
Generative Adversarial Networks (GANs)
Spring 2025
3
Generative Adversarial Networks: Implementation
2020
Spring 2025
4
Generative Adversarial Networks: Applications
Robbie Barrat, Obvious (2018)
Spring 2025
5
Generative Adversarial Networks
Recall that our problem is sampling images from complex, high-dimensional training distribution p(z).
z
G(z)
D(x)
Real/Fake
Generator
Discriminator
x
Spring 2025
6
Generative Adversarial Networks: Discriminator
z
G(z)
D(x)
p(real)
Generator
Discriminator
x
For real image x, the discriminator wants to output high p(real).
(Dθ(x) → 1)
For an image generated by G, the discriminator wants to output low p(real).
(Dθ(Gθ(z)) → 0)
Spring 2025
7
Generative Adversarial Networks: Generator
z
G(z)
D(x)
p(real)
Discriminator
x
Nothing to do when D takes a real example.
When G generates an image, its goal is to increase p(real) output of D with it, by producing a real-like image.
(Dθ(Gθ(z)) → 1)
Generator
Spring 2025
8
Generative Adversarial Networks: Objective Function
Spring 2025
9
Generative Adversarial Networks: Overall Algorithm
Spring 2025
10
Generative Adversarial Networks: Practical Concerns
D(G(z))
Spring 2025
11
Generative Adversarial Networks: Implementation
FC model trained on CIFAR10
Deconv-conv model trained on CIFAR10
Closest training example
Spring 2025
12
Deep Convolutional GAN (DCGAN)
Spring 2025
13
Deep Convolutional GAN (DCGAN): Architecture
Generator
Discriminator
5×5 deconv
stride=2
5×5 deconv
stride=2
5×5 deconv
stride=2
5×5 deconv
stride=2
5×5 conv
stride=2
5×5 conv
stride=2
5×5 conv
stride=2
5×5 conv
stride=2
Spring 2025
14
Deep Convolutional GAN (DCGAN): Interpretability
z1
z2
z3
(z1 + z2 + z3) / 3
y
Spring 2025
15
Deep Convolutional GAN (DCGAN): Interpretability
Generated images
Generated images
“Interpolation between a series of 9 random points in Z show that the space learned has smooth transitions, with every image in the space plausibly looking like a bedroom.” -- Fig. 4 in the paper
Spring 2025
16
Wasserstein GAN
Spring 2025
17
Limitation of GANs: Unstable Training
Spring 2025
18
Limitation of GANs: Unstable Training
Mean
Density
Spring 2025
19
Limitation of GANs: Mode Collapse
Spring 2025
20
Wasserstein GAN
Spring 2025
21
Wasserstein GAN
GAN
WGAN
or
z
G(z)
D(x)
p(real)
Discriminator
x
Generator
z
G(z)
f(x)
score(real)
Critic
x
Generator
maximize
Spring 2025
22
Wasserstein GAN: Results
Spring 2025
23
Wasserstein GAN: Limitations
Exploding Gradients
Vanishing Gradients
Spring 2025
24
Wasserstein GAN with Gradient Penalty (WGAN-GP)
Spring 2025
25
Wasserstein GAN with Gradient Penalty (WGAN-GP)
WGAN with clipping
WGAN-GP
Spring 2025
26
GANs for Image-to-Image Translation
Spring 2025
27
Image Translation
Spring 2025
28
Pix2pix
x ∈ X
y ∈ Y
Spring 2025
29
Pix2pix: Objective Function
D tries to maximize this by maximizing scores (≈1) for real images y.
D tries to maximize this by minimizing scores (≈0) for generated images from x.
G tries to minimize this by fooling D to assign high score (≈1) for generated images from x.
G tries to minimize this by creating an image as similar as the ground truth pair y.
Adversarial Loss
Reconstruction Loss
Spring 2025
30
Pix2pix: Implementation Details
Spring 2025
31
Pix2pix: Examples
Colorization
Apps
Aerial photos to/from Google Map
Spring 2025
32
Pix2pix: Summary
Spring 2025
33
CycleGAN
Spring 2025
34
CycleGAN
Domain X (horse)
Domain Y (zebra)
Gx→y
Dy
Gy→x
Real vs. Fake
x
Cycle-consistency Loss
Adversarial Loss
y
Gy→x
Gx→y
Dx
Real vs. Fake
Adversarial Loss
Cycle-consistency Loss
Real image in training set
Generated image by model
Spring 2025
35
CycleGAN: Objective Function
Dy tries to maximize this by maximizing scores (≈1) for real images y.
Dy tries to maximize this by minimizing scores (≈0) for generated images from x.
Gx→y tries to minimize this by fooling Dy to assign high score (≈1) for generated images from x.
Both G try to minimize this by creating an image as similar as the original image x.
Adversarial Loss�(same as original GAN)
Cycle-Consistency Loss
This is for x → y → x. We do have its mirrored version for y → x → y, and overall loss is the sum of both.
Spring 2025
36
CycleGAN: Implementation Details
Spring 2025
37
CycleGAN: Examples
Spring 2025
38
CycleGAN: Examples
Smart-phone photos → DSLR photos
Monet → Photos
Photo → Monet, Van Gogh, Cezzane, Ukiyo-e
Spring 2025
39
CycleGAN: Summary
Spring 2025
40
DiscoGAN
Spring 2025
41
DiscoGAN: Implementation Details
Spring 2025
42
DiscoGAN: Examples
Chair to Car
Car to Face
Handbag to Shoes
Shoes to Handbag
Spring 2025
43