1 of 15

Denoising diffusion probabilistic models

AIWS Yejin Lee

1

2 of 15

Contents

  • Introduction for diffusion models
  • Concept of denoising diffusion probabilistic models (DDPM)
    • Forward process
    • Reverse process
    • Calculate training loss
  • Use case

2

3 of 15

Diffusion models

  • Powerful generative models
  • Image generated with GANs vs Diffusion models (Dhariwal & Nichol, 2021)

3

GANs

Diffusion models

Training set

4 of 15

Diffusion models

4

Forward diffusion process

Reverse denoising process

  • Denoising diffusion probabilistic model (DDPM)
  • Forward diffusion process: Gradually adds noise to input
  • Reverse denoising process: Learn to generate data by denoising

5 of 15

Forward diffusion process

  • The formal definition of the forward process in T steps:

5

X0

X1

X2

X3

X4

X5

XT

Forward diffusion process

6 of 15

Forward diffusion process

  • Diffusion kernel

,

where ,

6

X0

X1

X2

X3

X4

X5

XT

Forward diffusion process

7 of 15

Reverse denoising process

  • Gaussian with small enough step size
    • The reversal of the forward process is also gaussian (Feller, 1949)

  • Reverse the forward process q(Xt-1| Xt) is intractable

🡪 Approximate with

7

8 of 15

Reverse denoising process

  • The formal definition of the reverse process in T steps:

where

8

X0

X1

X2

X3

X4

X5

XT

Reverse denoising process

9 of 15

Learning denoising model

  • Variational upper bound

  • Sohl-Dickstein et al. ICML 2015, Ho et al. NeurlPS 2020 show that the loss is:

  • Both and are Gaussian distributions
    • The KL divergence has a simple form:

where

9

10 of 15

Parameterizing of Lt for training loss

  • Forward process posteriors, which are tractable when conditioned on x0

  • Recall that .

  • propose to represent the mean of the denoising model using a noise-prediction network

  • With this parameterization

10

11 of 15

Training objective weighting

  • DDPM (Ho et al. 2020) simply set = 1 to improve sample quality.
  • So, they propose to use the final loss as:

11

12 of 15

Result

  • Better quality (High FID score) then before
  • Shows the potential of the diffusion model

12

13 of 15

Impressive diffusion model: DALL·E 2

  • DALL·E 2 (DALL·E (openai.com))
  • Text-to-image generation
  • AI painting
  • CLIP + Diffusion models

13

Text:

a teddy bear on a skateboard in times square

14 of 15

Thank you

14

15 of 15

Reference

  • Dhariwal, P., & Nichol, A. (2021). Diffusion models beat gans on image synthesis. Advances in Neural Information Processing Systems, 34, 8780-8794.�https://doi.org/10.48550/arXiv.2105.05233
  • Feller, W. On the theory of stochastic processes, with particular reference to applications. In Proceedings of the [First] Berkeley Symposium on Mathematical Statistics and Probability. The Regents of the University of California, 1949.
  • Ho, J., Jain, A., & Abbeel, P. (2020). Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems33, 6840-6851.�https://doi.org/10.48550/arXiv.2006.11239
  • Ramesh, A., Dhariwal, P., Nichol, A., Chu, C., & Chen, M. (2022). Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125.�https://doi.org/10.48550/arXiv.2204.06125
  • Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., & Ganguli, S. (2015, June). Deep unsupervised learning using nonequilibrium thermodynamics. In International Conference on Machine Learning (pp. 2256-2265). PMLR.�https://doi.org/10.48550/arXiv.1503.03585
  • Song, Y., Sohl-Dickstein, J., Kingma, D. P., Kumar, A., Ermon, S., & Poole, B. (2020). Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456.�https://doi.org/10.48550/arXiv.2011.13456

15