1 of 13

Stochastic Conditional Diffusion Models�for Robust Semantic Image Synthesis

Juyeon Ko*, Inho Kong*, Dogyun Park, Hyunwoo J. Kim

�Department of Computer Science and Engineering, Korea University

Korea University

MLV Lab

ICML 2024

2 of 13

Korea University

MLV Lab

Semantic Image Synthesis (SIS)

Semantic map y

(Label)

Semantic Image Synthesis with Spatially-Adaptive Normalization, Park et al., CVPR 2019 (Oral)

ICML 2024

Image

approximate

1

1

1

1

1

1

1

1

1

1

1

5

5

5

5

5

5

5

5

5

5

5

5

5

1: sky

2: tree

5: grass

Semantic Image Synthesis (SIS)

generate

3 of 13

Korea University

MLV Lab

Motivation

Image credit: https://tech.hindustantimes.com/tech/news/this-photoshop-ai-feature-will-change-the-way-you-edit-photos-know-what-is-generative-fill-71686747616850.html

car

water

grass

tree

sky

Photo editing

Content creation

Model

Train

Clean labels

from the dataset

Inference

Noisy labels

from users

gap

ICML 2024

4 of 13

Korea University

MLV Lab

Stochastic Conditional Diffusion Models (SCDM)

ICML 2024

Stochastic Conditional Diffusion Model (SCDM)

Stochastic conditioning via Label Diffusion

(Noisy)

Label

Existing Conditional Diffusion Models

(Noisy)

Label

5 of 13

Korea University

MLV Lab

  • Discrete diffusion for the labels,
  • Noise = [MASK] (absorbing state)

ICML 2024

Label Diffusion

clean

label

noisy

label

They get similar!

identical

6 of 13

Korea University

MLV Lab

ICML 2024

Label Diffusion

  • The trajectories and provided to the model during generation process are similar.
  • The generated image is close to the clean image.

7 of 13

Korea University

MLV Lab

ICML 2024

Forward process and Generation process

Label Diffusion

continuous

discrete

  • Forward process
  • Generation process

Label Diffusion

  • continuous
  • reverse
  • discrete
  • forward

8 of 13

Korea University

MLV Lab

ICML 2024

Class-wise Noise Schedule

Slowly diffuse small/rare classes

9 of 13

Korea University

MLV Lab

ICML 2024

Noisy SIS Benchmark

  • We introduce a new benchmark to assess generation performance under noisy conditions, simulating human errors that can occur during real-world applications.
  • Three setups:

1. [DS] downsampled semantic maps

2. [Edge] masking the edges of instances

3. [Random] randomly adding unlabeled class to the semantic maps (10%)

[DS]

[Edge]

[Random]

10 of 13

Korea University

MLV Lab

ICML 2024

Experiments - Noisy SIS

Label

OASIS

SAFM

SDM

Ours

LDM

11 of 13

Korea University

MLV Lab

ICML 2024

Analysis – Label Diffusion and Robustness

Label

Ours

Baseline

Clean

DS

Edge

Random

12 of 13

Korea University

MLV Lab

Conclusion

  • We first address a significant challenge for SIS models in real-world applications: dealing with noisy input from users.

  • We propose a novel conditional diffusion model, SCDM, specifically designed to enhance robustness on noisy labels.

  • We present a new benchmark setting to assess performance under noisy conditions.

  • SCDM improves generation quality through its Label Diffusion and class-wise noise schedule.

ICML 2024

13 of 13

Korea University

MLV Lab

Thank You

Paper

GitHub

ICML 2024