1 of 21

Presented by Jingwei Ma

2 of 21

3 of 21

Related #1 - Textual Inversion

4 of 21

Related #2 - Dreambooth

5 of 21

Related #3 - UniTune

6 of 21

Related #3 - UniTune

7 of 21

Comparison

Textual Inversion

Posture/Action

Text control

Unchanged

No control

ImagicDreambooth

UniTune

Textual Inversion

Context/Background

Text control

Unchanged

Affected by text

Imagic

Textual Inversion�Dreambooth

Identity

Text control

Unchanged

Same concept

Unitune

Imagic�Dreambooth

UniTune

8 of 21

Comparison

Textual Inversion

Posture/Action

Text control

Unchanged

No control

ImagicDreambooth

UniTune

Textual Inversion

Context/Background

Text control

Unchanged

Affected by text

Imagic

Textual Inversion�Dreambooth

Identity

Text control

Unchanged

Same concept

Unitune

Imagic�Dreambooth

UniTune

Satisfying the target prompt while preserving maximal content from image

9 of 21

Comparison

Textual Inversion

Posture/Action

Text control

Unchanged

No control

Imagic�Dreambooth

UniTune

Textual Inversion

Context/Background

Text control

Unchanged

Affected by text

Imagic

Textual Inversion�Dreambooth

Identity

Text control

Unchanged

Same concept

Unitune

Imagic�Dreambooth

UniTune

10 of 21

Method

11 of 21

Overview: 3 stages

12 of 21

Stage1: Text Embedding Optimization

Objective:

Optimize:

13 of 21

After Stage 1

Not exactly the same as original image

Original image

Generated

14 of 21

Stage 2: Model fine-tuning

Input

Before fine-tuning

After fine-tuning

15 of 21

Stage3: Generation

Top row = pretrained, bottom row = fine-tuned

16 of 21

More results

17 of 21

Different prompt, same image

18 of 21

19 of 21

Same prompt, different samples

20 of 21

More examples

21 of 21

Stable Diffusion Implementations