第 1 页，共 34 页

Sketch-Guided Text-to-Image Generation

Final Report - Jul 27, by Elliott Wu

Mentor: Hyungjoo Cho

Advisor: Yongyi Lu, Yu-Wing Tai,

Chi-Keung Tang

第 2 页，共 34 页

sText2Image

male, long face, smile with mouth closed, double eyelids, five o'clock shadow…

第 3 页，共 34 页

sText2Image

male, long face, smile with mouth closed, double eyelids, five o'clock shadow…

第 4 页，共 34 页

sText2Image

male, long face, smile with mouth closed, double eyelids, five o'clock shadow…

TEXT

SKETCH

IMAGE

第 5 页，共 34 页

Text2Image

Generative Adversarial Text-to-Image Synthesis (Reed et al, ICML 2016)
Stack-GAN: Text to Photo-realistic Image Synthesis (Zhang et al, arxiv)
…

* retrieved from Stack-GAN

第 6 页，共 34 页

Sketch?

* Jun-yan Zhu, Generative Visual Manipulation on the Natural Image Manifold, ECCV 2016

第 7 页，共 34 页

Sketch?

* collected from volunteers

第 8 页，共 34 页

Sketch?

第 9 页，共 34 页

Sketch?

第 10 页，共 34 页

Joint Representation

male, long face, smile with mouth closed, double eyelids, five o'clock shadow…

TEXT

SKETCH

IMAGE

male, long face, smile with mouth closed, double eyelids, five o'clock shadow…

Joint Space

TEXT

SKETCH

IMAGE

第 11 页，共 34 页

Network Architecture - Training

512

128

4x8

8x16

16x32

32x64

256

128

256

32x64

16x32

8x16

4x8

G(z, t)

Generator:

Discriminator:

100

linear

512

replicate

512

fake/wrong

real

replicate

4x8

第 12 页，共 34 页

Network Architecture - Testing

L_contextual :

Discriminator

L_perceptual :

text

G(z, t)

Generator

Input:

text

sketch

text

Output:

backprop

第 13 页，共 34 页

Data Preparation - Image

Face (CelebA)

Bird (CUB)

Flower (Oxford)

11k

202k

第 14 页，共 34 页

Data Preparation - Image

40 attributes:

1 : "5_o_Clock_Shadow"

2 : "Big_Lips"

3 : "Big_Nose"

4 : "Chubby"

5 : "Double_Chin"

6 : "Eyeglasses"

7 : "Goatee"

8 : "Heavy_Makeup"

9 : "High_Cheekbones"

10 : "Male"

11 : Mouth_Slightly_Open"

12 : "Mustache"

...

For both bird and flower datasets, 10 captions per image provided by char-CNN-RNN (Reed et al, CVPR 2016):

attribute vector OR

text embedding

Face (CelebA)

Bird (CUB)

Flower (Oxford)

第 15 页，共 34 页

Data Preparation - Synthesized Sketch�

Edge detection:

XDog (Winnemöller et al, Computer & Graphics 2012)
Photoshop photocopy effect

Simplification (synthesized sketches):

Sketch simplification (Simo-Serra & Iizuka, SIGGRAPH 2016)

Image

Edge

Simplified

第 16 页，共 34 页

Data Preparation - Freehand Sketch�

* collected from volunteers

第 17 页，共 34 页

Experiments - Face

male, long face, smile with mouth closed, double eyelids, five o'clock shadow…

ATTRIBUTES

SKETCH

IMAGE

第 18 页，共 34 页

Experiments - Failures

第 19 页，共 34 页

Experiments - Failures

第 20 页，共 34 页

第 21 页，共 34 页

Experiments - Finally…

第 22 页，共 34 页

Experiments - Face

Attributes Match Sketch

Attributes Mismatch Sketch

Freehand Sketch

第 23 页，共 34 页

Experiments - Match (Mustache)

第 24 页，共 34 页

Experiments - Match (Eyeglasses)

第 25 页，共 34 页

Experiments - Match (Lipstick)

第 26 页，共 34 页

Female, Heavy_Makeup, Wearing_Lipstick

Experiments - Mismatch

Female, Heavy_Makeup, Smiling, Wearing_Lipstick

第 27 页，共 34 页

Experiments - Mismatch

Male, Chubby, Double_Chin, High_Cheekbones, Mouth_Open

Male

第 28 页，共 34 页

Experiments - Mismatch

Female, High_Cheekbones, Smiling, Wearing_Lipstick, No_Eyeglasses

Female, Heavy_Makeup, High_Cheekbones, Pointy_Nose, Smiling, Wearing_Lipstick, No_Eyeglasses

第 29 页，共 34 页

第 30 页，共 34 页

Experiments - Freehand

第 31 页，共 34 页

Experiments - Freehand

第 32 页，共 34 页

Experiments - Failure Cases (Eyeglasses)

第 33 页，共 34 页

Timeline

Before Mar

Ideation

Mar

Submitted to ICCV on Sketch-to-Image

Jul

Extension on Sketch-

Guided Text-to-Image

Aug

Run experiments on bird and flower datasets

Sept - Oct

Refine results and paper write-up

Nov

Submit to CVPR

第 1 页，共 34 页

第 2 页，共 34 页

第 3 页，共 34 页

第 4 页，共 34 页

第 5 页，共 34 页

第 6 页，共 34 页

第 7 页，共 34 页

第 8 页，共 34 页

第 9 页，共 34 页

第 10 页，共 34 页

第 11 页，共 34 页

第 12 页，共 34 页

第 13 页，共 34 页

第 14 页，共 34 页

第 15 页，共 34 页

第 16 页，共 34 页

第 17 页，共 34 页

第 18 页，共 34 页

第 19 页，共 34 页

第 20 页，共 34 页

第 21 页，共 34 页

第 22 页，共 34 页

第 23 页，共 34 页

第 24 页，共 34 页

第 25 页，共 34 页

第 26 页，共 34 页

第 27 页，共 34 页

第 28 页，共 34 页

第 29 页，共 34 页

第 30 页，共 34 页

第 31 页，共 34 页

第 32 页，共 34 页

第 33 页，共 34 页

第 34 页，共 34 页