1 of 12

GANwriting: Content-Conditioned Generation of Styled Handwritten

Word Images

L. Kang, P. Riba, Y. Wang, M.l Rusiñol, A. Fornés, M. Villegasy

Universitat Autònoma de Barcelona & omni:us, Berlin

Arxiv March 2020

2 of 12

Handwriting Generation

  • generate realistic handwriting images
  • conditioned on text content
  • conditioned on writing style
  • mimic a real writer: write a word with unique slant, roundness, stroke, ...

3 of 12

Handwriting Generation

  • generate realistic handwriting images
  • conditioned on text content
  • conditioned on writing style
  • mimic a real writer: write a word with unique slant, roundness, stroke, ...

4 of 12

Handwriting Generation

  • x∈X: image space
  • t∈Aₗ: text content space, A is the alphabet, l is the fixed length
  • w∈W: writer space, multi-class labels
  • formally:

x’ᵢ ~ H(z ~ p; t, {x₁|wᵢ, x₂|wᵢ, x₃|wᵢ, …, xₖ|wᵢ})

5 of 12

Architecture

6 of 12

Encoders - Decoder Generator

  • style encoder S
    • stacked image-wise feature maps
    • add random noise
  • content encoders C
    • character embedding
    • stacked character-wise feature vectors
    • word-level feature vector
  • concatenate & padding to match the shapes
  • generator G

7 of 12

Loss Functions

  • discriminative loss: standard GAN loss

  • style loss: standard classification loss

  • text content loss: next slide

8 of 12

Loss Functions: Text Content Loss

  • CNN-RNN-Attention-RNN to recognize text in generated image
  • character-wise KL divergence loss with ground truth
  • whole network is trained end-to-end on minibatches

9 of 12

Experiments

  • training set: IAM corpus, 62857 handwritten word snippets, 500 writers
  • source text: 22500 English words from Brown corpus

10 of 12

Compare with Image Translation

11 of 12

Latent Space Interpolation

12 of 12

“Turing Test”