SemStyle: Learning to Generate Stylised Image Captions using Unaligned Text
Alexander Mathews *†, Lexing Xie *†, Xuming He ‡
Australian National University *, Data to Decision CRC †, ShanghaiTech University ‡
Goals:
I stopped short when I saw the train sitting at the station.
A train that stopped at a train station.
Style: story-like
Style: descriptive / MSCOCO
Poster: [D3]
Poster: [D3]
a white sheep and birds in a field.
a clock is mounted on the corner of a building.
Images with descriptive captions
She wore a gown the colors of an autumn sunset.
She sat down and picked up her fork.
He cleared his throat and when that got no response, he banged his fist down on the table.
She smiled sheepishly.
……
Text corpus in a distinct style
Evaluated automatically and manually:
Poster: [D3]
A woman walking with an umbrella in the rain.
The woman stepped underneath her umbrella and walked in the rain.
A juicer is poured into a
glass of juice.
I'll be in the juicer with a
glass of orange juice.
A forest that has a large tree in it.
Forest, tall, and thick trees.
(a)
(b)
(c)
Success cases
Failure case
[Descriptive]
[Story-like]
See Paper #4057 or Poster [D3] for details of:
Code, models and more results:
Poster: [D3]
I had to take a little umbrella to the beach.
Tennis player got balls.