1 of 40

Learning to generate one-sentence biographies from Wikidata

Andrew Chisholm, Will Radford, Ben Hachey

School of Information Technologies

University of Sydney

2 of 40

The Task

Title	Mathias Tuomi
Gender	male
Date of birth	1985-09-03
Occupation	squash player
Citizenship	finland

Matias Tuomi, (born September 30, 1985 in Espoo) is a professional squash player who represents Finland.

Relations

Text

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

3 of 40

The Plan

Motivation
Creating a dataset
Fact-to-text translation models
Evaluating generated summaries

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

4 of 40

Motivation

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

5 of 40

Motivation

Why should we care about fact-to-text tasks?

Describing and summarising data
Consistency checking

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

6 of 40

Motivation » Consistency checking

Barack Hussein Obama II (born August 4, 1961 in Hawaii) is an American politician who served as the 44th President of the United States from 2009 to 2017.

Barack Hussein Obama II (born August 4, 1961 in Kenya) is an American politician who served as the 44th President of the United States from 2009 to 2017.

Relation	Value
Title	Barack Obama
Gender	male
Date of birth	1961-08-04
Place of birth	Hawaii
Occupation	...

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

7 of 40

Creating a Dataset

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

8 of 40

Creating a dataset » Sources

Wikipedia

4.5m entity pages
~2b words�

Wikidata

20m nodes/entities
>100m edges/relations

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

9 of 40

Creating a dataset » Constraints

Wikidata

Human entities (1.2m)
Top-15 relations (73% coverage)
At-least 5 relations present per entity�

Wikipedia

First sentence
10-37 tokens (10th to 90th percentile)�

400k train, 50k dev, 50k test

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

10 of 40

Dataset » Relation coverage

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

11 of 40

Creating a dataset » Constraints

Wikidata

Human entities (1.2m)
Top-15 relations (73% coverage)
At-least 5 relations present per entity�

Wikipedia

First sentence
10-37 tokens (10th to 90th percentile)�

400k train, 50k dev, 50k test

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

12 of 40

Dataset » Task complexity

How complicated are Wikipedia first sentences?�

Robert Charles Cortner (April 16, 1927 – May 19, 1959) was an American automobile racing driver from Redlands, California.�

Barry MacKay (8 January 1906 – 12 December 1985) was a British actor.�

Joseph "Flip" Nuñez (August 27, 1931 – November 3, 1995) was an American jazz pianist, composer, and vocalist of Filipino descent.

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

13 of 40

Dataset » Language modelling benchmark

Our approach

Train a benchmark language model and compare perplexity
5-gram KN smoothed (typical ~50-100 ppl)�

Results (ppl)

29.8 (Raw Text)

Robert Cortner was an American automobile racing driver...

14.5 (Name Templating)

TITLE was an American automobile racing driver...

10.1 (Full Templating)

TITLE was an CITIZENSHIP automobile OCCUPATION from...

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

14 of 40

Modelling

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

15 of 40

Modelling » Baseline

Template driven baseline

Replace occurrences of fact values in text with placeholders
Manually inspect the most frequent patterns
Example:

TITLE, known as GIVEN NAME, (born DATE OF BIRTH in PLACE OF BIRTH; died DATE OF DEATH in PLACE OF DEATH) is an POSITION HELD and OCCUPATION from CITIZENSHIP.

48 variations possible (cond. on presence of relations + values)

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

16 of 40

Modelling » Neural model

Basic idea

Treat our fact-to-text task as a simple translation problem
Borrow and improve upon a state of the art translation model

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

17 of 40

Modelling » Fact linearization

Source Language: Linearized Facts (Vinyals et al., 2015, Gillick et al., 2016, Xiao et al., 2016)�

#TITLE matias tuomi #SEX_OR_GENDER male #DATE_OF_BIRTH 1985 09 03 #OCCUPATION squash player #CITIZENSHIP finland�

Target Language: English

matias tuomi , ( born september 30 , 1985 in espoo ) is a professional squash player who represents finland .

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

18 of 40

Modelling » Base sequence-to-sequence model

S2S Model

Sutskever et al, 2014
3-layer GRU encoder
3-layer GRU decoder
Joint source-target word embedding space
Decoder attention

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

19 of 40

Modelling » Constraining generation

Problem

Inputs facts predictive of output
Previously generated text is predictive of output
Fact aren’t enough of a constraint and may even be missing�

Idea

Augment decoder loss to penalize inaccurate generation
We need some kind of relation extraction oracle

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

20 of 40

Modelling » Sequence-to-sequence Auto-encoding

S2S+AE Model

Constrains output to be predictive of the input
Multi-task learning�
S2S forward network
S2S backward network
Combined loss

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

21 of 40

Examples

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

22 of 40

Examples

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

23 of 40

Examples

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

24 of 40

Evaluation

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

25 of 40

Evaluation

Metrics - the more, the better

BLEU
Crowd-sourced human preference judgements
Content selection performance by fact annotation

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

26 of 40

Evaluation » Reference similarity

BLEU

Standard translation metric
Evaluates similarity between generated sequence and reference
Decoding expensive - randomly sample 10k entities from DEV and TEST for evaluation

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

27 of 40

Evaluation » Human preference

Preference evaluation

Crowdflower Task
100 entities
Randomly paired output from Wikipedia and systems
Minimum of 3-judgements per instance
$31USD

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

28 of 40

Evaluation » Crowd Task

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

29 of 40

Evaluation » Human preference

Results

S2S+AE > Baseline 62% of the time
S2S+AE > S2S 77% of the time
Wikipedia > S2S+AE 60%

But it’s not statistically significant!

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

30 of 40

Evaluation » Content Selection

Content selection

How does model generated text compare to Wikipedia?
How accurate is the generated text?
How much are we hallucinating?�
Manually annotate 100 outputs with all facts expressed
Analyze P/R/F of expressed facts

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

31 of 40

Evaluation » Comparing to Wikidata

Systems vs Wikidata facts

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

32 of 40

Evaluation » Comparing to Wikipedia

Systems vs Wikipedia

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

33 of 40

Summing up

Biography generation as fact-to-text translation
Auto-encoding improves generation
Robust evaluation is hard

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

34 of 40

Thanks!

Code + Data: github.com/andychisholm/mimo

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

35 of 40

Examples

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

36 of 40

That’s it!

Relations

Text

Facts

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

37 of 40

Knowledge-to-text tasks (Wen et al., 2015; Mei et al., 2015)
Neural Wikipedia biography generation (Lebret et al., 2016)

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

38 of 40

Annotation challenges

Fact equivalence is hard

“Redlands” != “Redlands, California”
“Film actor” != “actor”

Legend:

Facts in the text, and in the data
Extra facts in the text, not in the data
Facts in the text that are different from the data

Two annotators, all differences discussed

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

39 of 40

Dataset » Relation Occurrence

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey

40 of 40

Human Preference Evaluation

Learning to generate one-sentence biographies from Wikidata

EACL 2017

Chisholm, Radford, Hachey