1 of 31

Towards Exploiting Background Knowledge for Building Conversation Systems

Nikita Moghe, Siddhartha Arora, Suman Banerjee, Mitesh M. Khapra

1

Indian Institute of Technology Madras

Robert Bosch Centre for Data Sciences and AI, IITM

2 of 31

2

Courtesy: Google Images (Cortana, Allo, Echo, Siri, Ask Jenn)

3 of 31

3

Deep Learning for Conversational AI

Rise of data driven systems

First Chatbot

Template/Rule based Systems

Natural Language Generation

Natural Language Understanding

Dialog State Tracking

Utterance

Response

Rules

Slot-Filling

Rules

Data

Data

Data Hungry

End to End Systems

Data

Weizenbaum.1966

Ritter et al., 2011; Vinyals and Le, 2015; Lowe et al., 2015

Perez-Marin and Pascual-Nieto, 2011; Shawar and Atwell, 2007b; Williams et al., 2013

Aust et al., 1995; McGlashan et al., 1992; Simpson and Eraser, 1993

4 of 31

Data Data Everywhere..

4

Based on Movie Scripts

Crawled from websites

60+ such datasets for dialog

~1.3M

chats

~930K

chats

~1.7B

comments

A Survey of Available Corpora for Building Data-Driven Dialogue Systems, Serban et al., arxiv 2015

Based on Human-Human Spoken Interaction

Based on Human Machine Interaction

Logos: https://ubuntuforums.org/, https://twitter.com/, https://www.reddit.com/

5 of 31

Dialog as a Seq2Seq Problem!

5

Human: Please suggest a movie ?

Bot: Sure! Check out Titanic

please suggest a movie

sure check out <UNK>

Encoder

Decoder

But...

Humans rely on background knowledge to converse

Vinyals et al., ICML 2015

6 of 31

Has this never been tried before ?

6

Not really...

Linux

Man Pages

Goal Oriented Dialog

Ubuntu Corpus

But the resources and chats are not tightly coupled!

Open Domain Dialog

Alexa Proceedings 2017

Lowe et al., 2015

7 of 31

Key Contribution

7

Domain specific conversation systems with alternate responses explicitly obtained from specific

background knowledge

8 of 31

Dataset Creation

8

9 of 31

The 4C’s of Dataset Creation

9

Crowdsource

Crawl

Check

Curate

9071 chats

~90K

Utterances

9278

resources

15.29

Avg words/turn

153.07

Avg words/chat

10 of 31

10

Movie : Spider-Man

Popular and Diverse Movie List

921 Movies

IMDb 250

Top Ten Movies by Genre

1001 Movies you must watch before you die!

Curate

11 of 31

11

.. spiders through

genetic manipulation. While Peter is taking photographs of Mary Jane for the school newspaper, ...

I thought the movie very engrossing. Director Sam Raimi kept the action quotient high but also emphasized the human element of the story

Box Office

$403,706,375

Similar Movies

Avengers SpiderMan 2

Crazy attention to detail.

It was too heavily reliant on light-hearted humor.

Movie : Spider-Man

Plot

Review

Comments

Wikipedia for Plots

Collected Reviews using IMDb Most Popular Reviews

Crawl

Curate

Official Reddit Pages for Comments

Facts

Wikipedia Infoboxes for Facts

Background Knowledge

12 of 31

12

.. spiders through

genetic manipulation. While Peter is taking photographs of Mary Jane for the school newspaper, ...

I thought the movie very engrossing. Director Sam Raimi kept the action quotient high but also emphasized the human element of the story

Box Office

$403,706,375

Similar Movies

Avengers SpiderMan 2

Crazy attention to detail.

It was too heavily reliant on light-hearted humor

Movie : Spider-Man

Plot

Review

Comments

S1(N): Which is your favourite character in this?

S2(C): My favorite character was played by Tobey Maguire.

Crowdsource

Crawl

Curate

Chat opening statements

Which is your favourite scene in the movie ?

Which is your favourite character in this ?

What do you think about the movie ?

9 per movie

Facts

My favorite character was played by Tobey Maguire.

13 of 31

13

S1(N): Which is your favourite character in this?

S2(C): My favorite character was played by Tobey Maguire.

.. spiders through

genetic manipulation. While Peter is taking photographs of Mary Jane for the school newspaper, ...

I thought the movie very engrossing. Director Sam Raimi kept the action quotient high but also emphasized the human element of the story

Box Office

$403,706,375

Similar Movies

Avengers SpiderMan 2

Movie : Spider-Man

Plot

Review

Comments

Crowdsource

Crawl

Curate

Facts

Crazy attention to detail.

It was too heavily reliant on light-hearted humor

My favorite character was played by Tobey Maguire.

Crowdsourcing platforms are meant for Atomic Tasks!

But dialog requires two people

Same worker plays the role of Speaker 1 and Speaker 2

to complete the chat

Self-Chats

Krause et al. Alexa Proceedings 2017

14 of 31

14

S1(N): Which is your favourite character in this?

S2(C): My favorite character was played by Tobey Maguire.

S1(N): I thought he did an excellent job as Peter Parker, I didn’t see what it was that turned him into Spider-Man though.

.. spiders through

genetic manipulation. While Peter is taking photographs of Mary Jane for the school newspaper, ...

I thought the movie very engrossing. Director Sam Raimi kept the action quotient high but also emphasized the human element of the story

Box Office

$403,706,375

Similar Movies

Avengers SpiderMan 2

Movie : Spider-Man

Plot

Review

Comments

Crowdsource

Crawl

Curate

Facts

Crazy attention to detail.

It was too heavily reliant on light-hearted humor

My favorite character was played by Tobey Maguire.

Self-Chats

Speaker 1 and Speaker 2 are played by the same person.

Speaker 1 is free to talk about anything

15 of 31

15

S1(N): Which is your favourite character in this?

S2(C): My favorite character was played by Tobey Maguire.

S1(N): I thought he did an excellent job as Peter Parker, I didn’t see what it was that turned him into Spider-Man though.

S2(P): Well this happens while Peter is taking photographs of Mary Jane for the school newspaper, one of these new spiders lands on his hand and bites him.

.. spiders through

genetic manipulation. While Peter is taking photographs of Mary Jane for the school newspaper, ...

I thought the movie very engrossing. Director Sam Raimi kept the action quotient high but also emphasized the human element of the story

Box Office

$403,706,375

Similar Movies

Avengers SpiderMan 2

Movie : Spider-Man

Plot

Review

Comments

Crowdsource

Crawl

Curate

Facts

Crazy attention to detail.

It was too heavily reliant on light-hearted humor

My favorite character was played by Tobey Maguire.

Self-Chats

Speaker 1 and Speaker 2 are played by the person.

Speaker 1 is free to talk about anything

Speaker 2 has to reply using background knowledge

Specifically, Speaker 2 selects a contiguous span of words from the resource and appends suitable words

16 of 31

16

S1(N): Which is your favourite character in this?

S2(C): My favorite character was played by Tobey Maguire.

S1(N): I thought he did an excellent job as Peter Parker, I didn’t see what it was that turned him into Spider-Man though.

S2(P): Well this happens while Peter is taking photographs of Mary Jane for the school newspaper, one of these new spiders lands on his hand and bites him.

.. spiders through

genetic manipulation. While Peter is taking photographs of Mary Jane for the school newspaper, ...

I thought the movie very engrossing. Director Sam Raimi kept the action quotient high but also emphasized the human element of the story

Box Office

$403,706,375

Similar Movies

Avengers SpiderMan 2

Movie : Spider-Man

Plot

Review

Comments

AMT workers are notorious!

Check for:

  • Incoherence

  • Simulated Patterns

  • Incompleteness

  • Digression

Crowdsource

Crawl

Check

Curate

Facts

Crazy attention to detail.

It was too heavily reliant on light-hearted humor

My favorite character was played by Tobey Maguire.

Amazon Mechanical Turk https://www.mturk.com

17 of 31

17

S1(N): Which is your favourite character in this?

S2(C): My favorite character was played by Tobey Maguire.

S1(N): I thought he did an excellent job as Peter Parker, I didn’t see what it was that turned him into Spider-Man though.

S2(P): Well this happens while Peter is taking photographs of Mary Jane for the school newspaper, one of these new spiders lands on his hand and bites him.

S1 (N): I see. I was very excited to see this film and it did not disappoint!

S2(R): I agree, I thoroughly enjoyed “Spider-Man”

S1(N): I loved that they stayed pretty true to the comic.

S2(C): Yeah, it was a really great comic book adaptation

S1(N): The movie is a great life lesson on balancing

power.

S2(F): That is my most favorite line in the movie, ‘With

great power comes great responsibility.’

I thoroughly enjoyed“Spider-Man” which I saw in a screening. I thought the movie very engrossing. Director Sam Raimi kept the action quotient high, but also emphasized the human element of the story. The casting was perfect. Tobey Maguire was very believable as the gawky teenager in the early part of the film and then, after his run-in with the radioactive

Peter’s science class takes a field trip

to a genetics laboratory at Columbia University. The lab works on spiders and has even managed to create new species of spiders through

genetic manipulation. While Peter is taking photographs of Mary Jane for the school newspaper, one of these new spiders lands on his hand and bites him

Plot

Review

Crazy attention to detail. My favorite character was played by Tobey Maguire. I can’t get over the ”I’m gonna kill you dead” line.

No spoilers, but it does start to take itself more seriously towards the finale. It was too heavily

reliant on constant light-hearted humor. How ever the constant joking around kinda was low. A really great comic book adaptation.

Comments

Box Office

$403,706,375

Taglines

With great power comes great responsibility

Get Ready For Ultimate Spin!

Fact Table

Movie : Spider-Man

18 of 31

Multi-Reference Test Set

18

S1(N): I thought he did an excellent job as Peter Parker, I didn’t see what it was that turned him into Spider-Man though.

S2(P): Well this happens while Peter is taking photographs of Mary Jane for the school newspaper, one of these new spiders lands on his hand and bites him.

S1 (N): I see. I was very excited to see this film and it did not disappoint!

I thoroughly enjoyed“Spider-Man” which I saw in a screening. I thought the movie very engrossing. Director Sam Raimi kept the action quotient high, but also emphasized the human element of the story. The casting was perfect. Tobey Maguire was very believable as the gawky teenager in the early part of the film and then, after his run-in with the radioactive

Review

S2 (R): I agree. I thoroughly enjoyed Spider-Man

S2 (R): Also, The casting was perfect

S2 (R): I think so too! Director Sam Raimi kept the action quotient high

78.04% of test set

Several Responses can be correct for a given context

19 of 31

But wait… Is this Natural?

19

500 randomly chosen chats

Evaluated by three in-house annotators per chat

How can self-chat be a dialog?

What if the conversation digressed ?

Does this copy-paste make any sense ?

Metric

Score on 5

Intelligibility

4.47

Coherence

4.33

Two-person Chat

4.47

On-Topic

4.57

Grammar

4.41

20 of 31

Methods

20

21 of 31

21

Generation Based Models

Copy-or-Generate Models

Span Prediction Models

Hierarchical Recurrent

Encoder Decoder (HRED)

(No Background Knowledge)

Get to the Point (GTTP)

Document <-> Resource

Summary <-> Response

BiDirectional

Attention Flow (BiDAF)

Question <-> Context

Document <-> Resource

Serban et al. AAAI 2016; See et al. ACL 2017; Seo et al., ICLR 2017

22 of 31

Challenges

22

S1 (N): I see. I was very excited to see this film and it did not disappoint!

S2(R): I agree, I thoroughly enjoyed “Spider-Man”

S1(N): I loved that they stayed pretty true to the comic

S1 (N): I see. I was very excited to see this film and it did not disappoint!

S2(R): I agree, I thoroughly enjoyed “Spider-Man”

S1(N): I loved that they stayed pretty true to the comic

S1 (N): I see. I was very excited to see this film and it did not disappoint!

S2(R): I agree, I thoroughly enjoyed “Spider-Man”

S1(N): I loved that they stayed pretty true to the comic

Oracle

Mixed-Long

Mixed-Short

BiDirectional Attention Flow fails beyond 256 resource words!

Average combined resource length ~ 900 words

23 of 31

Results

23

Model

Type

F1

BLEU

ROUGE-1

ROUGE-2

ROUGE-L

HRED

-

-

5.23

24.55

7.61

18.87

GTTP

oracle

-

13.92

30.32

17.78

25.67

GTTP

mixed-short

-

11.05

29.66

17.7

25.13

GTTP

mixed-long

-

7.51

23.2

9.91

17.35

BiDAF

oracle

39.69

28.85

39.68

33.72

35.91

BiDAF

mixed-short

45.72

32.95

45.69

40.18

43.8

24 of 31

Results

24

Background knowledge helps! But correct background knowledge helps better!

Model

Type

F1

BLEU

ROUGE-1

ROUGE-2

ROUGE-L

HRED

-

-

5.23

24.55

7.61

18.87

GTTP

oracle

-

13.92

30.32

17.78

25.67

GTTP

mixed-short

-

11.05

29.66

17.7

25.13

GTTP

mixed-long

-

7.51

23.2

9.91

17.35

BiDAF

oracle

39.69

28.85

39.68

33.72

35.91

BiDAF

mixed-short

45.72

32.95

45.69

40.18

43.8

25 of 31

Results

25

Span prediction models perform better

Model

Type

F1

BLEU

ROUGE-1

ROUGE-2

ROUGE-L

HRED

-

-

5.23

24.55

7.61

18.87

GTTP

oracle

-

13.92

30.32

17.78

25.67

GTTP

mixed-short

-

11.05

29.66

17.7

25.13

GTTP

mixed-long

-

7.51

23.2

9.91

17.35

BiDAF

oracle

39.69

28.85

39.68

33.72

35.91

BiDAF

mixed-short

45.72

32.95

45.69

40.18

43.8

26 of 31

Results

26

BiDAF can chose to ignore noise. GTTP has no such distinction

Model

Type

F1

BLEU

ROUGE-1

ROUGE-2

ROUGE-L

HRED

-

-

5.23

24.55

7.61

18.87

GTTP

oracle

-

13.92

30.32

17.78

25.67

GTTP

mixed-short

-

11.05

29.66

17.7

25.13

GTTP

mixed-long

-

7.51

23.2

9.91

17.35

BiDAF

oracle

39.69

28.85

39.68

33.72

35.91

BiDAF

mixed-short

45.72

32.95

45.69

40.18

43.8

27 of 31

Results

27

On Multi Reference Test Set (78.04%)

Model

Type

F1

BLEU

ROUGE-1

ROUGE-2

ROUGE-L

HRED

-

-

5.38

25.38

8.35

19.67

GTTP

oracle

-

16.46

32.74

20.20

28.23

GTTP

mixed-short

-

15.68

31.71

19.72

27.35

GTTP

mixed-long

-

8.73

25.51

12.13

19.57

BiDAF

oracle

47.18

34.98

46.49

40.58

42.64

BiDAF

mixed-short

51.35

39.39

50.73

45.01

46.95

28 of 31

What next?

Finding the right resource is important

  • Two stage model : Predict the resource and generate

Cross attention mechanisms are useful

  • Building scalable cross attention mechanisms

Generation models are more human-like

  • Adding the power of cross attention mechanisms to generation models

28

Code and data : https://github.com/nikitacs16/Holl-E

29 of 31

29

Siddhartha Arora

Suman Banerjee

Mitesh M. Khapra

Microsoft Research India

(Student Travel Grant)

TextKernel

(EMNLP Student Travel Scholarship)

Thank You!

30 of 31

Questions/Suggestions

30

31 of 31

Human Evaluation of Responses

31

Model

Type

Human-Like

Appropriate

Fluency

Specificity

HRED

-

2.91

1.97

2.74

2.14

GTTP

oracle

4.1

3.82

4.03

3.33

GTTP

mixed-long

2.93

3.46

3.42

2.6

BiDAF

oracle

3.78

4.17

4.05

3.76

BiDAF

mixed-short

3.41

3.5

3.47

3.3