1
Youssra Rebboud
From Nodes to Narratives
A Knowledge Graph-based Storytelling Approach
Mike de Kok, Youssra Rebboud, Pasquale Lisena, Raphael Troncy, Ilaria Tiddi
2
Narrative Graph
3
What information the Narrative Graph does cover?
The narratives covers information about
the 4W:
4
What information the Narrative Graph does not cover?
Lack of more semantically rich event relations
Examples:
The government has implemented a series
of laws to prevent the abuse of animals.
DD Acquisition said the extension is to allow
this process to be completed.
The government passed a law to increase access
to mental health services and reduce stigma.
5
Prevention
Enable
Cause
Our Proposed Solution
6
V. Qualitative Analysis (QLA)
IV. Quantitative Analysis (QNA)
II. Building the Narrative Graph (BNG)
III. Knowledge graph summarization (KGS)
II. Building the Narrative
Graph (BNG)
7
II. (BNG)
Build a semantically rich Narrative Graph
8
Starting Point: ASRAEL KG [1]
[1] Rudnik, C., Ehrhart, T., Ferret, O., Teyssou, D., Troncy, R., Tannier, X.: Searching news articles using an event knowledge graph leveraged by wikidata. In: Companion Proceedings of The Web Conference, 2019.
Predicate type | Predicates label | Wikidata properties | SEM properties |
Who | participant, organizer, founded by | P710, P664, P112 | sem:hasActor |
Where | country, location, coordinate location, located in the administrative territorial entity, continent | P17, P276, P625, P131, P30 | sem:hasPlace |
When | point in time, start time, end time, inception, "dissolved, abolished or demolished date", publication date | P585, P580, P582, P571, P576, P577 | sem:hasTime, sem:hasBeginTimeStamp, sem:hasEndTimeStamp, sem:hasTime, sem:hasTime, sem:hasTimeStamp |
Location ✅
Actor ✅
Time ✅
Precise Event Spans ❌
Semantically Precise relations ❌
II. (BNG)
Event Relation extraction from ASRAEL news articles
9
REBEL [2]
Sentence: “A 7.3 magnitude earthquake off Japan’s Fukushima injured
dozens of people ”
Generated:� (<triplet> earthquake <subj> injured <obj> causes)
cause
[2] REBEL: Relation Extraction By End-to-end Language generation (Huguet Cabot & Navigli, Findings 2021)
II. (BNG)
Event Coreference resolution for ASRAEL KG
The EECEP model [4]
10
Example:
“A 7.3-magnitude earthquake off Japan's Fukushima injured dozens of people ….”
"The powerful quake, triggering landslides and collapsing houses, caused mayhem."
"The powerful quake, triggering landslides and collapsing houses, caused mayhem."
Window
Similarity: 0.96
[4] Held, W., Iter, D., & Jurafsky, D. (2021). Focus on what matters: Applying Discourse Coherence Theory to Cross Document Coreference. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (pp. 1406–1417). Association for Computational Linguistics.
II. (BNG)
Event Coreference resolution for ASRAEL KG
11
Example:
“A 7.3-magnitude earthquake off Japan's Fukushima injured dozens of people ….”
"The powerful quake, triggering landslides and collapsing houses, caused mayhem."
Similarity: 0.96
II. (BNG)
Final Narrative Graph
12
II. (BNG)
III. Knowledge graph summarization (KGS)
13
II. (KGS)
Relevant Information Selection
III. (KGS)
14
(if available) .�
Text Generation from KG
III. (KGS)
15
[5] P. Ke, H. Ji, Y. Ran, X. Cui, L. Wang, L. Song, X. Zhu, M. Huang, JointGT: Graph-Text Joint Representation Learning for Text Generation from Knowledge Graphs, in: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, ACL, 2021, pp. 2526–2538.
Dataset | Triples | Label |
WebNLG | (3Arena, owner, Live Nation Entertainment), (Dublin, is part of, Republic of Ireland), (3Arena, location, Dublin), (Dublin, is part of, Leinster) | (The owner of 3Arena, Dublin, Leinster, Republic of Ireland is Live Nation Entertainment), (Dublin is part of Leinster and a city in the Republic of Ireland. Dublin is also home to the 3Arena which is currently owned by Live Nation Entertainment) |
Text Generation from KG
III. (KGS)
16
Contribution: Enhancement of (WebNLG) with the FARO Dataset
Dataset | Triples | Label |
WebNLG | (3Arena, owner, Live Nation Entertainment), (Dublin, is part of, Republic of Ireland), (3Arena, location, Dublin), (Dublin, is part of, Leinster) | (The owner of 3Arena, Dublin, Leinster, Republic of Ireland is Live Nation Entertainment), (Dublin is part of Leinster and a city in the Republic of Ireland. Dublin is also home to the 3Arena which is currently owned by Live Nation Entertainment) |
FARO | (Demand, cause, benefited) | The company benefited from continued strong demand and higher selling prices for titanium dioxide, a white pigment used in paints, paper and plastics. |
Dataset | Train | Val | Test |
WebNLG | 12876 | 1619 | 1600 |
FARO | 1800 | 201 | 108 |
Combined | 14676 | 1820 | 1708 |
IV. Quantitative Analysis (QNA)
17
II. (QNA)
Quantitative analysis
IV. (QNA)
18
Model | Dataset | BLEU | METEOR | ROUGE |
Base (WebNLG) | WebNLG Val | 0.6642 | 0.4727 | 0.7558 |
WebNLG Test | 0.6529 | 0.4681 | 0.7535 | |
FARO test | 0.0 | 0.056 | 0.1281 | |
Combined | WebNLG Val | 0.6368 | 0.4543 | 0.7468 |
WebNLG Test | 0.6101 | 0.4409 | 0.7260 | |
FARO test | 0.0651 | 0.0983 | 0.2304 |
V. Qualitative Analysis (QLA)
19
V. (QLA)
Qualitative analysis on ASRAEL KG
V. (QLA)
20
The text generated by the combined dataset: the trained model appears more semantically robust.
Triples | Label | Base | Combined |
(Demand, cause, benefited) | The company benefited from continued strong demand and higher selling prices for titanium dioxide, a white pigment used in paints, paper and plastics. | benefited “ is the cause of the demand | The company said it benefited from the strong demand for its products and services from a growing number of customers. |
V. (QLA)
Qualitative analysis on ASRAEL KG
21
Does the (combined) model trained on a dataset containing "semantically precise relationships" show improved fluency and adequacy over the one that did not (base)?
3 annotators performed a manual review:
Combined model generates a better adequate sentence than base model → win adequacy
Base model generates a better fluent sentence than the combined model → lose fluency
Both models generate an equally good/worse adequate sentence → tie adequacy
V. (QLA)
Qualitative analysis on ASRAEL KG
22
Subject | Predicate | Object |
2021 Fukushima earthquake | location | Japan |
2021 Fukushima earthquake | date | 2021-02-13 |
earthquake | cause | collapsing |
earthquake | cause | injured |
broken | cause | damage |
Base:
2021 Fukushima earthquake , which was caused by collapsing , is located in Japan and was broken .
Combined:
The 2021 Fukushima earthquake , which hit Japan on February 13th , 2021 , injured many people and caused extensive damage and collapsing .
V. (QLA)
Qualitative analysis on ASRAEL KG
23
The combined model is significantly better at generating more sophisticated sentences compared to the
base model.
win/lose/tie was determined by taking a majority vote on the annotation
Task | Fluency | Adequacy | ||||
Win % | Lose % | Tie % | Win % | Lose % | Tie % | |
7 selected events | 71.4 | 14.3 | 14.3 | 28.6 | 0.0 | 71.4 |
Quantitative analysis on ASRAEL KG -Manually Annotated Article
V. (QLA)
24
Task | Fluency | Adequacy | ||||
Win % | Lose % | Tie % | Win % | Lose % | Tie % | |
Manually annotated article | 33.3 | 16.7 | 50.0 | 58.3 | 8.3 | 33.3 |
Conclusion
25
26
Thank you!
GitHub
This presentation
kFLOW Website
Narrative Graph