Text Summarization
Text Summarization
Table of contents
Abstracts of papers
Summarize the Web
Knowledge Graph
Headline news
TV-GUIDES
Graphical maps
Textual Directions
Questions
Categories
What to summarize? Single vs. multiple documents
Single-document Summarization
Multiple-document Summarization
Query-focused Summarization & Generic Summarization
Snippets: query-focused summaries
Summarization for Question Answering: Snippets
Summarization for Question Answering: Multiple documents
Extractive summarization & Abstractive summarization
A Summarization Machine
The Modules of the Summarization Machine
Typical 3 Stages of Summarization
1. Topic Identification: find/extract the most important material
2. Topic Interpretation: compress it
3. Summary Generation: say it in your own words
Overview of Topic Extraction Methods
Topic Interpretation
NL Generation for Summaries
1. Sentence Planner: plan sentence content, sentence length, theme, order of constituents, words chosen... (Hovy and Wanner, 96)
2. Surface Realizer: linearize input grammatically (Elhadad, 92; Knight and Hatzivassiloglou, 95).
Unsupervised content selection
Topic signature-based content selection with queries
(could learn more complex weights)
Graph-based Ranking Algorithms
Supervised content selection
Data and metrics
31
Q: How to find the list of commonly used datasets?
A: Look at the recent SOTA papers
Data and metrics
32
Q: How to find the list of commonly used datasets?
A: Look at the recent SOTA papers
Q: How to find the SOTA papers?
A:
Data and metrics
33
Data and metrics
34
Data and metrics
35
CNN/DailyMail
36
CNN/DailyMail
37
https://huggingface.co/datasets/cnn_dailymail
CNN/DailyMail
38
https://huggingface.co/datasets/cnn_dailymail
XSum
39
Let’s follow the trail…
XSum
40
Evaluating Summaries: ROUGE
A ROUGE example: Q: “What is water spinach?”
Water spinach is a
Human 1: tropics.
Human 2:
green leafy vegetable grown in the
Water spinach is a
semi-aquatic tropical plant grown as a
vegetable. Human 3:
Water spinach is a commonly eaten
leaf vegetable
of Asia.
10 + 10 + 9
3 + 3 + 6
= 12/29 = .41
A neural attention model for abstractive sentence summarization
Rush et al., EMNLP 2015
(Bahdanau, 2014)
Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond
Nallapati et al., CoNLL 2016