1 of 44

Text Summarization

2 of 44

Text Summarization

  • Goal: reducing a text with a computer program in order to create a summary that retains the most important points of the original text
    • Create a concise representation that retains most relevant information.
    • Relevant information with respect to particular aspect:
  • Time saving
  • Informing
  • Decision making
  • Orientation (maps)
  • Planning (maps)

3 of 44

Table of contents

4 of 44

Abstracts of papers

5 of 44

Summarize the Web

  • Search engines organizes information for accessibility and usefulness
  • Match keywords to queries (words)
  • Richer meaning: Refer to entities, objects (people, places, things) in real world.

6 of 44

Knowledge Graph

  • Searching for entities, instantly provide information relevant about the entity
  • Provide connections: Relations between entities
  • An ever growing database of structured knowledge
    • 500 million entities
    • 3.5 billion defining attributes and connections

7 of 44

Headline news

8 of 44

TV-GUIDES

9 of 44

Graphical maps

10 of 44

Textual Directions

11 of 44

Questions

  • What kinds of summaries do people want?
    • What are summarizing, abstracting, gisting,...?
  • How sophisticated must summ. systems be?
    • Are statistical techniques sufficient?
    • Or do we need symbolic techniques and deep understanding as well?
  • What milestones would mark quantum leaps in summarization theory and practice?
    • How do we measure summarization quality?

12 of 44

Categories

  • Input
    • Single-Document Summarization (SDS)
    • Multiple-Document Summarization (MDS)
  • Output
    • Extractive
    • Abstractive
    • Compressive
  • Focus
    • Generic
    • Query-focused summarization
    • Domain-specific
  • Machine learning methods:
    • Supervised
    • Unsupervised

13 of 44

What to summarize? Single vs. multiple documents

  • Single-document summarization
      • Given a single document, produce
      • abstract
      • outline
      • headline
  • Multiple-document summarization
    • Given a group of documents, produce a gist of the content:
      • a series of news stories on the same event
      • a set of web pages about some topic or question

14 of 44

Single-document Summarization

15 of 44

Multiple-document Summarization

16 of 44

Query-focused Summarization & Generic Summarization

  • Generic summarization:
    • Summarize the content of a document
  • Query-focused summarization:
    • Summarize a document with respect to an information need expressed in a user query.
    • A kind of complex question answering:
      • Answer a question by summarizing a document that has the information to construct the answer

17 of 44

Snippets: query-focused summaries

18 of 44

Summarization for Question Answering: Snippets

  • Create snippets summarizing a web page for a query

19 of 44

Summarization for Question Answering: Multiple documents

  • Create answers to complex questions summarizing multiple documents.
  • Instead of giving a snippet for each document
  • Create a cohesive answer that combines information from each document

20 of 44

Extractive summarization & Abstractive summarization

  • Extractive summarization:
    • create the summary from phrases or sentences in the source document(s)
  • Abstractive summarization:
    • express the ideas in the source documents using (at least in part) different words

21 of 44

A Summarization Machine

22 of 44

The Modules of the Summarization Machine

23 of 44

Typical 3 Stages of Summarization

1. Topic Identification: find/extract the most important material

2. Topic Interpretation: compress it

3. Summary Generation: say it in your own words

24 of 44

Overview of Topic Extraction Methods

  • General method: score each sentence; combine scores; choose best sentence(s)
  • Scoring techniques:
    • Position in the text: lead method; optimal position policy; title/heading method
      • Claim: Important sentences occur at the beginning (and/or end) of texts.
      • Lead method: just take first sentence(s)!
    • Title-Based Method
      • Claim: Words in titles and headings are positively relevant to summarization
    • Cue phrases in sentences
      • Claim: Important sentences contain ‘bonus phrases’, such as significantly, In this paper we show, and In conclusion, while non-important sentences contain ‘stigma phrases’ such as hardly and impossible
      • Method: Add to sentence score if it contains a bonus phrase, penalize if it contains a stigma phrase
    • Word frequencies throughout the text
      • Claim: Important sentences contain words that occur “somewhat” frequently
      • Method: Increase sentence score for each frequent word.
    • Cohesion: links among words; word co-occurrence; coreference; lexical chains
      • Claim: Important sentences/paragraphs are the highest connected entities in more or less elaborate semantic structures
      • Method: determine relatedness score Si for each paragraph, and extract paragraphs with largest Si scores
    • Discourse structure of the text
      • Claim: The multi-sentence coherence structure of a text can be constructed, and the ‘centrality’ of the textual units in this structure reflects their importance
    • Information Extraction: parsing and analysis
      • Idea: content selection using forms (templates)

25 of 44

Topic Interpretation

  • From extract to abstract; topic interpretation or concept fusion
  • Concept generalization
    • Sue ate apples, pears, and bananas ⇒ Sue ate fruit
  • Meronymy replacement
    • Both wheels, the pedals, saddle, chain… ⇒ the bike
  • Script identification (Schank and Abelson, 77)
    • He sat down, read the menu, ordered, ate, paid, and left ⇒ He ate at the restaurant
  • Metonymy
    • A spokesperson for the US Government announced that… ⇒ Washington announced that...

26 of 44

NL Generation for Summaries

  • Level 1: no separate generation
    • Produce extracts, verbatim from input text.
  • Level 2: simple sentences
    • Assemble portions of extracted clauses together.
  • Level 3: full NLG

1. Sentence Planner: plan sentence content, sentence length, theme, order of constituents, words chosen... (Hovy and Wanner, 96)

2. Surface Realizer: linearize input grammatically (Elhadad, 92; Knight and Hatzivassiloglou, 95).

27 of 44

Unsupervised content selection

  • Intuition dating back to Luhn (1958):
    • Choose sentences that have salient or informative words
  • Two approaches to defining salient words
  • tf-idf: weigh each word wi in document j by tf-idf

  • topic signature: choose a smaller set of salient words
    • mutual information
    • log-likelihood ratio (LLR) Dunning (1993), Lin and Hovy (2000)

28 of 44

Topic signature-based content selection with queries

  • choose words that are informative either
    • by log-likelihood ratio (LLR)
    • or by appearing in the query
  • Weigh a sentence (or window) by weight of its words:

(could learn more complex weights)

29 of 44

Graph-based Ranking Algorithms

  • unsupervised sentence extraction
  1. Identify text units that best define the task at hand, and add them as vertices in the graph.
  2. Identify relations that connect such text units, and use these relations to draw edges between vertices in the graph. Edges can be directed or undirected, weighted or unweighted.
  3. Iterate the graph-based ranking algorithm until convergence.
  4. Sort vertices based on their final score. Use the values attached to each vertex for ranking/selection decisions

30 of 44

Supervised content selection

  • Given:
    • a labeled training set of good summaries for each document
  • Align:
    • the sentences in the document with sentences in the summary
  • Extract features
    • position (first sentence?)
    • length of sentence
    • word informativeness, cue phrases
    • cohesion
  • Train
    • a binary classifier (put sentence in summary? yes or no)
  • Problems:
    • hard to get labeled training
    • alignment difficult
    • performance not better than unsupervised algorithms
  • So in practice:
    • Unsupervised content selection is more common

31 of 44

Data and metrics

31

Q: How to find the list of commonly used datasets?

A: Look at the recent SOTA papers

32 of 44

Data and metrics

32

Q: How to find the list of commonly used datasets?

A: Look at the recent SOTA papers

Q: How to find the SOTA papers?

A:

  • nlpprogress.com
  • paperswithcode.com/sota
  • connectedpapers.com

33 of 44

Data and metrics

33

34 of 44

Data and metrics

34

35 of 44

Data and metrics

35

36 of 44

CNN/DailyMail

36

37 of 44

CNN/DailyMail

37

https://huggingface.co/datasets/cnn_dailymail

38 of 44

CNN/DailyMail

38

https://huggingface.co/datasets/cnn_dailymail

39 of 44

XSum

39

Let’s follow the trail…

40 of 44

XSum

40

41 of 44

Evaluating Summaries: ROUGE

  • ROUGE; Recall Oriented Understudy for Gisting Evaluation
  • Intrinsic metric for automatically evaluating summaries
    • Based on BLEU (a metric used for machine translation)
    • Not as good as human evaluation (“Did this answer the user’s question?”)
    • But much more convenient
  • Given a document D, and an automatic summary X:
    • Have N humans produce a set of reference summaries of D
    • Run system, giving automatic summary X
    • What percentage of the bigrams from the reference summaries appear in X?

42 of 44

A ROUGE example: Q: “What is water spinach?”

  • System output: Water spinach is a leaf vegetable commonly eaten in tropical areas of Asia.
  • Human Summaries (Gold)

Water spinach is a

Human 1: tropics.

Human 2:

green leafy vegetable grown in the

Water spinach is a

semi-aquatic tropical plant grown as a

vegetable. Human 3:

Water spinach is a commonly eaten

leaf vegetable

of Asia.

  • ROUGE-2 =

10 + 10 + 9

3 + 3 + 6

= 12/29 = .41

43 of 44

A neural attention model for abstractive sentence summarization

Rush et al., EMNLP 2015

  • Inspired by attention-based seq2seq models

(Bahdanau, 2014)

44 of 44

Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond

Nallapati et al., CoNLL 2016

  • Implements many tricks (nmt, copy, coverage, hierarchical, external knowledge)