1 of 23

summarization using text processing and pre-trained models

Fatemeh Rahimi

2 of 23

Summarization

  • Going through a lot of documents to find the most important parts�
  • Manually:
    • Time consuming
    • Not practical��

2

3 of 23

Why is Summarization Useful?

  • over-growing the amount of data available online
  • Automatic summarization algorithms are less biased than human summarizers.

3

4 of 23

Where can we use Summarization?

  • Researchers
    • Reading related work
  • Medical Health Records
    • Emergency rooms
  • Law companies
    • Summary of their previous courts
  • And so on ...

4

5 of 23

Summarization

5

Single Document Summarization

Summarization

Multi-Document Summarization

6 of 23

Types of Summarization

6

Extractive Summarization

Summarization

Abstractive Summarization

7 of 23

Extractive Summarization

  • Supervised
    • Binary classification
  • Unsupervised
    • Ranking Algorithms
    • Graph based approaches
    • Clustering

7

8 of 23

Extractive Summarization (Supervised)

A dataset with highlighted sentences

8

Sentences

Highlighted

By the Mid 19th…..

1

Japan changed int ...

0

Sentence 3

0

Sentence 4

1

...

...

...

...

Sentence n

0

9 of 23

Extractive Summarization (Supervised)

We need input of numbers for classification, But we have sentences and words.

What is the solution?

9

10 of 23

Pretrained-models

  • A deep neural network
  • A Transfer learning approach
  • A model that has already learned �a good presentation of fake data
  • Use for another task

  • NLP pretrained models:
    • BERT (a breakthrough)
    • Roberta
    • T5 (state-of-the-art)

10

11 of 23

Pretrained-models (Cont.)

  • Train the neural network on Wikipedia
    • So that model learn good features and context of the language
  • And use it for
    • Sentiment analysis
    • Question Answering
    • Summarization
    • Semantic Textual Similarity
    • Information Retrieval
    • etc. �

11

12 of 23

Let’s use pre-trained models for Summarization

Supervised

12

Sentences within the document

Pretrained models

Embeddings

ML approaches

To find the highlighted sentences

13 of 23

Extracting Embeddings with pre-trained models

What is Word Embedding?

  • A vector of numbers that represent a word
  • Use as input for Machine Learning approaches

Word Embeddings in BERT:�From base model: [12x768]

13

14 of 23

Extractive Summarization (Supervised)

  • Having Embeddings:
    • Vectors that represent Words
  • Binary classification
    • Deep Neural Network
      • Output layer (Logistic activation function)
    • Machine Learning
      • Ridge regression
      • Random forest

14

15 of 23

Extractive Summarization (UnSupervised)

(Useful When u also have a topic �or question to find summary on)��Ranking Algorithms:

  • Cosine Similarity (most basic form)
  • BM25 (old, reliable)
  • DSSM (DNN)
  • Conv-KNRM (new, DNN)

15

16 of 23

Extractive Summarization (UnSupervised)

Clustering

  • K-means
  • DBSCAN
  • ….

16

17 of 23

Extractive Summarization (UnSupervised)

Graph-based

  • Node: sentences of documents
  • Edges: similarity between sentence

Extracting summary:

  • Finding Centroids
  • Clustering the graph

17

18 of 23

18

19 of 23

Wrap Up

We Learned:

  • What Multi-document Extractive Summarization is.
  • What pre-trained models are.
  • How to use pre-trained models for Extractive Summarization.

19

20 of 23

Thanks for listening

Any Questions?

20

21 of 23

BERT

21

22 of 23

Evaluation for summarization

ROUGE

  • stands for Recall-Oriented Understudy for Gisting Evaluation

ROUGE-N

  • measures unigram, bigram, trigram and higher order n-gram overlap between the system summary and reference summary

ex: ROUGE-1, ROUGE-2

ROUGE-L

  • measures longest matching sequence of words using LCS

22

23 of 23

DBSCAN

23