1 of 1

Automatic Detection of Entity-Manipulated Text Using Factual Knowledge

Ganesh Jawahar, Muhammad Abdul-Mageed, Laks V.S. Lakshmanan

University of British Columbia, Vancouver, Canada

ganeshjwhr@gmail.com, muhammad.mageed@ubc.ca, laks@cs.ubc.ca

Manipulated Text Creator

What is this paper about?

Distinguish a human written news article from a manipulated news article. See orange panel.

  • Detecting entity-manipulated text helps detect one form of misinformation, which is easy/cheap to create.
  • Understudied subfield of fake news detection.
  • Existing detectors don’t perform well as they over rely on stylometric signals.

Why is this problem important?

What are the key contributions?

  • A detector that exploits factual knowledge to overcome the limitations of relying only on stylometric signals.
  • An approach to generate challenging manipulated news article dataset using GPT-2.
  • A collection of challenging datasets by considering various strategies to generate the replacement entity.

Problem

Our Approach

Results

Manipulated Article Detection Accuracy

PubNub, a startup that develops the infrastructure to power key features in real-time applications (...) has raised $23 million in a series D round of funding from Hewlett Packard Enterprise (HPE), Relay Ventures, Sapphire Ventures, Scale Venture Partners, Cisco Investments, Bosch, and Ericsson.

PubNub, a startup that develops the infrastructure to power key features in real-time applications (...) has raised $23 million in a series D round of funding from Hewlett Packard Enterprise (HPE), Samsung, Sapphire Ventures, Scale Venture Partners, Cisco Investments, Bosch, and Ericsson.

We focus only on replacing some entities in a human written news article with manipulated entities.

Distinguish a human written from a manipulated news article

Entity-Manipulated Text Creator

Human text

Entity-Manipulated Text Detector

Human text

Consults knowledge base

Manipulated text

Human text

Manipulated text

Human text

Manipulated text

Prompt

GPT-2

Samsung

Generated entity

Entity replacement

Manipulated Text Detector

PubNub, a startup that develops the infrastructure to power key features in real-time applications (...) has raised $23 million in a series D round of funding from Hewlett Packard Enterprise (HPE), Samsung, Sapphire Ventures, Scale Venture Partners, Cisco Investments, Bosch, and Ericsson.

Samsung

type

Organization

Ericsson

memberOf

FIDO Alliance

Entity-relation graph

Graph Convolutional Network

RoBERTa

Manipulated Article Detector

Manipulated Entity Classifier

Detector

Samsung

Manipulated text

GPT-2

1

2

3

RoBERTa

67.09

74.12

78.79

Ours

65.84 (1.9%)

74.8 (0.9%)

79.05 (0.3%)

  • Fake news propagator can manipulate exactly one entity to make the detection task harder.
  • Manipulation using GPT-2 can keep the detection task harder even for large number of replacements.
  • Explicit factual knowledge can complement textual knowledge.
  • There’s a lot of room for improvement in detection tasks (>20% accuracy) and entity identification tasks (>75% F-score).

Takeaways