1 of 1

Automatic Detection of Entity-Manipulated Text Using Factual Knowledge

Ganesh Jawahar, Muhammad Abdul-Mageed, Laks V.S. Lakshmanan

University of British Columbia, Vancouver, Canada

ganeshjwhr@gmail.com, muhammad.mageed@ubc.ca, laks@cs.ubc.ca

Manipulated Text Creator

What is this paper about?

Distinguish a human written news article from a manipulated news article. See orange panel.

Detecting entity-manipulated text helps detect one form of misinformation, which is easy/cheap to create.
Understudied subfield of fake news detection.
Existing detectors don’t perform well as they over rely on stylometric signals.

Why is this problem important?

What are the key contributions?

A detector that exploits factual knowledge to overcome the limitations of relying only on stylometric signals.
An approach to generate challenging manipulated news article dataset using GPT-2.
A collection of challenging datasets by considering various strategies to generate the replacement entity.

Problem

Our Approach

Results

Manipulated Article Detection Accuracy

PubNub, a startup that develops the infrastructure to power key features in real-time applications (...) has raised $23 million in a series D round of funding from Hewlett Packard Enterprise (HPE), Relay Ventures, Sapphire Ventures, Scale Venture Partners, Cisco Investments, Bosch, and Ericsson.

PubNub, a startup that develops the infrastructure to power key features in real-time applications (...) has raised $23 million in a series D round of funding from Hewlett Packard Enterprise (HPE), Samsung, Sapphire Ventures, Scale Venture Partners, Cisco Investments, Bosch, and Ericsson.

We focus only on replacing some entities in a human written news article with manipulated entities.

Distinguish a human written from a manipulated news article

Entity-Manipulated Text Creator

Human text

Entity-Manipulated Text Detector

Human text

Consults knowledge base

Manipulated text

Human text

Manipulated text

Human text

Manipulated text

Prompt

GPT-2

Samsung

Generated entity

Entity replacement

Manipulated Text Detector

PubNub, a startup that develops the infrastructure to power key features in real-time applications (...) has raised $23 million in a series D round of funding from Hewlett Packard Enterprise (HPE), Samsung, Sapphire Ventures, Scale Venture Partners, Cisco Investments, Bosch, and Ericsson.

Samsung

type

Organization

Ericsson

memberOf

FIDO Alliance

Entity-relation graph

Graph Convolutional Network

RoBERTa

Manipulated Article Detector

Manipulated Entity Classifier

Detector

Samsung

Manipulated text

GPT-2	1	2	3
RoBERTa	67.09	74.12	78.79
Ours	65.84 (1.9%)	74.8 (0.9%)	79.05 (0.3%)

Fake news propagator can manipulate exactly one entity to make the detection task harder.
Manipulation using GPT-2 can keep the detection task harder even for large number of replacements.
Explicit factual knowledge can complement textual knowledge.
There’s a lot of room for improvement in detection tasks (>20% accuracy) and entity identification tasks (>75% F-score).

Takeaways