B. Tech. Project
TOWARDS TARGET-AWARE TWITTER STANCE DETECTION
17CS10006: Ayush Kaushal
Prof. Niloy Ganguly (Supervisor)
Introduction
NAACL-HLT 2021
Introduction
Spurious Cues
Target-Aware Stance
Conclusion
The Part of work done in the Thesis is going to appear in the Main track of NAACL-HLT, 2021
tWT–WT: A Dataset to Assert the Role of Target Entities for Detecting Stance of Tweets
- Ayush Kaushal, Avirup Saha and Niloy Ganguly
Preprint
Abstract
Introduction
Stance Detection
Introduction
Spurious Cues
Target-Aware Stance
Conclusion
* Text portion of the Tweet example taken from SemEval 2016 task 6 dataset
Introduction
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
Applications of Stance Detection
Analysing Debates
Sentiment Analysis
Detecting Fake News
Verifying Rumours
Introduction
Stance Detection Systems
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
Introduction
Spurious Cues in Datasets
Introduction
Spurious Cues
Conclusion
Example:
Visual Question Answering[1]
Q. What is the colour of sky?
Ans. Blue
Cue: Generic truth
Q. Does the man have legs in the air?
Ans. Yes
Cue: Nature of questions annotators ask.
[1] Y. Goyal et. al. 2017. Making the v in vqa matter: Elevating the role of image understanding in visual question answering. CVPR 2017
Target-Aware Stance
Introduction
Spurious Cues in Datasets
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
Introduction
Role of Targets in Detecting Stance
Introduction
Spurious Cues
Conclusion
* The text portion of the annotated example is taken from WT-WT dataset.
Target-Aware Stance
Introduction
Targets as free-form sentences
Introduction
Spurious Cues
Conclusion
* Text portions of Tweet and Target are taken from RumourEval 2017 dataset
Target-Aware Stance
Introduction
Variants of Twitter Stance Detection
Introduction
Spurious Cues
Conclusion
Stance Detection |
|
Multi-target |
|
Cross-target |
|
Rumour Stance |
|
Target-Aware Stance
Introduction
Demonstrating Spurious Cues in Twitter Stance Detection Datasets.
Creating new datasets benchmarks for Target Aware Stance Detection
Investigating Datasets for the spurious cues.
Re-evaluating for Target Aware Stance Detection
Contributions
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
Spurious Cues in Twitter Stance Detection Datasets.
Spurious Cues in Datasets
Overview:
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
Spurious Cues in Datasets
Datasets Considered - 3/6
Will-They-Won’t-
They (WT-WT)
01
> Cross Target
> Financial Domain (M&A)
> 50k+ Tweet-target pairs
SemEval 2016
Task-6
02
> Vanilla Stance Detection
> Various Domains - politics, movements, policy
> 4.1k Tweet-target pairs
M-T Multitarget
03
> Multi-target Stance
> Political domain
> 4.4k Tweet-target pairs
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
Spurious Cues in Datasets
Datasets Considered - 6/6
RumourEval 2017
04
> Rumour Stance Detection
> Disaster Domain Threads
> 5.5k Tweet-target pairs
RumourEval 2019
05
> Rumour Stance Detection
> Disaster Domain Threads
> Twitter + Reddit
> 8.5k Tweet-target pairs
Encryption Debate
06
> Vanilla Stance Detection
> Encryption Debate
> 3k Tweet-target pairs
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
Spurious Cues in Datasets
Very few examples of tweets with different targets.
Dataset | % of tweets with different targets |
WT-WT | 2% |
SemEval16 | 0% |
Rumour2017 | 0% |
Rumour2019 | 0% |
Multi-target | 0.9% |
Encryption | 0% |
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
Spurious Cues in Datasets
Obtaining Dataset
Some of the datasets release only the tweet ids:
> Scrapped using Twitter API* and Tweepy**
Dataset | Tweets scrapped |
WT-WT | 45865 / 50210 |
Multi-target | 2688 / 4413 |
Encryption | 1634 / 2522 |
* developer.twitter.com
** tweepy.org
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
Spurious Cues in Datasets
Preprocessing
* Libraries used - ekphrasis, nltk
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
Spurious Cues in Datasets
Setting up the Experiments
> Each datapoint is a tuple: (Tweet, Target, Stance)
> Target oblivious Model classify only on the tweet.
> Target aware Model receives both as input.
> Target Aware Models should outperform Target Oblivious significantly.
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
Images shown in this slide have been taken from Shutterstock.
Spurious Cues in Datasets
Target Aware
Bert Model
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
This picture of Bert is taken from Sesame Street show after which Bert has been named.
Spurious Cues in Datasets
Target Oblivious
Bert Model
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
This picture of Bert is taken from Sesame Street show after which Bert has been named.
Spurious Cues in Datasets
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
Domain-Specificity: Twitter
Spurious Cues in Datasets
Evaluation Metrics
01
Accuracy
Accuracy
Fraction of labels correctly predicted
02
Tile Error
Weighted Average F1
> Weighted average with weights proportional to the number of examples in that class.
03
Macro
F1
04
Human Bounds
F1 Weighted
Macro Averaged F1
> F1 score is the harmonic mean of precision and Recall
> Macro F1 is a simple average of F1 across all the classes
Human upper bound
Used for comparison purposes only.
Provided for some datasets.
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
Spurious Cues in Datasets
Results (part 1): WT-WT Dataset
Observations:
> Target oblivious Bert performs near or above human bounds.
> Little performance gains from considering targets.
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
Spurious Cues in Datasets
Results (part 2): WT-WT Dataset
Similar Observations:
> Target oblivious Bert performs near human bounds.
> Out-of-Domain (OOD): Massive performance drop.
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
Spurious Cues in Datasets
Results (part 3): SE16 and M-T Datasets
> Target oblivious Bert consistently gives > ⅔ accuracy.
> Performs well on all metrics, very close to target aware.
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
Spurious Cues in Datasets
Results (part 4)
Skewed class distributions:
Target Oblivious:
> Above ⅔ accuracy score
> Impressive Macro-F1
> Performs near target aware
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
Spurious Cues in Datasets
Visualizing the Results
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
This plot was drawn using Matplotlib and Seaborn Libraries.
Spurious Cues in Datasets
Dataset Analysis:
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
The Image shown in this slide is taken from VanillaLaw
Spurious Cues in Datasets
Dataset Analysis: Lexical Choice
[1] Gururangan et. al. Annotation artifacts in natural language inference data. NAACL 2018
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
Spurious Cues in Datasets
Dataset Analysis: Lexical Choice
Top 5 stance-wise lexicons according to PMI, along with percent of tweets with stance class containing the word
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
Support | Refute | Comment | Unrelated | ||||
approves | 3.3% | urges | 3.0% | ceo | 3.7% | stocks | 3.4% |
approve | 5.1% | blocked | 5.5% | healthcare | 11.8% | size | 2.6% |
billion | 26.2% | sues | 4.3% | mean | 2.3% | merge | 11.3% |
shareholder | 0.7% | blocks | 4.8% | merger | 29.3% | bid | 19.0% |
close | 6.4% | block | 21.8% | trial | 3.4% | agreement | 16.7% |
Spurious Cues in Datasets
Dataset Analysis: Sentiment and Stance
Class | Sentiment |
Support | 0.23 |
Refute | 0.64 |
Comment | 0.49 |
Unrelated | 0.48 |
[1] Yang et. al. Xlnet: Generalized autoregressive pretraining for language understanding.NeurIPS 2019
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
Spurious Cues in Datasets
Dataset Analysis
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
Spurious Cues in Datasets
Dataset Analysis: Length Correlation
Very less correlation compared to previous works.[1]
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
[1] Gururangan et. al. Annotation artifacts in natural language inference data. NAACL 2018
Spurious Cues in Datasets
Dataset Analysis: Length Correlation
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
Spurious Cues in Datasets
Dataset Analysis: Length Correlation
Introduction
Spurious Cues
Conclusion
Target-Aware Stance
Towards Target Aware Twitter Stance Detection
Target Aware Stance Detection
Overview
Introduction
Spurious Cues
Target Aware Stance
Conclusion
Target Aware Stance Detection
Dataset Creation Method
Introduction
Spurious Cues
Target Aware Stance
Conclusion
Target Aware Stance Detection
Augmenting procedure - part 1
Results: Near same sentiment score for each class.
Class | Sentiment |
Support | 0.44 |
Refute | 0.44 |
Comment | 0.49 |
Unrelated | 0.48 |
Introduction
Spurious Cues
Target Aware Stance
Conclusion
Target Aware Stance Detection
Augmenting procedure - part 2 and 3
II III
Introduction
Spurious Cues
Target Aware Stance
Conclusion
Target Aware Stance Detection
Targeted WT–WT Dataset Statistics
Introduction
Spurious Cues
Target Aware Stance
Conclusion
Target Aware Stance Detection
Maximum Accuracy of Target Oblivious Classifiers
Theorem: The maximum possible accuracy for any deterministic target-oblivious class stance classifier is:
where;
Introduction
Spurious Cues
Target Aware Stance
Conclusion
Targeted WT-WT | 0.722 |
Targeted SE16 | 0.551 |
Targeted M-T | 0.506 |
Target Aware Stance Detection
Experiments with Targeted datasets
Baselines:
Metrics:
Introduction
Spurious Cues
Target Aware Stance
Conclusion
Target Aware Stance Detection
SiamNet + Bert
Introduction
Spurious Cues
Target Aware Stance
Conclusion
Target Aware Stance Detection
TAN + Bert
Introduction
Spurious Cues
Target Aware Stance
Conclusion
Target Aware Stance Detection
Experiments with Targeted WT–WT (Part-1)
Observations:
Introduction
Spurious Cues
Target Aware Stance
Conclusion
Target Aware Stance Detection
Experiments with Targeted WT–WT (Part-2)
Observations:
Introduction
Spurious Cues
Target Aware Stance
Conclusion
Target Aware Stance Detection
Experiments (Part-3)
Observations:
> Target Oblivious Bert
performs poorly.
> Target Aware Bert performs
the best.
> SiamNet comes very close
to Target Aware Bert
> TAN performs very poorly.
Introduction
Spurious Cues
Target Aware Stance
Conclusion
Target Aware Stance Detection
Experiments Overview
Introduction
Spurious Cues
Target Aware Stance
Conclusion
Conclusions and Future Work
Conclusions & Future Work
Conclusions:
Introduction
Spurious Cues
Target Aware Stance
Conclusion
Conclusions & Future Work
Future Work:
Introduction
Spurious Cues
Target Aware Stance
Conclusion
Conclusions & Future Work
Future Work: Visualization
Target Aware trained
on Targeted-WTWT
Introduction
Spurious Cues
Target Aware Stance
Conclusion
Conclusions & Future Work
Future Work: Visualization
Target Aware Bert trained on WTWT
Introduction
Spurious Cues
Target Aware Stance
Conclusion
Conclusions & Future Work
Code and Trained models
Introduction
Spurious Cues
Target Aware Stance
Conclusion
The pictures for Octocat, Pytorch logo and Huggingface logo are taken from their respective GitHub organizations.
Conclusions & Future Work
Introduction
Spurious Cues
Target Aware Stance
Conclusion
The leaderboard website is inspired by Squad, HotPotQA and HoVer dataset leaderboards.
Thank you
Slide Template partial credit: SlidesCarnival