papers on hate-speech
 Share
The version of the browser you are using is no longer supported. Please upgrade to a supported browser.Dismiss

Comment only
 
ABCDEFGHIJKLMNOPQRSTUVWXYZ
1
The articleLanguageDataData's categories
Corpus available?
MethodResultsYearHate speech definition
2
Right-wing German Hate Speech on Twitter: Analysis and Automatic DetectionGerman55k tweets from 100+ right-wing users (Users based => take all tweets from specific user)(Binary) Hate speech/Not hate speechYes. Subset of 20 000 (Polly Corpus)(Word-bias + word co-occurrence) Features: character trigrams+word lexicon + word bigrams
Model: Single-layer averaged perceptron
F1-score: 84.09 (typo in the paper) Precision: 84.21 Recall: 83.97May 2018INCITEMENT
(CRIME), INCITEMENT
(MASSES), INSULT, SLANDER, SLANDER (politician), FAKE NEWS, INTIMIDATION
Hyperlinks are marked by underline
3
Abusive Language Detection in Online User ContentEnglishPrimary data set:
Finance-comments: 759k (53k abusive)
News-data: 1390k
(228k abusive)
Temporal data set:
Finance: 448k (15k abusive)
News: 726k (70k)
(apparently binary?) abusive speech is divided in "Hate speech", "Derogatory" and "Profanity", but no information about accuracy of categorical detection was givenYes. Subset is available hereFeatures: n-grams (3-5), linguistic features (length, punctuation,urls, etc), syntactic features, word embeddings or word2vec
Model: Vowpal Wabbit’s regression model
Primary data set - F-score: 0.795 in Finance, 0.817 in NewsApril 2016"language which attacks or demeans a group based on race, ethnic origin, religion, disability, gender, age, disability,
or sexual orientation/gender identity"
4
Predictive Embeddings for Hate Speech Detection on TwitterEnglishSexist/Racist dataset:
15k (3k sexist, 2k racist)
Hate dataset: 24k (1k hate speech)
Harassment dataset: 20k (5k harassment)
binaryThree corpora, all avaliable. (linked in the comment)
Model: a modified Simple Word Embeddings based
September 2018
5
Automated Hate Speech Detection and the Problem of Offensive LanguageEnglish25k of tweets in three categories: hate speech, offensive, neither.multi-category (hate speech, offensive, neither)NoModel: SVM + logistic regression + L2 regularization
March 2017
6
Reliability of Hate Speech Annotations: The Case of the European Refugee CrisisGerman10 hashtags were used to collect tweets, and 13 766 tweets were collectedYes"Overall, agreement was very low,
ranging from α = .18 to .29. In contrast, for the
purpose of content analysis, Krippendorff recommends a minimum of α = .80, or a minimum of
.66 for applications where some uncertainty is unproblematic (Krippendorff, 2004). Reliability did
not consistently increase when participants were
shown a definition."
January 2017"Warner and Hirschberg (2012) define hate speech as “abusive speech targeting specific group characteristics, such as ethnic origin, religion, gender, or sexual orientation”. More recent approaches rely on lists of guidelines such as a tweet being hate speech if it “uses a sexist or racial slur” (Waseem and Hovy, 2016). These approaches are similar in that they leave plenty of room for personal interpretation, since there may be differences in what is considered offensive. For instance, while the utterance “the refugees will live off our money” is clearly generalising and maybe unfair, it is unclear if this is already hate speech. "
7
Automatic detection of verbal aggression for Russian and American imageboardsEnglish, RussianEnglish: 654,047 4chan.org messages Russian: 1,148,692 2ch.hk messagesbinaryNoEnglish: 88.40%, Russian: 59.13 % (accuracy) December 2015
8
Detecting Online Hate Speech Using Context Aware Models
English1 May 2018
9
SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in TwitterSpanish, English
10
A Survey on Hate Speech Detection using Natural Language Processing
11
Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter
12
Data Augmentation and Deep Learning for Hate Speech Detection
13
STUFIIT at SemEval-2019 Task 5: Multilingual Hate Speech Detection on Twitter with MUSE and ELMo EmbeddingsSpanish, EnglishHate Speech (HS) is commonly defined as any communication that disparages a person or a group on the basis of some characteristic such as race, color, ethnicity, gender, sexual orientation, nationality, religion, or other characteristics
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
Just as in the previous work: agreement between what is hate speech or offensive is very low (human-based too)
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
Loading...