The Future of AI in Online Safety
Roy Ka-Wei Lee, Assistant Professor
Singapore University of Technology and Design (SUTD)
TikTok
13 Dec 2024
2
WARNING: The following talk contain act of violence and discrimination that may be disturbing to some participants. Discretion is advised
Wide Spectrum of Online Harm
3
Online Harm
hate speeches
Deepfakes
Fake News
Cyberbullying
Sexual Harassment
Dangerous viral challenges
False Rumors
Many other harms…
What is Hate Speech?
United Nations defines hate speech as…“any kind of communication in speech, writing or behaviour, that attacks or uses pejorative or discriminatory language with reference to a person or a group on the basis of who they are, in other words, based on their religion, ethnicity, nationality, race, colour, descent, gender or other identity factor.”
4
Hate Speech Works from Social AI Studio
5
DeepHate - Deep learning model that uses multi-facet information (semantics, sentiment, topics) for hate speech detection.
AngryBERT - Multi-task learning framework that enables joint learning of target and emotion for hate speech detection.
HateGAN - Generative adversarial network that generates hateful social media posts for data augmentation.
HEAR - Deep recursive network to perform early hate speech propagation prediction.
DisMultiHate - Disentangle target entities in hateful memes to improve classification and explainability.
Explainable Hateful Meme - Perform visual-text slur grounding to understand hateful memes.
SGHateCheck
6
SGHateCheck: Functional Tests for Detecting Hate Speech in Low-Resource Languages of Singapore [WOAH’24]
Benchmarking Models on SGHateCheck
7
Hateful Meme Detection
8
Demystifying Hateful Content: Leveraging Large Multimodal Models for Hateful Meme Detection with Explainable Decisions [ICWSM’24]
MultiHateClip
9
MultiHateClip: A Multilingual Benchmark Dataset for Hateful Video Detection on YouTube and Bilibili [MM’24]
| Hateful | Offensive | Normal | Total |
YouTube | 128 | 194 | 678 | 1000 |
Bilibili | 82 | 256 | 622 | 1000 |
Open Issues and Research Opportunities
10
Collaborate to do good
11
Governments
Platforms
NGOs
Academics
Online
Safety & Trust
Thank You
Roy Ka-Wei Lee | Assistant Professor
Singapore University of Technology and Design
User Profiling in Multiple Social Media
Online Safety & Cyber Abuse Research
Social Natural Language Generation
Social Recommender System