Auto-translation vs. Post-edited
Capability Assessment
Can model understand and identify dialects?
Datasets: Belebele, QADI, ADI
Dialect Identification
Cultural Value
Can we rely on translation?
……
World Knowledge
Common sense Reasoning
Reading Comprehension
Misinformation
Datasets: …
Cognitive Abilities
World Knowledge
Common Sense Reasoning
Reading Comprehension
Misinformation
AraDiCE Datasets
ArabicMMLU
PIQA, OBQA, Winogrande
BoolQ, Belebele
TruthfulQA
Language and Dialects
MSA
LEV
EGY
Arabic culture
Understanding and Generation
Datasets
Dialect Identification
Generation
QADI, ADI, ADD
MADAR, DA Generation
Cultural Understanding
Capabilities and datasets
GULF
Cognitive Abilities
World Knowledge
Common Sense Reasoning
Reading Comprehension
Misinformation
AraDiCE Datasets
ArabicMMLU
PIQA, OBQA, Winogrande
BoolQ, Belebele
TruthfulQA
Language and Dialects
MSA
LEV
EGY
Regional Arabic culture
Dialect Understanding and Generation
Datasets
Dialect Identification
Generation
QADI, ADI, ADD
MADAR, DA Generation
Cultural Understanding
Capabilities and datasets
GLF
AraDiCE-Culture
Evaluation
4
Grounded Situations (HellaSwag)
NLU (Standard NLP Tasks)
Capabilities/Tasks/Datasets
World Knowledge
Common Sense Reasoning, Morality
Reading Comprehension
Misinformation, Factuality and Bias
Summarization
MSA, Dialects: Levantine, Gulf, North-African, Egyptian (covers ~18 dialects)
Sarcasm
Offensive
Dialect Identification
Natural Language Inference
Factuality (TruthfulQA, AraFact)
Stereotype
Bias
MSA
Mesopotamian
North Levantine
Najdi
Moroccan
Egyptian
MMLU
(57 subcategories/ subjects)
ArabicMMLU
(40 subcategories)
Belebele dataset
MSA and dialects
Exams
Ethics (Morality): Justice, well being, …
Information Seeking
(SituatedQA)
Physical common sense (PIQA)
Elementary school science facts (OpenBookQA)
Grad-school science questions
Natural science questions
Question Answering