Forum for Information Retrieval and Evaluation:
SARCASM DETECTION IN CODE-MIXED DRAVIDIAN LANGUAGES
Dhanya Krishnan
Krithika Dharanikota
Dr. B. Bharathi
TABLE OF CONTENTS
Results and metrics achieved
RESULTS
03
04
01
02
A brief look into sentiment analysis using ML
INTRODUCTION
Our inferences and scope for future studies
INFERENCE
The models and methods used in this study
METHODOLOGIES
INTRODUCTION
01
SENTIMENT ANALYSIS
CHALLENGES IN IDENTIFYING SARCASM
While sentiment analysis works well with directly expressed emotions, sarcasm detection is not easy:
DATASET STATS
SARCASM IN TAMIL
Tamil has 19866 non-sarcastic comments and 7170 sarcastic comments
26.5%
SARCASM IN MALAYALAM
Malayalam has 9798 non-sarcastic comments and 2259 sarcastic comments
18.6%
The imbalance in the number of sarcastic vs non sarcastic comments is highlighted in the graph.
METHODOLOGIES
02
OUR TOP PERFORMING MODELS
Countvectorizer with Multilayer Perceptron for Classifier
MALAYALAM
Tf-IDF vectorizer with Multilayer Perceptron for classification.
TAMIL
Why TF-IDF Vectorizer?
Why CountVectorizer?
Why Multilayer Perceptron?
OUR RESULTS
03
Successful models for each language :
TAMIL
MALAYALAM
Metric values obtained :
Our results
INFERENCES
04
Our study contribution
Future Scope
Future Scope
REFERENCES