A Hybrid Computational Framework Using NLP and ML for Emotion Analysis in Sinhala Songs
Malinda Peiris
School of Computing
Informatics Institute of Technology
Deshan Sumanathilaka
School of Mathematics and Computer Science
Swansea University
Presenting Author : Deshan Sumanathilaka
INTRODUCTION
LITERATURE REVIEW
Paper | Findings |
Lakshitha and Jayaratne 2016 | supervised learning approaches with different classifier algorithms and melody analysis for Sinhala songs. |
Abeyratne and Jayarathna 2019 | Machine learning techniques to predict emotional labels. |
Jenarthanan, Senarath and Thayasivam 2019 | An emotionally annotated corpus was created using tweets to support Tamil and Sinhala NLP community support emotional analysis |
Yang 2008 | Trained two independent regression models (regressors) to predict the valence and arousal |
Mihalcea and Strapparava 2012 | Ekman's six core emotions to classify emotions in songs with 100 popular songs using a bag of words method |
RESEARCH GAP
RESEARCH OBJECTIVES
EMOTIONAL MODEL
Russell’s two-dimensional model (Russell, 1980) is a well-known model that connects cognitive science and psychology. Valence of this model represents pleasantness/positive emotions, and Arousal represents the intense/emotions of excitement.
DATASET
Dataset was collected from the research conducted by Aberathna and Jayarathana.
The first 90 seconds of the song were extracted in each music file. Also, the commonly used WAV format is used for the format of the audio.
Description | Label | Number of Songs |
High Arousal and Positive Valence | HP- Happy | 55 |
Low Arousal and Negative Valence | LN - Sad | 52 |
Low Arousal and Positive Valence | LP - Calm | 55 |
CLASSIFICATION APPROACH
Audio Feature Extraction
Audio Feature Selection.
Audio Train Test Approach
Audio Classification Algorithms
Word Embedding
DATAFLOW
RESULTS : Audio Feature
Algorithm | Feature Selection | Accuracy |
Naive Bayes | RelieF Based Attribute Selection | 65.43% |
Random Forest | RelieF Based Attribute Selection | 78.39% |
SMO | RelieF Based Attribute Selection | 79.01% |
Decision Tree | Correlation Based Attribute Selection | 72.22% |
Logistic Regression | Correlation Based Attribute Selection | 75.30% |
RESULTS : Textual Feature
Word Embedding | Algorithm | Pre-trained Model | Accuracy |
TF-IDF | SVM | - | 44.44% |
Word2Vec - Fasttext | Fasttext | sinhala-bin-fasttext | 51.51% |
Tokenizer | Transformer | xlm-roberta-base | 61.22% |
LIMITATION
FUTURE WORKS
KEY CONTRIBUTIONS
• Developed an emotional model for classifying Sinhala songs based on their emotions.
• Identified key music information features critical for classifying Sinhala songs.
• Evaluated the effectiveness of the machine learning algorithm for lyrical feature analysis.
REFERENCES
THANK YOU !
- Malinda Peiris
- ashan.20210193@iit.ac.lk