1 of 17

A Hybrid Computational Framework Using NLP and ML for Emotion Analysis in Sinhala Songs

Malinda Peiris

School of Computing

Informatics Institute of Technology

Deshan Sumanathilaka

School of Mathematics and Computer Science

Swansea University

Presenting Author : Deshan Sumanathilaka

2 of 17

INTRODUCTION

  • Music emotion recognition can be considered less research in Sri Lanka. Whereas researchers carried out to classify songs by their emotional state by different researchers, but those approaches/models are mainly for English songs hence those methods would be problematic and not be applicable for Sinhala songs.

  • People tend to listen to music to deal with their emotional state, whereas in Sri Lanka, most of the song listeners are not aware of the emotional state of the songs that they are listening.

  • This research aims to identify natural language process approaches in classifying Sinhala songs(emotions) by extracting music information from songs.

3 of 17

LITERATURE REVIEW

Paper

Findings

Lakshitha and Jayaratne 2016

supervised learning approaches with different classifier algorithms and melody analysis for Sinhala songs.

Abeyratne and Jayarathna 2019

Machine learning techniques to predict emotional labels.

Jenarthanan, Senarath and Thayasivam 2019

An emotionally annotated corpus was created using tweets to support Tamil and Sinhala NLP community support emotional analysis

Yang 2008

Trained two independent regression models (regressors) to predict the valence and arousal

Mihalcea and Strapparava 2012

Ekman's six core emotions to classify emotions in songs with 100 popular songs using a bag of words method

4 of 17

RESEARCH GAP

  • As per the literature with regards to Sri Lankan context no study has conducted to emotionally classify Sinhala song based on Lyrics.

  • No Study has used a hybrid approach in classifying Sinhala songs based on both Audio and Lyrics of a song.

  • Currently there is no system which can classify Sinhala songs based on its emotional representation.

5 of 17

RESEARCH OBJECTIVES

  • Significant emotional model used in classifying songs by their emotion�
  • Music information features that are significant in classifying Sinhala songs. �
  • An appropriate natural language model classifies Sinhala songs by their emotions.

  • An appropriate Machine Learning algorithm for Sinhala songs by its emotions

6 of 17

EMOTIONAL MODEL

Russell’s two-dimensional model (Russell, 1980) is a well-known model that connects cognitive science and psychology. Valence of this model represents pleasantness/positive emotions, and Arousal represents the intense/emotions of excitement.

7 of 17

DATASET

Dataset was collected from the research conducted by Aberathna and Jayarathana.

The first 90 seconds of the song were extracted in each music file. Also, the commonly used WAV format is used for the format of the audio.

Description

Label

Number of Songs

High Arousal and Positive Valence

HP- Happy

55

Low Arousal and Negative Valence

LN - Sad

52

Low Arousal and Positive Valence

LP - Calm

55

8 of 17

CLASSIFICATION APPROACH

Audio Feature Extraction

  • JAudio - 72 Features (McEnnis 2005)

Audio Feature Selection.

  • RelieF Based Attribute Selection - Best 34 Features.
  • Correlation Based Attribute Selection - Best 30 Features.

Audio Train Test Approach

  • K-fold cross validation.

Audio Classification Algorithms

  • Naive Bayes
  • Random Forest
  • SMO
  • Decision Tree
  • Logistic Regression

Word Embedding

  • TFIDF
  • WORD2VEC
  • Tokenizer - Transformers

9 of 17

DATAFLOW

10 of 17

11 of 17

RESULTS : Audio Feature

Algorithm

Feature Selection

Accuracy

Naive Bayes

RelieF Based Attribute Selection

65.43%

Random Forest

RelieF Based Attribute Selection

78.39%

SMO

RelieF Based Attribute Selection

79.01%

Decision Tree

Correlation Based Attribute Selection

72.22%

Logistic Regression

Correlation Based Attribute Selection

75.30%

12 of 17

RESULTS : Textual Feature

Word Embedding

Algorithm

Pre-trained Model

Accuracy

TF-IDF

SVM

-

44.44%

Word2Vec - Fasttext

Fasttext

sinhala-bin-fasttext

51.51%

Tokenizer

Transformer

xlm-roberta-base

61.22%

13 of 17

LIMITATION

  • This research is based on the Sri Lankan context, and the materials and models are within the context of Sri Lankan songs and Sinhala song lyrics.
  • The Aberathna and Jayarathna data set was the only available dataset found in sri lanka context.
  • Word embedding in fasttext - xml roberta-based created on wiki data and common crawl data - These word embeddings does not represent the classical words used by music composers these vectors are trained on extracted documents in web.

14 of 17

FUTURE WORKS

  • Deep learning methods in place of supervised machine learning algorithms.

  • This classification algorithms can be used to build music recommendation systems and automated playlist generation systems based on the mood of the user.

  • Subjective data collection using feedback from music listeners and music experts with different emotional models

  • LLM and GPT models can be utilized in future researches.

15 of 17

KEY CONTRIBUTIONS

• Developed an emotional model for classifying Sinhala songs based on their emotions.

• Identified key music information features critical for classifying Sinhala songs.

• Evaluated the effectiveness of the machine learning algorithm for lyrical feature analysis.

16 of 17

REFERENCES

  • S. Dhanapala and K. Samarasinghe, “Music emotion classification: A literature review,” 2024.
  • K. Abeyratne and K. Jayaratne, “Classification of Sinhala songs based on emotions,” in 2019 19th International Conference on Advances in ICT for Emerging Regions (ICTer), vol. 250, IEEE, 2019, pp. 1–10.
  • M. V. Lakshitha and K. Jayaratne, “Melody analysis for prediction of the emotions conveyed by Sinhala songs,” in 2016 IEEE International Conference on Information and Automation for Sustainability (ICIAfS), IEEE, 2016, pp. 1–6.
  • R. Amarasinghe and L. Jayaratne, “Supervised learning approach for singer identification in Sri Lankan music,” European Journal of Computer Science and Information Technology (EJCSIT) by European Centre for Research Training and Development UK, vol. 4, no. 6, pp. 1–14, 2016.
  • R. Panda, R. Malheiro, and R. P. Paiva, “Audio features for music emotion recognition: A survey,” IEEE Transactions on Affective Computing, vol. 14, no. 1, pp. 68–88, 2020.
  • Y. Song, S. Dixon, and M. T. Pearce, “Evaluation of musical features for emotion classification,” in ISMIR, 2012, pp. 523–528.
  • X. Jia, “Music emotion classification method based on deep learning and improved attention mechanism,” Computational Intelligence and Neuroscience, vol. 2022, no. 1, p. 5181899, 2022.
  • Y.-H. Yang and H. H. Chen, “Machine recognition of music emotion: A review,” ACM Transactions on Intelligent Systems and Technology (TIST), vol. 3, no. 3, pp. 1–30, 2012.
  • D. Yang and W.-S. Lee, “Music emotion identification from lyrics,” in 2009 11th IEEE International Symposium on Multimedia, IEEE, 2009, pp. 624–629.
  • R. Mihalcea and C. Strapparava, “Lyrics, music, and emotions,” in Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012, pp. 590–599.
  • A. Jamdar, J. Abraham, K. Khanna, and R. Dubey, “Emotion analysis of songs based on lyrical and audio features,” arXiv preprint arXiv:1506.05012, 2015.
  • M. Soleymani, A. Aljanaki, Y.-H. Yang, M. N. Caro, F. Eyben, K. Markov, B. W. Schuller, R. Veltkamp, F. Weninger, and F. Wiering, “Emotional analysis of music: A comparison of methods,” in Proceedings of the 22nd ACM International Conference on Multimedia, 2014, pp. 1161–1164.
  • A. Aljanaki, Y.-H. Yang, and M. Soleymani, “Developing a benchmark for emotional analysis of music,” PloS One, vol. 12, no. 3, p. e0173392, 2017.
  • J. A. Russell, “A circumplex model of affect,” Journal of Personality and Social Psychology, vol. 39, no. 6, p. 1161, 1980.
  • Y. Senarath, “Language processing tool for Sinhalese,” 2004. [Online]. Available: https://sinling.ysenarath.com/
  • D. Lakmal, S. Ranathunga, S. Peramuna, and I. Herath, “Word embedding evaluation for Sinhala,” in Proceedings of the Twelfth Language Resources and Evaluation Conference, 2020, pp. 1874–1881.
  • I. F. P. Daniel McEnnis and C. McKay, “Jaudio: A feature extraction library,” in ISMIR, 2005.
  • C. McKay, R. Fiebrink, D. McEnnis, B. Li, and I. Fujinaga, “ACE: A framework for optimizing music classification,” in ISMIR, 2005, pp. 42–49.
  • C. McKay, D. McEnnis, R. Fiebrink, and I. Fujinaga, “ACE: A general-purpose classification ensemble optimization framework,” in ICMC, 2005.
  • C. McKay, “Jaudio: Towards a standardized extensible audio music feature extraction system,” Course Paper, McGill University, Canada, 2009.

17 of 17

THANK YOU !

  • For Any Technical Enquiries :

- Malinda Peiris

- ashan.20210193@iit.ac.lk