Dataset
Conclusion
Problem Statement
HPE -2D and 3D Estimation
Multi-view Processing of Audio-Visual Recording
[1] S. Nadkarni, S. Roychowdhury, P. Rao, and M. Clayton, “Exploring the correspondence of melodic contour with gesture in raga alap singing,” in Proceedings of the 24th International Society for Music Information Retrieval Conference, ISMIR, Milan, Italy, 2023.
References
Human Pose Estimation for Expressive Movement
Descriptors in Vocal Musical Performances
Sujoy Roychowdhury, Preeti Rao, Sharat Chandran
Indian Institute of Technology Bombay, India
HPE vs. Sensor Based Systems
Results
Correspondence between HPE Models
Confidence Thresholds
Research Questions
F1-scores (%) for stable note detection and Accuracy (%) for gesture-based singer identification
Fusion based results
Left View
Front View
Right View
Left View
Front View
Right View
Feature Extraction
Feature Extraction
Feature Extraction
classification
label
Classifier
Classifier
Classifier
Fusion Model
Predicted prob
Predicted prob
2D/3D Kinematic features
2D/3D Kinematic features
2D/3D Kinematic features
Multi-view fusion of classifiers
Classification using HPE Models and Multi-view Fusion
Stable Note from Gesture [1]
Singer ID from Gesture
Observations:-
Dataset Link
Euclidean distance between keypoints
Template ID: inquisitalanchor Size: 36x48