Using Human Activity Recognition in Physical Rehabilitation Exercises in Real-time
Presented By : Moamen Zaher
Assoc. Prof. Ayman Ezzat Atia
Computer Science Department
Faculty of Computers & Artificial Intelligence
Helwan University
Dr. Amr Ghoneim
Computer Science Department
Faculty of Computers & Artificial Intelligence
Helwan University
Dr. Laila M. Abdelhamid
Information System Department
Faculty of Computers & Artificial Intelligence
Helwan University
This Presentation is Presentedto Department of Information Systems to Obtain Master Degree in Software Engineering
Agenda
Introduction
Introduction 1/2
4
Rehabilitation is a long process so we need to shorten hospital stays.
Rehabilitation is a set of interventions designed to optimize functioning and reduce disability in individuals with health conditions in interaction with their environment.
Wrong execution of the exercises can hinder injury and increase recovery time.
Introduction 2/2
5
Different skeleton parts, angles and trajectories for different body joints are required
Data must be acquired using sensors
An AI model must be developed to classify correct & wrong execution of exercises
Motivation
6
Map of leading health conditions requiring rehabilitation in each country, 2019
Map of leading health conditions requiring rehabilitation in each country, 2019
Problem Statement
7
WHO (World health Organization / Rehabilitation 10 November 2021
Research Objective
Train more professionals
Increase productivity
❌
✅
Research Outcomes
Allow patients to practice exercises at home without the need to go the physio clinics.
Real-Time feedback for the patient whether he’s done the exercise correctly or not.
Cut-Down Cost of rehabilitation.
Allow doctors to monitor patients progress.
Background and Literature Review
Background
Arshad, M. H., Bilal, M., & Gani, A. (2022). Human activity recognition: Review, taxonomy and open challenges. Sensors, 22(17), 6463. [2]
Related Work
12
Debnath, B., O’brien, M., Yamaguchi, M., & Behera, A. (2022). A review of computer vision-based approaches for physical rehabilitation and assessment. Multimedia Systems, 28(1), 209-239. [3]
Computer Vision-Based approaches in physiotherapy
Rehabilitation
Virtual
Skeleton-based
Non-skeleton based
Automated Assessment
Direct
Pure Vision-based
Multi-modal
Assessment
Comparison
Kinematics-based Modeling
Statistical Model
Stochastic Methods
Categorization
Rule-based
Statistical and Stochastic Algorithms-based
Scoring
Author Proposed
Clinical
Digital Rehabilitation
Direct Rehabilitation
Virtual Rehabilitation
Data Acquisition
Tasnim, N., & Baek, J. H. (2023). Dynamic edge convolutional neural network for skeleton-based human action recognition. Sensors, 23(2), 778. [4]
Yue, R., Tian, Z., & Du, S. (2022). Action recognition based on RGB and skeleton data sets: A survey. Neurocomputing, 512, 287-306. [5]
Datasets
Author | Impairment | Details | Sensor/Data |
SPHERE- Walking2015 | Sit to stand | 109 sequences, 10 individuals, restricted knee, hip, freezing | Kinect/ Kinect SDK, OpenNI SDK skeleton |
Parkinson’s pose�estimation | PD, LID, UPDRS assessment tasks | 526 sequence, PD, LID patients, 4 UPDRS assessment tasks | RGB Camera/CPM skeleton |
AHA-3D | Senior lower body fitness | 11 young, 10 elderly subjects, 4 exercises | Kinect/ RGB, depth, skeleton |
UI-PRMD | Physical Rehabilitation Movement | 10 rehabilitation exercises 10 healthy individuals | Vicon optical tracker, and a Kinect camera |
KIMORE | Stroke, PD, back pain exercises | 44 healthy, 34 patient subjects, 5 exercises 5 repetitions | Kinect/ RGB, depth, skeleton |
UTD-MHAD | 27 different actions | 8 subjects (4 females and 4 males). Each subject repeated each action 4 times. | one Kinect camera and one wearable inertial sensor |
Related Work 1/2 :
16
Rashid, F. A. N., Suriani, N. S., Mohd, M. N., Tomari, M. R., Zakaria, W. N. W., & Nazari, A. (2020). Deep convolutional network approach in spike train analysis of physiotherapy movements. In Advances in Electronics Engineering: Proceedings of the ICCEE 2019, Kuala Lumpur, Malaysia (pp. 159-170). Springer Singapore. [6]
17
Tasnim, N., & Baek, J. H. (2023). Dynamic edge convolutional neural network for skeleton-based human action recognition. Sensors, 23(2), 778. [4]
Related Work 2/2 :
Methodology
Overview of the Methodology
UI-PRMD
A. Vakanski, H.-P. Jun, D. Paul, and R. Baker, “A data set of human body movements for physical rehabilitation exercises,” Data (Basel), vol. 3, Mar. 2018. [7]
KIMORE
Capecci, M., Ceravolo, M. G., Ferracuti, F., Iarlori, S., Monteriu, A., Romeo, L., & Verdini, F. (2019). The kimore
dataset: Kinematic assessment of movement and clinical scores for remote monitoring of physical rehabilitation.
IEEE Transactions on Neural Systems and Rehabilitation Engineering, 27(7), 1436-1448. [8]
KIMORE
Collected Datasets
23
Research Directions
Approach 1:��Comparative Study of Machine Learning Algorithms.�
Approach 2:��Case Study: A framework for assessing physical rehabilitation exercises.
Approach 3:��Comparative Study between CNN and RNN algorithms on multiple datasets.�
Approach 4:��Transfer Learning and Model Fusion.�
✅
✅
⏳
⏳
Approaches and Results
Paper 1: Rehabilitation Monitoring and Assessment: A Comparative Analysis of Feature Engineering and Machine Learning Algorithms on the UI-PRMD and KIMORE Benchmark Datasets
�Under Review in Journal of Information and Telecommunication(Q2), Taylor & Francis.
Research Pipeline
UI-PRMD Dataset Representation
Data Processing
Feature Ranking and Selection
Feature Ranking and Selection
Action Classifcation
Non-Ensembled
Ensembled Models
Experiments
Results : Ranking Methods
Results : Algorithms
Results
UI-PRMD Results
Classifier | X2 | FCBF | ReliefF | Gini Decrease | Information Gain | Information Gain Ratio | Classifier Average |
Ada Boost Classifer | 94.04% | 93.37% | 86.20% | 93.66% | 93.60% | 91.66% | 92.10% |
Decision Tree | 99.00% | 99.07% | 91.12% | 98.81% | 98.81% | 98.81% | 98.81% |
Extra Tree | 98.37% | 99.64% | 99.94% | 97.81% | 97.56% | 98.25% | 98.60% |
Gradient Boosting | 98.87% | 98.93% | 98.50% | 98.81% | 98.81% | 98.81% | 98.80% |
KNN | 92.44% | 94.01 | 99.88% | 91.63% | 91.63% | 93.07% | 93.80% |
Light Gradient Boosting Machine | 98.94% | 99.36% | 98.75% | 99.06% | 99.06% | 99% | 99% |
Linear Discriminant Analysis | 98.87% | 68.05% | 97% | 97.62% | 97.62% | 97.56% | 92.80% |
Logistic Regression | 90.75% | 94.86% | 79.12% | 87.94% | 87.94% | 88.31% | 88.20% |
Naïve Bayes | 96.94% | 96.85% | 95.25% | 96.38% | 96.38% | 86.75% | 94.80% |
Quadratic Discriminant Analysis | 66.18% | 87.76% | 18.44% | 67.31% | 68.25% | 40.56% | 58.10% |
Random Forest | 98.31% | 99.57% | 99.62% | 82.24% | 98.06% | 98.06% | 96% |
Ridge | 68.26% | 77.71% | 53.69% | 59.81% | 59.81% | 59.94% | 63.20% |
Support Vector Machine | 86.83% | 95.25% | 69.61% | 85.25% | 85.25% | 83.82% | 84.30% |
Average Accuracy of Each Feature Ranking Technique | 91.37% | 92.65% | 83.62% | 88.95% | 90.21% | 87.28% | |
Classifier | X2 | FCBF | ReliefF | Gini Decrease | Information Gain | Information Gain Ratio | Mean Classifier Accuracy |
Ada Boost Classifier | 54.95% | 56.67% | 53.38% | 55.28% | 56.11% | 59.86% | 56.04% |
Decision Tree | 67.45% | 74.35% | 68.56% | 66.53% | 72.82% | 73.19% | 70.48% |
Extra Tree | 76.81% | 81.85% | 74.63% | 72.78% | 75.14% | 76.81% | 76.34% |
Gradient Boosting | 74.95% | 74.95% | 68.80% | 69.54% | 74.03% | 73.98% | 72.71% |
KNN | 75.00% | 71.30% | 71.85% | 68.52% | 71.90% | 73.15% | 71.95% |
Light Gradient Boosting Machine | 73.89% | 71.44% | 73.89% | 72.96% | 76.90% | 77.64% | 75.45% |
Linear Discriminant Analysis | 58.89% | 58.89% | 61.94% | 61.53% | 57.78% | 60.83% | 59.98% |
Logistic Regression | 59.12% | 59.12% | 53.80% | 63.19% | 58.52% | 56.48% | 58.37% |
Naïve Bayes | 57.92% | 57.92% | 54.44% | 58.84% | 59.26% | 54.68% | 57.18% |
Quadratic Discriminant Analysis | 18.19% | 18.19% | 32.45% | 74.91% | 75.56% | 18.19% | 39.58% |
Random Forest | 76.53% | 76.53% | 72.22% | 70.46% | 75.65% | 78.43% | 74.97% |
Ridge | 56.94% | 56.94% | 55.42% | 56.30% | 58.75% | 57.92% | 57.05% |
Support Vector Machine | 57.55% | 66.20% | 58.50% | 58.06% | 62.04% | 58.43% | 60.13% |
KIMORE Results
Discussion
Conclusion
Paper 2: A Framework for Assessing Physical Rehabilitation Exercises
Published at 2023 Intelligent Methods, Systems, and Applications (IMSA), IEEE Conference�DOI
Zaher, M., Samir, A., Ghoneim, A., Abdelhamid, L., & Atia, A. (2023, July). A Framework for Assessing Physical Rehabilitation Exercises. �In 2023 Intelligent Methods, Systems, and Applications (IMSA) (pp. 526-532). IEEE.
Research Pipeline
Datasets
44
Preprocessing
45
Feature Engineering
46
UI-PRMD Dataset Features
47
Experiments
48
Used Algorithms 1 of 2
49
Used Algorithms 2 of 2
50
Experiments 1 of 2
51
Experiments 2 of 2
52
Results
53
Number of Templates | Accuracy | Precision | Recall | F1 Score |
One | 72% | 74.5% | 72% | 70.4% |
Two | 88% | 93% | 88% | 87.9% |
Three | 90% | 93.8% | 90% | 90.1% |
Evaluation of 1$ algorithm based on different evaluation metrics for different number of templates used
Algorithm | Accuracy | Precision | Recall | F1 Score |
Extra Tree | 99.64% | 99.74% | 99.64% | 99.62% |
One Dollar | 90% | 93.8% | 90% | 90.1% |
Evaluation of different algorithms based on different evaluation metrics
Conclusion
54
Confusion Matrix for 1$ with 3 templates
55
Confusion Matrix for 1$ with 3 templates
56
Paper 3: Unlocking the Potential of RNN and CNN Models for Accurate Rehabilitation Exercise Classification on Multi-Datasets.
��Published at Multimedia Tools and Applications Journal (Q1), Springer.
Zaher, M., Ghoneim, A. S., Abdelhamid, L., & Atia, A. (2024). Unlocking the potential of RNN and CNN models for accurate rehabilitation exercise classification on multi-datasets. Multimedia Tools and Applications, 1-41.
Research Pipeline
Datasets
UI-PRMD
KIMORE
Preprocessing
For Disease Classification
Hyper-parameters tuning
Bergstra J, Bengio Y. Random search for hyper-parameter optimization. Journal of machine learning research. 2012 Feb 1;13(2).
For Disease Classification
Used Algorithms
Random Search
Parameters : LSTM
Parameter | Range of Values | Best Value |
LSTM Layers | From 1 to 7 | 1 |
LSTM Units | From 32 to 1024 | 320 |
Dropout Rate | From 0.2 to 0.5 | 0.26 |
Learning Rate | from 0.0001 to 0.01 | 0.0005 |
LSTM Regularizer | l1, l2, or none | l2 |
Dense Regularizer | l1, l2, or none | None |
Dense Layers | From 1 to 5 | 2 |
Dense Units | From 64 to 1024 | 940 |
Parameters : Bi-LSTM
Parameter | Range of Values | Best Value |
BiLSTM Layers | From 1 to 5 | 2 |
BiLSTM Units | From 32 to 1024 | 271 |
Dropout Rate | From 0.2 to 0.5 | 0.3 |
Learning Rate | from 0.0001 to 0.01 | 0.001014 |
Regularizer | l1, l2, or none | None |
Dense Regularizer | l1, l2, or none | None |
Dense Layers | From 1 to 5 | 4 |
Dense Units | From 64 to 1024 | 927 |
Parameters : CNN-LSTM
Parameter | Range of Values | Best Value |
Filters | From 32 to 1024 | 128 |
Kernel Size | From 3 to 10 | 8 |
LSTM Units | From 64 to 1024 | 256 |
Dropout Rate | From 0.2 to 0.5 | 0.2 |
Learning Rate | from 0.0001 to 0.01 | 0.00077 |
Conv Regularizer | l1, l2, or none | l2 |
LSTM Regularizer | l1, l2, or none | None |
Dense Regularizer | l1, l2, or none | None |
Dense Layers | From 1 to 5 | 2 |
Dense Units | From 64 to 1024 | 927 |
Parameters : CNN
Parameter | Range of Values | Best Value |
Convolutional Layers | From 1 to 6 | 2 |
Conv Units | From 32 to 512 | 48 |
Dropout Rate | From 0.2 to 0.5 | 0.2 |
Learning Rate | from 0.0001 to 0.01 | 0.0025284 |
Conv Regularizer | l1, l2, or none | None |
Dense Regularizer | l1, l2, or none | None |
Dense Layers | From 1 to 5 | 1 |
Dense Units | From 64 to 1024 | 544 |
Evaluation Metrics
Training
Parameter | Value |
Epochs | 450 |
Train-test split | 80 20 |
Train-val split | 80 20 |
Folds | 5 |
Batch Size | 32 |
Hidden Layer Activation | ReLU |
Output Layer Activation | Softmax |
Optimizer | Adam |
Loss | Categorical cross-entropy |
Experiments
Exercise Classification
UIPRMD
KIMORE
State-of-the-art - KIMORE
Method | Accuracy |
Ensemble-based Graph Convolutional Network (EGCN) | 80.10% |
3D Convolution Neural Network (3D-CNN) | 90.57% |
Many-to-Many model with density map output | 92.33% |
Our Tuned-CNN | 93.08% |
State-of-the-art - UIPRMD
Method | Accuracy |
Ensemble-based Graph Convolutional Network (EGCN) | 86.90% |
Graph Convolutional Siamese Network | 99.20% |
FCBF - Extra Tree | 99.60% |
Our Tuned-CNN | 99.70% |
Disease Classification
Disease Classification Results
Conclusion
Paper 4: Fusing CNN and Attention Mechanisms: Advancements in Real-Time Human Activity Recognition�for Rehabilitation Exercises Classification
�Submitted to Computers in Biology and Medicine (Q1), Elsevier
Research Pipeline
Datasets
UI-PRMD
KIMORE
Preprocessing
Scalogram Generation
Mel-frequency cepstral coefficients
Continuous Wavelet Transform
Image Representation: CWT
Image Representation
Limitations of Machine Learning
Limitations of Deep Learning
Transfer Learning
Fusion
Algorithms
CNN
Attention
Fused Algorithms Models
2 Models
3 Models
Model Architectures
Top Layer Archiecture
Two Architectures
Parameter | First Architecture | Second Architecture |
Number of Dense Layers | 2 | 2 |
1st Dense Layer’s Units | 512 | 352 |
2nd Dense Layer’s Units | 265 | 176 |
Dropout Rate | 0.2 | 0.2 |
Evaluation Metrics
Optimization Metrics�
Satisfactory Metrics
Training
Parameter | Value |
Epochs | 150 |
Train-test split | 80 20 |
Train-val split | 80 20 |
Folds | 5 |
Batch Size | 64 |
Input SHape | 224x224x3 |
Hidden Layer Activation | ReLU |
Output Layer Activation | Softmax |
Optimizer | Adam |
Loss | Categorical cross-entropy |
Experiments
The first experiment aimed to identify the optimal top-layer architecture for achieving the most accurate exercise classification results on the UI-PRMD dataset.
Simultaneously, the second experiment focused on implementing numerous hybrid CNN-ViT architectures, validating the results through cross-validation and incorporating an additional dataset.
For Disease Classification
Exp 1 Results (CNN-Based Only)
Comparison of the best two architectures for the Fully Connected Network architecture on the UI-PRMD dataset, focusing on CNN-based models. The 2D bar chart visually represents the accuracy distinctions between the two architectures.
Exp 1 Results (Attention-Based)
Comparison of the best two architectures for the Fully Connected Network architecture on the UI-PRMD dataset, focusing on Attention-based models. The 2D bar chart visually represents the accuracy distinctions between the two architectures.
All Model Results on UIPRMD
Comparison of results across various algorithms using the second architecture on the UIPRMD dataset. The image depicts the performance of 20 different CNN-Based Models alongside 13 attention-based and fused models.
All Model Results on UIPRMD
Cross Validation on UI-PRMD
Comparison with state-of-the-art on UI-PRMD
Method | Accuracy | F1-Score | Results |
GCN | 92.64% |
| Training |
ST-GCN | 98.90% |
| Training |
2S-AGCN | 99.10% |
| Training |
Graph Convolutional Siamese Network | 99.20% |
| Training |
Spike Train | 77% |
| Testing |
Graph Transformer |
| 85% | Testing |
EGCN | 86.90% |
| Testing |
Res50-MobileV3Small-ViT | 89.30% | 89.07% | Testing |
DenseNet121 | 89.33% | 89.06% | Testing |
Res50-Dense201-ViT | 89.80% | 89.59% | Testing |
DenseNet201-MobileNetV3Small-ViT | 89.80% | 89.64% | Testing |
All Model Results on KIMORE
Comparison of results across various algorithms using the second architecture on the KIMORE dataset. The image depicts the performance of 20 different CNN-Based Models alongside 13 attention-based and fused models.
All Model Results on KIMORE
Cross Validation on KIMORE
Comparison with state-of-the-art on KIMORE
Algorithm | Accuracy |
EGCN | 80.10% |
3D-CNN | 90.57% |
Many-to-Many model with density map output | 92.33% |
Res50-Dense201-ViT | 93.78% |
Res50-MobileV3Small-ViT | 94.04% |
Dense201-MobileV3Small-ViT | 94.30% |
ViT | 95.08% |
MobileNetV3Small-ViT | 95.33% |
Comparison of inference time
Comparison of model size in MB
Discussion
Discussion
Conclusion
Discussion
Discussion
Discussion
SE Prespective
Collaboration With Physical Therapy
September 2022
October 2023
Patient Dashboard
Conclusion
Conclusion
Future Work
UI-PRMD
UI-PRMD
Collected Dataset
Collected Dataset Classes
131
Exercise | Incorrect Tempate |
Mini Squat | Uncontrolled Knee Position |
Excessive Trunk Flexion | |
Sit-to-Stand | Uncontrolled Knee Position |
Excessive Trunk Flexion | |
Straight Leg Raising | Knee Flexion |
Ankle Planter Flexion + Knee Flexion | |
Ankle Planter Flexion + Knee Slight Flexion |
Collected Dataset : Extracting Joints
132
Body-joint extractor
One Dollar
Extra Tree
135
Extra Tree
136
LightGBM
137
FCBF
Fast Correlation-Based Filter:
138
CNN
1D CNN
Benefits of Transfer Learning
Modifying and Fine-Tuning Pre-trained Models
VIT
Residual Models
Dense Models
Mob
Acknowledgments
Acknowledgments
I am deeply grateful for the support and guidance provided by my supervisors:
A special thank you to:
References
150
16/07/2023
Proposed Framework
151
System Overview
Proposed System : Overview
Proposed System : Technical
Proposed System : Track 1 - Classical
Proposed System : Track 2 – Deep Learning
Research plan
Phase 1
Phase 2
Phase 3
Phase 4
Phase 5
Reference
158
16/07/2023
Reference