K-EmoPhone Dataset
“Emotion & Stress Tracking”
Minhajur Rahman Chowdhury Mahim
2025.06.13
Final Submission
Overview
{'feat_time': 16,
'feat_dsc': 55,
'feat_current_sensor': 84,
'feat_current_ESM': 1,
'feat_ImmediatePast_sensor': 416,
'feat_ImmediatePast_ESM': 0,
'feat_today_sensor': 2496,
'feat_today_ESM': 6,
'feat_yesterday_sensor': 2496,
'feat_yesterday_ESM': 6,
'feat_sleep': 2,
'feat_pif': 11}
Assignment 1 - Feature Combinations
Feature Combinations
Feature | Model AUC | Notes |
feat_baseline | 0.5610 | Baseline only |
all_features | 0.5783 | All sensor data |
baseline+today+current_esm+pid+sleep | 0.5817 | Adds PID and sleep |
baseline+today+current_esm+sleep | 0.579 | Adds sleep only |
baseline+today+current_esm+yesterday+immediate_past_esm | 0.5991 | Adds yesterday and recent ESM |
baseline+today+current_esm+yesterday+pid+sleep | 0.5855 | Adds yesterday, PID, and sleep |
baseline+today+current_esm+yesterday-no_immediate_past | 0.5932 | Omits immediate past |
baseline+today+current_esm+yesterday | 0.5991 | Adds yesterday only |
baseline+today+current_esm | 0.5898 | No yesterday data |
current+ImmediatePast | 0.5622 | No baseline, includes past ESM |
current | 0.576 | Current ESM data only |
dsc | 0.5399 | Dynamic social context features |
sensor+time | 0.5383 | Time + sensor features only |
Assignment 2 - Feature Selection
Assignment 2 - Feature Selection
Method | Description |
LASSO (baseline) | SelectFromModel(LogisticRegression with penalty='l1', threshold=0.005) |
LASSO (adaptive) | Same as above but changing the threshold (Linear model) |
SHAP (global + local) (EvXGBoost) | Model-agnostic |
Random Forest | Non-linear model for selection |
Feature Selection - LASSO (Mean AUC > 0.59)
Model | C | Threshold | Mean AUC |
feat_baseline | 1 | 0.005 | 0.5610 |
baseline+today+current_esm+yesterday+immediate_past_esm | 1 | 0.005 | 0.5991 |
baseline+today+current_esm+yesterday+immediate_past_esm | 10 | 0.005 | 0.5936 |
baseline+today+current_esm+yesterday-no_immediate_past | 1 | 0.001 | 0.6113 |
baseline+today+current_esm+yesterday-no_immediate_past | 1 | 0.005 | 0.5932 |
baseline+today+current_esm+yesterday-no_immediate_past | 1 | mean | 0.5997 |
baseline+today+current_esm+yesterday-no_immediate_past | 10 | mean | 0.5912 |
baseline+today+current_esm+yesterday | 1 | 0.005 | 0.5991 |
baseline+today+current_esm+yesterday | 10 | 0.005 | 0.5936 |
baseline+today+current_esm | 0.1 | 0.001 | 0.596 |
baseline+today+current_esm | 0.1 | 0.005 | 0.5982 |
baseline+today+current_esm | 0.1 | mean | 0.5906 |
baseline+today+current_esm | 10 | 0.001 | 0.5965 |
baseline+today+current_esm | 10 | 0.005 | 0.5975 |
Feature Selection - SHAP (Mean AUC > 0.60)
Put all features in shap:
Merged training data → fit model → compute SHAP.
Feature Set | Top N Features | Mean AUC |
all_features | 10 | 0.6119 |
all_features | 20 | 0.6317 |
all_features | 30 | 0.6411 |
all_features | 40 | 0.6545 |
all_features | 45 | 0.6568 |
all_features | 50 | 0.6533 |
all_features | 60 | 0.6609 |
all_features | 65 | 0.6541 |
all_features | 70 | 0.6555 |
all_features | 80 | 0.6212 |
all_features | 90 | 0.6331 |
all_features | 100 | 0.6495 |
all_features | 200 | 0.658 |
all_features | 300 | 0.6423 |
all_features | 500 | 0.6085 |
Feature Selection - Random Forest (Mean AUC > 0.60)
Feature Set | Top K Features | Mean AUC |
all_features | 10 | 0.606 |
all_features | 20 | 0.6089 |
all_features | 25 | 0.6125 |
all_features | 30 | 0.6261 |
all_features | 35 | 0.6138 |
all_features | 45 | 0.6104 |
all_features | 50 | 0.6137 |
all_features | 100 | 0.6064 |
all_features | 200 | 0.6061 |
all_features | 300 | 0.6051 |
Assignment 3 - Hyperparameter Tuning
Assignment 3 - Hyperparameter Tuning: Hyperopt (Same)
space = {
'max_depth': hp.choice('max_depth', list(range(3, 10))),
'min_child_weight': hp.quniform('min_child_weight', 1, 10, 1),
'subsample': hp.uniform('subsample', 0.6, 1.0),
'colsample_bytree': hp.uniform('colsample_bytree', 0.6, 1.0),
'gamma': hp.uniform('gamma', 0, 5),
'learning_rate': hp.loguniform('learning_rate', np.log(0.01), np.log(0.3)),
'n_estimators': hp.quniform('n_estimators', 50, 300, 10),
'reg_alpha': hp.uniform('reg_alpha', 0, 1),
'reg_lambda': hp.uniform('reg_lambda', 0, 1),
'random_state': 42
}
{'colsample_bytree': 0.8463935920953023, 'gamma': 0.2144097629496495, 'learning_rate': 0.0736917612203914, 'max_depth': 6, 'min_child_weight': 1, 'n_estimators': 280, 'reg_alpha': 0.4775914535854724, 'reg_lambda': 0.6862888539325457, 'subsample': 0.7290628198195405, 'random_state': 42}
Mean AUC: 0.566
Assignment 4 - Deep Learning
Comparing with XGBoost:
TabNet Benchmark Results
Setting | Mean AUC |
All Features | 0.5607 |
Baseline - Before Tuning | 0.5703 |
SHAP Top 60 | 0.6591 |
Feature Selections
�
Hyperparameter Tuning
space = {
"n_d": hp.choice("n_d", [32, 64, 128]),
"n_a": hp.choice("n_a", [32, 64, 128]),
"n_steps": hp.choice("n_steps", [3, 5, 7]),
"gamma": hp.uniform("gamma", 1.0, 2.0),
"lambda_sparse": hp.loguniform("lambda_sparse", np.log(1e-6), np.log(1e-2)),
"lr": hp.loguniform("lr", np.log(1e-3), np.log(5e-2)),
}
Assignment 5 - Final Model
XGBoost Weight | TabNet Weight | AUC |
0.5 | 0.5 | 0.681 |
0.4 | 0.6 | 0.678 |
0.60 | 0.40 | 0.680 |
0.55 | 0.45 | 0.682 |
Features Chosen by SHAP
ESM#LastLabel | LOC_LABEL#RLV_SUP=work#ImmediatePast_15 | APP_CAT#RLV_SUP=HEALTH#ImmediatePast_15 |
PIF#BFI_NEU | ACC_MAG#MED#ImmediatePast_15 | WIF_MAN#BEP#ImmediatePast_15 |
LOC_DST#SKW#ImmediatePast_15 | AML#SKW#ImmediatePast_15 | APP_CAT#DSC=HEALTH |
APP_CAT#DSC=INFO | WIF_JAC#KUR#ImmediatePast_15 | RNG#DSC=NORMAL |
ACE_VHC#SKW#ImmediatePast_15 | SKT#BEP#ImmediatePast_15 | SCR_DUR#KUR_TodayAfternoon |
APP_DUR_UNKNOWN#MED#ImmediatePast_15 | SKT#SKW#ImmediatePast_15 | BAT_LEV#KUR#ImmediatePast_15 |
STP#SKW_TodayLateAfternoon | HRT#KUR#ImmediatePast_15 | ACC_AXY#BEP_TodayLateAfternoon |
WIF_JAC#BEP#ImmediatePast_15 | ACC_AXZ#VAL | HRV#skw#ImmediatePast_15 |
AML#MED#YesterdayNight | APP_DUR_UNKNOWN#AVG#ImmediatePast_15 | HRT#BEP#ImmediatePast_15 |
CAE_DUR#VAL | WIF_MAN#AVG#ImmediatePast_15 | SKT#KUR_TodayAfternoon |
AML#AVG#ImmediatePast_15 | ACE_RUN#BEP_TodayAfternoon | ACC_AXZ#AVG_TodayAfternoon |
ACC_AXY#SKW#ImmediatePast_15 | WIF_JAC#MED#ImmediatePast_15 | WIF_EUC#STD_TodayMorning |
ACE_TLT#STD#ImmediatePast_15 | SCR_DUR#KUR_TodayEvening | SCR_DUR#VAL |
SCR_EVENT#RLV_SUP=ON_TodayMorning | LOC_LABEL#DSC=eating | ACE_TLT#TSC#ImmediatePast_15 |
LOC_LABEL#ASC##YesterdayMorning | DST_SPD#TSC#ImmediatePast_15 | HRV#slope#ImmediatePast_15 |
EDA#min_phasic_TodayMorning | ACT#RLV_SUP=WALKING_TodayLateAfternoon | LOC_DST#ASC_TodayMorning |
ESM#LIK#YesterdayEvening | APP_DUR_UNKNOWN#KUR#ImmediatePast_15 | |
Discussion
Code Review
Snippets
Retraction From Final Presentation
Resources
Thank you