1 of 22

K-EmoPhone Dataset

“Emotion & Stress Tracking”

Minhajur Rahman Chowdhury Mahim

2025.06.13

Final Submission

2 of 22

Overview

  • Objective: Predict user stress using smartphone and wearable data.
  • Dataset: K-EmoPhone (2,619 samples)
  • Approach: Feature engineering, selection, tuning, and modeling
  • Challenge: High-dimensional tabular data (5,589 features)
  • Keeping 5-fold StratifiedGroupKFold validation for consistency

{'feat_time': 16,

'feat_dsc': 55,

'feat_current_sensor': 84,

'feat_current_ESM': 1,

'feat_ImmediatePast_sensor': 416,

'feat_ImmediatePast_ESM': 0,

'feat_today_sensor': 2496,

'feat_today_ESM': 6,

'feat_yesterday_sensor': 2496,

'feat_yesterday_ESM': 6,

'feat_sleep': 2,

'feat_pif': 11}

3 of 22

Assignment 1 - Feature Combinations

  • Current Baseline: Feat_baseline
    • Time related features (feat_time), duration since last change for categorical data (feat_dsc), current sensor features (feat_current_sensor), and immediate past sensor features (feat_ImmediatePast_sensor)
  • Goal: Test performance impact of adding/removing groups
  • All other settings rather than feature combinations will remain unchanged

4 of 22

Feature Combinations

Feature

Model AUC

Notes

feat_baseline

0.5610

Baseline only

all_features

0.5783

All sensor data

baseline+today+current_esm+pid+sleep

0.5817

Adds PID and sleep

baseline+today+current_esm+sleep

0.579

Adds sleep only

baseline+today+current_esm+yesterday+immediate_past_esm

0.5991

Adds yesterday and recent ESM

baseline+today+current_esm+yesterday+pid+sleep

0.5855

Adds yesterday, PID, and sleep

baseline+today+current_esm+yesterday-no_immediate_past

0.5932

Omits immediate past

baseline+today+current_esm+yesterday

0.5991

Adds yesterday only

baseline+today+current_esm

0.5898

No yesterday data

current+ImmediatePast

0.5622

No baseline, includes past ESM

current

0.576

Current ESM data only

dsc

0.5399

Dynamic social context features

sensor+time

0.5383

Time + sensor features only

5 of 22

Assignment 2 - Feature Selection

  • Methods Tried in the Baseline:
  • - LASSO (threshold = 0.005, mean)
  • - XGBoost embedded importances

  • Goal: Retain predictive, non-redundant features
  • Using the baseline feature groups and the best feature combinations from Assignment 1

6 of 22

Assignment 2 - Feature Selection

Method

Description

LASSO (baseline)

SelectFromModel(LogisticRegression with penalty='l1', threshold=0.005)

LASSO (adaptive)

Same as above but changing the threshold

(Linear model)

SHAP (global + local) (EvXGBoost)

Model-agnostic

Random Forest

Non-linear model for selection

7 of 22

Feature Selection - LASSO (Mean AUC > 0.59)

Model

C

Threshold

Mean AUC

feat_baseline

1

0.005

0.5610

baseline+today+current_esm+yesterday+immediate_past_esm

1

0.005

0.5991

baseline+today+current_esm+yesterday+immediate_past_esm

10

0.005

0.5936

baseline+today+current_esm+yesterday-no_immediate_past

1

0.001

0.6113

baseline+today+current_esm+yesterday-no_immediate_past

1

0.005

0.5932

baseline+today+current_esm+yesterday-no_immediate_past

1

mean

0.5997

baseline+today+current_esm+yesterday-no_immediate_past

10

mean

0.5912

baseline+today+current_esm+yesterday

1

0.005

0.5991

baseline+today+current_esm+yesterday

10

0.005

0.5936

baseline+today+current_esm

0.1

0.001

0.596

baseline+today+current_esm

0.1

0.005

0.5982

baseline+today+current_esm

0.1

mean

0.5906

baseline+today+current_esm

10

0.001

0.5965

baseline+today+current_esm

10

0.005

0.5975

8 of 22

Feature Selection - SHAP (Mean AUC > 0.60)

Put all features in shap:

  • Captures contributions from all possible features.�
  • Problem: Running SHAP on full data before split → test data influences selection.
  • Solution:
  • Used StratifiedGroupKFold (n=5, same seed) to match final CV splits.
  • Collected only training indices across folds.

Merged training data → fit model → compute SHAP.

  • Selected top-N features from training data only.

Feature Set

Top N Features

Mean AUC

all_features

10

0.6119

all_features

20

0.6317

all_features

30

0.6411

all_features

40

0.6545

all_features

45

0.6568

all_features

50

0.6533

all_features

60

0.6609

all_features

65

0.6541

all_features

70

0.6555

all_features

80

0.6212

all_features

90

0.6331

all_features

100

0.6495

all_features

200

0.658

all_features

300

0.6423

all_features

500

0.6085

9 of 22

Feature Selection - Random Forest (Mean AUC > 0.60)

  • Captures non-linear interactions between features and the target.
  • Ranks features based on their contribution to reducing prediction error across many trees.�

Feature Set

Top K Features

Mean AUC

all_features

10

0.606

all_features

20

0.6089

all_features

25

0.6125

all_features

30

0.6261

all_features

35

0.6138

all_features

45

0.6104

all_features

50

0.6137

all_features

100

0.6064

all_features

200

0.6061

all_features

300

0.6051

10 of 22

Assignment 3 - Hyperparameter Tuning

  • Using Hyperopt (Bayesian optimization)
    • uses Bayesian optimization to explore areas of the parameter space
  • Parameters Tuned: max_depth, learning_rate, subsample, colsample_bytree, reg_alpha, reg_lambda

  • Goal: Optimize XGBoost for best performance (keeping the baseline model)
  • Dataset: Using Top 60 features got using SHAP (best result so far)

11 of 22

Assignment 3 - Hyperparameter Tuning: Hyperopt (Same)

space = {

'max_depth': hp.choice('max_depth', list(range(3, 10))),

'min_child_weight': hp.quniform('min_child_weight', 1, 10, 1),

'subsample': hp.uniform('subsample', 0.6, 1.0),

'colsample_bytree': hp.uniform('colsample_bytree', 0.6, 1.0),

'gamma': hp.uniform('gamma', 0, 5),

'learning_rate': hp.loguniform('learning_rate', np.log(0.01), np.log(0.3)),

'n_estimators': hp.quniform('n_estimators', 50, 300, 10),

'reg_alpha': hp.uniform('reg_alpha', 0, 1),

'reg_lambda': hp.uniform('reg_lambda', 0, 1),

'random_state': 42

}

{'colsample_bytree': 0.8463935920953023, 'gamma': 0.2144097629496495, 'learning_rate': 0.0736917612203914, 'max_depth': 6, 'min_child_weight': 1, 'n_estimators': 280, 'reg_alpha': 0.4775914535854724, 'reg_lambda': 0.6862888539325457, 'subsample': 0.7290628198195405, 'random_state': 42}

Mean AUC: 0.566

12 of 22

Assignment 4 - Deep Learning

  • Model: TabNet (attention-based DL for tabular data)
  • Compared to tuned XGBoost
  • Goal: Use same feature set as top-performing XGBoost setup from previous assignments

Comparing with XGBoost:

  • Macro AUC-ROC
  • Feature combinations

13 of 22

TabNet Benchmark Results

Setting

Mean AUC

All Features

0.5607

Baseline - Before Tuning

0.5703

SHAP Top 60

0.6591

Feature Selections

  • SHAP Top 60: Selected based on SHAP (EvXGBoost), having best pre-tuning AUC.
  • All Features: Deep learning handles feature selection implicitly via regularization (neuron dropout/pruning).

14 of 22

Hyperparameter Tuning

space = {

"n_d": hp.choice("n_d", [32, 64, 128]),

"n_a": hp.choice("n_a", [32, 64, 128]),

"n_steps": hp.choice("n_steps", [3, 5, 7]),

"gamma": hp.uniform("gamma", 1.0, 2.0),

"lambda_sparse": hp.loguniform("lambda_sparse", np.log(1e-6), np.log(1e-2)),

"lr": hp.loguniform("lr", np.log(1e-3), np.log(5e-2)),

}

15 of 22

Assignment 5 - Final Model

  • Features: The shap top 60 features from assignment 2
  • Soft voting ensemble: evXGBoost and TabNet
    • Averages or weights the predicted probabilities from multiple models instead of using their final class predictions.

XGBoost Weight

TabNet Weight

AUC

0.5

0.5

0.681

0.4

0.6

0.678

0.60

0.40

0.680

0.55

0.45

0.682

16 of 22

Features Chosen by SHAP

ESM#LastLabel

LOC_LABEL#RLV_SUP=work#ImmediatePast_15

APP_CAT#RLV_SUP=HEALTH#ImmediatePast_15

PIF#BFI_NEU

ACC_MAG#MED#ImmediatePast_15

WIF_MAN#BEP#ImmediatePast_15

LOC_DST#SKW#ImmediatePast_15

AML#SKW#ImmediatePast_15

APP_CAT#DSC=HEALTH

APP_CAT#DSC=INFO

WIF_JAC#KUR#ImmediatePast_15

RNG#DSC=NORMAL

ACE_VHC#SKW#ImmediatePast_15

SKT#BEP#ImmediatePast_15

SCR_DUR#KUR_TodayAfternoon

APP_DUR_UNKNOWN#MED#ImmediatePast_15

SKT#SKW#ImmediatePast_15

BAT_LEV#KUR#ImmediatePast_15

STP#SKW_TodayLateAfternoon

HRT#KUR#ImmediatePast_15

ACC_AXY#BEP_TodayLateAfternoon

WIF_JAC#BEP#ImmediatePast_15

ACC_AXZ#VAL

HRV#skw#ImmediatePast_15

AML#MED#YesterdayNight

APP_DUR_UNKNOWN#AVG#ImmediatePast_15

HRT#BEP#ImmediatePast_15

CAE_DUR#VAL

WIF_MAN#AVG#ImmediatePast_15

SKT#KUR_TodayAfternoon

AML#AVG#ImmediatePast_15

ACE_RUN#BEP_TodayAfternoon

ACC_AXZ#AVG_TodayAfternoon

ACC_AXY#SKW#ImmediatePast_15

WIF_JAC#MED#ImmediatePast_15

WIF_EUC#STD_TodayMorning

ACE_TLT#STD#ImmediatePast_15

SCR_DUR#KUR_TodayEvening

SCR_DUR#VAL

SCR_EVENT#RLV_SUP=ON_TodayMorning

LOC_LABEL#DSC=eating

ACE_TLT#TSC#ImmediatePast_15

LOC_LABEL#ASC##YesterdayMorning

DST_SPD#TSC#ImmediatePast_15

HRV#slope#ImmediatePast_15

EDA#min_phasic_TodayMorning

ACT#RLV_SUP=WALKING_TodayLateAfternoon

LOC_DST#ASC_TodayMorning

ESM#LIK#YesterdayEvening

APP_DUR_UNKNOWN#KUR#ImmediatePast_15

17 of 22

Discussion

  • 26 out of 50 features involved #ImmediatePast features
    • Recent historical context

  • Context-rich features like #ImmediatePast15, #TodayAfternoon, and #YesterdayNight appear frequently
    • Temporal dynamics does matter.

  • ESM data rather act best as guiding factor (from assignment 1)

  • A mix of objective signals (sensors) and contextual data (ESM, location labels) are important

18 of 22

Code Review

  • Deterministic Results� Random seeds fixed, model states saved to ensure reproducibility.�
  • Clarity Through Comments� Well-placed inline comments and markdown cells improve readability.�
  • Fair Model Comparison� All models and baselines evaluated using identical settings and metrics.�
  • Clean Code Style� Clear variable names and consistent formatting enhance maintainability.

19 of 22

Snippets

20 of 22

Retraction From Final Presentation

  • There was an error in doing SHAP based feature selection
    • Eventually the SHAP was mistakenly done on the entire dataset
    • Correct measures have been taken place to ensure no data leakage
      • It has been mentioned in the SHAP slides

  • Some performance metrics have been changed due to this change

21 of 22

Resources

  1. Kang, S., Choi, W., Park, C.Y., Kim, J.H., Lee, J., & Lee, U. (2023). K-EmoPhone: A Mobile and Wearable Dataset with In-Situ Emotion, Stress, and Attention Labels. Scientific Data, 10, 351.
  2. Tibshirani, R. (1996). Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267-288.
  3. Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794).
  4. Lundberg, S. M., & Lee, S.-I. (2017). A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems 30 (NeurIPS 2017).
  5. Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32.
  6. Bergstra, J., Yamins, D., & Cox, D. D. (2013). Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures. In Proceedings of the 30th International Conference on Machine Learning (ICML-13), 115–123.
  7. Arik, S. Ö., & Pfister, T. (2021). TabNet: Attentive Interpretable Tabular Learning. In Proceedings of the AAAI Conference on Artificial Intelligence, 35(8), 6679–6687.

22 of 22

Thank you