2 of 22

Overview

Objective: Predict user stress using smartphone and wearable data.
Dataset: K-EmoPhone (2,619 samples)
Approach: Feature engineering, selection, tuning, and modeling
Challenge: High-dimensional tabular data (5,589 features)
Keeping 5-fold StratifiedGroupKFold validation for consistency

{'feat_time': 16,

'feat_dsc': 55,

'feat_current_sensor': 84,

'feat_current_ESM': 1,

'feat_ImmediatePast_sensor': 416,

'feat_ImmediatePast_ESM': 0,

'feat_today_sensor': 2496,

'feat_today_ESM': 6,

'feat_yesterday_sensor': 2496,

'feat_yesterday_ESM': 6,

'feat_sleep': 2,

'feat_pif': 11}

3 of 22

Assignment 1 - Feature Combinations

Current Baseline: Feat_baseline

Time related features (feat_time), duration since last change for categorical data (feat_dsc), current sensor features (feat_current_sensor), and immediate past sensor features (feat_ImmediatePast_sensor)

Goal: Test performance impact of adding/removing groups
All other settings rather than feature combinations will remain unchanged

4 of 22

Feature Combinations

Feature	Model AUC	Notes
feat_baseline	0.5610	Baseline only
all_features	0.5783	All sensor data
baseline+today+current_esm+pid+sleep	0.5817	Adds PID and sleep
baseline+today+current_esm+sleep	0.579	Adds sleep only
baseline+today+current_esm+yesterday+immediate_past_esm	0.5991	Adds yesterday and recent ESM
baseline+today+current_esm+yesterday+pid+sleep	0.5855	Adds yesterday, PID, and sleep
baseline+today+current_esm+yesterday-no_immediate_past	0.5932	Omits immediate past
baseline+today+current_esm+yesterday	0.5991	Adds yesterday only
baseline+today+current_esm	0.5898	No yesterday data
current+ImmediatePast	0.5622	No baseline, includes past ESM
current	0.576	Current ESM data only
dsc	0.5399	Dynamic social context features
sensor+time	0.5383	Time + sensor features only

5 of 22

Assignment 2 - Feature Selection

Methods Tried in the Baseline:
- LASSO (threshold = 0.005, mean)
- XGBoost embedded importances

Goal: Retain predictive, non-redundant features
Using the baseline feature groups and the best feature combinations from Assignment 1

6 of 22

Assignment 2 - Feature Selection

Method	Description
LASSO (baseline)	SelectFromModel(LogisticRegression with penalty='l1', threshold=0.005)
LASSO (adaptive)	Same as above but changing the threshold (Linear model)
SHAP (global + local) (EvXGBoost)	Model-agnostic
Random Forest	Non-linear model for selection

7 of 22

Feature Selection - LASSO (Mean AUC > 0.59)

Model	C	Threshold	Mean AUC
feat_baseline	1	0.005	0.5610
baseline+today+current_esm+yesterday+immediate_past_esm	1	0.005	0.5991
baseline+today+current_esm+yesterday+immediate_past_esm	10	0.005	0.5936
baseline+today+current_esm+yesterday-no_immediate_past	1	0.001	0.6113
baseline+today+current_esm+yesterday-no_immediate_past	1	0.005	0.5932
baseline+today+current_esm+yesterday-no_immediate_past	1	mean	0.5997
baseline+today+current_esm+yesterday-no_immediate_past	10	mean	0.5912
baseline+today+current_esm+yesterday	1	0.005	0.5991
baseline+today+current_esm+yesterday	10	0.005	0.5936
baseline+today+current_esm	0.1	0.001	0.596
baseline+today+current_esm	0.1	0.005	0.5982
baseline+today+current_esm	0.1	mean	0.5906
baseline+today+current_esm	10	0.001	0.5965
baseline+today+current_esm	10	0.005	0.5975

8 of 22

Feature Selection - SHAP (Mean AUC > 0.60)

Put all features in shap:

Captures contributions from all possible features.�
Problem: Running SHAP on full data before split → test data influences selection.
Solution:
Used StratifiedGroupKFold (n=5, same seed) to match final CV splits.
Collected only training indices across folds.

Merged training data → fit model → compute SHAP.

Selected top-N features from training data only.

Feature Set	Top N Features	Mean AUC
all_features	10	0.6119
all_features	20	0.6317
all_features	30	0.6411
all_features	40	0.6545
all_features	45	0.6568
all_features	50	0.6533
all_features	60	0.6609
all_features	65	0.6541
all_features	70	0.6555
all_features	80	0.6212
all_features	90	0.6331
all_features	100	0.6495
all_features	200	0.658
all_features	300	0.6423
all_features	500	0.6085

9 of 22

Feature Selection - Random Forest (Mean AUC > 0.60)

Captures non-linear interactions between features and the target.
Ranks features based on their contribution to reducing prediction error across many trees.�

Feature Set	Top K Features	Mean AUC
all_features	10	0.606
all_features	20	0.6089
all_features	25	0.6125
all_features	30	0.6261
all_features	35	0.6138
all_features	45	0.6104
all_features	50	0.6137
all_features	100	0.6064
all_features	200	0.6061
all_features	300	0.6051

10 of 22

Assignment 3 - Hyperparameter Tuning

Using Hyperopt (Bayesian optimization)

uses Bayesian optimization to explore areas of the parameter space

Parameters Tuned: max_depth, learning_rate, subsample, colsample_bytree, reg_alpha, reg_lambda

Goal: Optimize XGBoost for best performance (keeping the baseline model)
Dataset: Using Top 60 features got using SHAP (best result so far)

11 of 22

Assignment 3 - Hyperparameter Tuning: Hyperopt (Same)

space = {

'max_depth': hp.choice('max_depth', list(range(3, 10))),

'min_child_weight': hp.quniform('min_child_weight', 1, 10, 1),

'subsample': hp.uniform('subsample', 0.6, 1.0),

'colsample_bytree': hp.uniform('colsample_bytree', 0.6, 1.0),

'gamma': hp.uniform('gamma', 0, 5),

'learning_rate': hp.loguniform('learning_rate', np.log(0.01), np.log(0.3)),

'n_estimators': hp.quniform('n_estimators', 50, 300, 10),

'reg_alpha': hp.uniform('reg_alpha', 0, 1),

'reg_lambda': hp.uniform('reg_lambda', 0, 1),

'random_state': 42

}

{'colsample_bytree': 0.8463935920953023, 'gamma': 0.2144097629496495, 'learning_rate': 0.0736917612203914, 'max_depth': 6, 'min_child_weight': 1, 'n_estimators': 280, 'reg_alpha': 0.4775914535854724, 'reg_lambda': 0.6862888539325457, 'subsample': 0.7290628198195405, 'random_state': 42}

Mean AUC: 0.566

12 of 22

Assignment 4 - Deep Learning

Model: TabNet (attention-based DL for tabular data)
Compared to tuned XGBoost
Goal: Use same feature set as top-performing XGBoost setup from previous assignments

Comparing with XGBoost:

Macro AUC-ROC
Feature combinations

13 of 22

TabNet Benchmark Results

Setting	Mean AUC
All Features	0.5607
Baseline - Before Tuning	0.5703
SHAP Top 60	0.6591

Feature Selections

SHAP Top 60: Selected based on SHAP (EvXGBoost), having best pre-tuning AUC.
All Features: Deep learning handles feature selection implicitly via regularization (neuron dropout/pruning).

�

14 of 22

Hyperparameter Tuning

space = {

"n_d": hp.choice("n_d", [32, 64, 128]),

"n_a": hp.choice("n_a", [32, 64, 128]),

"n_steps": hp.choice("n_steps", [3, 5, 7]),

"gamma": hp.uniform("gamma", 1.0, 2.0),

"lambda_sparse": hp.loguniform("lambda_sparse", np.log(1e-6), np.log(1e-2)),

"lr": hp.loguniform("lr", np.log(1e-3), np.log(5e-2)),

}

15 of 22

Assignment 5 - Final Model

Features: The shap top 60 features from assignment 2
Soft voting ensemble: evXGBoost and TabNet

Averages or weights the predicted probabilities from multiple models instead of using their final class predictions.

XGBoost Weight	TabNet Weight	AUC
0.5	0.5	0.681
0.4	0.6	0.678
0.60	0.40	0.680
0.55	0.45	0.682

16 of 22

Features Chosen by SHAP

ESM#LastLabel	LOC_LABEL#RLV_SUP=work#ImmediatePast_15	APP_CAT#RLV_SUP=HEALTH#ImmediatePast_15
PIF#BFI_NEU	ACC_MAG#MED#ImmediatePast_15	WIF_MAN#BEP#ImmediatePast_15
LOC_DST#SKW#ImmediatePast_15	AML#SKW#ImmediatePast_15	APP_CAT#DSC=HEALTH
APP_CAT#DSC=INFO	WIF_JAC#KUR#ImmediatePast_15	RNG#DSC=NORMAL
ACE_VHC#SKW#ImmediatePast_15	SKT#BEP#ImmediatePast_15	SCR_DUR#KUR_TodayAfternoon
APP_DUR_UNKNOWN#MED#ImmediatePast_15	SKT#SKW#ImmediatePast_15	BAT_LEV#KUR#ImmediatePast_15
STP#SKW_TodayLateAfternoon	HRT#KUR#ImmediatePast_15	ACC_AXY#BEP_TodayLateAfternoon
WIF_JAC#BEP#ImmediatePast_15	ACC_AXZ#VAL	HRV#skw#ImmediatePast_15
AML#MED#YesterdayNight	APP_DUR_UNKNOWN#AVG#ImmediatePast_15	HRT#BEP#ImmediatePast_15
CAE_DUR#VAL	WIF_MAN#AVG#ImmediatePast_15	SKT#KUR_TodayAfternoon
AML#AVG#ImmediatePast_15	ACE_RUN#BEP_TodayAfternoon	ACC_AXZ#AVG_TodayAfternoon
ACC_AXY#SKW#ImmediatePast_15	WIF_JAC#MED#ImmediatePast_15	WIF_EUC#STD_TodayMorning
ACE_TLT#STD#ImmediatePast_15	SCR_DUR#KUR_TodayEvening	SCR_DUR#VAL
SCR_EVENT#RLV_SUP=ON_TodayMorning	LOC_LABEL#DSC=eating	ACE_TLT#TSC#ImmediatePast_15
LOC_LABEL#ASC##YesterdayMorning	DST_SPD#TSC#ImmediatePast_15	HRV#slope#ImmediatePast_15
EDA#min_phasic_TodayMorning	ACT#RLV_SUP=WALKING_TodayLateAfternoon	LOC_DST#ASC_TodayMorning
ESM#LIK#YesterdayEvening	APP_DUR_UNKNOWN#KUR#ImmediatePast_15

17 of 22

Discussion

26 out of 50 features involved #ImmediatePast features

Recent historical context

Context-rich features like #ImmediatePast15, #TodayAfternoon, and #YesterdayNight appear frequently

Temporal dynamics does matter.

ESM data rather act best as guiding factor (from assignment 1)

A mix of objective signals (sensors) and contextual data (ESM, location labels) are important�

18 of 22

Code Review

Deterministic Results� Random seeds fixed, model states saved to ensure reproducibility.�
Clarity Through Comments� Well-placed inline comments and markdown cells improve readability.�
Fair Model Comparison� All models and baselines evaluated using identical settings and metrics.�
Clean Code Style� Clear variable names and consistent formatting enhance maintainability.

20 of 22

Retraction From Final Presentation

There was an error in doing SHAP based feature selection

Eventually the SHAP was mistakenly done on the entire dataset
Correct measures have been taken place to ensure no data leakage

It has been mentioned in the SHAP slides

Some performance metrics have been changed due to this change

21 of 22

Resources

Kang, S., Choi, W., Park, C.Y., Kim, J.H., Lee, J., & Lee, U. (2023). K-EmoPhone: A Mobile and Wearable Dataset with In-Situ Emotion, Stress, and Attention Labels. Scientific Data, 10, 351.
Tibshirani, R. (1996). Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267-288.
Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794).
Lundberg, S. M., & Lee, S.-I. (2017). A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems 30 (NeurIPS 2017).
Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32.
Bergstra, J., Yamins, D., & Cox, D. D. (2013). Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures. In Proceedings of the 30th International Conference on Machine Learning (ICML-13), 115–123.
Arik, S. Ö., & Pfister, T. (2021). TabNet: Attentive Interpretable Tabular Learning. In Proceedings of the AAAI Conference on Artificial Intelligence, 35(8), 6679–6687.

1 of 22