1 of 40

Machine Learning

Mentors: Professor Nikola Banovic and Anindya Das Antar

2 of 40

The Challenge

Developing methods for patient specific predictions of in-hospital mortality

  • Information provided from first 48 hours of an ICU stay
  • 12,000 patients from community hospital in Massachusetts
  • 6 general descriptor variables (Ex. Age, Weight, ICU type)
  • 36 time series variables, taken at least once (Ex. Heart Rate, BUN, Glucose)

The timely and accurate detection of people at risk can save lives!

2

3 of 40

Abbreviations

  • HMM - Hidden Markov Model
  • SVM - Support Vector Machine
  • CNN - Convolutional Neural Networks
  • LSTM - Long Short Term Memory
  • LR - Logistic Regression
  • GNB - Gaussian Naive Bayes
  • RF - Random Forest

3

4 of 40

Metric definitions

4

TN

FP

FN

TP

Predicted

Dead

Alive

Actual

Alive

Dead

  • Precision: TP/(TP+FP)

Fraction of predicted positives that were actually positive

  • Recall: TP/(TP+FN)

Fraction of actual positives that were correctly predicted

  • Accuracy: (TP+TN)/(TP+TN+FP+FN)

Fraction of predictions that were correctly classified

  • F1 Score: Harmonic mean of Precision and Recall
  • AUC - ROC: Area under the ROC curve

5 of 40

Group 1: MaSH

Markov model, SVM Hybrid

Anvit Garg, Alejandra Solis Sala, Ian Maywar, Rhea Verma

6 of 40

Group 1: Predicting ICU Mortality using HMMs and SVM

Anvit Garg, Alejandra Solis Sala, Ian Maywar, Rhea Verma

7 of 40

Model Illustration

Model Architecture

7

8 of 40

Feature selection and preprocessing

  • We chose features based on availability, SAPS-2, and variation among dead and alive patients
    • Heart rate
    • NI systolic blood pressure
    • Creatinine
    • Na
    • Glascow coma score

  • Preprocessing
    • Forward imputation
    • Regression (ICU Type, age)
    • Time window transformation

8

    • Blood urea nitrogen
    • Urine output
    • Bicarbonate (HCO3)
    • K
    • Glucose

9 of 40

Models (background)

  • HMM:
    • Follows the Markovian property (next state depends only on current state) between unobservable hidden states
    • Allow us to work with time-series data
    • Predict path between hidden states via observable properties
    • Gaussian suitable: model data that is continuous

  • SVM:
    • Classifier that separates data points using a hyperplane in an n-dimensional space

9

Hidden

Observed

10 of 40

Models

  • HMM
    • 2 models
      • One trained on alive patients, one trained on dead
    • Bayesian factor taken for each model for a given patient and used in SVM
  • SVM
    • Static variables (age, ICU-Type, weight) of patient used with bayesian factors to classify patient as dead or alive during ICU stay

10

11 of 40

Cross Validation

  • Hyperparameter tuning
    • Time window size, number of mixtures for both HMMs, regularization parameter in SVM

11

SVM

HMM_dead

HMM_alive

time window transformation

12 of 40

Full Model Evaluation

  • F1 Score: 0.42
  • Recall: 0.78
  • Accuracy: 0.70
  • Precision: 0.29
  • AUC-ROC: 0.73

12

1187

535

62

216

Predicted

Dead

Alive

Actual

Alive

Dead

13 of 40

Conclusions and Future Work

With more time, further developments can be made on the model:

  • Feature engineering
  • Testing more feature divisions
  • Wider hyperparameter tuning
  • More Error Analysis
  • Updating as more information is received

While there were limitations in the dataset due to missing values, results may still inform future models that can allow for augmented resource allocation and confirmation of clinical decisions.

13

14 of 40

Thank You

Any questions?

Contact Information:

14

Anvit Garg

anvit25@gmail.com

Alejandra Solis Sala

alejandra.solis@cimat.mx

Ian Maywar

ijmaywar@gmail.com

Rhea Verma

rhv3@pitt.edu

15 of 40

References

  1. Scikit-learn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 2825-2830, 2011.
  2. Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. E215–e220.
  3. https://www.programmersought.com/article/19492506989/
  4. http://gregorygundersen.com/blog/2020/11/28/hmms/

15

16 of 40

MACHINE LEARNING �GROUP 2

Felicia Zhang

Madeline J Peterson

Maya Nitsche Taylor

16

Mentors: Professor Nikola Banovic and Anindya Das Antar

17 of 40

The Model

17

Canva. (n.d.). Donut Decision Maker. Design a Superb Decision Tree Online with Canva. https://www.canva.com/graphs/decision-trees/ .

Yiu, T. (2019, August 14). Understanding random forest. Medium. https://towardsdatascience.com/understanding-random-forest-58381e0602d2.

18 of 40

Why Random Forests?

  • Shown good performance across domains
  • One of the best supervised machine learning algorithms
  • Easy to evaluate feature importance
  • Explore trees which can aid explainability and interpretability

18

The goal: provide clinicians and hospitals with a bigger picture of patient survival in the ICU that may help with large-scale vision

The challenge: applying random forest to time-series data when the algorithm is not meant for time-series data

19 of 40

Methods

  1. Examine and split the data

  • Development
    1. Feature selection
    2. Window selection

  • Training
    • Hyperparameter selection
    • Comparison to other ML models

  • Testing
    • Performance stats

19

10,000 patients

development

2500 samples

training

5000 samples

testing

2500 samples

48 hours

8 hours

window and feature selection

summarize over each window

  • mean
  • median
  • 25th quantile
  • 75th quantile
  • standard deviation
  • min
  • max
  • count

20 of 40

For a given time-series variable:

Understanding Time Windows

  1. Ex: use 6 windows: hours 1-8, 8-16, 16-24, 25-30, 30-36, 36-48
  2. Ex. GCS (glascow coma score, a measure of a patient’s consciousness)
    • A patient might have 20 GCS measurements over the 48 hour period.
    • To summarize with six windows, we split these 20 measurements into their respective time block and summarize over each block with various measurements (for ex, mean)
    • The data might look like: meanGCS1, meanGCS2,..., meanGCS6, medianGCS1,..., stdGCS1,..., minGCS1,..., maxGCS1,..., etc.
    • Also included a general 48 hour measurement for each statistic - for example, meanGCS (the mean of the patient’s GCS score over all 48 hours of the ICU visit)

20

21 of 40

To use random forest, the time-series data must be collapsed over time windows to create 2D feature vectors:

Development - Choosing Window Size

Ran cross-validation on the development set using various window sizes, optimizing on the f1 score produced by the random forest.

Based on this analysis we decided to move forward with six windows.

21

22 of 40

In parallel with window selection, we also chose a set of summary statistics for best performance (using the development data set).

Development - Choosing Summary Statistics

Set 1: mean, min, max

Set 2: mean, min, max, median, std deviation

Set 3: mean, min, max, median, std deviation, 25th and 75th quantiles, count

22

23 of 40

Methods

  • Examine and split the data

  • Development
    • Feature selection
    • Window selection

  • Training
    • Hyperparameter selection
    • Comparison to other ML models

  • Testing
    • Performance stats

23

Max Features

Final: 100

N Estimators

Final: 1000

Others:

Bootstrap: True

Class Weight: Balanced

Max Depth: 4

24 of 40

Methods

  • Examine and split the data

  • Development
    • Feature selection
    • Window selection

  • Training
    • Hyperparameter selection
    • Comparison to other ML models

  • Testing
    • Performance stats

24

Comparison to other ML Models

F1 score

25 of 40

25

Final Metrics on the Testing Set

Note:

-1 = survival

1 = death

Important takeaways:

  • 74% of deaths were correctly predicted
  • 77% of survivals were correctly predicted

Final Values on test data

Precision: 0.377

F1 Score: 0.501

Recall: 0.749

Confusion Matrix

26 of 40

26

Receiver Operating Characteristic (ROC) Curves

27 of 40

27

Understanding the Model - Example Decision Tree

Top 10 important features:

GCSmedian5, GCSquant755, GCSmax5, GCSmean5, GCSquant255, GCSmin5, GCSmedian4, GCSmean4, mean_Urine, quant25_BUN

Note: the number at the end denotes the window. GCS (Glasgow coma scale) quantifies degree of consciousness.

28 of 40

28

Takeaways and Future Work

Takeaways

  • Our proposed model outperforms the other models
  • Able to incorporate time series data
  • Understanding what features are important

Future Work

  • Adding more features
  • Generalizability
  • Predicting if someone will return to the ICU

29 of 40

29

References

Canva. (n.d.). Donut Decision Maker. Design a Superb Decision Tree Online with Canva. https://www.canva.com/graphs/decision-trees/ .

Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. E215–e220.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, É. (1970, January 1). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research. https://jmlr.csail.mit.edu/papers/v12/pedregosa11a.html.

Yiu, T. (2019, August 14). Understanding random forest. Medium. https://towardsdatascience.com/understanding-random-forest-58381e0602d2.

30 of 40

Thank you!

Felicia Zhang, University of Michigan - fyzhang@umich.edu

Madeline Peterson, Albion College - mjp12@albion.edu

Maya Taylor, Brown University - maya_taylor@brown.edu

30

31 of 40

Long Short-Term Memory (LSTM) neural network to predict ICU mortality

By: Rami Shams, Sabir Meah, Esther Adegoke, Brian Lin

32 of 40

The Promise of LSTMs

  • Neural Networks - Deep Learning capable of identifying features
    • Recurrent Neural Networks - Neural Network that considers the output of the last computation when computing next input (Short Term Memory)
      • LSTMs - Recurrent Neural Network capable of storing information in an internal memory (Long Term Memory)

Neural Network

RNN

LSTM

33 of 40

Raw Data

t = 1 Hour

L = 48 Hours

Cubic Spline Interpolation

≥ 3 measurements per feature

Forward and back filling

≥ 1 measurements per feature

Mean Imputation

0 measurements per feature

Z-Score standardization

Data Loader

Batch Size = 64

Time Variant(35)

Time Invariant(8)

Data Preprocessing and Missing Data approaches

10000 Patients

__________________

8000 Train

1000 Validation

1000 Test

Mean aggregation by time

Stratify by outcome label

34 of 40

Time Variant data(35)

Time Invariant data(8)

LSTM(40)

Hidden layer ∈ R48

Hidden layer ∈ R40

Internal state

Sigmoid output

t = 48

LSTM Architecture

ReLU

35 of 40

Methods - Data cleaning

  • Aggregate data over time
    • Time window trade-off
      • Smaller time windows include more granular data
      • Bigger time windows decrease prevalence of missing data
  • Imputation
    • Mean imputation
    • Representing missing data with an identifier value failed to converge
  • Standardization
    • Z - Score
    • MinMax scaling yielded poor results
  • Stratification
    • ICU Type
    • Mortality

ICU

CCU

CSRU

MI

SI

Mortality

0.13

0.05

0.20

0.15

Num of Patients

(N = 10000)

1476

2076

3609

2839

36 of 40

Methods - Hyperparameters

  • Choice of optimizer
    • Adam optimizer had a strange loss behavior (pictured above)
    • Stochastic gradient descent showed more stable loss curve
  • Learning rate
    • Initial learning rate (0.001) was too slow with gradient descent (pictured below), model never converged
    • Learning rate of 0.01 converged
  • Loss function
    • Used binary cross entropy
    • Failed to get good F1 scores from other loss functions
  • Size of hidden layer
    • Found best results with a size of 40

Loss with Adam Optimizer

Loss with Gradient Descent Optimizer

37 of 40

Results - Final Model Performance

  • ~.5 validation F1
  • Test Statistics
    • .769 AUROC
    • .854 Accuracy
    • .488 Precision
    • .420 Recall
    • .451 F1 Score

Loss in Final Model

F1 Score in Final Model

38 of 40

Discussion

  • Efficacy
    • Strong improvement over our group’s baseline SVM model
    • Efficacy may come at cost of explainability
    • Model still far from perfect - can inform a doctor but not replace them
  • Limitations
    • The hyperparameters (hidden size, optimizer, learning rate) that were tuned did not significantly improve the results
    • Lack of time to tune other hyperparameters (batch size, window size, etc.)
  • Future Research
    • Fine-tune hyperparameters
    • We want to try additional imputation and interpolation strategies
    • Bidirectional LSTMs - improved performance in papers about this dataset
    • Explore other machine learning libraries for additional options

39 of 40

References

  • Antar AD., Banovic N. BDSI Machine Learning Github Tutorials
  • Olah, C. (2015, August 27). Understanding LSTM Networks. In Colah's Blog. Retrieved from https://colah.github.io/posts/2015-08-Understanding-LSTMs/
  • Wikipedia. Machine Learning. Retrieved from https://en.wikipedia.org/wiki/Machine_learning
  • Yasrab, R., & Pound, M. P. (2020). PhenomNet: Bridging phenotype-genotype gap: A CNN-LSTM based automatic plant root anatomization system. bioRxiv.
  • Zhu, Y., Fan, X., Wu, J., Liu, X., Shi, J., & Wang, C. (2018, January). Predicting ICU mortality by supervised bidirectional LSTM networks. In AIH@ IJCAI.

40 of 40

Contacts

Brian Lin - Carnegie Mellon University

blin2@andrew.cmu.edu

Sabir Meah - University of Michigan

smeah@umich.edu

Rami Shams - University of Michigan

rshams@umich.edu

Esther Adegoke - Tufts University

esther.adegoke@tufts.edu