Machine Learning Model for Predicting Drug Sensitivity in Hepatocellular Carcinoma Cell Lines Using Gene Expression Data
Ethan Cody, Andrew Lukashchuk, Wuyan Li
Early Stage
Late Stage
Huang, Ao, et al. "Targeted therapy for hepatocellular carcinoma." Signal transduction and targeted therapy 5.1 (2020): 146.
Precision Medicine
Without Precision Medicine
With Precision Medicine
Tailored therapy
Response
No-response
Adverse
Same Therapy
DNA
Can we create a prediction model using genetic information to identify the best treatment response in HCC patients?
Key Data Sources
Cancer Cell Line Encyclopedia (CCLE)
Genomics of Drug Sensitivity in Cancer (GDSC)
Simplified Molecular Input Line Entry System (SMILES)
Workflow
Predictive Model
External Dataset
+
Gene Expression with Drug resistance
Drug molecular information
Train Dataset
Validation
External Testing Set
Qiu, Zhixin, et al. "A pharmacogenomic landscape in human liver cancers." Cancer Cell 36.2 (2019): 179-193.
Modeling
Linear Regression
XGBoost
Random Forest Classifier
External Dataset
Following the initial results of the various models on the validation set, we chose the best models to use on the external test dataset. The results were worse than on the validation set, but still effective:
XGBoost
�Random Forest Classifier
XGBoost
Random Forest
Summary
Data Organization
Modeling
Future Direction