Machine Learning Group
Big Data Summer Institute 2022
Department of Electrical Engineering and Computer Science
Department of Biostatistics
Claire Chu Sara Colando Ricardo Gloria Picazzo Savannah Gonzales Audrey Kim
Amaan Jogia-Sattar Jonathan Lin Dhruba Nandi Rui Nie Xavier Serrano Nguyen Tran-Bach
1
Presentation Outline
2
In medicine, diagnoses = important
3
In medicine, diagnoses = important
4
Radiologists!
Radiologists to the rescue!
segmentation →
5
Radiologist’s dilemma: accuracy vs efficiency
Radiologists to the rescue!
segmentation →
6
Radiologist’s dilemma: accuracy vs efficiency
Radiologists to the rescue!
7
Radiologist’s dilemma: accuracy vs efficiency
Radiologists to the rescue!
8
Optimizing radiology with AI
9
Do we trust AI?
10
AI Explainability
Precision Health->Clinician-Model Synchronicity
11
Drawbacks of Existing XAI Methods
XAI methods exist! So what’s the problem?
12
Drawbacks of Existing XAI Methods
XAI methods exist! So what’s the problem?
13
Drawbacks of Existing XAI Methods
XAI methods exist! So what’s the problem?
classification
cow
vs.
no cow
“cow”
“goat”
“bull”
14
Drawbacks of Existing XAI Methods
XAI methods exist! So what’s the problem?
classification
segmentation
cow
vs.
no cow
“cow”
“goat”
“bull”
15
Drawbacks of Existing XAI Methods
XAI methods exist! So what’s the problem?
classification
segmentation
cow
vs.
no cow
“cow”
“goat”
“bull”
16
Difficulty: XAI methods not prepared for segmentation
AI Model: U-Net
Specifically,
17
AI Model: U-Net
Specifically,
18
7.8mil parameters!!!
Difficulty: Huge model → hard to train (computational costs)
The Dataset
Per patient,
19
[
[
[
144 pixels
144 pixels
144 slices
(images)
4 sequences
GDC Data. National Cancer Institute. https://portal.gdc.cancer.gov/
The Dataset
Per patient,
20
[
[
[
144 pixels
144 pixels
144 slices
(images)
4 sequences
GDC Data. National Cancer Institute. https://portal.gdc.cancer.gov/
The Dataset
Per patient,
Per slice,
21
[
[
[
144 pixels
144 pixels
144 slices
(images)
4 sequences
GDC Data. National Cancer Institute. https://portal.gdc.cancer.gov/
The Dataset
Per patient,
Per slice,
22
[
[
[
144 pixels
144 pixels
144 slices
(images)
4 sequences
U-Net
GDC Data. National Cancer Institute. https://portal.gdc.cancer.gov/
The Dataset
Per patient,
Per slice,
23
[
[
[
144 pixels
144 pixels
144 slices
(images)
4 sequences
U-Net
Difficulty: Using 4 sequences to produce one prediction
GDC Data. National Cancer Institute. https://portal.gdc.cancer.gov/
Explainability Attempts
24
Modifying LIME to Explain Tumor Segmentation Predictions
Amaan Jogia-Sattar, Audrey Kim, Rui Nie
25
GROUP 1
26
LIME:
Expertise
UNet
Prediction
Ideally
False Positive?
False Negative?
Ribeiro MTC. lime. https://github.com/marcotcr/lime
27
Research trajectory
LIME : What is it?
28
LIME Modifications
29
One Grey or RGB image at a time
Classification
Class labels for each image
A mask consist of 0 and 1, indicating region of contribution
4 MRI sequences (images) for each scan
Segmentation (simulated with
binary classification by pixel)
Tumor/non-tumor label for each pixel
A mask consist of 0 and 1, indicating region of contribution
Model Input
Task
Explanation target
Explanation formats
Traditional LIME
LIME for UNet
LIME for U-Net
30
LIME for U-Net
31
LIME for U-Net
32
LIME for U-Net
33
Results: LIME explanations for single pixels
34
Patient case: ‘TCGA-HT-7874’
Brain slice: 75
Segmentation Algorithm: quick shift
Sequences: FLAIR
Explanation
Prediction by UNet
Tumor Label
Results (cont.):
Explanatory Plots
35
FLAIR
T1
T1Gd
T2
Sequences
(original)
Quick shift
Felzenszwalb
Heatmap masks:
Mask Boundary plots:
Results (cont.):
Metrics of explanations
36
| FLAIR | T1 | T1Gd | T2 |
Quick shift | 74.9% | 72.0% | 89.7% | 74.9% |
Felzenszwalb | 26.1% | 34.1% | 12.8% | 22.2% |
Table: proportion of tumor pixels included in explanations (%)
Future directions
37
Thank you!
38
Quantifying Uncertainty in a Tumor Segmentation Model
Claire Chu, Sara Colando, Dhruba Nandi, Xavier Serrano
39
GROUP 2
Transparency as Explainability
Transparency exposes a model’s properties to various stakeholders to better understand, improve, and contest model predictions.
40
Uncertainty Quantification in models communicates to stakeholders:
(a) if and when they should trust model predictions
(b) assess how fair these predictions are on sample-wide and patient-specific cases
So, Uncertainty is Transparency and Uncertainty is Explainability
How Does Uncertainty Enhance Explainability?
41
Explainable to Clinicians:
Explainable to Patients:
Explainable to Model Designers:
Central Goal:
Quantify model uncertainty by using a partially bayesian neural network (pBNN) to communicate where the model is uncertain of its prediction.
Research Questions:
42
Outline of Methods
43
Outline of Methods
44
(Snehal Prabhudesai 2022)
U-NET Architecture
45
Selected Layer
(Snehal Prabhudesai 2022)
Outline of Methods
46
(Snehal Prabhudesai 2022)
Bayesian Inference
Allows us to update the probability of a hypothesis as more data becomes available!
In neural net:
Using bayesian inference, the weights are sampled push-forward posterior distribution generated during training.
47
Example: Full Bayesian Neural Net
Push Forward Posterior Distribution
Output
Input
Hidden Layer
48
Why Use a Partially Bayesian Neural Net?
49
Targeted Bayesian inference on a small, strategically chosen single layer of the Deep Neural Network while training the rest of the network using less-expensive deterministic methods.
Promises of using a pBNN:
50
Training Summary:
Epochs = 400
Batch Size/Epoch: 256
Parameters: 7.8 million
Training Time: 11 hours
Tuning the Hyperparameters
Outline of Methods
51
52
Inaccurate Prediction but Not Uncertain?
Clustering of False Positive and False Negative?
OUTPUTS
INPUTS
Female, age 41
37.13 month survival time
Tissue Source Site: Case Western - St. Joes
Study: Brain Lower Grade Glioma
Histology: oligodendroglioma (G3)
53
Female, age 41
37.13 month survival time
Tissue Source Site: Case Western - St. Joes
Study: Brain Lower Grade Glioma
Histology: oligodendroglioma (G3)
Inaccurate Prediction but Not Uncertain?
Clustering of False Positive and False Negative?
OUTPUTS
INPUTS
54
Female, age 66,
15.97 month survival time
Tissue Source Site: Duke
Study: Glioblastoma multiforme
Histology: glioblastoma (G3)
High Sensitivity
Higher Uncertainty in Predicted Boundary Regions
Comparing Uncertainty Across Truth Prediction Discrepancy Values
55
More certain for accurate classification.
More certain for false negatives than false positives.
False Negative
1.0
0.8
0.6
0.4
Normalized Uncertainty Distribution
0.2
0.0
False Positive
Accurate
Sample-wide Certainty ≠ Individual Level Certainty
56
False Negative
1.0
0.8
0.6
0.4
Normalized Uncertainty Distribution
0.2
0.0
False Positive
Accurate
False Positive
False Negative
Accurate
Accurate
False Positive
False Negative
Male, age 67, 7.69 month survival time
Tissue Source Site: Thomas Jefferson University
Study: Lower Brain Grade Glioma
Histology: Astrocytoma (G3)
Female, age 70, 5.32 month survival time
Tissue Source Site: Case Western St. Joes
Study: Lower Brain Grade Glioma
Histology: Astrocytoma (G3)
Sample-Wide
These patients’ clinical info are highly similar
57
…But the Normalized Uncertainty Distributions Vary
Especially in False Positive and Accurate Discrepancies
Future Work
Investigating the implications of the different kinds of model failure on clinical outcomes. Investigating what kind of model failure is considered more dangerous by clinicians.
58
Collaborating with clinicians to better understand why model fails in specific brain regions, and why false positive and false negative results tend to cluster.
Comparing model performance and uncertainty levels across various subsets (e.g. different tumor histologies, tissue source sites, patient sex, vital status, etc.).
References
59
Bhatt, Umang, Javier Antorn, Yunfeng Zhang, Q. Vera Liao, Prasanna Sattigeri, Riccardo Fogliato, Gabrielle Melançon, et al. 2021. “Uncertainty as a Form of Transparency: Measuring, Communicating, and Using Uncertainty.” In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, 401–13. AIES ’21. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3461702.3462571.
Prabhudesai, Snehal, Nicholas Wang, Vinayak Ahluwalia, Xun Huan, Jayapalli Bapuraj, Nikola Banovic, and Arvind Rao. 2021. “Stratification by Tumor Grade Groups in a Holistic Evaluation of Machine Learning for Brain Tumor Segmentation.” Frontiers in Neuroscience 15 (October). https://doi.org/10.3389/fnins.2021.740353.
Snehal Prabhudesai, Dingkun Guo, Jeremiah Hauth. 2022. “Partially Bayesian Neural Networks: Low-Cost Bayesian Uncertainty Quantification for Deep Learning in Medical Image Segmentation.”
Thank you!
Claire Chu cychu@email.unc.edu
Sara Colando skca2020@mymail.pomona.edu
Dhruba Nandi nandidhruba2019@gmail.com
Xavier Serrano serranox17@gmail.com
60
Visualization Using
Embedding Projector
Jonathan Lin, Nguyen Tran-Bach, Ricardo Gloria-Picazzo, Savannah Gonzales
GROUP 3
Overview
Goal: Use TensorBoard to visualize and explain certain aspects of the machine learning model
After applying TensorBoard, we hope to obtain an intuitive visualization that can shed light onto why the machine learning model is behaving in the way that it is.
Embedding Projector
63
34
Daniel Smilkov, Nikhil Thorat, Charles Nicholson, Emily Reif, Fernanda B. Viégas, and Martin Wattenberg. Embedding projector: Interactive visualization and interpretation of embeddings, 2016.
GIF taken from Google
Embeddings shown in TensorBoard
finds a submanifold in which, upon projected, the data points yield the highest empirical variance.
creates a probability distribution by determining similarities in the data, tries to minimize KL convergence between the distribution in high-dimension and the one in low-dimension; works well for clustering.
similar to t-SNE, but with additional mathematical assumptions.
65
65
Daniel Smilkov, Nikhil Thorat, Charles Nicholson, Emily Reif, Fernanda B. Viégas, and Martin Wattenberg. Embedding projector: Interactive visualization and interpretation of embeddings, 2016.
Ground Truth vs. Prediction
66
UMAP of ground truth and prediction images
67
PCA of
ground truth
vs. prediction:
68
t-SNE of ground truth vs. prediction:
Output of First Layer
69
UMAP output of first layer
First layer (Conv2D) and filters
70
71
PCA of outputs of first layer:
72
UMAP of outputs of first layer:
73
t-SNE of outputs of first layer:
Challenges
74
Future Developments
75
MDS (n = 282; d = 11,943,936)
76
Thank you!
Jonathan Lin jlin900@berkeley.edu
Nguyen Tran-Bach tactb@mit.edu
Ricardo Gloria Picazzo ricardo.gloria@cimat.mx
Savannah Gonzales srgonzal@umich.edu
77
Takeaways
Three XAI frameworks:
78
AI explainability
broader adoption in healthcare and beyond
us
References - Group 3
79
Acknowledgements
We would like to thank
80
Results (cont.):
Explanatory Plots
81
FLAIR
T1
T1Gd
T2
Sequences
(original)
Quick shift
Felzenszwalb
Heatmap masks:
Mark Boundary plots:
Neural networks: machine learning models that mirror the way human brains process information
82
Keiron O’Shea and Ryan Nash. An introduction to convolutional neural networks, 2015.
Convolutional neural networks (CNNs): a type of neural network primarily used for pattern recognition and image classification
83
Keiron O’Shea and Ryan Nash. An introduction to convolutional neural networks, 2015.
84
t-SNE of ground truth vs. prediction: