1 of 84

Machine Learning Group

Big Data Summer Institute 2022

Department of Electrical Engineering and Computer Science

Department of Biostatistics

Claire Chu Sara Colando Ricardo Gloria Picazzo Savannah Gonzales Audrey Kim

Amaan Jogia-Sattar Jonathan Lin Dhruba Nandi Rui Nie Xavier Serrano Nguyen Tran-Bach

1

2 of 84

Presentation Outline

Background
Different eXplainable AI methods

Group 1
Group 2
Group 3

Joint conclusion
Questions

2

3 of 84

In medicine, diagnoses = important

Proper patient care hinges on specific, accurate, and timely diagnostics

3

4 of 84

In medicine, diagnoses = important

Proper patient care hinges on specific, accurate, and timely diagnostics

ex) Brain tumor grades require different interventions

4

5 of 84

Radiologists!

Radiologists to the rescue!

segmentation →

5

6 of 84

Radiologist’s dilemma: accuracy vs efficiency

Radiologists to the rescue!

segmentation →

6

Large workloads
Small time frames
High stakes

7 of 84

Radiologist’s dilemma: accuracy vs efficiency

Radiologists to the rescue!

7

Large workloads
Small time frames
High stakes

8 of 84

Radiologist’s dilemma: accuracy vs efficiency

Radiologists to the rescue!

8

Large workloads
Small time frames
High stakes

9 of 84

Optimizing radiology with AI

9

10 of 84

Do we trust AI?

10

11 of 84

AI Explainability

Demystifying the “Black-Box Problem”�
eXplainable AI (XAI)

Precision Health->Clinician-Model Synchronicity

11

12 of 84

Drawbacks of Existing XAI Methods

XAI methods exist! So what’s the problem?

Huge computational demands
Incompatibility

Inputs (data)
Task (classification vs. segmentation)

12

13 of 84

Drawbacks of Existing XAI Methods

XAI methods exist! So what’s the problem?

Huge computational demands
Incompatibility

Inputs (data)
Task (classification vs. segmentation)

13

14 of 84

Drawbacks of Existing XAI Methods

XAI methods exist! So what’s the problem?

Huge computational demands
Incompatibility

Inputs (data)
Task (classification vs. segmentation)

classification

cow

vs.

no cow

“cow”

“goat”

“bull”

14

AUDREY

XAI methods exist! So…why don’t just use them?

For one, they’re hard to implement from a purely infrastructural standpoint – high amounts of computing power are needed to run and train complex models and work with such large volumes of data.

Also, current XAI methods aren’t prepared to handle the kinds of information and tasks that we need.

For one, there are issues with data dimensions/characteristics.

Additionally, there aren’t XAI methods suited to address the types of tasks we need. They’re fairly well-suited to explain classification decisions, which answer questions like “Is there a cow in this image, yes or no?” or “What is in this image?”, but not segmentation, which asks “Where exactly, if at all, is this thing?” – specific down to the pixel-level

Bringing this back to the medical setting, this would be the difference between, a physician may know that a tumor exists and knowing the exact size, shape, and location for treatment planning

15 of 84

Drawbacks of Existing XAI Methods

XAI methods exist! So what’s the problem?

Huge computational demands
Incompatibility

Inputs (data)
Task (classification vs. segmentation)

classification

segmentation

cow

vs.

no cow

“cow”

“goat”

“bull”

15

16 of 84

Drawbacks of Existing XAI Methods

XAI methods exist! So what’s the problem?

Huge computational demands
Incompatibility

Inputs (data)
Task (classification vs. segmentation)

classification

segmentation

cow

vs.

no cow

“cow”

“goat”

“bull”

16

Difficulty: XAI methods not prepared for segmentation

17 of 84

AI Model: U-Net

State-of-the-art convolutional neural network used for medical image segmentation�
Layers utilize spatial correlations of input data, create weighted feature map�
Segmentation: identifying precise location of entity of interest

Specifically,

Trained for brain tumor segmentation

17

18 of 84

AI Model: U-Net

State-of-the-art convolutional neural network used for medical image segmentation�
Layers utilize spatial correlations of input data, create weighted feature map�
Segmentation: identifying precise location of entity of interest

Specifically,

Trained for brain tumor segmentation

18

7.8mil parameters!!!

Difficulty: Huge model → hard to train (computational costs)

19 of 84

The Dataset

Per patient,

MRI scans:

19

[

144 pixels

144 slices

(images)

4 sequences

GDC Data. National Cancer Institute. https://portal.gdc.cancer.gov/

20 of 84

The Dataset

Per patient,

MRI scans:

20

[

144 pixels

144 slices

(images)

4 sequences

GDC Data. National Cancer Institute. https://portal.gdc.cancer.gov/

21 of 84

The Dataset

Per patient,

MRI scans:

Per slice,

Ground truth: radiologist-determined segmentations

21

[

144 pixels

144 slices

(images)

4 sequences

GDC Data. National Cancer Institute. https://portal.gdc.cancer.gov/

22 of 84

The Dataset

Per patient,

MRI scans:

Per slice,

Ground truth: radiologist-determined segmentations
Predictions: AI-generated segmentations

22

[

144 pixels

144 slices

(images)

4 sequences

U-Net

GDC Data. National Cancer Institute. https://portal.gdc.cancer.gov/

23 of 84

The Dataset

Per patient,

MRI scans:

Per slice,

Ground truth: radiologist-determined segmentations
Predictions: AI-generated segmentations

23

[

144 pixels

144 slices

(images)

4 sequences

U-Net

Difficulty: Using 4 sequences to produce one prediction

GDC Data. National Cancer Institute. https://portal.gdc.cancer.gov/

24 of 84

Explainability Attempts

Group 1: LIME
Group 2: Quantifying Uncertainty
Group 3: Embeddings Projector

24

25 of 84

Modifying LIME to Explain Tumor Segmentation Predictions

Amaan Jogia-Sattar, Audrey Kim, Rui Nie

25

GROUP 1

26 of 84

26

LIME:

Model-agnostic
Individual predictions

Expertise

UNet

Prediction

Ideally

False Positive?

False Negative?

Ribeiro MTC. lime. https://github.com/marcotcr/lime

27 of 84

27

Research trajectory

Modify and implement LIME

Accommodate data inputs and segmentation task
Which part of the brain scan of each MRI sequence positively contributes to the final prediction of tumor segmentation?

An Exploratory study of different segmentation algorithms of images used in LIME

28 of 84

LIME : What is it?

28

Local: approximate model behavior in the vicinity of local prediction�
Interpretable: local surrogate model is sparse, linear�
Model-agnostic: treat the original model as a black box�
Explanation: surrogate model weights correspond to approximated features of importance

29 of 84

LIME Modifications

29

One Grey or RGB image at a time

Classification

Class labels for each image

A mask consist of 0 and 1, indicating region of contribution

4 MRI sequences (images) for each scan

Segmentation (simulated with

binary classification by pixel)

Tumor/non-tumor label for each pixel

A mask consist of 0 and 1, indicating region of contribution

Model Input

Task

Explanation target

Explanation formats

Traditional LIME

LIME for UNet

AUDREY

The goal of our modifications was to address the incompatibility issues with data input and the classification vs segmentation task.

In terms of data input: LIME is designed to receive one image and explain for an AI’s decision about that image.

We modified LIME to intake 4 images (the different sequences), and explain for U-Net’s decision about ONE pixel from one of those sequences.

This output difference helps us address the task (classification vs segmentation) issue.

To recap, a classification task returns a label for each image. We can think about segmentation as returning a label for each pixel, where the label is “tumor” or “non-tumor.”

So we take LIME’s original capability of explaining classification labels of an image and zoom in so it explains classification labels of a pixel. And in doing so, we simulate a segmentation task.

For both LIMEs, the output is a mask of 0 – representing non-importance – and 1 – representing importance.

30 of 84

LIME for U-Net

30

Sequence Extraction + Perturbation
“Black box” Prediction
Sparse Linear Surrogate Model
Superpixels

31 of 84

LIME for U-Net

31

Sequence Extraction + Perturbation
“Black box” Prediction
Sparse Linear Surrogate Model
Superpixels

32 of 84

LIME for U-Net

32

Sequence Extraction + Perturbation
“Black box” Prediction
Sparse Linear Surrogate Model
Superpixels

33 of 84

LIME for U-Net

33

Sequence Extraction + Perturbation
“Black box” Prediction
Sparse Linear Surrogate Model
Superpixels

34 of 84

Results: LIME explanations for single pixels

34

Patient case: ‘TCGA-HT-7874’

Brain slice: 75

Segmentation Algorithm: quick shift

Sequences: FLAIR

Explanation

Prediction by UNet

Tumor Label

35 of 84

Results (cont.):

Explanatory Plots

35

FLAIR

T1

T1Gd

T2

Sequences

(original)

Quick shift

Felzenszwalb

Heatmap masks:

Idea: weighted mean of two types of masks: pixels predicted as tumor vs. non-tumor

Mask Boundary plots:

Idea: delineate using thresholds (e.g. 0.5) on heatmap masks

36 of 84

Results (cont.):

Metrics of explanations

36

	FLAIR	T1	T1Gd	T2
Quick shift	74.9%	72.0%	89.7%	74.9%
Felzenszwalb	26.1%	34.1%	12.8%	22.2%

Table: proportion of tumor pixels included in explanations (%)

37 of 84

Future directions

37

Reflect contextual information (e.g. clinical observations) in explanations as opposed to lone ground truth segmentations and tumor vs. non-tumor labeling.
Develop metrics for assessing explanation qualities and determine if particular MRI sequences result in more optimal diagnoses.
Attempt global explanation using a set of local instances.

38 of 84

Thank you!

Amaan Jogia-Sattar amaanjs@umich.edu

Audrey Kim audreyki@umich.edu

Rui Nie ruinie@umich.edu

38

39 of 84

Quantifying Uncertainty in a Tumor Segmentation Model

Claire Chu, Sara Colando, Dhruba Nandi, Xavier Serrano

39

GROUP 2

40 of 84

Transparency as Explainability

Transparency exposes a model’s properties to various stakeholders to better understand, improve, and contest model predictions.

40

Uncertainty Quantification in models communicates to stakeholders:

(a) if and when they should trust model predictions

(b) assess how fair these predictions are on sample-wide and patient-specific cases

So, Uncertainty is Transparency and Uncertainty is Explainability

41 of 84

How Does Uncertainty Enhance Explainability?

41

Explainable to Clinicians:

Explainable to Patients:

Explainable to Model Designers:

Allowing physicians to more confidently segment tumors
Clarity in review processes leading up to implementation of models in a clinical setting

Help model designers understand weaknesses
Collaboration with domain experts can clarify various types of errors and their implications

Encourage trust between clinician and patient
Help patients understand strengths and limitations of models without an overload of technical information

42 of 84

Central Goal:

Quantify model uncertainty by using a partially bayesian neural network (pBNN) to communicate where the model is uncertain of its prediction.

Research Questions:

Where is this model failing, and how is it failing to properly segment the tumor?
In what cases is the model certain but still makes a mistake in tumor segmentation?

42

43 of 84

Outline of Methods

43

44 of 84

Outline of Methods

44

(Snehal Prabhudesai 2022)

45 of 84

U-NET Architecture

45

Selected Layer

(Snehal Prabhudesai 2022)

46 of 84

Outline of Methods

46

(Snehal Prabhudesai 2022)

47 of 84

Bayesian Inference

Allows us to update the probability of a hypothesis as more data becomes available!

In neural net:

Using bayesian inference, the weights are sampled push-forward posterior distribution generated during training.

47

Example: Full Bayesian Neural Net

Push Forward Posterior Distribution

Output

Input

Hidden Layer

48 of 84

48

49 of 84

Why Use a Partially Bayesian Neural Net?

49

Targeted Bayesian inference on a small, strategically chosen single layer of the Deep Neural Network while training the rest of the network using less-expensive deterministic methods.

Promises of using a pBNN:

Less Computationally Expensive than using a complete bayesian neural networks.
Outputs a predicted value for each pixel between 0 (no tumor) and 1 (tumor) that serves as a probability for pixel classification.
Standard Deviation of sampled predictions can quantify model uncertainty → which increases explainability.

50 of 84

50

Training Summary:

Epochs = 400

Batch Size/Epoch: 256

Parameters: 7.8 million

Training Time: 11 hours

Tuning the Hyperparameters

51 of 84

Outline of Methods

51

52 of 84

52

Inaccurate Prediction but Not Uncertain?

Clustering of False Positive and False Negative?

OUTPUTS

INPUTS

Female, age 41

37.13 month survival time

Tissue Source Site: Case Western - St. Joes

Study: Brain Lower Grade Glioma

Histology: oligodendroglioma (G3)

53 of 84

53

Female, age 41

37.13 month survival time

Tissue Source Site: Case Western - St. Joes

Study: Brain Lower Grade Glioma

Histology: oligodendroglioma (G3)

Inaccurate Prediction but Not Uncertain?

Clustering of False Positive and False Negative?

OUTPUTS

INPUTS

54 of 84

54

Female, age 66,

15.97 month survival time

Tissue Source Site: Duke

Study: Glioblastoma multiforme

Histology: glioblastoma (G3)

High Sensitivity

Higher Uncertainty in Predicted Boundary Regions

55 of 84

Comparing Uncertainty Across Truth Prediction Discrepancy Values

55

More certain for accurate classification.

More certain for false negatives than false positives.

Less certain when classifying a pixel as “tumor”.
More likely to be falsely confident that a pixel is “non-tumor” than “tumor”.

False Negative

1.0

0.8

0.6

0.4

Normalized Uncertainty Distribution

0.2

0.0

False Positive

Accurate

56 of 84

Sample-wide Certainty ≠ Individual Level Certainty

56

False Negative

1.0

0.8

0.6

0.4

Normalized Uncertainty Distribution

0.2

0.0

False Positive

Accurate

False Positive

False Negative

Accurate

False Positive

False Negative

Male, age 67, 7.69 month survival time

Tissue Source Site: Thomas Jefferson University

Study: Lower Brain Grade Glioma

Histology: Astrocytoma (G3)

Female, age 70, 5.32 month survival time

Tissue Source Site: Case Western St. Joes

Study: Lower Brain Grade Glioma

Histology: Astrocytoma (G3)

Sample-Wide

57 of 84

These patients’ clinical info are highly similar

57

…But the Normalized Uncertainty Distributions Vary

Especially in False Positive and Accurate Discrepancies

58 of 84

Future Work

Investigating the implications of the different kinds of model failure on clinical outcomes. Investigating what kind of model failure is considered more dangerous by clinicians.

58

Collaborating with clinicians to better understand why model fails in specific brain regions, and why false positive and false negative results tend to cluster.

Comparing model performance and uncertainty levels across various subsets (e.g. different tumor histologies, tissue source sites, patient sex, vital status, etc.).

59 of 84

References

59

Bhatt, Umang, Javier Antorn, Yunfeng Zhang, Q. Vera Liao, Prasanna Sattigeri, Riccardo Fogliato, Gabrielle Melançon, et al. 2021. “Uncertainty as a Form of Transparency: Measuring, Communicating, and Using Uncertainty.” In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, 401–13. AIES ’21. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3461702.3462571.

Prabhudesai, Snehal, Nicholas Wang, Vinayak Ahluwalia, Xun Huan, Jayapalli Bapuraj, Nikola Banovic, and Arvind Rao. 2021. “Stratification by Tumor Grade Groups in a Holistic Evaluation of Machine Learning for Brain Tumor Segmentation.” Frontiers in Neuroscience 15 (October). https://doi.org/10.3389/fnins.2021.740353.

Snehal Prabhudesai, Dingkun Guo, Jeremiah Hauth. 2022. “Partially Bayesian Neural Networks: Low-Cost Bayesian Uncertainty Quantification for Deep Learning in Medical Image Segmentation.”

60 of 84

Thank you!

Claire Chu cychu@email.unc.edu

Sara Colando skca2020@mymail.pomona.edu

Dhruba Nandi nandidhruba2019@gmail.com

Xavier Serrano serranox17@gmail.com

60

61 of 84

Visualization Using

Embedding Projector

Jonathan Lin, Nguyen Tran-Bach, Ricardo Gloria-Picazzo, Savannah Gonzales

GROUP 3

62 of 84

Overview

Goal: Use TensorBoard to visualize and explain certain aspects of the machine learning model

While Uncertainty Maps and LIME are grounded in explainable AI, TensorBoard remains largely experimental, and has not been used extensively in the field.
Very little has been done to apply TensorBoard to explain tumor segmentation models.

After applying TensorBoard, we hope to obtain an intuitive visualization that can shed light onto why the machine learning model is behaving in the way that it is.

63 of 84

Embedding Projector

TensorFlow is an end-to-end open source platform for machine learning.
TensorBoard is a visualization toolkit provided by TensorFlow which includes an embedding projector tool.
Embedding: a technique that takes high-dimensional input data and plots them in a 2D or 3D space preserving some geometric structure.
Dimensionality reduction is extremely useful to visualize data especially when the dimension of the data is too high.

63

34

Daniel Smilkov, Nikhil Thorat, Charles Nicholson, Emily Reif, Fernanda B. Viégas, and Martin Wattenberg. Embedding projector: Interactive visualization and interpretation of embeddings, 2016.

64 of 84

GIF taken from Google

65 of 84

Embeddings shown in TensorBoard

PCA (Principal Component Analysis)

finds a submanifold in which, upon projected, the data points yield the highest empirical variance.

t-SNE (t-distributed Stochastic Neighbor Embedding)

creates a probability distribution by determining similarities in the data, tries to minimize KL convergence between the distribution in high-dimension and the one in low-dimension; works well for clustering.

UMAP (Uniform Manifold Approximation and Projection)

similar to t-SNE, but with additional mathematical assumptions.

65

Daniel Smilkov, Nikhil Thorat, Charles Nicholson, Emily Reif, Fernanda B. Viégas, and Martin Wattenberg. Embedding projector: Interactive visualization and interpretation of embeddings, 2016.

66 of 84

Ground Truth vs. Prediction

Aim to explain how accurate the model is.

144 points representing the ground truth; 144 points representing the predictions; all of these are for a single patient.

66

UMAP of ground truth and prediction images

67 of 84

67

PCA of

ground truth

vs. prediction:

68 of 84

68

t-SNE of ground truth vs. prediction:

69 of 84

Output of First Layer

Aims to explain what the first layer does to the input.

39 patients, from each patient we selected 3 middle z-slices.

Each of the 39 * 3 = 117 points represents the output values of the first layer.

69

UMAP output of first layer

70 of 84

First layer (Conv2D) and filters

70

71 of 84

71

PCA of outputs of first layer:

72 of 84

72

UMAP of outputs of first layer:

73 of 84

73

t-SNE of outputs of first layer:

74 of 84

Challenges

Computational intensiveness of certain techniques implemented by TensorBoard.
The visualization that TensorToard provides, while capable of clustering data points together, does not actually provide an explanation as to why those points are clustered in such a way.
Technical difficulties running TensorBoard on Armis2.
TensorBoard generally is used for classification models.
The layers may be too complex for the visualization to yield good results.

74

75 of 84

Future Developments

Apply TensorBoard and other dimensionality reduction techniques to subsequent layers in the model
Extend our example for using dimensionality reduction to explain the accuracy of the model to all patients simultaneously, rather than only a single patient at a time.
Multidimensional scaling (MDS): maps the points into a lower dimension while trying to minimize the loss function of distances. It is significantly faster than other methods when the number of data points is much smaller than the dimension of the data.

75

76 of 84

MDS (n = 282; d = 11,943,936)

76

77 of 84

Thank you!

Jonathan Lin jlin900@berkeley.edu

Nguyen Tran-Bach tactb@mit.edu

Ricardo Gloria Picazzo ricardo.gloria@cimat.mx

Savannah Gonzales srgonzal@umich.edu

77

78 of 84

Takeaways

Three XAI frameworks:

LIME aims to explain local (instance-specific) decisions.
Uncertainty Quantification aims to pinpoint where and how the model’s predictions fail.
TensorBoard aims to help visualize how accurate the model is and what it learns in each layer.

78

AI explainability

broader adoption in healthcare and beyond

us

79 of 84

References - Group 3

Janet Bastiman. Explainability in AI: why you need it. Napier, 2021.
Ahmed Hosny, Chintan Parmar, John Quackenbush, Lawrence H. Schwartz, and Hugo J. W. L. Aerts. Artificial intelligence in radiology. National Reviews Cancer, 18(8):500–510, 2018.
Leland McInnes, John Healy, and James Melville. Umap: Uniform manifold approximation and projection for dimension reduction.
Long Nguyen. Multivariate and categorical data analysis (UMich STAT 601), fall 2016.
Keiron O’Shea and Ryan Nash. An introduction to convolutional neural networks, 2015.
Snehal Prabhudesai, Nicholas Chandler Wang, Vinayak Ahluwalia, Xun Huan, Jayapalli Rajiv Bapuraj, Nikola Banovic, and Arvind Rao. Stratification by tumor grade groups in a holistic evaluation of machine learning for brain tumor segmentation. Frontiers in Neuroscience, 15, 2021.
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-Net: Convolutional networks for biomedical image segmentation. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, 2015.
Daniel Smilkov, Nikhil Thorat, Charles Nicholson, Emily Reif, Fernanda B. Viégas, and Martin Wattenberg. Embedding projector: Interactive visualization and interpretation of embeddings, 2016.
Jie Tian, Di Dong, Zhenyu Liu, and Jingwei Wei. Chapter 1 - introduction. In Jie Tian, Di Dong, Zhenyu Liu, and Jingwei Wei, editors, Radiomics and Its Clinical Application, The Elsevier and MICCAI Society Book Series, pages 1–18. Academic Press, 2021.

79

80 of 84

Acknowledgements

We would like to thank

Dr. Nikola Banovic and Snehal Prabhudesai for proposing this project and fearlessly and patiently mentoring us.
Dan Barker for his technical expertise.
The BDSI Coordinators – Sabrina Olsson, Hanna Venera, and Jenna Bedrava – for organizing and making our research possible .
Dr. Bhramar Mukherjee for everything: this opportunity, bringing us together, and more.

80

81 of 84

Results (cont.):

Explanatory Plots

81

FLAIR

T1

T1Gd

T2

Sequences

(original)

Quick shift

Felzenszwalb

Heatmap masks:

Idea: weighted mean of two types of masks: pixels predicted as tumor vs. non-tumor

Mark Boundary plots:

Idea: delineate using thresholds (e.g. 0.5) on heatmap masks

RUI + AUDREY

We created two types of visualizations of LIME explanations for UNet, which are heatmap masks, which are the images using red and yellow colors to highlight different areas, and mask boundary plot, which are images that used only red lines to delineate the boundary of regions that LIME thinks contribute to the final tumor segmentation made by UNet.

For the figure on the right, all those images are generated for the patient #### at the 75-th slice, where the tumor area is regarded as relatively big. Four images in the first row presents four modalities we are interested in, and the row name “quickshift” and “felzenszwalb” indicate two types of segmentation algorithms. LIME also provides a third segmentation algorithm called “slic”, but we didn’t include it in this comparison since it was designed for LAB color space image while all brain scan images we are working on are in grey scale.

Since our modified lime explanations explains tumor vs. non-tumor prediction for each pixel, we iterating the explaining process for every pixel in one image to create masks, and then according to the prediction label, we added those created masks separately, and scale them by the number of pixels of the same prediction category. This makes sense at least logically because we are able to prevent predicted non-tumor pixels from dominant the overall heatmap mask plot simply as a result of the overwhelming number of predicted non-tumor pixel in one brain scan image. Besides, we can even justify our explanation by the large overlap between highlighted region and the ground truth tumor area.

With heatmap masks created above, we are also able to delineating region of interest using manually chosen threshold, for example, 0.5, to draw a red line on our ground-truth tumor image.

Compare these two types of visualization, heatmap provides more gradient information which makes clear that which superpixel that LIME think more positively explains our tumor segmentation, while mask boundary plots offers clearer indication explanation of interest, but we have to be careful of the choice of threshold.

82 of 84

Neural networks: machine learning models that mirror the way human brains process information

Input layer: receive inputs of high-dimensional data, which are split up and mapped to a hidden layer
Hidden layer: determines how each input will improve or worsen the final output, using what it learned from the previous layer
Purpose is to learn from inputs in order to optimize outputs

82

Keiron O’Shea and Ryan Nash. An introduction to convolutional neural networks, 2015.

83 of 84

Convolutional neural networks (CNNs): a type of neural network primarily used for pattern recognition and image classification

Hidden layers create a vector that tells us what parts of the image have identifiable features
Parts of image with easily identifiable features → larger weights; Parts of image with harder to identify features → smaller weights
Weights are the effect that each pixel has on the final output image

83

Keiron O’Shea and Ryan Nash. An introduction to convolutional neural networks, 2015.

84 of 84

84

t-SNE of ground truth vs. prediction: