1 of 59

Approaches and Challenges for using Artificial Intelligence in Medical Imaging

The 2nd Swiss Deep Learning Days

Kevin Mader

22nd of September 2017

2 of 59

3 of 59

4 of 59

5 of 59

6 of 59

CT

7 of 59

8 of 59

9 of 59

10 of 59

11 of 59

There’s Waldo!

12 of 59

WHAT DOES STAGING MEAN?

Categorizing the severity of the disease

  • 5 levels for the tumor
  • 4 levels for the lymph nodes
  • 2 levels for metastases
  • 8 different stages for treatment planning
  • 40 different possible categories

American Cancer Society 2016

13 of 59

WHY AUTOMATE STAGING?

  • Mis-staging patients can have serious consequences as both under and over treating can drastically affect patient health
  • Currently two physicians take over an hour to perform the initial assessment
  • Reports are often incomplete or missing key information

14 of 59

JUST DEEP LEARN!

  • Everybody is doing it!
  • Google → Inception, Show and Tell, Retina Disease
  • Facebook → DeepFace
  • Apple → Deploy DNN models on your iPhone
  • Baidu → Recognize complicated sentences and synthesize realistic speech
  • Watson → Win Jeopardy and read
  • Academia as well
  • Stanford → ImageNet, Visual Genome
  • U Washington → YOLO

15 of 59

WHAT TO LEARN?

Input

700 x 512 x 512 x 2 x 32bit [367MPixels]

Output

TNM Stage [3 values]

16 of 59

JUST CLASSIFICATION?

  • Physicians do a lot more then just check a box
  • Detailed Report for each scan (50-1000 words)
    • Covers imaging technique, patient information, questions to be answered, other findings
    • Very lightly structured, very personalized content

17 of 59

JUST CLASSIFICATION?

  • Physicians examine images and look for lesions (detection)
    • Image[float] -> Image[boolean]
  • Each lesion is then classified into a lesion type (lesion classification)
    • Image[boolean] -> (Image[int], List[string])
  • Features are generated for each lesion
    • Image[int] -> List[FeatureVectors]
  • The collection of lesions is then used to classify the patient
    • (List[FeatureVectors], List[string]) -> (T_stage, N_stage, M_stage)

18 of 59

IS DEEP LEARNING OUR SAVING GRACE?

Deep Learning a spiral pattern from noise-free data: 6 layers, 30+ neurons, 3000+ epochs of training

Programming a custom feature for spirals (1 line of code)

19 of 59

IS DEEP LEARNING OUR SAVING GRACE?

GPU

  • ~16-24GB of memory
  • SIMD very well, thousands of cores
  • Branching very poorly

CPU

  • >1TB of memory per node
  • Memory mapping
  • A few cores, that can run fast and branch well

InceptionV3 for medical volumes would need 338GB for a batch size of 1

20 of 59

TRAINING DATA?

ImageNet

  • 59K pixels per image
  • 1.2M images
  • 0.04 variables (pixels) per example
  • 1200 examples per category

Lung Cancer

  • 157M pixels per patient
  • 2,000 patients
  • 78,000 variables (pixels) per example (3.9M times worse)
  • 50 examples per category (24 times worse)

21 of 59

TRAINING DATA?

We have gotten really far by learning to

  • stir big piles (Stochastic Gradient Descent on Deep Neural Networks)
  • really fast (GPU)
  • With lots of data to pour

https://xkcd.com/1838/

22 of 59

STARTING WITH DEEP LEARNING

A few models that even work well for medical images / limited training data

  • UNET
  • 3D CNN
  • PyraMid LSTM

https://www.kaggle.com/kmader/simple-nn-with-keras

23 of 59

WELL….

Using UNET to segment bone requires around

  • 7M parameters to be fit
  • > 2,000 manually labeled test images
  • Hours of training
  • Execution time on standard CPU systems measured in hours for a single volume

24 of 59

WELL….

A well adapted approach with standard techniques requires around

  • 15 parameters
    • All of which have anatomical / physiological significance
  • 1% of the computational time
  • 2% higher accuracy

25 of 59

LEARNING, FORGETTING, RELEARNING

  • Using Deep Learning to segment lungs works really well on training data (>99%)
  • Fails simple sanity checks
    • Rotating 90
    • Noisy scanners
    • Images without lungs
  • Retraining causes it to forget how to do it’s original job

https://www.kaggle.com/kmader/simple-nn-with-keras

26 of 59

LITTLE DIFFERENCES

  • Every patient is slightly differently positioned on the scanner
  • Every scanner has slightly different artifacts
  • Every patient has at least a few hundred (thousand) completely benign abnormalities

https://www.kaggle.com/kmader/simple-nn-with-keras

27 of 59

https://xkcd.com/1831/

28 of 59

JUST CLASSIFICATION?

  • Physicians examine images and look for lesions (detection)
    • Image[float] -> Image[boolean]
  • Each lesion is then classified into a lesion type (lesion classification)
    • Image[boolean] -> (Image[int], List[string])
  • Features are generated for each lesion
    • Image[int] -> List[FeatureVectors]
  • The collection of lesions is then used to classify the patient
    • (List[FeatureVectors], List[string]) -> (T_stage, N_stage, M_stage)

29 of 59

WE NEED TO LEARN SMART!

Incorporate things we already know

  • Lung tumors don’t show up in the brain (context)
  • Lungs are supposed to be full of air (normality)

Train networks more efficiently

  • Standard loss functions like
    • MSE, MAE (regression)
    • DICE, IOU (segmentation)
    • Accuracy and cross-entropy (classification)
  • Do not apply well to many medical problems, one or two pixels of metastases COMPLETELY change treatment planning and outcome likelihoods (<0.001% of the image)

30 of 59

INCORPORATING WHAT WE ALREADY KNOW

  • The body and anatomy are fairly well understood
  • We can use Bayesian models to incorporate all of the prior information better and establish
    • Bayesian Networks
      • Selectively incorporate information outside of simple patient images
    • Bayesian Conditional Random Fields
      • Begin to incorporate larger ideas about shape and structure

31 of 59

INCORPORATING WHAT WE ALREADY KNOW

What is normal?

  • Have the network imagine how the patient should look
    • Simulating normality (Generative Models)
  • Find regions which don’t match this well

32 of 59

TRAINING MORE EFFICIENTLY

  • Neural Networks that won image net found categories of images
    • A cat in the corner is the same as a cat in the middle
  • What does Deep Learning give us?
    • Using Convolutional Neural Networks
      • fully position independent
      • Few parameters (few thousand per layer)
    • Using Fully Connected Layers
      • Fully position dependent (every pixel counts)
      • 512x512 image has 68 billion free parameters
      • Volume has 1.8e16 parameters (512^6)
  • Medicine is different
    • A bulge in the lung isn’t the same as a bulge in your knee
    • A spot on your arm could be a life-threatening metastases or an FDG tracer injection site

33 of 59

TRAINING MORE EFFICIENTLY

  • Decision boundaries are not linear
    • A lot of tumor might not always be a problem
    • A tiny bit of metastases always is a problem
  • Class distributions are very skewed
    • A lot of cancer is still less than 1% of the image
  • Adversarial techniques allow us to ‘learn the loss metric’ rather than trying to manually define and tune it
    • Adapt to get better results

34 of 59

HIGH QUALITY, TARGETED ANNOTATIONS

  • Focus the algorithms on the pixels that are important
  • Most of them aren’t
    • < 0.02%

35 of 59

40 different categories of annotation

Incorporating targeted pieces of information

36 of 59

37 of 59

38 of 59

ANNOTATIONS ARE HARD

39 of 59

40 of 59

INCORPORATING OTHER INFORMATION IS DIFFICULT

The golden dagger of back-propagation doesn’t apply to many existing algorithms

  • How to incorporate
    • Atlases
    • Known segmentations
  • Other image data
    • Registration
    • Different views

Using approaches like synthetic gradients allows many of these pieces to be decoupled

DeepMind: Synthetic Gradients

41 of 59

SMART LEARNING

ImageNet

  • 59K pixels per image
  • 1.2M images
  • 0.04 variables per example
  • 1200 examples per category

LungStage (+Annotations)

  • ~4000 pixels per patient
  • 2000 patients
  • 2 variables (pixels) per example (50 times worse)
  • 50 examples per category (24 times worse)

42 of 59

Other Challenges

43 of 59

CURRENT STATUS

~ 1000 patients

> 3,000 lesions

44 of 59

EVALUATING RESULTS

  • Nobody really knows what works best so gotta try them all.
  • A good cross validation method is essential for figuring out which models learn well

45 of 59

TRANSFER LEARNING

  • Medical images are black and white
  • Pretrained models are color
  • We can learn a pseudo-color mapping by prepending layers to the model

46 of 59

EVALUATING RESULTS

  • Each network can take days to train on distributed HPC systems
  • > 10,000 different network structures
  • Finding the best will still take some time, but in the meantime, good enough will work

> 4 billion calculations per patient

47 of 59

VISUALIZING RESULTS

  • TSNE can be a useful visualization
  • We can focus on the more difficult lesions

48 of 59

VISUALIZING RESULTS

  • Looking at mistakes
  • Resample classes which perform poorly to improve them

49 of 59

COMMUNICATING RESULTS

As important as making a classification is delivering the confidence in the classification. Neural networks are naturally very bad at this, but Bayesian Networks and simpler models can help

50 of 59

51 of 59

CONSTRUCTING SIMPLE RULES

T1

Tumor ≤3 cm across its greatest dimension, surrounded by lung or visceral pleura, without invasion, and more proximal than the lobar bronchus

T1a

Tumor ≤2 cm across its greatest dimension

T1b

Tumor >2 cm and ≤3 cm across its greatest dimension

T2

Tumor >3 cm and ≤7 cm or with any of the following features: involves main bronchus and is more than 2 cm distal to the carina; invades visceral pleura; associated with atelectasis or obstructive pneumonitis that extends to the hilar region without involvement of the entire lung

T2a

Tumor >3 cm and ≤5 cm across its greatest dimension

T2b

Tumor >5 cm and ≤7 cm across its greatest dimension

T3

Tumor >7 cm or any of the following features: direct invasion of the chest wall (including the superior sulcus), diaphragm, phrenic nerve, mediastinal pleura, or parietal pericardium; involvement of the main bronchus <2 cm distal to the carina (without involvement of the carina); associated atelectasis or obstructive pneumonitis of the entire lung; or a tumor nodule within the same lobe as that of the primary tumor

T4

Tumor of any size with invasion of the mediastinum, heart, great vessels, trachea, recurrent laryngeal nerve, esophagus, vertebral body, or carina or a separate tumor nodule within an ipsilateral lobe

Our Results

52 of 59

CONSTRUCTING SIMPLE RULES

Our Results

N0

No regional lymph node metastasis

N1

Metastasis in ipsilateral peribronchial or ipsilateral hilar and intrapulmonary lymph nodes, including direct extension

N2

Metastasis in ipsilateral mediastinal or subcarinal lymph nodes

N3

Metastasis in contralateral mediastinal, contralateral hilar, ipsilateral, contralateral scalene, or supraclavicular lymph nodes

53 of 59

CONSTRUCTING SIMPLE RULES

N0

No regional lymph node metastasis

N1

Metastasis in ipsilateral peribronchial or ipsilateral hilar and intrapulmonary lymph nodes, including direct extension

N2

Metastasis in ipsilateral mediastinal or subcarinal lymph nodes

N3

Metastasis in contralateral mediastinal, contralateral hilar, ipsilateral, contralateral scalene, or supraclavicular lymph nodes

Our Results

54 of 59

INTERACTIVE TOOLS

55 of 59

UNROLLING RESULTS TO TEXT

LungStage found 20 suspicious tumor regions and decided on T4 because of the shown lesions which are invasive in the mediastinum

56 of 59

NEXT STEPS

  • We are looking to involve other hospitals and health-systems
  • We want to further improve the diagnosis of lung cancer.
  • We want to let radiologists make our models better (make code better by giving feedback and using the tools)
  • Unsupervised approaches for finding lesions better (RSNA 2017)

57 of 59

PACS

RIS

Search

Engine

Curation / Annotation

Research

Machine Learning

Decision Support

58 of 59

CEO & Co-Founder

Joachim Hagger

Master of Science in Physics

TEAM

CTO & Co-Founder

Dr. Kevin Mader

Doctor of Sciences ETHZ

CFO & Co-Founder

Flavio Trolese

Dipl. Ing. FH Informatik

Scientific Advisor & Co-Founder

Prof. Dr. Marco Stampanoni

Inst. f. Biomedizinische Technik

CMO

Bram Stieltjes, MD, PhD

Research Group Leader

Dr. Thomas J Re, MD, MSEE

Research / Radiology Insights

Dr. med. Dipl. Phys. Gregor Sommer

Stv. Oberarzt Kardiale und

Thorakale Diagnostik

PD Dr. Tobias Heye

Oberarzt und stv. Leiter Kardiale

und Thorakale Diagnostik

PD Dr. Alexander Sauter

Facharzt Radiologie,

Assistenzarzt Nuklearmedizin

Joshy Cyriac

Machine Learning / Software Engineering

Prof. Dr. Elmar M. Merkle

Chefarzt und Leiter Klinik für

Radiologie und Nuklearmedizin

Partner

Interdisciplinary and experienced team

59 of 59

4QuantBIG IMAGE ANALYTICS

Thank you for your attention!

Yes, we’re hiring!

Explore our data

Detect lung nodules in CT images

https://www.kaggle.com/kmader/lungnodemalignancy

Look for cancer in PETCT Images

https://www.kaggle.com/4quant/soft-tissue-sarcoma