1 of 47

AI for Product Managers

Build a Model with Google AutoML

Lucas Oliveira

Mentor / Project Reviewer

June 2019 - Present

2 of 47

2

Agenda

  • Project Summary�
  • Project Walkthrough�
  • Present some pain points

  • Q&A

3 of 47

PROJECT SUMMARY

4 of 47

4

Project Summary

  • Use AutoML to train a machine learning model to predict pneumonia within xray images

  • Fill a Modelling Report with your findings and conclusions

5 of 47

5

Project Summary

Skills that will be learned

  • Upload a dataset and train a model with Google AutoML

  • How to interpret the graphs and metrics shown in the results dashboard

  • Understand how the model performance changes according to different training configurations

6 of 47

6

Project Summary

Tips for starting the project

  • Since the upload and training may take some time, make sure your dataset is correctly setup with the correct amount of data and folder structure

  • Read the documentation to get an even better view of the platform and use cases for different companies https://cloud.google.com/vision

7 of 47

PROJECT WALKTHROUGH

8 of 47

8

Project Walkthrough

  • Downloading the dataset
  • Creating dataset file
  • Uploading the dataset
  • Training the model
  • Checking the results
  • Filling the model report

9 of 47

Downloading the dataset

10 of 47

10

Downloading the dataset

11 of 47

Creating dataset file

12 of 47

12

Creating dataset file

We will 4 create different datasets:

  • Clean/Balanced Dataset
  • Clean/Unbalanced Dataset
  • Dirty Dataset
  • 3-Class Dataset

(Refer to project lesson for more details)

13 of 47

13

Creating dataset file

dataset.zip

dataset

normal

pneumonia

Example for configuration 1:

14 of 47

Uploading the dataset

15 of 47

15

Uploading the dataset

16 of 47

16

Quick project summary

Skills that will be learned

Tips for starting the project

Uploading the dataset

17 of 47

17

Quick project summary

Skills that will be learned

Tips for starting the project

Uploading the dataset

18 of 47

18

Quick project summary

Skills that will be learned

Tips for starting the project

1

2

Uploading the dataset

19 of 47

19

Quick project summary

Skills that will be learned

Tips for starting the project

2

1

Uploading the dataset

20 of 47

20

1

3

2

4

5

Uploading the dataset

21 of 47

21

Uploading the dataset

22 of 47

22

Uploading the dataset

23 of 47

23

Uploading the dataset

24 of 47

Training the model

25 of 47

25

Training the model

26 of 47

26

Training the model

27 of 47

27

Training the model

28 of 47

28

Training the model

29 of 47

29

Quick project summary

Skills that will be learned

Tips for starting the project

Training the model

30 of 47

30

Training the model

31 of 47

Checking the results

32 of 47

32

Checking the results

1

2

33 of 47

33

Checking the results

34 of 47

Filling the model report

35 of 47

Some pain points

36 of 47

Describing Confusion Matrix

37 of 47

37

Describing Confusion Matrix

Here you will need to explicitly report two values:

  • True Positive Rate for Pneumonia Class

  • False Positive Rate for Normal Class

38 of 47

38

Describing Confusion Matrix

Students often get the first correct, but miss the second one:

  • When we say False Positive Rate for Normal Class we are considering the Normal Class as the Positive class

  • So False Positives here means images that were classified as Normal (positive) but the correct classification should have been Pneumonia (negative)

39 of 47

39

Describing Confusion Matrix

Notice that we are talking about rates, so you need to report a percentage value

40 of 47

Explaining Threshold Variation

41 of 47

41

Explaining Threshold Variation

Here you will need to explain two things:

  • What happens to precision and recall when we increase the threshold value

  • Why these effects happen

42 of 47

42

Explaining Threshold Variation

43 of 47

43

Explaining Threshold Variation

Students often successfully explain what happens, but forget to explain why the effects happen:

  • Try to create a causation link using the common informations that are connected with precision, recall and the threshold value

  • Tip: Explore the formulas for precision and recall

44 of 47

Calculating Multiclass Metrics

45 of 47

45

Calculating Multiclass Metrics

Here you will need to describe two things

  • The precision and recall observed in the 3-class experiment

  • How precision and recall are calculated in a 3-class scenario

46 of 47

46

Calculating Multiclass Metrics

Again, students often get the first correct, but miss the second:

  • The way we calculate precision and recall for a multilabel classifier is different from a binary classifier

Formulas for binary classifier: Formulas for multilabel classifier:

47 of 47

Q&A