1 of 17

CSE 163

ML and Images�

Suh Young Choi�

🎶 Listening to: Doctor Strange soundtrack

💬 Before Class: What are you thinking about for your final project?

2 of 17

Announcements

No class next Monday and Wednesday (8/7 and 8/9)

  • Video lectures will be posted to Canvas
  • Lessons still released on Ed

Project Proposals due Monday 8/7

  • Turned in via Gradescope
  • No late work accepted, no resubmissions

Two more resubmission periods remaining, unless…

  • If 60% of the class fills out the final course eval (releasing 8/9), then you get a third!
  • You may submit any one THA and any one LR during this “bonus” period (8/16 – 8/18)

2

3 of 17

This Time

  • Machine learning with images

Last Time

  • Convolutions
  • Kernel operations

3

4 of 17

Machine Learning, revisited

Terms from machine learning

  • Features / labels
  • Learning algorithm
  • Model
  • Model class
  • Training set / test set
  • Parameters / Hyperparameters

4

5 of 17

ML + Images

How do we do machine learning on images?

  • Simplest: Unroll the image into a vector
  • Complex: Use other tools to extract features from the images

5

10

20

30

40

50

60

70

80

90

10

20

30

40

50

60

70

80

90

Raw Image

Unrolled Image

6 of 17

ML + Images

Pros: Simple transformation (just a call to reshape!)

Cons: It loses the idea of “neighboring” pixels (up/down)

  • Most machine learning models don’t take position of the features into account
  • This is where more complex models like convolutional neural networks come in to encode that local information as features

6

Despite these drawbacks, it can work in practice on some problems!

7 of 17

Neural Network

Based on how our brains �work

7

8 of 17

Example

What is the output for this neuron if the inputs are 0 for the first input and 1 for the second. The activation function is the step function (0 if negative, 1 otherwise). The bias should be subtracted from the weighted sum before applying the activation function.

8

3

-2

4

9 of 17

Example

What is the output for this neuron if the inputs are 0 for the first input and 1 for the second. The activation function is the step function (0 if negative, 1 otherwise). The bias should be subtracted from the weighted sum before applying the activation function.

9

3

-2

4

0

1

10 of 17

Example

What is the output for this neuron if the inputs are 0 for the first input and 1 for the second. The activation function is the step function (0 if negative, 1 otherwise). The bias should be subtracted from the weighted sum before applying the activation function.

10

3

-2

4

0

1

0 * 3 + 1 *(-2) - 4 = -6

11 of 17

Example

What is the output for this neuron if the inputs are 0 for the first input and 1 for the second. The activation function is the step function (0 if negative, 1 otherwise). The bias should be subtracted from the weighted sum before applying the activation function.

11

3

-2

4

0

1

0 * 3 + 1 *(-2) - 4 = -6

squish(-6) = 0

12 of 17

Image Classification

12

13 of 17

Unsupervised

Learning

So far, we have seen supervised machine learning, where we have to explicitly shown the algorithm the labels

Unsupervised machine learning lets the algorithm try to learn trends on its own without providing explicit labels

Examples

  • Clustering
  • Outlier detection

13

14 of 17

Project Proposal Details

Multiple Datasets

  • Must have at least 3 datasets that are used together in some way
  • At least two of the research questions should involve at least two datasets
  • Must have at least one join/merge operation between the datasets.

Messy Data

  • Data cannot be in a file format that we have worked with in class (.csv, .shp, .json)
  • Data from web-scraping, API, or requires lots of preprocessing all count as messy data
    • Example: using imputation for large amounts of missing data

14

15 of 17

Project Proposal Details

Result Validity

  • Verify the validity of your results using statistical testing or some other known testing method from domain expertise
  • Any test you use must be justified and explained in the context of your data and research questions
  • Clearly interpret the results of the validity tests alongside preliminary results from your analysis

Machine Learning

  • Cannot use only the DecisionTreeRegressor or DecisionTreeClassifier pipeline from class
  • Must use a new model class (scikit-learn has many!) OR
  • Train several Decision Trees with differing hyperparameters

15

16 of 17

Project Proposal Details

New Library

  • Must use at least one library that was not introduced in class to answer at least two research questions
  • The new library may be used in tandem with one of the other challenge goals
    • E.g., using Pytorch to create advanced machine learning models
  • Multiple new libraries do not count as separate challenge goals
    • E.g., using Scipy and plotly only count as one challenge goal

More details and examples can be found on the course website.

16

17 of 17

Before Next Time

  • Complete Lesson 21
    • Remember not for points, but do go towards Checkpoint Tokens
  • Checkpoint 6 releasing after class today
  • Project Proposals due on Monday!

Next Time

  • Ethics and Data Science

17