1 of 21

How can a computer learn to classify objects from examples?

Unit 3, Module 3.2

1

2 of 21

Pastas (cut out)

2

Corkscrew

Rigatoni

Ravioli

cascatelli

Tortellini

Penne

Macaroni

Fusilli

farfalle

Zitti

Training Set

3 of 21

Try It #1: Pasta Land - Creating a Pasta Classifier

Setup: (a) Get a big sheet of poster paper for drawing your tree; (b) cut out pictures of pastas (or use real pastas provided by your teacher) that you can move around the nodes of your tree.

Step 1: Propose a bunch of features to distinguish between the pasta types

Step 2: Pick a feature that will split the pastas into two roughly equal groups

  • Doing this every time will give you the smallest number of questions
  • This is similar to how real machine learning algorithms decide which feature to use for the next node

Step 3: Take the features, build the decision tree.

  • Make sure every one of your pasta’s is a leaf
  • You may not have enough features - if you end up with two types of pasta at the same leaf

Step 4: Test this out and update the features and the tree

Step 5: Convert the tree into a table.

  • The columns are the features (red - replace question)
  • The rows are the leaves
  • If the feature doesn’t apply, put N/A

3

4 of 21

4

Step 1: �List all the features you think may be useful to distinguish the pastas.

��

Step 2: Create a root node and place all the pastas there.��

A feature is a property of a thing we are trying to describe.

Step 3: Pick a feature from the list that splits the pasta examples in the current node roughly in half.��

5 of 21

5

Yes

No

Step 4: Create two children of the current node and split its pastas between them based on the feature.

6 of 21

6

Is it long?

Yes

No

Yes

No

Step 4: Create two children of the current node and split its pastas between them based on the feature.

7 of 21

Feature Vector Description of Pasta

Add more columns to fit all your features & more rows to fit all your pastas

7

Pasta

Spaghetti

Lasagne

Rigatoni

Ravioli

8 of 21

Use this test set to see how good your features are at classifying a new set of pastas

8

Mezzaluna

Mafaldine

Calamarata

Penne

Ravioli

Shell

9 of 21

Or Pick Your Own Set of Pastas

9

10 of 21

Feature Vector Description of Pasta

10

Pasta

Spaghetti

Lasagne

Rigatoni

Ravioli

11 of 21

Test Set - Data Collection Sheet

11

Pasta

Correctly classified

Incorrectly classified

Spaghetti

Lasagne

Rigatoni

Ravioli

12 of 21

Reflection

How well did your pasta classifier do on this new test set? What pastas did it classify incorrectly?

Were your features flexible enough to allow you to classify new pasta? Why or why not?

How would you change your features to allow them to better classify a wider range of pastas?

12

13 of 21

Takeaways

13

14 of 21

If your data is in a feature vector table like this then you don’t have to build the tree yourself

People don’t usually build decision trees by hand because for realistic tasks it can be hard.

For practical problems, people build a table that may have thousands of rows (examples) and dozens of columns (features). Each example is labeled with the correct decision (i.e., the class it belongs to).

Then they use a machine learning algorithm to create the decision tree.

14

Machine Learning is a process for constructing a reasoner (such as a decision tree classifier) using training data and a learning algorithm.

15 of 21

Pasta Land vs Mini-Akinator

Pastaland is a simulation of a machine learning algorithm

  • We knew all the examples/classes in advance
  • We knew all the features in advance
  • When we constructed the tree
    • We used node splitting process
    • We were selective about the features we chose in our tree
  • Used an algorithm to build the tree

Take away - The new way we build the tree in Pasta Land is what a machine learning algorithm will do for you.

Mini-Akinator - simulation of creating and growing a decision tree

  • We didn’t know the examples in advance
  • We didn’t know all the features in advance
  • Instead we added the examples one at a time and discovered the features as we went along
  • When we constructed the tree
    • We grew it step by step because we added a new example
  • We couldn’t predict what to ask until we had another example
  • We might have to ask a lot more questions using this process.

Take away - We learned how to create and grow a decision tree and get some insight into how Akinator works.

15

16 of 21

Let’s look at some examples from the real-world

16

17 of 21

Try It #2: Think of a real-world example

Step 1: Pick a situation in the real world where computers might make decisions

Step 2: Think about the attributes that would contribute to making good decisions ��Step 3: Reflection: Think about how your life is being affected by classifier systems (automated decision making).

Examples

  • Classifying toys based on age. �What kinds of features make a toy suitable?
  • Classify games, movies, and music
  • Reviewing loan applications. �What kinds of things help us determine the strength of a loan application?
  • College applications.�What kinds of things make for a strong college application?

17

18 of 21

Computers don’t learn the way people do.

18

19 of 21

How do people learn?

19

Discussion

20 of 21

Ways people learn

  • Being taught by someone (e.g., parent, teacher, sibling, friend)
  • Asking questions
  • Observing others
  • Reading
  • Trial and error
  • Systematic experimentation
  • Everyday life experiences

20

Note: None of these are things that computers currently do well.

21 of 21

Ways Computers Learn

  • Learning from labeled examples (like Pasta Land)
    • This is called supervised learning
    • We will only explore supervised learning in this unit�
  • Finding clusters in data
    • This is called unsupervised learning because the data does not come pre-labeled
    • Example: examining data on plants collected from a forest and decided how many different species of plants there are in the forest�
  • Learning from experience via trial and error
    • This is called reinforcement learning
    • This is how computers learn good strategies for playing games

21