1 of 21

How can a computer learn to classify objects from examples?

Unit 3, Module 3.2

1

How do we describe objects to a decision making system?

Feature representation
These features come from the questions

Need to fill in the gap from Module 3.2 and 3.4

Pastaland is a simulation of a machine learning algorithm

We know all the classes and features in advance -. Constructing

Difference between Akinator

Akinator - expand tree incrementally - grow step by step
We couldn’t predict what to ask until we had another example

Your data is a feature vector - it isn’t a picture of pasta.

Lives are affected b

In module 3.4 we will create classifiers and predictors with AI lab.

Machine learning for kids does use decision trees

What do AI machines need - they need labe examples.

Feature vectors are labeled examples.

We can’t dump the rigatoni into an AI machine.

Images have pixels

Other objects need feature vectors to describe them.

Compuers only know the feature values. They don’t know the questions.

The learning algorithms just look at the feature vectors.

Real world connection

I3, Cart,
THey calculate statistics that will give you a good split

2 of 21

Pastas (cut out)

2

Corkscrew

Rigatoni

Ravioli

cascatelli

Tortellini

Penne

Macaroni

Fusilli

farfalle

Zitti

Training Set

3 of 21

Try It #1: Pasta Land - Creating a Pasta Classifier

Setup: (a) Get a big sheet of poster paper for drawing your tree; (b) cut out pictures of pastas (or use real pastas provided by your teacher) that you can move around the nodes of your tree.

Step 1: Propose a bunch of features to distinguish between the pasta types

Step 2: Pick a feature that will split the pastas into two roughly equal groups

Doing this every time will give you the smallest number of questions
This is similar to how real machine learning algorithms decide which feature to use for the next node

Step 3: Take the features, build the decision tree.

Make sure every one of your pasta’s is a leaf
You may not have enough features - if you end up with two types of pasta at the same leaf

Step 4: Test this out and update the features and the tree

Step 5: Convert the tree into a table.

The columns are the features (red - replace question)
The rows are the leaves
If the feature doesn’t apply, put N/A

3

4 of 21

4

Step 1: �List all the features you think may be useful to distinguish the pastas.

��

Step 2: Create a root node and place all the pastas there.��

A feature is a property of a thing we are trying to describe.

Step 3: Pick a feature from the list that splits the pasta examples in the current node roughly in half.��

5 of 21

5

Yes

No

Step 4: Create two children of the current node and split its pastas between them based on the feature.

6 of 21

6

Is it long?

Yes

No

Yes

No

Step 4: Create two children of the current node and split its pastas between them based on the feature.

7 of 21

Feature Vector Description of Pasta

Add more columns to fit all your features & more rows to fit all your pastas

7

Pasta
Spaghetti
Lasagne
Rigatoni
Ravioli

8 of 21

Use this test set to see how good your features are at classifying a new set of pastas

8

Mezzaluna

Mafaldine

Calamarata

Penne

Ravioli

Shell

9 of 21

Or Pick Your Own Set of Pastas

https://en.wikipedia.org/wiki/List_of_pasta

https://www.delish.com/uk/food-news/�g31017969/pasta-shapes/

9

10 of 21

Feature Vector Description of Pasta

10

Pasta
Spaghetti
Lasagne
Rigatoni
Ravioli

11 of 21

Test Set - Data Collection Sheet

11

Pasta	Correctly classified	Incorrectly classified
Spaghetti
Lasagne
Rigatoni
Ravioli

12 of 21

Reflection

How well did your pasta classifier do on this new test set? What pastas did it classify incorrectly?

Were your features flexible enough to allow you to classify new pasta? Why or why not?

How would you change your features to allow them to better classify a wider range of pastas?

12

13 of 21

Takeaways

13

14 of 21

If your data is in a feature vector table like this then you don’t have to build the tree yourself

People don’t usually build decision trees by hand because for realistic tasks it can be hard.

For practical problems, people build a table that may have thousands of rows (examples) and dozens of columns (features). Each example is labeled with the correct decision (i.e., the class it belongs to).

Then they use a machine learning algorithm to create the decision tree.

14

Machine Learning is a process for constructing a reasoner (such as a decision tree classifier) using training data and a learning algorithm.

15 of 21

Pasta Land vs Mini-Akinator

Pastaland is a simulation of a machine learning algorithm

We knew all the examples/classes in advance
We knew all the features in advance
When we constructed the tree

We used node splitting process
We were selective about the features we chose in our tree

Used an algorithm to build the tree

Take away - The new way we build the tree in Pasta Land is what a machine learning algorithm will do for you.

Mini-Akinator - simulation of creating and growing a decision tree

We didn’t know the examples in advance
We didn’t know all the features in advance
Instead we added the examples one at a time and discovered the features as we went along
When we constructed the tree

We grew it step by step because we added a new example

We couldn’t predict what to ask until we had another example
We might have to ask a lot more questions using this process.

Take away - We learned how to create and grow a decision tree and get some insight into how Akinator works.

15

16 of 21

Let’s look at some examples from the real-world

16

17 of 21

Try It #2: Think of a real-world example

Step 1: Pick a situation in the real world where computers might make decisions

Step 2: Think about the attributes that would contribute to making good decisions ��Step 3: Reflection: Think about how your life is being affected by classifier systems (automated decision making).

Examples

Classifying toys based on age. �What kinds of features make a toy suitable?
Classify games, movies, and music
Reviewing loan applications. �What kinds of things help us determine the strength of a loan application?
College applications.�What kinds of things make for a strong college application?

17

18 of 21

Computers don’t learn the way people do.

18

19 of 21

How do people learn?

19

Discussion

20 of 21

Ways people learn

Being taught by someone (e.g., parent, teacher, sibling, friend)
Asking questions
Observing others
Reading
Trial and error
Systematic experimentation
Everyday life experiences

20

Note: None of these are things that computers currently do well.

21 of 21

Ways Computers Learn

Learning from labeled examples (like Pasta Land)

This is called supervised learning
We will only explore supervised learning in this unit�

Finding clusters in data

This is called unsupervised learning because the data does not come pre-labeled
Example: examining data on plants collected from a forest and decided how many different species of plants there are in the forest�

Learning from experience via trial and error

This is called reinforcement learning
This is how computers learn good strategies for playing games

21