1 of 29

Lecture 7��Training versus Testing

1

Instructor: Ercan Atam

Institute for Data Science & Artificial Intelligence

Course: DSAI 512-Machine Learning

2 of 29

2

List of contents for this lecture

 

3 of 29

3

Relevant readings for this lecture

  • Chapter 2 of Yaser S. Abu-Mostafa, Malik Magdon-Ismail, Hsuan-Tien Lin, “Learning from Data”, AMLBook, 2012.

  • Chapter 5 of Hui Jiang , "Machine Learning Fundamentals: a Concise Introduction",

‎Cambridge University Pres, 2022.

4 of 29

4

The two questions of learning

 

 

5 of 29

5

Problems with the current generalization bound

 

 

6 of 29

6

Problems with the current generalization bound

 

 

 

 

7 of 29

7

What will the “Theory of Generalization” achieve?

 

 

Error

out-of-sample error

model complexity

in-sample error

model complexity

out-of-sample error

in-sample error

Error

 

8 of 29

8

 

 

 

9 of 29

9

 

 

A hypothesis set without considering data:

10 of 29

10

 

 

 

11 of 29

11

Some definitions (1)

Definition: dichotomy

 

12 of 29

12

Some definitions (2)

Definition: dichotomy set

 

13 of 29

13

Some definitions (3)

 

Definition: growth function

14 of 29

14

Some definitions (4)

Definition: shattering

 

15 of 29

15

Back to the growth function

 

 

16 of 29

16

Growth function examples (1)

2-D perceptron model:

Cannot implement this dichotomy

Consider the following three points (N=3):

17 of 29

17

Growth function examples (2)

2-D perceptron model (continue):

Consider another three points (N=3):

Can implement all 8 dichotomies where

any point here can be assigned +1 or -1.

18 of 29

18

Growth function examples (3)

2-D perceptron model (continue):

In case of 4 points, it can implement

at most 14 dichotomies. The remaining

two missing dichotomises are displayed

here with blue and red corresponding to

-1, +1 or to +1, -1.

 

What about N=4?

19 of 29

19

Growth function examples (4)

1-D positive ray model:

 

20 of 29

20

Growth function examples (5)

Positive intervals:

 

21 of 29

21

Growth function examples (6)

Convex sets:

 

22 of 29

22

The 3 growth functions

 

23 of 29

23

 

 

24 of 29

24

Break Point

Definition: break point

25 of 29

25

Example: break point for 2D perceptrons (1)

Claim: for 2D perceptrons, k=4 is a break point (which may not be so obvious!)

 

Observation: It is easier to implement dichotomies on k=4 points by 2D perceptrons when the configuration (arrangement) of points does not include three or four colinear points.

This means that we must focus on 4 points with a configuration given below:

26 of 29

26

Example: break point for 2D perceptrons (2)

k=4

Impossible for the 2D perceptrons

to implement.

27 of 29

27

Break points for other examples

28 of 29

28

Main result

29 of 29

References �(utilized for preparation of lecture notes or Matlab code)

29