1 of 25

Swayam Prabha

Course Title

Multivariate Data Mining- Methods and Applications

Lecture 34

Training and Pruning Decision Trees

By

Anoop Chaturvedi

Department of Statistics, University of Allahabad

Prayagraj (India)

Slides can be downloaded from https://sites.google.com/view/anoopchaturvedi/swayam-prabha

2 of 25

  •  

Data Mining_Anoop Chaturvedi

2

3 of 25

  •  

Data Mining_Anoop Chaturvedi

3

4 of 25

  •  

Data Mining_Anoop Chaturvedi

4

5 of 25

Pruning the Tree ⇒ Control overfitting

Remove leaves and assign the majority label of the parent to all items.

Pre-pruning ⇒ Stop growing the tree at some point there is insufficient data to make reliable decisions.

Post-pruning ⇒ Grow the full decision tree and remove nodes for which we have insufficient evidence.

Data Mining_Anoop Chaturvedi

5

6 of 25

  •  

Data Mining_Anoop Chaturvedi

6

7 of 25

  •  

Data Mining_Anoop Chaturvedi

7

8 of 25

Data Mining_Anoop Chaturvedi

8

9 of 25

  •  

Data Mining_Anoop Chaturvedi

9

10 of 25

  •  

Data Mining_Anoop Chaturvedi

10

11 of 25

  •  

Data Mining_Anoop Chaturvedi

11

12 of 25

  •  

Data Mining_Anoop Chaturvedi

12

13 of 25

  •  

Data Mining_Anoop Chaturvedi

13

14 of 25

  •  

Data Mining_Anoop Chaturvedi

14

 

15 of 25

  •  

Data Mining_Anoop Chaturvedi

15

16 of 25

  •  

Data Mining_Anoop Chaturvedi

16

17 of 25

  •  

Data Mining_Anoop Chaturvedi

17

18 of 25

  •  

Data Mining_Anoop Chaturvedi

18

19 of 25

  •  

Data Mining_Anoop Chaturvedi

19

20 of 25

  •  

Data Mining_Anoop Chaturvedi

20

21 of 25

  •  

Data Mining_Anoop Chaturvedi

21

22 of 25

  •  

Data Mining_Anoop Chaturvedi

22

23 of 25

  •  

Data Mining_Anoop Chaturvedi

23

24 of 25

  •  

Data Mining_Anoop Chaturvedi

24

25 of 25

  •  

Data Mining_Anoop Chaturvedi

25