1 of 24

Data Mining_Anoop Chaturvedi

1

Swayam Prabha

Course Title

Multivariate Data Mining- Methods and Applications

Lecture 09

Multicollinearity and Variables Selection

By

Anoop Chaturvedi

Department of Statistics, University of Allahabad

Prayagraj (India)

Slides can be downloaded from https://sites.google.com/view/anoopchaturvedi/swayam-prabha

2 of 24

  •  

Data Mining_Anoop Chaturvedi

2

3 of 24

  •  

Data Mining_Anoop Chaturvedi

3

4 of 24

  •  

Data Mining_Anoop Chaturvedi

4

5 of 24

  •  

Data Mining_Anoop Chaturvedi

5

6 of 24

  •  

Data Mining_Anoop Chaturvedi

6

7 of 24

  •  

Data Mining_Anoop Chaturvedi

7

8 of 24

  •  

Data Mining_Anoop Chaturvedi

8

9 of 24

  •  

Data Mining_Anoop Chaturvedi

9

10 of 24

  •  

Data Mining_Anoop Chaturvedi

10

11 of 24

Variable Selection in a regression equation

  • Overfitting implies too many parameters and inflated variance.
  • Underfitting reduces variance but increases bias of regression function leading to poor explanation of data.
  • A variable is important if dropping it seriously affects prediction accuracy.
  • Driving force is Occam's razor (simpler and more easily interpretable is better) combined with greater accuracy in prediction.

Data Mining_Anoop Chaturvedi

11

12 of 24

  •  

Data Mining_Anoop Chaturvedi

12

13 of 24

  •  

Data Mining_Anoop Chaturvedi

13

14 of 24

  •  

Data Mining_Anoop Chaturvedi

14

15 of 24

  •  

Data Mining_Anoop Chaturvedi

15

16 of 24

  •  

Data Mining_Anoop Chaturvedi

16

17 of 24

Example: Stepwise Regression ⇒ mtcars dataset

  1. Forward Stepwise Selection
  2. Backward Stepwise Selection
  3. Both-Direction Stepwise Selection

mpg (miles per gallon) ⇒ Response variable

All other 10 variables are potential predictors variables

We use step() function to perform stepwise selection

Data Mining_Anoop Chaturvedi

17

18 of 24

Data Mining_Anoop Chaturvedi

18

(i) Forward Stepwise Selection

19 of 24

  • Fit intercept-only model ⇒ AIC =115.94345.
  • Fit every possible one-predictor model. Select model having lowest AIC and statistically significant reduction in AIC compared to the intercept-only model. This adds predictor wt. AIC=73.21736.
  • Fit every possible two-predictor model and proceed as before. This adds predictor cyl. AIC = 63.19800.
  • Fit every possible three-predictor model and proceed as earlier. This adds predictor hp. AIC=62.66456.

Data Mining_Anoop Chaturvedi

19

20 of 24

  • Fit every possible four-predictor model. Here none of these models produced a significant reduction in AIC, and the procedure is stopped.

Final model:

Data Mining_Anoop Chaturvedi

20

21 of 24

Data Mining_Anoop Chaturvedi

21

  1. Backward Stepwise Selection

22 of 24

  • Fit a model using all predictors.
  • Fit all models that contain all but one of the predictors. Select the best among these models.
  • Proceeding further, lastly select a single best model from among all fitted models using AIC.

Final Model:

Data Mining_Anoop Chaturvedi

22

23 of 24

Both-Direction Stepwise Selection

Final model:

Data Mining_Anoop Chaturvedi

23

24 of 24

  •  

Data Mining_Anoop Chaturvedi

24