Learning with Regression and Trees
Dr. S. M. Patil, Computer Engineering Department
1
11-06-2024
Learning with Regression and Trees
Learning with Regression
Linear Regression in Machine Learning
Dr. S. M. Patil, Computer Engineering Department
2
11-06-2024
Linear Regression
Dr. S. M. Patil, Computer Engineering Department
3
11-06-2024
Learning with Regression and Trees
y= a0+a1x+ ε
Here
Y= Dependent Variable (Target Variable)�X= Independent Variable (predictor Variable)�a0= intercept of the line (Gives an additional degree of freedom)�a1 = Linear regression coefficient (scale factor to each input value).�ε = random error
The values for x and y variables are training datasets for Linear Regression model representation.
Dr. S. M. Patil, Computer Engineering Department
4
11-06-2024
Learning with Regression and Trees
Simple Linear Regression:�If a single independent variable is used to predict the value of a numerical dependent variable, then such a Linear Regression algorithm is called Simple Linear Regression.
Multiple Linear regression:�If more than one independent variable is used to predict the value of a numerical dependent variable, then such a Linear Regression algorithm is called Multiple Linear Regression.
A linear line showing the relationship between the dependent and independent variables is called a regression line. A regression line can show two types of relationship
Dr. S. M. Patil, Computer Engineering Department
5
11-06-2024
Learning with Regression and Trees
Dr. S. M. Patil, Computer Engineering Department
6
11-06-2024
Learning with Regression and Trees
Dr. S. M. Patil, Computer Engineering Department
7
11-06-2024
Learning with Regression and Trees
Finding the best fit line:
Dr. S. M. Patil, Computer Engineering Department
8
11-06-2024
Learning with Regression and Trees
Finding the best fit line:
Dr. S. M. Patil, Computer Engineering Department
9
11-06-2024
Learning with Regression and Trees
Residuals: The distance between the actual value and predicted values is called residual. If the observed points are far from the regression line, then the residual will be high, and so cost function will high. If the scatter points are close to the regression line, then the residual will be small and hence the cost function.
Gradient Descent:
Dr. S. M. Patil, Computer Engineering Department
10
11-06-2024
Learning with Regression and Trees
Dr. S. M. Patil, Computer Engineering Department
11
11-06-2024
Learning with Regression and Trees
Dr. S. M. Patil, Computer Engineering Department
12
11-06-2024
Learning with Regression and Trees
Simple Linear Regression in Machine Learning
- Such as the relationship between Income and expenditure, experience and Salary, etc.
Dr. S. M. Patil, Computer Engineering Department
13
11-06-2024
Learning with Regression and Trees
Simple Linear Regression in Machine Learning
- Such as Weather forecasting according to temperature, Revenue of a company according to the investments in a year, etc.
Simple Linear Regression Model:
y= a0+a1x+ ε
a0= It is the intercept of the Regression line (can be obtained putting x=0)�a1= It is the slope of the regression line, which tells whether the line is increasing or decreasing.�ε = The error term. (For a good model it will be negligible)
Dr. S. M. Patil, Computer Engineering Department
14
11-06-2024
Learning with Regression and Trees
- salary (dependent variable) and
- experience (Independent variable).
The goals of this problem is:
�
Dr. S. M. Patil, Computer Engineering Department
15
11-06-2024
Learning with Regression and Trees
Step-1: Data Pre-processing
Dr. S. M. Patil, Computer Engineering Department
16
11-06-2024
Learning with Regression and Trees
Dr. S. M. Patil, Computer Engineering Department
17
11-06-2024
Learning with Regression and Trees
�
Dr. S. M. Patil, Computer Engineering Department
18
11-06-2024
Dr. S. M. Patil, Computer Engineering Department
19
11-06-2024
Learning with Regression and Trees
�
Dr. S. M. Patil, Computer Engineering Department
20
11-06-2024
Dr. S. M. Patil, Computer Engineering Department
21
11-06-2024
Dr. S. M. Patil, Computer Engineering Department
22
11-06-2024
Learning with Regression and Trees
�
Dr. S. M. Patil, Computer Engineering Department
23
11-06-2024
Learning with Regression and Trees
�
Dr. S. M. Patil, Computer Engineering Department
24
11-06-2024
Learning with Regression and Trees
Dr. S. M. Patil, Computer Engineering Department
25
11-06-2024
Step: 3. Prediction of test set result:
Dr. S. M. Patil, Computer Engineering Department
26
11-06-2024
Learning with Regression and Trees
Dr. S. M. Patil, Computer Engineering Department
27
11-06-2024
Learning with Regression and Trees
Step: 4. visualizing the Training set results:
Dr. S. M. Patil, Computer Engineering Department
28
11-06-2024
Learning with Regression and Trees
Dr. S. M. Patil, Computer Engineering Department
29
11-06-2024
Learning with Regression and Trees
Dr. S. M. Patil, Computer Engineering Department
30
11-06-2024
Learning with Regression and Trees
Dr. S. M. Patil, Computer Engineering Department
31
11-06-2024
Learning with Regression and Trees
Step: 5. visualizing the Test set results:
Dr. S. M. Patil, Computer Engineering Department
32
11-06-2024
Learning with Regression and Trees
Dr. S. M. Patil, Computer Engineering Department
33
11-06-2024
Learning with Regression and Trees
Dr. S. M. Patil, Computer Engineering Department
34
11-06-2024
In the plot, there are observations given by the blue color,
and prediction is given by the red regression line.
As we can see, most of the observations are close to the regression line,
hence we can say our Simple Linear Regression is a good model and able to make good predictions.
Learning with Regression and Trees
Multiple Linear Regression
Some key points about MLR:
Dr. S. M. Patil, Computer Engineering Department
35
11-06-2024
Learning with Regression and Trees
Dr. S. M. Patil, Computer Engineering Department
36
11-06-2024
Learning with Regression and Trees
�
Dr. S. M. Patil, Computer Engineering Department
37
11-06-2024
Learning with Regression and Trees
Step-1: Data Pre-processing Step:
Importing libraries: Firstly we will import the library which will help in building the model. Below is the code for it:
Importing dataset: Now we will import the dataset(50_CompList), which contains all the variables. Below is the code for it:
Dr. S. M. Patil, Computer Engineering Department
38
11-06-2024
Learning with Regression and Trees
Dr. S. M. Patil, Computer Engineering Department
39
11-06-2024
Learning with Regression and Trees
Dr. S. M. Patil, Computer Engineering Department
40
11-06-2024
Learning with Regression and Trees
Dr. S. M. Patil, Computer Engineering Department
41
11-06-2024
Learning with Regression and Trees
Encoding Dummy Variables:
Dr. S. M. Patil, Computer Engineering Department
42
11-06-2024
Learning with Regression and Trees
Dr. S. M. Patil, Computer Engineering Department
43
11-06-2024
Learning with Regression and Trees
Dr. S. M. Patil, Computer Engineering Department
44
11-06-2024
Learning with Regression and Trees
Dr. S. M. Patil, Computer Engineering Department
45
11-06-2024
Learning with Regression and Trees
Dr. S. M. Patil, Computer Engineering Department
46
11-06-2024
Learning with Regression and Trees
Dr. S. M. Patil, Computer Engineering Department
47
11-06-2024
Learning with Regression and Trees
Step: 2- Fitting our MLR model to the Training set:
Output:
Dr. S. M. Patil, Computer Engineering Department
48
11-06-2024
Learning with Regression and Trees
Step: 3- Prediction of Test set results:
Dr. S. M. Patil, Computer Engineering Department
49
11-06-2024
Learning with Regression and Trees
Dr. S. M. Patil, Computer Engineering Department
50
11-06-2024
Learning with Regression and Trees
Dr. S. M. Patil, Computer Engineering Department
51
11-06-2024
Learning with Regression and Trees
Applications of Multiple Linear Regression:
Dr. S. M. Patil, Computer Engineering Department
52
11-06-2024
Dr. S. M. Patil, Computer Engineering Department
53
11-06-2024
Dr. S. M. Patil, Computer Engineering Department
54
11-06-2024
Logistic Regression
Dr. S. M. Patil, Computer Engineering Department
55
11-06-2024
Logistic Regression
Dr. S. M. Patil, Computer Engineering Department
56
11-06-2024
Logistic Regression
Dr. S. M. Patil, Computer Engineering Department
57
11-06-2024
Logistic Regression
Dr. S. M. Patil, Computer Engineering Department
58
11-06-2024
Logistic Regression
Assumptions for Logistic Regression:
Logistic Regression Equation:
Dr. S. M. Patil, Computer Engineering Department
59
11-06-2024
Logistic Regression
Dr. S. M. Patil, Computer Engineering Department
60
11-06-2024
Logistic Regression
Dr. S. M. Patil, Computer Engineering Department
61
11-06-2024
y= a0+a1x+ ε
Logistic Regression
Type of Logistic Regression:
Dr. S. M. Patil, Computer Engineering Department
62
11-06-2024
Logistic Regression
Example: There is a dataset given which contains the information of various users obtained from the social networking sites. There is a car making company that has recently launched a new SUV car. So the company wanted to check how many users from the dataset, wants to purchase the car.
Dr. S. M. Patil, Computer Engineering Department
63
11-06-2024
Logistic Regression
Dr. S. M. Patil, Computer Engineering Department
64
11-06-2024
we will predict the
purchased variable (Dependent Variable)
by using age and salary (Independent variables).
Logistic Regression
Dr. S. M. Patil, Computer Engineering Department
65
11-06-2024
Logistic Regression
Dr. S. M. Patil, Computer Engineering Department
66
11-06-2024
Logistic Regression
Dr. S. M. Patil, Computer Engineering Department
67
11-06-2024
Logistic Regression :
Data Pre-processing step:
Dr. S. M. Patil, Computer Engineering Department
68
11-06-2024
Logistic Regression :
Dr. S. M. Patil, Computer Engineering Department
69
11-06-2024
Logistic Regression
Dr. S. M. Patil, Computer Engineering Department
70
11-06-2024
Logistic Regression
Dr. S. M. Patil, Computer Engineering Department
71
11-06-2024
Logistic Regression
Dr. S. M. Patil, Computer Engineering Department
72
11-06-2024
Logistic Regression
Dr. S. M. Patil, Computer Engineering Department
73
11-06-2024
Logistic Regression :
2. Fitting Logistic Regression to the Training set:
Dr. S. M. Patil, Computer Engineering Department
74
11-06-2024
Logistic Regression :
Fitting Logistic Regression to the Training set:
Output
Dr. S. M. Patil, Computer Engineering Department
75
11-06-2024
Logistic Regression :
3. Predicting the Test Result
Dr. S. M. Patil, Computer Engineering Department
76
11-06-2024
The output image shows the corresponding predicted users who want to purchase or not purchase the car.
Logistic Regression :
4. Test Accuracy of the result
Dr. S. M. Patil, Computer Engineering Department
77
11-06-2024
Logistic Regression :
Dr. S. M. Patil, Computer Engineering Department
78
11-06-2024
We can find the accuracy of the predicted result by interpreting the confusion matrix. By above output, we can interpret that 65+24= 89 (Correct Output) and 8+3= 11(Incorrect Output).
Logistic Regression :
5. Visualizing the training set result
Dr. S. M. Patil, Computer Engineering Department
79
11-06-2024
Dr. S. M. Patil, Computer Engineering Department
80
11-06-2024
Dr. S. M. Patil, Computer Engineering Department
81
11-06-2024
Dr. S. M. Patil, Computer Engineering Department
82
11-06-2024
Dr. S. M. Patil, Computer Engineering Department
83
11-06-2024
The purple point observations are for which purchased (dependent variable) is probably 0, i.e., users who did not purchase the SUV car.
Logistic Regression :
Dr. S. M. Patil, Computer Engineering Department
84
11-06-2024
Logistic Regression :
Visualizing the test set result:
Dr. S. M. Patil, Computer Engineering Department
85
11-06-2024
Dr. S. M. Patil, Computer Engineering Department
86
11-06-2024
Logistic Regression :
Dr. S. M. Patil, Computer Engineering Department
87
11-06-2024
The above graph shows the test set result. As we can see, the graph is divided into two regions (Purple and Green). And Green observations are in the green region, and Purple observations are in the purple region.
So we can say it is a good prediction and model. Some of the green and purple data points are in different regions, which can be ignored as we have already calculated this error using the confusion matrix (11 Incorrect output).
Hence our model is pretty good and ready to make new predictions for this classification problem.
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
88
11-06-2024
Classification and Regression Trees (CART)
�Step 2: Select an attribute on the basis of splitting criteria (Gain Ratio or other impurity metrics, discussed below)
�Step 3: Partition instances according to selected attribute recursively
Dr. S. M. Patil, Computer Engineering Department
89
11-06-2024
Dr. S. M. Patil, Computer Engineering Department
90
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
91
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
92
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
93
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
94
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
95
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
96
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
97
11-06-2024
If we have ambiguity , then we select individual error
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
98
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
99
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
100
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
101
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
102
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
103
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
104
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
105
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
106
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
107
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
108
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
109
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
110
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
111
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
112
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
113
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
114
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
115
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
116
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
117
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
118
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
119
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
120
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
121
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
122
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
123
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
124
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
125
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
126
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
127
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
128
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
129
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
130
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
131
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
132
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
133
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
134
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
135
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
136
11-06-2024
Classification and Regression Trees (CART)
Dr. S. M. Patil, Computer Engineering Department
137
11-06-2024