Linear Regression
How does a computer draw a line that best fits the data?
By Eric Honer
Recap of Machine Learning
Types of Learning
Classification vs Regression
Classification predicts a distinct class, while regression predicts a value
What is Linear Regression?
Linear Regression is a form of supervised learning and is used for regression
What is Linear Regression
We want to find a line that fits the data
(remember line of best fit?)
What is Linear Regression
We want to find a line that fits the data
(remember line of best fit?)
Linear meaning:
What is Linear Regression
We want to find a line that fits the data
(remember line of best fit?)
Linear meaning:
Which of these is a Line of Best Fit?
Line 1
Line 2
Line 3
How Good is our Line?
Find the Residual
Find the Residual
Answer: 0
Find the Residual
Find the Residual
Answer: 10
Find the Residual
Find the Residual
Answer: -7
What to do with Residuals
Residuals: [0, -5, 10, -3, 1, 7, -2]
What to do with Residuals
Residuals: [0, -5, 10, -3, 1, 7, -2]
What to do with Residuals
Residuals: [0, -5, 10, -3, 1, 7, -2]
SSR = 02 + (-5)2 + 102 + (-3)2 + 12 + 72 + (-2)2 = 188
Mean Squared Error (MSE)
MSE = 188/7 = 26.85
How do we get here
Line 1
Line 2
Line 3
Equation of a Line
y = mx + b
y:
m:
x:
b:
Equation of a Line
y = mx + b
y: output value
m: slope
x: input value
b: y-intercept
Our goal: Find the best values for m and b that minimize the loss for all our (x, y) datapoints
Gradient Descent
Let’s pick a random y-intercept
Gradient Descent
Loss
y-intercept
Gradient Descent
Loss
y-intercept
Gradient Descent
Loss
y-intercept
Gradient Descent
Loss
y-intercept
Gradient Descent
Gradient Descent
Let’s pick a random slope
Gradient Descent
Loss
Slope
Gradient Descent
Loss
Slope
Gradient Descent
Loss
Slope
Gradient Descent
Loss
Slope
Gradient Descent
?
Gradient Descent
Slope
Loss
What does the SSR look like when we adjust one parameter?
The better our line of best fit, the lower the loss
parameter
Gradient Descent
When the gradient of the loss function is as close to zero as we can get, it means we have found the optimal parameters
How Much do we Adjust?
learning rate
original parameter value
Learning Rate
Updating a parameter
Remember: new Weight = Weight - LR * gradient of loss with respect to Weight
Updating a parameter
Remember: new Weight = Weight - LR * gradient of loss with respect to Weight
New slope: 2
New y-intercept: 30
It’s Just a Ball Rolling Down a Hill
Higher dimensions?!?!
What if we want to predict a student’s grade based on the # of hours they studied AND the # of hours of sleep they got? Is that possible?
YES!
Now, our equation is
We use the same gradient descent approach just with one extra parameter
Let’s put all of that knowledge together with Python
Go to the code!
Congrats! You’ve Learned Linear Regression!
You’re already ahead of the masses! This is the first step towards learning about more advanced cutting-edge models that shape the world today.
Thank you!
Keep in Touch :
discord.gg/santacruzai
linktr.ee/santacruzai