Lecture 9: �Uncertainty in regression and multiple regression
Josh Grossman
Stanford University
Simple linear regression
The “best fitting” line through the points
Heights of fathers and sons
Karl Pearson, circa 1900
Heights of fathers and sons
Karl Pearson, circa 1900
Simple linear regression
Minimizes the sum of squared residuals
Simple linear regression
Minimizes the sum of squared residuals
A probabilistic interpretation
Properties of the estimator
is a random vector.
Properties of the estimator
The estimator is unbiased
is a random vector.
Properties of the estimator
The variance
Standard errors�For the parameters
Standard errors�For the parameters
is small when:
Standard errors�For the parameters
is small when:
Confidence intervals�For the parameters
[ The parameters are approximately normal. ]
Confidence intervals
For the mean response
For each x* value, how accurate is our estimate of the expected Y value?
Confidence intervals
For the mean response
For each x* value, how accurate is our estimate of the expected Y value?
Confidence intervals
For the mean response
Confidence intervals
For the mean response
Confidence intervals�For the mean response
Small when:
Confidence intervals
For a specific response [ prediction intervals ]
interval="confidence" for mean response
Confidence intervals
For a specific response [ prediction intervals ]
Confidence intervals
For a specific response [ prediction intervals ]
Confidence intervals
For a specific response [ prediction intervals ]
Confidence intervals�For a specific response [ prediction intervals ]
Small when:
only change compared
to the uncertainty in the
mean response
Confidence intervals�For a specific response [ prediction intervals ]
But never smaller than 𝜎2.
the "irreducible error" → we can't perfectly predict random error!
“Essentially, all models are wrong, but some are useful.”�George Box
Multiple regression
Outcomes modeled as a linear combination of multiple covariates.
Multiple regression
Matrix notation
Multiple regression
Matrix notation
Least-squares estimate
Least-squares estimate
1 x n
n x 1
Regression hyperplane
From Hastie, Tibshirani & Friedman, “Elements of Statistical Learning”
Computation�Linear regression
Computation�Linear regression
Computation�Linear regression
Computation�Linear regression
closed-form solution
for linear regression!
Multiple regression�Variance and confidence intervals
Multiple regression�Variance and confidence intervals