Regression (Cont.) and Bias-Variance Trade-off
Lecture 9
More on probabilistic view of regression and bias-variance trade-off
EECS 189/289, Fall 2025 @ UC Berkeley
Joseph E. Gonzalez and Narges Norouzi
EECS 189/289, Fall 2025 @ UC Berkeley
Joseph E. Gonzalez and Narges Norouzi
Join at slido.com�#2312298
The Slido app must be installed on every computer you’re presenting from
Do not edit�How to change the design
2312298
Roadmap
2312298
MLE Recap: Least Squares as Maximum Likelihood
2312298
Least Squares ≘ Maximum Likelihood
Writing the equation for a Normal distribution
Simplify and separate two terms
2312298
Least Squares ≘ Maximum Likelihood
The least-squares solution is the MLE under Gaussian noise.
2312298
Choosing Different Noise Models
2312298
Noise Model ⟺ Error Function
Zero-mean Gaussian Noise
Zero-mean Laplacian Noise
2312298
Prior Beliefs
2312298
Recall: Beliefs and Priors
Strong prior ensures alignment with beliefs
0 ——— 0.5 (prior) ——— 0.58 (Posterior mean) ——— 1.0 (MLE)
2312298
What About Regression? MLE for Weights: No Prior
2312298
Belief about Parameters
Likelihood function
Prior (small, centered)
Posterior (from Bayes rule) up to a constant:
2312298
What Is MAP?
Bayes Rule
Prior
Likelihood
Maximum A Posteriori (MAP):
2312298
What Is MAP?
Maximum A Posteriori (MAP):
2312298
Plugging in Distributions
+
2312298
Does This Looks Like a Ridge Regression?
+
+
Least-Squares + Gaussian Prior ⇒ Ridge Regression
2312298
What is the relationship between MAP and ridge regression?
The Slido app must be installed on every computer you’re presenting from
Do not edit�How to change the design
2312298
Least-Squares + Gaussian Prior ⇒ Ridge Regression
Posterior (Bayes Rule):
Plugging in Gaussian:
2312298
Bias-Variance Trade-off
2312298
Fundamental Challenges in Learning?
Is this cat grumpy or are we overfitting to human faces?
2312298
Fundamental Challenges in Learning?
2312298
Bias
The expected deviation between the predicted value and the true value
All possible functions
True �Function
Bias
2312298
Noise
The variability of the random noise in the process we are trying to model.
Beyond our control�(usually)
2312298
Model Variance
Variability in the predicted value across different training datasets.
2312298
Which of the following models would have high bias?
The Slido app must be installed on every computer you’re presenting from
Do not edit�How to change the design
2312298
Which of the following models would have high variance?
The Slido app must be installed on every computer you’re presenting from
Do not edit�How to change the design
2312298
Analysis of Squared Error
Noise term:
True Function
Can be any parametric function
2312298
Analysis of Squared Error
=
Goal:
“Noise” +
(Bias)2 +
Model Variance
2312298
=
Useful Equations:
2312298
=
Useful Equations:
+
2312298
=
Useful Equations:
+
2312298
Useful Equations:
+
Obs. Value
True Value
True Value
Pred. Value
“Noise” Term
Model�Estimation
Error
2312298
Useful Equations:
+
Model�Estimation
Error
We need to calculate this term
“Noise” Term
2312298
Useful Equations:
+
Model�Estimation
Error
We need to calculate this term
“Noise” Term
2312298
Next we will show….
(Bias)2
Model Variance
2312298
2312298
constant
2312298
constant
2312298
constant
2312298
(Bias)2
Model Variance
2312298
Useful Equations:
+
Model�Estimation
Error
We now have calculated this term
“Noise” Term
2312298
Useful Equations:
+
(Bias)2
Model Variance
“Noise” Term
2312298
+
(Bias)2
Model Variance
“Noise” Term
2312298
Bias Variance Plot
Test Error
(Bias)2
Optimal Value
Decreasing Model Complexity
Variance
Test Error
(Bias)2
Optimal Value
2312298
More Data Supports More Complexity
Test Error
(Bias)2
Optimal Value
Decreasing Model Complexity
Test Error
(Bias)2
Optimal Value
Variance
2312298
How Do We Control Model Complexity?
Test Error
(Bias)2
Optimal Value
Decreasing Model Complexity
Variance
Test Error
(Bias)2
Optimal Value
2312298
Determining the Optimal 𝜆
increasing 𝜆
Error
Variance
Test Error
(Bias)2
Optimal Value
2312298
Dataset Example
2312298
2312298
Select options below.
The Slido app must be installed on every computer you’re presenting from
Do not edit�How to change the design
2312298
Bias Variance Derivation Quiz
2312298
Select options below.
The Slido app must be installed on every computer you’re presenting from
Do not edit�How to change the design
2312298
Bias Variance Derivation Quiz
2312298
Select options below.
The Slido app must be installed on every computer you’re presenting from
Do not edit�How to change the design
2312298
Bias Variance Derivation Quiz
2312298
Regression (Cont.) and Bias-Variance Trade-off
Lecture 9
Credit: Joseph E. Gonzalez and Narges Norouzi
Reference Book Chapters: Chapter 4.2 and 4.3
Homework!
2312298
HW 1 Updates
2312298
HW2
Part 1 (Due Oct 3rd)
Part 2 (Due Oct 17th)
Uses concepts from Logistic regression lectures next week
2312298
Data Visualizers for HW2
Gradio – a python backed UI developer created by huggingface
In the hw we encourage you to play around with the data you visualize and the styles of the UI
YOU SHOULD VIBE CODE THESE SINCE IT IS EASY TO VERIFY
2312298
2312298
62
Evaluating LMM’s in the Wild with
An open platform for human preference evals
2312298
LMArena
63
A platform for holistic LMM evaluation, where real user conversations and pairwise votes are crowdsourced to build a live human preference leaderboard.
Members (pre-company launch): Wei-Lin Chiang, Anastasios Angelopoulos, Lianmin Zheng, Ying Sheng, Lisa Dunlap, Chris Chou, Tianle Li, Evan Frick, Aryan Vichare, Naman Jain, Manish Shetty, Yifan Song, Kelly Tang, Sophie Xie, Connor Chen, Joseph Tennyson, Dacheng Li, Siyuan Zhuang, Valerie Chen, Wayne Chi
Advisors (pre-company launch): Ion Stoica, Joseph Gonzalez, Hao Zhang, Trevor Darrell
(Formerly Chatbot Arena)
Crowdsourcing user interactions with LMM’s
Direct Chat
64
Crowdsourcing user interactions with LMM’s
Battle
Pairwise setting allows for more fine grained comparison
65
Using Battles to Generate a Preference Leaderboard
66
2312298
If you think logistic regression is not valuable
67
2312298
Structure of a paper
2312298
What to answer in every paper
1-4 can usually be found in the intro
2312298
What to answer in every paper
1-4 can usually be found in the intro
1st paragraph
1st - 2nd
2nd- 4th
2nd- 4th
2312298