1 of 12

Introduction to Bayesian Regression

AI4Fusion Summer School - W&M, 2024

Cristiano Fanelli

06/12/2024

  • Linear/Polynomial
  • Logistic

2 of 12

Sample/Tune

2

One of the most immediate improvements you can make to Hamiltonian Monte Carlo (HMC) is to implement step size adaptation, which gives you fewer parameters to tune, and adds in the concept of “warmup” or “tuning” for your sampler.

https://colcarroll.github.io/hmc_tuning_talk/

3 of 12

Bayesian Linear Regression

3

  • Highlight machine learning connection — umbrella term for a collection of methods to automatically learn patterns in data, and then use what we learn to predict future data or make decisions under uncertainty
  • Regression problems as an example of supervised learning
  • In this class we will compare the ordinary Least Squares fitting procedure for linear regression with the bayesian linear regression
    • Optimization Problem (the one you are familiar with) VS Probabilistic Problem
  • We assume you are familiar with OLS as well as uncertainty propagation from previous courses. Nonetheless we will recall some concepts in class.
  • The probabilistic approach to the linear regression problem can be summarized as:

4 of 12

Bayesian Linear Regression

4

  • Half Cauchy?
  • Why?
    • Generally works well as a good regularizing prior that avoid overfitting

disturbance

5 of 12

Bayesian Linear Regression

5

disturbance

Μ expressed as “deterministic” — see code

6 of 12

Preprocessing

6

  • Correlation is a direct consequence of our assumption/model
    • An increase in the slope means a decrease of the intercept and viceversa
  • The line is constrained to pass through the mean of the data (this is actually “relaxed” in the Bayesian model, but still partly true)
  • One good idea in some cases could be transform your data through:

Class: what happens in this case?

7 of 12

Bayesian Linear Regression: FAQ

7

I am familiar with linear regression models already, and I know methods for fitting, e.g., least square. Why should I use Bayesian linear regression?

  • Bayesian linear regression allows a useful mechanism to deal with insufficient data, or poor distributed data. It allows you to put a prior on the coefficients and on the noise so that in the absence of data, the priors can take over. [ref]
  • The aim of Bayesian Linear Regression is not to find the single “best” value of the model parameters, but rather to determine the posterior distribution for the model parameters.

In problems where we have limited data or have some prior knowledge that we want to use in our model, the Bayesian Linear Regression approach can both incorporate prior information and show our uncertainty. Bayesian Linear Regression reflects the Bayesian framework: we form an initial estimate and improve our estimate as we gather more data.

8 of 12

Linear VS Logistic Regression

8

Credits: references [1], [2], [3]

9 of 12

Logistic (Sigmoid) Function

9

Credits: references [1], [2], [3]

Hypothesis Representation:

Sigmoid:

features

You can decide a threshold to make a decision (e.g., everything above 0.5 is dog, anything below is cat)

10 of 12

Logistic vs Bayesian Logistic

10

Traditional logistic regression typically involves minimizing a cost function using gradient descent, Bayesian logistic regression involves using MCMC or similar methods to sample from the posterior distribution of the parameters, thus providing a probabilistic understanding of the model parameters.

Logistic Regression:

    • The objective is to find the best parameters (weights) for the model that predict the probability of a certain class or event existing. This is usually done by minimizing a cost function, most commonly the cross-entropy loss function.
    • Gradient descent is a popular optimization algorithm. It iteratively adjusts the parameters to minimize the cost function. The algorithm calculates the gradient of the cost function with respect to the parameters, and then updates the parameters in the direction that reduces the cost.

Bayesian Logistic Regression:

    • Bayesian logistic regression incorporates Bayesian methods into logistic regression. This approach treats the parameters of the logistic regression model as random variables and places prior distributions on them.
    • To find the posterior distribution of these parameters (which reflects the updated beliefs about the parameters after observing the data), Bayesian logistic regression often uses MCMC methods, used for sampling from probability distributions based on constructing a Markov chain that has the desired distribution as its equilibrium distribution.
    • Unlike traditional logistic regression, which provides point estimates for the parameters, Bayesian logistic regression gives a distribution, offering a measure of uncertainty around the parameter estimates.

11 of 12

Types of Logistic Regression

11

  • Binary Logistic Regression
    • the response/dependent variable is binary in nature
    • this is the most common type of logistic regression
    • example: is a tumor benign or malignant (0 or 1) based on one or more predictor
  • Ordinal Logistic Regression
    • response variable has 3+ possible outcomes and they have a specified order
    • example: which grade is a student likely to receive on scale of A through F based on one or more predictor
  • Multinomial Logistic Regression
    • the response variable has 3 or more possible outcomes but they have no specified order
    • example: which candy are people likely to prefer out of chocolate, hard candy, sour gummies, and sweet gummies based on one or more predictor

12 of 12

12

Spares