1 of 12

Understanding Linear Regression

  • A Fundamental Concept in Data Science and Machine Learning

  • Presented by: [MUHAMMAD SHAFIQ]
  • Institution: University of the Punjab, Lahore

2 of 12

What is Linear Regression?

  • Linear regression is a statistical method used to model the relationship between a dependent variable (Y) and one or more independent variables (X).

  • Equation: Y = a + bX + ε
  • • Y = dependent variable
  • • X = independent variable
  • • a = intercept
  • • b = slope
  • • ε = error term

3 of 12

4 of 12

Types of Linear Regression

  • 1. Simple Linear Regression: One independent variable
  • Example: Predicting height based on age.
  • 2. Multiple Linear Regression: Two or more independent variables
  • Example: Predicting crop yield based on rainfall, fertilizer use, and temperature.

5 of 12

Equation and Graph

  • Equation: Y = a + bX
  • • Slope (b): Shows how much Y changes for each unit change in X.
  • • Intercept (a): Value of Y when X = 0.
  • Graph: A straight line best fitting the data points.

6 of 12

Assumptions of Linear Regression

  • 1. Linear relationship between X and Y
  • 2. Independence of observations
  • 3. Homoscedasticity (equal variance of errors)
  • 4. Normal distribution of errors
  • 5. No multicollinearity (for multiple regression).

7 of 12

Applications of Linear Regression

  • • Economics: Predicting GDP growth or inflation
  • • Environmental Science: Modeling pollution levels
  • • Agriculture: Forecasting crop yields
  • • Health: Estimating disease risk from lifestyle factors.

8 of 12

Example

  • Dataset: Rainfall (X) vs Crop Yield (Y)
  • Regression Equation: Y = 2.3 + 0.8X
  • Interpretation: For every 1 mm increase in rainfall, yield increases by 0.8 units.

9 of 12

Evaluation Metrics

  • • R² (Coefficient of Determination): Measures goodness of fit
  • • Mean Squared Error (MSE): Average squared difference between observed and predicted values
  • • Root Mean Squared Error (RMSE): Square root of MSE

10 of 12

Advantages and Limitations

  • Advantages:
  • • Easy to implement and interpret
  • • Works well for linear relationships

  • Limitations:
  • • Not suitable for non-linear data
  • • Sensitive to outliers
  • • Assumes constant variance and normality

11 of 12

Summary

  • • Linear regression is a foundation of predictive modeling
  • • It identifies relationships and trends in data
  • • A good starting point for learning more advanced ML models

12 of 12

References

  • Montgomery, D.C., & Peck, E.A. (1992). Introduction to Linear Regression Analysis.
  • James, G. et al. (2013). An Introduction to Statistical Learning.
  • Online tutorials: Coursera, Kaggle, and scikit-learn documentation.