1 of 59

Machine Learning�Course Code – B17CS4101

Thursday, December 16, 2021

Department of Computer Science and Engineering

Sagi Ramakrishnam Raju Engineering College

Bhimavaram-534204

2 of 59

Machine Learning�Unit-II�XGBoost�

Thursday, December 16, 2021

Presented By:

Dr. G.N.V.G. Sirisha., Asst. Prof.

Department of Computer Science and Engineering

Sagi Ramakrishnam Raju Engineering College

Bhimavaram-534204

3 of 59

Objectives

  • To learn
  • What is XGBoost
  • How it differs from Gradient Boosting
  • Construction of XGBoost Tree
  • XGBoost for regression

Thursday, December 16, 2021

Courtesy: A Gentle Introduction to Gradient Boosting, Cheng Li, College of Computer and Information Science, Northeastern University

4 of 59

XGBoost

  • XGBoost is a popular and efficient open-source implementation of the gradient boosted trees algorithm.
  • Gradient boosting is a supervised learning algorithm
  • It attempts to accurately predict a target variable by combining the estimates of a set of simpler, weaker models.
  • When using gradient boosting for regression, the weak learners are regression trees
  • Each regression tree maps an input data point to one of its leaves that contains a continuous score.

Thursday, December 16, 2021

5 of 59

XGBoost contd.

  • XGBoost minimizes a regularized (L1 and L2) objective function that combines a convex loss function (based on the difference between the predicted and target outputs) and a penalty term for model complexity (in other words, the regression tree functions).
  • The training proceeds iteratively, adding new trees that predict the residuals or errors of prior trees that are then combined with previous trees to make the final prediction.
  • It's called gradient boosting because it uses a gradient descent algorithm to minimize the loss when adding new models.

Thursday, December 16, 2021

6 of 59

XGBoost – eXtreme Gradient Boosting

Thursday, December 16, 2021

7 of 59

Thursday, December 16, 2021

  • XGBoost can generally be applied on complex datasets containing many features
  • But, we are going to apply XGBoost on simple dataset containing one independent variable and one dependent variable which are drug dosage and drug effectiveness

8 of 59

XGBoost Example

  • XGBoost is generally applied on complex datasets
  • But to make the example easy we take a simple dataset

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

9 of 59

XGBoost Example

  • XGBoost is generally applied on complex datasets
  • But to make the example easy we take a simple dataset

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

10 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

intial prediction can be anything by default it is taken as 0.5

11 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

12 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

-7.5

13 of 59

Building XGBoost Tree

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

14 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

15 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

16 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

17 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

18 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

19 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

20 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

21 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

22 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

23 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

24 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

25 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

26 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

27 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

28 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

29 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

30 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

31 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

32 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

33 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

34 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

35 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

36 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

37 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

38 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

39 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

40 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

Instance

Observed Value

Predicted Value

Residuals

I1

-10

-2.65

-7.35

I2

7

2.6

4.4

I3

8

2.6

5.4

I4

-7.5

-1.75

-5.75

0.5+0.3*-10.5

41 of 59

Predicting the output values

42 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

43 of 59

Predicting the drug effectiveness for the first data point in the training dataset

Thursday, December 16, 2021

44 of 59

Role of λ in preventing overfitting

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

45 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

46 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

47 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

48 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

49 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

50 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

51 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

52 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

53 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

54 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

55 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

56 of 59

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

57 of 59

Conclusion

Thursday, December 16, 2021

58 of 59

References

  1. https://statquest.org/gradient-boost-part-1-regression-main-ideas/
  2. https://statquest.org/gradient-boost-part-2-regression-details/
  3. https://statquest.org/gradient-boost-part-3-classification-main-ideas/
  4. Gradient Boosting Decision Tree Algorithm, Explained, https://towardsdatascience.com/machine-learning-part-18-boosting-algorithms-gradient-boosting-in-python-ef5ae6965be4

Machine Learning: Making Sense of Data

Thursday, December 16, 2021

59 of 59

Thursday, December 16, 2021

Thank You