CS60050: Machine Learning
Sourangshu Bhattacharya
BOOSTING
Boosting
Boosting Intuition
Boosting in a Picture
5
training cases
Correctly
classified
This example
has a large weight
in this round
This DT has
a strong vote.
boosting rounds
Boosting: Adaboost
Adaboost
Adaboost (contd..)
And in animation
Original training set: equal weights to all training samples
Taken from “A Tutorial on Boosting” by Yoav Freund and Rob Schapire
AdaBoost example
10
ROUND 1
ε = error rate of classifier
α = weight of classifier
AdaBoost example
11
ROUND 2
AdaBoost example
12
ROUND 3
AdaBoost example
13
Adaboost illustration
Adaboost
Adaboost (contd..)
Adaboost - Observations
Adaboost - derivation
Adaboost - derivation
Adaboost - derivation
Adaboost
Adaboost (contd..)
XGBOOST
Regression Tree Ensemble
age < 15
is male?
+2
-1
+0.1
Y
N
Y
N
Y
N
+0.9
-0.9
tree1
tree2
Use Computer Daily
f( ) = 2 + 0.9= 2.9 f( )= -1 - 0.9= -1.9
Prediction of is sum of scores predicted by each of the tree
Tree Ensemble methods
Put into context: Model and Parameters
Think: regression tree is a function that maps the attributes to the score
, we are learning functions(trees)
Space of functions containing all Regression trees
Learning a tree on single variable
t < 2011/03/01
Y
N
t < 2010/03/20 Y
0.2
Equivalently
The model is regression tree that splits on time
N
1.2
1.0
Piecewise step function over time
Learning a step function
Splitting Positions
The Height in each segment
Learning step function (visually)
Coming back: Objective for Tree Ensemble
Training loss
Complexity of the Trees
Objective vs Heuristic
Regression Tree is not just for regression!
Outline
Take Home Message for this section
So How do we Learn?
Model at training round t
New function
Keep functions added in previous round
Additive Training
This is what we need to decide in round t
Goal: find
to minimize this
This is usually called residual from previous round
Taylor Expansion Approximation of Loss
Our New Goal
Refine the definition of tree
age < 15
is male?
Y
N
Y
N
Leaf 1
Leaf 2
Leaf 3
q(
) = 1
q(
) = 3
w1=+2
w2=0.1
w3=-1
The structure of the tree
The leaf weight of the tree
Define Complexity of a Tree
Number of leaves
L2 norm of leaf scores
age < 15
is male?
Y
N
Y
N
Leaf 1
Leaf 2
Leaf 3
w1=+2
w2=0.1
w3=-1
Revisit the Objectives
The Structure Score
This measures how good a tree structure is!
The Structure Score Calculation
age < 15
is male?
Y
N
Y
N
Instance index
1
2
3
4
5
g1, h1
g2, h2
g3, h3
g4, h4
g5, h5
gradient statistics
The smaller the score is, the better the structure is
Searching Algorithm for Single Tree
Greedy Learning of the Tree
objective after adding the split is
the score of left child the score of if we do not split the score of right child
The complexity cost by introducing additional leaf
Efficient Finding of the Best Split
g1, h1
g4, h4
g2, h2
g5, h5
g3, h3
a
An Algorithm for Split Finding
What about Categorical Variables?
Pruning and Regularization
Recap: Boosted Tree Algorithm