�
Data Mining_Anoop Chaturvedi
1
Swayam Prabha
Course Title
Multivariate Data Mining- Methods and Applications
Lecture 36
Committee Machine and Random Forests
Anoop Chaturvedi
Department of Statistics, University of Allahabad
Prayagraj (India)
Slides can be downloaded from https://sites.google.com/view/anoopchaturvedi/swayam-prabha
Committee Machine or Ensemble learning �Objective: To lower the generalization error of a learning algorithm. Improve accuracy of learning algorithm.
Multiple models are combined to improve the predictive performance over any individual model
Generalization error ⇒ How well a machine learning model performs on unseen data.�Approach ⇒ Bagging and Boosting
Data Mining_Anoop Chaturvedi
2
Generate an ensemble of base predictors/ classifiers by perturbing the learning set.�Combine them into a single combined predictor/ classifier�Bagging ⇒ Generates perturbations by random and independent drawings from the learning set. Focus is on variance reduction.�Boosting process ⇒ The Perturbation process is deterministic. Generates perturbations by successive re-weightings of the learning set. Current weights depend upon the misclassification history of the process. Focus is on bias reduction.
Data Mining_Anoop Chaturvedi
3
Data Mining_Anoop Chaturvedi
4
Data Mining_Anoop Chaturvedi
5
Data Mining_Anoop Chaturvedi
6
Data Mining_Anoop Chaturvedi
7
Data Mining_Anoop Chaturvedi
8
Data Mining_Anoop Chaturvedi
9
Data Mining_Anoop Chaturvedi
10
Data Mining_Anoop Chaturvedi
11
Data Mining_Anoop Chaturvedi
12
Data Mining_Anoop Chaturvedi
13
Data Mining_Anoop Chaturvedi
14
Data Mining_Anoop Chaturvedi
15
Data Mining_Anoop Chaturvedi
16
Data Mining_Anoop Chaturvedi
17
Data Mining_Anoop Chaturvedi
18
Random Forests
Bagging, Bootstrap ⇒ Reducing variance of an estimated prediction function.
Random forests ⇒ A modification of bagging that builds a large collection of de-correlated trees, and then averages them.
The performance of random forests is very similar to boosting, and they are simpler to train and tune.
Data Mining_Anoop Chaturvedi
19
Data Mining_Anoop Chaturvedi
20
Data Mining_Anoop Chaturvedi
21
Data Mining_Anoop Chaturvedi
22
Data Mining_Anoop Chaturvedi
23
Data Mining_Anoop Chaturvedi
24
Data Mining_Anoop Chaturvedi
25