Tree Methods
Let’s learn something!
Python and Spark
Chapter 8 of
Introduction to Statistical Learning
By Gareth James, et al.
Reading Assignment
Math &
Statistics
Domain
Knowledge
Machine Learning
Software
Research
DS
Tree Methods
Math &
Statistics
Domain
Knowledge
Machine Learning
Software
Research
DS
Tree Methods
Math &
Statistics
Domain
Knowledge
Machine Learning
Software
Research
DS
Tree Methods
Math &
Statistics
Domain
Knowledge
Machine Learning
Software
Research
DS
Tree Methods
I want to use this data to predict whether or not he will show up to play.
An intuitive way to do this is through a Decision Tree
Math &
Statistics
Domain
Knowledge
Machine Learning
Software
Research
DS
Tree Methods
In this tree we have:
Math &
Statistics
Domain
Knowledge
Machine Learning
Software
Research
DS
Tree Methods
In this tree we have:
Math &
Statistics
Domain
Knowledge
Machine Learning
Software
Research
DS
Intuition Behind Splits
Imaginary Data with 3 features (X,Y, and Z) with two possible classes.
Math &
Statistics
Domain
Knowledge
Machine Learning
Software
Research
DS
Intuition Behind Splits
Splitting on Y gives us a clear separation between classes
Math &
Statistics
Machine Learning
Intuition Behind Splits
We could have also tried splitting on other features first:
Math &
Statistics
Machine Learning
Intuition Behind Splits
Entropy and Information Gain are the Mathematical Methods of choosing the best split. Refer to reading assignment.
Math &
Statistics
Machine Learning
Random Forests
To improve performance, we can use many trees with a random sample of features chosen as the split.
Math &
Statistics
Machine Learning
Random Forests
What's the point?
Math &
Statistics
Machine Learning
Random Forests
What's the point?
Python and Spark
Python and Spark
Python and Spark
Python and Spark
Python and Spark
Python and Spark
Python and Spark
Python and Spark
Python and Spark
2. Increase the weight of samples that are misclassified by model m, and decrease the weight of samples that are classified correctly by model m
Python and Spark
3. Train next weak model using samples drawn according to the updated weight distribution.
Python and Spark
Python and Spark
Python and Spark
Python and Spark
Tree Methods
Documentation Example
Let’s learn something!
Python and Spark
Python and Spark
Tree Methods
Code Along
Python and Spark
Tree Methods
Consulting Project
Let’s learn something!
Python and Spark
Python and Spark
Python and Spark
Python and Spark
Python and Spark
Python and Spark
Python and Spark
Python and Spark
Tree Methods
Consulting Project
Solutions
Python and Spark
Python and Spark
Python and Spark
Python and Spark
Python and Spark
Python and Spark
Python and Spark