DECISION TREE
RAJKUMAR D
ASST PROG(SL.G)
DEPARTMENT OF COMPUTER SCIENCE & APPLICATIONS
SRMIST, RAMAPURAM
Decision Tree - Introduction
Decision Tree - Example
Decision Tree - Introduction
Decision Tree - Terminologies
How does the Decision Tree algorithm Work?
How does the Decision Tree algorithm Work?
The complete process can be better understood using the below algorithm:
Step-1: Begin the tree with the root node, says S, which contains the complete dataset.
Step-2: Find the best attribute in the dataset using Attribute Selection Measure (ASM).
Step-3: Divide the S into subsets that contains possible values for the best attributes.
Step-4: Generate the decision tree node, which contains the best attribute.
Step-5: Recursively make new decision trees using the subsets of the dataset created in step -3. Continue this process until a stage is reached where you cannot further classify the nodes and called the final node as a leaf node.
Example Problem
Suppose there is a candidate who has a job offer and wants to decide whether he should accept the offer or Not.
Attribute Selection Measures
Attribute Selection Measures
1. Information Gain:
Information Gain= Entropy(S)-[(Weighted Avg) * Entropy(each feature)]
Attribute Selection Measures
Entropy: Entropy is a metric to measure the impurity in a given attribute. It specifies randomness in data. Entropy can be calculated as:
Entropy(s)= -P(yes)log2 P(yes)- P(no) log2 P(no)
Where,
S= Total number of samples
P(yes)= probability of yes
P(no)= probability of no
Attribute Selection Measures
2. Gini Index:
Gini index can be calculated using the below formula:
Gini Index= 1- ∑jPj2
Pruning: Getting an Optimal Decision tree�
There are mainly two types of tree pruning technology used:
Advantages of the Decision Tree
Disadvantages of the Decision Tree