JavaScript isn't enabled in your browser, so this file can't be opened. Enable and reload.

1 of 16

Classification Part II

Decision Trees with Information Gain
Course: Databases and Data Mining
Instructor: Jamolbek Mattiev

2 of 16

Learning Objectives

• Understand decision tree structure
• Learn entropy concept
• Understand information gain
• Apply splitting criteria mathematically

3 of 16

What is a Decision Tree?

• Supervised classification method
• Tree-like structure
• Internal nodes = feature tests
• Leaves = class labels

4 of 16

Entropy Concept

Entropy measures impurity
H(S) = − Σ pi log2(pi)
Range: 0 (pure) to 1 (max impurity)

5 of 16

Interpretation of Entropy

• H = 0 → Pure node
• H = 1 → Maximum uncertainty (binary case)
• Higher entropy → more disorder

6 of 16

Information Gain

IG(S, A) = H(S) − Σ (|Sv|/|S|) H(Sv)
Measures reduction in entropy
Select attribute with highest IG

7 of 16

Step-by-Step Algorithm

1. Compute entropy of dataset
2. For each attribute compute IG
3. Choose attribute with max IG
4. Split dataset
5. Repeat recursively

8 of 16

Example Dataset (Binary Class)

Total samples: 14
9 Positive, 5 Negative
Compute H(S)
Test split on attribute Outlook

9 of 16

Tree Construction Process

• Recursive partitioning
• Stop when entropy = 0
• Or no features left
• Or stopping criteria reached

10 of 16

Advantages

• Easy to interpret
• Handles categorical data
• No normalization required
• Fast training

11 of 16

Limitations

• Can overfit
• Sensitive to noise
• Bias toward multi-valued attributes

12 of 16

Overfitting Control

• Pre-pruning
• Post-pruning
• Limit maximum depth
• Minimum samples per leaf

13 of 16

Entropy vs Gini

Entropy → used in ID3/C4.5
Gini → used in CART
Both measure impurity
Different mathematical properties

14 of 16

Mathematical Insight

Information gain = Mutual Information
Based on Information Theory
Maximizes reduction in uncertainty

15 of 16

Summary

• Entropy measures impurity
• Information gain selects best split
• Tree built recursively
• Pruning improves generalization

16 of 16

Entropy Curve Visualization