UNIT-IV
Association Analysis:
Basic Concepts and Algorithms
Association Rule Mining
Association Rule Mining
Monday, May 08, 2023
Market-Basket transactions
Example of Association Rules
{Diaper} → {Beer},�{Milk, Bread} → {Eggs,Coke},�{Beer, Bread} → {Milk},
Implication means co-occurrence, not causality!
Definition: Frequent Itemset
Itemset
Support count (σ)
>> Support and confidence
Definition: Association Rule
Association Rule Mining
Monday, May 08, 2023
Example:
Association Rule Mining Task
Association Rule Mining
Monday, May 08, 2023
Computational Complexity
Association Rule Mining
Monday, May 08, 2023
If d=6, R = 602 rules
Mining Association Rules
Association Rule Mining
Monday, May 08, 2023
Example of Rules:�
{Milk,Diaper} → {Beer} (s=0.4, c=0.67)�{Milk,Beer} → {Diaper} (s=0.4, c=1.0)
{Diaper,Beer} → {Milk} (s=0.4, c=0.67)
{Beer} → {Milk,Diaper} (s=0.4, c=0.67) �{Diaper} → {Milk,Beer} (s=0.4, c=0.5)
{Milk} → {Diaper,Beer} (s=0.4, c=0.5)
Observations:
Mining Association Rules
Association Rule Mining
Monday, May 08, 2023
Frequent Itemset Generation
Association Rule Mining
Monday, May 08, 2023
Given d items, there are 2d possible candidate itemsets
Frequent Itemset Generation
Association Rule Mining
Monday, May 08, 2023
Frequent Itemset Generation Strategies
Association Rule Mining
Reducing Number of Candidates
Association Rule Mining
Monday, May 08, 2023
Illustrating Apriori Principle
Association Rule Mining
Monday, May 08, 2023
Found to be Infrequent
Pruned supersets
Any measure that possesses an anti-monotone property can be incorporated
directly into the mining algorithm to effectively prune the exponential search space of candidate itemsets
Frequent Itemset Generation in the Apriori Algorithm�
Illustrating Apriori Principle
Association Rule Mining
Monday, May 08, 2023
Minimum Support = 3
Items (1-itemsets)
If every subset is considered,
6C1 + 6C2 + 6C3
6 + 15 + 20 = 41
With support-based pruning,
6 + 6 + 4 = 16
Illustrating Apriori Principle
Association Rule Mining
Monday, May 08, 2023
Minimum Support = 3
If every subset is considered,
6C1 + 6C2 + 6C3
6 + 15 + 20 = 41
With support-based pruning,
6 + 6 + 4 = 16
Items (1-itemsets)
Illustrating Apriori Principle
Association Rule Mining
Monday, May 08, 2023
Items (1-itemsets)
Pairs (2-itemsets)
(No need to generate�candidates involving Coke�or Eggs)
Minimum Support = 3
If every subset is considered,
6C1 + 6C2 + 6C3
6 + 15 + 20 = 41
With support-based pruning,
6 + 6 + 4 = 16
Illustrating Apriori Principle
Association Rule Mining
Monday, May 08, 2023
Items (1-itemsets)
Pairs (2-itemsets)
(No need to generate�candidates involving Coke�or Eggs)
Minimum Support = 3
If every subset is considered,
6C1 + 6C2 + 6C3
6 + 15 + 20 = 41
With support-based pruning,
6 + 6 + 4 = 16
Illustrating Apriori Principle
Association Rule Mining
Monday, May 08, 2023
Items (1-itemsets)
Pairs (2-itemsets)
(No need to generate�candidates involving Coke�or Eggs)
Triplets (3-itemsets)
Minimum Support = 3
If every subset is considered,
6C1 + 6C2 + 6C3
6 + 15 + 20 = 41
With support-based pruning,
6 + 6 + 4 = 16
If every subset is considered,
6C1 + 6C2 + 6C3
6 + 15 + 20 = 41
With support-based pruning,
6 + 6 + 4 = 16
Illustrating Apriori Principle
Association Rule Mining
Monday, May 08, 2023
Items (1-itemsets)
Pairs (2-itemsets)
(No need to generate�candidates involving Coke�or Eggs)
Triplets (3-itemsets)
Minimum Support = 3
If every subset is considered,
6C1 + 6C2 + 6C3
6 + 15 + 20 = 41
With support-based pruning,
6 + 6 + 4 = 16
Illustrating Apriori Principle
Association Rule Mining
Monday, May 08, 2023
Items (1-itemsets)
Pairs (2-itemsets)
(No need to generate�candidates involving Coke�or Eggs)
Triplets (3-itemsets)
Minimum Support = 3
If every subset is considered,
6C1 + 6C2 + 6C3
6 + 15 + 20 = 41
With support-based pruning,
6 + 6 + 4 = 16
6 + 6 + 1 = 13
Apriori Algorithm
Association Rule Mining
Monday, May 08, 2023
Candidate Generation and Pruning
1. Candidate Generation. This operation generates new candidate k itemsets based on the frequent (k - l)-itemsets found in the previous iteration.
2. Candidate Pruning. This operation eliminates some of the candidate k-itemsets using the support-based pruning strategy.
ways to generate candidate itemsets
The folIowing is a list of requirements for an effective candidate generation procedure
Generating Association Rules from Frequent Item sets�
Association rules can be generated as follows:
Example
Support counting
Example
Support Counting Using a Hash Tree�
Support Counting of Candidate Item sets
Support Counting Using a Hash Tree
1,4,7
2,5,8
36,9
Hash function
Apriori Algorithm Disadvantages�
Rule Generation
X 🡪Y-X satisfies the confidence threshold
Confidence-Based Pruning�
To compare rules generated from the same frequent item set Y, the following theorem holds for the confidence measure.
Rule Generation in Apriori Algorithm
Limitations of Apriori Algorithm
Compact Representation of Flequent Itemsets
frequent item sets
Closed Frequent Itemsets
Maximal Frequent Itemsets
An item set is maximal frequent if it is frequent and none of its immediate supersets is frequent
FP-Growth Algorithm
Frequent Pattern Growth Algorithm�
FP Tree�
Example of FP-Growth Algorithm�
Advantages -FP Growth Algorithm