1 of 22

Finding Frequent Patterns and Association Rules

Using WEKA – FP-Growth Algorithm
Course: Databases and Data Mining
Instructor: Jamolbek Mattiev

2 of 22

Learning Objectives

• Understand FP-Growth algorithm
• Compare FP-Growth with Apriori
• Generate association rules in WEKA
• Interpret support and confidence

3 of 22

Frequent Pattern Mining

• Discover frequent itemsets
• Analyze transactional data
• Support-based pattern discovery

4 of 22

Limitations of Apriori

• Multiple database scans
• Large candidate generation
• High computational cost

5 of 22

FP-Growth Overview

• FP = Frequent Pattern
• No candidate generation
• Uses FP-Tree structure
• More efficient than Apriori

6 of 22

FP-Tree Structure

• Compact prefix-tree
• Stores frequency information
• Built with two database scans

7 of 22

FP-Growth Steps

1. Scan dataset and compute support
2. Build FP-tree
3. Generate conditional pattern bases
4. Extract frequent patterns

8 of 22

Support and Confidence

Support(A) = frequency(A) / total transactions
Confidence(A→B) = Support(A∪B)/Support(A)

9 of 22

Advantages of FP-Growth

• Faster than Apriori
• No candidate generation
• Efficient for large datasets

10 of 22

WEKA: Step 1

• Open WEKA Explorer
• Load transactional dataset (.arff)
• Go to Associate tab

11 of 22

WEKA: Step 2

• Click Choose
• Select FPGrowth
• Set minimum support & confidence

12 of 22

WEKA: Step 3

• Configure number of rules
• Adjust thresholds
• Click Start

13 of 22

Interpreting WEKA Output

• Frequent itemsets
• Association rules
• Support and confidence values

14 of 22

Effect of Support Threshold

High support → fewer patterns
Low support → many patterns
Balance quality vs quantity

15 of 22

Performance Comparison

FP-Growth faster than Apriori
Better scalability
Suitable for large transactional data

16 of 22

Applications

• Market basket analysis
• E-commerce recommendation
• Web clickstream mining
• Healthcare pattern discovery

17 of 22

Challenges

• Memory usage for large trees
• Parameter tuning
• Interpreting many rules

18 of 22

Experimental Design in WEKA

• Compare Apriori and FP-Growth
• Use same support/confidence
• Measure runtime
• Compare rule quality

19 of 22

Discussion Questions

• Why is FP-Growth faster?
• When use Apriori instead?
• How to choose thresholds?

20 of 22

Summary

• FP-Growth avoids candidate generation
• Uses FP-tree for compression
• Efficient frequent pattern mining
• WEKA supports easy experimentation

21 of 22

Support Threshold Impact (Example)

22 of 22

Apriori vs FP-Growth Runtime (Example)