1 of 21

Practical Lesson: Generating Association Rules in WEKA

  • Course: Databases and Data Mining
  • Practical lesson
  • Instructor: Jamolbek Mattiev

2 of 21

Lab Objectives

  • • Generate frequent itemsets
  • • Create association rules using Apriori
  • • Analyze support and confidence
  • • Compare rule sets under different thresholds

3 of 21

Dataset Preparation

  • • Use sample transactional dataset
  • • Format: ARFF file
  • • Binary attributes (Yes/No)
  • • Each row = one transaction

4 of 21

Step 1: Open WEKA

  • • Launch WEKA
  • • Select Explorer
  • • Load sample dataset

5 of 21

Step 2: Go to Associate Tab

  • • Click 'Associate'
  • • Choose Apriori algorithm

6 of 21

Step 3: Configure Parameters

  • • Set minimum support (e.g., 0.4)
  • • Set minimum confidence (e.g., 0.8)
  • • Set number of rules (e.g., 10)

7 of 21

Step 4: Run Apriori

  • • Click Start
  • • Observe generated frequent itemsets
  • • Review association rules

8 of 21

Understanding Output

  • • Rule format: A → B
  • • Support value
  • • Confidence value
  • • Lift value

9 of 21

Recording Results Table

  • Experiment | MinSup | MinConf | #Rules
  • Exp1 | 0.4 | 0.8 |
  • Exp2 | 0.3 | 0.8 |
  • Exp3 | 0.3 | 0.7 |

10 of 21

Experiment 1: Vary Support

  • • Keep confidence constant
  • • Decrease support gradually
  • • Observe rule growth

11 of 21

Experiment 2: Vary Confidence

  • • Keep support constant
  • • Decrease confidence gradually
  • • Observe rule changes

12 of 21

Rule Quality Evaluation

  • • High confidence ≠ high usefulness
  • • Check lift value
  • • Remove redundant rules

13 of 21

Common Observations

  • • Lower thresholds → more rules
  • • Risk of weak or noisy rules
  • • Balance between quality and quantity

14 of 21

Practical Questions

  • • What happens when support is too low?
  • • How many rules are useful?
  • • Which rules have highest lift?

15 of 21

Comparison with FP-Growth

  • • Faster on large datasets
  • • No candidate generation
  • • Similar output structure

16 of 21

Applications Discussion

  • • Market basket analysis
  • • Recommendation systems
  • • Healthcare pattern analysis

17 of 21

Common Mistakes in Lab

  • • Ignoring data preprocessing
  • • Using extremely low thresholds
  • • Misinterpreting lift

18 of 21

Mini Project Assignment

  • • Apply Apriori on new dataset
  • • Test 3 different thresholds
  • • Compare rule sets
  • • Submit short report

19 of 21

Summary of Lab

  • • WEKA simplifies rule generation
  • • Threshold tuning is critical
  • • Interpret rules carefully
  • • Validate meaningful patterns

20 of 21

Effect of Support Threshold (Lab Example)

21 of 21

Effect of Confidence Threshold (Lab Example)