1 of 24

Lazy Learner & Rule Based Classification�

Unit-3

2 of 24

Introduction

  • In lazy learner approach, in which the learner instead waits until the last minute before doing any model construction in order to classify a given test tuple.
  • That is, when given a training tuple, a lazy learner simply stores it (or does only a little minor processing) and waits until it is given a test tuple.

3 of 24

4 of 24

Example: KNN Classification Algorithm

5 of 24

6 of 24

7 of 24

8 of 24

Rule Based Classification

  • The Rule Based Data Mining Classifier is a well-known technique used for data mining.
  • Rules are a good way of representing information and can easily be read and understood.
  • The efficiency of a rule-based classifier depends on factors such as the quality of the rules,  rule ordering, and properties of the set of rules.
  • The idea behind rule based Data Mining classifiers is to find regularities and different scenarios in data expressed in the IF-THEN rule.
  • A collection of IF-THEN rules is used for classification and predicting the outcome. 

IF-THEN rules are defined as

IF condition THEN conclusion

9 of 24

Let us consider a rule R1,R1: IF age = youth THEN buy_computer = yes

Points to remember −

  • The IF part of the rule is called rule antecedent or precondition.
  • The THEN part of the rule is called rule consequent.
  • The antecedent part the condition consist of one or more attribute tests and these tests are logically ANDed.
  • The consequent part consists of class prediction.

IF condition1 AND condition2 THEN conclusion

10 of 24

11 of 24

12 of 24

13 of 24

14 of 24

15 of 24

Rule Extraction

Building Classification Rules:

16 of 24

17 of 24

18 of 24

  • Sequential Covering Algorithms are the most widely used rule based Data Mining algorithms.
  • In this kind of algorithm, rules are learned sequentially, one at a time. Ideally, Sequential Algorithms define rules to cover the maximum possible records of one class and none of the other classes.
  • Once the rule is learned, the records covered by it are removed, and the process keeps on repeating the remaining data.

19 of 24

20 of 24

Step 1: Rule Growing

Start from an empty rule. Grow a rule using the 1R algorithm such that the rule covers the majority of records of the class.

21 of 24

Step 2: Instance Elimination

Remove the records covered by the previous rule. This step ensures that the following rule will differ from the previous one. It improves the accuracy of the rule as well.

22 of 24

Step 3: Rule Evaluation

Evaluate each rule’s accuracy. Repeat the above two steps until a stopping criterion is met.

23 of 24

Step 4: Stopping Criteria

If the accuracy of the rule is not up to mark, then discard that rule.

Step 5: Rule Pruning

Calculate the error rate at every step similar to the 1R algorithm. Suppose the error rate increases; prune that rule and again compare the error rate before and after pruning and take the best decision. If rule pruning is unnecessary, add that rule to the existing ruleset.

24 of 24