1 of 42

SRI SIDDHARTHA INSTITUTE OF TECHNOLOGY- TUMAKURU(A constituent College of Sri Siddhartha Academy of Higher Education, Tumakuru)

Pattern Recognition

By: Savitha C

Assistant Professor

Dept. Of ECE

1

2 of 42

Pattern Recognition/Classification

  • Assign an object or an event (pattern) to one of several known categories (or classes).

Courtesy: https://www.section.io/engineering-education/understanding-pattern-recognition-in-machine-learning/

2

3 of 42

  • Pattern recognition is the scientific discipline whose goal is the classification of objects into a number of categories or classes.
  • Depending on the application, these objects can be images or signal waveforms or any type of measurements that need to be classified.
  • These objects are referred to using the generic term patterns.

3

4 of 42

Introduction to Pattern Recognition

    • Recognize face.
    • Understand spoken words.
    • Read hand written characters.
    • Identify car keys in our pocket by feel.
    • A fruit ripe, by its smell.

Act of taking in raw data and making an action based on the category of the pattern

4

5 of 42

1.1 Machine Perception

1.2 An Example

1.3 The Design Cycle

1.4 Pattern Recognition Systems

1.5 Learning and Adaptation

1.6 Conclusion

5

6 of 42

Machine Perception

  • Build a machine that can recognize patterns:

    • Speech recognition
    • Fingerprint identification�
    • OCR (Optical Character Recognition)�
    • DNA sequence identification

    • Knowledge of how these are solved in nature- human beings

6

7 of 42

An Example

  • Illustrate the complexity of problem (imaginary and fanciful)
  • Automate the process of sorting incoming fish on conveyor belt according to species.
  • Using optical sensing.

Sea bass

Species

Salmon

Courtesy Pattern Recognition and Image Analysis by Earl Gose

7

8 of 42

  • FEATURES for problem analysis using -
    • Explore features to differentiate between two types of fish.
    • Set up a camera and take some sample images to extract features.
      • Length
      • Lightness
      • Width
      • Number and shape of fins
      • Position of the mouth, etc…
  • Noise or variations in images.
  • Variations in lightning, position of fish on conveyor.
  • Electronics of camera.

8

9 of 42

  • MODEL
    • Truly there are differences between population of sea bass & salmon.
    • Different models- different descriptions - typically mathematical.
    • Process sensed data to eliminate noise.

  • PREPROCESSING
    • Use a segmentation operation to isolate fishes from one another and from the background
    • Information from a single fish is sent to a feature extractor whose purpose is to reduce the data by measuring certain features
    • The features are passed to a classifier

9

10 of 42

10

Courtesy: Pattern Recognition and Image Analysis by Earl Gose

11 of 42

  • Preprocessing: to simplify subsequent operations without loosing information.
        • Adjust for average light level
        • Remove the background of conveyor belt.
  • Segmentation: images of different fish are isolated from one another and from background.
  • Feature extraction: reduce data by measuring certain features/properties.
  • Passed to Classifier- evaluates evidence –makes a final decision of species.
  • Sea bass length is generally longer than a salmon.
  • Length – feature
  • Attempt to classify whether length l of fish exceeds critical value l*
  • Choose l* to obtain design/training samples of sea bass & salmon.

11

12 of 42

  • Histogram is a graphical display of data using bars of different heights.
  • A histogram is the most commonly used graph to show frequency distributions.
  • Taller bars show that more data falls in that range.
  • A histogram displays the shape and spread of continuous sample data.
  • A frequency distribution shows how often each different value in a set of data occurs.
  • When to Use a Histogram
  • To see the shape of the data’s distribution.
  • Seeing whether a process change has occurred from one time period to another
  • Determining whether the outputs of two or more processes are different

12

13 of 42

  • Disappointing histograms – sea bass are some what longer than salmon.
  • Single criterion is poor. No mater how we choose l*
  • We cannot reliably separate sea bass from salmon.
  • The length is a poor feature alone!
  • Select lightness as a possible feature x*
  • Deciding the fish as a sea bass when in fact it was a salmon was just undesirable as the converse.

13

14 of 42

Cost of miss-classifications

  • Consider the fish classification example.
  • There are two possible classification errors:

(1) Deciding the fish was a sea bass when it was a salmon.

(2) Deciding the fish was a salmon when it was a sea bass.

  • Are both errors equally important ?

14

15 of 42

Cost of miss-classifications (cont’d)

  • Suppose that:
    • Customers who buy salmon will object vigorously if they see sea bass in their cans.
    • Customers who buy sea bass will not be unhappy if they occasionally see some expensive salmon in their cans.

  • How does this knowledge affect our decision?

15

16 of 42

  • We might add other features that are not correlated with the ones we already have.
  • A precaution should be taken not to reduce the performance by adding such “noisy features”.
  • Some features might be redundant (eg: eye color).
  • Difficulty/computational cost in attaining more features – high dimensionality
  • Ideally, the best decision boundary should be the one which provides an optimal performance such as in the following figure:

16

17 of 42

Issue of generalization!

  • However, our satisfaction is premature because the central aim of designing a classifier is to correctly classify novel input, ie. Fish not yet seen.
  • It is unlikely that complex decision boundary would provide good generalization – seems to be tuned
  • One approach would be to get more training samples for obtaining a better estimate of the true underlying characteristics – probability distribution.
  • We seek to simplify the recognizer with the belief that underlying models will not require a decision boundary that very complex .
  • Indeed we are satisfied with slightly poorer performance on the training samples �

17

18 of 42

Would it be possible to build a “general purpose” PR system?

  • It would be very difficult to design a system that is capable of performing a variety of classification tasks.

    • Different problems require different features.
    • Different features might yield different solutions.
    • Different tradeoffs exist for different problems.

18

19 of 42

  • Achieve good representation.
  • Structural relationships are simply and naturally revealed.
  • Vectors of real valued numbers
  • Patterns leading to same relationship are close to one another and far from those that demand a different action..
  • Favor features which lead to
      • Simpler decision regions
      • Classifier easy to train
      • Robust-insensitive to noise/errors

19

20 of 42

Analysis by synthesis

  • When we have insufficient training data, incorporate knowledge of the problem domain.
  • Less training data – more important is the knowledge
  • How patterns are produced.
  • Eg. Speech recognition - “dee” - uttered differently by different people.
  • Lowering jaw slightly, opening mouth placing the tongue tip against the roof of mouth.
  • Male - female - old - young - different pitch.
  • Eg. Recognizing all types of chair
  • Std office chair, living room chair, bean bag chair etc.
  • Variety in no of legs, material, shape and so on.
  • Unifying aspect is functional (stable artifact that supports human sitter)
  • OCR – hand written characters as a sequence of strokes.

20

21 of 42

Related Fields

  • Image Processing - preserves original information

Pattern classification – extracts feature - loses information

  • Associative memory – reduces information

Closely inter related fields

  • Regression - seek to find some functional description of data, with the goal of predicting values for new input

Eg. linear regression (length varies linearly with age)

  • Interpolation - easily deduce function for certain ranges of input
  • Density estimation - probability that a member of certain category, found to have particular features.

21

22 of 42

Pattern Recognition Systems

  • Sensing�
    • Use of a transducer (camera or microphone)
    • PR system depends on the bandwidth, resolution, sensitivity, signal-to-noise ratio, distortion of the transducer.
    • Sensor converts images or sounds into signal data.

  • Segmentation and grouping�
    • Patterns should be well separated and should not overlap.

22

23 of 42

23

Courtesy Pattern Recognition and Image Analysis by Earl Gose

24 of 42

  • Post Processing
    • Exploit context input dependent information other than from the target pattern itself to improve performance.
    • Measure of classifier performance is error rate - % of new patterns that are assigned to wrong category – minimize error rate.
    • Risk – minimize the total expected cost.
    • Context – exploit context –input dependent information to improve system performance.
    • Multiple classifiers – multiple features lead to improved recognition

each classifier operating on different aspects of input

24

25 of 42

The Design Cycle

  • Data collection
  • Feature Choice
  • Model Choice
  • Training
  • Evaluation
  • Computational Complexity

25

26 of 42

Pattern Recognition

26

Courtesy: https://laptrinhx.com/pattern-recognition-basics-5439501/

27 of 42

  • Data Collection

  • Large part of cost of developing PR systems
  • More data for good performance.
  • How do we know when we have collected an adequately large and representative set of examples for training and testing the system?

27

28 of 42

  • Feature Choice

  • Choosing distinguishing features is a critical design step
  • Prior knowledge- analysis by synthesis.

    • Depends on the characteristics of the problem domain.
    • Simple to extract
    • Invariant to irrelevant transformation insensitive to noise.

28

29 of 42

  • Model Choice

    • Unsatisfied with the performance of our fish classifier and want to jump to another class of model.
    • Based on some function of no & position of fins, color of eyes, weight, shape of mouth & so on.
    • Reject a class of models & try another.

29

30 of 42

  • Training

    • Use data to determine the classifier.
    • Many different procedures for training classifiers and choosing models
    • No universal method to solve all these problems.
    • Effective method – learning from example patterns.

30

31 of 42

  • Evaluation

    • Measure the error rate.
    • Important to measure performance – to identify the need for improvements.
    • Switch from one set of features to another one.

  • Overfitting
    • Overly complex system – perfect classification of samples.
    • Unlikely to perform well on new patterns.

31

32 of 42

  • Computational Complexity

    • What is the trade-off between computational ease and performance?�
    • How an algorithm scales as a function of the number of features, patterns or categories?

    • Some PR problems can be solved using algorithms that are highly impracticable.

    • Eg. Label all possible 20 * 20 binary pixel images for OCR
    • Use look up table to classify incoming patterns.
    • Theoretically error free recognition, but labeling time & storage requirements are high.

32

33 of 42

Learning and Adaptation

  • Supervised learning
    • A teacher provides a category label or cost for each pattern in the training set
    • Learning algorithm is powerful enough to learn the solution to a given problem.
  • Unsupervised learning
    • The system forms clusters or “natural groupings” of the input patterns
    • No explicit teacher

  • Reinforcement learning
    • Present an input, compute category label & use known target category label to improve classifier.
    • RL is binary- correct or not.

33

34 of 42

Conclusion

  • Reader seems to be overwhelmed by the number, complexity and magnitude of the sub-problems of Pattern Recognition

  • Many of these sub-problems can indeed be solved

  • Many fascinating unsolved problems still remain

34

35 of 42

Applications of Pattern Recognition

  • PR is automatic detection/classification of objects or events.
  • Automated analysis of medical images obtained from microscopes &CAT scanners, magnetic resonance images, X-rays and photographs.
  • Automatic inspection of parts on an assembly lin e.
  • Human speech recognition by computers.
  • Automatic grading of plywood, steel and other sheet material.
  • Classification of seismic signals for oil and mineral exploration and earthquake prediction.
  • Identification of people from fingerprints, hand shape and size, retinal scans, voice characteristics, typing patterns and handwriting.
  • Automatic inspection of printed circuits and printed character and handwriting recognition.
  • Automatic analysis of satellite pictures to determine the type and condition of agricultural crops, weather conditions snow and water reserves, and mineral prospects.
  • Classification of electrocardiograms into diagnostic categories of heart disease.
  • Detection of spikes in electroencephalograms and other medical waveform analysis

35

36 of 42

  • Measurements used to classify the objects are called features.
  • The categories into which they are classified are called classes.
  • Automatic classification of objects using same features used by people can be difficult task.
  • Some times features that would be impossible or difficult for humans to estimate are useful in automated systems.
  • Eg. satellite images use wavelength of light.
  • Clustering – unsupervised learning – naturally occurring groups.
  • The individual objects or situations to be classified are called samples.
  • Set of samples used in system design – training set.
  • Data set for testing the system – test set.
  • Sometimes the class of an object cannot be determined by any absolute criterion but depends on opinion of experts.
  • Data should be classified by several experts independently and their results pooled.

36

37 of 42

  • Eg. Four hematologists classified 1041 white blood cells into 8 categories.
  • Each of them disagreed 7.97% of majority opinion.
  • Build a machine.
  • May not be able to design a system that performs much better than the best expert who disagreed 5.41%
  • Possibly the images do not contain sufficient information to classify/definitions are vague.
  • 3 classes (small, medium and large lymphocytes) merged – total 5 classes.
  • Each of them disagreed 0.63% of majority opinion.
  • Eg. Detection of spikes in EEG
  • 5 experts –spikes in 1 min , 8 channel recording from 30 patients.
  • 942 vents – spikes (1 or more experts)
  • 104 events – spikes by 5 experts. (human reliability).
  • Depending on severity / obviousness.

37

38 of 42

Statistical Decision Theory

  • Predicting winner of the game in Hypothetical Basketball Association (HBA).
  • Based on difference between home team’s average no of points per game (apg) and the visiting team’s apg for previous games.
  • Training set consists of scores previously played games.
  • dapg = Home Team apg – Visiting team apg.

38

39 of 42

Data set of games with out comes &differences

39

Courtesy Pattern Recognition and Image Analysis by Earl Gose

40 of 42

Histogram of dapg

40

Courtesy Pattern Recognition and Image Analysis by Earl Gose

41 of 42

41

Courtesy Pattern Recognition and Image Analysis by Earl Gose

42 of 42

Scatter plot

42

Courtesy: Pattern Recognition and Image Analysis by Earl Gose