Professional Documents
Culture Documents
Chapter 1: Introduction
Chapter 2: Bayesian Decision Theory
Lab.
Research Activity
Divide Students into groups.
Each group has a specific job.
Programming Tools
Human Perception
Humans have developed highly sophisticated Skills for
actions
Recognizing a face,
Understanding spoken words,
Reading handwriting,
Distinguishing fresh food from its smell.
Machine Perception
Build a machine that can recognize patterns:
Speech recognition
Fingerprint identification
OCR (Optical Character Recognition)
DNA sequence identification
Pattern Samples
Examples of applications
Handwritten: sorting letters by postal code,
input device for PDAs.
Optical Character
Recognition (OCR)
Biometrics
Diagnostic systems
Military applications
Features
Feature is any distinctive aspect, quality or characteristic.
Features may be symbolic (i.e., color) or numeric (i.e.,
height)
Definitions
The combination of d features is represented as a ddimensional column vector called a feature vector
The d-dimensional space defined by the feature vector is
called the feature space
Objects are represented as points in feature space. This
representation is called a scatter plot
Features (cont.)
Basic Structure
Two basic factors: Feature & Classifier
Feature:
Car
Boundary
Classifier: Mechanisms and methods to define what
the pattern is.
System structure
The feature should be well-chosen to describe the
pattern!!
Knowledge: experience, analysis, trial&error
The classifier should contain the knowledge of each
pattern category and also the criterion or metric to
discriminate among patterns classes.
Knowledge: direct defined or training
Pattern
Pattern is a composite of features characteristic of
an individual
In classification tasks, a pattern is a pair of
variables {x,} where
Feature extraction
Task: to extract features which are good for classification.
Good features: Objects from the same class have similar feature values.
Objects from different classes have different values.
Good features
Bad features
Classifiers
The task of a classifier is to partition feature space
If
g i ( x) g j ( x)
i j
as
set
of
template.
Structural (or Syntactic) PR: pattern classes
A comparison
20
An Example
Sorting incoming Fish on a conveyor according to
21
Problem Analysis
Set up a camera and take some sample images to extract
features
Length
Lightness
Width
Number and shape of fins
Position of the mouth, etc
This is the set of all suggested features to explore for use in our
classifier!
Pattern Classification, Chapter 1
22
Preprocessing
Use a segmentation operation to isolate fishes from one
23
24
Classification
Select the length of the fish as a possible feature for
discrimination
25
26
27
28
29
Fish
xT = [x1, x2]
Lightness
Width
30
31
We might add other features that are not correlated with the
32
underfitting
good fit
overfitting
Figure: Overly complex models for the fish will lead to decision boundaries that are
complicated. While such a decision may lead to perfect classification of our training
samples, it would lead to poor performance on future patterns. The novel test point
marked ? is evidently most likely a salmon, whereas the complex decision boundary
shown leads it to be misclassified as a sea bass.
34
Issue of generalization!
35
Figure 1.6: The decision boundary shown might represent the optimal tradeoff
between performance on the training set and simplicity of classifier.
Pattern Classification, Chapter 1
36
37
overlap
Pattern Classification, Chapter 1
38
Feature extraction
Discriminative features
Invariant features with respect to translation, rotation and
scale.
Classification
Use a feature vector provided by a feature extractor to assign
the object to a category
Post Processing
Exploit context input dependent information other than
from the target pattern itself to improve performance
Pattern Classification, Chapter 1
39
Feature Choice
Model Choice
Training
Evaluation
Computational Complexity
40
Design Cycle
41
Data Collection
How do we know when we have collected an adequately
large and representative set of examples for training and
testing the system?
Feature Choice
Depends on the characteristics of the problem domain.
Simple to extract, invariant to irrelevant transformation
insensitive to noise.
Model Choice
Unsatisfied with the performance of our fish classifier
and want to jump to another class of model
Pattern Classification, Chapter 1
42
Evaluation
Measure the error rate (or performance and switch from
one set of features to another one.
Computational Complexity
What is the trade-off between computational ease and
performance?
(How an algorithm scales as a function of the number of
features, patterns or categories?)
Training
Use data to determine the classifier. Many different
procedures for training classifiers and choosing models
Pattern Classification, Chapter 1
43
Unsupervised learning
The system forms clusters or natural groupings of
44
45