You are on page 1of 8

FACE RECOGNITION USING HIDDEN MARKOV MODEL

I. Introduction
Hidden Markov Models (HMMs) are a set of statistical models used to characterize the statistical properties
of a signal. An HMM is a doubly stochastic process with an underlying stochastic process that is not
observable, but can be observed through another set of stochastic processes that produce a sequence of
observed symbols. An HMM has a finite set of states, each of which is associated with a multidimensional
probability distribution; transitions between these states are governed by a set of probabilities. Hidden Markov
Models are especially known for their application in 1D pattern recognition such as speech recognition,
musical score analysis, and sequencing problems in bioinformatics. More recently they have been applied to
more complex 2D problems and this review focuses on their use in the field of automatic face recognition,
tracking the evolution of the use of HMMs from the early-1990’s to the present day.

Our goal is to use this method of HMM model applied to face recognition of a well-knowing face-database
and to adopt and apply these techniques in their own work.

This Report we are doing face recognition, so you’ll need some face images! You can either create your own
database or start with one of the available databases, face-rec.org/databases gives an up-to-date
overview. Three interesting databases are:

AT&T Facedatabase, the AT&T Facedatabase, sometimes also known as ORL Database of Faces,
contains ten different images of each of 40 distinct subjects. For some subjects, the images were taken at
different times, varying the lighting, facial expressions (open /closed eyes, smiling/not smiling) and facial
details (glasses/no glasses). All the images were taken against a dark homogeneous background with the
subjects in an upright, frontal position (with tolerance for some side movement).

Figure: Example of pictures from ORL Database

HIDDEN MARKOV MODELS (HMMS)


Mathematical theory of Hidden Markov Models (HMMs) was originally described during the 1960’s and early
1970’s [14].(HMMs) are a technique applied in practical pattern recognition applications, more specifically in
speech recognition problems [15]. Recently it has been used in vision: texture segmentation [16], face finding
[17], object recognition [18] and face recognition [19]. face image represented as a sequence of states
produced when the face is scanned from top to bottom, and HMM is made of states, where the probability to
move from one state to another depends only on those two states and not any further history [20,21]. HMM
can be represented as a triplet
l ={A, B, 𝜋} (1)

 The number of states N, and the state at time t is given by

l≤t≤T (2)

Where T is the length of the observation sequence.

𝜋 = {𝑝𝑖} where pi = p{q1 = i}, 1 ≤ i ≤ N (3)

 The initial state distribution matrix:

𝐴 = {𝑎𝑖𝑗 } where 𝑎𝑖𝑗 = 𝑝{𝑞𝑡+1 = 𝑗𝛪𝑞𝑡 = 𝑖}, 1 ≤ i, j ≤ N, 0 ≤ 𝑎𝑖𝑗 ≤ 1, and ∑𝑁


𝑗=1 𝑎𝑖𝑗 = 1 (4)

 A probability distribution for each of the states,

𝐵 = {𝑏𝑗 (𝑜𝑡 )} (5)

Usually probability density function is approximated by the weighted sum of M. Where

𝑏𝑗 (𝑜𝑡 ) = 𝑝{𝑜𝑡 = 𝑉𝐾 |𝑞𝑡 = 𝑠𝑖 }, 1 ≤ j ≤ N, 1 ≤ 𝑗≤ N, 1 ≤ 𝑘≤ M (6)

M= |V| is the number of the different observation symbols, where V={v1,v2,…,vM} is the setofall possible
observation symbols. The observation symbol at time t is given by oi ∈V S={s1,s2,…,sN} is the set of all
possible states. The state of the model at time t is given by qt ∈ s.HMMs generally work on sequences of
symbols called observation vectors, that’s why in this paper divided the face image into seven regions which
each is assigned to a state in a left to right one dimensional HMM. Figure 1 shows the mentioned seven face
regions.

Fig. 1: Seven regions of face coming from top to down in natural order.

A simple structure and small number of parameters is used to build the model as shown in figure 2.
Fig. 2: A one dimensional HMM model with sevenstates for face image with seven regions.

PRINCIPAL COMPONENT ANALYSIS (PCA)


PCA aims to maximize between-class data separation [22]. It reduces the dimensionality of the description by
projecting the points onto the principal axes, where orthonormal set of points are in the direction of maximum
covariance of the data. PCA is an optimal compression scheme that minimizes the mean squared error between
the original images and their reconstructions for any given level of compression [23,24]. works by finding a
new coordinate system for a set of data, where the axes (or principal components) are ordered by the
variance contained within the training data [25].The approach for face recognition aims is decompose face
images into small set of characteristic feature images called eigenfaces which used to represent both existing
and new faces. The training database consists of M images which is same size. The images are normalized by
converting each image matrix to equivalent image vector I.

The training set matrix is the set of image vectors with Training set

𝐼 = {𝐼1 , 𝐼2 , 𝐼3 , … , 𝐼𝑀 } (7)

The mean face (ψ) is the arithmetic average vector as given by:
𝑀 (8)
1
𝜓 = ∑ 𝐼𝑖
𝑀
𝑖=1

The deviation vector for each image Φi is given by:

𝜙 = 𝐼𝑖 − 𝜓 i = 1, 2, …, m (9)

Consider a difference matrix A= [Φ1, Φ2, …….ΦM ], which keeps only the distinguishing features for face
images and removes the common features. Then eigenfaces are calculated by find the Covariance matrix C of
the training image vectors by:

𝐶 = 𝐴. 𝐴𝑇 (10)

Due to large dimension of matrix C, consider matrix L of size (Mt X Mt) which gives the same effect with
reduces dimension. The eigenvectors of C (Matrix U) can be obtained by using the eigenvectors of L (Matrix
V) as given by:
𝑈𝑖 = 𝐴𝑉𝑖 (11)

The eigenfaces are

Eigenfaces= {𝑈1 , 𝑈2 , 𝑈3 , … , 𝑈𝑀 } (12)

Instead of using M eigenfaces, the highest m’ <= M is chosen as the eigenspace. Then the weight of each
eigenvector ωi to represent the image in the eigenface space, as given by:

𝜔𝑖 = 𝑈𝑖𝑇 (𝐼 − 𝜓) i = 1, 2, …, m (13)

Weight matrix Ω = [ω1 , ω2 … . ωm ] T (14)

1𝑥𝑖 (15)
Average class projection 𝛺𝜓 = 𝑥 ∑𝑖=1 𝛺𝑖
𝑖

IV. PROJECT PROCEDURE DESCRIPTION

These experiments were performed using the Olivetti Research Lab (ORL) database, containing
frontal facial images with limited side movements and head tilt. The database was comprised of 40
subjects with 10 pictures per subject.

The experiments used 5 images per person for training and the remaining 5 images for testing. The
results were reported as error rates, calculated as the proportion of incorrectly classified images. Three
sets of tests were done, varying the values of each of the three parameters as follows: 2 ≤ N ≤ 10, 1 ≤
L ≤ 10 and 0 ≤ M ≤ L−1. For M varied, the number of states was fixed at N = 5 and window height
L was varied between 2 and 10. According to the tests, the error rates drop as the overlap increases,
approximately from 28% to 15%.

However a greater overlap implies a bigger computational effort. When L was varied, N was fixed to
5 and the overlaps considered were 0, 1 and L-1. In this case if there is little or no overlap, the smaller
the strip height the lower the error rate is, with values between 13% for L = 1 up to 28% for L = 10.
However, for sufficiently large overlap the strip height has marginal effect on the recognition
performance, the error rate remaining almost constant around 14%. In the third set of tests N was
varied, with L = 1 and 0 overlap and L = 8 and maximum overlap (M=L-1). The performance is fairly
uniform for values of N between 4 and 10, with an increase in error for values smaller than three.

PROGRAM FUNCTIONS:

 Mainmenu.m: This is the main program that will be executed, and open the main menu,
containing 04 boutons which correspond to four different function.
 gendata.m: that’s the function which generate the database form the data folder;
 facer.m: this is the function which open the second menu, to execute the recognition phase;
 facerec.m: that’s the main function which execute the HMM algorithm to the selected picture;
 savepicture.m: this function is used to save the selected picture;
 DATABASE.mat: database created.
Training

The dataset used in this program consists of 40 persons with 10 images of each person stored in a
folder named ‘sn’ (n=1 to 40).

Eg. Person s1 consists of:

When the ‘Generate Database’ is option is selected and the specific path is given, the program
creates a file ‘training.mat’ and stores all the corresponding vectors of the faces in the file.
Face Recognition
When an ‘Recognize from Image’ is selected, the program open a new window which is used to
choose a picture to be recognized.

First you need to choose a picture by selecting ‘Input image’, that button open the exploratory
windows in the test folder.

After choosing the file, the program picture will open with the menu window beside, which gives
you the possibility to recognize the person from the database
When you select ‘Recognize’, the program will run the HMM algorithm to recognize the picture
among the constructed database matrix. And then, another figure will show the selected picture
along with the first picture of the person folder
Additional Information

Hardware Specification

Model : HP Pavilion dm4 Notebook PC


Processor : Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
Memory : 6.00 GB
System Type : 64-bit Operating System
Operating system : Windows 7 Home Premium

Software Specification
Software Used : MATLAB
Type : 64-bit (win64)
Build : R2015a (8.5.0.197613)

References

1. http://www.face-rec.org/general-info/
2. http://vision.ucsd.edu/~iskwak/ExtYaleDatabase/ExtYaleB.html
3. http://www.mathworks.com/products/matlab/

You might also like