Professional Documents
Culture Documents
, (1)
where, is the coeIIicient vector and
i
is the corresponding
coeIIicient oI eigenIace x
i
and i1, 2,
.
, N.
PCA has been demonstrated as an eIIicient and eIIective
approach Ior Iace recognition. However, it is sensitive to the
change oI the appearance, pose and light conditions, which is
caused by the inherent deIicient expression oI Iacial images. In
the traditional PCA, there are two phases, training phase and
test phase. In the training phase, one Iront Iacial image oI every
class is selected and the eigenIaces are calculated. In the test
Iace, the remaining Iront images in the Iacial image database
are classiIied to classes by minimizing the Euclidean distance
between a Iacial image I and the image I
k
in class k given by
(2),
, (2)
where,
k
is the Euclidean distance between I and I
k
; k C, and
C is the set oI suIIixes oI classes; and
ki
is the coeIIicient oI
the i-th eigenIaces oI I
k
as deIined by (1).
Perlibakas |3| tests 14 distance measures and concludes that
the Mahalanobis distance (M-distance) can promote the correct
recognition rate oI PCA-based methods. Thus, the proposed
method is only compared with the PCA-based method using
the M-distance which is deIined by (3),
(3)
where, MD(
1
,
2
) is the M-distance between eigenvectors
1
and
2
, M is the number oI eigenvalues in the training set and
i
is the i-th eigenvalue in descent.
Until now, two main problems oI PCA are described as
Iollows. One is that using only one image Ior training is not
enough to reIlect the change oI the appearance, pose and light
conditions. The other is that the rigid distance is too simple to
measure the similarity oI a Iacial image and the image in a
certain class.
B. GNP-fu::y Data Mining
GNP is a novel evolutionary algorithm proposed in |8|.
DiIIerent Irom GA and Genetic Programming (GP), solutions
are represented by directed graphs rather than strings or tree
structures. By this way, a more complex and Ilexible
representations oI the practical problems are obtained. In the
past seven years, GNP has been successIully applied to stock
trading |9|, elevator dispatch |10|, automatic program
generation |11| and data mining |12|. We concentrate on the
Iuzzy association rule mining using GNP.
Table. I shows an example oI a real-valued database, where
A
i
(i 1, 2, 3, 4, 5) is an attribute item, C is the class
identiIication, TID is the database identiIication number. In
GNP-Iuzzy data mining, the values oI all attributes are
IuzziIied by membership Iunctions and the parameters oI the
membership Iunctions are calculated according to their
statistics in the Iirst generation. Then, a binary-valued database
can be obtained as Table. II. As shown in Table. II, there are at
least two rules Ior C1and C2, respectively, i.e., A
1
1 ~
C1, A
1
1 A
5
1 C1, A
3
1~C2 and A
3
1 A
4
1
~C2. II each attribute item is modeled as a gene, then it is
obvious that the inherent structure oI the chromosome in class
association rule mining problem is a graph structure, which is
the great advantage oI GNP compared with GA and GP.
Take Table. II as an example, association rules can be
easily represented by a GNP individual whose basic structure is
shown as Fig. 1. A GNP individual or the chromosome in a
GNP individual consists oI three types oI nodes, i.e., start node
(S), judgment node (J
i
) and processing node (P
j
), i [1,
number of Js{ and f [1, number of Ps{.
In this section, the encoding, decoding, genetic operators
TABLE II.Example oI a binary-valued database
TID A1 A2 A3 A4 A5 C
1 1 1 1 0 1 1
2 1 0 0 0 1 1
3 0 0 1 1 0 2
4 0 0 1 1 1 2
TABLE I. Example oI a real-valued database
TID A1 A2 A3 A4 A5 C
1 -3.8 2.6 0.3 0.2 1.1 1
2 -8.7 -3.8 1.1 -1.9 -1.4 1
3 6.8 0.9 0.2 -0.2 0.7 2
4 3.2 0.8 0.2 -1.5 -0.6 2
Figure. 1 Association rule mining represented by a GNP individual
- 3074 -
and Iitness Iunction oI GNP-Iuzzy data mining are explained
by this example.
x Encoding and decoding: In GNP-Iuzzy data mining,
each attribute item is represented by one J or several
Js. Then, the connections between the Ps and Js are
randomly generated in the Iirst generation. In this way,
several potential association rules are encoded as a
GNP individual. As shown in Fig. 1, each transition
route in GNP is decoded as several rules, i.e.,
A11~C1, A11A21~C1 and A11A21
A31~C1. Then, rules are evaluated by measures
like support, conIidence and chi-square. II a rule
satisIies all the minimal values oI these three measures,
it is added into the rule pool.
x Cenetic operators: Same as other evolving methods,
GNP also have three kinds oI genetic operators, i.e.,
selection, crossover and mutation. Usually, elite
selection is used in GNP to move the best individual
to the next generation. Tournament selection is also
used to select the individuals having higher Iitness
values with a higher probability. In crossover, two
candidates are selected and some nodes and all their
connections are exchanged Ior the next generation.
There are two types oI mutation in GNP. One is to
change the Iunction oI a node. The other is to change
the connection oI a node.
x Fitness function: To mine as many association rules
as possible, the Iitness Iunction oI a GNP individual is
deIined by (4),
(4)
where, N
r
is the total number oI rules extracted by a
GNP individual;
2
(r) is the chi-square value oI rule r;
and N
ante
(r) is the number oI attributes oI antecedent.
The Ilowchart is then shown in Fig. 2.
III. PROPOSED METHOD
The proposed method mainly consists oI three phases, i.e.,
database conversion phase, training phase and testing phase. In
the database conversion phase, eigenIaces are treated as
attributes and their coeIIicients are considered as attribute
values. Then, the high dimensional Iacial image database is
converted into a low dimensional real-valued database. In the
training phase, we use the GNP-Iuzzy data mining method to
extract the class association rules Irom the real-valued database.
At last, in the testing phase, the extracted rules are applied to
the classiIier and the classes oI the Iacial images are obtained.
A. Database Conversion Phase
Suppose we have a Iacial database shown in Fig. 3. Then,
according to the idea oI PCA, we can obtain six eigenIaces oI
the Iacial database shown in Fig. 4. Finally, each Iacial image
in the database can be represented by a tuple in the real-valued
database given by Table. III.
Then, according to the statistics oI each attributes, the real-
valued database is IuzziIied and the membership Iunction is
given by Fig. 5, where,
and
is the mean value and standard
deviation oI attribute A
i
and is a scaling parameter, 0.25 is
used in this paper.
B. Training Phase
Although the eIIectiveness oI GNP-Iuzzy data mining Ior
class association rule extraction has been demonstrated, the
Iace recognition is quite diIIerent Irom common classiIication
problems, because the number oI classes in Iace recognition is
always very large, Irom hundreds to thousands. ThereIore, the
Figure. 2 Flowchart oI GNP evolution
(a) (b) (c) (d) (e) (I) (g) (h)
Figure. 3 Sub database Irom Yale A Iacial image database
(a) (b) (c) (d) (e) (I)
Figure. 4 EigenIaces oI the sub database in Fig. 3
TABLE III. Real-valued database oI the Iacial image database in Fig.3
Facial
Image
1 2 3 4 5
a -3.8 2.6 0.3 0.2 1.1 1.9
b -8.7 -3.8 1.1 -1.9 -1.4 1.3
c 6.8 0.9 0.2 -0.2 0.7 -1.2
d 3.2 0.8 0.2 -1.5 -0.6 1.0
e -2.4 -0.7 1.8 3.6 -0.2 0.8
I -12 1.3 -0.9 -1.4 -0.3 -0.1
g 10.5 -1.0 3.9 -0.3 0.9 -0.1
h -0.4 3.1 -1.7 -1.8 -1.0 -0.5
Figure. 5 Fuzzy membership Iunction
- 3075 -
proposed method includes two steps.
Suppose a Iacial database has 40 person
the proposed method uses a three-obje
clustering method to cluster the 40 persons
each cluster, there are about 10 classes, i.e.
Iitness Iunctions are given by (5)-(7),
where, D
1
is the average distance between ea
its cluster center, m is the number oI cluster
suIIix oI object data in cluster i,
Ci
is the
cluster i and a
u
(
f
,
Ci
) is the Euclidean distan
Ci
; D
2
is the average distance between cluste
the clustering error rate, H is the hyperplane
which has the normal vector and crosses the m
C
f
and a(
,
H) is the distance between a coeII
H.
The objective Iunction oI the GA-based
is given by (10),
where, D
1
*
, D
2
*
and
(5)
(6)
(7)
(8)
(9)
ach object data and
rs, C
i
is the set oI
e cluster center oI
nce between
j
and
er centers; and, is
between C
i
and C
f
midpoint oI C
i
and
Iicient vector
and
clustering method
(10)
lues oI D
1
, D
2
, and
es in the IuzziIied
viduals and class
in the rule pool.
cial images Ior one
method including
ditions. In this way,
ication with high
tion rule mining is
stly assigned to a
between the cluster
s are then used to
classiIy the test data to a certa
evaluation value deIined by (11
where,
Matchk
(a) and
Matchk
(a)
deviation oI Match
k
(a,r), res
matching degree between data
the sum oI the Iuzzy membersh
in the antecedent part oI Iuz
evaluated by data a; N
k
(r) is th
in the antecedent part oI rule
conventional GNP-Iuzzy data m
Ior classiIication considers not
the standard deviation oI the ma
The Ilowchart oI the propos
shown in Fig. 7. By the mea
degree oI class association
evaluated in diIIerent viewpoin
rate is expected to be improved
IV. EXPERIMENTAL R
To demonstrate the perIo
recognition scheme using PCA
both the proposed method an
ning in the training
Figure. 7 Flowchart oI the pr
TABLE IV. ConIig
Parameter
Population size
Maximal generation
Elite selection size
Crossover size
Crossover rate
Mutation size
Mutation rate
Selection criteria
Number oI judgment nodes
Number oI processing
nodes
ain class which has the highest
1),
(11)
(12)
(13)
(14)
a) are the average and standard
spectively; Match
k
(a,r) is the
a and rule r in class k; N
k
(a,r) is
hip values oI the Iuzzy attributes
zy rule r in class k, which is
e number oI the Iuzzy attributes
r in class k. DiIIerent Irom the
mining method, the criteria used
only the average value but also
atching degree.
sed classiIier in the test phase is
asurement oI average matching
rules, the eigenvectors were
nts, and the correct recognition
d.
RESULTS AND DISCUSSION
ormance oI the proposed Iace
A with GNP-Iuzzy data mining,
nd the conventional PCA-based
oposed classiIier in the test phase
guration oI the GNP
Jalue
200
100
1
100
0.65
99
0.3
Tournament selection
3
21
- 3076 -
method using M-distance have been perIormed on the sub-
database oI the Yale Iacial database B. There are 2432 images
Ior 38 persons under diIIerent poses in illumination conditions
Ior experiments in this paper. And, the conIiguration oI GNP-
Iuzzy data mining is given by Table. IV. The average Iitness
value in the Iirst 100 generations is shown as Fig. 8.
In the GA-based clustering method, we set the cluster
number at 4. The clustering results is given as 1,5,13,15,17,
19,21,23,29,37}, 2,6,12,14,16,18,20,30,36,38}, 3,7,9,11,27,
31,33,35} and 4, 8, 10, 22, 24, 25, 26, 28, 32, 34}, where the
corresponding cluster centers are 15, 6, 33 and 4, respectively.
Finally, the correct recognition rate (CRR) is used to
evaluate the accuracy and the perIormance oI the recognition
methods. The result is given in Table. V.
According to the experimental results, the proposed Iace
recognition scheme using GNP-Iuzzy data mining has better
higher accuracy than the conventional PCA-based Iace method
with M-distance. There are mainly two reasons.
1) PCA is the optimal representation oI Iacial images
under minimal mean square error. But, it is not the
optimal presentation Ior classiIication. In contrast, the
proposed method GNP-Iuzzy data mining extracts the
class association rules to construct the classiIier,
which largely shorten the inner-class distance and
enlarge the intra-class distance;
2) Although M-distance uses the eigenvalues to improve
the perIormance oI Euclidean distance and cosine
distance, it is still insuIIicient to measure the
similarity between two eigenvectors. Especially, the
eigenvector is used to describe the subject in diIIerent
viewing conditions; while, the proposed method uses
the average matching degree oI diIIerent class
association rules, which could somewhat reIlect the
eigenvectors in diIIerent viewing conditions.
In addition, the computation pressure is controlled through
the GA-based clustering method. Also, because the proposed
method has learning ability, it is easier to apply it to diIIerent
database.
V. CONCLUSIONS
The proposed Iace recognition scheme using PCA with
GNP-Iuzzy data mining has successIully solved the three
demerits oI the conventional PCA-based Iace recognition
method, i.e., eIIiciency, robustness and generalization ability oI
pose, appearances and light conditions. At the same time, CRR
is calculated to compare the perIormance and accuracy oI the
proposed method and the conventional PCA-based method.
Experimental results indicate that the proposed method has
higher accuracy than the conventional PCA-based method.
Furthermore, the proposed method could be easily applied to
other diIIerent recognition problem, such as Iingerprint
recognition due to its learning ability. For Iuture works, the
proposed method will be perIormed on more databases and
better Ieature representation approaches will be used to replace
the PCA method to test its eIIectiveness and generalization
ability.
REFERENCES
|1| W. Zhao, R. Chellappa, P. J. Phillips and A. RosenIeld, 'Face
recognition: a literature survey, ACM Computing Survey, Vol. 35, No.
4, pp. 399-458, December 2003.
|2| M. Turk and A. Pentland, 'EigenIaces Ior Recognition, Journal oI
Cognitive Neurosicence, Vol. 3, No. 1, pp. 71-86, 1991.
|3| V. Perlibakas, 'Distance measures Ior PCA-based Iace recognition,
Pattern Recognition Letters, Vol. 25, pp. 711-724, 2004.
|4| K. Kim, 'Face recognition using principal component analysis, BTech
thesis, 2008.
|5| J. Yang, D. Zhang, A. F. Frangi and J. Yang, 'Two-dimensional PCA: a
new approach to appearance-based representation and recognition,
IEEE Trans. On Pattern Analysis and Machine Intelligence, Vol. 16, No.
1, January, 2004.
|6| K. I. Kim, K. Jung and H. J. Kim, 'Face recognition using kernel
principal component analysis, IEEE Signal Processing Letters, Vol. 9,
No. 1, February 2002.
|7| K. Taboada, S. Mabu, E. Gonzales, K. Shimada and K. Hirasawa,
'Mining Iuzzy association rules: a general model based on genetic
network programming and its applications, IEEJ Trans. on Electrical
and Electronic Engineering, Vol.5, No. 1, pp.102111, 2006.
|8| H. Katagiri, K. Hirasawa and J. Hu, 'Genetic Network Programming -
Application to Intelligent Agents, Proceding oI the IEEE international
conIerence on Systems, Man and Cybernetics, Vol. 5, pp. 3829-3834,
2000.
|9| Y. Chen, S. Mabu, K. Shimada and K. Hirasawa, 'Real Time Updating
Genetic Network Programming Ior Adapting to the Change oI Stock
Prices, IEEJ Trans. EIS, Vol. 129, No. 2, pp. 344-354, 2009.
|10| L. Yu, J. Zhou, S. Mabu, K. Hirasawa, J. Hu and S. Markon, 'Double-
Deck Elevator Group Supervisory Control System Using Genetic
Network Programming with Ant Colony Optimization with
Evaporation, J. oI Advanced Computational Intelligence and Intelligent
InIormatics, Vol. 11, No. 9, pp. 1149-1158, 2007.
|11| S. Mabu, K. Hirasawa, Y. Matsuya and J. Hu, 'Genetic Network
Programming Ior Automatic Program Generation, J. oI Advanced
Computational Intelligence and Intelligent InIormatics, Vol. 9, No. 4, pp.
430-435, 2005.
|12| K. Shimada, K. Hirasawa and J. Hu, 'Class Association Rule Mining
with Chi-Squared Test Using Genetic Network Programming, In Proc.
oI the IEEE SMC, pp.5338-5344, 2006.
Figure. 8 Training result oI the GNP in class association rule mining
TABLE V. CRR () value oI the proposed method and the conventional
PCA-based method with M-distance
Method CRR
Proposed method
PCA with M-distance
23.66
19.74
- 3077 -