GA-based optimal selection of PZMI features for face recognition

GA-based optimal selection of PZMI features for face recognition

Applied Mathematics and Computation 205 (2008) 706–715 Contents lists available at ScienceDirect Applied Mathematics and Computation journal homepag...

563KB Sizes 0 Downloads 27 Views

Applied Mathematics and Computation 205 (2008) 706–715

Contents lists available at ScienceDirect

Applied Mathematics and Computation journal homepage: www.elsevier.com/locate/amc

GA-based optimal selection of PZMI features for face recognition Hamidreza Rashidy Kanan, Karim Faez * Machine Vision Lab, Electrical Engineering Department, Amirkabir University of Technology (Tehran Polytechnic), Hafez Avenue, Tehran 15914, Iran

a r t i c l e

i n f o

Keywords: Face recognition Feature selection Pseudo Zernike moment invariant (PZMI) Genetic algorithm (GA)

a b s t r a c t One of the key problems in automated face recognition system is that of handling the face image variation in terms of scale, rotation (in plane) and translation. One approach is fixing mentioned problems in recognition processes by extracting one linear transformation invariant feature. This paper presents a novel method for face recognition. Pseudo Zernike moment invariant (PZMI) which has linear transformation invariance properties and is robust in the presence of noise utilized to produce feature vectors. For decreasing computational complexity of feature extraction step, we use genetic algorithm (GA) to select the optimal feature set which contains optimal PZMI orders and corresponding repetitions. In addition, we have investigated the effect of PZMI orders on recognition rate in noisy images. Proposed scheme has been tested on the FERET database. Experimental results prove the advantages of the proposed method when compared with other PZMI-based face recognition systems. Ó 2008 Elsevier Inc. All rights reserved.

1. Introduction An automatic face recognition system has been a hot topic of computer vision and pattern recognition for decades. Existing face recognition approaches can be classified into two main analytic and holistic categories [1,2]. The analytic or featurebased methods extract a set of geometric face features such as the eyes, nose and the mouth. The location of some significant feature points in terms of face representation and reliability for automatic extraction corresponding to the different facial features is used to compute geometrical relationship including the areas, distances and angles among the ‘‘fiducial” points. Extracted geometrical features are utilized to search for the best candidate from a face database. However, holistic or appearance-based methods utilize the global characteristics of the face patterns. The appearance-based algorithms generally use pixel intensity values of faces without the detection of facial features. Since detection of geometric facial features is not required, this class of approaches is usually more practical and easier to implement as compared to geometric feature-based approaches [2]. Many approaches have been proposed in the literature as a holistic method such as Eigenface [3], linear (Fisher) discriminant analysis (LDA)-based [4] methods, neural network-based methods [5–7], moment-based methods [8–10], etc. Among above approaches, LDA-based method is one of the promising approaches. However, LDA has two major drawbacks. First, LDA is a linear method and it cannot solve nonlinear problem, while the second is the small sample size (S3) problem [11]. In order to overcome first drawback, kernel-based method such as kernel Fisher discriminant (KFD) [12,13] is employed. For the second problem, some algorithms such as PCA + LDA [4], direct LDA [14,15] and regularized discriminant analysis (RDA) [16,17] have been developed. However, PCA + LDA and direct LDA are designed in the sub-feature space, instead of full feature space. Therefore, it may loss some useful discriminant information. Dai et al. [16] proposed three parameters RDA method to solve S3 problem. Although this algorithm is designed in the full feature space, it requires high computational cost. To overcome the complexity problem, Chen et al. [17] further proposed a single parameter RDA * Corresponding author. E-mail address: [email protected] (K. Faez). 0096-3003/$ - see front matter Ó 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.amc.2008.05.114

H.R. Kanan, K. Faez / Applied Mathematics and Computation 205 (2008) 706–715

707

Fig. 1. Some problems in face detection algorithms: (a) reference face image; (b) scaled face image; (c) scaled and translated face image; (d) rotated (in plane) face image.

algorithm. Also, Chen et al. [11] proposed kernel-based RDA algorithm, namely kernel-based one-parameter regularized Fisher discriminant (K1PRFD) method. The proposed K1PRFD algorithm has two parameters, namely the regularization parameter (t) and kernel parameter (h). They propose to determine the optimal kernel parameter h in RBF kernel and regularized parameter t in within-class scatter matrix QU simultaneously based on the conjugate gradient method (CGM). For an automated human face recognition system; there is always an automatic face detection algorithm prior to the face recognition process. Existing face recognition algorithms assume that faces to be recognized are accurately localized. However, this is not always true. It is inevitable that faces detected by a face detection algorithm are variable in terms of scale, rotation (in plane) and translation. Examples of these variations are illustrated in Fig. 1. This requires a face recognition algorithm which is invariant to scale, translation and rotation. PZMI is an orthogonal moment that has been utilized for a number of image processing tasks [10,18–20]. The nature of linear transformation invariance makes the PZMI descriptors very valuable. However, PZMI is computationally infeasible for some real world applications. Any PZMI is determined by two parameters, the order n and their corresponding repetitions m. If we know all moments PZMInm of one image up to a given order nmax, we can obtain its reconstructed version whose moments exactly match those of original image up to the given order nmax. To represent one face image; we can extract finite number of PZMI with different orders and their corresponding repetitions and then concatenate for creating one feature vector. But some orders and repetitions have more image representation ability than the others. So, for decreasing computational complexity, we should select precisely the appropriate orders and repetitions. In earlier works, the best values of these parameters have been determined via simulations [10,20]. In [21], we optimized only the order of PZMI but in this paper we have selected not only the best orders but also the best corresponding repetitions of PZMI. Genetic algorithms (GA’s) are optimization techniques based on the mechanism of natural selection [22]. Because of their adaptive optimized search advantages, GA’s have been widely used as a tool in pattern recognition applications such as optimization and optimal feature selection. In this paper, we propose a feature selection-based approach using PZMI for face recognition. The proposed method is executed in the full feature space instead of sub-feature space and also it has not the inherent problem of LDA-based methods. This paper addresses the major problem in PZMI. The proposed face recognition system uses GA-based optimal selected PZMI features for feature extraction step. We have used the difference between original image and its reconstructed version in term of mean square error (MSE) as a criterion for determining the up bound of order (n*) for face image representation and limiting the GA search space. We first extract PZMI of order 1 to n* = 20 and their corresponding repetitions for face images. Then in the feature selection part, extracted features are applied to GA for selecting the optimal feature subset for recognition or more precisely the optimal PZMI. SVM which has good generalization ability is employed in the classifier stage of this system. The organization of this paper is as follows: Section 2 presents feature extraction and selection. Classification is described in Section 3 and finally, Sections 4 and 5 attain the experimental results and conclusion. 2. PZMI feature extraction and image reconstruction from PZMI The kernel of PZMI is a set of orthogonal Pseudo Zernike polynomials defined inside a unit circle. The two-dimensional complex PZMI of order n with repetition m of a continuous image intensity function f(x, y) are defined as [23]:

PZMIn;m ¼

nþ1

p

ZZ x2 þy2 61

V n;m ðx; yÞf ðx; yÞdx dy;

ð1Þ

where n=0, 1, 2, . . ., 1 and m takes on positive and negative integer values subject to the constraint |m| 6 n. Also, the symbol * denotes the complex conjugate. The Pseudo Zernike polynomials Vn,m(x, y) are defined as

V n;m ðx; yÞ ¼ Rn;m ðrÞejmh

ð2Þ

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffi where j ¼ 1, r ¼ x2 þ y2 is the length of the vector from the origin to the pixel ðx; yÞ and h = tan1(y/x) is the angle between the vector r and the principle x-axis in counterclockwise direction. And the real-valued radial polynomials Rn,m(r) are defined as

708

H.R. Kanan, K. Faez / Applied Mathematics and Computation 205 (2008) 706–715

Rn;m ðrÞ ¼

njmj X

ð1Þs 

s¼0

ð2n þ 1  sÞ! rðnsÞ : s!ðn  jmj  sÞ!ðn þ jmj þ 1  sÞ!

ð3Þ

Note that Rn,m(r) = Rn,m(r). Fig. 2 shows graph of radial polynomials Rn,m(r) with orders n=0 up to 10. Pseudo Zernike polynomials Vn,m(x, y) are orthogonal and satisfy:

ZZ

V n;m ðx; yÞ½V p;q ðx; yÞ dx dy ¼ x2 þy2 61

p nþ1

ð4Þ

dn;p dm;q :

And, also real-valued radial polynomials Rn,m(r) satisfy the orthogonality relation as

Z

1

Rn;m ðrÞ½Rp;m ðrÞ r dr ¼

0

1 dn;p ; 2ðn þ 1Þ

ð5Þ

where di,j is the Kronecker delta and is defined as

 di;j ¼

1 if i ¼ j; 0

ð6Þ

otherwise:

Since the PZMI is defined in the inside of unit circle, in order to compute them, for a digital image, pixel coordinates of the image must be normalized into [0, 1] by a linear mapping transform. In the other words, the center of the image is taken as the origin and pixel coordinates are mapped to the range of unit circle, i.e., x2 + y2 6 1. This linear transformation is illustrated in the Fig. 3 and the corresponding equations are as follows:

Fig. 2. Graph of radial polynomials Rn,m(r) with orders n = 0 up to 10.

y

0

N-1

i

Mapping

1 x 2 2

N-1 j

Fig. 3. Pixel coordinates normalization into unit circle.

H.R. Kanan, K. Faez / Applied Mathematics and Computation 205 (2008) 706–715

pffiffiffi pffiffiffi 2 2 xi ¼  þ i ði ¼ 0; 1; . . . ; ðN  1ÞÞ; 2 N  pffiffiffi pffiffiffi 1 2 2  j ðj ¼ 0; 1; . . . ; ðN  1ÞÞ: yj ¼ 2 N1

709

ð7Þ

So, based on the mentioned mapping, the discrete approximated PZMI of order n with repetition m for mapped digital image intensity f(xi, yj) are given by

PZMIn;m ¼

N1 X N1 ðn þ 1Þ X V  ðxi ; yj Þf ðxi ; yj Þ; kðNÞ i¼0 j¼0 n;m

ð8Þ

where the normalization factor k(N) which is depended on mapping method, is derived from the ratio between the number of pixels located in the unit circle by the mapping transform to the area of normalized square image which is multiplied to the area of a unit circle p in the continuous domain, and is given as

kðNÞ ¼

pN2

ð9Þ

:

2

Note that PZMIn;m ¼ PZMIn;m . The rotation invariant features are obtained by taking the magnitude values of PZMI since these values remain identical to the image functions before and after rotation. The set of PZMI of an image which is rotated by / angle is given as

PZMIrn;m ¼ PZMIn;m expðjm/Þ;

ð10Þ

where PZMIrn;m is the PZMI of rotated image and PZMIn,m is the PZMI of original image. The rotation invariant PZMI are extracted by considering only their magnitude values as

jPZMIrn;m j ¼ jPZMIn;m expðjm/Þj ¼ jPZMIn;m j:

ð11Þ

Since PZMIn;m ¼ PZMIn;m and jPZMIn;m j ¼ jPZMIn;m j ¼ jPZMIn;m j, only the magnitude of PZMI with m P 0 are considered for feature extraction. The PZMI of order n and repetition m can be computed using the scale invariant central moments CMp,q and the radial geometric moments RMp,q as follows:

PZMInm ¼

 k X m   X m k ð2n þ 1  sÞ! ðjÞb CM2kþm2ab;2aþb s!ðn  jmj  sÞ!ðn  jmj  s þ 1Þ! a¼0 b¼0 a b ðnmsÞeven;s¼0  njmj d X m   X X m d nþ1 þ Dn;jmj;s ðjÞb RM2dþm2ab;2aþb ; kðNÞ ðnmsÞodd;s¼0 b a a¼0 b¼0 njmj X

nþ1 kðNÞ

ð1ÞS

ð12Þ

where k = (n  s  m)/2, d = (n  s  m  1)/2 and also CMp,q and RMp,q are as follows:

CMp;q ¼ RMp;q ¼

l

p;q ðpþqþ2Þ=2

M00 PP x

ð13Þ

;

^2 ^2 1=2 ^xp y ^q y f ðx; yÞðx þ y Þ ðpþqþ2Þ=2 M00

;

ð14Þ

^ ¼ y  y0 and x0, y0, Mpq and lpq are gravity coordinates, geometric moments and central geometric mowhere ^ x ¼ x  x0 , y ments, respectively. The image within the unit circle may be reconstructed to an arbitrary precision by

f ðx; yÞ ¼ lim

K!1

K X X n¼0

PZMIn;m V n;m ðx; yÞ;

ð15Þ

m

where the second sum is taken over all |m| 6 n and the PZMI {PZMIn,m} are computed over the unit circle. Fig. 4 shows this reconstruction process for one face image that is obtained in the experiments.

Fig. 4. Original image and its reconstructed images: (a) original image; (b) reconstructed image using order n = 0 up to n = 5; (c) reconstructed image using order n = 0 up to n = 10; (d) reconstructed image using order n = 0 up to n = 15; (e) reconstructed image using order n = 0 up to n = 20.

710

H.R. Kanan, K. Faez / Applied Mathematics and Computation 205 (2008) 706–715

2.1. GA-based feature selection Several approaches exist for using GA for feature subset selection. The two main methods that have been widely used in the past are as follow. First one is due to Siedlecki and Sklansky [24], of finding an optimal binary vector in which each bit corresponds to a feature (binary vector optimization method, BVO). A ‘1’ or ‘0’ suggests that the feature is selected or dropped, respectively. The aim is to find the binary vector with the smallest number of 1’s such that the classifier performance is maximized. This criterion is often modified to reduce the dimensionality of the feature vector at the same time [25]. The second and more refined technique [26] uses an m-ary vector to assign weights to features instead of abruptly dropping or including them as in the binary case. This gives a better search resolution in the multidimensional space [27]. In this paper, we have used the BVO method. At first, for proper face image representation and limiting the GA search space, we used the difference between original image and its reconstructed version as a criterion for determining the maximum needed order (n*). If we assume f(x, y) is an original image and ^f i ðx; yÞ is a reconstructed version of it by using moments of order 0 through i extracted from original image. Using mean square error measure, it can be concluded that enough information is extracted and no additional order of moments needs to be computed, i.e., n* = i if:

MSEð^f i ðx; yÞ; f ðx; yÞÞ 6 e;

ð16Þ

where e is a preselected threshold. It should be noted that the selected threshold is e = 2000. The reason for selection of this value is that it is around 25% of e value corresponding to the reconstructed image using PZMIs with only zero order which is a good degree of closeness. So, based on Eq. (16) n* = 20 is selected. After the PZMI features have been extracted over a range of orders (n = 1 to n* = 20), a GA is used to select the features. A block diagram of proposed feature extraction and selection scheme is shown in Fig. 5. In the feature selection stage, we set the length of chromosomes to L (L = 230). Each gene gi (i = 1, 2, . . ., L) corresponds to a specific order and repetition of PZMI as shown in Fig. 6. If gi = 1, this represents that we select this order and the corresponding repetition as one of optimal feature element. Otherwise, gi = 0 means discarding it. The goal of feature subset selection is to use less features to achieve the same or better performance. So, the fitness function should contain two terms:  Classifier performance (accuracy).  The number of selected features.

Augmented Input Face Feature Image PZMI Feature Extractor

Genetic Algorithm

Nearest Neighbor Classifier

Final Decision (Best n and m)

Feedback for GA

Fig. 5. Block diagram of proposed feature extraction and selection scheme.

L=230

1

1

0

1



0

1

1



Encoding

PZMI 1, 0 PZMI 1,1 PZMI 2 , 0 PZMI 2 ,1



PZMI 20 ,18 PZMI 20 ,19 PZMI20 , 20

Fig. 6. Chromosome representation.

711

H.R. Kanan, K. Faez / Applied Mathematics and Computation 205 (2008) 706–715

Given a chromosome q the fitness function F(q) is defined as described in [28]:

FðqÞ ¼ 10; 000 

NC þ 0:4  zeros; NT

ð17Þ

where NC is the number of face images correctly classified and NT is the total number of face images being tested. Also, the zeros term refers to the number of features not selected, i.e. the number of zeros in the chromosome. In the other words, the higher the accuracy, the higher the fitness. The fewer the number of features, the higher the fitness. For each face image its selected feature vector can be obtained as follows. The non-zero bits in the chromosome are found out and their associated elements of PZMI features are concatenated in turn into an augmented feature vector. For simplicity, we have used the nearest neighbor classifier with Euclidian distance and the aim is to find a binary vector with the smallest number of 1’s such that the classifier performance is maximized. Also, in order to select the individuals for the next generation, GA’s roulette wheel selection method was used. 3. Support vector machine as a classifier Support vector machine (SVM) was first developed by Vapnik and his team (AT&T Bell Labs) for pattern recognition and function regression [29]. Assume there are a set of points belonging to two classes. Intuitively, a SVM finds the hyper-plane that separates the largest possible fraction of points of the same class on the same side, while minimizing the distance from either class to the hyper-plane. For a two-class classification problem, the goal is to separate the two classes by a function which is induced from available samples. Consider the samples in Fig. 7a, where there are many possible linear classifiers that can separate the data, but there is only one (shown in Fig. 7b) that maximizes the margin (the distance between the hyper-plane and the nearest data point of each class). The decision function derived by the SVM classifier for a two-class problem, can be formulated, using a kernel function K(x, xi) of a new sample x and a training sample xi, as follows:

f ðxÞ ¼

X

ai yj Kðxi ; xÞ þ a0 ;

ð18Þ

i2SV

where SV is the support vector set (a subset of training set) and yi = ±1 the label of sample xi. The parameters ai P 0 are optimized during the training process. Table 1 shows some kernels that are used in SVMs. 4. Experimental results The standard test bed, the FERET database [30], was used to test the proposed algorithm. Six hundred frontal face images from 200 subjects are selected, where all the subjects are in an upright, frontal position, with tolerance for some tilting and

Margin

Support Vectors

Hyperplane

a

b

Fig. 7. Classification between two classes using hyper-planes.

Table 1 Kernels are used in SVM Kernel name

Kernel function

Linear Polynomial Radial basis function (RBF) Exponential RBF Perceptron

KðX i ; X j Þ ¼ X Ti X j KðX i ; X j Þ ¼ ðX Ti X j þ 1Þd K(Xi, Xj) = exp(kXi  Xjk2/2r2) K(Xi, Xj) = exp(kXi  Xjk/2r2) KðX i ; X j Þ ¼ tanhðkX Ti X j þ HÞ

712

H.R. Kanan, K. Faez / Applied Mathematics and Computation 205 (2008) 706–715

Fig. 8. Some samples of FERET database [30].

rotation of up to 10°. The 600 images were acquired under varying illumination conditions and facial expressions. Each subject has three images of size 256  384 with 256 gray levels. Fig. 8 shows some sample images of this database. In the feature extraction step, we extract the PZMI of orders 1–20 and all of their corresponding repetitions for face images. That is, for each n, we extract one vector which contains PZMIs of order n and all corresponding repetitions m (m 6 n). For example, if we choose n = n0, we have one vector which has (n0 + 1) elements. So, for all orders 1–20 and their all corresponding repetitions, we have one feature vector containing (2 + 3 +    + 21 = 230) elements. Then in the selection stage, for each chromosome, we create one feature vector which includes all of the extracted PZMIs of GA-based selected orders and corresponding repetitions, respectively. Finally, created features based on the selected orders and repetitions are classified using nearest neighbor classifier and the obtained mean square error (MSE) and the length of selected feature vector are considered for performance evaluation. Further GA parameters are summarized in Table 2. Simulation results show that the PZMI with high orders have more image representation ability and in the selected orders, the number of selected repetitions is increased with increasing the indices of orders. Finally, the number of optimized feature vector elements is 40. Also, the optimal selected orders and corresponding selected repetitions are summarized in Table 3. In the classification step, we use SVM with exponential RBF (ERBF) and polynomial (P) kernel functions. Various parameters for leading to better convergence are tested and the best parameters that are obtained by simulations are used. We use two random images in each class for training and one rest image for the test. Finally the classifier error rate is computed using the test images. The classifier error rate is computed as the number of misclassifications in the test phase over the total number of test images. For comparison purpose, we first considered the PZMIs which have used in [10] and PZMIs which are obtained based on only order selection method [21] as a feature vector. Then we comprised their results in terms of recognition rate with the result of optimal selected orders and repetitions which are obtained in this paper. Simulation results on the FERET database show that the proposed face recognition system which uses selected orders and repetitions provides better recognition rate in comparison with another similar system without any feature selection step [10] and also another same system with only

Table 2 GA parameters Population size Chromosome length

100 230

Probability of crossover Probability of mutation

0.7 0.003

Table 3 Optimal selected orders and the corresponding selected repetitions Selected orders (n)

Corresponding selected repetitions (m)

7 11 12 14 15

2, 1, 2, 0, 1,

3, 3, 4, 1, 2,

4, 6, 5, 3, 3,

6 8, 6, 4, 5,

9, 8, 6, 6,

11 9, 10, 12 7, 8, 10, 13 7, 9, 10, 11, 12, 13, 14, 15

713

H.R. Kanan, K. Faez / Applied Mathematics and Computation 205 (2008) 706–715

order selection step [21] even with the lower number of feature elements. But, the execution time of feature selection step in the face recognition system with only order selection is lower than that in the proposed face recognition system. Since feature selection is typically done in an off-line manner, the execution time of a specific algorithm is of much less importance than its ultimate classification performance. So, we can say that the proposed face recognition system outperforms the other two face recognition systems. Also SVM classifier with the ERBF kernel function has a better result in the test phase in comparison with the polynomial kernel function. These results are summarized in Tables 4 and 5. 4.1. Performance evaluation in noisy images In this section, the sensitivity of the PZMI features to noise are investigated. For sensitivity evaluation, white Gaussian noise with zero mean and four different variances has been added to the original images as shown in Fig. 9. Using the noise-free images as training samples and the noisy images as test samples, the performance of features with different orders are tested. The results are tabulated in Table 6 and also are shown in Fig. 10. Since, including the PZMI with high orders in many cases does not improve or even degrades the recognition rate; we can say that the PZMI with high orders are more sensitive to noise. And also according to the results of this part, we can say that PZMI can perform well in the presence of a moderate level of noise. Table 4 Recognition rate for the three different systems Face recognition system

Recognition rate SVM (ERBF, r = 3)

Without feature selection [10] With feature selection [21] Proposed method

SVM (P, d = 2)

Train set (%)

Test set (%)

Train set (%)

Test set (%)

100 100 100

86.4 90.5 93.7

100 100 100

83.6 88.2 91.7

Table 5 Experimental results Face recognition system

PZMI selected orders

Number of feature vector elements

Classification recognition rate (%)

Execution time of feature selection step (S)

Without feature selection [10] With feature selection [21] Proposed method

3, 4, 5, 6, 7, 8, 9, 10 5, 8, 11, 13, 15 7, 11, 12, 14, 15

60 57 40

86.4 90.5 93.7

1320 1560

a

Not defined.

Fig. 9. Some samples of noisy images with different noise values.

a

714

H.R. Kanan, K. Faez / Applied Mathematics and Computation 205 (2008) 706–715

Table 6 Sensitivity of different PZMI orders to noise Maximum order

15 (135 feature element) 13 (104 feature element) 11 (77 feature element) 9 (54 feature element) 7 (35 feature element)

Recognition rate

r = 0 (%)

r = 0.01 (%)

r = 0.02 (%)

r = 0.03 (%)

r = 0.04 (%)

94 92 89 88 86

86 84 81 81 82

82 81 78 79 81

81 79 78 79 81

79 77 76 77 80

100 Maximum Order=15 Maximum Order=13

Recognition Rate (%)

95

Maximum Order=11 Maximum Order=9 Maximum Order=7

90

85

80

75 0

0.01

0.02

0.03

0.04

0.05

Noise Value (δ) Fig. 10. Recognition rate for noisy images with different noise levels.

5. Conclusion In this paper, we have presented linear transformation invariant face recognition method based on optimal PZMI’s order and repetition. We first extracted PZMI features of order 1 through n* = 20 and all of the corresponding repetitions. In the feature selection step, we have used GA for selection of PZMI optimal orders and repetitions. Finally, the SVM which has good generalization ability is employed in this system. The noise sensitivity of PZMI is investigated and it is concluded that PZMI can perform well in the presence of a moderate level of noise while PZMI with high orders are more sensitive to noise. Also, through the analysis on distributions of the selected orders, we infer that the PZMI with high orders have more image representation ability and more discrimination capability for classification stage and the number of selected repetitions corresponding to the selected orders is increased with increasing the indices of orders. Acknowledgement This research is partially supported by the Iran Telecommunication Research Center (ITRC). References [1] R. Brunelli, T. Poggio, Face recognition: features versus templates, IEEE Trans. Pattern Anal. Mach. Intell. 15 (10) (1993) 1042–1052. [2] S.G. Kong, J. Heo, B.R. Abidi, J.K. Paik, M.A. Abidi, Recent advances in visual and infrared face recognition – a review, Comput. Vis. Image Understand. 97 (1) (2005) 103–135. [3] M. Turk, A. Pentland, Eigenfaces for recognition, J. Cognit. Neurosci. 3 (1991) 71–86. [4] P.N. Belhumeur, J.P. Hespanha, D.J. Kriegman, Eigenfaces vs. Fisherfaces: recognition using class specific linear projection, IEEE Trans. Pattern Anal. Mach. Intell. 19 (1997) 711–720. [5] L. Pessoa, A.P. Leitao, Complex cell prototype representation for face recognition, IEEE Trans. Neural Networks 10 (6) (1999) 1528–1531. [6] Z.Q. Zhao, D.S. Huang, B.Y. Sun, Human face recognition based on multi-features using neural networks committee, Pattern Recogn. Lett. 25 (12) (2004) 1351–1358. [7] L. Guo, D.S. Huang, Human face recognition based on radial basis probabilistic neural network, in: Proceedings of the International Joint Conference on Neural Networks (IJCNN 2003), 2003, pp. 2208–2211. [8] X. Jia, M.S. Nixon, Extending the feature vector for automatic face recognition, IEEE Trans. Pattern Anal. Mach. Intell. 17 (12) (1995) 1167–1176. [9] R. Mariani, Local invariants and local constraints for face recognition, ICPR 2 (2000) 949–952. [10] J. Haddadnia, K. Faez, M. Ahmadi, An efficient human face recognition system using Pseudo Zernike moment invariant and radial basis function neural network, IJPRAI 17 (1) (2003) 41–62. [11] W.S. Chen, P.C. Yuen, J. Huang, D.Q. Dai, Kernel machine-based one-parameter regularized Fisher discriminant method for face recognition, IEEE Trans. Syst. Man Cybern. Pt. 35 (4) (2005) 659–669. [12] S. Mika, G. Rätsch, J. Weston, B. Schölkopf, K.R. Müller, Fisher discriminant analysis with kernels, in: Proceedings of the IEEE Workshop on Neural Networks for Signal Processing IX, 1999, pp. 41–48.

H.R. Kanan, K. Faez / Applied Mathematics and Computation 205 (2008) 706–715

715

[13] S. Mika, G. Rätsch, J. Weston, B. Schölkopf, A.J. Smola, K.R. Müller, Invariant feature extraction and classification in feature spaces, in: S.A. Solla, T.K. Leen, K.R. Müller (Eds.), Advances in Neural Information Processing Systems, vol. 12, MIT Press, Cambridge, MA, 2000, pp. 526–532. [14] L.F. Chen, H.Y. Liao, M.T. Ko, J.C. Lin, G.J. Yu, A new LDA-based face recognition system which can solve the small sample size problem, Pattern Recogn. 33 (10) (2000) 1713–1726. [15] H. Yu, J. Yang, A direct LDA algorithm for high-dimensional data-with application to face recognition, Pattern Recogn. 34 (10) (2001) 2067–2070. [16] D.Q. Dai, P.C. Yuen, Regularized discriminant analysis and its applications to face recognition, Pattern Recogn. 36 (3) (2003) 845–847. [17] W.S. Chen, P.C. Yuen, J. Huang, A new regularized linear discriminant analysis methods to solve small sample size problems, Int. J. Pattern Recogn. Artif. Intell. 19 (7) (2005) 917–935. [18] A. Khotanzad, Y.H. Hong, Invariant image recognition by Zernike moments, IEEE Trans. Pattern Anal. Mach. Intell. 12 (5) (1990) 489–497. [19] C.H. Teh, R.T. Chin, On image analysis by the methods of moments, IEEE Trans. Pattern. Anal. Mach. Intell. 10 (4) (1988) 496–513. [20] H.R. Kanan, K. Faez, PZMI and wavelet transform features in face recognition system using a new localization method, Proc. IECON (2005) 2690–2694. [21] H.R. Kanan, K. Faez, M. Ezoji, Face recognition: an optimized localization approach and selected PZMI feature vector using SVM classifier, in: Proceedings of the International Conference on Intell. Computing (ICIC 2006), LNCS 4113, 2006, pp. 690–696. [22] M. Srinivas, L.M. Patnik, Genetic algorithms: a survey, IEEE, 1994. [23] A.B. Bhatia, E. Wolf, Proc. Camb. Philos. Soc. 50 (1954) 40–48. [24] W. Siedlecki, J. Sklansky, A note on genetic algorithms for large-scale feature selection, Pattern Recogn. Lett. 10 (5) (1989) 335–347. [25] J. Yang, V. Honavar, Feature subset selection using a genetic algorithm, IEEE Intell. Syst. 13 (1998) 44–49. [26] W.F. Punch, E.D. Goodman, L. Chia-Shun, M. Pei, P. Hovland, R. Enbody, Further research on feature selection and classification using genetic algorithms, in: Proceedings of the International Conference on Genetic Algorithms, 1993, pp. 557–564. [27] M. Raymer, W. Punch, E. Goodman, L. Kuhn, A.K. Jain, Dimensionality reduction using genetic algorithms, IEEE Trans. Evol. Comput. 4 (2) (2000) 164– 171. [28] S. Zehang, B. George, M. Ronald, Object detection using feature subset selection, Int. J. Pattern Recogn. 37 (11) (2004) 2165–2176. [29] E. Osuna, R. Freund, F. Girosi, Support vector machine: training and applications, Technical Report, Massachusetts Institute of Technology, 1997. [30] D.M. Blackburn, M. Bone, P.J. Phillips, Facial recognition vendor test 2000, 2001. .