Neurocomputing 74 (2011) 3760–3767
Contents lists available at ScienceDirect
Neurocomputing journal homepage: www.elsevier.com/locate/neucom
Improved discriminant locality preserving projections for face and palmprint recognition Jiwen Lu , Yap-Peng Tan School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore
a r t i c l e i n f o
a b s t r a c t
Article history: Received 13 January 2011 Received in revised form 16 April 2011 Accepted 24 June 2011 Communicated by T. Heskes Available online 12 August 2011
We propose in this paper two improved manifold learning methods called diagonal discriminant locality preserving projections (Dia-DLPP) and weighted two-dimensional discriminant locality preserving projections (W2D-DLPP) for face and palmprint recognition. Motivated by the fact that diagonal images outperform the original images for conventional two-dimensional (2D) subspace learning methods such as 2D principal component analysis (2DPCA) and 2D linear discriminant analysis (2DLDA), we first propose applying diagonal images to a recently proposed 2D discriminant locality preserving projections (2D-DLPP) algorithm, and formulate the Dia-DLPP method for feature extraction of face and palmprint images. Moreover, we show that transforming an image to a diagonal image is equivalent to assigning an appropriate weight to each pixel of the original image to emphasize its different importance for recognition, which provides the rationale and superiority of using diagonal images for 2D subspace learning. Inspired by this finding, we further propose a new discriminant weighted method to explicitly calculate the discriminative score of each pixel within a face and palmprint sample to duly emphasize its different importance, and incorporate it into 2D-DLPP to formulate the W2D-DLPP method to improve the recognition performance of 2D-DLPP and Dia-DLPP. Experimental results on the widely used FERET face and PolyU palmprint databases demonstrate the efficacy of the proposed methods. & 2011 Elsevier B.V. All rights reserved.
Keywords: Discriminant locality preserving projections (DLPP) Two-dimensional DLPP Manifold learning Face recognition Palmprint recognition Weighted
1. Introduction Locality preserving projections (LPP) [1] is a recently proposed manifold learning method for pattern recognition and dimensionality reduction. It considers data samples residing on or nearby a nonlinear manifold and projects them into a low-dimensional subspace so that the samples in a neighborhood are projected as close to one another as possible. As the local data structure is explicitly considered, LPP has demonstrated more discriminant power than several conventional subspace learning methods such as principal component analysis (PCA) [2] and linear discriminant analysis (LDA) [3] for image recognition. Since LPP is originally unsupervised, Yu et al. [4] further exploited supervised information and derived a discriminant locality preserving projections (DLPP) algorithm to enhance the recognition performance. However, existing LPP and DLPP methods need to convert two-dimensional images lexicographically into one-dimensional vectors, an operation that inevitably compromises the spatial
Corresponding author.
E-mail addresses:
[email protected] (J. Lu),
[email protected] (Y.-P. Tan). 0925-2312/$ - see front matter & 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.neucom.2011.06.024
structure information of the original image features. The averse impact of such compromise becomes more apparent for undersampled applications such as face and palmprint recognition. More recently, a new manifold learning algorithm called twodimensional discriminant locality preserving projections (2D-DLPP) [5] has been proposed to address this problem. The main idea of 2D-DLPP is to seek directly from image matrices rather than vectors the optimal projections using DLPP-like optimization criterion. As 2D-DLPP does not need to perform image-to-vector conversion, it can better preserve the spatial structure information of the features and achieve higher recognition accuracy than DLPP [4] as well as several existing two-dimensional (2D) subspace learning methods such as 2D principal component analysis (2DPCA) [6], 2D linear discriminant analysis (2DLDA) [7] and 2D locality preserving projections (2DLPP) [8,9]. However, the projections of 2D-DLPP mainly exploit the row direction of the image features and ignore the column direction. Exploiting discriminant information from both the row and column directions is desirable as it has demonstrated better recognition performance in conventional subspace learning methods [10–15]. Motivated by the aforementioned observation and inspired by the work for such subspace learning methods as PCA, LDA and LPP [10,11,21], we first propose in this paper a new diagonal discriminant locality preserving projections (Dia-DLPP) method to extract discriminant information from both the row and column
J. Lu, Y.-P. Tan / Neurocomputing 74 (2011) 3760–3767
directions for face and palmprint recognition. Different from the methods proposed in [10,11] that only used diagonal images in the training phase, we apply diagonal images both in the training and testing phases, and our empirical results show that the recognition performance can be further improved. Moreover, we also show that transforming an original image into a diagonal image is equivalent to assigning different weights to each pixel of the original image to emphasize its different importance for recognition, which further explains the rationale and superiority of using diagonal images in existing 2D subspace learning methods [10,11]. Motivated by this finding, we further propose a new discriminant weighted method to explicitly calculate the discriminative score of each pixel within a face and palmprint sample to duly emphasize its different importance, and incorporate it into 2D-DLPP to formulate a weighted 2D-DLPP (W2DDLPP) method to further improve the recognition performance of 2D-DLPP and Dia-DLPP. Experimental results on the widely used FERET face and PolyU palmprint databases are presented to demonstrate the efficacy of the proposed methods. The remainder of this paper is organized as follows. Section 2 briefly reviews the diagonal image representation technique and presents the rationale of transforming an original image into a diagonal image. Section 3 presents the proposed Dia-DLPP method, and Section 4 proposes the W2D-DLPP method. Section 5 presents the experimental results, and Section 6 concludes the paper.
Comparing the original image Xi with its corresponding diagonal image X 0i , we can consider X 0i as the original image preprocessed by an operator. To define this operator, we consider each pixel in X 0i is to be obtained by multiplying the corresponding pixel in the original image with a positive weight. That is X 0i ¼ Xi A
Given an m n image X, we can transform it into a diagonal image X 0i , as follows [10]: (1) If the height m is equal to or larger than the width n, offset the image columns progressively as illustrated in Fig. 1(a). (2) If the height m is smaller than the width n, offset the image rows as illustrated in Fig. 1(b). Now, we show that transforming an original m n image into the corresponding diagonal image is equivalent to assigning an appropriate weight to each pixel of the original image to highlight and emphasize its different importance.
ð1Þ
where A is the weighting matrix and ‘‘’’ denotes element-wise multiplication. Furthermore, (1) If mZ n, 2 1 6 61 6 6 ^ A¼6 6 61 6 4 1
(2) If mo n, 2 1 6 x22 6 x21 6 A¼6 6 ^ 4 xmm xm1
2. Diagonal image representation
3761
x22 x12 x32 x22
3
xnn x1n
xðn þ 1Þn x2n
^
&
xm2
^
x12 xm2
xðn1Þn xmn
xðm1Þ2
^
7 7 7 7 7 7 7 7 5
1
1
x23 x22
x2n x2ðn1Þ
^
^
&
xmðm þ 1Þ xm2
xmðn2Þ xmðn1Þ
1
3
7 7 7 ^ 7 7 xmðn1Þ 5 x21 x2n
xmn
We can see that A applies different weights for pixels around texture or edge regions. Fig. 2 shows five face images and the weighting matrices used to obtain the corresponding diagonal images. The figure shows that the weighting matrices emphasize the pixels on the face images differently. Specifically, important face parts such as eyes and eyebrows are assigned with larger weights while other regions such as the cheek and forehead are assigned with smaller weights. In other words, Dia-PCA and Dia-LDA can be regarded as weighted 2DPCA and 2DLDA approaches, respectively, and the weighting matrix emphasizes the importance of different face parts for recognition. This provides a good rationale on why DiaPCA and DiaLDA outperform 2DPCA and 2DLDA, respectively, in addition to the empirical results provided in [10,11].
Fig. 1. Deriving a diagonal image. (a) m Z n. (b) m o n.
3762
J. Lu, Y.-P. Tan / Neurocomputing 74 (2011) 3760–3767
Fig. 2. (a) Original face images. (b) Weighting matrices used to obtain the diagonal images.
3. Dia-DLPP
Table 1 Top recognition accuracy and corresponding reduced dimension obtained by performing training using the diagonal images, but performing recognition using the original images and the diagonal images, respectively, on the FERET database.
3.1. 2D-DLPP Consider a training set consisting of N images Xi of m n pixels, where i ¼ 1,2, . . . ,N. The 2D-DLPP minimizes an objective function defined as [5] PC PNs s s T s s s s¼1 i,j ¼ 1 ðYi Yj Þ ðYi Yj ÞSij J¼ ð2Þ PC T i,j ¼ 1 ðMi Mj Þ ðMi Mj ÞPij where Ysi and Ysj denote the low-dimensional representations of Xi and Xj in the sth class, Mi and Mj are the mean samples of Y in the ith and jth classes, respectively, C is the number of classes, Ns denotes the number of training samples in the sth class, Ssij and Pij are two affinity matrices, defined as ! 8 s s 2 > < exp JXi Xj J if LXi ¼ LXj ¼ s t1 ð3Þ Ssij ¼ > : 0 otherwise
DiaPCA DiaPCA þ 2DPCA
Recognition using original images
Recognition using diagonal images
Acc. (%)
Dim.
Acc. (%)
Dim.
90.5 91.5
60 5 16 14
94.0 95.0
60 5 16 14
zero. Let v1 ,v2 , . . . ,vd be the eigenvectors of (5) corresponding to the d smallest eigenvalues ordered according to 0 r l1 r l2 r r ld . An n d transformation matrix V ¼ ½v1 ,v2 , . . . ,vd can be obtained to project each m n image Xi into an m d feature matrix Yi, as follows: Yi ¼ Xi V,
i ¼ 1,2, . . . ,N
ð6Þ
3.2. Dia-DLPP
and Pij ¼ exp
Method
JFi Fj J t2
2
! ð4Þ
P i PN X i and Fj ¼ ð1=Nj Þ k j¼ 1 Xkj are the mean where Fi ¼ ð1=Ni Þ N k¼1 k samples of Xk in the ith and jth classes, LXi and LXj are the class labels of Xi and Xj, t1 and t2 are two empirically pre-defined parameters, respectively. Let V be the transformation matrix and Yi ¼ Xi V, i ¼ 1,2, . . . ,N. By simple algebraic manipulations as shown in [5], one can reduce the numerator and denominator of (2) to 12 V T X T LXV and 1 T T 2 V F HFV,
respectively, where X T ¼ ½X1T ,X2T , . . . ,XNT is an mN n matrix obtained by arranging all the training images in a column form, L ¼ DS is known as the Laplacian matrix with D being P a diagonal matrix comprising elements Dii ¼ j Sij , F T ¼ ½F1T , P F2T , . . . ,FCT , H ¼ EP, and Eii ¼ j Pji . Then, the projections of 2DDLPP can be obtained by solving the following generalized eigenvalue problem: X T LXv ¼ lF T HFv T
ð5Þ T
As matrices X LX and F HF are both symmetric and positive semidefinite, the eigenvalues obtained from (5) are no smaller than
As 2D-DLPP mainly extracts discriminant information from the row direction, some structure information such as nose and eyes in a face as well as principal lines in a palmprint may not be fully exploited. To address this shortcoming, we transform each image into a diagonal image by offsetting the columns or rows of the image progressively, as suggested for PCA and LDA in [10,11] and illustrated in Fig. 1. In this way, the rows or columns of the diagonal image contain both the row and column information of the original image and the spatial structures can be duly emphasized through the differences across columns or rows after offsetting. It should be noted that the methods proposed in [10,11] only apply diagonal images in the learning (training) phase but not in the recognition (testing) phase. As such, the discriminant information available in both the row and column directions of the diagonal images may not be fully exploited and the performance could be compromised during recognition. To illustrate this point, we performed DiaPCA on a subset of FERET face database containing 400 gray-level frontal images (60 60 pixels each) of 200 subjects (i.e., two images per subject). We selected one image of each subject for training and the other image for testing. Table 1 lists the recognition accuracies obtained by the DiaPCA and
J. Lu, Y.-P. Tan / Neurocomputing 74 (2011) 3760–3767
DiaPCA þ 2DPCA methods as proposed in [10]. We can see that the accuracies obtained by performing training on the diagonal images but recognition on the original images are the same as those reported in [10]. However, performing both training and recognition using the diagonal images as we propose can further improve the accuracies by 3–4%. Based on the above reasons, our proposed Dia-DLPP algorithm works as follows: Step 1: Transform each m n image Xi in the training set into a diagonal image X 0i . If m Zn, offset the image columns progressively as illustrated in Fig. 1(a); otherwise, offset the image rows as illustrated in Fig. 1(b). Step 2: Perform 2D-DLPP on the diagonal training images X 0i , i ¼ 1,2, . . . ,N, to obtain the optimal projection matrix V by solving (5). Then use (6) to project each training image X 0i onto V to obtain its m d feature matrix Y 0i . Step 3: For each testing image Xj (X 0j ),1 use (6) to project it onto V to obtain its m d feature matrix W, where W ¼ Wj ¼ Yj V when the original image was applied and W ¼ W 0j ¼ Y 0j V when the diagonal image was applied for recognition, respectively. Step 4: Perform recognition by using a simple nearest neighbor classifier c ¼ arg mindðW,Y 0i Þ
ð7Þ
i
based on the Euclidean distance of W and Y 0i , defined as !1=2 m X d X dðW,Y 0i Þ ¼ JWY 0i J2 ¼ ðWðx,yÞY 0i ðx,yÞÞ ,
ð8Þ
x¼1y¼1
where Wðx,yÞ and Y 0i ðx,yÞ denote the (x, y) element of matrices W and Y 0i , respectively. 3.3. Dia-DLPP þ 2D-DLPP Although Dia-DLPP can obtain better recognition accuracy than 2D-DLPP, they share one common shortcoming: both of them require more coefficients for efficient feature representation and recognition when compared to DLPP [4]. To circumvent this shortcoming, we can combine the proposed Dia-DLPP method with the 2D-DLPP method to obtain a better recognition performance, as follows: Step 1: Perform Dia-DLPP in the row direction of the training image set X 0i to obtain an n d transformation matrix V, and transform each image X 0i to Y 0i using Y 0i ¼ X 0i V. Step 2: Perform 2D-DLPP in the column direction of the transformed image set (Y 0i ) to obtain a q m transformation matrix U, and transform each image Y 0i to Y 00i using Y 00i ¼ UY 0i . Step 3: For each testing image Xj (or X 0j ), project it onto V first to obtain Yj (or Y 0j ), where Yj ¼ Xj V, and Y 0j ¼ X 0j V, and then project Yj (or Y 0j ) onto U to obtain a q d feature matrix Zj (or Z 0j ), where Zj ¼ UY j , and Z 0j ¼ UY 0j . Step 4: Perform recognition by using the simple nearest neighbor classifier based on the Euclidean distance in the projected feature space, similar to those defined in (7) and (8).
4. W2D-DLPP It is likely that different pixels in different spatial positions of each face sample may have different discriminative power for face recognition. Hence, the pixel with different values and spatial locations could be assigned with different weights for discrimination, and pixels at important facial features such as eyes, mouth, 1 If the original image was applied for recognition, the testing image is denoted as Xj, otherwise, it is denoted as X 0j .
3763
nose, etc., could be emphasized and others such as cheek and forehead could be deemphasized. While each pixel in original images was emphasized (or deemphasized) with different weights in diagonal images, as shown in Fig. 2(b), the weighting matrices used in the diagonal images cannot explicitly depict the different discriminant importance of each pixel in the original images. We can see from this figure that some useful information, such as the nose region of each face image, is deemphasized in the weighting matrices used in the diagonal images. This is because the weighting matrices used in these diagonal images are rather simple and coarse, and some useful information have been compromised in these weighting matrices. Moreover, the discriminant information of each pixel in the original images may not be fully exploited in the weighting matrices used in the diagonal images, as these weighting matrices are derived in an unsupervised manner and the discriminant power of each pixel was not explicitly exploited. Based on the above observations, we propose as below a discriminant weighted algorithm to exploit the discriminant information of different face and palmprint parts for recognition: (1) Apply a feature selection scheme based on a graph embedding (GE) criterion [16], described below, to calculate the discriminant capability of each pixel in the face and palmprint samples. Let fri denote the rth pixel of the ith sample xi, where i ¼ 1,2, . . . ,N and r ¼ 1,2, . . . ,d, Lr denote the GE score of the rth feature. We first construct two graphs GI and GP to describe the locality structure of the data set. For each sample xi, we find its k1 nearest neighbors of the same class and k2 nearest neighbors of different class, and put an edge between xi and its neighbors, described as follows: ( 1 if xi A Nkþ1 ðxj Þ or xj A Nkþ1 ðxi Þ GIij ¼ ð9Þ 0 otherwise ( GPij
¼
1
if xi A Nk2 ðxj Þ
0
otherwise
or xj A Nk2 ðxi Þ
ð10Þ
where Nkþ1 ðxj Þ and Nkþ1 ðxi Þ denote the k1 nearest neighbors of the samples xi and xj in the same class, Nk2 ðxj Þ and Nk2 ðxi Þ denote the k2 nearest neighbors of the samples xi and xj in different class, k1 and k2 are two empirically pre-specified parameters. P (2) Define two diagonal matrices DI and DP, where DIii ¼ j GIij and P DPii ¼ j GPij , and compute the GE matrices as HI ¼ DI GI and HP ¼ DP GP . Then, we calculate the GE score for the rth pixel: Lr ¼
frT HI fr frT HP fr
ð11Þ
where fr ¼ ½fr1 ,fr2 , . . . ,frN and r ¼ 1,2, . . . ,d. (3) For each position (i, j) in an image sample, calculate its GE score Aij ¼ LðrÞ, where r ¼ ði1Þ n þj, and the operator [c] retains the integer part of c. (4) Normalize each element in Aði,jÞ to the range of [0, 1] to obtain the weighting matrix A as follows: Aði,jÞ ¼
Aði,jÞ maxðAÞminðAÞ
ð12Þ
where maxðAÞ and minðAÞ denote the maximum and minimum weight of A, respectively. We applied the proposed weighted method on the FERET face and PolyU palmprint databases to derive the corresponding weighting matrices, respectively. We selected four images of 200 subjects from the FERET database and three images of 100
3764
J. Lu, Y.-P. Tan / Neurocomputing 74 (2011) 3760–3767
subjects from the PolyU database to learn the weighting matrices. Fig. 3 shows the weighting matrix of the FERET face and PolyU palmprint database, respectively. Comparing Fig. 3 with Fig. 2(b), we have made the following two observations: (1) Our proposed weighting algorithm can better depict the semantic features than the diagonal image representation. For example, some parts in the face such as eyes, mouth, nose and chin in the face image have higher discriminative score than other parts. Similarly, for the palmprint image, the principal lines have higher discriminative score than other regions. (2) Our proposed weighted method calculated the weighting matrices from all the training samples rather than that of the diagonal image calculated from only one image. Hence, more discriminant information has been exploited in our proposed weighted method. Now, we present the proposed W2D-DLPP algorithm by integrating the discriminant weighted method into 2D-DLPP as follows: Step 1: Perform elementwise multiplication for each m n training image Xi to obtain a weighted image Yi, where Yi ¼ Xi A
ð13Þ
where A is the weighting matrix derived by the above discriminative weighted method. Step 2: Perform 2D-DLPP in the column direction of the image set (Yi) to obtain a q m transformation matrix U, and transform each image Yi to Y 0i by using Y 0i ¼ UY i . Step 3: Perform 2D-DLPP in the row direction of the image set (Y 0i ) to obtain an n q transformation matrix V, and transform each image Y 0i to Y 00i by using Y 00i ¼ Y 0i V. Step 4: For each testing image Xj, use (13) to obtain a weighted image Yj. Step 5: Project Yj onto U to obtain its q m feature matrix Y 0j . Step 6: Project Y 0j onto V to obtain its p q feature matrix Y 00j . Step 7: Perform recognition by using the simple nearest neighbor classifier based on the Euclidean distance in the projected feature space, similar to those defined in (7) and (8).
Fig. 3. Visualization of the discriminative weighting matrices of (a) the FERET face database and (b) the PolyU palmprint database.
5. Experimental results We have tested the proposed Dia-DLPP, Dia-DLPP þ 2D-DLPP and W2D-DLPP methods on two widely used benchmark biometrics databases, namely the FERET face database [17,18] and the PolyU palmprint database [19,20]. We have also compared the proposed methods with existing vector-based feature extraction algorithms, including PCA [2], LDA [3], LPP [1], DLPP [4], and several recently proposed matrix-based feature extraction methods, such as 2DPCA [6], 2DLDA [7], 2DLPP [8], 2D-DLPP [5], DiaPCA [10], DiaLDA [11], DiaLPP [21], SDiaLPP [21], 2D2PCA [12], and 2D2LDA [13]. Moreover, we have also applied our proposed weighting scheme to 2DPCA and 2DLDA to formulate the corresponding weighted 2DPCA (W2D-PCA) and weighted 2DLDA (W2D-LDA), and evaluated their performance in our face and palmprint experiments. All the results reported here were obtained by using the best-tuned parameters of the respective methods for a fair comparison. Assume there are k images per subject in each training set, we randomly selected k 1 samples to learn the subspace and the remaining one sample per subject as a validation set to tune the optimal parameter of each subspace method. Then, we used these parameters for recognition. 5.1. Results on the FERET database The FERET database comprises 13 539 face images acquired from 1565 subjects [17,18]. In our experiments, we selected a subset of 1400 face images of 200 subjects from the FERET face database, i.e., seven face images per subject, which are marked with two-character strings: ‘‘ba’’, ‘‘bj’’, ‘‘bk’’, ‘‘be’’, ‘‘bf’’, ‘‘bd’’ and ‘‘bg’’ to denote different face orientations. The facial part of each image was manually cropped and aligned into 64 64 pixels according to the eyes’ positions. Fig. 4 shows the seven images of one subject. For each subject, k (k¼ 3, 4) images were randomly selected for training and the rest were used for testing. For each k, we calculated the average recognition accuracy and standard deviation over 20 random splits of training and testing sets. The top recognition accuracy achieved by each method and the corresponding dimension of the reduced space are listed in Table 2. Note that the accuracies denoted as Dia-DLPP (org) and Dia-DLPPþ2D-DLPP (org) in Table 2 were obtained by using the proposed methods to perform recognition on the original images instead of the diagonal images. We can easily see from this table that the proposed Dia-DLPP, Dia-DLPPþ2D-DLPP and W2D-DLPP methods perform better than other methods under comparison. Then, we compared the recognition performance of five DLPPbased feature extraction methods—DLPP, 2D-DLPP, Dia-DLPP, Dia-DLPPþ2D-DLPP, and W2D-DLPP—under different feature dimensions. For fair comparison, we applied PCA to the 2D-DLPP and Dia-DLPP methods and performed recognition under the same dimension of DLPP, Dia-DLPPþ 2D-DLPP and W2D-DLPP. For the Dia-DLPP and Dia-DLPP þ2D-DLPP methods, we applied the diagonal images in both the training and testing phases. Fig. 5 shows their performance when three samples of each subject were randomly selected as the training set and the rest as the testing set. We can easily see from Table 2 and Fig. 5 that the proposed Dia-DLPP, Dia-DLPP þ2D-DLPP and W2D-DLPP
Fig. 4. Seven images of one subject from the FERET database.
J. Lu, Y.-P. Tan / Neurocomputing 74 (2011) 3760–3767
methods perform better than other methods under comparison. Moreover, W2D-DLPP always obtains the best recognition performance in terms of recognition accuracy. 5.2. Results on the PolyU database The PolyU palmprint database contains 600 gray images of 100 different palms with six samples from each palm [19,20]. They were collected in two sessions separated by two months: three Table 2 Recognition performance comparison on the FERET database (mean 7 std). Method
k¼ 3
PCA [2] LDA [3] LPP [1] DLPP [4] 2DPCA [6] 2DLDA [7] 2DLPP [8] 2D-DLPP [5] DiaPCA [10] DiaLDA [11] DiaLPP [21] SDiaLPP [21] W-2DPCA W-2DLDA (2D)2 PCA (2D)2 LDA Dia-DLPP (org) Dia-DLPP þ 2D-DLPP (org) Dia-DLPP Dia-DLPP þ 2D-DLPP W2D-DLPP
k¼4
Acc. (%)
Dim.
Acc. (%)
Dim.
44.4 7 3.3 74.5 7 3.6 76.4 7 3.1 79.7 7 2.8 48.3 7 3.6 76.8 7 3.3 78.4 7 2.8 81.6 7 2.2 53.5 7 2.3 78.7 7 3.3 82.2 7 2.6 81.9 7 2.8 58.6 7 3.3 80.8 73.1 60.2 72.1 80.3 72.0 83.8 7 2.1 84.3 7 2.1 85.1 7 1.9 86.4 7 1.9 87.8 7 1.8
300 190 199 199 64 6 64 5 64 5 64 4 64 5 64 5 64 5 64 5 64 6 64 5 87 77 64 4 77 64 4 76 77
49.7 73.2 82.9 73.1 86.8 72.9 89.9 72.8 50.87 3.7 84.5 73.4 87.9 72.3 91.1 72.2 56.4 72.1 87.3 73.4 89.6 72.8 89.3 72.5 68.8 73.5 87.5 73.2 69.3 72.3 89.2 71.7 92.1 71.7 92.2 71.4 92.4 71.3 93.5 71.3 94.7 71.2
500 199 199 199 64 6 64 5 64 5 64 5 64 6 64 4 64 5 64 5 64 6 64 5 87 87 64 5 87 64 5 76 77
90 85 Recognition accuracy (%)
80 75 70 65 60 55 DLPP 2D−DLPP+PCA Dia−DLPP+PCA Dia−DLPP+2D−DLPP W2D−DLPP
50 45 40
9
16
25
36 49 Dimension
64
81
100
Fig. 5. Recognition accuracies on the FERET database under different feature dimensions.
3765
samples in the first session and another three in the second session. In our experiments, the central part of each palmprint image was cropped and aligned using an algorithm similar to that of [20]. Each aligned image was then resized to 64 64 pixels and processed by histogram equalization. Fig. 6 shows the six cropped and aligned palmprint images of one palm. Similar to the previous experiments, we randomly selected k (k¼2, 3, 4) images of each palm to form the training set and used the rest to form the testing set. For each k, we calculated the recognition results over 20 random splits of training and testing sets. Table 3 lists the top recognition performance of all the evaluated methods, including the accuracies and standard deviation obtained by applying the proposed methods on the original images instead of the diagonal images proposed. The results show that the proposed Dia-DLPP, Dia-DLPPþ2D-DLPP and W2D-DLPP methods consistently outperform the other methods under comparison. Similar to the previous experiments, we also compared the recognition performance of the DLPP, 2D-DLPP, Dia-DLPP, DiaDLPPþ2D-DLPP, and W2D-DLPP methods under different feature dimensions. Fig. 7 shows their performance when four samples of each subject were randomly selected as the training set and the rest as the testing set. The results show that the proposed DiaDLPP, Dia-DLPP þ 2D-DLPP and W2D-DLPP methods consistently outperform the other methods under comparison, and W2D-DLPP also always obtains the best recognition performance in terms of recognition accuracy.
5.3. Discussion We draw the following five key observations from the experimental results: 1. W2D-DLPP, Dia-DLPP þ2D-DLPP, Dia-DLPP and 2D-DLPP consistently outperform DLPP, which implies that 2D-based approach is more effective in preserving the spatial structure information for recognition when compared with 1D-based approach. 2. W2D-DLPP, Dia-DLPP þ2D-DLPP, Dia-DLPP and 2D-DLPP consistently outperform 2D-DLPP, especially when the number of the training samples is small. The reason is that when the training samples are few, the locality information of each class is not strong and the structure information preserved in the Dia-DLPP and Dia-DLPP þ2D-DLPP methods has more influence on feature extraction and recognition. When the number of the training samples increases, both the locality information and the structure information become important. 3. W2D-DLPP consistently outperforms Dia-DLPP þ2D-DLPP and Dia-DLPP. That is because the proposed discriminant weighted method has utilized different discriminative information of each pixel in different positions of the images, and more discriminant information can be exploited than the diagonal image representation method, and higher recognition accuracy can be attained. 4. Dia-DLPP outperforms DiaPCA, DiaLDA, DiaLPP and SDiaLPP, which implies that preserving locality information and
Fig. 6. Six images of one palm from the PolyU database.
3766
J. Lu, Y.-P. Tan / Neurocomputing 74 (2011) 3760–3767
Table 3 Recognition performance comparison on the PolyU database (mean 7std). Method
k ¼2
PCA [2] LDA [3] LPP [1] DLPP [4] 2DPCA [6] 2DLDA [7] 2DLPP [8] 2D-DLPP [5] DiaPCA [10] DiaLDA [11] DiaLPP [21] SDiaLPP [21] W-2DPCA W-2DLDA (2D)2 PCA (2D)2 LDA Dia-DLPP (org) Dia-DLPP þ 2D-DLPP (org) Dia-DLPP Dia-DLPP þ 2D-DLPP W2D-DLPP
k¼ 3
Acc. (%)
Dim.
Acc. (%)
Dim.
Acc. (%)
Dim.
70.2 7 1.0 72.1 7 1.1 72.5 7 0.9 73.6 7 0.8 76.7 7 0.8 77.6 7 0.7 77.8 7 0.7 78.2 7 0.6 77.5 7 0.7 77.9 7 0.7 78.6 7 0.7 78.2 7 0.7 78.2 7 0.7 78.4 7 0.7 77.4 7 0.7 77.9 7 0.7 78.4 7 0.6 78.6 7 0.5 79.3 7 0.5 80.2 7 0.4 81.6 7 0.3
199 91 91 91 64 6 64 6 64 4 64 5 64 6 64 6 64 6 64 6 64 6 64 6 66 66 64 5 66 64 5 66 66
78.4 7 1.0 84.4 7 1.1 84.8 7 1.1 85.3 7 1.0 84.5 7 0.9 85.6 7 0.9 86.2 7 0.8 86.6 7 0.7 86.2 7 0.6 86.5 7 0.6 86.3 7 0.7 86.2 7 0.7 86.4 7 0.8 86.4 7 0.9 84.9 7 0.6 85.9 7 0.6 86.6 7 0.6 87.4 7 0.5 87.9 7 0.5 88.5 7 0.4 89.8 7 0.3
280 99 99 99 64 7 64 6 64 5 64 5 64 6 64 4 64 4 64 4 64 7 64 6 64 65 64 5 66 64 5 66 66
90.6 7 1.0 90.9 7 0.9 91.4 7 0.9 91.7 7 0.9 91.2 7 0.7 91.5 7 0.7 91.7 7 0.7 92.2 7 0.5 91.4 7 0.5 91.8 7 0.4 91.4 7 0.5 91.2 7 0.4 91.6 7 0.7 92.3 7 0.7 91.4 7 0.4 91.7 7 0.4 92.8 7 0.4 92.9 7 0.4 93.1 7 0.3 93.6 7 0.3 94.9 7 0.3
350 99 99 99 64 7 64 6 64 5 64 5 64 5 64 5 64 5 64 5 64 7 64 6 65 65 64 6 77 64 6 77 77
diagonal images to find the optimal projections for both feature extraction and recognition. We have also showed that transforming an image to a diagonal image is equivalent to assigning different weight to each pixel of the original image to emphasize its different importance for recognition, which can explain the rationale and superiority of using diagonal images in 2D subspace learning methods. Moreover, we have proposed a new discriminant weighted method to explicitly calculate the discriminative score of each pixel within a face and palmprint sample to duly emphasize its different importance, and incorporated it into 2DDLPP to formulate the W2D-DLPP method to further improve the face and palmprint recognition performance. Experimental results on the widely used FERET face and PolyU palmprint databases have demonstrated the effectiveness of the proposed methods.
95
Recognition accuracy (%)
90 85 80 75 70
DLPP 2D−DLPP+PCA Dia−DLPP+PCA Dia−DLPP+2D−DLPP W2D−DLPP
65 60
9
16
25
k¼ 4
36 49 Dimension
64
81
100
Fig. 7. Recognition accuracies on the PolyU database under different feature dimensions.
discriminative information of the samples are more useful in improving the recognition performance. 5. Dia-DLPP and Dia-DLPP þ2D-DLPP attain more gains when performing recognition using the diagonal image rather than the original image as suggested in [10,11]. This finding shows that not only the diagonal images can enhance the feature extraction performance, they can also improve the accuracy when applied for recognition.
6. Conclusion We have proposed in this paper two improved manifold learning method called diagonal discriminant locality preserving projections (Dia-DLPP) and weighted two-dimensional discriminant locality preserving projections (W2D-DLPP) for face and palmprint recognition. The proposed Dia-DLPP method first generates diagonal images and then performs 2D-DLPP on these
Acknowledgment The authors would like to express their thanks to the associate editor and anonymous reviewers for their supportive comments and useful suggestions that have helped us to improve the quality of the paper. They would also like to thank The Hong Kong Polytechnic University for providing the PolyU Palmprint Database. Portions of the research in this paper use the FERET database of facial images collected under the FERET program.
References [1] X. He, S. Yan, Y. Hu, P. Niyogi, H.-J. Zhang, Face recognition using Laplacianfaces, IEEE Transactions on Pattern Analysis and Machine Intelligence 27 (3) (2005) 328–340. [2] M. Turk, A. Pentland, Eigenfaces for recognition, Journal of Cognitive Neuroscience 3 (1) (1991) 71–86. [3] P.N. Belhumenur, J.P. Hepanha, D.J. Kriegman, Eigenfaces vs. Fisherface: recognition using class specific linear projection, IEEE Transactions on Pattern Analysis and Machine Intelligence 19 (7) (1997) 711–720. [4] W. Yu, X. Teng, C. Liu, Face recognition using discriminant locality preserving projections, Image and Vision Computing 24 (2006) 239–248. [5] R. Zhi, Q. Ruan, Facial expression recognition based on two-dimensional discriminant locality preserving projections, Neurocomputing 71 (2008) 1730–1734. [6] J. Yang, D. Zhang, A.F. Frangi, J.-Y. Yang, Two-dimensional PCA: a new approach to appearance-based face representation and recognition, IEEE
J. Lu, Y.-P. Tan / Neurocomputing 74 (2011) 3760–3767
[7] [8]
[9] [10] [11]
[12]
[13] [14]
[15]
[16]
[17]
[18]
[19] [20]
[21]
Transactions on Pattern Analysis and Machine Intelligence 26 (1) (2004) 131–137. H. Xiong, M.N.S. Swamy, M.O. Ahmad, Two-dimensional FLD for face recognition, Pattern Recognition 38 (7) (2005) 1121–1124. D. Hu, G. Feng, Z. Zhou, Two-dimensional locality preserving projections (2DLPP) with its application to palmprint recognition, Pattern Recognition 40 (1) (2007) 339–342. B. Niu, Q. Yang, C.K. Shiu, S.K. Pal, Two dimensional Laplacianfaces method for face recognition, Pattern Recognition 41 (2008) 3237–3243. D. Zhang, Z.-H. Zhou, S. Chen, Diagonal principal component analysis for face recognition, Pattern Recognition 39 (2006) 140–142. S. Noushatha, G.H. Kumara, P. Shivakumara, Diagonal fisher linear discriminant analysis for efficient face recognition, Neurocomputing 69 (2006) 1711–1716. D. Zhang, Z.H. Zhou, (2D)2 PCA: two-directional two-dimensional PCA for efficient face representation and recognition, Neurocomputing 69 (2005) 224–231. P. Nagabhushan, D. S Guru, B.H. Shekar, 2D2 FLD: an efficient approach for appearance based object recognition, Neurocomputing 69 (2006) 934–940. S.-B. Chen, B. Luo, G.-P. Hu, R.-H. Wang, Bilateral two-dimensional locality preserving projections, in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, 2007, pp. 601–604. J. Lu, Y.-P. Tan, Two-directional two-dimensional discriminant locality preserving projections for image recognition, in: IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, 2009, pp. 1753–1756. S. Yan, D. Xu, B. Zhang, H.J. Zhang, Q. Yang, S. Lin, Graph embedding and extensions: a general framework for dimensionality reduction, IEEE Transactions on Pattern Analysis and Machine Intelligence 29 (1) (2007) 40–51. P.J. Phillips, H. Wechsler, J. Huang, P. Rauss, The FERET database and evaluation procedure for face-recognition problem, Image and Vision Computing 16 (1998) 295–306. P.J. Phillips, H. Moon, P.J. Rauss, S. Rizvi, The FERET evaluation methodology for face recognition algorithms, IEEE Transactions on Pattern Analysis and Machine Intelligence 22 (10) (2000). The PolyU palmprint database. Available at /http://www.comp.polyu.edu. hk/biometricsS. D. Zhang, W.K. Kong, J. You, M. Wong, Online palmprint identification, IEEE Transactions on Pattern Analysis and Machine Intelligence 25 (9) (2003) 1041–1050. Veerabhadrappa, L. Rangarajan, Diagonal and secondary diagonal locality preserving projection for object recognition, Neurocomputing 73 (2010) 3328–3333.
3767
Jiwen Lu received the B.Eng. degree in mechanical engineering, and the M.Eng. degree in signal and information processing from the Xi’an University of Technology, Xi’an, China, in 2003 and 2006, respectively. He completed the Ph.D. degree in Electrical and Electronic Engineering from the Nanyang Technological University, Singapore, in 2011. He is currently a postdoctoral research fellow in the Advanced Digital Sciences Center, Singapore. His research interests include pattern recognition, computer vision, machine learning, human computer interaction, multimedia computing and biometrics. Mr. Lu received 1st-Prize of the National Scholarship and the National Outstanding Student awarded by the Ministry of Education of China, in 2002 and 2003, respectively.
Yap-Peng Tan received the B.S. degree from National Taiwan University, Taipei, Taiwan, in 1993, and the M.A. and Ph.D. degrees from Princeton University, Princeton, NJ, in 1995 and 1997, respectively, all in electrical engineering. He was the recipient of an IBM Graduate Fellowship from IBM T. J. Watson Research Center, Yorktown Heights, NY, from 1995 to 1997, and was with Intel and Sharp Labs of America from 1997 to 1999. In November 1999, he joined the School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, where he is presently an Associate Professor and Head of the Division of Information Engineering. His current research interests include image and video processing, content-based multimedia analysis, computer vision, and pattern recognition. He is the principal inventor/co-inventor of 15 U.S. patents in the areas of image and video processing. Dr. Tan is the secretary of the IEEE Circuits and Systems Society’s Technical Committee on Visual Signal Processing and Communications, a member of the IEEE Signal Processing Society’s Technical Committee on Multimedia Signal Processing, an editorial board member of the EURASIP Journal on Advances in Signal Processing and EURASIP Journal on Image and Video Processing, and an associate editor of the Journal of Signal Processing Systems. He was the General Co-chair of the 2010 IEEE International Conference on Multimedia and Expo (ICME 2010).