ARTICLE IN PRESS
Neurocomputing 72 (2008) 352–358 www.elsevier.com/locate/neucom
Two-directional maximum scatter difference discriminant analysis for face recognition Jianguo Wanga,b,, Wankou Yanga, Yusheng Lina, Jingyu Yanga a
School of Computer Science & Technology, Nanjing University of Science & Technology, Nanjing, PR China b Network & Education Center, Tangshan College, Tangshan, PR China Received 21 June 2007; received in revised form 27 December 2007; accepted 1 January 2008 Communicated by D. Tao Available online 6 February 2008
Abstract In this paper, we propose a novel method for image feature extraction. This method combines the ideas of two-dimensional principal component analysis and two-dimensional maximum scatter difference and which can directly extracts the optimal projective vectors from 2D image matrices rather than image vectors based on the scatter difference criterion. The proposed method not only avoids the singularity problem frequently occurred in the classical Fisher discriminant analysis due to the small sample size, but also saves much computational time. In addition, the proposed method can simultaneously make use of the discriminant information and descriptive information of the image. Experiments conducted on FERET, and ORL face databases demonstrate the effectiveness of the proposed method. r 2008 Elsevier B.V. All rights reserved. Keywords: Feature extraction; Face recognition; Two-dimensional principal component analysis (2DPCA); 2D maximum scatter difference (2DMSD); Image matrix
1. Introduction It is well-known that Fisher linear discriminant analysis (FLDA) is a popular and effective method for feature extraction. The basic idea of FLDA is try to find an optimal projection vectors by maximizing between-class scatter matrix (Sb) and minimizing within-class scatter matrix (Sw), which can be obtained by maximizing the Fisher discriminant criterion defined as follows: J f ðW Þ ¼ arg max W
W T Sb W . W T Sw W
However, due to face recognition frequently encountered the high dimensionality and small sample size [2,15], the classical FLDA cannot be used directly in that the withinclass scatter matrix is always singular. To overcome this problem, some corresponding techniques were proposed Corresponding author at: School of Computer Science & Technology, Nanjing University of Science & Technology, Nanjing, PR China. E-mail address:
[email protected] (J. Wang).
0925-2312/$ - see front matter r 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.neucom.2008.01.004
[2,13,17,18]. The most popular method, called Fisherface, was build by Swets and Weng [9] and Belhumeur [1]. In their methods, PCA is first used to reduce the dimension of the original features space to Nc, and the classical FLDA is next applied to reduce the dimension to d(dpc). Obviously, in the process PCA transform, the small projection components have been thrown away. So some effective discriminatory information may be lost, and PCA step cannot guarantee the transformed within-class scatter matrix be not singular. To avoid the singularity problem, Song et al. [8] proposed a method, which adopts the difference of both between-class scatter and within-class scatter as discriminant criterion, due to the inverse matrix is need not constructed, so the small sample size problem occurred in traditional Fisher discriminant analysis is in nature avoided. In addition, Yu and Tian [14] proposed a parametric hybrid discriminant analysis (HDA) method, which can extract both discriminant and descriptive features for classification. The alternative to LDA and PCA has a good performance.
ARTICLE IN PRESS J. Wang et al. / Neurocomputing 72 (2008) 352–358
However, the above-mentioned methods are based on vectors, so the 2D images must be firstly transformed into 1D vectors, but which has some disadvantages: (1) The original image matrices must be previously transformed into 1D image vectors. This usually leads to a so-called the ‘‘curse of dimensionality’’ problem. This problem is always encountered in small-sample-size cases such as face recognition. (2) The matrix-to-vector transform procedure may cause the loss of some useful structural information embedding in the original images. In contrast with the above-mentioned methods, a method based on the local geometrical structure called tensor subspace analysis (TSA) [11] is proposed, which captures an optimal linear approximation to the face manifold in the sense of local isometry. But the computational convergence of the iterative algorithms is not guaranteed. To address the problem, Tao et al. proposed a tensor discriminant analysis method for feature extraction. They provide a convergent solution for discriminative tensor subspace selection and give the proof of the training stage convergence in a mathematical way [4], and provide a general overview of supervised learning theoretically in both feature selection and training stage convergent [3]. However, the method has to make iteration computation to select the optimal projection axis. Inspired by the successful application of 2DPCA to face recognition [12], we proposed a novel method to handle the above problems by directly projecting the 2D face image matrices rather than using the transformed image vectors. The proposed method here is a straightforward manner based on scatter difference criterion and the 2D face image matrices projection. To preserve the correlations between variations of rows and columns of images, the 2DMSD method works in the row direction of images and the 2DPCA method works in the column direction of images are combined together, which not only improves the face recognition rate but also reduce the dimensions required to represent an image. The most related method is Wang’s et al. [10], but Wang’ method is the same as twice trace difference in two directions and finds the optimal transformations by virtue of iteration algorithm, while ours method combines two feature extraction methods, and the optimal projection axis is selected with the eigenvalue decomposition method without iteration computation. The proposed method of this paper has the following properties: (1) The scatter-difference-based discriminant criterion is consistent with the principle of the scatter-ratio-based discriminant criterion. (2) Computation efficient. The maximal scatter difference discriminant criterion based method avoids compute the inverse matrix, which saves much computational time on feature extraction.
353
(3) The scatter difference criterion can avoid the small sample size problem occurred in traditional Fisher discriminant analysis. (4) Simple and more straightforward. Which directly extracts the optimal projective vectors from 2D face image matrices rather than vectors, reserves useful structural information embedding in the original images. (5) Preserving useful information. The proposed method seeks the optimal projective vectors from rows and columns of face images, which can reserve the correlations between variations of rows and those of columns of images, and obtain discriminant information and descriptive information simultaneously. The paper of the rest is organized as follows. In Section 2, we review briefly maximum scatter difference (MSD) criterion. In Section 3, we propose the idea and describe the proposed method in detail. In Section 4, experiments on FERET and ORL face database are presented to demonstrate the effectiveness of the proposed method. Conclusions are summarized in Section 5. 2. Outline of maximum scatter difference Suppose there are C known pattern classes, o1, o2,y,oC, the between-class scatter matrix and withinclass scatter matrix can be denoted as: Sb ¼
C 1 X M i ðmi m0 Þðmi m0 ÞT ; M i¼1
(1)
Sw ¼
Mi C X 1 X ðxj mi Þðxji mi ÞT ; M i¼1 j¼1 i
(2)
where M is the total number of training samples, and Mi is the number of training samples in class i. In class i, the jth training sample is denoted by xji , the mean vector of training samples in class i is denoted by mi and the mean vector of all training samples is m0. From the classical Fisher criterion function, we know that when the ratio of the between-class scatter and the within-class scatter is maximized, the samples can be separated easily. In this paper, a scatter difference based discriminant rule is defined as follows [8], which is based on the difference of between-class scatter matrix and withinclass scatter matrix. J s ðwÞ ¼ wT ðSb Sw Þw.
(3)
By the property of the extreme value of generalized Rayleigh quotient [16], the optimal projection axe is the eigenvector corresponding to the maximal eigenvalue of Eq. (3). Generally, in many cases (such as class number is bigger than two) one optimal projection axe is not enough; we usually select a set of projection axes subject to the orthonormal constraints and maximizing the criterion in
ARTICLE IN PRESS J. Wang et al. / Neurocomputing 72 (2008) 352–358
354
Eq. (3). In fact, the optimal projection axes w1, w2,y,wk can be selected as the orthonormal eigenvectors corresponding to the first k largest eigenvalues l1, l3,y,lk, i.e., (SbSw)wj ¼ ljwj, where l1Xl2X?Xlk. Comparing the MSD criterion with the classical Fisher discriminant criterion, we find that the former avoids calculation of the inverse within-class scatter, i.e., S1 w S b is substituted by SbSw, this cannot only make computationally more efficient but also avoid the singular problem of the within-class scatter.
Thus, trðSB Þ ¼ jT
! C 1 X ¯ i AÞ ¯ T ðA ¯ i AÞ ¯ j M i ðA M i¼1
¼ jT SBI j, T
trðSW Þ ¼ j
(8)
! Mi C X 1 X j j T ¯ iÞ j ðA Ai Þ ðAi A M i¼1 j¼1 i
¼ jT S WI j.
(9)
Let 3. The proposed method SBI ¼ The 2D discriminant analysis is an interesting technique in pattern recognition, since it can extract more discriminative feature faster than the 1D discriminant analysis. In this paper, a novel feature extraction method is proposed, which combined the two-dimensional (2D) principal component analysis and the 2D maximum scatter difference criterion for face recognition.
3.1. Idea and theoretical analysis Considering there is a set of M sample images A1, A2,y,AM, each image is a m n matrix, let j be an n-dimensional unitary column vector. Suppose there are C known pattern classes, and Mi is the number of training samples of class i, and the jth sample image in class i is denoted by Aji , the 2DMSD method is projecting each image ðAji Þmn onto j by the following transformation: Y ji ¼ Aji j;
i ¼ 1; 2; . . . ; C;
j ¼ 1; 2; . . . ; M i .
(4)
To obtain the highest recognition rate, it is important to select the optimal projection vector j. A criterion is defined as follows: J ds ðjÞ ¼ trðS B Þ trðS W Þ,
(5)
where tr is the trace of the matrix, SB and SW denote between-class scatter matrix and within-class scatter matrix of projected feature vectors of the 2D image faces respectively, and is defined as follows: SB ¼ ¼
SW ¼
¼
C 1 X M i ðY¯ i Y¯ ÞðY¯ i Y¯ ÞT , M i¼1 C 1 X ¯ i AÞj½ð ¯ ¯ i AÞj ¯ T: M i ½ðA A M i¼1
(6)
Mi C X 1 X ðY j Y¯ i ÞðY ji Y¯ i ÞT ; M i¼1 j¼1 i C X M 1 X ¯ i Þj½ðAji A ¯ i ÞjT . ½ðAj A M i¼1 j¼1 i
(7)
SWI ¼
C 1 X ¯ i AÞ ¯ T ðA ¯ i AÞ, ¯ M i ðA M i¼1 Mi C X 1 X ¯ i ÞT ðAji A ¯ i Þ, ðAj A M i¼1 j¼1 i
(10)
(11)
¯ i is the mean image matrix of the ith class face where A ¯ is the total mean image matrix of training sets, images, A Y¯ i denotes the mean feature matrix of the ith class feature matrix and Y¯ denotes the total mean feature matrix of feature matrix Y. Then we have trðSB Þ ¼ jT SBI j,
(12)
trðSW Þ ¼ jT S WI j.
(13)
Thus, the criterion (5) could be expressed by J ds ðjÞ ¼ jT SBI j jT S WI j ¼ jT ðSBI S WI Þj.
(14)
It is easy to find the optimal projection axis jopt by maximizing the function Jds(j), such that, j ¼ arg max J ds ðjÞ. j
(15)
The optimal projection axis jopt is given by the maximal eigenvalue solution to the generalized eigenvalue problem: ðS BI S WI Þj ¼ lj.
(16)
After the projection of samples onto jopt, the difference between the between-class scatter and the within-class scatter is maximized. From the detail descriptions, we can see that the 2DLDA [5] must compute the inverse matrix, while the 2DMSD avoids this computation successfully by virtue of trace difference. However, one disadvantage of 2D matrix-based method (compared to 1D vectors) is that more coefficients are needed to represent an image. From Tables 1 and 2, it is evident that dimension of the 2D matrix-based methods feature vectors are always much higher than 1D vectorbased methods at top recognition accuracy. Here, to reduce the dimension of 2D matrix-based methods, a simple strategy is to use 2DPCA for further dimensional reduction after 2DMSD, it is a way to simultaneously use the 2DPCA and 2DMSD.
ARTICLE IN PRESS J. Wang et al. / Neurocomputing 72 (2008) 352–358
355
Table 1 The maximal average recognition rates (ARR, %) of eight feature extraction methods on a subset of the FERET database, the corresponding dimensions and the running time of each phase (CPU: P4 2.4G, RAM: 512 M, Matlab 7.01) Method
Recognition accuracy (%)
Dimension
Extraction time (s)
Classification time (s)
Total time (s)
PCA+LDA MSD TSA 2DPCA 2DLDA 2DMSD 2DPCA+2DLDA Proposed method
35.33 46.83 46.5 53.83 50.5 53.33 51.67 54.17
119 119 12 12 40 4 40 4 40 4 12 4 11 4
55.015 53.406 9.782 0.141 0.138 0.109 0.078 0.063
4.781 4.543 4.75 4.453 4.493 4.453 3.609 3.578
59.796 57.949 14.156 4.594 4.631 4.562 3.687 3.641
Table 2 The maximal average recognition rates (ARR, %) of eight feature extraction methods with different training sample sizes across 10 runs on the ORL database and the corresponding standard deviations (S.D.) and dimensions (D) Method
No. of train samples 4
PCA+LDA MSD TSA 2DPCA 2DLDA 2DMSD 2DPCA+2DLDA Proposed method
5
ARR (%)
D
ARR (%)
D
ARR (%)
D
90.6372.03 94.3872.11 92.58371.87 94.5471.83 94.5871.77 95.1371.66 94.8871.81 95.4271.69
39 39 10 10 56 4 56 3 56 3 12 3 16 3
92.9572.23 95.571.99 94.5571.64 95.771.7 95.571.99 96.471.73 95.8571.36 96.4571.8
39 39 12 12 56 4 56 3 56 3 83 14 3
94.1971.84 97.2571.32 97.1371.03 97.2570.53 96.4471.82 97.4471.19 96.6271.56 97.5671.3
39 39 10 10 56 5 56 3 56 3 10 3 16 3
From literature [12], we know that 2DPCA works in the row direction of images. Because we want to simultaneously use the projection matrices of 2DPCA and 2DMSD, so we redefined the scatter matrix of 2DPCA as follows: S BD ¼
S WD ¼
6
C 1 X ¯ i AÞð ¯ A ¯ i AÞ ¯ T, M i ðA M i¼1 Mi C X 1 X ¯ i ÞðAji A ¯ i ÞT , ðAj A M i¼1 j¼1 i
S TD ¼ S BD þ SWD .
As can be seen from Eq. (20) that the feature matrix F ¼ (fij)q d is composed of the most discriminative features for face recognition. Based on the above descriptions, the fusion of 2D principal component analysis and 2D maximum scatter difference algorithm can be described as follows:
(17)
(18) (19)
Thus, the optimal projection matrix Z can be obtained by calculating the eigenvectors of Eq. (19) corresponding to the q largest eigenvalues. Now the eigenvectors of Eq. (19) reflect the information between columns of images, we say that the new 2DPCA now works in the column direction of images.
Step 1. Extracting image features using Eq. (20), then we have a feature matrix F ¼ (fij)q d. Step 2. Classifying with the nearest neighbor classifier. For any two feature matrices F i ¼ ðF 1i ; F 2i ; . . . ; F di Þ and F j ¼ ðF 1j ; F 2j ; . . . ; F dj Þ, the distance between them is defined as:
dðF i ; F j Þ ¼
C X
kF li F lj k2
(21)
l¼1
Let F1, F2,y,FM be the projected eigenvectors of all the training samples, for any test sample t, if dðt; F l Þ ¼ min dðt; F i Þ and FlAol, then t is classified to class l. i
3.2. Feature extraction and classification
4. Experiments and analysis Now, project the m n image A onto Z and j simultaneously, then a q d matrix F is obtained: F ¼ ZT Aj.
(20)
In this section, experiments conducted on a subset of FERET, and ORL face image databases were designed to evaluate the effectiveness of ours proposed method.
ARTICLE IN PRESS 356
J. Wang et al. / Neurocomputing 72 (2008) 352–358
4.1. Experiments on a subset of FERET database The proposed method is evaluated on a subset of FERET database [6,7], which includes 1000 images of 200 distinct subjects, each subject has five images, which corresponds to the images whose names containing the strings: ‘‘ba’’, ‘‘bj’’, ‘‘be’’, ‘‘bf’’ and ‘‘bk’’. The subset involves variations in facial expression, illumination and pose. In our experiment, the facial portion of each original image is cropped automatically based on the location of eyes and resized to 40 40 pixels. Some facial portion images of one person are shown in Fig. 1. In the experiment, the 400 images whose names containing the string ‘‘ba’’ and ‘‘bj’’ (i.e., the first two images per person) are used to form training set, and the rest 600 images (the last three images per person) are used for testing. 4.1.1. Selection of the projection axis First, 2DPCA, 2DLDA and 2DMSD methods are used for feature extraction, the number of selected eigenvectors (projection vectors) varies from 1 to 10. Here, let k denotes the projection vector number, then the dimension of corresponding projected feature vector is 40 k. Finally, a nearest neighbor classifier with Euclidean distance is employed to classify in the projected feature space. The recognition rates versus k are shown in Fig. 2. From Fig. 2, we can see that 2DPCA, 2DLDA, and 2DMSD all achieve the top recognition rate when k equals to 4. Next, 2DPCA+2DLDA and the proposed method are used for feature extraction. Let k ¼ 4, and in the second step feature extraction (in the column direction), we let m denotes the projection vector number, and which varies from 3 to 40 (the image height). A nearest neighbor classifier with Euclidean distance is employed for classification too. The recognition rates versus m are shown in Fig. 3. From Fig. 3, we can see that when m equals to 11, i.e. the feature matrix is 11 4, the proposed method achieves the top recognition rate. And when m equals to 12, 2DPCA+2DLDA achieves the best recognition rate. We also find that when m411, the recognition rate of the proposed method is not less than that of the 2DMSD method, which indicate that the proposed method guarantees the recognition rates of the 2DMSD with fewer feature coefficients. 4.1.2. Comparison of the performance In this test, PCA+LDA, MSD, TSA, 2DMSD, 2DPCA, 2DLDA, 2DPCA+2DLDA and the proposed method are
Fig. 1. Some sample images for one person in a subset of the FERET database.
Fig. 2. The recognition rates of 2DPCA, 2DLDA, 2DMSD versus the dimensions.
Fig. 3. The recognition rates of 2DPCA+2DLDA and the proposed method versus the dimensions.
used for feature extraction. Note that PCA+LDA involves a PCA phase. In this phase, we keep nearly 98% image energy. And in the TSA algorithm, the number of iterations is taken to be 10. After feature extraction, a nearest neighbor classifier with Euclidean distance is employed for classification. The maximal recognition rate of each method and the corresponding dimension and the running time of each phase are listed in Table 1. From Table 1, we can see three main points. First, the proposed method is computationally more efficient than the other methods (whatever vector-based or image-based) for face feature extraction. Second, the proposed method needs a much-reduced coefficient set for image representation and recognition than the other methods. Third, due to the proposed method works simultaneously in the row and the column directions of face images, it outperforms the other methods with fewer coefficients. Why can the feature extraction time of MSD, 2DMSD, and the proposed method be more computationally efficient than PCA+LDA, 2DLDA, and 2DPCA+2DLDA methods, respectively? This is mainly because of the formers need not to calculate the inverse of matrix.
ARTICLE IN PRESS J. Wang et al. / Neurocomputing 72 (2008) 352–358
357
4.2. Experiment on the ORL database The ORL database is composed of 40 individuals, each providing 10 different images under different expression and different views, with a tolerance for some tilting and rotation of up to about 201. Moreover, there is also some variation in the scale of up to about 10%. All images are grayscale and normalized to 92 112 pixels. Some sample images from the ORL database are shown in Fig. 4. To evaluate the recognition performance of the proposed method, PCA+LDA, MSD, TSA, 2DMSD, 2DPCA, 2DLDA, 2DPCA+2DLDA and the proposed method are used for feature extraction. In the PCA phase of PCA+LDA, we keep nearly 98% image energy. And in the TSA algorithm, the number of iterations is taken to be 10. The training and testing set are selected randomly for each individual. We repeated the recognition procedure 10 times by randomly choosing different training and testing sets. In each time, s face images from each person are randomly selected as training samples, and the rest is for testing. Finally, a nearest neighbor classifier is employed for classification. The maximal average recognition rate and the S.D. (i.e., standard deviations) across 10 runs of each method and the corresponding dimension are listed in Table 2. The recognition rate curve versus the 10 different training sets are shown in Figs. 5 and 6. From Table 2, we can see that the proposed method outperforms the other methods. Figs. 5 and 6 depict the comparative analysis of the eight face recognition methods on ORL face database; here the training set is selected five samples per person randomly. Due to recognition rates of eight methods are plot in one figure is too compact, for the sake of clarity, we separate the comparisons of recognition rates into two parts as illustrated in Figs. 5 and 6. Fig. 5 gives the comparisons between the proposed method and the LDA-based methods, and Fig. 6 gives the comparisons between the proposed method and the MSD-based methods. The figures also demonstrate that the performance of the proposed method outperforms the other methods under the same condition, it further shows that the proposed method can extract more discriminative features than the other methods. From Table 2 and the figures, we can see that the proposed method use fewer feature coefficients, and have higher recognition rates, and Fig. 6 shows that the proposed method guarantees the recognition rates of the 2DMSD with fewer feature coefficients.
Fig. 5. The recognition rates of LDA-based methods, and the proposed method versus different training sets on ORL database, five randomly samples per class.
Fig. 6. The recognition rates of MSD, 2DMSD, 2DPCA, and the proposed method versus different training sets on ORL database, five randomly samples per class.
information and descriptive information by virtue of 2D maximum scatter difference algorithm and 2D principal component analysis. The proposed one not only preserves the correlations between variations of rows and those of columns of face images, but also the singularity problem of the within-class scatter matrix occurred in classical Fisher discriminant analysis can be avoided by the definition of the scatter difference discriminant criterion. The experiments conducted on FERET and ORL databases indicate the effectiveness of the proposed method.
5. Conclusions In this paper, we propose a novel algorithm for image feature extraction, which combines the discriminant
Fig. 4. Some sample images of one person in the ORL database.
Acknowledgments This work was supported by the National Natural Science Foundation, PR China, under grant nos. 60632050, 60503026 and 60472060. Finally, the authors would like to thank the anonymous reviewers for their constructive advice.
ARTICLE IN PRESS 358
J. Wang et al. / Neurocomputing 72 (2008) 352–358
References [1] V. Belhumeur, J. Hespanha, D. Kriegman, Eigenfaces vs fisherfaces: recognition using class specific linear projection, IEEE Trans Pattern Anal. Machine Intell. 19 (7) (1997) 711–720. [2] L.F. Chen, H.Y.M. Liao, M.T. Ko, et al., A new LDA-based face recognition system which can solve the small sample size problem [J], Pattern Recog. 33 (10) (2000) 1713–1726. [3] Dacheng Tao, Xuelong Li, Weiming Hu, et al., Supervised Tensor Learning. Knowledge and Information Systems, vol. 2, Springer, KAIS, 2007a, pp. 1670–1677. [4] Dacheng Tao, Xuelong Li, Xindong Wu, et al., General tensor discriminant analysis and gabor features for gait recognition, TPAMI 29 (10) (2007). [5] M. Li, B. Yuan, 2D-LDA: a novel statistical linear discriminant analysis for image matrix, PRL 26 (5) (2005) 527–532. [6] P.J. Phillps, The Facial Recognition Technology (FERET) Database /http://www.itl.nist.gov/iad/humanid/feret/feret_master.htmlS, 2004. [7] P.J. Phillips, H. Moon, S.A. Rizvi, P.J. Rauss, The FERET evaluation methodology for face-recognition algorithms, IEEE Trans. Pattern Anal. Machine Intell. 22 (10) (2000) 1090–1104. [8] F.X. Song, K. Cheng, J.Y. Yang, et al., Maximum scatter difference, large margin linear projection and support vector machines, Acta Automat. Sinica 30 (6) (2004) 890–896 (in Chinese). [9] D.L. Swets, J. Weng, Using discriminant eigenfeatures for image retrieval, IEEE Trans. Pattern Anal. Machine Intell. 18 (8) (1996) 831–836. [10] H. Wang, S. Yan, T. Huang, X. Tang, A convergent solution to tensor subspace learning, Proc. IJCAI (2007) 629–634. [11] Xiaofei He, Deng Cai, Partha Niyogi, Tensor subspace analysis. Advances in Neural Information Processing Systems 18, Vancouver, Canada, December 2005. [12] J. Yang, David Zhang, A.F. Frangi, J.Y. Yang, Two-dimensional PCA: a new approach to appearance-based face representation and recognition, Pattern Anal. Machine Intell. 26 (1) (2004) 131–137. [13] J. Yang, J.Y. Yang, Why can LDA be performed in PCA transformed space?, Pattern Recog. 36 (2) (2003) 563–566. [14] J. Yu, Q. Tian, Constructing descriptive and discriminant features for face classification, Proc. IEEE ICASSP (2006) 121–124. [15] H. Yu, J. Yang, A direct LDA algorithm for high-dimensional data with application to face recognition, Pattern Recog. 34 (10) (2001) 2067–2070. [16] Zhang Xian-da, Matrix Analysis and Application (in Chinese), Tsinghua University Press, Beijing, 2004. [17] W. Zhao, R. Chellappa, J. Phillips. Subspace Linear Discriminant Analysis for Face Recognition. Technical Report CS-TR4009, University of Maryland, 1999. [18] X.S. Zhuang, D.Q. Dai, Improved discriminant analysis for highdimensional data and its application to face recognition, Pattern Recog. 40 (5) (2007) 1570–1578.
Jianguo Wang was born in Hebei, China, June 1972. He received the B.S. degree in printing technology from Xi’an University of technology in 1994 and received the M.S. degree in signal and information processing from Xi’an University of Technology in 1997. He is an assistant professor in the Network & Education Center, Tangshan College. He is now working towards his doctor’s degree in computer vision and pattern recognition in NUST. His current interests are in the areas of pattern recognition, image processing, face detection and recognition. Wankou Yang was born in 1979 in PR China, is a Ph.D. candidate at the Computer Science Department, Nanjing University of Science and Technology, PR China. He received M.S. degrees in pattern recognition and computer image processing from Nanjing University of Science and Technology, PR China, in 2004. His research interests include computer vision, digital image processing, pattern recognition.
Yusheng Lin was born in Liaoning, China, June 1978. He obtained his bachelor of computer science at the Northeastern University in 2001. He then continued to complete a masters of computer science at the Shenyang University of Science and Technology in 2005. Now, he is a Ph.D. candidate at the Nanjing University of Science and Technology in the Department of Computer Science on the subject of pattern recognition and intelligence systems from 2005. His current research interests include pattern recognition and machine learning. Jingyu Yang received the B.S. degree in computer science from Nanjing University of Science and Technology (NUST), Nanjing, China. From 1982 to 1984, he was a visiting scientist at the Coordinated Science Laboratory, University of Illinois at Urbana-Champaign. From 1993 to 1994, he was a visiting professor in the Department of Computer Science, Missuria University. And, in 1998, he acted as a visiting professor at Concordia University in Canada. He is currently a professor and chairman in the Department of Computer Science at NUST. He is the author of more than 300 scientific papers in computer vision, pattern recognition, and artificial intelligence. He has won more than 20 provincial and national awards. His current research interests are in the areas of pattern recognition, robot vision, image processing, data fusion, and artificial intelligence.