Pattern Recognition Letters 31 (2010) 1002–1009
Contents lists available at ScienceDirect
Pattern Recognition Letters journal homepage: www.elsevier.com/locate/patrec
Rotation, illumination invariant polynomial kernel Fisher discriminant analysis using Radon and discrete cosine transforms based features for face recognition Dattatray V. Jadhav a,*, Raghunath S. Holambe b a b
Department of Electronics, Vishwakarma Institute of Technology, Pune (MS), India Department of Instrumentation, SGGSIE&T, Vishnupuri, Nanded (MS), India
a r t i c l e
i n f o
Article history: Received 2 September 2008 Received in revised form 5 October 2009 Available online 4 January 2010 Communicated by T. Tan Keywords: Face recognition Feature extraction Radon transform Discrete cosine transform (DCT) Kernel Fisher discriminant (KFD)
a b s t r a c t This paper presents an in-plane rotation (tilt), illumination invariant pattern recognition framework based on the combination of the features extracted using Radon and discrete cosine transforms and kernel based learning for face recognition. The use of Radon transform enhances the low frequency components, which are useful for face recognition and that of DCT yields low dimensional feature vector. The proposed technique computes Radon projections in different orientations and captures the directional features of the face images. DCT applied on Radon projections provides frequency features. Further, polynomial kernel Fisher discriminant analysis implemented on these features enhances discrimination capability of these features. The technique is also robust to zero mean white noise. The feasibility of the proposed technique has been evaluated using FERET, ORL, and Yale databases. Ó 2009 Elsevier B.V. All rights reserved.
1. Introduction Face recognition has received much attention due to its potential values for applications as well as challenges. A successful face recognition system must be robust to many variations of face images such as viewpoint, plane rotation, illumination, and expressions (Jain et al., 2004). It should attain high recognition accuracy with minimum number of features and should also be robust to noise. Popular pattern recognition techniques based on data reduction such as redundancy, and dimensionality reduction, have difficulties in solving complex pattern recognition problems, such as human face recognition. Low dimensional pattern recognition methods based on principal component analysis (PCA) (Turk and Pentland, 1991), linear discriminant analysis (LDA) (Belhumeur et al., 1997), and other subspace-based methods (Ekenel and Sankur, 2004 cannot achieve satisfactory performance (Phillips et al., 2005). LDA seeks the optimal projection directions that maximize the ratio of between class scatter to within class scatter. Gabor wavelets with five scales and eight orientations are used to derive the desirable facial features characterized by spatial frequency, spatial locality, and orientation selectivity to cope with the variations due to illumination and facial expressions. Fisher linear discriminant * Corresponding author. Tel.: +91 9422797509; fax: +91 20 24280926. E-mail addresses:
[email protected] (D.V. Jadhav),
[email protected] (R.S. Holambe). 0167-8655/$ - see front matter Ó 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.patrec.2009.12.026
(FLD) analysis is performed on these features called as Gabor Fisher Classifier (GFC) (Liu and Wechsler, 2002). Discrete cosine transform (DCT) is effective in data compaction and has been used to derive the facial features. PCA and LDA are implemented on these features (Chen et al., 2005). Local appearance based face recognition scheme is proposed in (Ekenel and Stiefelhagen, 2006). DCT coefficients of non-overlapping blocks of an image are computed and ordered using zigzag scanning. The DCT coefficients extracted from each block are concatenated to obtain the feature vector. The pose mismatch problem has been addressed using the DCTmod2 feature extraction technique. In this approach, DCT coefficients obtained from overlapping blocks have been utilized (Sanderson et al., 2006). The global and local facial features derived using DCT have been combined and LDA has been performed on these features to enhance discrimination capability (Zhou et al., 2006). It has been demonstrated that low frequency components only are sufficient for face recognition (Lai et al., 2001). The Radon transform improves the low frequency components and can derive large number of features (Magli et al., 1999). Since the face image space is of high dimension and the number of face images per subject available for training is usually small, a major computational problem with the LDA algorithm is that the within class scatter matrix is singular and hence LDA cannot be applied directly (Phillips, 1998). One solution to this problem is kernel Fisher discriminant analysis (KFD). This paper presents a face recognition scheme, which is the extension of our previous work (Jadhav and Holambe, 2008). It is computationally efficient, invariant to in-plane rotation and
1003
D.V. Jadhav, R.S. Holambe / Pattern Recognition Letters 31 (2010) 1002–1009
illumination. It is also robust to zero mean white noise. The proposed scheme attempts to exploit the capabilities of Radon transform to enhance low frequency components and DCT to reduce redundancy in data to derive effective and efficient face features. Fractional power polynomial kernel is used to map these features nonlinearly, which enhances their discrimination capability. Fisher discriminant analysis is performed on these features. The effectiveness of the proposed approach is demonstrated in terms of both absolute performance and comparative performance. The paper is organized as follows. In Section 2, we briefly describe Radon transform and DCT. Section 3 presents the proposed technique. Databases and experimental results are summarized in Section 4 followed by conclusions in Section 5.
Ns min ¼ pN
and Ns max ¼ 2pN:
ð5Þ
2.2. Discrete cosine transform DCT is a well-known signal analysis tool used in compression due to its compact representation capability. It has an excellent energy compaction property for highly correlated data. This helps in reduction of feature dimension. Let image f(x, y) is represented as f(m, n) of size M N. The 2-D DCT of an image f(m, n) is given as
Bpq ¼ ap aq
M 1 X N 1 X
f ðm; nÞ cos
pð2m þ 1Þp
m¼0 n¼0
2M
cos
pð2n þ 1Þq 2N
;
0 6 p 6 M 1; 0 6 q 6 N 1; 2. Radon and discrete cosine transforms
where p and q denote the frequencies.
pffiffiffiffiffi 1= M;
2.1. Radon transform
ap ¼ pffiffiffiffiffiffiffiffiffiffi
The Radon transform of a two dimensional function f(x, y) in (r, h) plane is defined as
Rðr; hÞ½f ðx; yÞ ¼
Z
1
1
Z
1
f ðx; yÞdðr x cos h y sin hÞdx dy;
ð6Þ
ð1Þ
p ¼ 0;
2=M; 1 6 p 6 M 1;
pffiffiffiffiffi 1= N; q ¼ 0;
aq ¼ pffiffiffiffiffiffiffiffiffi
2=N; 1 6 q 6 N 1;
where M and N denote the row and column size of f(m, n) respectively (Chen et al., 2005; Ekenel and Stiefelhagen, 2006; Sanderson et al., 2006).
1
where d() is the Dirac function, r 2 [1, 1] is the perpendicular distance of a line from the origin and h 2 [0, p] is the angle formed by the distance vector and X-axis as shown in Fig. 1 (Khouzani and Soltanian, 2005; Wang et al., 2007). A rotation of f(x, y) by an angle / leads to translation of R(r, h) in the variable h by /, i.e.
Rðr; hÞff ðx cos / þ y sin /; x cos / þ y sin /Þg ¼ Rðr; h þ /Þ:
ð2Þ
As the Radon transform converts rotation into translation and DCT is invariant to translation, features extracted using their combination are invariant to the rotation (in-plane). In addition, due to its excellent energy compaction property, the use of DCT in the proposed approach also helps in reduction of feature vector dimension. The advantage of using the Radon transform is its robustness to white noise. The signal-to-noise ratio of Radon space and that of an image is related as
SNRproj ¼ SNRimage þ 1:7N SNRimage ;
ð3Þ
SNRproj 1:7N SNRimage :
ð4Þ
3. The Radon transform and DCT based kernel Fisher discriminant (RDCTKFD) 3.1. Feature extraction Facial features in this approach are the frequency components in different directions derived using the combination of Radon transform and DCT. Radon transform is the line integral of image which enhances low frequency components useful in face recognition. The Radon spaces of two face images computed using projections for 0°–179° orientations are shown in Fig. 2. Significant discriminative information is available in the Radon space. DCT is used to derive the frequency features from the Radon space. In the proposed approach Radon space of face image is computed using optimum number of Radon projections which is one third of Ns min (derived experimentally). DCT of Radon space is
This shows that SNR of Radon projection has been increased by a factor of N (image radius in pixels) which is practically a large quantity (Khouzani and Soltanian, 2005; Wang et al., 2007). Hence approach is robust to zero mean white noise. The minimum and maximum number of Radon projections required for image reconstruction is
Fig. 1. Radon transform of an image.
Fig. 2. Reference images and the respective Radon space for angle 0°–179°.
1004
D.V. Jadhav, R.S. Holambe / Pattern Recognition Letters 31 (2010) 1002–1009
computed to derive frequency features. Significant coefficients (25%) of DCT (excluding DC component) are concatenated to form the facial feature vector. This has reduced the dimension of feature vector significantly. The fractional power polynomial kernel Fisher discriminant analysis is performed on Radon transform and DCT based features to further enhance the discrimination capability of these features (Jadhav and Holambe, 2008). 3.2. Fractional power polynomial kernel Fisher discriminant analysis The kernel Fisher discriminant (KFD) method combines the kernel trick with FLD. The KFD method uses nonlinear mapping between the input space and the feature space. According to the Cover’s theorem on the separability of patterns, the advantage of applying the nonlinear mapping is that it increases the discrimination ability of a pattern classifier. This theorem states that nonlinearly separable patterns in an input space are linearly separable with high probability if the input space is transformed nonlinearly to a high dimensional feature space (Cover Thomas, 1965; Simon Haykin, 1999). In the proposed algorithm polynomial kernel with fractional power, which is nonlinear in mapping has been used. Let w1, w2, . . . , wL denote the L number of classes and N1, N2, . . . , NL denote the number of samples in the class respecx2 ; . . . xM be the features of the training samples tively. Let X ¼ ½ x1 ; as derived in Section 3.1 in the input space and / be a nonlinear mapping between the input space and the feature space. Let D represent the data matrix in the feature space:
D ¼ ½/ðx1 Þ/ðx2 Þ /ðxM Þ;
ð7Þ
where M is the total number of training samples (N1 + N2 + NL). A kernel matrix K by means of the dot product in the feature space is defined as:
K ¼ DT D;
ð8Þ
where Kij = (/(Xi) (/(Xj)) i, j = 1, 2, . . . , M. The within class matrix in the feature space is given as
Sw ¼
1 DDT : M
ð9Þ
The between class matrix is given as
SB ¼
L X
Nðmi mÞðmi mÞT ¼
i¼1
1 DWDT ; M
ð10Þ
where mi is the mean of ith class and m is the overall mean. W 2 RMM is a block diagonal matrix. W = diag[W1, W2 . . . , WL], where W j 2 RNj Nj is an Nj Nj matrix with all elements equal to 1 , j = 1, 2, . . . , L. Nj A good criterion for class separability should convert these scatter matrices to a number, which becomes large when the between class scatter is large or the within class scatter is small. The KFD projection matrix consists of the eigenvectors V corresponding to the largest eigenvalues k of
SB V ¼ kSw V; V¼
M X
C i /ðX i Þ ¼ Da;
ð11Þ
tional complexity is related to the number of training examples rather than the dimension of the feature space (Liu, 2006). Kernel matrix K is computed from Radon transform and DCT based features by using kernel function
Kðx; yÞ ¼ /ðxÞ /ðyÞ:
ð13Þ
In the proposed approach the polynomial kernel with fractional power is used. It is given as
Kðx; yÞ ¼ ðx yÞd ;
ð14Þ
where d is the degree of polynomial. The algorithm is tested for the integer and the fractional value of d. The scatter matrices reside in the high dimensional feature space and are difficult to evaluate. To evaluate this difficulty, SB and Sw are replaced by the kernel matrix given as
KWK a ¼ kKK a:
ð15Þ
The generalized eigenvalue problem of Eq. (15) can be converted into an ordinary eigenvalue problem
ðKK þ eIÞ1 ðKWKÞa ¼ ka;
ð16Þ
where e is a small positive regularization number and I an M M identity matrix. The vector a is normalized so that KFD basis vector V has unit norm.
kVk2 ¼ V T V ¼ aT K a ¼ 1:
ð17Þ
Let V1, V2, . . . , Vn (n 6 L 1) be the KFD basis vectors associated with the largest eigenvalues of Eq. (16). Euclidean distance between KFD features of test image and each training image is computed. The test image is classified using the nearest neighbor classifier. 4. Experimental results and performance analysis This section evaluates the performance of the proposed approach using three databases: (1) Face Recognition Technology (FERET), (2) an Olivetti Research Laboratory (ORL), and (3) Yale. The effectiveness of this method is shown in terms of absolute performance index and comparative performance against some existing popular face recognition schemes such as; PCA, KPCA, LDA, and kernel Fisher discriminant (KFD). In the first part of this section, the databases, the characteristics of the images and the normalization procedure have been described. This is followed by the experiments carried out for evaluation of the performance. Following sets of experiments were carried out 1. Testing the performance of the algorithms for normal faces. 2. Testing the performance of the algorithms in illumination variations. 3. Testing the robustness of the algorithm against zero mean white noise. 4. Testing the rotation invariance of the algorithm. 5. Testing the performance of the algorithms in facial expression variations. 6. Measurement of computation time.
ð12Þ
i¼1
where a = [C1, C2, . . . , CM]T. However, the scatter matrices reside in the high dimensional feature space and are difficult to evaluate. In practice, the kernel matrix K is computed by means of a kernel function rather than by explicitly implementing the nonlinear mapping /. It takes advantage of the Mercer equivalence condition and is feasible because the dot products in the high dimensional feature space are replaced by a kernel function in the input space while computa-
4.1. Databases and normalization procedure The FERET database, which has become the de facto standard for evaluating the face recognition technologies, consists of more than 13,000 facial images corresponding to more than 1500 subjects. The diversity of the FERET database is across gender, ethnicity, and age. Since images are acquired during different photo sessions, the illumination conditions, facial expressions, and the size of the face have been varied. The data set used in
D.V. Jadhav, R.S. Holambe / Pattern Recognition Letters 31 (2010) 1002–1009
Fig. 3. (a) Original and normalized images from the FERET database. (b) Sample images from the ORL database. (c) Sample images from the Yale database.
1005
1006
D.V. Jadhav, R.S. Holambe / Pattern Recognition Letters 31 (2010) 1002–1009
our experiments consists of 3000 FERET face images corresponding to 500 subjects. The images are of size 256 384 with 8-bit resolution (Phillips et al., 2000). The ORL face database is composed of 400 images with ten different images for each of the 40 distinct subjects. The variations of the images are across pose, size, time and facial expression. All the images were taken against a dark homogeneous background with the subjects in an upright, frontal position, with tolerance for some side movement. The spatial resolution is of 92 112, while gray level resolution is 8-bit (ORL face database). There are 165 images in Yale database, which are of 15 different subjects with 11 images per subject. The images vary in facial expressions and lighting conditions. The spatial resolution of image is 640 480 (Yale face database). The images from FERET and Yale databases are normalized to extract the facial regions that contain only face so that the performances of the algorithms are not affected by the factors not related to the face. In normalization, the center of the eyes is detected and images are manually cropped to the size of 128 128 to extract the facial region. Fig. 3 (A), (B) and (C) show the cropped images from FERET, ORL and Yale databases respectively. 4.2. Experimental results 4.2.1. Testing the performances of the algorithms for normal faces The first set of experiments assesses the performance of the proposed approach for normal faces. The normal faces are the frontal faces with average illumination and neutral facial expression. Facial features in the proposed approach are the low frequency components. If these components are improved, and the discriminative features in the frequency domain are extracted, the recognition rate can be significantly improved with reduced dimensionality. The closest match (minimum distance) with any of the correct training images has been considered as the correct match of test image. The Recognition rate is defined as follows:
Recognition rate ¼
Number of correct matches 100%: Total number of test images
The proposed algorithm computes the optimum number of Radon projections of an image. Significant coefficients of DCT of Radon space after removing the DC component are concatenated to derive the facial features. These features are identified as RDCT. Polynomial kernel Fisher discriminant algorithm is implemented on RDCT features identified as RDCTKFD. In the first part of experiments, the optimum number of Radon projections was derived experimentally. The recognition rates for RDCT, and RDCTKFD algorithms were computed for different numbers of Radon projections (10, 20, 40, 50, 60, 75, 90, 125, 150, 180 and 200) for angles between 0° and 179°. Table 1 shows the recognition rate for different number of Radon projections. It was ob-
Table 1 Effect of number of Radon projections on the face recognition performance (Set 1-FERET database). Number of projections
RDCT
RDCTKFD
2 5 10 20 40 50 60 90 125 150 180
72.3 78.1 80.2 84.6 90.1 93.2 94.3 94.3 94.3 94.3 94.1
79.8 85.7 88.2 90.4 97.2 98.1 98.9 98.9 98.9 98.9 98.9
served that the recognition rate remains almost unaffected for any increase in number of projections above one third of Nsmin(60). Hence we used only 60 projections (optimum number) for testing the performance of the proposed algorithm. In the next part, the performances of RDCT and RDCTKFD approaches were investigated for variation in number of DCT coefficients selected in feature vector and are presented in Table 2. It has been observed that there is significant improvement in the recognition rate as the use of DCT coefficients increases up to about 25%. Any further increase in the coefficients does not improve the performances significantly. So we selected 25% coefficients of DCT as significant coefficients in all experiments. In the subsequent experiment, the performance of RDCTKFD algorithm was tested for integer and fractional degree of polynomial. The results are presented in Table 3. The highest recognition rate was obtained for polynomial degree d = 0.7. This is due to better linear separation of features because of nonlinear mapping. The results presented in the paper are for optimum Radon projections (60), 25% DCT coefficients, and polynomial degree d = 0.7. For the comparison PCA (Turk and Pentland, 1991), KPCA (Liu, 2004), LDA (Belhumeur et al., 1997) and KFD (Lu et al., 2003) algorithms were implemented and tested. In all of these algorithms, the nearest neighbor classifier has been used. All above algorithms were tested using ten different combinations with two and three images per subject from FERET database for training as well as testing. The recognition rates of the algorithms were computed using these combinations. The average recognition rates along with standard deviations and 95% confidence intervals are presented in Table 4. Standard deviation is the most useful criterion for evaluating whether different selections of training images affect the recognition performance. The lower the standard deviation, the less the effect of choosing different training image sets. Tables 5 and 6 show the performance for ORL and Yale databases (for ten combinations of three images per subject in training and testing). The results show that the proposed algorithm yields maximum recognition rate with minimum standard deviation. Thus the proposed approach provides effective and efficient features for face
Table 2 Percentages of DCT coefficients used versus recognition rates (Set 1-FERET database). % of DCT coefficients used
5 10 20 25 30 40 50 75 100
Recognition rate (%) DCT only
RDCT
RDCTKFD
73.6 80.5 87.9 91.1 91.1 91.1 91.1 91.1 91.1
86.4 89 94.3 94.3 94.3 94.3 94.3 94.6 94.3
89.4 95.2 97.2 98.9 98.9 98.9 98.9 98.9 98.9
Table 3 Effect of degree of polynomial on the recognition rate of RDCTKFD algorithm (Set 1FERET database). Degree of polynomial
Recognition rate (%)
3 2 1 0.9 0.8 0.7 0.6
88.6 92.3 94 94.1 97.2 98.9 98.1
1007
D.V. Jadhav, R.S. Holambe / Pattern Recognition Letters 31 (2010) 1002–1009 Table 4 Face recognition performance for two and three images per subject in training of FERET database (10 combinations). Algorithm
Average recognition rate
Standard deviation
Confidence interval
Performance for two images in training set PCA 88.2 KPCA (d = 0.7) 90.09 LDA 89.67 KFD 91.75 Radon 87.2 DCT 91.08 RDCT 94.38 RDCTKFD 98.21
1.063 1.039 0.989 1.400 0.752 0.397 0.282 0.196
[87.398, [89.306, [88.925, [90.707, [86.633, [90.780, [94.168, [98.062,
89.000] 90.873] 90.415] 92.793] 87.767] 91.379] 94.592] 98.357]
Performance for three images in training set PCA 89.86 KPCA (d = 0.7) 91.09 LDA 91.82 KFD 92.01 Radon 87.71 DCT 92.33 RDCT 95.47 RDCTKFD 99.05
0.959 0.890 0.760 0.882 0.482 0.401 0.298 0.085
[89.136, [90.418, [91.247, [91.345, [87.346, [92.028, [95.245, [98.986,
90.583] 91.761] 92.393] 92.675] 88.073] 92.632] 95.694] 99.114]
Table 5 Face recognition performance for three images per subject in training and testing of ORL database (10 combinations). Algorithm
Average recognition rate
Standard deviation
Confidence interval
PCA KPCA (d = 0.7) LDA KFD Radon DCT RDCT RDCTKFD
89.03 90.65 90.38 91.54 89 91.78 97 99.31
1.282 1.005 0.995 0.851 0.392 0.400 0.305 0.182
[88.033, 89.996] [89.892, 91.407] [89.08, 91.407] [90.898, 92.181] [88.704, 89.295] [91.478, 92.081] [96.774, 97.226] [99.172, 99.447]
Table 6 Face recognition performance for three images per subject in training and testing of Yale database (10 combinations). Algorithm
Average recognition rate
Standard deviation
Confidence interval
PCA KPCA (d = 0.7) LDA KFD Radon DCT RDCT RDCTKFD
88.49 90.22
0.851 0.755
[87.847, 89.132] [89.659, 90.789]
92.08 94.76 92.83 91.5 96.5 99.6
0.590 0.727 0.482 0.338 0.298 0.109
[91.653, [94.218, [92.466, [91.208, [96.275, [99.518,
92.525] 95.315] 93.193] 91.792] 96.724] 99.682]
recognition. The narrow width of confidence interval of the proposed algorithm implies that the performance of the proposed algorithm is irrelevant with respect to training image sets.
Under normal conditions, most of the algorithms can achieve a high recognition rate. The KPCA performed well compared to PCA because of extraction of features after nonlinear mapping. The performance of KFD is better than LDA because of extracting detail information of face images with nonlinear mapping in KFD. The performance of the proposed approach is better than the other approaches. This is due to boosting of low frequency components, which contribute significantly in the recognition process during the computation of Radon projections and enhancement of discrimination capability of these features using Kernelized approach. 4.2.2. Testing the performance in illumination variations The effect of illumination variation on face recognition was investigated using FERET and Yale databases. The images which have variations in illumination have been selected for testing (two images per subject for training and five images per subject for testing). Reference images used for training and test images with different illumination are shown in Fig. 4. To remove the effect of illumination variations, the DC component of DCT of Radon space is removed before all the rows are concatenated to form the feature vector. The result of these experiments is given in Table 7. The performance of the RDCT and RDCTKFD algorithms are not affected by illumination variations because the DC value corresponding to average illumination has been removed from the feature vectors. However performance of the other approaches like PCA is significantly affected because PCA represents faces with their principal components, but the variations between the images of the same face due to illumination are larger than image variations due to change in face identity. 4.2.3. Testing the robustness against additive noise In this set of experiments, the robustness of the proposed approach to zero mean white noise was tested using the ORL database. Five images per subject were randomly selected for training, and remaining five images per subject were used for testing. For 40 subjects, number of training images is 200 and test images are also 200. While training, images without noise were used. Zero mean white noise with variance dependent on the required signal-to-noise ratio (SNR-15, 10 and 7 dB) was added in the test images. The performances of the algorithms were tested using the noisy probe. Fig. 6 (a) shows an original image while (b), (c), (d) show the images with signal to noise level of 15 dB, 10 dB, and 7 dB respectively. Table 8 shows the effect of noise of
Table 7 Face recognition results under different illumination condition on different databases. Algorithm
FERET
Yale
PCA KPCA (d = 0.7) LDA KFD Radon DCT RDCT RDCTKFD
58.5 60.2 78.4 79.4 64 90.4 94.6 98.2
62.2 64 79.8 78.9 72.6 90.4 95.6 98.8
Fig. 4. (a) Reference image and (b) test image with different illumination.
1008
D.V. Jadhav, R.S. Holambe / Pattern Recognition Letters 31 (2010) 1002–1009
Table 8 Effect of zero mean white noise on the performances of face recognition algorithms (ORL database). Algorithm
Table 9 Face recognition results for different rotation using ORL database. Rotation (in degree)
Recognition rate with added noise of signal to noise level in the test images (%)
PCA KPCA LDA KFD Radon transform DCT RDCT RDCTKFD
No noise
15 dB
10 dB
7 dB
84 88.5 87 89 87.5 91 97.4 99.3
74.5 81.5 75.5 84 87.5 87 97.1 98.6
62.0 75.0 65.5 78 87 84.5 96.2 97.3
54.5 66.5 58 71.5 86 75 94.5 96.2
0 5 5 10 10 15 15 20 20 25 25 +30 30
Recognition rate ORL
FERET
94.5 94.5 94.5 94.5 94.5 94.5 94.5 94.0 94.0 94.0 94.0 94.5 94.5
94.8 94.8 94.8 94.8 94.8 94.8 94.8 94.6 94.6 94.4 94.4 94.6 94.6
different level on the performances of different algorithms. These results show that the proposed method has significantly higher robustness to zero mean white noise. Since, the SNR improves significantly in Radon space, the algorithm is robust to zero mean additive noise.
Fig. 5. In-plane rotated images from ORL database.
Fig. 6. (a) Typical original image from the ORL database and (b)–(d) are test images corrupted with zero mean white noise with signal to noise ratio of 15 dB, 10 dB and 7 dB, respectively.
4.2.4. Testing the in-plane rotation invariance of the algorithm The rotation (in-plane or tilt) invariance property of the proposed approach was tested using ORL and FERET databases. In this experiment, 200 images of 40 persons (five images per subject) from ORL database were used for training. Remaining 200 images from ORL database were rotated (tilted) up to 35° on both sides with increment of 5° to form the probe set. This set consists of 2800 (14 200) in-plane rotated images. Fig. 5 shows the sample rotated images. For the FERET database, 500 images of 250 persons (two images per subject) were used for training and 7000 (500 14) rotated images were used for testing. Table 9 shows the recognition rates for different angle of rotation. As the Radon transform converts rotation into translation and DCT is invariant to translation, features extracted using their combination are invariant to rotation. 4.2.5. Testing the performance in facial expression variations Experiments based on the face images with different facial expressions were conducted to assess the robustness of the approach to variations in facial expressions. In this experiment two frontal normal images per subject from each database were used for training. Three images per subject having facial expression variations from FERET, ORL, and Yale databases were used for testing. Reference images from Yale database used for training and test images with different facial expressions are shown in Fig. 7. The facial expressions cause the local distortions in the images which contribute to the high frequency components. Radon projections (integration) minimize the effect of facial expression variations. The recognition rate for different databases is given in Table 10. As PCA maintains the global structure of the input and discards the detailed local information, it is not sensitive to facial expres-
Fig. 7. (a) Reference image and (b) test image with different facial expressions.
D.V. Jadhav, R.S. Holambe / Pattern Recognition Letters 31 (2010) 1002–1009 Table 10 Face recognition results under different facial expressions on different databases. Algorithm
FERET
ORL
Yale
PCA KPCA (d = 0.7) LDA KFD Radon DCT RDCT RDCTKFD
85 86.2 78 90 84 80.2 89.3 91.4
87.4 82 75.3 89 85.6 82.4 88.5 90.8
88.3 87 65 94.7 92 81.2 90.1 90.9
Table 11 Comparison of running time for different algorithms (ORL database using five images per person in the gallery and probe). Algorithm
PCA KPCA LDA KFD Radon RDCT RDCTKFD
Running time (200 images) Training time (s)
Testing time (s)
16.297 17.515 19.672 28.625 12.032 22.844 45.609
2.305 4.782 4.898 8.278 12.172 16.235 18.689
Recognition rate (%)
84 88.5 87 89 87.5 92.5 98.5
sions. The nonlinear mapping in KFD considers the higher order statistical property of the input features. Hence performance is not severely affected by the expression variations. 4.2.6. Measurement of computation time All the experiments were performed using a Pentium-4 PC; with CPU speed of 2.6 GHz and 2 GB RAM using MATLAB. The time taken by each algorithm for training as well as testing the 200 images was computed and is given in Table 11. The proposed method takes somewhat more time for training because of integration of Radon transform and DCT and the nonlinear mapping performed on these features. Once the features are extracted, the time required for identity recognition is very less, making the proposed approach suitable for real time face recognition. 5. Conclusions In this paper, we have proposed a technique for face recognition, which uses Radon transform and DCT to derive the directional frequency features. Radon transform first derives desirable directional face features. Being the line integral, it improves low frequency components, which are significant in the identification process. DCT derives the compact frequency features from the Radon space. Fractional power polynomial KFD analysis performed on these features enhances their discrimination capability. The experiments on noisy and rotated images proved the noise immunity and rotation invariance of the proposed approach. The approach is effective in accuracy and dimensionality reduction. The feasibility of this algorithm has been successfully tested using FERET, ORL, and Yale databases. Experimental results show that the proposed
1009
approach performs better than the other algorithms, such as PCA, KPCA and KFD. Acknowledgements The authors thank the associate editor and anonymous reviewers for their critical and constructive comments and suggestions, which helped to improve the quality of this paper. References Belhumeur, P.N., Hespanha, J.P., Kriegman, D.J., 1997. Eigenfaces vs. Fisherfaces; rcognition using class specific linear projection. IEEE Trans. Pattern Anal. Machine Intell. 19 (7), 711–720. Chen, W., Meng, J., Er, Shiquian Wu, 2005. PCA and LDA in DCT domain. Pattern Recognition Lett. 26, 2474–2482. Cover Thomas, M., 1965. Geometrical and statistical properties of systems of linear inequalities with application in pattern recognition. IEEE Trans. Electr. Comput. 14 (3), 326–334. Ekenel, H.K., Sankur, B., 2004. Feature selection in the independent component subspace for face recognition. Pattern Recognition Lett. 25 (12), 1377–1388. Ekenel, H.K., Stiefelhagen, R., 2006. Analysis of local appearance-based face recognition: Effects of feature selection and feature normalization. CVPR Workshop, New York, USA. Jadhav, D.V., Holambe, R.S., 2008. Radon and discrete cosine transforms based feature extraction and dimensionality reduction approach for face recognition. Signal Process. 88 (10), 2604–2609. Jain, A.K., Ross, A., Prabhakar, S., 2004. An introduction to biometric recognition. IEEE Trans. Circuits Systems Video Technol. 14 (1), 4–20. Khouzani, K.J., Soltanian, H.Z., 2005. Rotation invariant multiresolution texture analysis using Radon and wavelet transform. IEEE Trans. Image Process. 14 (6), 783–794. Lai, J.H., Yuen, P.C., Feng, G.C., 2001. Face recognition using holistic Fourier invariant features. Pattern Recognition 34, 95–109. Liu, C., 2004. Gabor – Based Kernel PCA with fractional power polynomial models for face recognition. IEEE Trans. Pattern Anal. Machine Intell. 26 (5), 572–581. Liu, C., 2006. Capitalize on dimensionality increasing techniques for improving face recognition grand challenge performance. IEEE Trans. Pattern Anal. Machine Intell. 28 (5), 725–737. Liu, C., Wechsler, H., 2002. Gabor feature based classification using the enhanced Fisher linear discriminant model for face recognition. IEEE Trans. Image Process. 11 (4), 467–476. Lu, J., Plataniotis, K.N., Venetsanopoulos, A.N., 2003. Face recognition using kernel direct discriminant analysis algorithms. IEEE Trans. Neural Networks 14 (1), 117–126. Magli, E., Olmo, G., Presti, L., 1999. Pattern recognition by means of the Radon transform and the continuous wavelet transform. Signal Process. 73, 277– 289. ORL face database.
. Phillips, P.J., 1998. Matching pursuits filters applied to face identification. IEEE Trans. Image Process. 7 (8), 150–1164. Phillips, P.J., Hyeonjoon, M., Sayed, A.R., Patric, J.R., 2000. The FERET evaluation methodology for face recognition algorithms. IEEE Trans. Pattern Anal. Machine Intell. 22 (10), 1090–1104. Phillips, P.J., Flynn, P.J., Scruggs, T., Bowyer, K.W., Chang, J., Hoffman, K., Morques, J., Min, J., Worek, W., 2005. Overview of the face recognition grand challenge. IEEE Conf. on Computer Vision and Pattern Recognition 1, 947–954. Sanderson, C., Bengio, S., Yongsheng, G., 2006. On transforming statistical models for non-frontal face verification. Pattern Recognition 39, 288–302. Simon Haykin, 1999. Neural Networks – A Comprehensive Foundation, second ed. Prentice Hall. Turk, M., Pentland, A., 1991. Eigenfaces for recognition. J. Cognitive Neurosci. 13 (1), 71–86. Wang, X., Xiao, B., Feng, J.M., Xiu, L.B., 2007. Scaling and rotation invariant analysis approach to object recognition based on Radon and Fourier–Mellin transforms. Pattern Recognition 40, 3503–3508. Yale face database. . Zhou, D., Yang, X., Ningsong, P., Wang, Y., 2006. Improved LDA based face recognition using both global and local information. Pattern Recognition Lett. 27, 536–543.