Rotation, illumination invariant polynomial kernel Fisher discriminant analysis using Radon and discrete cosine transforms based features for face recognition

Pattern Recognition Letters 31 (2010) 1002–1009 Contents lists available at ScienceDirect Pattern Recognition Letters journal homepage: www.elsevier...

Download PDF

488KB Sizes 0 Downloads 83 Views

Report

PDF Reader
Full Text

Pattern Recognition Letters 31 (2010) 1002–1009

Contents lists available at ScienceDirect

Pattern Recognition Letters journal homepage: www.elsevier.com/locate/patrec

Rotation, illumination invariant polynomial kernel Fisher discriminant analysis using Radon and discrete cosine transforms based features for face recognition Dattatray V. Jadhav a,*, Raghunath S. Holambe b a b

Department of Electronics, Vishwakarma Institute of Technology, Pune (MS), India Department of Instrumentation, SGGSIE&T, Vishnupuri, Nanded (MS), India

a r t i c l e

i n f o

Article history: Received 2 September 2008 Received in revised form 5 October 2009 Available online 4 January 2010 Communicated by T. Tan Keywords: Face recognition Feature extraction Radon transform Discrete cosine transform (DCT) Kernel Fisher discriminant (KFD)

a b s t r a c t This paper presents an in-plane rotation (tilt), illumination invariant pattern recognition framework based on the combination of the features extracted using Radon and discrete cosine transforms and kernel based learning for face recognition. The use of Radon transform enhances the low frequency components, which are useful for face recognition and that of DCT yields low dimensional feature vector. The proposed technique computes Radon projections in different orientations and captures the directional features of the face images. DCT applied on Radon projections provides frequency features. Further, polynomial kernel Fisher discriminant analysis implemented on these features enhances discrimination capability of these features. The technique is also robust to zero mean white noise. The feasibility of the proposed technique has been evaluated using FERET, ORL, and Yale databases. Ó 2009 Elsevier B.V. All rights reserved.

1. Introduction Face recognition has received much attention due to its potential values for applications as well as challenges. A successful face recognition system must be robust to many variations of face images such as viewpoint, plane rotation, illumination, and expressions (Jain et al., 2004). It should attain high recognition accuracy with minimum number of features and should also be robust to noise. Popular pattern recognition techniques based on data reduction such as redundancy, and dimensionality reduction, have difﬁculties in solving complex pattern recognition problems, such as human face recognition. Low dimensional pattern recognition methods based on principal component analysis (PCA) (Turk and Pentland, 1991), linear discriminant analysis (LDA) (Belhumeur et al., 1997), and other subspace-based methods (Ekenel and Sankur, 2004 cannot achieve satisfactory performance (Phillips et al., 2005). LDA seeks the optimal projection directions that maximize the ratio of between class scatter to within class scatter. Gabor wavelets with ﬁve scales and eight orientations are used to derive the desirable facial features characterized by spatial frequency, spatial locality, and orientation selectivity to cope with the variations due to illumination and facial expressions. Fisher linear discriminant * Corresponding author. Tel.: +91 9422797509; fax: +91 20 24280926. E-mail addresses: [email protected] (D.V. Jadhav), [email protected] (R.S. Holambe). 0167-8655/$ - see front matter Ó 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.patrec.2009.12.026

(FLD) analysis is performed on these features called as Gabor Fisher Classiﬁer (GFC) (Liu and Wechsler, 2002). Discrete cosine transform (DCT) is effective in data compaction and has been used to derive the facial features. PCA and LDA are implemented on these features (Chen et al., 2005). Local appearance based face recognition scheme is proposed in (Ekenel and Stiefelhagen, 2006). DCT coefﬁcients of non-overlapping blocks of an image are computed and ordered using zigzag scanning. The DCT coefﬁcients extracted from each block are concatenated to obtain the feature vector. The pose mismatch problem has been addressed using the DCTmod2 feature extraction technique. In this approach, DCT coefﬁcients obtained from overlapping blocks have been utilized (Sanderson et al., 2006). The global and local facial features derived using DCT have been combined and LDA has been performed on these features to enhance discrimination capability (Zhou et al., 2006). It has been demonstrated that low frequency components only are sufﬁcient for face recognition (Lai et al., 2001). The Radon transform improves the low frequency components and can derive large number of features (Magli et al., 1999). Since the face image space is of high dimension and the number of face images per subject available for training is usually small, a major computational problem with the LDA algorithm is that the within class scatter matrix is singular and hence LDA cannot be applied directly (Phillips, 1998). One solution to this problem is kernel Fisher discriminant analysis (KFD). This paper presents a face recognition scheme, which is the extension of our previous work (Jadhav and Holambe, 2008). It is computationally efﬁcient, invariant to in-plane rotation and

1003

D.V. Jadhav, R.S. Holambe / Pattern Recognition Letters 31 (2010) 1002–1009

illumination. It is also robust to zero mean white noise. The proposed scheme attempts to exploit the capabilities of Radon transform to enhance low frequency components and DCT to reduce redundancy in data to derive effective and efﬁcient face features. Fractional power polynomial kernel is used to map these features nonlinearly, which enhances their discrimination capability. Fisher discriminant analysis is performed on these features. The effectiveness of the proposed approach is demonstrated in terms of both absolute performance and comparative performance. The paper is organized as follows. In Section 2, we brieﬂy describe Radon transform and DCT. Section 3 presents the proposed technique. Databases and experimental results are summarized in Section 4 followed by conclusions in Section 5.

Ns min ¼ pN

and Ns max ¼ 2pN:

ð5Þ

2.2. Discrete cosine transform DCT is a well-known signal analysis tool used in compression due to its compact representation capability. It has an excellent energy compaction property for highly correlated data. This helps in reduction of feature dimension. Let image f(x, y) is represented as f(m, n) of size M N. The 2-D DCT of an image f(m, n) is given as

Bpq ¼ ap aq

M 1 X N 1 X

f ðm; nÞ cos

pð2m þ 1Þp

m¼0 n¼0

2M

cos

pð2n þ 1Þq 2N

;

0 6 p 6 M 1; 0 6 q 6 N 1; 2. Radon and discrete cosine transforms

where p and q denote the frequencies.

pﬃﬃﬃﬃﬃ 1= M;

2.1. Radon transform

ap ¼ pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

The Radon transform of a two dimensional function f(x, y) in (r, h) plane is deﬁned as

Rðr; hÞ½f ðx; yÞ ¼

Z

1

1

Z

1

f ðx; yÞdðr x cos h y sin hÞdx dy;

ð6Þ

ð1Þ

p ¼ 0;

2=M; 1 6 p 6 M 1;

pﬃﬃﬃﬃﬃ 1= N; q ¼ 0;

aq ¼ pﬃﬃﬃﬃﬃﬃﬃﬃﬃ

2=N; 1 6 q 6 N 1;

where M and N denote the row and column size of f(m, n) respectively (Chen et al., 2005; Ekenel and Stiefelhagen, 2006; Sanderson et al., 2006).

1

where d() is the Dirac function, r 2 [1, 1] is the perpendicular distance of a line from the origin and h 2 [0, p] is the angle formed by the distance vector and X-axis as shown in Fig. 1 (Khouzani and Soltanian, 2005; Wang et al., 2007). A rotation of f(x, y) by an angle / leads to translation of R(r, h) in the variable h by /, i.e.

Rðr; hÞff ðx cos / þ y sin /; x cos / þ y sin /Þg ¼ Rðr; h þ /Þ:

ð2Þ

As the Radon transform converts rotation into translation and DCT is invariant to translation, features extracted using their combination are invariant to the rotation (in-plane). In addition, due to its excellent energy compaction property, the use of DCT in the proposed approach also helps in reduction of feature vector dimension. The advantage of using the Radon transform is its robustness to white noise. The signal-to-noise ratio of Radon space and that of an image is related as

SNRproj ¼ SNRimage þ 1:7N SNRimage ;

ð3Þ

SNRproj 1:7N SNRimage :

ð4Þ

3. The Radon transform and DCT based kernel Fisher discriminant (RDCTKFD) 3.1. Feature extraction Facial features in this approach are the frequency components in different directions derived using the combination of Radon transform and DCT. Radon transform is the line integral of image which enhances low frequency components useful in face recognition. The Radon spaces of two face images computed using projections for 0°–179° orientations are shown in Fig. 2. Signiﬁcant discriminative information is available in the Radon space. DCT is used to derive the frequency features from the Radon space. In the proposed approach Radon space of face image is computed using optimum number of Radon projections which is one third of Ns min (derived experimentally). DCT of Radon space is

This shows that SNR of Radon projection has been increased by a factor of N (image radius in pixels) which is practically a large quantity (Khouzani and Soltanian, 2005; Wang et al., 2007). Hence approach is robust to zero mean white noise. The minimum and maximum number of Radon projections required for image reconstruction is

Fig. 1. Radon transform of an image.

Fig. 2. Reference images and the respective Radon space for angle 0°–179°.

1004

D.V. Jadhav, R.S. Holambe / Pattern Recognition Letters 31 (2010) 1002–1009

computed to derive frequency features. Signiﬁcant coefﬁcients (25%) of DCT (excluding DC component) are concatenated to form the facial feature vector. This has reduced the dimension of feature vector signiﬁcantly. The fractional power polynomial kernel Fisher discriminant analysis is performed on Radon transform and DCT based features to further enhance the discrimination capability of these features (Jadhav and Holambe, 2008). 3.2. Fractional power polynomial kernel Fisher discriminant analysis The kernel Fisher discriminant (KFD) method combines the kernel trick with FLD. The KFD method uses nonlinear mapping between the input space and the feature space. According to the Cover’s theorem on the separability of patterns, the advantage of applying the nonlinear mapping is that it increases the discrimination ability of a pattern classiﬁer. This theorem states that nonlinearly separable patterns in an input space are linearly separable with high probability if the input space is transformed nonlinearly to a high dimensional feature space (Cover Thomas, 1965; Simon Haykin, 1999). In the proposed algorithm polynomial kernel with fractional power, which is nonlinear in mapping has been used. Let w1, w2, . . . , wL denote the L number of classes and N1, N2, . . . , NL denote the number of samples in the class respecx2 ; . . . xM be the features of the training samples tively. Let X ¼ ½ x1 ; as derived in Section 3.1 in the input space and / be a nonlinear mapping between the input space and the feature space. Let D represent the data matrix in the feature space:

D ¼ ½/ðx1 Þ/ðx2 Þ /ðxM Þ;

ð7Þ

where M is the total number of training samples (N1 + N2 + NL). A kernel matrix K by means of the dot product in the feature space is deﬁned as:

K ¼ DT D;

ð8Þ

where Kij = (/(Xi) (/(Xj)) i, j = 1, 2, . . . , M. The within class matrix in the feature space is given as

Sw ¼

1 DDT : M

ð9Þ

The between class matrix is given as

SB ¼

L X

Nðmi mÞðmi mÞT ¼

i¼1

1 DWDT ; M

ð10Þ

where mi is the mean of ith class and m is the overall mean. W 2 RMM is a block diagonal matrix. W = diag[W1, W2 . . . , WL], where W j 2 RNj Nj is an Nj Nj matrix with all elements equal to 1 , j = 1, 2, . . . , L. Nj A good criterion for class separability should convert these scatter matrices to a number, which becomes large when the between class scatter is large or the within class scatter is small. The KFD projection matrix consists of the eigenvectors V corresponding to the largest eigenvalues k of

SB V ¼ kSw V; V¼

M X

C i /ðX i Þ ¼ Da;

ð11Þ

tional complexity is related to the number of training examples rather than the dimension of the feature space (Liu, 2006). Kernel matrix K is computed from Radon transform and DCT based features by using kernel function

Kðx; yÞ ¼ /ðxÞ /ðyÞ:

ð13Þ

In the proposed approach the polynomial kernel with fractional power is used. It is given as

Kðx; yÞ ¼ ðx yÞd ;

ð14Þ

where d is the degree of polynomial. The algorithm is tested for the integer and the fractional value of d. The scatter matrices reside in the high dimensional feature space and are difﬁcult to evaluate. To evaluate this difﬁculty, SB and Sw are replaced by the kernel matrix given as

KWK a ¼ kKK a:

ð15Þ

The generalized eigenvalue problem of Eq. (15) can be converted into an ordinary eigenvalue problem

ðKK þ eIÞ1 ðKWKÞa ¼ ka;

ð16Þ

where e is a small positive regularization number and I an M M identity matrix. The vector a is normalized so that KFD basis vector V has unit norm.

kVk2 ¼ V T V ¼ aT K a ¼ 1:

ð17Þ

Let V1, V2, . . . , Vn (n 6 L 1) be the KFD basis vectors associated with the largest eigenvalues of Eq. (16). Euclidean distance between KFD features of test image and each training image is computed. The test image is classiﬁed using the nearest neighbor classiﬁer. 4. Experimental results and performance analysis This section evaluates the performance of the proposed approach using three databases: (1) Face Recognition Technology (FERET), (2) an Olivetti Research Laboratory (ORL), and (3) Yale. The effectiveness of this method is shown in terms of absolute performance index and comparative performance against some existing popular face recognition schemes such as; PCA, KPCA, LDA, and kernel Fisher discriminant (KFD). In the ﬁrst part of this section, the databases, the characteristics of the images and the normalization procedure have been described. This is followed by the experiments carried out for evaluation of the performance. Following sets of experiments were carried out 1. Testing the performance of the algorithms for normal faces. 2. Testing the performance of the algorithms in illumination variations. 3. Testing the robustness of the algorithm against zero mean white noise. 4. Testing the rotation invariance of the algorithm. 5. Testing the performance of the algorithms in facial expression variations. 6. Measurement of computation time.

ð12Þ

i¼1

where a = [C1, C2, . . . , CM]T. However, the scatter matrices reside in the high dimensional feature space and are difﬁcult to evaluate. In practice, the kernel matrix K is computed by means of a kernel function rather than by explicitly implementing the nonlinear mapping /. It takes advantage of the Mercer equivalence condition and is feasible because the dot products in the high dimensional feature space are replaced by a kernel function in the input space while computa-

4.1. Databases and normalization procedure The FERET database, which has become the de facto standard for evaluating the face recognition technologies, consists of more than 13,000 facial images corresponding to more than 1500 subjects. The diversity of the FERET database is across gender, ethnicity, and age. Since images are acquired during different photo sessions, the illumination conditions, facial expressions, and the size of the face have been varied. The data set used in

D.V. Jadhav, R.S. Holambe / Pattern Recognition Letters 31 (2010) 1002–1009

Fig. 3. (a) Original and normalized images from the FERET database. (b) Sample images from the ORL database. (c) Sample images from the Yale database.

1005

1006

D.V. Jadhav, R.S. Holambe / Pattern Recognition Letters 31 (2010) 1002–1009

our experiments consists of 3000 FERET face images corresponding to 500 subjects. The images are of size 256 384 with 8-bit resolution (Phillips et al., 2000). The ORL face database is composed of 400 images with ten different images for each of the 40 distinct subjects. The variations of the images are across pose, size, time and facial expression. All the images were taken against a dark homogeneous background with the subjects in an upright, frontal position, with tolerance for some side movement. The spatial resolution is of 92 112, while gray level resolution is 8-bit (ORL face database). There are 165 images in Yale database, which are of 15 different subjects with 11 images per subject. The images vary in facial expressions and lighting conditions. The spatial resolution of image is 640 480 (Yale face database). The images from FERET and Yale databases are normalized to extract the facial regions that contain only face so that the performances of the algorithms are not affected by the factors not related to the face. In normalization, the center of the eyes is detected and images are manually cropped to the size of 128 128 to extract the facial region. Fig. 3 (A), (B) and (C) show the cropped images from FERET, ORL and Yale databases respectively. 4.2. Experimental results 4.2.1. Testing the performances of the algorithms for normal faces The ﬁrst set of experiments assesses the performance of the proposed approach for normal faces. The normal faces are the frontal faces with average illumination and neutral facial expression. Facial features in the proposed approach are the low frequency components. If these components are improved, and the discriminative features in the frequency domain are extracted, the recognition rate can be signiﬁcantly improved with reduced dimensionality. The closest match (minimum distance) with any of the correct training images has been considered as the correct match of test image. The Recognition rate is deﬁned as follows:

Recognition rate ¼

Number of correct matches 100%: Total number of test images

The proposed algorithm computes the optimum number of Radon projections of an image. Signiﬁcant coefﬁcients of DCT of Radon space after removing the DC component are concatenated to derive the facial features. These features are identiﬁed as RDCT. Polynomial kernel Fisher discriminant algorithm is implemented on RDCT features identiﬁed as RDCTKFD. In the ﬁrst part of experiments, the optimum number of Radon projections was derived experimentally. The recognition rates for RDCT, and RDCTKFD algorithms were computed for different numbers of Radon projections (10, 20, 40, 50, 60, 75, 90, 125, 150, 180 and 200) for angles between 0° and 179°. Table 1 shows the recognition rate for different number of Radon projections. It was ob-

Table 1 Effect of number of Radon projections on the face recognition performance (Set 1-FERET database). Number of projections

RDCT

RDCTKFD

2 5 10 20 40 50 60 90 125 150 180

72.3 78.1 80.2 84.6 90.1 93.2 94.3 94.3 94.3 94.3 94.1

79.8 85.7 88.2 90.4 97.2 98.1 98.9 98.9 98.9 98.9 98.9

served that the recognition rate remains almost unaffected for any increase in number of projections above one third of Nsmin(60). Hence we used only 60 projections (optimum number) for testing the performance of the proposed algorithm. In the next part, the performances of RDCT and RDCTKFD approaches were investigated for variation in number of DCT coefﬁcients selected in feature vector and are presented in Table 2. It has been observed that there is signiﬁcant improvement in the recognition rate as the use of DCT coefﬁcients increases up to about 25%. Any further increase in the coefﬁcients does not improve the performances signiﬁcantly. So we selected 25% coefﬁcients of DCT as signiﬁcant coefﬁcients in all experiments. In the subsequent experiment, the performance of RDCTKFD algorithm was tested for integer and fractional degree of polynomial. The results are presented in Table 3. The highest recognition rate was obtained for polynomial degree d = 0.7. This is due to better linear separation of features because of nonlinear mapping. The results presented in the paper are for optimum Radon projections (60), 25% DCT coefﬁcients, and polynomial degree d = 0.7. For the comparison PCA (Turk and Pentland, 1991), KPCA (Liu, 2004), LDA (Belhumeur et al., 1997) and KFD (Lu et al., 2003) algorithms were implemented and tested. In all of these algorithms, the nearest neighbor classiﬁer has been used. All above algorithms were tested using ten different combinations with two and three images per subject from FERET database for training as well as testing. The recognition rates of the algorithms were computed using these combinations. The average recognition rates along with standard deviations and 95% conﬁdence intervals are presented in Table 4. Standard deviation is the most useful criterion for evaluating whether different selections of training images affect the recognition performance. The lower the standard deviation, the less the effect of choosing different training image sets. Tables 5 and 6 show the performance for ORL and Yale databases (for ten combinations of three images per subject in training and testing). The results show that the proposed algorithm yields maximum recognition rate with minimum standard deviation. Thus the proposed approach provides effective and efﬁcient features for face

Table 2 Percentages of DCT coefﬁcients used versus recognition rates (Set 1-FERET database). % of DCT coefﬁcients used

5 10 20 25 30 40 50 75 100

Recognition rate (%) DCT only

RDCT

RDCTKFD

73.6 80.5 87.9 91.1 91.1 91.1 91.1 91.1 91.1

86.4 89 94.3 94.3 94.3 94.3 94.3 94.6 94.3

89.4 95.2 97.2 98.9 98.9 98.9 98.9 98.9 98.9

Table 3 Effect of degree of polynomial on the recognition rate of RDCTKFD algorithm (Set 1FERET database). Degree of polynomial

Recognition rate (%)

3 2 1 0.9 0.8 0.7 0.6

88.6 92.3 94 94.1 97.2 98.9 98.1

1007

D.V. Jadhav, R.S. Holambe / Pattern Recognition Letters 31 (2010) 1002–1009 Table 4 Face recognition performance for two and three images per subject in training of FERET database (10 combinations). Algorithm

Average recognition rate

Standard deviation

Conﬁdence interval

Performance for two images in training set PCA 88.2 KPCA (d = 0.7) 90.09 LDA 89.67 KFD 91.75 Radon 87.2 DCT 91.08 RDCT 94.38 RDCTKFD 98.21

1.063 1.039 0.989 1.400 0.752 0.397 0.282 0.196

[87.398, [89.306, [88.925, [90.707, [86.633, [90.780, [94.168, [98.062,

89.000] 90.873] 90.415] 92.793] 87.767] 91.379] 94.592] 98.357]

Performance for three images in training set PCA 89.86 KPCA (d = 0.7) 91.09 LDA 91.82 KFD 92.01 Radon 87.71 DCT 92.33 RDCT 95.47 RDCTKFD 99.05

0.959 0.890 0.760 0.882 0.482 0.401 0.298 0.085

[89.136, [90.418, [91.247, [91.345, [87.346, [92.028, [95.245, [98.986,

90.583] 91.761] 92.393] 92.675] 88.073] 92.632] 95.694] 99.114]

Table 5 Face recognition performance for three images per subject in training and testing of ORL database (10 combinations). Algorithm

Average recognition rate

Standard deviation

Conﬁdence interval

PCA KPCA (d = 0.7) LDA KFD Radon DCT RDCT RDCTKFD

89.03 90.65 90.38 91.54 89 91.78 97 99.31

1.282 1.005 0.995 0.851 0.392 0.400 0.305 0.182

[88.033, 89.996] [89.892, 91.407] [89.08, 91.407] [90.898, 92.181] [88.704, 89.295] [91.478, 92.081] [96.774, 97.226] [99.172, 99.447]

Table 6 Face recognition performance for three images per subject in training and testing of Yale database (10 combinations). Algorithm

Average recognition rate

Standard deviation

Conﬁdence interval

PCA KPCA (d = 0.7) LDA KFD Radon DCT RDCT RDCTKFD

88.49 90.22

0.851 0.755

[87.847, 89.132] [89.659, 90.789]

92.08 94.76 92.83 91.5 96.5 99.6

0.590 0.727 0.482 0.338 0.298 0.109

[91.653, [94.218, [92.466, [91.208, [96.275, [99.518,

92.525] 95.315] 93.193] 91.792] 96.724] 99.682]

recognition. The narrow width of conﬁdence interval of the proposed algorithm implies that the performance of the proposed algorithm is irrelevant with respect to training image sets.

Under normal conditions, most of the algorithms can achieve a high recognition rate. The KPCA performed well compared to PCA because of extraction of features after nonlinear mapping. The performance of KFD is better than LDA because of extracting detail information of face images with nonlinear mapping in KFD. The performance of the proposed approach is better than the other approaches. This is due to boosting of low frequency components, which contribute signiﬁcantly in the recognition process during the computation of Radon projections and enhancement of discrimination capability of these features using Kernelized approach. 4.2.2. Testing the performance in illumination variations The effect of illumination variation on face recognition was investigated using FERET and Yale databases. The images which have variations in illumination have been selected for testing (two images per subject for training and ﬁve images per subject for testing). Reference images used for training and test images with different illumination are shown in Fig. 4. To remove the effect of illumination variations, the DC component of DCT of Radon space is removed before all the rows are concatenated to form the feature vector. The result of these experiments is given in Table 7. The performance of the RDCT and RDCTKFD algorithms are not affected by illumination variations because the DC value corresponding to average illumination has been removed from the feature vectors. However performance of the other approaches like PCA is signiﬁcantly affected because PCA represents faces with their principal components, but the variations between the images of the same face due to illumination are larger than image variations due to change in face identity. 4.2.3. Testing the robustness against additive noise In this set of experiments, the robustness of the proposed approach to zero mean white noise was tested using the ORL database. Five images per subject were randomly selected for training, and remaining ﬁve images per subject were used for testing. For 40 subjects, number of training images is 200 and test images are also 200. While training, images without noise were used. Zero mean white noise with variance dependent on the required signal-to-noise ratio (SNR-15, 10 and 7 dB) was added in the test images. The performances of the algorithms were tested using the noisy probe. Fig. 6 (a) shows an original image while (b), (c), (d) show the images with signal to noise level of 15 dB, 10 dB, and 7 dB respectively. Table 8 shows the effect of noise of

Table 7 Face recognition results under different illumination condition on different databases. Algorithm

FERET

Yale

PCA KPCA (d = 0.7) LDA KFD Radon DCT RDCT RDCTKFD

58.5 60.2 78.4 79.4 64 90.4 94.6 98.2

62.2 64 79.8 78.9 72.6 90.4 95.6 98.8

Fig. 4. (a) Reference image and (b) test image with different illumination.

1008

D.V. Jadhav, R.S. Holambe / Pattern Recognition Letters 31 (2010) 1002–1009

Table 8 Effect of zero mean white noise on the performances of face recognition algorithms (ORL database). Algorithm

Table 9 Face recognition results for different rotation using ORL database. Rotation (in degree)

Recognition rate with added noise of signal to noise level in the test images (%)

PCA KPCA LDA KFD Radon transform DCT RDCT RDCTKFD

No noise

15 dB

10 dB

7 dB

84 88.5 87 89 87.5 91 97.4 99.3

74.5 81.5 75.5 84 87.5 87 97.1 98.6

62.0 75.0 65.5 78 87 84.5 96.2 97.3

54.5 66.5 58 71.5 86 75 94.5 96.2

0 5 5 10 10 15 15 20 20 25 25 +30 30

Recognition rate ORL

FERET

94.5 94.5 94.5 94.5 94.5 94.5 94.5 94.0 94.0 94.0 94.0 94.5 94.5

94.8 94.8 94.8 94.8 94.8 94.8 94.8 94.6 94.6 94.4 94.4 94.6 94.6

different level on the performances of different algorithms. These results show that the proposed method has signiﬁcantly higher robustness to zero mean white noise. Since, the SNR improves signiﬁcantly in Radon space, the algorithm is robust to zero mean additive noise.

Fig. 5. In-plane rotated images from ORL database.

Fig. 6. (a) Typical original image from the ORL database and (b)–(d) are test images corrupted with zero mean white noise with signal to noise ratio of 15 dB, 10 dB and 7 dB, respectively.

4.2.4. Testing the in-plane rotation invariance of the algorithm The rotation (in-plane or tilt) invariance property of the proposed approach was tested using ORL and FERET databases. In this experiment, 200 images of 40 persons (ﬁve images per subject) from ORL database were used for training. Remaining 200 images from ORL database were rotated (tilted) up to 35° on both sides with increment of 5° to form the probe set. This set consists of 2800 (14 200) in-plane rotated images. Fig. 5 shows the sample rotated images. For the FERET database, 500 images of 250 persons (two images per subject) were used for training and 7000 (500 14) rotated images were used for testing. Table 9 shows the recognition rates for different angle of rotation. As the Radon transform converts rotation into translation and DCT is invariant to translation, features extracted using their combination are invariant to rotation. 4.2.5. Testing the performance in facial expression variations Experiments based on the face images with different facial expressions were conducted to assess the robustness of the approach to variations in facial expressions. In this experiment two frontal normal images per subject from each database were used for training. Three images per subject having facial expression variations from FERET, ORL, and Yale databases were used for testing. Reference images from Yale database used for training and test images with different facial expressions are shown in Fig. 7. The facial expressions cause the local distortions in the images which contribute to the high frequency components. Radon projections (integration) minimize the effect of facial expression variations. The recognition rate for different databases is given in Table 10. As PCA maintains the global structure of the input and discards the detailed local information, it is not sensitive to facial expres-

Fig. 7. (a) Reference image and (b) test image with different facial expressions.

D.V. Jadhav, R.S. Holambe / Pattern Recognition Letters 31 (2010) 1002–1009 Table 10 Face recognition results under different facial expressions on different databases. Algorithm

FERET

ORL

Yale

PCA KPCA (d = 0.7) LDA KFD Radon DCT RDCT RDCTKFD

85 86.2 78 90 84 80.2 89.3 91.4

87.4 82 75.3 89 85.6 82.4 88.5 90.8

88.3 87 65 94.7 92 81.2 90.1 90.9

Table 11 Comparison of running time for different algorithms (ORL database using ﬁve images per person in the gallery and probe). Algorithm

PCA KPCA LDA KFD Radon RDCT RDCTKFD

Running time (200 images) Training time (s)

Testing time (s)

16.297 17.515 19.672 28.625 12.032 22.844 45.609

2.305 4.782 4.898 8.278 12.172 16.235 18.689

Recognition rate (%)

84 88.5 87 89 87.5 92.5 98.5

sions. The nonlinear mapping in KFD considers the higher order statistical property of the input features. Hence performance is not severely affected by the expression variations. 4.2.6. Measurement of computation time All the experiments were performed using a Pentium-4 PC; with CPU speed of 2.6 GHz and 2 GB RAM using MATLAB. The time taken by each algorithm for training as well as testing the 200 images was computed and is given in Table 11. The proposed method takes somewhat more time for training because of integration of Radon transform and DCT and the nonlinear mapping performed on these features. Once the features are extracted, the time required for identity recognition is very less, making the proposed approach suitable for real time face recognition. 5. Conclusions In this paper, we have proposed a technique for face recognition, which uses Radon transform and DCT to derive the directional frequency features. Radon transform ﬁrst derives desirable directional face features. Being the line integral, it improves low frequency components, which are signiﬁcant in the identiﬁcation process. DCT derives the compact frequency features from the Radon space. Fractional power polynomial KFD analysis performed on these features enhances their discrimination capability. The experiments on noisy and rotated images proved the noise immunity and rotation invariance of the proposed approach. The approach is effective in accuracy and dimensionality reduction. The feasibility of this algorithm has been successfully tested using FERET, ORL, and Yale databases. Experimental results show that the proposed

1009

approach performs better than the other algorithms, such as PCA, KPCA and KFD. Acknowledgements The authors thank the associate editor and anonymous reviewers for their critical and constructive comments and suggestions, which helped to improve the quality of this paper. References Belhumeur, P.N., Hespanha, J.P., Kriegman, D.J., 1997. Eigenfaces vs. Fisherfaces; rcognition using class speciﬁc linear projection. IEEE Trans. Pattern Anal. Machine Intell. 19 (7), 711–720. Chen, W., Meng, J., Er, Shiquian Wu, 2005. PCA and LDA in DCT domain. Pattern Recognition Lett. 26, 2474–2482. Cover Thomas, M., 1965. Geometrical and statistical properties of systems of linear inequalities with application in pattern recognition. IEEE Trans. Electr. Comput. 14 (3), 326–334. Ekenel, H.K., Sankur, B., 2004. Feature selection in the independent component subspace for face recognition. Pattern Recognition Lett. 25 (12), 1377–1388. Ekenel, H.K., Stiefelhagen, R., 2006. Analysis of local appearance-based face recognition: Effects of feature selection and feature normalization. CVPR Workshop, New York, USA. Jadhav, D.V., Holambe, R.S., 2008. Radon and discrete cosine transforms based feature extraction and dimensionality reduction approach for face recognition. Signal Process. 88 (10), 2604–2609. Jain, A.K., Ross, A., Prabhakar, S., 2004. An introduction to biometric recognition. IEEE Trans. Circuits Systems Video Technol. 14 (1), 4–20. Khouzani, K.J., Soltanian, H.Z., 2005. Rotation invariant multiresolution texture analysis using Radon and wavelet transform. IEEE Trans. Image Process. 14 (6), 783–794. Lai, J.H., Yuen, P.C., Feng, G.C., 2001. Face recognition using holistic Fourier invariant features. Pattern Recognition 34, 95–109. Liu, C., 2004. Gabor – Based Kernel PCA with fractional power polynomial models for face recognition. IEEE Trans. Pattern Anal. Machine Intell. 26 (5), 572–581. Liu, C., 2006. Capitalize on dimensionality increasing techniques for improving face recognition grand challenge performance. IEEE Trans. Pattern Anal. Machine Intell. 28 (5), 725–737. Liu, C., Wechsler, H., 2002. Gabor feature based classiﬁcation using the enhanced Fisher linear discriminant model for face recognition. IEEE Trans. Image Process. 11 (4), 467–476. Lu, J., Plataniotis, K.N., Venetsanopoulos, A.N., 2003. Face recognition using kernel direct discriminant analysis algorithms. IEEE Trans. Neural Networks 14 (1), 117–126. Magli, E., Olmo, G., Presti, L., 1999. Pattern recognition by means of the Radon transform and the continuous wavelet transform. Signal Process. 73, 277– 289. ORL face database. . Phillips, P.J., 1998. Matching pursuits ﬁlters applied to face identiﬁcation. IEEE Trans. Image Process. 7 (8), 150–1164. Phillips, P.J., Hyeonjoon, M., Sayed, A.R., Patric, J.R., 2000. The FERET evaluation methodology for face recognition algorithms. IEEE Trans. Pattern Anal. Machine Intell. 22 (10), 1090–1104. Phillips, P.J., Flynn, P.J., Scruggs, T., Bowyer, K.W., Chang, J., Hoffman, K., Morques, J., Min, J., Worek, W., 2005. Overview of the face recognition grand challenge. IEEE Conf. on Computer Vision and Pattern Recognition 1, 947–954. Sanderson, C., Bengio, S., Yongsheng, G., 2006. On transforming statistical models for non-frontal face veriﬁcation. Pattern Recognition 39, 288–302. Simon Haykin, 1999. Neural Networks – A Comprehensive Foundation, second ed. Prentice Hall. Turk, M., Pentland, A., 1991. Eigenfaces for recognition. J. Cognitive Neurosci. 13 (1), 71–86. Wang, X., Xiao, B., Feng, J.M., Xiu, L.B., 2007. Scaling and rotation invariant analysis approach to object recognition based on Radon and Fourier–Mellin transforms. Pattern Recognition 40, 3503–3508. Yale face database. . Zhou, D., Yang, X., Ningsong, P., Wang, Y., 2006. Improved LDA based face recognition using both global and local information. Pattern Recognition Lett. 27, 536–543.

Rotation, illumination invariant polynomial kernel Fisher discriminant analysis using Radon and discrete cosine transforms based features for face recognition

Rotation, illumination invariant polynomial kernel Fisher discriminant analysis using Radon and discrete cosine transforms based features for face recognition

Recommend Documents