Recent advances in face biometrics with Gabor wavelets: A review

Pattern Recognition Letters 31 (2010) 372–381 Contents lists available at ScienceDirect Pattern Recognition Letters journal homepage: www.elsevier.c...

Download PDF

365KB Sizes 2 Downloads 81 Views

Report

PDF Reader
Full Text

Pattern Recognition Letters 31 (2010) 372–381

Contents lists available at ScienceDirect

Pattern Recognition Letters journal homepage: www.elsevier.com/locate/patrec

Recent advances in face biometrics with Gabor wavelets: A review Ángel Serrano *, Isaac Martín de Diego, Cristina Conde, Enrique Cabello Face Recognition and Artiﬁcial Vision Group, Universidad Rey Juan Carlos, C/ Tulipán s/n, Móstoles, Madrid E-28933, Spain

a r t i c l e

i n f o

Article history: Received 26 November 2008 Received in revised form 5 October 2009 Available online 10 November 2009 Communicated by G. Borgefors Keywords: Gabor wavelet Face recognition Face veriﬁcation Face database

a b s t r a c t In this paper we focus on the great outburst of Gabor-based methods for face biometrics occurred in the last few years. Analytical approaches rely on the representation of a face with the Gabor responses computed on speciﬁc landmarks, while holistic methods take into account the face as a whole. We explore the role played by Gabor wavelets in international competitions, such as FERET or BANCA, where Gabor algorithms ranked ﬁrst above other methods. By means of the analysis of ﬁve quantiﬁable factors, we present a ranking of methods as a function of their goodness. An enhanced version of AdaBoost, a complex-valued Gabor representation and a Gabor adaptive downsampling method are the three algorithms that lead the ranking. We also show there is a global trend toward face recognition methods, as well as toward Gabor holistic algorithms, due to their higher success rates. Ó 2009 Elsevier B.V. All rights reserved.

1. Introduction Face is the most signiﬁcant part of the human body for our daily mutual interaction (Zhao et al., 2003). With the recent advent of potent, fast and affordable computers, many automatic face recognition and veriﬁcation techniques have developed and found application in topics such as law enforcement and criminality prevention. Recognition (or identiﬁcation) consists in matching biometric data of an unknown subject with an identity (1:N problem), while veriﬁcation (or authentication) checks whether certain biometric data correspond to a given identity (1:1 problem). Since the 1980s, it is well known that the simple cells in the visual cortex in our brain present a special sensitiveness to orientation and spatial frequency in a visual scene. Orientation refers to the angle formed by an object and a reference axis, whereas spatial frequency accounts for the repetition of a pattern along space. Daugman (1980) realized that Gabor wavelets (Gabor, 1946) could reproduce such behaviour. Deﬁned as a plane wave enveloped by a Gaussian bell, they decompose an image into a set of channels, each one tuned to a different orientation and spatial frequency. It is usual to take into account a ﬁlter bank with ﬁve frequencies and eight orientations (Lades et al., 1993):

w¼

j~ kj2

r2

! j~ kj2 j~ rj2 r2 exp i~ k ~ r exp exp : 2 2r 2

ð1Þ

The orientation is parametrized as hl = lp/8 radians, 0 6 l 6 7. The wave vector ~ k ¼ km ðcos hl u~x þ sin hl u~y Þ has an angle-independent magnitude p kmﬃﬃﬃ= kmax/fm = 2(m+2)/2p, with kmax = p/2, a frequency ratio f ¼ 2, 0 6 m 6 4. The response of an image I to a wavelet w is calculated as the convolution G = I w. Their excellent performance in international competitions such as FERET (Phillips et al., 2000) or BANCA (Messer et al., 2004) has elicited a special interest on Gabor wavelets, with two possible strategies. Either Gabor features are extracted only on certain locations of the image to form local ‘jets’ (Lades et al., 1993; Wiskott et al., 1997), or they are computed globally (Liu and Wechsler, 2002). Whatever the case, a great outburst of Gabor-based methods has recently shaken the biometric community, including Adaboost-driven Gabor wavelets (Yang et al., 2004; Gao et al., 2006) or complex phase-based Gabor features (Qing et al., 2006; Zhang et al., 2007). This paper reviews the most recent Gabor-based face recognition and veriﬁcation techniques and is organized as follows. Section 2 provides an updated survey on Gabor methods. Then we look into the results obtained by Gabor methods in international competitions and evaluation protocols in Section 3. A complete discussion of the Gabor methods studied, including a critical analysis of their pros and cons, is presented in Section 4. Here we also present the evaluation and ranking of these works as a function of ﬁve factors for the goodness of an algorithm. Finally the conclusions are summarized in Section 5. 2. Gabor-based face recognition methods

* Corresponding author. Tel.: +34 914888122; fax: +34 914888530. E-mail address: [email protected] (Á. Serrano). URL: http://www.frav.es (Á. Serrano). 0167-8655/$ - see front matter Ó 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.patrec.2009.11.002

Following the same division proposed by Shen and Bai (2006a), we can consider analytical methods, which compute the response

Á. Serrano et al. / Pattern Recognition Letters 31 (2010) 372–381

of an image to a Gabor wavelet in a set of discrete locations, and holistic methods, which make use of a global response, which is subsequently processed with a dimension reduction algorithm such as Principal Component Analysis (PCA, Turk and Pentland, 1991) or Linear Discriminant Analysis (LDA, Belhumeur et al., 1997), among others. Table 1 provides a schematic classiﬁcation of the methods described in this paper.

2.1. Analytical methods Analytical methods (Fig. 1) can be divided into three types, depending on how they select the nodes: graph-matching based, manual detection (or other non-graph algorithms) and enhanced methods.

2.1.1. Graph-based analytical methods In Dynamic Link Architecture (DLA) Lades et al. (1993) proposed placing a rectangular graph over an image, where the Gabor responses were only computed on the nodes (Eq. (1)). With their own face dataset (Bochum) containing 128 128 images, Lades et al. surveyed the wavelet parameter settings that maximized the performance. In order to cover the frequency space with the smallest amount of wavelets, they chose a family of 40 Gabor ﬁlters parametrized with ﬁve frequencies and eight orientations. The total amount of 40 coefﬁcients computed for a speciﬁc node forms a ‘Gabor jet’. Except for very recent works (Qing et al.,

Table 1 Summary list of the Gabor methods described in this paper. See text for full references. Type

Subtype (Section)

Examples

Analytical

Graph-based (Section 2.1.1)

DLA EBGM Manual detection Ridge/valley detection Non-uniform sampling Gaussian mixture models Optimal Gabor parameters Gabor + Adaboost GFC HEGFC Gabor Kernel PCA Gabor + KDDA Gabor + 2DPCA Gabor + B2DPCA Gabor + (2D)2PCA LGBPHS GBC HGPP Multichannel Gabor + PCA

Non-graph-based (Section 2.1.2)

Enhanced (Section 2.1.3)

Holistic

Downsampled Gabor PCA/LDA (Section 2.2.1) Downsampled Gabor kernel PCA/LDA (Section 2.2.2) Gabor 2D Methods (Section 2.2.3)

Local Binary Patterns (Section 2.2.4)

No downsampling (Section 2.2.5)

373

2006; Zhang et al., 2007), it is usual to discard the complex phase information and consider only the magnitude. Elastic Bunch Graph Matching (EBGM, Wiskott et al., 1997) used a set of shape-driven graphs, whose nodes were located over speciﬁc landmarks of the face, such as the eyes, the nostrils, the mouth and the face contour. They generated a stack-like ‘face bunch graph’, where ‘bunch’ meant a set of jets corresponding to the same node. The maximization of a similarity function between graphs, with the direct comparison of the complex coefﬁcients of the jets as well as the distance between the nodes, allowed the shape-driven adaptation of a graph to a new face image. This process had several steps, each one reﬁning the previous one, including translation, change of scale, aspect ratio and, ﬁnally, local distortions of each node. The experiments done with FERET database yielded a 98% accuracy for frontal recognition and an 84% rate for proﬁle views. Due to the great variability in size and pose, using half-proﬁle images reduced the accuracy to 57%. EBGM introduced a signiﬁcant speed up over DLA and showed its robustness to in-depth rotation, although the size of the database and the design of the experiments have a great inﬂuence and are vital for the comparison between different authors. 2.1.2. Non-graph-based analytical methods These methods select the nodes manually and then compute the Gabor responses at these locations. Other authors take advantage of Gabor jets themselves to detect these landmarks. Qin and He (2005) detected a set of 87 facial features manually in FERET and ORL (AT&T, Samaria and Harter, 1994) databases. A circular 8-pixel neighbourhood was extracted around each node with the usual 5 8 ﬁlter bank. The extracted features were concatenated to form an augmented vector, which was then fed into a linear Support Vector Machine classiﬁer (SVM, Vapnik, 1995) after a downsampling process. González-Jiménez and Alba-Castro (2007) extracted shape information from a face with a ridge and valley detector. Using a dense rectangular grid, each node was allowed to move to the nearest ridge or valley, where Gabor jets were computed. Face comparison was performed with a pointwise correspondence between both sketches, including information about distances and angles between nodes and the similarity of jets. Experiments performed with AR database (Martinez and Benavente, 1998) showed good robustness to changes in face expression and illumination, while XM2VTS (Messer et al., 1999) and BANCA (Bailly-Bailliére et al., 2003) databases obtained a lower total error rate (TER) compared to EBGM. Zou et al. (2007) compared PCA, Gabor wavelets and Local Binary Patterns (LBP) for FERET and AR databases. Using a 37 37 window centred at each facial feature, the Gabor responses were sampled uniformly for a given spatial frequency, but sampling rate was higher for high frequency wavelets. Jets were compared with

Fig. 1. Outline of analytical methods.

374

Á. Serrano et al. / Pattern Recognition Letters 31 (2010) 372–381

dot product and Borda count. Their results showed that Gabor wavelets provided very good results, just slightly worse than LBP, but with more robustness against illumination changes. They also concluded that the nose is the most discriminant feature, as it cannot be deformed like other features do. Unlike other authors, they also suggested to keep external regions, such as the hair, the cheeks or the face contour and claimed that ‘local regions’ (gridlike subdivisions of the original image) carry more information that ‘local components’ (windows centred at facial features). Jahanbin et al. (2008) combined Gabor jets from aligned 2D + 2.5D images from ADIR database. By means of the manual selection of the 11 most prominent features in a 2D image, Gabor jets were computed both in the 2D and 2.5D images to create a model for each facial feature. Given a new image, each landmark was searched for around its average location. 2.5D range images achieved a 91% detection rate for an error distance of 6% relative to the interocular distance, while 2D images obtained 96%. Moreover, the fusion of 2D and 2.5D data improved the performance up to 98%. Ilonen et al. (2008) made use of complex-valued Gaussian mixture models for facial feature detection. These models were computed on manually detected landmarks by means of an expectation maximization algorithm and then tested in a new image. The selected features were ordered according to their conﬁdence levels. Experiments with XM2VTS and BANCA databases yielded better results compared to LBP and steerable ﬁlters. Complex-valued Gaussian models showed as well an important improvement in the detection rate compared to magnitude-based features. 2.1.3. Enhanced analytical methods These methods get rid of those wavelets and/or nodes with the lowest discriminant power. Moreno et al. (2005) investigated the adequate Gabor parameters (orientation l, frequency m and Gaussian envelope width r) for feature detection for AR database. They used a 3D extended information diagram of the Gabor responses as a function of l, m and r, where each slice corresponded to the case where one of the three parameters was constant. For a certain facial feature, the authors sampled the orientation l uniformly, and computed the values of m and r in every l-slice. This process was repeated for the other slices. The best recognition rates were obtained for l-slice, using Mahalanobis distance and a combination of the real and the imaginary parts of the Gabor convolution, instead of the usual Euclidean distance and magnitude. Bianconi and Fernández (2007) surveyed the inﬂuence of Gabor parameters for texture classiﬁcation, considering the maximum frequency, the number of frequencies, the frequency ratio, the number of orientations and the Gaussian envelope widths. Although this analysis was texture-oriented, there were some interesting conclusions. For example, the most signiﬁcant Gabor parameters were the frequency ratio (half-octaves better than octaves) and the Gaussian envelope widths, ruling out a potential inﬂuence of the number of frequencies or orientations on the recognition rate. Yang et al. (2004) suggested using AdaBoost to select the optimal wavelets. Boosting looks for a set of weak classiﬁers which, in combination, behave as a strong classiﬁer, where ‘weak’ and ‘strong’ refer to slightly-better-than-random-guess or good performance, respectively. The weights of each weak classiﬁers (Gabor responses) were adequately updated so that the total recognition rate was maximized and the weights of the worst classiﬁers were increased. With this technique only several hundreds of Gabor features were selected instead of the typical tens of thousands by other methods. Similarly Shan et al. (2005) and Gao et al. (2006) combined AdaBoost-extracted information with LDA coefﬁcients. Zhou and Wei

(2006) selected the 20 most discriminant Gabor features with AdaBoost for a veriﬁcation problem with XM2VTS database, obtaining low error rates, according to the authors. Shen and Bai (2006b) combined a mutual-information-based variant of AdaBoost with Generalized Discriminant Analysis. Most selected features were located around the eyes, the eyebrows, the nose and the chin, with preference to low frequencies and orientations parallel to facial features. 2.2. Holistic methods These methods consider Gabor convolutions as a whole and therefore usually rely on an adequate preprocessing, like face alignment, size normalization and tilt correction. However they suffer from a dimensionality problem. For instance, for an image with size 128 128, the total amount of features comes to 655,360. This augmented feature vector is usually downsampled and then processed directly with a dimensionality reduction algorithm, such as PCA or LDA (Fig. 2). Besides these ‘downsampled methods’, there are other strategies that perform no downsampling. 2.2.1. Downsampled Gabor PCA/LDA methods Liu and Wechsler (2002) worked with 128 128 FERET images, which were downsampled by 64. The corresponding 10,240 features were then reduced again with an enhanced variant of LDA. This ‘Gabor Fisher Classiﬁer’ (GFC) clearly outperformed those methods with a combination of downsampled Gabor features and eigenfaces (PCA) or ﬁsherfaces (LDA). The highest recognition rates were achieved with L1 distance, with an important deterioration for a downsampling factor of 256, so 64 was ﬁnally chosen. Su et al. (2006) presented a hierarchical ensemble of Gabor Fisher Classiﬁers (HEGFC). Each image was broken down by means of a cascade representation, from a severely downsampled version of the image to its original size. Except for the top-level representation, LDA was applied in sub-images or modules after Gabor convolutions. For the top level, which was very fast, those images in the database with the highest similarity were selected for the immediately lower level, and so on. Therefore with this coarse-to-ﬁne approach, the recognition process was sped up. This method obtained a better performance with a slight penalty in computational cost with respect to GFC, AdaBoosted GFC (Shan et al., 2005) and Local Gabor Binary Pattern Histogram Sequence (LGBPHS, Zhang et al., 2005, see below). Cheung et al. (2004) proposed a low-dimensional representation of Gabor features with a three-frequency and six-orientation ﬁlter bank and keeping both the real and the imaginary parts. Their ‘aggregated 2D features’ were computed as the corresponding averages and standard deviations in each convolution. On the other hand, in contrast with previous works, Hen et al. (2007) claimed that their Downsampled Gabor + PCA + SVM method outperformed GFC for BANCA dataset, although it ranked second for ORL. 2.2.2. Downsampled Gabor kernel PCA/LDA methods Holistic kernel methods have been also applied in combination with Gabor wavelets. For example, Liu (2004) proposed fractional power polynomial kernels k(x, y) = (x y)d for his version of Gabor KPCA. As usual, Gabor features were concatenated to form an augmented vector and downsampled to a manageable size. KPCA was then applied with different values of the exponent d. Mahalanobis distance with d = 0.6 provided the best results, with a 99.5% recognition rate using 246 features for FERET database. Xie and Lam (2006) used a doubly nonlinear mapping in combination with Gabor features, including an additional nonlinear transformation to emphasize those Gabor features with the highest statistical importance. Experiments carried out with different databases, showed an

375

Á. Serrano et al. / Pattern Recognition Letters 31 (2010) 372–381

Fig. 2. Outline of holistic methods.

increase in recognition rate compared to Gabor KPCA (Liu, 2004) for varied conditions (illumination, facial expression, pose). Among LDA-based kernel methods we can cite Gabor Kernel Direct Discriminant Analysis (Shen and Bai, 2004), whose authors claimed to outperform Gabor KPCA and Gabor GDA using a Gaussian kernel for FERET database. 2.2.3. Gabor 2D methods Two-dimensional versions of PCA have the advantage of keeping structural information. Serrano et al. (2007a) combined B2DPCA with Gabor features and SVM in a multichannel nondownsampling-based approach (see below), providing very low equal error rates and a great dimension reduction. Jin and Ruan (2007) applied (2D)2PCA to Gabor responses to extract the most discriminant features. On the other hand, Wang et al. (2007) used 2DPCA and (2D)2PCA of the Gabor extracted features, with very good results in a face recognition problem. 2.2.4. Local Binary Patterns methods Zhang et al. (2005) made use of Local Binary Patterns (LBP) of the Gabor responses to represent faces. After the usual convolutions with a Gabor ﬁlter bank, each pixel was compared with its corresponding eight neighbours, generating an 8-bit sequence depending on whether this central pixel was lower (‘1’) or higher (‘0’) than the surroundings, interpreted as a 0–255 grey level. After this transformation, each image was divided into a set of several non-overlapping regions. The face representation was ﬁnally generated as the concatenation of all the local histograms of all the regions for every Gabor channel (Local Gabor Binary Pattern Histogram Sequence, LGBPHS). In combination with LDA, the experiments with FERET and AR databases showed that LGBPHS achieved better results than LDA, non-Gabor LBP or other algorithms. The most discriminant regions were found around the eyes with preference to low frequencies. He et al. (2006) proposed an LGBPHS-inspired method with a 1frequency and 8-orientation ﬁlter bank. For each pixel a total amount of eight Gabor coefﬁcients were computed, each of which was compared in a circularly-linked list style, producing an 8-bit sequence. This ‘Gabor Binary Codes’ (GBC) representation consisted in an image of the same size as the original data, although a downsampling process was applied to reduce the dimensionality. With respect to other methods, GBC improved the recognition rate in several FERET experiments. Zhang et al. (2007) also proposed the LBP-based Histogram of Gabor Phase Patterns (HGPP). They took into account the complex phase of the Gabor convolutions, a brand-new strategy so far. The phase angle was transformed using a 2-bit quadrant coding for a 5 8 ﬁlter bank. The ﬁrst of these bits was called ‘real’ and the other one ‘imaginary’. For a given wavelet frequency, the eight coefﬁcients for the same location were considered as one byte and therefore interpreted as a 0–255 grey level. This global repre-

sentation multiplied the dimensionality by ten (ﬁve frequencies two bits). On the other hand, a ﬁngerprint-like texture local representation was obtained using an XOR-operator-based LBP technique over the Gabor responses, which provided up to 80 new images (ﬁve frequencies eight orientations two bits). In this 90 representation local histograms were computed over regions of each image (similarly to LGBPHS). Several weights could be assigned to each combination of local histogram, orientation/ frequency, global/local and ‘real’/‘imaginary’ bits with an LDAbased method. For FERET database the authors claimed to outperform previous methods, including GFC, EBGM, LDA and LGBPHS, their own previous work. Unlike LGBPHS, the weights were found more important for the eyes/nose/mouth region and high frequency wavelets. 2.2.5. Non-downsampling-based methods Serrano et al. (2007a,b) presented a multichannel parallel approach for a face veriﬁcation problem, where each Gabor convolution was processed independently with PCA/B2DPCA and SVM with no downsampling. The ﬁnal decision was taken with the fusion of 40 SVM scores, each for every channel. The experiments with XM2VTS and FRAV2D databases showed a severe reduction of the equal error rate and the dimensionality with respect to other ‘downsampled’ methods. 3. Competitions and evaluation protocols Due to the inherent difﬁculties in the comparison of results obtained with different databases, several international competitions and protocols have been proposed to evaluate face recognition systems, such as FERET, FRVT, FRGC, XM2VTS and BANCA (Tables 2 and 3). We will summarize the role played by Gabor methods with respect to other algorithms. 3.1. FERET FERET was a project of the United States Department of Defense through the Defense Advanced Research Products Agency (DARPA,

Table 2 Summary of results of face recognition competitions. See text for full references. Competition

Experiment

FERET

Frontal fa set/frontal fb set

FRVT FRGC

Algorithm

PCA + LDA EBGM Frontal fa set/frontal fc set EBGM PCA + LDA Details on algorithms not public Details on algorithms not public Controlled/uncontrolled Gabor + KFA

Recognition rate (%) 96 95 82 59

78

376

Á. Serrano et al. / Pattern Recognition Letters 31 (2010) 372–381

Table 3 Summary of results of face veriﬁcation competitions. See text for full references. (EER = equal error rate, WER = weighted error rate, TER = total error rate, HTER = half total error rate, FVR = face veriﬁcation rate). Competition

Experiment

Algorithm

Error rate (%)

FERET

Frontal fa set/frontal fb set

PCA + LDA

EER = 1

FRGC

Controlled/controlled

EBGM Gabor + KFA

XM2VTS (2000)

Controlled/ uncontrolled Conﬁguration I

EER = 2 FVR/FAR = 92/ 0.1 FVR/FAR = 76/ 0.1 TER = 4.8 TER = 16.6 TER = 2.2 TER = 15.0 TER = 1.47 TER = 11.36

Conﬁguration II XM2VTS (2003)

Conﬁguration I

Conﬁguration II

BANCA

Part I

Part II

Part III

Gabor + KFA LDA-based EBGM-based LDA-based EBGM-based Shape trace Gabor local maxima LDA-based Gabor local maxima Gabor + LDA Gabor + SVM EBGM-based Gabor + LDA Gabor + SVM LDA + SVM Gabor + LDA EBGM-based LDA + Colour

TER = 0.75 TER = 7.72 WER = 1.39 WER = 3.33 WER = 7.77 WER = 2.21 WER = 4.58 WER = 9.66 HTER = 13.47 HTER = 25.19 HTER = 25.87

Phillips et al., 2000) to evaluate and foster the, at that time, stateof-the-art face recognition algorithms. The database contained 14,126 images of 1199 subjects, including frontal pictures, ‘duplicate’ (or alternative) frontal images, complete proﬁles, half-proﬁles and quarter-proﬁles. There were two parts of the project: one for face recognition and the other one for face veriﬁcation. The University of Southern California (USC) entered the face recognition competition with EBGM (Wiskott et al., 1997) and ranked ﬁrst above the other methods in all the experiments except for one, where this algorithm ranked second after a PCA + LDAbased method from the University of Maryland (UMD). In the face veriﬁcation counterpart (Rizvi et al., 1998), EBGM also obtained the best EER (16%) in an experiment with frontal images for training a duplicate images for tests, in a tie with PCA + LDA. In a different experiment with frontal images for training and a disjoint set of frontal images for tests, EBGM obtained the second best EER (2%), again after PCA + LDA. 3.2. FRVT The Face Recognition Vendor Test (FRVT) was sponsored by DARPA and other US-Government agencies as a way to apply the previously learned lessons in the FERET program to commercial face recognition systems. Three editions have been held so far: 2000 (Blackburn et al., 2001), 2002 (Phillips et al., 2003) and 2006 (Phillips et al., 2007). Unfortunately not many details on the algorithms were made public, as some of the participants were private ﬁrms. 3.3. FRGC The Face Recognition Grand Challenge (FRGC) (Phillips et al., 2005) was a preparatory competition organized by the US National Institute of Standards and Technology (NIST), in an attempt to improve in an order of magnitude the best results of FRVT 2002. The database contained 50,000 images of 625 subjects, including high-

resolution 2D images with controlled or uncontrolled lighting and 3D meshes with shape and texture information. Nineteen research groups (competing with a total of 63 algorithms) participated in the competition, although for the sake of neutrality, no information on the groups or the algorithms used was reported. Liu (2006) published his results of his Gabor + KFA algorithm on his own. He trained his algorithm with images with controlled illumination and tested the inﬂuence of illumination in his experiments. The usual 5 8 Gabor ﬁlter bank was applied. This method obtained a 78% rank-one recognition rate, well above a baseline PCA (37%) and a PCA + LDA (48%), for a subset of FRGC database with 366, 152 and 608 images for training, validation and test. Other experiments were performed with only 6388 images were used for training, despite the databases protocol recommends 12,776, due to computational memory difﬁculties. For the controlled/uncontrolled illumination experiment, the best ROC curve achieved a 76% veriﬁcation rate for FAR = 0.1%, while the PCA baseline method could only obtain 12%. On the contrary, the experiment with controlled illumination images for training and a disjoint set of controlled illumination images for tests obtained a 92% rate versus 66%. Despite these promising results, no comparison with his other method was reported (GFC). 3.4. XM2VTS XM2VTS (Messer et al., 1999) was acquired as a multimodal database for face veriﬁcation with 295 subjects. It included ‘speaking shots’ and ‘head rotation shots’, from which eight frontal shots were extracted in four sessions. The ‘Lausanne protocol’ divided the database into three disjoint sets for training, validation and tests, and deﬁned two conﬁgurations for image selection. The ﬁrst edition of XM2VTS competition was held during the International Conference on Pattern Recognition (ICPR) 2000 (Matas et al., 2000) with only four participants. The IDIAP Institute from Switzerland competed with a modiﬁed version of EBGM, with three Gabor frequencies and six orientations. However the best results were obtained with an LDA-based method from the University of Surrey (United Kingdom), with a TER of 4.8% (conﬁguration I) and 2.2% (conﬁguration II). IDIAP could only obtain some mediocre results, 16.6% and 15.0% for each conﬁguration. The second edition was held during the Audio- and VideoBased Biometric Person Authentication (AVBPA) conference 2003 (Messer et al., 2003) with seven institutions (only Tübitak Bilten from Turkey used a Gabor-based method that selected the local maxima of Gabor convolutions). Unfortunately, this algorithm ranked in the last position, with a TER equal to 11.36% and 7.72% for each conﬁguration. The best algorithm was again proposed by the University of Surrey and was based on a shape trace transform (TER = 1.47%). 3.5. BANCA BANCA project was aimed at developing realistic multimodal veriﬁcation systems (Bailly-Bailliére et al., 2003). Using both low- and high-resolution equipments, video and multi-language speech data were acquired for a total of 208 subjects. The ‘BANCA Protocol’ established four sessions with control conditions, four with a degraded quality (low resolution web-cam) and four with adverse conditions (cluttered background, subject looking downwards). A special edition of the International Conference on Pattern Recognition tested 13 algorithms in a face veriﬁcation competition with this database (Messer et al., 2004). The original dataset was extended with ‘sequestered data’, i.e., new images of subjects already present and images of newcomers. The competition was divided into three parts: a ﬁrst one with pre-registered normalized

377

Á. Serrano et al. / Pattern Recognition Letters 31 (2010) 372–381

images, a second one where automatic face localization and normalization was mandatory at least in the test phase, and a third one, which used the sequestered data for testing. It is of great relevance that the best performance was obtained by three Gabor-based methods. In the ﬁrst part of the competition, Tsinghua University ranked ﬁrst with a Gabor + LDA mixture of holistic and component-based scores (averaged weighted error rate WER = 1.39%). It was followed by the University of Nottingham with a Gabor + SVM approach (3.33%) and the Institut für Neuroinformatik with an enhanced version of EBGM (7.77%). In the second part, Tsinghua University (2.21%) and the University of Nottingham (4.58%) managed again to obtain the best results, followed by an LDA + SVM fusion approach by the Université Catolique de Louvain (9.66%). Finally, for the sequestered data part, Tsinghua University outperformed the other candidates with the lowest HTER (13.47%), with the Institut für Neuroinformatik in the second place (25.19%), followed by the University of Surrey with LDA applied in 3 different colour spaces (25.87%). Messer et al. explained this worsening of performance for sequestered data as the joint effects of ageing, overtuning and the use of incorrect thresholds.

Table 4 Selection of Gabor ﬁlter banks found in the literature. Frequencies orientations

Maximum frequency

58

0.25

Frequency ratio pﬃﬃﬃ 2

38

0.25

pﬃﬃﬃ 2

88

0.25

pﬃﬃﬃ 2

64

1/40

pﬃﬃﬃ 3

58

1/4p

2

Authors Lades et al. (1993) Hen et al. (2007) Wang et al. (2007) Ilonen et al. (2008) Jahanbin et al. (2008)

4. Discussion We have shown in the previous section that there has been a really outburst of Gabor methods in the last few years. This had led to a great variety of algorithms and strategies, including as well very different Gabor ﬁlter banks (Table 4). This makes their comparison really difﬁcult and it is not always easy to elucidate which methods do provide smart solutions and reliable results for the face recognition problem. Inspired by this complex situation, we have performed an exhaustive analysis of the methods mentioned in this paper. Each method has been evaluated thoroughly by means of a series of quantiﬁable parameters (explained in Section 4.1). This has allowed us to order the algorithms in a ranking as a function of their overall quality (Section 4.2). Our discussion will be completed with several comments about the complexity and computational load of these Gabor methods (Section 4.3). 4.1. Evaluation of the methods Conscious of the complicated task to compare such varied methods, the papers mentioned in this review have been ordered in a ranking (Table 5) as a function of ﬁve easily quantiﬁable factors, which provide information about the goodness and quality of the algorithm proposed: the number of face databases used, the amount of subjects selected for training, the comparison of the results with other works, the detailed efforts made to reduce the complexity of the algorithm and the success rate. Each aspect has been evaluated with 1 (worst) to 5 (best) stars, as speciﬁed in the following section. The last column in Table 5 shows the sum of all the stars obtained in each aspect. The methods with the same number of stars are presented in alphabetical order. 4.1.1. Number of face databases used With respect to the number of face databases used to test an algorithm, those papers that have used four or more have been

Table 5 Ranking of the methods discussed in this paper. See text for an explanation of stars. R = recognition, V = veriﬁcation, FD = feature detection, A = analytical, H = holistic. Authors

Type

Database(s) used

Subjects for training

Comparisons with others

Complexity details

Success rate

Total stars

Shen and Bai (2006b) Zhang et al. (2007) Zou et al. (2007) Ilonen et al. (2008) Liu (2004) Xie and Lam (2006) Liu and Wechsler (2002) Shan et al. (2005) Su et al. (2006) Liu (2006) Serrano et al. (2007b) Wang et al. (2007) Cheung et al. (2004) Gao et al. (2006) González-Jiménez and Alba-Castro (2007) Hen et al. (2007) Serrano et al. (2007a) Zhang et al. (2005) Jin and Ruan (2007) Qin and He (2005) Zhou and Wei (2006) Bianconi and Fernández (2007) Shen and Bai (2004) Yang et al. (2004) Jahanbin et al. (2008) He et al. (2006) Moreno et al. (2005)

R/A R/H R/A FD/A R/H R/H R/H R/A R/H R/V/H V/H R/H R/H R/A R/V/A

ww www www www www wwwww ww www ww www www www ww ww wwww

www wwwww wwwww wwwww www www www www w www www w ww www www

wwww ww wwww www wwwww www wwww www wwww wwwww www wwww www www www

wwww www w wwwww w w ww ww wwww www w ww ww www ww

wwwww wwwww wwwww w wwwww wwwww wwwww wwwww wwwww w wwwww wwwww wwwww www w

18

V/H V/H R/H R/H R/A V/A R/A R/H R/A FD/A R/H FD/A

wwww w www ww ww ww w ww ww ww www ww

ww www www w w www www www www w w ww

www www ww www www w w wwww ww w ww w

ww w w w w wwwww www w ww w w w

ww wwwww wwww wwwww wwwww w www w w wwww w w

17

16

15

14 13

12

11 10 9 8 7

378

Á. Serrano et al. / Pattern Recognition Letters 31 (2010) 372–381

awarded with ﬁve stars (Xie and Lam, 2006). Those with three databases have received four stars (González-Jiménez and AlbaCastro, 2007; Hen et al., 2007). However it is more usual in the literature to consider only two databases, mainly FERET (Phillips et al., 2000), ORL (Samaria and Harter, 1994) and/or AR (Martinez and Benavente, 1998), receiving three stars. Those methods tested with only one well-known face database or with a not yet well-known face database have received two stars or one star, respectively. The more databases are considered, the better generalization the algorithm gets in the training process. 4.1.2. Amount of subjects selected for training As a function of the number of different subjects selected for the experiments, the papers considering at most 50 people have received one star, those papers using up to 100 subjects have obtained two stars. Algorithms tested with up to 500 or 1000 subjects have been given three or four stars, respectively. Finally those papers with more than 1000 subjects in their experiments have received ﬁve stars. The consideration of more images of different people adds difﬁculty to an experiment, as the computational load usually rises quickly and the algorithm may get saturated and overﬁtted, producing declining results. This factor may seem correlated with the previous one (databases used). However, authors using the same database do not always take into account the same amount of subjects for training their algorithm, specially if the database contains several partitions or sets with different kinds of images. For example, regarding FERET database, Zhang et al. (2007), Zou et al. (2007) and Ilonen et al. (2008) considered images from 1010 subjects (the so-called fa set), while Liu (2004) and Shen and Bai (2006b) selected only 200 subjects (corresponding to a different set, the so-called ba). 4.1.3. Comparisons with other works As a part of the scientiﬁc method, it is always healthy to compare one’s results with other works. Liu’s papers (Liu, 2004; Liu, 2006) stand out from the rest, as they present detailed results obtained by implementing methods of many other authors (ﬁve stars). Those algorithms which have been compared with at least three different methods have been awarded with four stars. Only three stars have been given if the authors have implemented up to two algorithms from somebody else. In case the authors have only cited the results obtained by others without reproducing their results themselves, they have been awarded with two stars. If no comparison was reported, the paper has been given one star. 4.1.4. Detailed efforts made to reduce the complexity We have not classiﬁed the papers by the complexity of their algorithm itself, as many authors usually avoid giving details about this (see Section 4.3). Five stars have been granted to those papers that provide extensive details about their algorithm complexity and the efforts made to reduce their computational load. In the opposite end, only one star has been given if there are no details about complexity at all. For example, Ilonen et al. (2008) took advantage of the redundancy in Gabor coefﬁcients for the convolution computations. On the other hand, Zhou and Wei (2006) selected the 20 most signiﬁcant Gabor features and the classiﬁers with a lower error rate with an AdaBoost-based technique. In both cases, the execution time was greatly reduced. 4.1.5. Success rate Another aspect that has been evaluated is the recognition rate reported (for recognition-based algorithms) or the error rate (for veriﬁcation-based methods). When authors have made several experiments with different conﬁgurations (changing the type of images for training or testing the classiﬁers or using different databases), only the best result has been considered in this table. For a

recognition rate less than 96% or an error rate greater than 4%, the papers have received only one star. A new star has been added for every percentile of improvement, so that the recognition algorithms that achieved a rate greater than or equal to 99% have received ﬁve stars. The same number of stars has been given for those veriﬁcation methods with an error rate less than or equal to 1%. As can be seen, more than half of the papers (14) claimed to obtain ﬁve star results in at least one experiment. Tables 6–8 summarize the results of some representative Gabor methods. A direct comparison should be performed with caution, as each paper has made use of a different database, comprising a great variety of designs of experiments. 4.2. Critical analysis A total amount of 29 Gabor methods have been analysed from the recent literature, of which 16 are holistic and 13 are analytic. The great bulk of them corresponded to face recognition (20),

Table 6 Some representative analytical Gabor face recognition methods. Authors

Database

Recognition rate (%)

Lades et al. (1993) Wiskott et al. (1997)

Bochum FERET Bochum FERET ORL FERET

88 98 94 95.2 100 97.9

Yang et al. (2004) Qin and He (2005) Gao et al. (2006)

Table 7 Some representative holistic Gabor face recognition methods. Authors

Database

Recognition rate (%)

Liu and Wechsler (2002) Cheung et al. (2004) Shen and Bai (2004) Zhang et al. (2005)

FERET AR FERET AR FERET FERET CAS-PEAL CMU PIE YaleB FRGC FERET Yale AR ORL YaleB ORL ORL Yale FERET CAS-PEAL

100 88.7 94 98 98 95 95 100 100 78 99 100 98.9 89.4 100 99.4 100 98.9 99.5 99.8

He et al. (2006) Qing et al. (2006) Liu (2006) Su et al. (2006) Xie and Lam (2006)

Jin and Ruan (2007) Wang et al. (2007) Zhang et al. (2007)

Table 8 Some representative analytical/holistic Gabor face veriﬁcation methods (EER = equal error rate, FAR = false acceptance rate, FRR = false rejection rate, WER = weighted error rate, TER = total error rate, HTER = half total error rate, FVR = face veriﬁcation rate). Authors

Database

Error rate (%)

Messer et al. (2004) Liu (2006) Zhou and Wei (2006) González-Jiménez and Alba-Castro (2007)

BANCA FRGC XM2VTS XM2VTS BANCA ORL BANCA FRAV2D XM2VTS

WER = 1.4 FVR/FAR = 92/0.1 FAR/FRR = 0/22.9 TER = 4.3 WER = 4.9 EER = 4.0 HTER = 5.6 EER = 0.0 EER = 1.2

Hen et al. (2007) Serrano et al. (2007b)

Á. Serrano et al. / Pattern Recognition Letters 31 (2010) 372–381

while six were devoted to face veriﬁcation and three were applied to facial feature detection. Two papers (González-Jiménez and Alba-Castro, 2007; Liu, 2006) were counted twice as they were tested for both recognition and veriﬁcation tasks. It can be appreciated that there is a global trend toward recognition versus veriﬁcation methods. Fourteen methods were worth ﬁve stars according to their reported success rate (12 are recognition-based and two veriﬁcation-based). It is interesting to point out that 10 of them used a holistic approach, while the remaining four are analytic. Of the 12 recognition methods, eight were holistic. This allows us to state that holistic face recognition algorithms are preferred by authors in the literature because they tend to obtain better success rates than the other methods. This contrasts clearly with previous Gabor works, clearly inﬂuenced by the analytical approach by Lades et al. (1993) and Wiskott et al. (1997). Holistic algorithms rank higher than their analytical counterparts, because they rely not only on the Gabor coefﬁcients computed on a limited number of facial landmarks (as analytical methods do), but they also extract relevant information of the global distribution of the facial structure. Several methods were reported to achieving a 100% success rate (Liu and Wechsler, 2002; Qin and He, 2005; Qing et al., 2006; Xie and Lam, 2006; Serrano et al., 2007b; Wang et al., 2007). However Hen et al. (2007) complained that GFC (Liu and Wechsler, 2002) was not as good as expected (sometimes even worse than PCA), probably as it was ‘highly sensitive to the underlying data quality’. Anyhow Liu and Wechsler used FERET, while Hen et al. made experiments with the degraded and adverse sets of BANCA. Even if both groups of authors had used the same database, performance could vary greatly due to a different amount of images used for the classiﬁer training. It is usual that, the more images for training, the better performance (Jin and Ruan, 2007). With a potential maximum number of stars equal to 25, three methods obtained the top position with 18 stars. On the one hand, Shen and Bai (2006b) stand out due to their combination of an AdaBoost-based feature selection method (MutualBoost) with a kernel modiﬁcation of LDA (Generalized Discriminant Analysis or GDA). Their algorithm achieved a great dimensionality reduction and allowed the correct recognition of 200 images from FERET database in only 4 s. Comparisons with other methods such as PCA, Bayesian PCA, AdaBoost combined with Gabor GDA, among others, conﬁrmed the goodness of their approach. On the other hand, HGPP method by Zhang et al. (2007) constitutes a very fresh strategy to the so far little explored facet of complex phase Gabor representations. HGPP has been tested with the whole FERET database (1196 subjects) and outperformed other Gabor methods such as LGBPHS, GFC, LDA or EBGM. However, despite the good performance obtained, the Gabor representation looked quite counter-intuitive. Zou et al. (2007) applied a variable downsampling for Gabor coefﬁcients extraction around facial features and compared this strategy with a windowbased extraction. Their method was tested with FERET (1196 subjects) and AR (135) databases and was reported to obtaining a 99.5% recognition rate, outperforming PCA, LBP, EBGM or LGBPHS. Their main drawback was that they still detected facial features manually. Three methods ranked in second place with 17 stars. The ﬁrst one was presented by Ilonen et al. (2008) and corresponds to a facial feature detection method based on complex-valued Gaussian mixture models. This method, which was also tried successfully to detect non-facial features, is remarkable due to its great novelty and promising results. As important drawbacks, Ilonen et al. performed a manual detection of facial features. The other two algorithms are kernel-based. On the one hand, Liu (2004) used fractional power polynomial kernels jointly with a downsampled Gabor PCA. His method showed a great robustness after a meticulous comparison with other algorithms, such as PCA, Kernel PCA,

379

fractional power polynomial Kernel PCA and Gabor PCA. On the other hand, Xie and Lam (2006) applied a doubly nonlinear mapping to their Gabor kernel method. This method obtains its credibility after being tested with four face databases (Yale, AR, ORL and YaleB), with recognition rates around 99% or higher (except for ORL). In general, kernel methods used in combination with Gabor features for face recognition are beginning to obtain much relevance in the literature due to their great robustness and good performance. Finally, in the third position of the ranking with 16 stars, there are also three methods. GFC by Liu and Wechsler (2002) is now a paradigm of holistic algorithms, as its fusion method (concatenating the Gabor responses in a mosaic style and then performing a downsampling in both directions) set a precedent and is now used by many authors in the literature. Shan et al. (2005) combined a GFC-based method with an AdaBoost approach for FERET and CAS-PEAL databases. Similarly to Shen and Bai (2006b), these authors computed the most signiﬁcant frequencies and orientations of Gabor wavelets for face recognition. Su et al. (2006) proposed using a cascade approach with their hierarchical ensemble of GFC (HEGFC), which obtained better results compared to the standard GFC, AdaBoost-based GFC, LBP and LGBPHS methods.

4.3. Performance and complexity An open question regarding Gabor methods is their computational load. Not many papers in the literature report their algorithm complexity nor their execution times. We have already considered in the previous section the efforts to reduce this problem as a way to evaluate the goodness of an algorithm. Here we intend to go deeply into this matter. Despite the great increase in recognition rate, Gabor holistic methods suffer from a huge computational load, as the dimensionality is multiplied by the number of wavelets in the ﬁlter bank. Moreover, the convolution with a Gabor ﬁlter itself is quite low, so it is usual to resort to the Fast Fourier Transform (FFT), which performs a number of operations around O(n2 log n), via the convolution theorem f g ¼ F1 ½Fðf Þ FðgÞ. Bearing in mind that a convolution requires two FFTs and an inverse FFT, and that ﬁlter banks usually deﬁne 40 convolutions, it is easy to understand why small images (128 128) are preferred to big ones and why analytical methods have obtained a major interest since the very ﬁrst works by Lades et al. (1993). Apart from the computation of the Gabor convolutions, there are other time consuming operations that need to be performed as well. Some of them correspond to the necessity of preprocessing the images (face/feature detection, face size normalization, eye distance normalization, histogram equalization, etc.). Others are dependent on the chosen algorithm itself. For example, graphbased methods usually carry out an automatic matching between graphs, which has an important overload. However as computers are getting faster and faster, holistic methods have taken the baton, despite they still need to perform a downsampling process to reduce the storage load and speed up the recognition process. With the introduction of boosting techniques, the selection of the optimal Gabor frequencies and orientations, as well as the most discriminant facial features, has led to another increase in performance. Although Lades et al. (1993) used a parallel machine with 23 transputers with a MIMD architecture, they needed up to 7 s to compute the convolution of a 128 128 image with a family of 40 wavelets and about 25 s for the comparison of a graph with the whole database with 87 subjects. Wiskott et al. (1997) reported that they could compute a face graph in 30 s (this was done once) and then compare 300 graphs per second.

380

Á. Serrano et al. / Pattern Recognition Letters 31 (2010) 372–381

Liu and Wechsler (2002) did not report time performance for their GFC method, although Su et al. (2006) succeeded to improve the recognition rate of GFC with a slight penalty in time for their coarse-to-ﬁne cascade HEGFC algorithm. Conversely, when Liu (2006) presented his Gabor-based Kernel Fisher Analysis (KFA) method, he reported needing 5 h 20 min of CPU time using a 3.2GHz single processor to complete an experiment, including face cropping of 36,818 images, training using 6388 images, feature extraction and similarity computation for FRGC database (Phillips et al., 2005). Other authors have recognized not to be concerned about speed, but intend to tackle this problem in the future with a dimension reduction technique (Zhang et al., 2007). AdaBoost is known for its long training time. For instance, Gao et al. (2006) reported that their 3.2-GHz Pentium 4 with 1.2 GB of RAM needed two full days of computation time to train one of the six subspaces for their AdaBoost random subspace algorithm. Zhou and Wei (2006) managed to reduce the training time up to 54% by ruling out those weak classiﬁers with a performance worse than random guessing. In general it is not important that the training stage of a biometric system takes too long, as it is usually performed ofﬂine. However a real-time implementation of a face recognition system should perform as fast as possible the Gabor convolutions for a new image and the comparison with the face models stored in the database. A veriﬁcation system has the advantage that the user claims an identity, so only one comparison has to be carried out.

5. Conclusions In this paper we have made a thorough revision of the most relevant Gabor algorithms of the last few years. In order to tackle the difﬁcult task of comparing such a set of diverse methods, ﬁve quantiﬁable factors have been considered, namely: the number of face databases used, the number of different subjects selected for the experiments, the amount of comparisons with other methods, the efforts made to reduce the computational load, and ﬁnally, the recognition or veriﬁcation rate. With these factors, a ranking of Gabor methods has been presented as a function of their goodness and quality. A total of 29 Gabor methods from the recent literature have been considered. A great majority (20) corresponded to face recognition, 6 of them were devoted to face veriﬁcation and the other three were for feature detection. This tendency toward face recognition to the detriment of face veriﬁcation occurs not only for Gabor methods, but also for face biometrics in general. In our ranking of Gabor methods, three algorithms share the top position: an AdaBoost-based kernel version of LDA (Shen and Bai, 2006b), a Histogram of Gabor Phase Patterns (HGPP) method (Zhang et al., 2007) and an adaptive Gabor downsampling around face features (Zou et al., 2007). All of them correspond to face recognition. Two of them are analytical and the other one is holistic. Of the 29 methods analysed, more than half of the methods (16) used a holistic versus an analytical (13) approach. This trend toward holistic methods can be explained due to their ability to achieve higher recognition rates with respect to their analytical counterparts. This is due to the fact that they extract facial information globally, instead of relying on a speciﬁc set of landmarks or face features. In particular, 10 holistic methods were reported to achieve a recognition rate higher than 99% or a veriﬁcation error rate lower than 1%, with only four analytical methods in the same situation. One of the disadvantages of holistic methods is their heavy computational load, due to the complete convolution of a face image with a Gabor ﬁlter bank and to their longer extracted feature

vectors. This explains why analytical approaches received much attention after the pioneer works of Lades et al. (1993) and Wiskott et al. (1997) in the 1990s decade. Several authors have made certain efforts to lessen the complexity of their algorithms, although most of them do not give details about this topic. The computational load still continues as an open question. Recent innovations that will most probably have an important impact in future works include: the consideration of the complex phase component of Gabor convolutions (Zhang et al., 2007; Ilonen et al., 2008); the combination of holistic Gabor algorithms, kernel methods and dimensionality reduction techniques (Liu and Wechsler, 2002; Liu, 2004; Su et al., 2006; Xie and Lam, 2006), and the optimal selection of Gabor parameters via boosting techniques for analytical Gabor methods (Shan et al., 2005; Shen and Bai, 2006b). Gabor methods outperformed non-Gabor algorithms in at least two international competitions such as FERET (Phillips et al., 2000) and BANCA (Bailly-Bailliére et al., 2003). An analytical algorithm (EBGM) obtained the best results (jointly with a non-Gabor method) in both the FERET recognition competition (Phillips et al., 2000) and the corresponding veriﬁcation counterpart (Rizvi et al., 1998). Regarding BANCA veriﬁcation competition, the top three methods were also Gabor-based (Messer et al., 2004): ﬁrst, a combination of holistic Gabor coefﬁcients, a dimensionality reduction process (LDA) and a nearest neighbour classiﬁer; second, a combination of Gabor wavelets, a dimensionality reduction process (not speciﬁed) and an SVM classiﬁer; ﬁnally, an enhanced version of EBGM (analytical method). Looking back, it has been a long way for Gabor-based face recognition until now, as shows this paper. However there are still many challenges to solve (for example, tackling non-controlled environments, using multiracial face databases, neutralizing ageing effects in the system performance, to cite only a few). We are conﬁdent that Gabor methods will play an important role in the attainment of the optimal solution to these topics and will pave the way to achieving new advances in face biometrics in the near future. Acknowledgments This research has been carried out under ﬁnancial support of the Universidad Rey Juan Carlos. The authors would also like to thank Dr. Bai Li, Dr. Linlin Shen and Dr. Ian Dryden, from the School of Computer Science and IT of the University of Nottingham (United Kingdom), for their interesting discussions on Gabor topics. References Bailly-Bailliére, E., Bengio, S., Bimbot, F., Hamouz, M., Kittler, J., Mariéthoz, J., Matas, J., Messer, K., Popovici, V., Porée, F., Ruiz, B., Thiran, J.-P., 2003. The BANCA database and evaluation protocol. In: Kittler, J., Nixon, M.S. (Eds.), Proc. of the Int. Conf. Audio- and Video-Based Biometric Person Authentication, LNCS, vol. 2688. Springer-Verlag, pp. 625–638. Belhumeur, P.N., Hespanha, J.P., Kriegman, D.J., 1997. Eigenfaces vs. ﬁsherfaces: recognition using class speciﬁc linear projection. IEEE Trans. Pattern Anal. Machine Intell. 19 (7), 711–720. Bianconi, F., Fernández, A., 2007. Evaluation of the effects of Gabor ﬁlter parameters on texture classiﬁcation. Pattern Recognition 40 (12), 3325–3335. Blackburn, D.M., Bone, M., Phillips, P.J., 2001. Facial Recognition Vendor Test 2000 Evaluation Report, Technical Report. Cheung, K.-H., You, J., Kong, W.-K., Zhang, D., 2004. A study of aggregated 2D Gabor features on appearance-based face recognition. In: Proc. ICIG, pp. 310–313. Daugman, J.G., 1980. Two-dimensional spectral analysis of cortical receptive ﬁeld proﬁles. Vision Research 20 (10), 847–856. Gabor, D., 1946. Theory of communication. J. Inst. Elect. Eng. London 93 (III/26), 429–457. Gao, Y., Wang, Y., Feng, X., Zhou, X., 2006. Boosting Gabor feature classiﬁer for face recognition using random subspace. In: Proc. IEEE ICASSP, vol. 2, p. II-II. González-Jiménez, D., Alba-Castro, J.L., 2007. Shape-driven Gabor jets for face description and authentication. IEEE Trans. Information Forensics and Security 2 (4), 769–780.

Á. Serrano et al. / Pattern Recognition Letters 31 (2010) 372–381 He, L., Hu, D., Jiang, C., 2006. Gabor binary codes for face recognition. In: Proc. Mexican Int. Conf. Artiﬁcial Intell., pp. 53–60. Hen, Y.W., Khalid, M., Yusof, R., 2007. Face veriﬁcation with Gabor representation and support vector machines. In: Proc. First Asia Int. Conf. Modelling and Simulation, pp. 451–459. Ilonen, J., Kamarainen, J.-K., Paalanen, P., Hamouz, M., Kittler, J., Kälviäinen, H., 2008. Image feature localization by multiple hypothesis testing of Gabor features. IEEE Trans. Image Process. 17 (3), 311–325. Jahanbin, S., Bovik, A.C., Choi, H., 2008. Automated facial feature detection from portrait and range images. In: Proc. IEEE Southwest Symp. Image Anal. Interpretation, pp. 25–28. Jin, Y., Ruan, Q.-Q., 2007. Gabor-based improved locality preserving projections for face recognition. In: Proc. of IEEE ICIP, vol. 1, pp. 153–156. Lades, M., Vorbrüggen, J.C., Buhmann, J., Lange, J., von der Malsburg, C., Würtz, R.P., Konen, W., 1993. Distortion invariant object recognition in the dynamic link architecture. IEEE Trans. Comput. 42 (3), 300–311. Liu, C., 2004. Gabor-based kernel PCA with fractional power polynomial models for face recognition. IEEE Trans. Pattern Anal. Machine Intell. 26 (5), 572–581. Liu, C., 2006. Capitalize on dimensionality increasing techniques for improving face recognition grand challenge performance. IEEE Trans. Pattern Anal. Machine Intell. 28 (5), 725–737. Liu, C., Wechsler, H., 2002. Gabor feature based classiﬁcation using the enhanced ﬁsher linear discriminant model for face recognition. IEEE Trans. Image Process. 11 (4), 467–476. Martinez, A.M., Benavente, R., 1998. The AR Face Database. CVC Technical Report #24. Matas, J., Hamouz, M., Jonsson, K., Kittler, J., Li, Y., Kotropoulos, C., Tefas, A., Pitas, I., Tan, T., Yan, H., Smeraldi, F., Capdevielle, N., Gerstner, W., Abdeljaoued, Y., Bigun, J., Ben-Yacoub, S., Mayoraz, E., 2000. Comparison of face veriﬁcation results on the XM2VTS database. In: Proc. IAPR ICPR, vol. 4, pp. 4858–4863. Messer, K., Matas, J., Kittler, J., Luettin, J., Maitre, G., 1999. XM2VTSDB: the extended M2VTS database. In: Proc. Int. Conf. Audio and Video-based Biometric Person Authentication, pp. 72–77. Messer, K., Kittler, J., Sadeghi, M., Marcel, S., Marcel, C., Bengio, S., Cardinaux, F., Czyz, J., Srisuk, S., Petrou, M., Kurutach, W., Kadyrov, E., Kepenekci, B., Tek, F.B., Akar, G.B., Deravi, F., 2003. Face veriﬁcation competition on the XM2VTS database. In: Proc. Int. Conf. Audio and Video Based Biometric Person Authentication. LNCS, vol. 2688. Springer-Verlag, pp. 964–974. Messer, K., Kittler, J., Sadeghi, M., Hamouz, M., Kostin, A., Cardinaux, F., Marcel, S., Bengio, S., Sanderson, C., Poh, N., Rodriguez, Y., Czyz, J., Vandendorpe, L., McCool, C., Lowther, S., Sridharan, S., Chandran, V., Paredes Palacios, R., Vidal, E., Bai, L., Shen, L., Wang, Y., Yueh-Hsuan, C., Hsien-Chang, L., Yi-Ping, H., Heinrichs, A., Müller, M., Tewes, A., von der Malsburg, C., Würtz, R., Wang, Z., Xue, F., Ma, Y., Yang, Q., Fang, C., Ding, X., Lucey, S., Goss, R., Schneiderman, H., 2004. Face authentication test on the BANCA database. In: Proc. IAPR ICPR, vol. 4, pp. 523– 532. Moreno, P., Bernardino, A., Santos-Victor, J., 2005. Gabor parameter selection for local feature detection. In: Proc. Iberian Conf. Pattern Recognition Image Anal.. LNCS, vol. 3522. Springer-Verlag, pp. 11–19. Phillips, P.J., Moon, H., Rizvi, S.A., Rauss, P.J., 2000. The FERET evaluation methodology for face-recognition algorithms. IEEE Trans. Pattern Anal. Machine Intell. 22 (10), 1090–1104. Phillips, P.J., Grother, P., Micheals, R.J., Blackburn, D.M., Tabassi, E., Bone, M., 2003. Face Recognition Vendor Test 2002, Evaluation Report, Technical Report. Phillips, P.J., Flynn, P.J., Scruggs, T., Bowyer, K.W., Chang, J., Hoffman, K., Marques, J., Jaesik, M., Worek, W., 2005. Overview of the face recognition grand challenge. In: Proc. IEEE Int. Conf. CVPR, vol. 1, pp. 20–25. Phillips, P.J., Grother, P., Micheals, R.J., Blackburn, D.M., Tabassi, E., Bone, M., 2007. FRVT 2006 and ICE 2006 Large-Scale Results, Technical Report, NIST.

381

Qin, J., He, Z.-S., 2005. A SVM face recognition method based on Gabor-featured key points. In: Proc. Int. Conf. Machine Learning and Cybernetics, vol. 8, pp. 5144– 5149. Qing, L., Shan, S., Chen, X., Gao, W., 2006. Face recognition under varying lighting based on the probabilistic model of gabor phase. In: Proc. IAPR ICPR, vol. 3, pp. 1139–1142. Rizvi, S.A., Phillips, P.J., Moon, H., 1998. A veriﬁcation protocol and statistical performance analysis for face recognition algorithms. In: Proc. IEEE Int. Conf. CVPR, pp. 833–838. Samaria, F., Harter, A., 1994. Parameterisation of a stochastic model for human face identiﬁcation. In: Proc. IEEE Workshop Applications of Computer Vision, pp. 138–142. Serrano, Á., Martı´n de Diego, I., Conde, C., Cabello, E., Shen, L., Bai, L., 2007a. Fusion of support vector classiﬁers for parallel Gabor methods applied to face veriﬁcation. In: Proc. MCS Conf.. LNCS, vol. 4472. Springer-Verlag, pp. 141–150. Serrano, Á., Martín de Diego, I., Conde, C., Cabello, E., Shen, L., Bai, L., 2007b. Inﬂuence of wavelet frequency and orientation in an SVM-based parallel Gabor PCA face veriﬁcation system. In: Proc. IDEAL Conf.. LNCS, vol. 4881. SpringerVerlag, pp. 219–228. Shan, S., Yang, P., Chen, X., Gao, W., 2005. AdaBoost Gabor ﬁsher classiﬁer for face recognition. In: Proc. Int. Workshop Anal. Modelling of Faces and Gestures. LNCS, vol. 3723. Springer-Verlag, pp. 279–292. Shen, L., Bai, L., 2004. Gabor wavelets and kernel direct discriminant analysis for face recognition. In: Proc. IAPR ICPR, vol. 1, pp. 284–287. Shen, L., Bai, L., 2006a. A review on Gabor wavelets for face recognition. Pattern Anal. Appl. 9 (2–3), 273–292. Shen, L., Bai, L., 2006b. Mutual boost learning for selecting Gabor features for face recognition. Pattern Recognition Lett. 27 (15), 1758–1767. Su, Y., Shan, S., Chen, X., Gao, W., 2006. Hierarchical ensemble of Gabor Fisher classiﬁer for face recognition. In: Proc. IEEE Int. Conf. Automatic Face and Gesture Recognition, pp. 91–96. Turk, M.A., Pentland, A.P., 1991. Eigenfaces for recognition. J. Cognitive Neurosci. 3 (1), 71–86. Vapnik, V., 1995. The Nature of Statistical Learning Theory. Springer-Verlag, New York. Wang, L., Li, Y., Wang, C., Zhang, H., 2007. Face recognition using Gaborface-based 2DPCA and (2D)2PCA classiﬁcation with ensemble and multichannel model. In: Proc. IEEE Symp. Comput. Intell. in Security and Defense Appl., pp. 1–6. Wiskott, L., Fellous, J.-M., Krüger, N., von der Malsburg, C., 1997. Face recognition by elastic bunch graph matching. IEEE Trans. Pattern Anal. Machine Intell. 19 (7), 775–779. Xie, X., Lam, K.-M., 2006. Gabor-based kernel PCA with doubly nonlinear mapping for face recognition with a single face image. IEEE Trans. Image Process. 15 (9), 2481–2492. Yang, P., Shan, S., Gao, W., Li, S.Z., Zhang, D., 2004. Face recognition using adaboosted Gabor features. In: Proc. IEEE Int. Conf. Automatic Face and Gesture Recognition, pp. 356–361. Zhang, W., Shan, S., Gao, W., Chen, X., Zhang, H., 2005. Local Gabor binary pattern histogram sequence (LGBPHS): a novel non-statistical model for face representation and recognition. In: Proc. IEEE ICCV, vol. 1, pp. 786–791. Zhang, B., Shan, S., Chen, X., Gao, W., 2007. Histogram of Gabor phase patterns (HGPP): a novel object representation approach for face recognition. IEEE Trans. Image Process. 16 (1), 57–68. Zhao, W., Chellappa, R., Rosenfeld, A, Phillips, P.J., 2003. Face Recognition: A Literature Survey. ACM Computing Surveys, pp. 399-458. Zhou, M., Wei, H., 2006. Face veriﬁcation using Gabor wavelets and AdaBoost. In: Proc. IAPR ICPR, vol. 1, pp. 404–407. Zou, J., Ji, Q., Nagy, G., 2007. A comparative study of local matching approach for face recognition. IEEE Trans. Image Process. 16 (10), 2617–2628.

Recent advances in face biometrics with Gabor wavelets: A review

Recent advances in face biometrics with Gabor wavelets: A review

Recommend Documents