Random subspace for an improved BioHashing for face authentication

Available online at www.sciencedirect.com Pattern Recognition Letters 29 (2008) 295–300 www.elsevier.com/locate/patrec Random subspace for an improv...

Download PDF

224KB Sizes 1 Downloads 125 Views

Report

PDF Reader
Full Text

Available online at www.sciencedirect.com

Pattern Recognition Letters 29 (2008) 295–300 www.elsevier.com/locate/patrec

Random subspace for an improved BioHashing for face authentication Loris Nanni *, Alessandra Lumini DEIS, IEIIT – CNR, Universita` di Bologna, Viale Risorgimento 2, 40136 Bologna, Italy Received 19 May 2006; received in revised form 1 March 2007 Available online 13 October 2007 Communicated by T. Tan

Abstract Veriﬁcation based on tokenised pseudo-random numbers and user speciﬁc biometric feature has received much attention. In this paper, we propose a BioHashing system for automatic face recognition based on Fisher-based Feature Transform, a supervised transform for dimensionality reduction that has been proved to be very eﬀective for the face recognition task. Since the dimension of the Fisher-based transformed space is bounded by the number of classes – 1, we use random subspace to create K feature spaces to be concatenated in a new higher dimensional space, in order to obtain a big and reliable ‘‘BioHash code’’. 2007 Elsevier B.V. All rights reserved. Keywords: Face veriﬁcation; BioHashing; Random subspace

1. Introduction Establishing the identity of a person is a task of increasing importance in various areas in the modern society, as entrance control in buildings and restricted areas, authentication in day-to-day aﬀairs like dealing with the post oﬃce, and detection of a suspect in a particular crime in the ﬁeld of criminal investigation. Biometrics, which measures a physiological or behavioural characteristic of a person, such as voice, face, ﬁngerprints, and iris, provides an eﬀective way to solve the problems that the traditional methods such as password and IC cards have faced. One of the drawback of biometric is that when a biometric is compromised a new template cannot be assigned. There is a substantial research going on to ﬁnd solutions/ alternatives to the contemporary non-reissuable biometric. Some authors like Bolle et al. (2002) and Davida et al. (1998) have introduced the terms cancellable biometrics and private biometrics. In (Duda et al., 2001) in order to solve the problem of high false rejection and to obtain a cancellable biometrics, *

Corresponding author. Tel.: +39 3493511673. E-mail address: [email protected] (L. Nanni).

0167-8655/$ - see front matter 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.patrec.2007.10.005

a novel two-factor authenticator based on iterated inner products between tokenised pseudo-random number (generated by an Hash key) and the user speciﬁc ﬁngerprint features is presented; in this way a set of user speciﬁc compact codes can be produced which is named ‘‘BioHash code’’. In (Teoh et al., 2004) it is proposed to integrate a dual factor authenticator based on iterated inner products between tokenised pseudo-random number and the user speciﬁc facial feature, which is generated from a Fisher Discriminant Analysis, and hence to produce a set of user speciﬁc compact code. The BioHashing is theoretically motivated in (Teoh et al., 2006). Unfortunately, when an ‘‘impostor’’ B steals the hash key or the pseudo-random numbers of A and tries to authenticate as A the performance of BioHashing-based methods (e.g. Teoh et al., 2004; Jin et al., 2004) may be lower than that obtained using only the biometric data (Duda et al., 2001; Lumini and Nanni, 2007). In a recent paper (Kong et al., 2006) the authors put in evidence the anomalies of the Base BioHashing approach and concluded that the claim of having achieved a zero equal error rate (EER) is based upon the impractical hidden assumption of no stealing of the hash key. Moreover, they proved that in a more realistic scenario where an impostor steals

296

L. Nanni, A. Lumini / Pattern Recognition Letters 29 (2008) 295–300

the hash key the results are worse than when using the biometric alone. In (Lumini and Nanni, 2007) we have proposed an improved BioHashing where more projection spaces were exploited to generate more BioHash codes per user. The veriﬁcation task was performed by training a classiﬁer for each BioHash code and ﬁnally by combining these classiﬁers by SUM rule. Moreover, we have shown that combining the matcher trained using the biometric features and the matcher based on the BioHash code we could obtain a more robust system. In this paper we propose a BioHashing system for automatic face recognition based on Fisher-based Feature Transform. Fisher Feature Transform is a supervised transform for dimensionality reduction that has been proved to be very eﬀective for the face recognition task (Bellhumer et al., 1997). Unfortunately, it is well known that a Fisher-based Feature Transform permits to obtain a feature space whose dimension is bounded by the number of classes of the training set – 1, which is a strong limit in a BioHashing approach. Our proposal in order to solve this constraint when we use a Fisher-based Feature Transform is to use random subspace to create K feature spaces, to project each of these spaces onto a lower d-dimensionality space by a Fisher-based Feature Transform, and ﬁnally to concatenate the K reduced spaces in a new space of K · d dimensions, in order to obtain a big and reliable ‘‘BioHash code’’. In this paper, we test Fisherface (FIS) and Non-Linear Fisher (NLF) (Loog et al., 2001) (both implemented as in PRTools 3.1.7 toolbox http://130.161.42.18/prtools/). While the random subspace method has been applied in various machine learning tasks (e.g. in on-line signature veriﬁcation Nanni, 2006), it has not received signiﬁcant attention in face recognition like boosting (Gao and Wang, 2006). In Wang and Tang (2004, 2006) applied a variant of the random subspace method using Linear Discriminant Analysis (Bellhumer et al., 1997) as the base classiﬁer. In Gao and Wang (2006) instead of boosting in original feature space, whose dimensionality is usually very high, multiple feature subspaces with lower dimensionality were randomly generated, and boosting was carried out in each random subspace. Then the trained classiﬁers were further combined with simple fusion method. The work is organized as follows: in Section 2 the Base BioHashing and the Improved BioHashing are brieﬂy reviewed, in Section 3 the system proposed for face recognition is detailed, in Section 4 the experimental results are discussed, ﬁnally, in Section 5 some concluding remarks are given.

2. BioHashing BioHashing (Duda et al., 2001) generates a vector of bits starting from the biometric feature set and a seed which represents the Hash key. The biometric feature vector x 2 RN is reduced down to a bit vector b 2 {0,1}m (m 6 N), via a uniform distributed pseudo-random num-

bers generated by giving a secret seed K (Hash key), as follows: (1) Given K generate a sequence of real numbers to produce a set of vectors ri 2 RN , i = 1, . . ., m. Control that they are linearly independent, eventually discarding wrong ones. (2) Apply the Gram-Schmidt ortho-normalization procedure to transform the basis ri into an ortho-normal basis ori, i = 1, . . ., m. (3) Compute the inner products hxjorii, i = 1, . . ., m and compute bi (i = 1, . . ., m) as bi ¼ 0 if hxjori i 6 s , where s is a preset threshold. 1 if hxjori i > s The resulting bit vector b, that we name ‘‘BioHash code’’, is compared by the Hamming Distance for the similarity matching. In Lumini and Nanni (2007) we suggest to use more projection spaces to generate more BioHash codes per user, we called this method: improved BioHashing. Let k be the selected number of projection spaces to be used, the BioHashing method is iterated k time on the same biometric vector in order to obtain k bit vectors bi, i = 1, . . ., k (SPACES AUGMENTATION). The veriﬁcation task is performed by training a classiﬁer for each BioHash code and ﬁnally by combining these classiﬁers by a fusion rule (we suggest the SUM rule). Moreover, we proposed to normalize each biometric vector by its module (NORMALIZATION) and to use several values for s instead of a ﬁxed one (s VARIATION). In Fig. 1 a schema of the Improved BioHashing is reported, for a detailed explanation, please see Lumini and Nanni (2007). All the experiments reported in this paper have been performed using the Improved BioHashing approach with the following parameter conﬁguration: smax = 0.1, smin = 0.1, p = 5, k = 5. 3. System proposed We propose to use random subspace to create K feature spaces. The random subspace method (RS) is the combining technique proposed by Ho (1998). This method modiﬁes the training data set (generating K new training sets), builds classiﬁers on these modiﬁed training sets, and then combines them into a ﬁnal decision rule. The new training sets contain only a subset (50% in this paper) of the all features. We project each of these spaces onto a lower ddimensionality space by a Fisher-based Feature Transform, we concatenate these reduced spaces to obtain a big and reliable ‘‘BioHash code’’. In this way, the dimension of the pseudo-random numbers is N = K · d. In this paper, we test as Fisher-based Feature Transform the Fisherface (FIS) and the Non-Linear Fisher (NLF) (Loog et al., 2001) (both implemented as in PRTools 3.1.7 toolbox .http://130.161.42.18/prtools/). Fisher mapping is a dimensionality reduction technique based on the optimization of the between class scatter

L. Nanni, A. Lumini / Pattern Recognition Letters 29 (2008) 295–300

297

Fig. 1. The improved BioHashing method (from Lumini and Nanni, 2007).

T R A I N I N G

RANDOM SUBSPACE GENERATION

SUBSPACE 1

FISHER-BASED FEATURE TRANSFORM

……………

……………….

SUBSPACE K

FISHER-BASED FEATURE TRANSFORM

CONCATENATION

BIOHASHING

Fig. 2. System proposed in this paper.

matrix with respect to the within scatter matrix, to better discriminate patterns of diﬀerent classes. It has been proven (Bellhumer et al., 1997) that the Fisher transform is particularly suited to be applied to classiﬁcation tasks such as face recognition. The Non-Linear Fisher mapping has been proposed (Loog et al., 2001) to deal with the drawbacks of the Fisher transform. The Fisher transform usually gives a bad approximation of the patterns in presence of multi-class problems due to the fact that it tends to emphasize large class distances to the detriment of very neighboring classes. Therefore, in Non-Linear Fisher mapping a weight factor based on the error function has been introduced in the deﬁnition of the between class scatter matrix to lower the inﬂuence of the large class distances. The feature vector obtained by concatenating the K Fisher-based transformed features has a dimension (N = K · d) large enough to allow a big and reliable ‘‘BioHash code’’ to be obtained, that is not possible using

a single Fisher-based Feature vector, whose dimension is bounded by the number of classes – 1. In Fig. 2 the global approach is shown, where the biometric feature vector obtained by concatenating the K Fisher-based reduced spaces is used to create a ‘‘BioHash code’’ by the Improved BioHashing approach. 4. Experiments All the tests have been conducted on the ORL,1 YALEB and FERET (Wang and Tang, 2006) datasets, which are three of the most used benchmarks in this ﬁeld. 2

ORL: It consists of 400 diﬀerent images related to 40 individuals.

1 2

http://www.uk.research.att.com/facedatabase.html. http://cvc.yale.edu/projects/yalefacesB/yalefacesB.html.

298

L. Nanni, A. Lumini / Pattern Recognition Letters 29 (2008) 295–300

YALE-B: It consists of images related to 10 individuals: we use only the frontal poses (108 images per individual). FERET: It was introduced in the third evaluation (September 1996) and contains images of 1196 individuals (taken over a long period varying acquisition conditions) and consists of one training set (1196 frontal images, one for each individual) and four test sets: • Dup I: 722 images of an individual taken on diﬀerent days – duplicate images. • Dup II: 234 images of an individual taken over a year apart. • FB: 1195 images of an individual taken on the same day with the same lighting. • FC: 194 images of an individual taken on the same day with diﬀerent lighting. For the veriﬁcation task of the methods based on solely biometric data we adopt a 1-Nearest Neighbor Classiﬁer (Duda et al., 2001). Moreover, before the random subspace step the data are pre-processed by Karhunen–Loeve Transform3 (Duda et al., 2001) (the gray level values are projected onto a 100-dimensional space). For ORL and YALE-B the datasets, to minimize the possible misleading results caused by the training data, the results have been averaged over ﬁve experiments, all conducted using the same parameters. For each experiment we randomly resampled the learning and the test sets (containing respectively half of the patterns), maintaining the distribution of the patterns in the classes (individuals). For the FERET dataset we used the given training set and evaluated the approaches on the four test sets. For the performance evaluation we adopt the equal error rate (EER) (as in Lumini and Nanni, 2007; Duda et al., 2001). EER is the error rate where frequency of fraudulent accesses and frequency of rejections of people who should be correctly veriﬁed assume the same value (Franco et al., 2006). Table 2 reports the results of the comparison among the diﬀerent methods tested in this paper. Each method is deﬁned by three parameters: • BioHash, indicates if the BioHashing is performed; it can assume the values ‘‘YES’’ or ‘‘NO’’. ‘‘YES’’ means that the classiﬁcation is performed by Improved BioHashing, ‘‘NO’’ means that the classiﬁcation is performed by 1Nearest Neighbor Classiﬁer. • RS, indicates if the random subspace creation is performed; it can assume the values ‘‘NO’’, ‘‘RS1’’ or ‘‘RS2’’. ‘‘NO’’ means that we do not use random subspace, ‘‘RS1’’ and ‘‘RS2’’ means that we use random subspace. In order to obtain a fair comparison, with a similar

3 Implemented as in PRTools 3.1.7 MATLAB toolbox http:// 130.161.42.18/prtools/.

Table 1 Parameter settings for the diﬀerent approaches tested in this work Dataset

Method (RS)

NO RS1 RS2

ORL

YALE-B

FERET

d = 39 K = 5, d = 20 K = 10, d = 20

d=9 K = 11, d = 9 K = 25, d = 9

d = 50 K = 5, d = 20 K = 10, d = 20

length of the feature vector on the three datasets, we ﬁx the parameter conﬁgurations reported in Table 1. Please note that the conﬁguration ‘‘RS2’’ is aimed at evaluating the performance that may be obtained using a very long BioHash Code. • Features, indicates which feature extraction has been adopted; ‘‘FIS’’ denotes the Fisher mapping, ‘‘NLF’’ denotes the Non-Linear Fisher mapping. Please note that all the experiments are carried out in the worst and very unlikely hypothesis that always (in each match) an impostor steals the hash key. Our best results has been obtained by: BioHash = ‘‘YES’’; RS = ‘‘RS2’’; Features = ‘‘NLF’’; we call this combination BIOHRS. BIOHRS dramatically improves the performance of BioHashing (i.e. RS = ‘‘NO’’) and of a matcher trained using the biometric features (i.e. BioHash = ‘‘NO’’) alsoin the worst and very unlikely hypothesis that always (in each match) an impostor steals the hash key. We want to stress that whenever an impostor steals the hash key, the EER of BIOHRS in these datasets was always 0! Other interesting results are: • The simple BioHashing approach (conﬁguration ‘‘BioHash = ‘‘YES’’RS = ‘‘NO’’; Features = ‘‘FIS’’), in the worst hypothesis that always (in each match) an impostor steals the hash key, perform worse than the pure biometric (conﬁguration ‘‘BioHash = ‘‘NO’’RS = ‘‘NO’’; Features = ‘‘FIS’’); anyway this is no more true in the best hypothesis than nobody steals the hash key, where the performance of the BioHashing approach reaches 0 EER. • The performance of pure biometric methods (‘‘BioHash = ‘‘NO’’) increases with the number of random subspaces (RS = ‘‘RS2’’ vs. RS = ‘‘RS1’’): this is probably due to the fact that the number of subspace in ‘‘RS1’’ is not suﬃcient to create a robust ensemble of matchers. The problem of the number of subspace in random subspace is not novel in the literature, this problem is also discussed in (Wang and Tang, 2004, 2006). The training time in the ORL dataset for the Fisher mapping feature extraction is 2 s, and it grows to 10 s using a Random subspace where K = 10 and d = 20. The veriﬁcation time using a BioHash key of length 20 is 0.1 · 104 s, the veriﬁcation time using the method BIOHRS is 0.1 · 103 s. All the experiments were carried out in a

L. Nanni, A. Lumini / Pattern Recognition Letters 29 (2008) 295–300

299

Table 2 EER obtained using the following parameters: smax = 0.1, smin = 0.1, p = 5, k = 5 Face veriﬁcation (EER)

Dataset ORL

Method

BioHash

RS

Features

NO

NO

FIS

NO

NO

NLF

NO

RS1

FIS

NO

RS1

NLF

NO

RS2

FIS

NO

RS2

NLF

YES

NO

FIS

YES

NO

NLF

YES

RS1

FIS

YES

RS1

NLF

YES

RS2

FIS

YES

RS2

NLF

YALE

2.07 2.65 2.9 2.78 2 1.48 3.48 2.15 3.6 1.94 3 0.82

FERET

0.8 0.8 0.23 0.24 0.35 0.23 0.55 0.53 0.26 0.26 0.3 0.23

DUP1

DUP2

FB

FC

17 15 17.2 16.3 16 12.9 17.5 14.8 16 13 16 11.9

21 18 20 20 18 13.5 20.5 17.1 18 15 19 11

6.5 6 6.2 7 6 5 7.9 6.5 6 5.8 4 4.7

14 13 14.6 13.6 13.2 10.5 15 12.4 12 11.5 9.4 10

Table 3 Average accuracy and average Q-statistic Dataset

Q-statistic

BIOH

NLF FIS

EER

NLF FIS

Q-statistic

BIOHRS

NLF FIS

EER

NLF FIS

ORL

YALE

DUP1

DUP2

FB

FC

0.95 0.8 4.7 14.1

0.96 0.96 3.4 3.1

0.92 0.9 19 25

0.91 0.85 20 25

0.85 0.8 14 22

0.83 0.8 15 21

0.8 0.65 4 9

0.95 0.95 0.6 1.5

0.95 0.92 18 22

0.94 0.92 19 22

0.88 0.85 13.5 20

0.86 0.85 14.5 19.5

Pentium 1.6 GHz 512 MbRAM with Matlab 6.5.1 without code optimization. We try to motivate the goodness of our method using the average Q-statistic (Kuncheva and Whitaker, 2003; Lumini and Nanni, 2007). Yule’s Q-statistic is a measure for evaluating the independence of classiﬁers: for two classiﬁers Di and Dk the Q-statistic is deﬁned as: Qi;k ¼

ad bc ad þ bc

where a is the probability of both classiﬁers being correct, d is the probability of both classiﬁers being incorrect, b is the probability ﬁrst classiﬁer is correct and second is incorrect, c is the probability second classiﬁer is correct and ﬁrst is incorrect. Qi,k varies between 1 and 1 and it is 0 for statistically independent classiﬁers. It is well known in the literature that a good ensemble of classiﬁers is obtained combing statistically independent classiﬁers where each classiﬁer obtains a low error rate (Kuncheva, 2005). In Table 3 we report the average accuracy and average Q-statistic obtained by each single hash key using in the BIOH (BioHash = ‘‘YES’’RS = ‘‘NO’’) and in BIOHRS approaches.

FERET

The classiﬁers in BIOHRS have an average Q-statistic slightly higher than the classiﬁers in BIOH, but the classiﬁers in BIOHRS have a lower average accuracy. Moreover, the BioHash keys obtained by Non-Linear Fisher mapping have a very lower average accuracy with respect to those obtained by Fisher mapping. 5. Conclusions We have proposed a new BioHashing approach (BIObased on the several spaces obtained by random subspace for augmenting the length of the hash code, which gains some performance improvement with respect to the method based on solely biometric also in the worst case when always an impostor steals the hash key. Whenever an impostor steals the hash key, the EER of BIOHRS was 0! As future research, we intend to investigate the use of a diﬀerent set of features (less sensitive to illumination conditions, as Gabor features (Zhang et al., 2005) or Locally binary Patterns (Ahonen et al., 2004) and other methods for dimensionality reduction. Moreover, we want to study the fusion between the matcher trained using the biometric features and the

HRS),

300

L. Nanni, A. Lumini / Pattern Recognition Letters 29 (2008) 295–300

matchers trained using the BioHash Code. Preliminary results reported in Lumini and Nanni (2007) show that these two matchers can be combined for obtaining a more robust system. Acknowledgements This work has been supported by European Commission IST-2002-507634 Biosecure NoE Projects. Portions of the research in this paper use the FERET database of facial images collected under the FERET program, sponsored by the DOD Counterdrug Technology Development Program Oﬃce. References Ahonen, T., Hadid, A., Pietikainen, M., 2004. Face recognition with local binary patterns. In: Proc. European Conference on Computer Vision, pp. 469–481. Bellhumer, P.N., Hespanha, J., Kriegman, D., 1997. Eigenface vs. ﬁsherfaces: Recognition using class speciﬁc linear projection. IEEE Trans. Pattern Anal. Machine Intell. 17 (7), 711–720. Bolle, R.M., Connel, J.H., Ratha, N.K., 2002. Biometric Perils and Patches. Pattern Recognition 35, 2727–2738. Davida, G., Frankel, Y., Matt, B.J., 1998. On enabling secure applications through oﬀ-line biometric identiﬁcation. In: Proceeding Symposium on Privacy and Security, pp. 148–157. Duda, R., Hart, P., Stork, D., 2001. Pattern Classiﬁcation. Wiley, New York. Franco, A., Lumini, A., Maio, D., Nanni, L., 2006. An enhanced subspace method for face recognition. Pattern Recognition Lett. 27, 76–84. Gao, Y., Wang, Y., 2006. Boosting in Random Subspaces for Face Recognition, ICPR 2006.

Ho, T.K., 1998. The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Machine Intell. 20 (8), 832–844. Jin, A.T.B., Ling, D.N.C., Goh, A., 2004. BioHashing: Two factor authentication featuring ﬁngerprint data and tokenised random number. Pattern Recognition 37 (11), 2245–2255. Kong, B., Cheung, K., Zhang, D., Kamel, M., You, J., 2006. An analysis of BioHashing and its variants. Pattern Recognition 39 (7), 1359–1368. Kuncheva, L.I., 2005. Diversity in multiple classiﬁer systems. Inform. Fusion 6 (1), 3–4. Kuncheva, L.I., Whitaker, C.J., 2003. Measures of diversity in classiﬁer ensembles and their relationship with the ensemble accuracy. Machine Learn. 51, 181–207. Loog, M., Duin, R.P.W., Haeb-Umbach, R., 2001. Multiclass linear dimension reduction by weighted pairwise ﬁsher criteria. IEEE PAMI 23, 762–766. Lumini, A., Nanni, L., 2007. An improved BioHashing for human authentication. Pattern Recognition 40 (3), 1057–1065. Nanni, L., 2006. Experimental comparison of one-class classiﬁers for online signature veriﬁcation. Neurocomputing 69 (7–9), 869–873. Teoh, A.B.J., Ngo, D.C.L., Goh, A., 2004. An Integrated Dual Factor Authenticator Based on the Face Data and Tokenised Random Number, ICBA 2004, LNCS 3072, pp. 117–123. Teoh, A.B.J., Ngo, D.C.L., Goh, A., 2006. Random multispace quantization as an analytic mechanism for bioHashing of biometric and random identity inputs. IEEE Trans. Pattern Anal. Machine Intell. 28 (12), 1892–1901. Wang, X., Tang, X.O., 2004. Random sampling lda for face recognition, In: Proc. of the 2004 IEEE Computer Society Conf. on CVPR. Wang, X., Tang, X., 2006. Random sampling for subspace face recognition. Internat. J. Comput. Vision 70 (1), 91–104. Zhang, W., Shan, S., Gao, W., Chen, X., Zhang, H., 2005. Local gabor binary pattern histogram sequence (LGBPHS): A novel non-statistical model for face representation and recognition. In: Proceedings of the 10th Internat. Conf. on Computer Vision.

Random subspace for an improved BioHashing for face authentication

Random subspace for an improved BioHashing for face authentication

Recommend Documents