Expert Systems with Applications 38 (2011) 14284–14289
Contents lists available at ScienceDirect
Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa
Finger-vein pattern identification using SVM and neural network technique Jian-Da Wu ⇑, Chiung-Tsiung Liu Graduate Institute of Vehicle Engineering, National Changhua University of Education, 1 Jin-De Rd., Changhua City, Changhua 500, Taiwan
a r t i c l e
i n f o
Keywords: Finger-vein pattern identification Support vector machine Neural network Vehicle safety system
a b s t r a c t This paper presents a support vector machine (SVM) technique for finger-vein pattern identification in a personal identification system. Finger-vein pattern identification is one of the most secure and convenient techniques for personal identification. In the proposed system, the finger-vein pattern is captured by infrared LED and a CCD camera because the vein pattern is not easily observed in visible light. The proposed verification system consists of image pre-processing and pattern classification. In the work, principal component analysis (PCA) and linear discriminant analysis (LDA) are applied to the image pre-processing as dimension reduction and feature extraction. For pattern classification, this system used an SVM and adaptive neuro-fuzzy inference system (ANFIS). The PCA method is used to remove noise residing in the discarded dimensions and retain the main feature by LDA. The features are then used in pattern classification and identification. The accuracy of classification using SVM is 98% and only takes 0.015 s. The result shows a superior performance to the artificial neural network of ANFIS in the proposed system. Ó 2011 Elsevier Ltd. All rights reserved.
1. Introduction Nowadays, personal identification using biometric verification technology is becoming more important in security systems and offers convenience to the user as there are no passwords to remember (Jay & Ajay, 2009). Meanwhile, biometrics has become a solution to deal with security problems because biometric biological features cannot be easily stolen or shared, and it can improve the traditional authentication system, usually verifying a person’s identity through secret keys or smart cards. There are many different biometric techniques such as face image (Perlibakas, 2004), fingerprint (Ross, Jain, & Reisman, 2003), iris and voice (Honeycutt, 2003) that can be used to authenticate persons. Fingerprint recognition is one of the most mature biometric technologies in terms of the algorithm availability and feasibility. Fingerprints are verified by comparing its minutiae with the minutiae template. However, it requires physical contact and hands are among the most unclean body parts. So this may cause problems with the device. Voice waveform recognition can be used for identifying persons over the phone or in environment but it can receive background noise disturbance, so it only works in an environment that can control background noise. Facial images can be corrupted by poor lighting, the angle of observation, and other parameters. Iris recognition systems are still much more expensive than other biometric systems. Therefore, these biometric
⇑ Corresponding author. E-mail address:
[email protected] (J.-D. Wu). 0957-4174/$ - see front matter Ó 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2011.05.086
methods all have some limitations for personal identification systems. In the present study, a biometric feature authentication technology using finger-vein patterns is proposed. The finger-vein pattern is not easily observed in visible light. In the experimental verification, the finger-vein patterns are captured by the infrared LED and CCD camera. Infrared LED radiation exists in the electromagnetic spectrum at wavelengths longer than visible light. It cannot be seen but it can be detected. An infrared LED light of 760–1000 nm is able to pass through the skin of the finger while the hemoglobin in the vein can absorb the infrared light. Some research has used infrared light to transmit through the various regions of the body (Boue, Cassagne, Massound, & Fournier, 2007). The near infrared is a type of spectroscopy and is a very useful and sensitive technique. Infrared thermograph is a non-destructive technique delivering temperature images of the human body. This finger-vein identification method does not require the subject to touch the sensor because it is detected by a CCD camera through a near-infrared filter. The modern application of the near-infrared uses it to measure the composition of unknown samples. It has become a very popular technique in a wide variety of industries because of its speed, accuracy, wide applicability and avoidance of extraneous chemicals. Finger-vein identification has some advantages compared to other biometric authentication technology, as summarized in Table 1 (Wu & Ye, 2009). The advantages are non-contact, live body identification, high security and low cost. A finger-vein identification system consisting of the captured image, feature extraction, and classifiers is proposed. The flow chart of the finger-vein pattern
14285
J.-D. Wu, C.-T. Liu / Expert Systems with Applications 38 (2011) 14284–14289 Table 1 Comparison of biometric authentication.
2. Principle of principal components analysis
Biometric
Characteristic
Defect
Uniqueness
Security
Cost
Voice Face Fingerprint Iris Finger-vein
Convenient Public Extensive Precision High security
Noise Light Skin Glasses Less
Low Low High High High
Normal Normal Good Excellent Excellent
Low Low Low High Low
Data collection
Feature extraction
Data division
PCA is mathematically defined as an orthogonal linear transformation that transforms the data to a new coordinate system so the greatest variance by any projection of the data comes to lie on the first coordinate. That is termed the first principal component. The second greatest variance on the second coordinate, and so on the third (Ya, 2007). PCA is a powerful tool for analyzing data, identifying patterns and expressing the data to highlight their differences. It describes the data set in terms of its variance. Since the patterns can be hard to find in data of high dimensions, PCA can find these patterns in the data by reducing the number of dimensions, without much loss of information (Jiang, Mandal, & Kot, 2008). The PCA method reduces the number of variables and represents a multivariate data table in a low-dimensional space. PCA achieves excellent results in feature extraction and data reduction in large datasets (Ikhlas, Sara, Osama, & Sherif, 2006). PCA can be used for multivariate data analysis and is widely used in pattern analysis and compression. The principle of PCA is described as follows. For a data matrix XT, calculate the empirical mean of the distribution which has been subtracted from the data set. The PCA transformation is given by
Y T ¼ X T W;
ð1Þ T
Training data
Testing data
Neural networks classifier
Classification results
Fig. 1. Schematic diagram of the finger-vein pattern identification system.
identification system is indicated in Fig. 1. The feature extraction in the proposed system uses the principal component analysis (PCA) technique. The PCA is used for feature extraction to generate the most distinguishing features. It aims to find the projection directions maximizing the variance of a subspace, equivalent to finding the eigenvalues from the covariance matrix. In PCA, each vector x is projected from the input space. Note, the dimensionality of the feature space can be arbitrarily large. After the PCA projection, each subject vector is significantly reduced. The main disadvantage of PCA-based approaches is they do not distinguish the different roles of variations. To solve this problem, we use the provided label information by linear discriminant analysis (LDA) projection, which is more reliable for classification purposes (Er, Wu, Lu, & Toh, 2002). For pattern classification, the support vector machine (SVM) has been successfully used in a number of applications. Recent developments in defining and training statistical classifiers make it possible to build reliable classifiers in very small sample size problems and even find non-linear decision boundaries for small training sets. SVM has been shown to work effectively in combination with kernels that map the data to other high-dimensional spaces by non-linear transformations, where the data can be separated in a linear way.
where W is the singular value decomposition of X . Given the set of points in the space, the first principal component corresponds to a line passing through the mean and minimizes the sum squared error with those points. The second principal component corresponds to the same concept after all correlation with first principal component has been subtracted out from the points. Each eigenvalue indicates the portion of the variance correlated with each eigenvector and the sum of all the eigenvalues equals the sum squared distance of the points with their mean divided by the number of dimensions. PCA essentially rotates the set of points around their mean to align with the first few principal components. This moves as much of the variance as possible into the first few dimensions. PCA has the distinction of being the optimal linear transformation for keeping the subspace at the largest variance. PCA is an unsupervised statistical technique to extract information from multivariate data sets. The eigenvectors of all principal components are orthogonal to each other in the data space. The number of important principal components is less than the number of all the principal components. Generally, PCA is regarded as a data reduction technique. 3. Principle of linear discriminant analysis LDA is one of the most traditional linear dimensionality reduction methods. LDA seeks a transformation matrix W that maximizes the ratio of the between-class scatter and minimizes the within-class scatter matrix in the projection feature space (Zhang & Jia, 2007). LDA is a method to find the linear combination of features best separating two or more classes of objects or events. We consider a within-class scatter matrix for the within-class scatter. A within-class scatter matrix SW is defined as
SW ¼
c X X ðx mi Þðx mi Þt ;
ð2Þ
i¼1 x2Ci
where c is the number of classes and Ci is a set of data within the ith class, and mi is the mean of the ith class. The within-class scatter matrix represents the degree of scatter within classes as a summation of covariance matrices of each class. The second term is defined as a between-class scatter matrix
SB ¼
c X i¼1
ni ðmi mÞðmi mÞt ;
ð3Þ
14286
J.-D. Wu, C.-T. Liu / Expert Systems with Applications 38 (2011) 14284–14289
where mi is the mean of class x. Based on SW and SB, the transformation matrix W is determined to maximize the criterion function
JðWÞ ¼
je SB j jW t SB Wj ¼ t e j S W j jW SW Wj
ð4Þ
The transformation matrix W can be obtained as one that maximizes the criterion function J(W). The columns of optimal W are the generalized eigenvectors wi corresponding to the largest eigenvalues in
SB W i ¼ k i SW W i
ð5Þ
LDA tries to find a set of features by best discriminating different object classes. If SW is full-rank, W can be computed from the eigenvectors of S1 W Sb . 4. Image classification using the support vector machine SVM has been greatly developed and widely applied to classification and pattern recognition (Wang, Yuan, Liu, Yu, & Li, 2009). SVM is a set of related supervised learning methods. SVM is basically a hyperplane classifier. Training a SVM classifier involves finding a hyperplane as its decision surface that separates the positive training examples from the negative ones with the largest margin (Sun, Lim, & Liu, 2009). The principle of training of a linear separable SVM is shown in Fig. 2. One of the main reasons for the wide application of SVM is its capacity to handle nonlinearly separable data. Given training examples represented as pairs ð~ xi ; yi Þ, where ~ xi is the weighted feature vector of the training example and yi e {1, 1} is the label of the example. For linearly separable data, we can determine a hyperplane f(x) = 0 that separates the data
f ðxÞ ¼
n X
wi xi þ b ¼ 0;
Fig. 3. Experimental setup of the finger-vein identification system.
ð6Þ
i¼1
where w is a n-dimensional vector and b is a scalar value. The vector w and the scalar b determine the poison of the separating hyperplane. For each i either
w xi b P 1
for xi of the first class;
w xi b 6 1 for xi of the second class:
ð7Þ
The separating hyperplane creating the maximum margin is termed the separating hyperplane. Considering the noise with slack variables ni and error penalty C, the optimal hyperplane can be found by solving the following problem
min Pðw; b; nÞ ¼ w;b;n
n 1 CX hw wi þ n2 ; 2 2 i¼1 i
Fig. 2. A linear separable support vector machine.
ð8Þ
Fig. 4. Example of finger-vein pattern.
14287
J.-D. Wu, C.-T. Liu / Expert Systems with Applications 38 (2011) 14284–14289
Image Acquisition
Feature Extraction
Image Classify
PCA
SVM
LDA
ANFIS
Result
Fig. 5. Block diagram of proposed finger-vein pattern system.
100 90 80
Accuracy(%)
70 60 50 40 30 20 PCA PCA+LDA
10 0
2
3
4 Number of pattern features
5
6
Fig. 6. Accuracy rate as a function of pattern features in PCA and PCA + LDA.
where ni is the distance between the margin and example xi lying on the wrong side of the margin. The calculations can be simplified by converting the problem with Kuhn–Tucker conditions into an equivalent Lagrange dual problem
VðaÞ ¼
l X i¼1
l 1X ai ai aj yi yj Kðxi xj Þ 2 i;j¼1
ð9Þ
Subject to: l X
yi ai ¼ 0;
C P a P 0; i ¼ 1; 2; . . . ; l:
ð10Þ
i¼1
The function K(xi xj) returning a dot product of feature space mappings of the original data points is called a kernel function. The number of variables of the dual problem is the number of training data. According to the Karush–Kuhn–Tucker theorem, the equality condition holds for the training input–output pair (xi, yi) only if the associated a is not 0. In this case the training example xi is support vector (SV). The number of SVs is considerably lower than the number of training samples making SVM computationally very efficient. SVM is an effective classifier for the classification task.
5. Experimental work and analysis 5.1. Experimental work of the proposed system The data acquisition system includes a near infrared LED, infrared CCD camera and PC. The proposed system is shown in Fig. 3. The light source of an array of near infra-red is used to clearly picture the finger-vein pattern because the finger-vein is not easily observed in visible light. The infrared LED source of an array of near infra-red used to clearly picture the characteristics of the vein pattern is designed in the present study. The near infrared LED (wavelength: 850 nm) is used for the light source and captures the image by an infrared sensitive CCD camera. The database in the present experiment included 10 people, in which the fore-finger was examined and each finger took 10 images. The size of the vein image captured by the CCD camera was 640 480 pixels. To simplify the data of the vein pattern, the images were cut at 130 130 pixels and resized to 20 20 pixels. An example of a finger-vein pattern is shown in Fig. 4. The block diagram of the proposed system is shown in Fig. 5. The experiment included three main procedures: first, it acquired image data by the CCD camera. Second, the finger-vein pattern features were extracted and preprocessed by the PCA and LDA method. Finally, the vein-pattern images were classified by ANN with SVM and the ANFIS network.
14288
J.-D. Wu, C.-T. Liu / Expert Systems with Applications 38 (2011) 14284–14289 Table 2 Performance of identification rate with number of training data using SVM and ANFIS network. Method
PCA + LDA
q/l
3/2 4/2 4/3 5/3 5/4 6/4 6/5 7/4
SVM
ANFIS
Accuracy (%)
Time (s)
Accuracy (%)
Time (s)
74 74 82 88 90 94 96 98
0.0156 0.0156 0.0156 0.0156 0.0156 0.0156 0.0156 0.0156
32 28 42 42 68 92 98 98
0.1 0.14 0.26 0.28 1.1 1.7 5 1.15
Table 3 Performance of identification rate with the relationship between the training set and testing set of training data using SVM.
Fig. 7. Sample feature images: (a) original image; (b) four-feature image; (c) fivefeature image; (d) six-feature image.
Method
q/l
PCA + LDA
3/2 4/2 4/3 5/3 5/4 6/4 6/5 7/4 7/5 8/4 8/5 8/6
SVM Train/test 10/100
20/100
50/100
Time (s)
73 78 71 85 81 86 80 79 71 89 85 79
71 82 83 90 88 91 90 91 87 95 98 95
81 85 86 92 93 91 92 91 93 96 98 96
0.0156 0.0156 0.0156 0.0156 0.0156 0.0156 0.0156 0.0156 0.0156 0.0156 0.0156 0.0156
carry major information of the given data and thus the eigenvectors related to small eigenvalues can be discarded. PCA + LDA have successfully reduced the dimensionality of the image data and generate the most discriminated features. PCA + LDA obtains better classification performance by reducing irrelevant and redundant information in the data. The performance by the PCA + LDA is better than PCA method. 5.3. Image process and analysis
Fig. 8. Finger-vein patterns of four feature vectors for four testers.
5.2. Recognition of extracted feature Feature extraction is a technique that can pre-process and reduce the computation complexity in a data system. Large feature dimensions can cause huge computation and memory cost to the classifier training and classification. The finger-vein features are first extracted by PCA and PCA + LDA. We measured the classification accuracy of a classification method such as SVM. The performance of the number of features is shown in Fig. 6. PCA has to achieve excellent results in feature extraction and data analysis reduction of a large dataset. The basic idea of PCA is to reduce the dimensionality of a dataset in which there are many interrelated variables, and the eigenvalues are sorted in the order of their significance. Eigenvectors related to eigenvalues with higher value
The finger-vein features are first extracted by PCA, as shown in Fig. 7. To find the differences in the groups of finger-vein patterns, PCA was applied to the macular and optic principal component derived from the patterns. Later, the principal components of the finger-vein patterns were analyzed and the first q principal components were extracted. The first q principal components correspond to the first q eigenvalues. The number of principal components as feature vectors and the four feature vectors image are shown in Fig. 8. After that, LDA transformation of the computation of l new projection axes and the features are projected to obtain maximum reparability between classes. The number of l features is the training data input for the ANN of ANFIS and SVM. In the experiment, 10 testers’ data were used to train and evaluate the performance of the ANN in the proposed system. Training data information was used in the choice of structure and parameters of ANFIS and the BP network. The results are summarized in Table 2. The number of different feature vectors showed different performance results. The number of seven PCA features and four LDA features trained by SVM achieved an identification rate of 98% in a short time. The relationship between the training set and testing set is described in Table 3 and showed good performance. However, SVM performance is superior to the ANFIS network in the present study.
J.-D. Wu, C.-T. Liu / Expert Systems with Applications 38 (2011) 14284–14289
14289
6. Conclusions
References
In this paper, a finger-vein pattern identification system based on PCA for image pre-processing and feature extraction using LDA is proposed. Both the SVM and ANFIS are used in pattern classification. A principal subspace in a feature space described the distribution of the training data. The capability of the method to extract a large number of PCA + LDA is useful for feature extraction. It is well known if the dimension of the input data is large enough, it would cause overload of the system. The PCA method can easily extract the most distinguishing feature vectors and LDA can retain the main feature vector. However, the discriminative features can be exactly extracted by the PCA + LDA method. Finally, they are classified with the neural network and support vector machine. In the identification system, the SVM and ANFIS were examined as the classifiers. The experimental results indicated the proposed system to be an effective method for biometric authentication. The identification rate of SVM had equally good performance as ANFIS but SVM took much less time, indicating it is a robust classifier. Finger-vein identification technology has high security and reliability compared to the traditional authentication mode and may be applied in the future.
Boue, C., Cassagne, F., Massound, C., & Fournier, D. (2007). Thermal imaging of a vein of the forearm: Analysis and thermal modeling. Infrared Physics & Technology, 51, 13–20. Er, M. J., Wu, S., Lu, J., & Toh, H. L. (2002). Face recognition with radial basis function (RBF) neural networks. IEEE Transactions on Neural Networks, 13(3), 697–709. Honeycutt, L. (2003). Researching the use of voice recognition writing software. Computers and Composition, 20, 77–95. Ikhlas, A. Q., Sara, P. R., Osama, A., & Sherif, Y. (2006). PCA-based algorithm for unsupervised bridge crack detection. Advances in Engineering Software, 37, 771–778. Jay, B., & Ajay, K. (2009). On estimating performance indices for biometric identification. Pattern Recognition, 42, 1803–1815. Jiang, X., Mandal, B., & Kot, A. (2008). Eigenfeature regularization and extraction in face recognition. IEEE Transaction on Pattern Analysis and Machine Intelligence, 30, 383–393. Perlibakas, V. (2004). Distance measures for PCA-based face recognition. Pattern Recognition Letters, 25, 711–724. Ross, A., Jain, A., & Reisman, J. (2003). A hybrid fingerprint matcher. Pattern Recognition, 36, 1661–1673. Sun, A., Lim, E. P., & Liu, Y. (2009). On strategies for imbalanced text classification using SVM: A comparative study. Decision Support System, 48, 191–201. Wang, A., Yuan, W., Liu, J., Yu, Z., & Li, H. (2009). A novel pattern recognition algorithm: Combining ART network with SVM to reconstruct a multi-class classifier. Computers and Mathematics with Applications, 57, 1908–1914. Wu, J. D., & Ye, S. H. (2009). Driver identification using finger-vein patterns with Radon transform and neural network. Expert System with Applications, 36, 5793–5799. Ya, X. Z. (2007). Artificial neural networks based on principal component analysis input selection for clinical pattern recognition analysis. Talanta, 73, 68–75. Zhang, X., & Jia, Y. (2007). Symmetrical null space LDA for face and ear recognition. Neurocomputing, 70, 842–848.
Acknowledgement The study was supported by the National Science Council of Taiwan, Republic of China, under project number NSC-97-2221-E018-008.