Accepted Manuscript
Online signature verification by continuous wavelet transformation of speed signals Orcan Alpar PII: DOI: Reference:
S0957-4174(18)30152-0 10.1016/j.eswa.2018.03.023 ESWA 11870
To appear in:
Expert Systems With Applications
Received date: Revised date: Accepted date:
2 September 2017 14 February 2018 9 March 2018
Please cite this article as: Orcan Alpar , Online signature verification by continuous wavelet transformation of speed signals, Expert Systems With Applications (2018), doi: 10.1016/j.eswa.2018.03.023
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Highlights•
We proposed a novel online signature validation system The signature is visible to all users for retracing. A hidden subsystem extracts the speed signal disregarding the signature matching. SVM is trained with spectrograms revealed by CWT for 10 signing samples EER of 3.19% with 0.83% FN and 2.5% FP are achieved for 120 trials.
AC
CE
PT
ED
M
AN US
CR IP T
1
ACCEPTED MANUSCRIPT
Online signature verification by continuous wavelet transformation of speed signals
CR IP T
Orcan ALPAR Center for Basic and Applied Research, Faculty of Informatics and Management, University of Hradec Kralove, Rokitanskeho 62, Hradec Kralove 50003, Czech Republic,
[email protected] +420 732 764683
AN US
Abstract— Despite the imitability of the signatures due to presence of numerous image processing programs, online verification systems could provide sufficient security for e-signatures. Recent developments in touchscreen technology and android programming also lead to utilization of hidden interfaces stealthily collecting the unique characteristics and storing the key features aside from geometrics. Therefore, we initially designed a signing interface for touchscreens which stealthily collects the precise coordinates while an individual is signing on the screen by fingertips. Even if the coordinate data is extracted as a matrix consisting of x and y values with corresponding time, the speed array is consequently calculated to investigate the higher frequency regions. The speed data processed by continuous wavelet transformations (CWT) to reveal the frequency information of the signing speed with respect to time information. The grayscale spectrograms created by wavelet transforms are converted into arrays for subsequent training session performed by support vector machines (SVM). The trained network successfully classified further attempts of the real and fake signatures with 1.67% false negative (FNR), 3.33% false positive rates (FPR) and 3.41% equal error rate (EER) for 120 signatures, even though the signature is totally public. For understanding the validity of the CWT and SVM running consecutively, the experiments are re-conducted for the signatures taken from SVC2004 and SUSIG public databases. Keywords— Online signature verification; biometrics; forensics; continuous wavelet transformation; SVM
1. Introduction
PT
ED
M
An electronic signature represents the original signature of an individual placed on an electronic document which should consist of unique geometrical and signing characteristics of the owner. However, considering various and numerous image processing applications, the signatures could easily be copied and pasted on an electronic document, which makes usage of the e-signatures untrustworthy. Easy as the imitation of the electronic signatures is, touchscreen technologies somehow could provide sufficient security by extracting the characteristics of signing style of the users for controlling the future attempts. Therefore, analyzing the signing style of the individuals by extracting the unique features while they are signing could be considered as a branch of habitual biometrics, disregarding the forensics which deals with offline signature verification.
AC
CE
The main unique feature in signature verification to be extracted is geometrical indeed; however when the signature is stolen, it would be simple to sign a document as similar as the original, mostly on electronic devices. Aside from analyzing the geometrical features of the signatures to find the similarities among the instances, which could be seen in very recent papers such as: (Hafemann, Sabourin, & Oliveira, 2017) (Serdouk, Nemmour, & Chibani, 2016) (Zalasiński, 2016) (Diaz, Fischer, Ferrer, & Plamondon, 2016) (Suryawanshi, Kale, Pawar, Kadam, & Ghule, 2016) and various old papers (Pal, Alireza, Pal, & Blumenstein, 2011), (Wang, 2009), (Pal, Pal, & Blumenstein, 2012), , (Nebti & Boukerram, 2013), what we propose here is extracting the unique features while signing that can be very hard to imitate even if the signature is public. As some examples of recent work on offline signature verification: Hamadane and Chibani (2016) dealt with finding co-occurrence matrices of global geometrical features by contour transformations and directional map reconstruction. Tolasana et al. (2015) analyzed very basic features such as: time, kinematic, direction, geometry and pressure for finding the similarities between the signatures using Mahalanobis distance. Eskander et.al (2014) proposed a fuzzy system to find the similarities of the signatures by feature dissimilarity space. These are some examples of recent papers; however they don’t collect the features while a user is signing and compare the geometric similarities after signing process. When we search for the latest research on online signature analysis, we came across several prominent papers in the literature: (Guru, Manjunatha, Manjunath, & Somashekara, 2017), (Sharma & Sundaram, 2017),
2
ACCEPTED MANUSCRIPT
CR IP T
(Manjunatha, Manjunath, Guru, & Somashekara, 2016), (Tang, Fang, Wu, Kang, & Zhao, 2016), (Kumar & BH, 2015), (Forhad, Poon, Amin, & Yan, 2015), (Bhateja, Chaudhury, & Saxena, 2014), (Lopez-Garcia, Ramos-Lara, Miguel-Hurtado, & Canto-Navarro, 2014), (Parodi & Gómez, 2014), (Plamondon, Pirlo, & Impedovo, 2014) : In detail, Sharma and Sundaram (2017) proposed a set of feature extraction applied statistical characteristics of signatures and presented a Gaussian mixture method (GMM) for validation using dynamic time warping (DTW). The also presented warping path feature to be fused with normalized DTW score for enhancing the performance. However they didn’t collect the signatures despite what we did and demonstrated their method on signature samples borrowed from public databases. They used 5 and 10 samples for training and achieved 3.05% and 2.2% EER respectively. Guru et al (2017) dealt with selection of writer dependent features and symbolic representation in the form of symbolic feature vectors. They afterwards fixed the writer dependent parameters for signature verification subsequent to training session for creating a confidence interval with a similarity threshold. As a result of numerous experiments they conducted by the signatures taken from the databases, they reached various average EER values between 2.2% and 9.2% when 5 real signatures are taken into consideration. The closest research recently published seems to belong Tang et al (2016), since they created their own application to collect real online biometric data of the individuals singing the interface. The interface they produced is signed with fingers and the samples are reproduced in enrollment step for further analysis by DTW. Although there is no detailed information in their paper about classification methodology, the EER values they achieved are between 6.7% and 7.3%.
M
AN US
Moreover and briefly, Forhad et al. (2015) presented a two dimensional plane method to approximate string matching for discrimination of the signatures they created. Bhateja et al, (2014) extracted speed, pressure and length features for training and testing in neural network based classifier to analyze the e-signatures by dividing the process into pieces in time-domain. Kumar et al. (2015) used a special device enabling to extract dynamic and spatial data as the users are signing on. They preferred support vector machines algorithm for classification of the attempts and for finding the similarities between input signature and reference set. López-García et al. (2014) utilized the dynamic time warping (DTW) method for capturing signing process by a special input device and classified the signatures depending on shape comparison methodology.
CE
PT
ED
In our previous papers, we dealt with the frequency trait as the main feature of biometric systems: For instance in (Alpar, 2017), we used frequencies as an alternative of time domain solutions as the novelty in biometric keystroke authentication. Instead of inter-key times, we trained the neural network based classifier by keystroke frequency signals using spectrograms achieved by short-time Fourier transformations. Given the kernel of online signature analysis we have two papers recently published (Alpar & Krejcar, 2017) and (Alpar & Krejcar, 2016), both including the frequency feature of signing styles. In (Alpar & Krejcar, 2016), we demonstrated the differences in frequencies and corresponding spectrograms that if a user is tracing the signature, even with various speeds, the spectrograms are significantly different than the original signing style. We used image closeness algorithm for comparing the frequency spectrograms which resulted in very low similarities than expected due to excessive resolution of RGB images. In our experiments we obtained 89.05%-97.75% similarity index among real signatures; while 0%-84.47% among fake attempts for 4 trials each.
AC
As a following study and enhancements of the weaknesses of the paper (Alpar & Krejcar, 2016), in (Alpar & Krejcar, 2017), we classified the frequency spectrograms obtained using Fourier transformations by mathematical fuzzy surfaces. Omitting any kind of fuzzy inference interfaces, we defined the fuzzy surfaces considering the logic table and tried to differentiate the attempts by grid histograms of high frequency regions. Discarding training session, we transformed displacement of the signature signals into spectrograms by short-time Fourier to find the similarities between real and fake signing styles. For 30 spectrograms in each set, we obtained 4.82% EER with 3.33% FAR and 0% false reject rate FRR for the threshold 90% as the minimum similarity. Existence of the threshold enabled us to analyze receiver operating characteristic (ROC) by changing the threshold and area under the ROC curve is calculated as . Aside from these researches, what we propose in this paper initially is extracting the speed feature and changing the speed signal from time-domain into frequency domain by continuous wavelet. The transformation we applied brings a grayscale spectrogram, indicating the high frequency regions by whiter colors and varying by the signing style. The training set is formed by ten attempts, consisting of five real with five designated fake attempts, to train
3
ACCEPTED MANUSCRIPT
the support vector machines (SVM) for classification the future trials. Therefore, the paper starts with introduction of the validation system including the workflow of the whole system, subsystems and preliminaries in Section 2. Initial experiments are presented in Section 3 with training and checking sessions and the additional experiments are conducted for the signatures taken from SVC2004 and SUSIG databases in Section 4, This paper ends up with conclusion& discussion in Section 5 providing future research possibilities, strengths weaknesses, and drawbacks; also including benchmarks of the outcomes of this research with the literature. 2. Signature Validation System
PROPOSED SYSTEM
CR IP T
As mentioned in the previous section, the main purpose of the system is to validate the real owner of the password by frequencies transformed from the speed signal. We borrowed the main conventional protocol of biometric authentication systems where the passwords to be cracked are totally public. The basic workflow of the system is presented in Figure 1 with examples of signing styles and spectrograms of real and fake signatures.
Displacement
Speed
VERIFICATION
Extration of the coordinates for 0.01s
Calculation of speed
Coordinate Data
Speed Data
CWT
Saving the coordinate data as a matrix
Saving the speed data as an array
Continuous Wavelet Transformations
AN US
Classificaton by SVM trained network
M
Fig.1. Basic Workflow
PT
ED
The details and components of the system are presented in following sections. Initially, the matching accuracy is introduced that calculates the ratio of the points extracted by an attempt that matches with the original signature. Furthermore, the basic components of the classification methodology, SVM and CWT, are presented as preliminaries.
2.1. Matching Accuracy
CE
Although the signature itself is not the main concern of the classifier, it is mandatory to calculate the accuracy of the signatures for performance analysis. Let any pixel on the matrix of the binary interface consisting of the master signature is represented by: ,
-
,
-)
(1)
AC
(
where is the width and is the height of the interface. Any signing attempt turned into a signal by extraction , -, which also forms a binary image of points would create a matrix consisting of set of coordinates after a simple operation, namely: ̅ ( , , -) ̅ (2) where ̅
{
(3)
and is the AND operator. Therefore the accuracy could be calculated by counting the zero pixels among the ̅ and dividing the zeros of the signature matrix ̅ , namely: new pixel values ̿ in summation matrix
4
ACCEPTED MANUSCRIPT
(
(∑ ∑
̿
̅
| ̿
̅ |))⁄(
(∑ ∑ ̅ ))
(4)
where represents the number of pixels on a plane . The subtraction is necessary since the summations give the number of white pixels; while we need to find the match ratio of black pixels. 2.2. Continuous Wavelet Transformation
(
)
〈
〉
∫
( )
CR IP T
The wavelet transformation, like short-time Fourier, is very practical to identify the time and frequency localization of a signal. Despite the fixed length window in short-time Fourier, CWT brings a variable length for windowing function while analyzing a signal. Given that ( ) is the signal to be analyzed, the CWT is very basically achieved by: (
)
(
)
〈̂ ̂
〉
having the
AN US
( ) where a is the scale, b is the time variable and . /. Provided that ( ) frequency component c, the Fourier transformation would lead to ( ), then we get:
(5)
∫ ̂( ) ̂ (
)
(6)
Among several methods for finding the wavelet, we used bump wavelet which theoretically gives more proper results for oscillations in data, such as in sound signals. The bump wavelet is defined as: )
[
(
M
̂(
and
]
(
(7)
)
changing the frequency and time localization.
PT
2.3. Support Vector Machines
ED
where is the indicator function, with the parameters
)
CE
The images extracted in wavelet transformation phase are the major representation of the frequencies formed by the signing style. For the classification of the CWT images, we preferred SVM as the main classifier that will distinguish the images by the differences of the pixels in training phase. However, unlike the other training algorithms, it is necessary to train the real images with the fake ones in SVM. Any pixel on a grayscale image extracted as a spectrogram is represented as:
AC
(
,
-
,
-)
(8)
where w is the width and h is the height of the image. Since the images are in 200x200 resolution, we initially reshaped the matrix and turned it into an array by vectorization: [ where a dataset
(
)]
[
]
(9)
. The optimizer is selected as sequential minimal optimization (SMO) which could be expressed for * + as: ( )] ( ) and [
5
ACCEPTED MANUSCRIPT
∑ where (
∑∑
) represents the kernel function and
and
(
)
(10)
are the Lagrange multipliers, with subject to (11)
∑
(12)
is the SVM hyperparameter; satisfying following Karush-Kuhn-Tucker (KKT) conditions depending
on Lagrange multiplier
: so that if (
or ( or finally if (
)
AN US
and where
CR IP T
and all sum products should be zero, namely:
)
(14)
)
(15)
for optimization of the pair (
)
ED
M
Given these conditions, the algorithm briefly finds a Lagrange multiplier until convergence that satisfies KKT equations.
3. Experimental Results
(13)
AC
CE
PT
In this section we present two types of experiment we’d conducted: A unique signature-like handwritten name which is not so difficult to forge and a public signature directly taken from a database. At the end of the section, results are evaluated while the results of separate subsection are given. In the first experiment, an interface with a very simple signature “Orcan” is modelled, including two more buttons for real and fake attempts to collect the data for experimental purposes. The interface and the application are designed for Samsung Galaxy Tab S 10.5 T800 tablet with a resolution of 2560x1600 ; however the interface is adaptive for other touchscreen devices, including Android smartphones. Although there is a signature on the interface to calculate accuracy, it is however totally concealed from the users. In each 0.01 second after pressing “real” or “fake” button, the coordinates are extracted with corresponding iteration number and time; and subsequently the matrix of coordinates is sent to server by “send” button. Speed of the signing is calculated for each iteration and stored as an array with the coordinate data stored as a matrix. Finally, the speed signals are transformed into grayscale spectrograms by continuous wavelet to reveal high frequency regions of the signature and the spectrograms of real and fake signatures are directly trained by support vector machines Acquiring the raw images generated by CWT leads to training session, while only one single signature sample is not enough for calibrating the classifier. On the other hand, we propose the utilization of SVM, therefore the real signatures are trained with the fake ones created by the owner for better discrimination. Although it seems plausible for experiments of the system, it doesn’t represent the real life situations, therefore we propose one more SVM without the designated fake signatures in Section 4.
6
ACCEPTED MANUSCRIPT
3.1. Interface
AN US
CR IP T
There are some devices and interfaces collecting the signatures by scanning and saving them for forensic purposes; while it is still easy to forge a signature on touchscreens. We indeed need the original touch data to be extracted for computing the displacement matrices therefore an application is written to extract the coordinates for every 0.01 second after the first touch. For research purposes, the application is not finalized and two fundamental buttons are placed to accumulate the matrices in different folders as “Fake” and “Real”. The signature could be selected from the “Picture” button and subsequently the “Send” button should be touched to submit the data into corresponding folder. The blank and signed interfaces could be seen in Figure 2.
Fig.2. The interface of the signature application for a unique signature (Blank screen on the left, Signed Screen on the right)
3.2. Extraction of the coordinates
M
The interface is prepared for signing by fingertips, therefore it is not so easy to trace the signature identically, as seen in Figure 2 right. The traces are converted into coordinate matrices to store for estimation of average interval speed and to analyze the matching ratios afterwards, as an additional information.
AC
CE
PT
ED
Along the signing process, the main algorithm stealthily extracts the coordinates for further estimation of average interval speed by linear interpolation. The coordinate data is also stored for analyzing the differences of matching ratio between real and fraud attempts. This analysis is not vital for our research and the potential outcomes are not used anywhere but only in ROC analysis. An instance of the coordinates extracted during the process is presented in Figure 3.
Fig.3. An example of extracted coordinates of the signature sample From the fig.3, it is obvious that the intervals are narrower when signing speed is rather slow due to sampling effect. In addition, the first part of the signature has rather low frequency than following that will be the key feature for this paper and further analyses.
7
ACCEPTED MANUSCRIPT
3.3. Estimation of the average interval speed Since the coordinates are sampled and the signal is digitized, the average speed calculation between two extracted points is only an estimation, computed by simple linear interpolation, namely; ̅
√(
)
(
)
(16)
PT
ED
M
AN US
CR IP T
Although the average speed between two points is only a close estimation, the results are mathematically sufficient through low sampling rate during all process. The speed signal created by the interpolation ̅ is presented in Fig.4, for a real signature, where epoch and record time is limited to 5 seconds. As expected, the first part of the signature is faster than the rest; however the main focus for discrimination of the signals is the frequency of speed instead. Therefore, speed signals are analyzed by continuous wavelet transforms to reveal the frequency component vs time. For parameters and , the following transformation is achieved, presented in Fig5, for the signal in Fig.4. Although the transformations usually are colored by a colormap for identifying the power of frequency, we generated the surfaces without a colormap with 128bit grayscale for further analysis.
Fig.4 Speed signal on the left and CWT of the public signature sample on the right
AC
CE
Since the transformation reveals only the high frequency regions of speed signal, not the high speed regions, the whiter areas on the spectrogram should be in the mid sections, as similar as presented in Fig.5. Moreover, the main advantage of 128 bit grayscaling is finding the high frequency regions with time localization by a scale between 0 and 255. In other words, high frequency would be revealed by the white area where the pixel values are close to upper limit 255. Therefore, only the raw images extracted by this process are crucial for training and testing the classifier, instead of whole diagram. The images are stored in their present form as grayscale images with 200x200 resolution consisting of only one layer and used for training the classifier presented in experiments section.
3.4. Training Session The arrays described in (10) are computed for five real and five fake signature spectrograms for training by SVM. The training set is presented in Table 1, with the real and fake signing styles determined by the owner of the signature. The fake signatures of the training set designated by the user don’t represent the reality, therefore we wrote an additional algorithm to shuffle speed signal. In this randomization process, the cells of the speed array created by the user is automatically shuffled without changing the amplitudes but with mapping the amplitudes to
8
ACCEPTED MANUSCRIPT
random t values. For each real attempt in training session, the corresponding random attempt is saved to create a distinctive grayscale wavelet as a fake attempt. The training set is presented in Table 1. Table 1. Training set with random fake signatures
REAL
+1
+1
-1
-1
-1
+1
-1
-1
AN US
FAKE
+1
CR IP T
+1
M
As mentioned above, the outputs of SVM are defined as +1 for real and -1 for fake signature samples. The first row belongs to the raw spectrogram images of the real signing style and seemingly inter-consistent in higher frequency regions. On the contrary, in the third raw consisting of fake signature spectrograms, there are randomized spectrograms for better discrimination in training phase. Moreover, it still is necessary to validate the performance of the classifier, trained by SVM, using some random images from real and fake sets, which will be introduced in Results section. 3.5. Experimental Results
AC
CE
PT
ED
In this section, a fraud team consisting of 12 users tried to crack the signature and subsequently a total of 120 fake spectrograms are saved and analyzed as well as 120 real attempts. The fraud team is not informed about the kernel of the classifier; yet obeying the signature sample, they only tried to forge the signature. Since the standardization of the wavelets is also so crucial, the time limit is defined as five seconds per trial for this signature and the data outside of this interval is omitted. On the other hand, the accuracy is calculated per trial to achieve the receiver operating characteristics for indication of the classification performance. The results are presented in Fig.5.
9
CR IP T
ACCEPTED MANUSCRIPT
AN US
Fig.5 Results Diagram (Dashed curve: Fake attempts, Dashed triangles: FP; Normal curve: Real attempts, Normal triangle: FN; curves belong to left, triangles belong to right y-axis)
M
In the diagram, fake attempts are represented by dotted blue, real attempts by normal blue lines, both correspond the left y-axis and sorted in ascending order. The red triangles represent the erroneous classification; where the dotted triangles show false positives while normal triangles show false negatives. Given the results, only one real attempt are classified as fake; while three fakes attempts are mistakenly accepted. Therefore, the error rates are ⁄ ⁄ easily computed as and where FAR is false acceptance rate and FRR is false rejection rate which also could be stated as FPR (false positive rate) and FNR (false negative rate) respectively.
AC
CE
PT
ED
Since the classification is binary not dependent on thresholding or similar, the first performance criteria should be considered as the sensitivity which is true positive rate (TPR) which is derived by true positives (TP) and false ⁄( negatives (FN) as: ) The second basic indicator is the specificity or true negative rate (TNR) which is found by true negatives (TN) divided by all trials including false positives (FP): ⁄( ) . Therefore classification accuracy is calculated as: ( )⁄( ) . Moreover, the main performance criteria of the biometric systems is the ROC curve, usually drawn by altering thresholds for non-binary outcomes; however it is not possible for this research. Therefore there seems only one way to generate the ROC curve for finding the relation between TPR and FPR, which could be achieved by declaring cutpoints relevant to the classification system. Considering our parameters and outcomes, the single method for determining the points should be the matching accuracy . The confusion matrix is divided into intervals according to the accuracy and following matrix is achieved in Table 2.
Cutpoint
Table 2. Confusion matrix with cutpoints for the second experiment n TP TN FP FN TPR 12 0 12 0 0 0 30 1 28 1 0 1 57 10 47 0 0 1 72 41 28 2 1 0,976 43 40 1 1 1 0,975 25 25 0 0 0 1 1 1 0 0 0 1
FPR 0 0,0345 0 0,0667 0,5 0 0
10
ACCEPTED MANUSCRIPT
AN US
CR IP T
⁄( where FPR is calculated by: ) According to the table 3, as the reference of the cutpoints and true and false positive rates, the following ROC curve is computed with the area under the curve as well as the detection error trade-off (DET) curve to determine EER. Consequently, 120 fake and 120 real attempts are classified and the achieved results are similar with the first experiment which could be summarized as 1.67% FNR and 3.33% FPR. Since the main indicators for the system performance of biometric systems are ROC and DET curves, results of both analysis are presented in Figure 6.
Fig.6 ROC curve on the left and DET curve on the right, for the second experiment
CE
PT
ED
M
The main expatiation from a classifier is having a ROC curve away from imaginary line shown in dotted line in Figure 6 left, which is perfectly satisfied, as well as the AUC very close to 1. On the other hand, equal error rate cannot be calculated by FAR and FRR due to lack of thresholding; however it is still possible to estimate it by finding the intersection point of the DET curve and imaginary line in Figure 6 right. The equal error rate is estimated as , which is one of the lowest among the results presented earlier in the literature. Despite what it means in biometric classification by thresholding, the DET curve doesn’t give any crucial information in this case; yet the EER could be found on the graph. The EER could have been interpolated by finding the intersection point of imaginary line and regular ROC curve having axes of 1 - FRR against FAR; however the ROC curve we generated is strictly dependent on the cutpoints we primarily defined and this approximation is not possible in our case. 4. Testing System Validity
AC
Biometrics systems strictly need the features extracted by dedicated data collection structure as inputs. As the best example, for this research the signature owner should have known about the frequency feature and sign the interface very consistently. However, this is kind of novel trait that we firstly put forward therefore most of the signature databases are totally not useful. Therefore the first and the major prerequisite is the consistency of the training set: the users should sign on the interface triggering the similar frequencies at similar times while the interface is collecting the touch points. Second requirement is the existence of constant sampling time interval with plausible coordinate extraction; yet as long as the sampling frequencies in the training and testing set is similar, this is not mandatory indeed. Given these assumptions there are two more experiments conducted using the eminent datasets SVC2004 (Yeung, et al., 2004) and SUSIG (Kholmatov & Yanıkoglu, 2009) with the identical protocols for the signature presented above. 4.1. SVC2004 Dataset
11
ACCEPTED MANUSCRIPT
CR IP T
The SVC2004 dataset consists of 20 genuine signatures with 20 fake ones per signature, therefore we took random samples from each set to form the training set of SVM. The instance of the signature we used through this experiment is shown in Figure 7.
AN US
Fig.7. An example of extracted coordinates of the signature sample (SVC2004)
Despite what is so common in online signature verification papers presenting only the results, we decided to show the spectrograms and the obvious differences between the genuine and fake signatures instead. The real signatures are thoroughly consistent given the frequency vs time representation; while the fake signatures have more intensity and visible lag. The second difference is length of arrays since the genuine signatures have shorter arrays but considerably higher speed, which could be seen in Table 3.
M
Table 3. Training set for SVC2004 experiment
+1
+1
+1
+1
-1
-1
-1
-1
PT
+1
ED
REAL
CE
FAKE
AC
-1
The trained network is tested by 15 more real and fake signatures and the results are interesting. Deliberately or not, all signatures in the set are totally distinguished; even in fake signature set, all signature samples look very similar. Since there is no crosscheck between the global or local dynamics of the signatures and all geometrical similarities and dissimilarities are totally neglected, we achieved 0% EER. It also means that all of the samples are perfectly classified through the very consistent real and fake signature sets. When we look closer to the speed arrays of the provided data for real and fake signatures, it is revealed that all fake signatures have faster styles and the high frequency regions arise significantly later than real signatures. This discrimination is possible only if the array size of the inputs is identical and not changing by the duration of the signing process. On the other hand, the same procedure is implemented for other four signature samples by assigning the set members to training set individually. We totally analyzed 75 genuine and fake signatures of 5 samples in the
12
ACCEPTED MANUSCRIPT
dataset and we achieved FAR=4/75=5.3% and FRR=2/75=2.7%. It look a bit higher; however this dataset is not formed to check the higher frequency regions of the signatures and some signature samples are not consistent even in real signature sets.
4.2. SUSIG Dataset
AN US
CR IP T
The second dataset we used for checking the validity of CWT and SVM is SUSIG database consisting of 8 real and 10 fake signatures per sample. When we checked the data they provided for one signature sample presented in Figure 8, there are very obvious similarities between real and fake signature sets as we turned the data arrays into speed signals.
M
Fig.8. An example of extracted coordinates of the signature sample (SUSIG)
ED
Although this database seemingly is not suitable for frequency investigation, we took five samples from each set for training session. We firstly turned the speed signals into spectrograms by CWT and trained SVM by these spectrograms as mentioned before. However, the real signatures are not consistent as they are in SVC2004 dataset and there sometimes is no difference between frequency vs. time representation of fake and real signatures. The training set could be seen in Table 4.
CE
AC
REAL
PT
Table 4. Training set for SUSIG experiment
+1
+1
+1
+1
+1
-1
-1
-1
-1
-1
FAKE
Despite very low consistency in spectrograms of real signatures in training set, our system could classify all fakes correctly; yet one real signature is erroneously classified as fake.
13
ACCEPTED MANUSCRIPT
5. Conclusion & Discussion We mainly dealt with frequency component of the signing styles of the individuals, extracted by CWT that creates grayscale spectrograms. These spectrograms are trained by SVM for each type of experiment based on a unique signature. All analysis are done online once the user signed on the interface after the training session since all features are extracted during the signing process. We repeated our methodology for the signatures taken from two well-known datasets, SUSIG and SVC2004. In these experiments, it is found that SUSIG dataset is not so appropriately designed for extracting speed signal to reach spectrograms by CWT; yet the real signatures in SVC2004 are more consistent given the time and frequency localization. In table 5, results of the papers dealing with online signature verification using SVC2004 are provided to give a brief insight.
CR IP T
Table 5. Comparison within the papers used SVC2004 Authors Method FAR FRR This Paper Wavelet+SVM 5.3% 2.7% Kar et al. (2018) SVM Song et al. (2017) Dynamic Time Warping Liu et al (2015) Sparse Representation Cpalka et al. (2014) Neuro-fuzzy Radmehr Et al. (2011) SVM 92% 10% Reza et al. (2011) Neural Networks 0.5% 0.3% Gruber et al. (2010) SVM 4.13% 5.5%
EER
AN US
1% 2.89% 3.98% 10.7%
M
The main drawback is the mandatory size of training set and the existence of real and fake signatures together in the training set for better differentiation of the classifier. The major strength of the system is usage of frequency information totally independent from the signature itself while the weakness could be the simplicity of the signatures used in this paper. As the complexity of the signature increase, the frequencies would be more precise and thus harder to mimic. Therefore this system could be used with more complex signatures for lowering down the FAR and EER, though the results are already so satisfactory for a very simple and public as well as for a unique signature.
CE
Acknowledgement
PT
ED
Short-time Fourier transformation is also an alternative indeed; yet it would give very similar results for frequency vs time spectrograms derived from speed signals. Since we compare the spectrograms as grayscale images, it could be possible to change the kernel of the classifier while comparing the images. All signatures are recorded by the users signing a 10.5 inch touchscreen monitor with their fingers; however it could be a future research to repeat the experiments by smaller screens and digital pens.
AC
The work and the contribution were supported by the project “Smart Solutions in Ubiquitous Computing Environments”, Grant Agency of Excellence, University of Hradec Kralove, Faculty of Informatics and Management.
References
Alpar, O. (2017). Frequency spectrograms for biometric keystroke authentication using neural network based classifier. Knowledge-Based Systems, 116, 116, 163–171. Alpar, O., & Krejcar, O. (2016). Hidden Frequency Feature in Electronic Signatures. In International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems (pp. 145-156). (pp. 145-156). Morioka: Springer International Publishing. Alpar, O., & Krejcar, O. (2017). Online signature verification by spectrogram analysis. . Applied Intelligence, doi:10.1007/s10489-017-1009-x.
14
ACCEPTED MANUSCRIPT
AC
CE
PT
ED
M
AN US
CR IP T
Bhateja, A. K., Chaudhury, S., & Saxena, P. K. (2014). A Robust Online Signature Based Cryptosystem. Frontiers in Handwriting Recognition (ICFHR), 2014 14th International Conference on IEEE., (pp. 7984). Cpałka, K., & Zalasiński, M. (2014). On-line signature verification using vertical signature partitioning. . Expert Systems with Applications, 41(9),, 4170-4180. Cpałka, K., Zalasiński, M., & Rutkowski, L. (2016). A new algorithm for identity verification based on the analysis of a handwritten dynamic signature. Applied soft computing, 43, 47-56. Diaz, M., Fischer, A., Ferrer, M. A., & Plamondon, R. (2016). Dynamic signature verification system based on one real signature. . IEEE transactions on cybernetics., DOI: 10.1109/TCYB.2016.2630419 . Eskander, G. S., Sabourin, R., & Granger, E. (2014). A bio-cryptographic system based on offline signature images. . Information Sciences, 259, 170-191. Forhad, N., Poon, B., Amin, M. A., & Yan, H. (2015). Online Signature Verification for Multi-modal Authentication using Smart Phone. Proceedings of the International MultiConference of Engineers and Computer Scientists . Gruber, C., Gruber, T., Krinninger, S., & Sick, B. (2010). Online signature verification with support vector machines based on LCSS kernel functions. . IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 40(4), 1088-1100. Guru, D. S., Manjunatha, K. S., Manjunath, S., & Somashekara, M. T. (2017). Interval valued symbolic representation of writer dependent features for online signature verification. , . Expert Systems with Applications, 80, 232-243. Hafemann, L. G., Sabourin, R., & Oliveira, L. S. (2017). Learning features for offline handwritten signature verification using deep convolutional neural networks. . Pattern Recognition, 70, 163-176. Hamadene, A., & Chibani, Y. (2016). One-Class Writer-Independent Offline Signature Verification Using Feature Dissimilarity Thresholding. , 11(6), . IEEE Transactions on Information Forensics and Security, 11(6), 1226-1238. Kar, B., Mukherjee, A., & Dutta, P. K. (2018). Stroke Point Warping-Based Reference Selection and Verification of Online Signature. . IEEE Transactions on Instrumentation and Measurement, 67(1),, 2-11. Kholmatov, A., & Yanıkoglu, B. (2009). SUSIG: an on-line signature database, associated protocols and benchmark results. . Pattern Analysis & Applications, 12 (3), 227-236. Kumar, S., & BH, V. P. (2015). Embedded Platform For Online Signature Verification. IJSEAT, 3(4), 126-131. Liu, Y., Yang, Z., & Yang, L. (2015). Online signature verification based on DCT and sparse representation. . IEEE transactions on cybernetics, 45(11), 2498-2511. Lopez-Garcia, M., Ramos-Lara, R., Miguel-Hurtado, O., & Canto-Navarro, E. (2014). Embedded System for Biometric Online Signature Verification. . Industrial Informatics, IEEE Transactions on, 10(1), 491-501. Manjunatha, K. S., Manjunath, S., Guru, D. S., & Somashekara, M. T. (2016). Online signature verification based on writer dependent features and classifiers. , . Pattern Recognition Letters, 80, 129-136. Nebti, S., & Boukerram, A. (2013). Handwritten characters recognition based on nature-inspired computing and neuro-evolution. Applied intelligence, 38(2), 146-159. Pal, S., Alireza, A., Pal, U., & Blumenstein, M. (2011). Off-line signature identification using background and foreground information. International Conference on Digital Image Computing Techniques and Applications (DICTA), (pp. 672-677). Pal, S., Pal, U., & Blumenstein, M. (2012). Off-line English and Chinese signature identification using foreground and background features. Neural Networks (IJCNN), The 2012 International Joint Conference on (pp. 1-7). IEEE., (pp. 1-7). Parodi, M., & Gómez, J. C. (2014). Legendre polynomials based feature extraction for online signature verification. Consistency analysis of feature combinations. . Pattern Recognition, 47(1),, 128-140. Plamondon, R., Pirlo, G., & Impedovo, D. (2014). Online signature verification. In Handbook of Document Image Processing and Recognition (pp. 917-947). Springer London. Radmehr, M., Anisheh, S. M., Nikpour, M., & Yaseri, A. (2011). Designing an offline method for signature recognition. . World Applied Sciences Journal, 13(3), 438-443. Reza, A. G., Lim, H., & Alam, M. J. (2011). An efficient online signature verification scheme using dynamic programming of string matching. In International Conference on Hybrid Information Technology (pp. 590-597). Berlin, Heidelberg: Springer,.
15
ACCEPTED MANUSCRIPT
AC
CE
PT
ED
M
AN US
CR IP T
Rua, E. A., Maiorana, E., Castro, J. L., & Campisi, P. (2012). Biometric template protection using universal background models: An application to online signature. . IEEE Transactions on Information Forensics and Security, 7(1), , 269-282. Serdouk, Y., Nemmour, H., & Chibani, Y. (2016). New off-line handwritten signature verification method based on artificial immune recognition system. . Expert Systems with Applications, 51, 186-194. Sharma, A., & Sundaram, S. (2017). A novel online signature verification system based on GMM features in a DTW framework. IEEE Transactions on Information Forensics and Security, 705-718. Song, X., Xia, X., & Luan, F. (2017). Online signature verification based on stable features extracted dynamically. . IEEE Transactions on Systems, Man, and Cybernetics: Systems., doi:10.1109/TSMC.2016.2597240 . Suryawanshi, R., Kale, S., Pawar, R., Kadam, S., & Ghule, V. R. (2016). Offline signature cognition and verification using artificial neural network. International Journal of Advanced Research in Computer and Communication Engineering, 5(3), 352-354. Tang, L., Fang, Y., Wu, Q., Kang, W., & Zhao, J. (2016). Online Finger-Writing Signature Verification on Mobile Device for Local Authentication. Chinese Conference on Biometric Recognition (pp. 409-416). Springer International Publishing. Thumwarin, P., Pernwong, J., & Matsuura, T. (2013). FIR signature verification system characterizing dynamics of handwriting features. , . EURASIP Journal on Advances in Signal Processing, 183-198. Tolosana, R., Vera-Rodriguez, R., Fierrez, J., & Ortega-Garcia, J. (2015). Feature-based dynamic signature verification under forensic scenarios. Biometrics and Forensics (IWBF), 2015 International Workshop on, IEEE., (pp. 1-6). Wang, N. (2009). Signature Identification Based on Pixel Distribution Probability and Mean Similarity Measure with Concentric Circle Segmentation. In Computer Sciences and Convergence Information Technology, 2009. ICCIT'09. Fourth International conference on , (pp. 1535-1538). Yeung, D. Y., Chang, H., Xiong, Y., George, S., Kashi, R., Matsumoto, T., & Rigoll, G. (2004). SVC2004: First international signature verification competition. In Biometric Authentication (pp. 16-22). (pp. 16-22). Berlin, Heidelberg.: Springer. Zalasiński, M. (2016). New algorithm for on-line signature verification using characteristic global features. In. In Information Systems Architecture and Technology: Proceedings of 36th International Conference on Information Systems Architecture and Technology–ISAT 2015–Part IV (pp. 137-146). Springer International Publishing.
16