Journal of Information Security and Applications 40 (2018) 63–77
Contents lists available at ScienceDirect
Journal of Information Security and Applications journal homepage: www.elsevier.com/locate/jisa
A continuous combination of security & forensics for mobile devices Soumik Mondal∗, Patrick Bours Department of Information Security and Communication Technology, Norwegian University of Science and Technology, Norway
a r t i c l e
i n f o
Article history:
Keywords: Continuous authentication Continuous identification Pairwise user coupling Behavioral biometrics Mobile devices Security and forensics
a b s t r a c t In this research, we introduce the concept of adversary identification in combination with continuous authentication. To protect the system from session hijacking, it is important to not only use the traditional access control at the beginning of a session but also continuously monitor the entire session whether the present user is still the legitimate user or not. In case an impostor is detected, the system should lock to avoid loss or disclosure of personal or confidential information. In many cases, it will be important to not only secure the system but also establish the identity of the impostor which could be seen as a deterrence measure or could be used as a shred of evidence in the court. This concept has not been introduced in this manner before, and it combines security and forensics continuously. We have performed a closed-set and an open-set experiment to validate our proof of concept with two different publicly available mobile biometrics datasets. Depending upon the dataset and used settings, we show that the adversary can be correctly identified with 82.2% to 97.9% of the attack cases for closed-set experiment and for the open-set experiment this performance ranges from 73.8% to 77.6%. © 2018 Elsevier Ltd. All rights reserved.
1. Introduction Technological advances in mobile devices make people dependent on such devices day by day. Nowadays mobile devices (i.e. smart-phone or tablet) can be treated as a pocket computing device, and people are using it for financial transactions, email, health monitoring and social networking. Therefore these devices contain highly sensitive and private information. Securing these devices from illegitimate access to such information is one of the primary concerns for the security research community. State of the art access control on mobile devices is implemented as a one-time proof of identity (i.e. password, pattern lock, face or fingerprint) during the initial login procedure, where the legitimacy of the user is assumed to be the same during the full session [7,8,10,19,28]. Unfortunately, if the device is left unlocked and unattended, any person can have access to the same information as the legitimate user. This type of access control is referred to as static authentication or static login. On the other hand, we have continuous authentication where the genuineness of a user is continuously monitored based on the biometric signature present on the device to protect the device from session hijacking. When doubt arises about the authenticity of the current user, the system can lock, and the user has to revert to the static authentication access control mechanism to be allowed to continue working. Con∗
Corresponding author at: E-mail addresses:
[email protected] [email protected] (P. Bours). https://doi.org/10.1016/j.jisa.2018.03.001 2214-2126/© 2018 Elsevier Ltd. All rights reserved.
(S.
Mondal),
tinuous authentication is not an alternative security solution for static authentication; it provides an added security measure alongside static access control. In most cases, it is important to detect if the current user is the legitimate user or an impostor user, but in some cases, it could also be interesting to establish the identity of the impostor user when detected. An example where identification can be useful is in an online user forum. Here it could be used to identify a person posting anonymous yet offensive or criminal comments, or posting comments under the name of someone else, e.g. after getting access to the account of the other person. In our research, we are not only looking at the Continuous Authentication (CA) where the system checks if the current user is the genuine user or not, but also at Continuous Identification (CI) where the system tries to establish the identity of the current user if he or she is known to the system. During our research we address two questions: • CA: Is an impostor currently using the system? • CI: If an impostor is currently using the system (detected by the CA system), then who is this impostor? The primary motivation of this research is to unveil the identity of an impostor once the system has detected that an impostor is using the system. Fig. 1, shows the system pipeline of our proposed architecture. It is divided into two major subsystems (see Fig. 1 with the dotted lines), i.e. the Continuous Authentication System (CAS) and the Continuous Identification System (CIS). In Fig. 1, after successfully providing the credentials for the static login the user is accepted as genuine and obtains the permission to use the
64
S. Mondal, P. Bours / Journal of Information Security and Applications 40 (2018) 63–77
Yes
Continuous Authentication System
Continue
Static Login
Continuous Identification System
Adversary ID, Score
No >
Lock out
Fig. 1. Block diagram representation of our proposed system.
device. During the usage of the device, the behavioral dynamics of the each and every activity (i.e. swipe action or key tapping) performed by the user is compared with the stored profile by the CAS and returns the current system Trust on the genuineness of the user. The Trust values are used in the decision module where it is compared with a predefined threshold (Tlockout ) to determine whether the user can continue to use the device or, if the trust is too low (i.e. our system feels that the device is operated by an impostor user), the device will be locked. After detecting that the present user is an impostor (i.e. Trust
ior) [13,27] but most of them have used swipe gesture behavior for CA [5,12,14,15,28,29,36]. We can also find the combination of tapping and the swipe gesture for CA [18]. Except for [12], all the other experiments were conducted in a controlled setting. We can also find the use of face biometrics for CA on mobile devices [9,35]. All of the above research was implemented as a periodic authentication, where the system will re-authenticate the user after a block of a predefined fixed number of activities or a pre-set fixed time interval. Therefore, even if the system can achieve 0% EER, the impostor user always gets a specific amount of activity or time to cause damage on the device. In our research, we will focus on actual CA where each action is immediately taken into consideration to determine if the current user is genuine or not. Our contribution in this paper is not so much in CA but instead in CI, where we are more interested in identifying the current user after he/she is locked out by the CA system. Since CI has been introduced first time in the research, we are unable to find related research in this domain. 2.2. Classifier(s) We have used a machine learning based approach during the analysis. More precisely, we have used Artificial Neural Network (ANN), Counter-Propagation ANN (CPANN) and Support Vector Machine (SVM) classifiers in our research. We have also used MultiClassifier Fusion (MCF) with a score fusion technique to obtain the better performance than for a single classifier [17]. 2.3. Data description In our research, we have used two publicly available datasets for the analysis [2,14]. The detail description of these datasets is given below. 2.3.1. Dataset - 1 A client-server application was deployed to eight different Android mobile devices (screen resolutions ranging from 320 × 480 to 1080 × 1205) for data collection, and the swipe gesture data was collected from 71 volunteers (56 male and 15 female with ages ranging from 19 to 47). Each volunteer has provided an average of 202 swipe actions. To the best of our knowledge, this dataset contains the largest number of users compared to other publicly
S. Mondal, P. Bours / Journal of Information Security and Applications 40 (2018) 63–77
65
Table 1 Summary of the related CA researches on mobile devices. Ref.
Method
# Users
Performance
[5] [12] [13] [14] [15] [18] [27] [28] [29] [36]
Support Vector Machine (SVM) k-Nearest Neighbors (k-NN) and Dynamic Time Warping (DTW) Sliding window and 3 machine learning approaches k-NN and SVM Distance metric SVM Artificial Neural Network (ANN) and k-NN Distance metric and 8 machine learning approaches one class SVM Correlation Distance
10 23 40 41 41 28 40 190 51 30
Equal Error Rate (EER) < 1% Accuracy of 90% False Accept Rate (FAR) of 3.8% and False Reject Rate (FRR) of 2.8% EER of 3% EER of 22.5% Accuracy of 79.74%–95.78% EER of 3.3% EER of 13.8%–33.2% FAR of 7.52% and FRR of 5.47% EER of 2.62%
available datasets [2]. The data was collected in four different sessions with two different tasks. One task was reading an article and answering some questions about it and the another task was surfing an image gallery. Every swipe action consists of the coordinates of the swipe position, time-stamp, orientation of the phone, finger pressure on the screen, and the area covered by the finger during swipe as raw data. From these raw data, several distinct features were calculated for each swipe action and Antal et al. [2] have described the details of these features. These extracted features are: 1. Action duration: The total time taken to complete the action (i.e. in milliseconds); 2. Begin X: X-Coordinate of the action starting point; 3. Begin Y: Y-Coordinate of the action starting point; 4. End X: X-Coordinate of the action end point; 5. End Y: Y-Coordinate of the action end point; 6. Distance end-To-end: Euclidean distance between action starting point and end point; 7. Movement variability: The average Euclidean distance between points belonging to the action trajectory and the straight line between action starting point and end point; 8. Orientation: Orientation of the action (i.e. horizontal or vertical); 9. Direction: Slope between action starting point and end point; 10. Maximum deviation from action: The maximum Euclidean distance between points belonging to the action trajectory and the straight line between action starting point and end point; 11. Mean direction: The average slope of the points belonging to the action trajectory; 12. Length of the action: The total length of the action; 13. Mean velocity: The mean velocity of the action; 14. Mid action pressure: The pressure calculated at the midpoint of the action; 15. Mid action area: The area covered by finger at the midpoint of the action.
2.3.2. Dataset - 2 During the data collection process for this dataset, a custom application was deployed to five different android mobile devices, and the swipe gestures from 41 volunteers with five to seven different sessions of data per participants were collected. Each volunteer provided an average of 516 swipe actions. According to our knowledge, this is a benchmark dataset for mobile based CA research where a significant amount of data collected from each participant [14]. The data were collected for seven different tasks, i.e. four different Wikipedia article reading, and three different image comparison games. Like previous dataset similar raw data were stored during an experiment, and similar features were calculated with some additional during analysis [14]. These extracted features are:
1. Inter stroke time: Time between two consecutive strokes; 2. Stroke duration: The total time taken to complete the action or stroke; 3. Start X: X-Coordinate of the stroke starting point; 4. Start Y: Y-Coordinate of the stroke starting point; 5. Stop X: X-Coordinate of the stroke end point; 6. Stop Y: Y-Coordinate of the stroke end point; 7. Direct end-To-end distance: Euclidean distance between action starting point and end point; 8. Mean resultant length: This represents how directed the stroke is; 9. Up/down/left/right flag: Orientation of the stroke (i.e. horizontal, vertical, up or down); 10. Direction of end-To-end line: Slope between action starting point and end point; 11. 20% pairwise velocity: 20% percentile of the stroke velocity; 12. 50% pairwise velocity: 50% percentile of the stroke velocity; 13. 80% pairwise velocity: 80% percentile of the stroke velocity; 14. 20% pairwise acceleration: 20% percentile of the stroke acceleration; 15. 50% pairwise acceleration: 50% percentile of the stroke acceleration; 16. 80% pairwise acceleration: 80% percentile of the stroke acceleration; 17. Median velocity at last 3 points: This represents the velocity before stop the stroke; 18. Largest deviation from end-To-end line: The maximum Euclidean distance between points belonging to the action trajectory and the straight line between action starting point and end point; 19. 20% deviation from end-To-end line: 20% percentile of the stroke deviation; 20. 50% deviation from end-To-end line: 50% percentile of the stroke deviation; 21. 80% deviation from end-To-end line: 80% percentile of the stroke deviation; 22. Average direction: The average slope of the points belonging to the stroke trajectory; 23. Length of trajectory: The total length of the stroke; 24. Ratio end-To-end distance and length of trajectory: Self explanatory; 25. Average velocity: The mean velocity of the stroke; 26. Median acceleration for first 5 points: Self explanatory; 27. Mid-action pressure: The pressure calculated at the midpoint of the stroke; 28. Mid-action area covered: The area covered by finger at the midpoint of the stroke; 29. Mid-action finger orientation: Self explanatory; 30. Change of finger orientation: Self explanatory; 31. Phone orientation: Self explanatory.
66
S. Mondal, P. Bours / Journal of Information Security and Applications 40 (2018) 63–77
Stroke duration
Stop x
Stop y
0.8
0.8
0.8
0.6
0.6
0.6 F(X)
1
F(X)
1
F(X)
1
0.4
0.4
0.4
0.2
0.2
0.2
0
0
0.5
0
1
0
0.5
X
0
1
Stroke direction
Largest deviation
0.8
0.6
0.6
0.6 F(X)
0.8
F(X)
0.8
F(X)
1
0.4
0.4
0.4
0.2
0.2
0.2
0.5
1
Mid−stroke area
1
0
0.5 X
1
0
0
X
0
1
0
X
0.5
1
0
0
0.5
X
1
X
Fig. 2. CDF plot of the selected features for the user 8 of the Dataset - 1 for CAS. The blue CDF are generated from the data of user 8 and the red CDF are generated from the data of the impostors for this user. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
2.4. Feature selection
Dataset-2. We decided to apply the feature selection method according to this observation.
Before building the classifier models, we first apply a feature selection technique for both CAS and CIS. Our proposed feature selection process is based on the maximization of the separation between two Multivariate Cumulative Distributions. Let F = {1, 2, 3, . . . m} be the total feature set, where m is the number of feature attributes. The feature subset A⊆F is based on the maximization of FS with Genetic Algorithm as a feature subset searching
technique where, F S = supMV CDF (xAi ) − MV CDF (xAj ), MVCDF() is
Multivariate Cumulative Distribution Function, xAi is ith user feature subset data and xAj is impostor user(s) feature subset data. More details about this feature selection can be found in [22]. Fig. 2 represents the Cumulative Distribution Function (CDF) selected features for user 8 of the Dataset - 1 for the CAS module. The significant separation between two CDFs (i.e. the blue CDF is for the genuine user 8 and the red CDF is for the impostors) signifies the selected features. We also tested the feature selection technique proposed by Ververidis et al. [33] for both the datasets. We found that the above feature selection method worked better for Dataset-1 and the technique proposed by Ververidis et al. worked better for
3. Continuous Authentication System (CAS) CAS is the first subsystem of our proposed architecture where we are going to determine whether an impostor user is using the device or not. Therefore, concerning the classification point of view, it is a two class problem (i.e. legitimate user or impostor user). As mentioned before we will use a CA biometric system, which checks the genuineness of the user during the full session by considering every action performed by the user. The Trust Model was used in this research, and we will also report our CAS performance in terms of the Average Number of Genuine Actions (ANGA) and the Average Number of Impostor Actions (ANIA) [20]. Fig. 3 shows the expanded block diagram of the CAS. We discuss the major components of CAS in more detail in below. 3.1. User’s profile creation for CAS Let Ui denote the extracted feature data of user i; then this data is split into parts for training (denoted by Mi ) and testing (denoted by Ti ). At most 50% of the data of a user is used for training, i.e. |Mi | ≤ |Ti |. The CASi training model for user i is built with the train-
S. Mondal, P. Bours / Journal of Information Security and Applications 40 (2018) 63–77
67
Training Phase Swipe Gesture
Feature Extraction
Data Separation for Training
Feature Selection
Build Classifier Models
Store Profile
Testing Phase Swipe Gesture
Feature Extraction
Feature Selection
Comparison Module
Trust Model
System Trust (Trust)
Fig. 3. Block diagram representation of our CAS.
ing data Mi from user i as well as training data from the other users, where the total amount of data taken from other users is approximate of the same size as Mi . 3.2. Comparison module We applied two classifier models for CAS on Dataset-1, i.e. ANN and CPANN classifiers in a weighted score fusion MCF architecture. We also have tested the SVM classifier on Dataset-1, but due to lower performance, we decided not to use it for this dataset. For Dataset-2, we found that CPANN and SVM performed better in a weighted score fusion MCF architecture. Therefore, the score vector we use is ( f1 , f2 ) = (Scoreann , Scorecpann ) for Dataset-1 and for Dataset-2 we use ( f1 , f2 ) = (Scoresvm , Scorecpann ). The resultant score that will be used in the Trust Model (see Section 3.3 and Eq. (1)) is calculated in the following way: sc = Wca × f1 + (1 − Wca ) × f2 where, Wca is the weight for the weighted fusion MCF technique and 0 ≤ Wca ≤ 1. 3.3. Trust Model The concept of a Trust Model was introduced for CA [6], where the system can check the genuineness of the user after every action performed by the present user. The variable nature of the user’s behavior is taken into account in this model. As we know can the user’s behavior deviate from time to time [4,34], which motivated us to use the Trust Model, where every action performed by the user was taken into consideration. In this research, we have used a Dynamic Trust Model (DTM) proposed by Mondal et al. [20]. The model uses four parameters to calculate the change in trust based on the resultant classification score and to return the system trust on the genuineness of the current user after each action performed by the present user of the device. All the parameters for this trust model are user specific and optimized based on the linear search. The change of system trust (T ) is calculated according to Eq. (1), which is based on the resultant classification score (sc) of the performed action by the user as well as on four parameters. The parameter A represents the threshold value to determine the penalty or reward. If the classification score of the current action (i.e. sc = P (xn |H1 ) where xn is the feature vector after feature selection of the performed nth action and H1 is the hypothesis for the genuine user) is exactly equal to this threshold i.e. sc = A then T = 0. If sc > A then T > 0, i.e. a reward is given and if sc < A then T < 0, i.e. the trust decreases because of a penalty. Furthermore, the parameter B is the width of the sigmoid
for this function i.e. in which score value the system gives maximum penalty/reward, while the parameters C and D are the upper limits of the reward and the penalty.
⎧ ⎪ ⎪ ⎨
⎛
⎞ ⎫
⎪ 1 ⎪ D× 1+ ⎜ ⎟ ⎬ C ⎟, C
T (sc ) = min −D + ⎜ ⎝1 ⎪ sc − A ⎠ ⎪ ⎪ ⎪ ⎩ ⎭ + exp − C B
(1)
If the trust value after the nth action is denoted by Trustn , then we have the following relation between the trust T rustn−1 (i.e. after the (n − 1 )st action) and the trust Trustn (i.e. after the nth action) when the particular nth action had a classification score sc:
T rustn = min {max {T rustn−1 + T (sc ), 0}, 100}
(2)
In Eq. (2), we can see that the upper limit of the system trust is 100, to prevent a situation where an impostor user benefits from the high system trust obtained by the genuine user before he/she hijacks the system.
4. Continuous Identification System (CIS) CIS is the final subsystem of our proposed architecture where we are going to establish the identity of the user after CAS found that an impostor user was using the system. The performed actions by the impostor user before getting detected by CAS will be presented to CIS to establish the identity of that impostor user. This is a multi-class classification problem, where the number of classes will be the number users present in the dataset. Fig. 4 shows the complete block diagram of the CIS. In this section, we discuss the major components of CIS in more detail. The feature extraction and feature selection techniques were discussed in Section 2. Person identification by using swipe action is a challenging task due to the limited feature set. When we performed conventional multi-class classification approach, we found a lower learning accuracy of the classifiers, and we are unable to achieve the desired results. Apart from that, the variable number of swipe actions/activities presented for impostor identification will pose another challenge for us. Therefore, we have applied a solution called Pairwise User Coupling (PUC), where the multi-class classification problem will be divided into several two-class classification problems [21]. We provide a comparison between the conventional multi-class classification approach and the PUC approach in Section 7.1, where we can see that the PUC method outperforms the standard multi-class classification approach.
68
S. Mondal, P. Bours / Journal of Information Security and Applications 40 (2018) 63–77
Training Phase Swipe Gesture
Feature Extraction
Pairwise Training Data Preparation
Feature Selection
Build Classifier Models
Store Pairwise Profile
Testing Phase Swipe Gestures before Lockout from CAS
Feature Extraction
Feature Selection
Comparison Module
Decision Module
Adversary ID, Score
Fig. 4. Block diagram representation of our CIS.
4.1. Analysis of the identification algorithms There are three different identification schemes (i.e. S1, S2, and S3) that have been proposed by Mondal et al. [21] by using PUC. In scheme S1 we will randomly arrange the set of users into pairs, and for each pair (user i, user j) we will determine if the data fits better to the profile of user i or of user j. The user whose profile fits best to the data will proceed to the next round of the scheme. In scheme S2 we will, for each user i, randomly choose k other users and determine the mean score for user i when comparing the test data in the k pairwise comparisons with the randomly chosen other users. The user with the highest total score is selected as the identified user. Scheme S3 is based on applying scheme S2 twice. First scheme S2 is used to reduce the set of potential users from the original N users to only c users. In the second step the remaining c users are compared in a full comparison, i.e. we apply scheme S2 on N = c users, and we use k = c − 1 to compare a user with all c − 1 other remaining users. The research was conducted with a biometric keystroke data. In this section, we are going to study how these schemes perform in our swipe gesture based biometric data and set the different parameters of these schemes. We would like to mention that all the analysis presented in this section was done on Dataset-1. Fig. 5 shows the identification accuracy after applying Scheme 1 (S1). We can clearly see that the performance of CPANN is better than ANN and SVM. Therefore, we have decided to use the only CPANN classifier for the analysis later on. Fig. 6 shows the Rank-1 identification accuracy obtained from Scheme 2 (S2) with different k value (i.e. k = 5, 10, . . . 25). We can see that increasing the value of k will increase the recognition accuracy, but it will saturate after a certain value of k. Keep in mind that increasing the value of k will also increase the time complexity of the system (see Mondal et al. [21]), we have used k = 15 for the analysis later on. Fig. 7 shows the Rank-1 identification accuracy obtained after applying Scheme 3 (S3) with different c value (i.e. k = 8, 10, . . . 16) and k = 15. We would like to mention that S3 is dependent upon the S2 with an added extra parameter c. We observed that increasing the value of c does not improve the accuracy and also the time complexity of the system (see Mondal et al. [21]) has increased by increasing the value of c. Therefore we have used c = 8 for the analysis later on. We can see the improvement on the results by comparing Figs. 6 and 7. We have also observed that S3 performs best when compared to other schemes. 5. Experimental data separation and performance measure In our research, we followed two experimental protocols, e.g. a closed-set experiment protocol, and an open-set experiment protocol. The data separation for training and testing of these two ex-
periment protocol and our system performance measure technique have been given below.
5.1. Experimental data separation 5.1.1. Closed-set experiment In a closed-set experiment, all the possible impostors are known to both CAS and CIS subsystem. The example scenario could be an office environment where your colleagues will be your potential adversaries. In the case of CAS, the impostor part of the training data is taken from all N − 1 impostors, and all N − 1 impostors contribute approximately with the same amount of data for the training of the classifier. This could, for example, be done internally in an organization where all users provide data for training of the various classifiers. For the testing, we used all the data of the genuine user and the impostor users that has not been used for the training of the classifier. This means that we have one genuine set of test data and N − 1 impostor sets of test data for each user. Fig. 8 explains this separation process for the first user, where |M1 | ≈ |IMP1 | and |M1 | ≤ |T1 |. A similar process has been followed for all the other users. In the case of CIS, the input I for all the schemes (see Mondal et al. [21] for parameter I) will be I = {1, 2, 3 . . . N}, where N is the number of participants in the dataset.
5.1.2. Open-set experiment In an open-set experiment, 50% the users are known to both CAS and CIS subsystem. The example scenario could be a university environment where your colleagues will be your potential known adversaries while students would form the group of unknown adversaries from outside of the organization. In the case of CAS, the classifiers aretrained with data from the genuine user as well as data of N−1 of the impostor users. Also here we test the sys2 tem with all genuine and impostor data that has not been used for training. For N−1 of the impostors this means that their full 2
data set is used for testing, while for the other N−1 impostors 2 the full data set with the exclusion of the training data are used for testing. Therefore, in this experiment, we also have one genuine set of test data and N − 1 impostor sets of test data for each user. Fig. 9, explains this separation process for the first user, where again |M1 | ≈ |IMP1 | and |M1 | ≤ |T1 |. A similar process has been followed for all the other users. In the case of CIS the input I for all the schemes (see Mondal et al. [21] for parameter I for S1, S2 and S3) will be I = {1, 2, 3 . . . N−1 2 + 1}, where N is the number of participants in the dataset. Therefore, this experiment can be seen as an open system where 50% of the probable adversaries is known to the system and another 50% are completely unknown to the system.
S. Mondal, P. Bours / Journal of Information Security and Applications 40 (2018) 63–77
69
100
90
Correct Identification (%)
80
70
60
50
40
30
20 ANN SVM CPANN
10
0 2
4
6
8
10
12
14
16
18
20
Number of Swipe Action Fig. 5. Results obtained from S1 with different classifier.
5.2. Performance measure The followed system performance measure techniques applied in this research will be discussed in this section. The CAS performance measure technique is common for both the experiments. Due to the different objective of these two experiments, we have applied two different performance measure techniques for CIS. The description of these techniques is given below.
5.2.1. CAS performance measure We found that the current research on CAS reports the results in terms of Equal Error Rate (EER), or in terms of False Match Rate (FMR) and False Non-Match Rate (FNMR), over either the whole test set or over a fixed chunks size of m of actions. This means that an impostor can perform at least m actions before being detected as an impostor, even if system achieves 0% EER. This is then in fact no longer CAS, but at best a Periodic Authentication System. In our research, we focus on actual CAS that reacts on every single action from a user. Therefore, we used Average Number of Genuine Actions (ANGA) and Average Number of Impostor Actions (ANIA) as the performance evaluation metric [20]. In our study, the performed actions are the swipe gestures. In Fig. 10 we see how system trust values change when we compare the model of genuine user 22 with the test data of impostor user 8. The trust will drop (in this example) 22 times below the lockout threshold (Tlockout = 90 is with a red line) marked within 281 user actions then, ANIA22 = 281 = 13. We can calcu22 8
late ANGA in a similar manner if the genuine user is locked out based on his test data. The goal is obviously to have ANGA as high as possible (to ensure that the system is user-friendly, and ideally a genuine user is never locked out), while at the same time the ANIA value must be as small as possible (to ensure the impostor can do little harm). We also do want that all the impostors are in fact detected as impostors. In our analysis, whenever a user is locked out, we reset the trust value to 100 to simulate a new session starting (i.e. post Static Login). In our research, we used an upper limit of the system trust of 100 (see Eq. (2)) to prevent the impostor user to take advantage of the higher system trust before taking over the system. 5.2.2. CIS performance measure for closed-set experiment This is a straightforward performance measure for closed-set experiment, where every time that a user is locked out by the CAS system, the adversary ID is determined by the CIS system. In Fig. 10 we displayed these adversary IDs in green (meaning a successfully identified adversary) and red (indicating the identity of another person than the real adversary). When the system is unsuccessful in the identification of the correct adversary ID, we display the Rank-1 identity along with the rank of the right impostor. In Fig. 10 one such case was present, where the system identifies the impostor as number 38, while the correct adversary ID of 8 is found at Rank-2 (R-2 marked as blue). During this experiment out of 22 times lockout the CIS identified the correct adversary 21 times, therefore, the recognition accuracy for this example is 21 out of 22, or ACC822 = 95.45%.
70
S. Mondal, P. Bours / Journal of Information Security and Applications 40 (2018) 63–77
100
90
Correct Identification (%)
80
70
60
50
40
30
20
k=5 k = 10 k = 15 k = 20 k = 25
10
0 2
4
6
8
10
12
14
16
18
20
Number of Swipe Action Fig. 6. Results obtained from S2 for different k value.
5.2.3. CIS performance measure for open-set experiment To measure the system performance for open-set experiment, we have used a threshold (i.e. Topen ) that will decide whether the adversary is within the set of known users or not. If the Userscore ≥ Topen (Userscore is the comparison score obtained from the identification schemes [21]), then we will say that the adversary is within the set of known adversaries otherwise the current user is declared to be an unknown adversary. If we find that the adversary is known to the system, then the system will establish the identity of the adversary. In our study, we will provide four different metrics that will provide the overall system performance for this protocol, where the summation of these four metrics will be 100%.
In Fig. 11 one case was presented for genuine user 8 and impostor user 22 where impostor user 22 is present in the known adversary user set. In this example, we see that two cases where the system says the adversary is not in the known user set (marked F in red color). Therefore, in this example T ID822 = 94.29%, F NotIn822 = 5.71%, T NotIn822 = 0% and F ID822 = 0%. In Fig. 12 another case is presented with genuine user 8 and impostor user 65, where impostor user 65 is not in the known adversary user set. In this example, we see that one case where the system says the adversary is in the known user set with a false ID (marked 51 in red color). Therefore, in this example T NotIn865 = 95.24%, F ID865 = 4.76%, T ID865 = 0%, F NotIn865 = 0%. In both examples (Figs. 11 and 12) we have used Dataset-1 and Topen = 0.8. 6. Result analysis
• True ID (TID) : Where (Userscore ≥ Topen i.e. adversary is within the known user set) and correctly identified. • False ID (FID) : This is the sum of two different components, but for both the components Userscore ≥ Topen . - The adversary is within the known user set but, falsely identified. - The adversary is not in the known user set, but the system says otherwise with a false adversary ID. • True Not In (TNotIn) : Where Userscore < Topen and the adversary was indeed not in the known user set. • False Not In (FNotIn) : In this case system says the adversary is not in the known user set (i.e. Adversaryscore < Topen ) but, actually the adversary is within the known user set.
We report the results with a user specific lockout threshold (Trus ) and with a fixed lockout threshold, where Tlockout = 90. In our research, all the algorithmic parameters and Trus thresholds are optimized using linear search. We have analyzed a zero effort attack scenario in our research. Therefore, we have in total N × N tests, of which N are genuine tests and the remainder are impostor tests for any given dataset. 6.1. Interpretation of result tables Based on the CAS performance a genuine user can be categorize into four possible categories:
S. Mondal, P. Bours / Journal of Information Security and Applications 40 (2018) 63–77
71
100
90
Correct Identification (%)
80
70
60
50
40
30
20
c=8 c = 10 c = 12 c = 14 c = 16
10
0 2
4
6
8
10
12
14
16
18
20
Number of Swipe Action Fig. 7. Results obtained from S3 with k = 15 for different c value.
1
1
1
1
1 2 1
2 …
1
… −1
( − 1)/2 ( − 1)/2 + 1 …
Fig. 8. Pictorial representation of the data separation process for the first user in a closed-set experiment.
• (+/+) : In this category, the genuine user is never locked out from the system (i.e. ANGA will be ∞), and all the N − 1 impostors are detected as impostors. This is the best category. • (+/−) : In this category the genuine user will also not be locked out, but some impostors are not detected by the system. • (−/+) : In this category, the genuine user is locked out by the system, but on the other hand are all impostors detected by the system. • (−/−) : In this category, the genuine user is locked out by the system and also some of the impostors are not detected by the system. This is the worst case situation. In Table 2, column # Users shows how many users fall within each of the 4 categories based on the CAS performance (i.e. the values sum up to N. In column ANGA a value will indicate the
Fig. 9. Pictorial representation of the data separation process for the first user in an open-set experiment.
mean value of the Average Number of Genuine Actions in case genuine users are locked out by the system. If the genuine users are not locked out, then ANGA will be set to ∞. The column ANIA will display the mean ± standard-deviation value of the Average Number of Impostor Actions, and is based on all impostors in that category that are detected. The actions of the impostors that are not detected are not used in this calculation, but the number of undetected impostors is given in column Imp. ND. This number should be seen in relation to the number of users in that particular category. For example, in case of Trus lockout threshold ’+/−’ category, we see that # Users equals 3 (i.e. there are 3 × 70 = 210 impostor test sets) and only 4 impostors are not detected by the system as being an impostor. Therefore, in this cat-
72
S. Mondal, P. Bours / Journal of Information Security and Applications 40 (2018) 63–77
100
System Trust
95
90
8
8
8
8
8
8
8
8
38 R−2
8
8
Genuine User − 22 Imposter User − 8 85
0
50
8
8
8
8
8
8
8
8
8
8
8
Accuracy − 95.45% ANIA − 13
100
150
200
250
Event Number Fig. 10. CIS performance measure for closed-set experiment with genuine user 22 and impostor user 8. (For interpretation of the references to color in the text, the reader is referred to the web version of this article.) Table 2 Results for Database-1. Tlockout
Trus
Tr90
Category
+ /+ + /− −/+ −/− Summary +/+ +/− −/+ −/− Summary
CAS performance
CIS performance (%)
# User
ANGA
ANIA
Imp. ND
S1
S2
S3
68 3
∞ ∞
4 ± 2 14 ± 3
4
72.9 ± 6.8 88.3 ± 3.8
69 ± 5.2 81.8 ± 1.7
82.3 ± 5.4 93 ± 2.4
71
∞
4.4
0.08%
73.5
69.5
82.7
63 8
∞ ∞
14 ± 3 25 ± 7
20
94.5 ± 1 96.4 ± 1.2
87.3 ± 1.4 90.8 ± 2.4
97.8 ± 0.6 98.5 ± 0.6
71
∞
15.2
0.4%
94.7
87.7
97.9
k egory AN IA = |I |(N−1 )1−Imp.ND k l AN IAl where k ∈ I+− ⊆ I = +− +− {1, 2, . . . N} (i.e. the users fall into the ’+/−’ category) and l ∈ IMP+k − , where IMP+k − = [I − {k} − NDk ] and NDk is the set of impostors not detected for k user. In the same way we can calculate the accuracy for other categories.
Vsummary =
i.e. ’−/+’ or ’−/−’ category. In that case it will be difficult to compare the advantages/disadvantages of the applied analysis method. Therefore, we present a summary line of the results for any given analysis. The summary line for all the tables was calculated by +nd−− ImpNDsummary = nd+−N×t × 100% for Imp. ND, and for others i.e. ANIA, S1, S2 and S3 by
(u++ × t × v++ ) + ((u+− × t − nd+− ) × v+− ) + (u−+ × t × v−+ ) + ((u−− × t − nd−− ) × v−− ) (N × t ) − (nd+− + nd−− )
In Table 2 we can see that most of the genuine users fall into the ’+/+’ category while some genuine users fall into the ’+/−’ category for any given analysis technique. But there could be a situation where some genuine users may fall into the other category
where, u++ , u+− , u−+ and u−− are the number of users fall into the corresponding categories (i.e. ’+/+’, ’+/−’, ’−/+’ and ’−/−’) and v++ , v+− , v−+ and v−− are the value we want to summarize (i.e. ANIA or CIS performances) that fall into the corresponding cate-
S. Mondal, P. Bours / Journal of Information Security and Applications 40 (2018) 63–77
73
100
System Trust
95
90
222222222222 22 F 22 22 22
22
22222222222222222222 22 22 22 22 F 22 222222 22 2222
Genuine User − 8 Imposter User − 22 85
0
50
100
150
22
Accuracy − 94.29% ANIA − 14
200
250
300
350
400
450
Event Number Fig. 11. CIS performance measure for closed-set experiment with genuine user 8 and impostor user 22. (For interpretation of the references to color in the text, the reader is referred to the web version of this article.) Table 3 Results for Database-2. Tlockout
Category
Trus
+/+ +/− −/+ −/− Summary
Tr90
+/+ +/− −/+ −/− Summary
CAS performance
CIS performance (%)
# User
ANGA
ANIA
41
∞
11 ± 9
41
∞
11
0%
79.7
69.6
82.2
39 2
∞ ∞
27 ± 16 246 ± 141
4
84.8 ± 1.8 95.2 ± 4.8
72.9 ± 2.3 88.7 ± 9.3
86.4 ± 1.7 95.5 ± 4.5
41
∞
37.2
0.24%
85.3
73.6
86.8
gories, nd+− and nd−− are the number of impostors not detected for ’+/−’ and ’−/−’ categories respectively and t = N − 1. 6.2. Results for closed-set experiment Tables 2 and 3 show the system performance for the closedset analysis with Database-1 and Database-2 respectively. In the CIS Performance (%) section, columns S1, S2 and S3 show the results obtained from Scheme 1, 2 and 3 respectively according to the corresponding category. The best CIS performance results are indicated in bold in both tables. We can clearly see that the S3 technique performs better than the other techniques. Therefore, we have decided to use S3 for the open-set experiment that will be presented in Section 6.3. Due to the lower value of ANIA (hence fewer data available for CIS) and also all the impostors detected
Imp. ND
S1
S2
S3
79.7 ± 5.3
69.6 ± 5.1
82.2 ± 4.9
by the system, we can observe that the continuous identification accuracy is lower for the ’+/+’ category users, than the ’+/−’ category genuine users for both the lockout threshold (i.e. Trus and Tr90 ). Similar phenomena were observed between these two lockout thresholds because for Tr90 impostors can perform more actions before getting detected. We can also see that the ANIA value is a bit higher for Database-2 than for Database-1 for any given lockout threshold, but on the other hand, all the users are falling into the ’+/+’ category for Trus . 6.3. Results for open-set experiment Tables 4 and 5 show the system performance for the openset experiment for Dataset-1 and Dataset-2 respectively, with user’s specific Topen thresholds. In Table 4, we can see that 66 users qual-
74
S. Mondal, P. Bours / Journal of Information Security and Applications 40 (2018) 63–77
100
System Trust
95
90
T
T
T
T
T
T
T
T
T
T
T
Genuine User − 8 Imposter User − 65 85
0
50
T
T
51
T
T
T
T
T
T
T
Accuracy − 95.24% ANIA − 12
100
150
200
Event Number Fig. 12. CIS performance measure for closed-set experiment with genuine user 8 and impostor user 65. (For interpretation of the references to color in the text, the reader is referred to the web version of this article.) Table 4 Results for Database-1 and Protocol-2. Tlockout
Trus
Tr90
Category
+/+ +/− −/+ −/− Summary +/+ +/− −/+ −/− Summary
CAS performance
CIS performance (%)
# User
ANGA
ANIA
Imp. ND
FID
FNotIn
TID
TNotIn
66 5
∞ ∞
4 ± 2 15 ± 8
10
12.2 ± 4.2 13 ± 2.6
13.2 ± 2.3 9.8 ± 4
36.5 ± 2.2 40.8 ± 4.2
38.1 ± 4 36.4 ± 2.6
71
∞
4.8
0.2%
12.3
13
36.8
38
56 15
∞ ∞
15 ± 4 22 ± 8
27
8 ± 6.3 11.7 ± 6
12.7 ± 6.6 7.5 ± 6.4
37.1 ± 6.3 43.2 ± 6.7
42.2 ± 6.1 37.6 ± 6.2
71
∞
16.4
0.54%
8.8
11.6
38.4
41.2
Table 5 Results for Database-2 and Protocol-2. Tlockout
Trus
Tr90
Category
+/+ +/− −/+ −/− Summary +/+ +/− −/+ −/− Summary
CAS performance
CIS performance (%)
# User
ANGA
ANIA
Imp. ND
40 1
∞ ∞
FID
FNotIn
TID
TNotIn
17 ± 13 88
1
15.5 ± 5.8 8.8
10.9 ± 2.6 6.7
38.6 ± 2.5 44.1
35 ± 5.5 40.5
41 38 3
∞
18.7
0.06%
15.3
10.8
38.7
35.1
∞ ∞
32 ± 20 138 ± 80
9
9 ± 1.7 9.3 ± 2.6
13.7 ± 1.8 7.6 ± 3.3
36 ± 1.8 46.5 ± 4.6
41.2 ± 1.7 36.6 ± 1.9
41
∞
39.2
0.55%
9
13.3
36.7
40.9
S. Mondal, P. Bours / Journal of Information Security and Applications 40 (2018) 63–77
75
100
90
Correct Identification (%)
80
70
60
50
40
30
20
10
PUC Antal et al. (SVM)
0 2
4
6
8
10
12
14
16
18
20
Number of Swipe Action Fig. 13. Identification accuracy comparison with previous research on the Dataset-1.
ified in the ’+/+’ category with ANIA of 4 actions and the rest of the users qualified as ’+/−’ category with ANIA of 15 actions for Trus . In case of CIS performance the summation of TID and TNotIn, i.e. Detection and Identification Rate (DIR) is 74.8%. Similar to previous observations, the CIS performance can improve if we use Tr90 as a lockout threshold because of the higher ANIA value (i.e. more swipe actions presented to CIS to establish the adversary ID).
7. Discussion 7.1. Comparison with previous research Due to the novelty of this research, we did not find any research which is directly related to our research. Therefore, we are unable to compare our continuous identification results with previous results, but, we can compare our proposed identification technique with the previous research done on the same dataset. Fig. 13 shows a comparison with previous research done by Antal et al. [2] for user identification with different numbers of actions. In [2] the best result was obtained by using the SVM classifier without pairwise user coupling. We would like to mention that in [2] the amount of data used for classifier training and testing the system is not explicitly mentioned in their paper. We see that the proposed method outperforms the existing research. More specifically, there is a large difference in accuracy when using a small number of actions, which benefited our research when the impostor users are locked out after a low number of actions.
We can also compare the previous CAS performance on Dataset2 with our CAS performance on the same dataset. Table 6, shows the previous research results in terms of ANIA/ANGA by using the conversion technique describe in [20]. We can clearly see that our methods outperforms the previous research for Closed-set, while for Open-set our ANIA value is higher. Note that the analysis in [14] has been done in a manner the similar as we have done for Open-set. The major limitation of the research in [14] is that they use fixed chunks of data i.e. 12 actions for analysis. An impostor can perform 12 actions before his/her identity is checked, even if a 0% FMR rate would be achieved. Table 7 shows the comparison between our CAS with state of the art Periodic Authentication (PA) approach in Dataset-2 with our preprocessing and classification techniques, where the block size varies from 2 to 20. After calculating the FMR/FNMR values for different block sizes, we then converted these into ANIA/ANGA by using the conversion technique described in [20]. We can see that our system is performing better if we compare the ANGA values i.e. our CA system’s ANGA value is much higher than the comparable PA system ANGA value (marked in bold). We would also like to compare our proposed system and algorithms to other datasets and state-of-the-art algorithms. Unavailability of such datasets, unfortunately, limits our possibilities for comparison with previous research. 7.2. Summary of the major findings • Contrary to the state-of-the-art continuous authentication research on mobile devices (i.e. periodic authentication), we have
76
S. Mondal, P. Bours / Journal of Information Security and Applications 40 (2018) 63–77 Table 6 Comparison with previous research with Dataset-2 for CAS. Reference
# Users
FNMR
FMR
Block size
ANGA
ANIA
[14] Our (Closed-set) Our (Open-set)
41 41 41
3% 0% 0%
3% 0% 0.06%
12 NA NA
400 ∞ ∞
12 11( ± 9) 19( ± 13)
Table 7 Comparison between state of the art periodic authentication system and our CAS in Dataset-2. Approach
PA
Our CAS
•
•
•
•
Block size
Protocol-1
Protocol-2
FMR
FNMR
ANIA
ANGA
FMR
FNMR
ANIA
ANGA
2 3 4 5 6 7 8 9 10
30.1 28.8 28.8 26.5 26.3 27.6 26.4 27.4 25.5
12.3 11.0 9.5 9.7 9.0 7.2 6.9 5.1 6.3
3 4 6 7 8 10 11 12 13
16 27 42 51 66 98 117 176 159
32.2 30.9 30.6 29.4 31.2 30.8 30.0 29.6 29.0
14.7 13.6 11.6 11.7 8.9 8.4 8.6 8.4 8.5
3 4 6 7 9 10 11 13 14
14 22 34 43 67 83 93 107 118
11 12 13 14 15 16 17 18 19 20
25.1 24.6 23.6 24.2 23.5 23.3 21.0 24.2 22.4 21.9
5.7 5.9 6.1 6.1 6.1 5.7 7.5 4.4 5.3 5.1
15 16 17 18 20 21 22 24 24 26
193 204 213 229 245 279 227 412 360 392
27.7 30.0 28.2 28.4 28.2 29.2 28.4 28.9 28.1 29.2
8.8 6.3 6.7 6.9 6.9 5.5 5.1 4.9 5.2 4.2
15 17 18 20 21 23 24 25 26 28
125 191 193 202 217 293 333 366 362 480
11
∞
18
∞
used an actual continuous authentication system in our research. The advantage of our CAS is that, whenever the system is confident about the illegitimacy of the current user, it does not wait to complete the fixed number of actions before taking the lockout decision. But, from a classification point of view, it poses a huge challenge, because the classifier only learns in terms of actions basis, not a chunk basis. We overcome this problem by applying the trust model (see Section 3). One of the challenges that we had in this research is that the number of the data samples that are used for the CIS is variable. The number of data samples equals the number of actions that the current user could perform before he/she was locked out by the CAS module. We mitigate this problem by the PUC based identification schemes (see Section 4). Due to significant intra-class variation, small inter-class variation and limited information of behavioral biometrics (i.e. swipe gesture), performing authentication (i.e. 1:1 comparison) is a challenging task on top of that we are performing identification (i.e. 1:N comparison) which increases this challenge even further. We overcome these challenges with a high degree of confidence by using PUC based identification techniques. These techniques are general enough that it could be applied to other identification problems. We have observed that none of the genuine users are wrongly locked out from the CAS for any give lockout threshold (i.e. Trus or Tr90 ) and dataset (see Tables 2–5). This means that our CAS will provide a high a degree of user friendliness. We found that the CAS performances for Tlockout = 90 are lower than the Tus for any given dataset. Therefore, we can say that the system trust on the genuine users was always above 90%.
7.3. Practical implementation For the practical implementation the behavior profile of CAS for a given user will be stored securely on his/her device, and all the
behavior profile of CIS for all the users will be safely stored in the central server. There are some definite advantages to go along with the proposed practical implementation: • In this way, we can reduce the network traffic because if there is an activity, we do not need to communicate with the server for continuous authentication. • As the CIS is a 1: N compression, by doing it in the central server, we can reduce the computation and storage cost of the local devices. Also if a single device is compromised not all the behavior profiles will be compromised.
8. Conclusion The concepts of continuous user authentication and identification have been introduced in this paper with experimental evaluation. CIS in combination with CAS will not only protect a system against unauthorized access but will also with high probability identify the impostor. This might then be a deterrence measure to an impostor when considering unauthorized access on another person’s system. We have evaluated our proposed architecture in a closed environment like a workplace and also in an open environment, where some of the potential adversaries are known, but some are unknown. For the open environment we found that the accuracy to establish the correct identity of the adversary by the CIS is above 73%, i.e. DIR of 73%, and for the closed environment, the identification accuracy is above 82%. The objective of continuous identification is to use it as forensic evidence. In this research, we produce proof of the continuous user authentication and identification concept and the possibility to explore this idea in future work to potentially present evidence in a court case. We are also going to explore the possibility to include other types of user’s behavior pattern (based on tapping, ac-
S. Mondal, P. Bours / Journal of Information Security and Applications 40 (2018) 63–77
celerometer and gyroscope information, etc.) into our system in future. References [1] Acharya S, Fridman A, Brennan P, Juola P, Greenstadt R, Kam M. User authentication through biometric sensors and decision fusion. In: 47th annual conference on information sciences and systems. IEEE; 2013. p. 1–6. [2] Antal M, Bokor Z, Szabó LZ. Information revealed from scrolling interactions on mobile devices. Pattern Recognit Lett 2015;56:7–13. [3] Bailey KO, Okolica JS, Peterson GL. User identification and authentication using multi-modal behavioral biometrics. Comput Secur 2014;43:77–89. [4] Bergner RM. What is behavior? and so what? New Ideas Psychol 2011;29(2):147–55. [5] Bo C, Zhang L, Jung T, Han J, Li X-Y, Wang Y. Continuous user identification via touch and movement behavioral biometrics. In: 2014 IEEE international performance computing and communications conference; 2014. p. 1–8. [6] Bours P. Continuous keystroke dynamics: a different perspective towards biometric evaluation. Inf Secur Tech Rep 2012;17:36–43. [7] Cai Z, Shen C, Wang M, Song Y, Wang J. Mobile authentication through touch-behavior features. In: Biometric recognition. In: Lecture notes in computer science, vol. 8232. Springer; 2013. p. 386–93. [8] Clarke N, Furnell S. Authenticating mobile phone users using keystroke analysis. Int J Inf Secur 2007;6(1):1–14. [9] Crouse D, Han H, Chandra D, Barbello B, Jain AK. Continuous authentication of mobile user: fusion of face image and inertial measurement unit data. In: International conference on biometrics (ICB’15). IEEE; 2015. p. 135–42. [10] De Luca A, Hang A, Brudy F, Lindner C, Hussmann H. Touch me once and i know it’s you!: implicit authentication based on touch screen patterns. In: SIGCHI conference on human factors in computing systems. In: CHI ’12. ACM; 2012. p. 987–96. [11] Feher C, Elovici Y, Moskovitch R, Rokach L, Schclar A. User identity verification via mouse dynamics. Inf Sci 2012;201(0):19–36. [12] Feng T, Yang J, Yan Z, Tapia EM, Shi W. Tips: Context-aware implicit user identification using touch screen in uncontrolled environments. In: 15th workshop on mobile computing systems and applications. ACM; 2014. 9:1–9:6. [13] Feng T, Zhao X, Carbunar B, Shi W. Continuous mobile authentication using virtual key typing biometrics. In: 12th IEEE international conference on trust, security and privacy in computing and communications. In: TRUSTCOM ’13. IEEE; 2013. p. 1547–52. [14] Frank M, Biedert R, Ma E, Martinovic I, Song D. Touchalytics: on the applicability of touchscreen input as a behavioral biometric for continuous authentication. IEEE Trans Inf Forensics Secur 2013;8(1):136–48. [15] Govindarajan S, Gasti P, Balagani K. Secure privacy-preserving protocols for outsourcing continuous authentication of smartphone users with touch data. In: IEEE 6th international conference on biometrics: theory, applications and systems (BTAS’13); 2013. p. 1–8. [16] Jakobsson M, Shi E, Golle P, Chow R. Implicit authentication for mobile devices. In: 4th USENIX conference on hot topics in security (HotSec’09); 2009. p. 1–6. [17] Kittler J, Hatef M, Duin RPW, Matas J. On combining classifiers. IEEE Trans Pattern Anal Mach Intell 1998;20(3):226–39.
77
[18] Li L, Zhao X, Xue G. Unobservable re-authentication for smartphones. In: 20th annual network & distributed system security symposium. The Internet Society; 2013. p. 1–16. [19] Meng Y, Wong DS, Kwok L-F. Design of touch dynamics based user authentication with an adaptive mechanism on mobile phones. In: 29th annual ACM symposium on applied computing. In: SAC ’14. ACM; 2014. p. 1680–7. [20] Mondal S, Bours P. A computational approach to the continuous authentication biometric system. Inf Sci 2015;304:28–53. [21] Mondal S, Bours P. Person identification by keystroke dynamics using pairwise user coupling. IEEE Trans Inf Forensics Secur 2017;12(6):1319–29. [22] Mondal S, Bours P. A study on continuous authentication using a combination of keystroke and mouse biometrics. Neurocomputing 2017;230:1–22. [23] Monrose F, Rubin A. Authentication via keystroke dynamics. In: 4th ACM conference on computer and communications security (CCS ’97). ACM; 1997. p. 48–56. [24] Nakkabi Y, Traor I, Ahmed A. Improving mouse dynamics biometric performance using variance reduction via extractors with separate features. IEEE Trans Syst Man Cybern Part A 2010;40:1345–53. [25] Niinuma K, Park U, Jain A. Soft biometric traits for continuous user authentication. IEEE Trans Inf Forensics Secur 2010;5(4):771–80. [26] Patel VM, Chellappa R, Chandra D, Barbello B. Continuous user authentication on mobile devices: recent progress and remaining challenges. IEEE Signal Process Mag 2016;33(4):49–61. [27] Saevanee H, Clarke N, Furnell S, Biscione V. Text-based active authentication for mobile devices. In: ICT systems security and privacy protection, vol. 428. Springer Berlin Heidelberg; 2014. p. 99–112. [28] Serwadda A, Phoha V, Wang Z. Which verifiers work?: A benchmark evaluation of touch-based authentication algorithms. In: 2013 IEEE 6th international conference on biometrics: theory, applications and systems; 2013. p. 1–8. [29] Shen C, Zhang Y, Cai Z, Yu T, Guan X. Touch-interaction behavior for continuous user authentication on smartphones. In: International conference on biometrics (ICB’15). IEEE; 2015. p. 157–62. [30] Shepherd S. Continuous authentication by analysis of keyboard typing characteristics. In: European convention on security and detection. IEEE; 1995. p. 111–14. [31] Sim T, Zhang S, Janakiraman R, Kumar S. Continuous verification using multimodal biometrics. IEEE Trans Pattern Anal Mach Intell 2007;29(4):687–700. [32] Traore I, Woungang I, Obaidat M, Nakkabi Y, Lai I. Combining mouse and keystroke dynamics biometrics for risk-based authentication in web environments. In: 4th international conference on digital home; 2012. p. 138–45. [33] Ververidis D, Kotropoulos C. Information loss of the mahalanobis distance in high dimensions: application to feature selection. IEEE Trans Pattern Anal Mach Intell 2009;31(12):2275–81. [34] Yampolskiy RV, Govindaraju V. Behavioural biometrics: a survey and classification. Int J Biom 2008;1(1):81–113. [35] Zhang H, Patel VM, Chellappa R. Robust multimodal recognition via multitask multivariate low-rank representations. In: IEEE international conference on automatic face and gesture recognition (FG’15). IEEE; 2015. p. 1–8. [36] Zhao X, Feng T, Shi W. Continuous mobile authentication using a novel graphic touch gesture feature. In: 2013 IEEE international conference on biometrics: theory, applications and systems; 2013. p. 1–6.