Available online at www.sciencedirect.com Available online at www.sciencedirect.com
Available online at www.sciencedirect.com
ScienceDirect
Procedia Computer Science 00 (2018) 000–000 Procedia Computer Science (2018) 000–000 Procedia Computer Science 15500 (2019) 177–184
www.elsevier.com/locate/procedia www.elsevier.com/locate/procedia
The 16th International Conference on Mobile Systems and Pervasive Computing (MobiSPC) The 16th International Conference Mobile Systems andCanada Pervasive Computing (MobiSPC) August on 19-21, 2019, Halifax, August 19-21, 2019, Halifax, Canada
Towards Towards Continuous Continuous Authentication Authentication on on Mobile Mobile Phones Phones using using Deep Deep Learning Models Learning Models Hasan Can Volakaaa , Gulfem Alptekinaa , Okan Engin Basaraa , Mustafa Isbilenbb , Ozlem a,∗ Basar , Mustafa Isbilen , Ozlem , Okan Engin Hasan Can Volaka , Gulfem Alptekin Durmaz Incel a,∗ Durmaz Incel a b Yapi b Yapi
Department of Computer Engineering, Galasaray University, 34349, Istanbul/Turkey a Department of Computer Engineering, Galasaray University, 34349, Istanbul/Turkey Kredi Teknoloji, Resitpasa Mah.Katar Cad. ITU Ayazaga Yerleskesi, Teknokent Ari 3 Sit. No:4/B201 34485 SARIYER / ISTANBUL Kredi Teknoloji, Resitpasa Mah.Katar Cad. ITU Ayazaga Yerleskesi, Teknokent Ari 3 Sit. No:4/B201 34485 SARIYER / ISTANBUL
Abstract Abstract Smartphones have become essential objects for our daily lives. Besides their original purpose of use, people use these devices Smartphones haveassistants. become essential objects for our daily lives. large Besides their storage original which purpose of use,users people use these as their personal Additionally, smartphones provide internal enables to store their devices private as their personal assistants. Additionally, smartphones provide large internal storage which enables users to store information, such as personal photos, contact details, call histories, etc. On the other hand, because of their small their sizes,private these information, as get personal details, call histories, etc. On otherofhand, because users of their smallunauthorized sizes, these devices couldsuch easily lost orphotos, stolen.contact Therefore, providing the security andthe privacy smartphone against devices easily get lost or stolen. the solutions security and privacy smartphone users against unauthorized access iscould a significant and crucial area Therefore, of research.providing One of the is the use ofofbehavioral biometrics, which tracks and access is a significant and crucial area of research. One of the solutions is the use of behavioral biometrics, which tracks and and identify users’ interaction patterns with the device. In this paper, we investigate the impact of using both touchscreen-based identify users’features interaction with themodel device.using In this paper, we investigate the impact using both touchscreen-based sensor-based in anpatterns authentication deep learning methods. Mainly, weoftrain a three-layer deep networkand on sensor-based in anand authentication model using deep learning methods. characters Mainly, we three-layer deep network on the combined features feature-sets applied classification for revealing the behavioral oftrain usersafor building an authentication the combined and applied classification for revealing theover behavioral characters of users for building an authentication model. We usefeature-sets HMOG dataset that includes data from 100 users 24 sessions. We train different networks with different model. We useof HMOG dataset thatonly includes data from 100 users overdata, 24 sessions. train different networks with combinations input data, namely touch-screen data, only sensor and their We combination. Our results show thatdifferent we can combinations of input data, namely touch-screen data, onlyclassification sensor data, when and their combination. show that we can achieve 88% accuracy and 15% EERonly values considering binary different types of Our dataresults are used together. achieve 88% accuracy and 15% EER values considering binary classification when different types of data are used together. c 2019 2018 The The Authors. Authors. Published Published by by Elsevier Elsevier B.V. B.V. © c 2018an The Authors. Published by Elsevier B.V. This This is is an open open access access article article under under the the CC CC BY-NC-ND BY-NC-ND license license (http://creativecommons.org/licenses/by-nc-nd/4.0/) (http://creativecommons.org/licenses/by-nc-nd/4.0/) This is an open access article under theConference CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/) Peer-review responsibility ofthe the Conference Program Chairs. under responsibility of Program Chairs. Peer-review under responsibility of the Conference Program Chairs. Keywords: continuous authentication; mobile sensors; deep learning. Keywords: continuous authentication; mobile sensors; deep learning.
1. Introduction 1. Introduction Smartphones are essential tools in our daily lives. A survey report by Pew Research Center[6] shows that 81% Smartphones aresmartphones. essential toolsThe in same our daily lives.also A survey by Pew Research showsnotthat of U.S. adults have research shows report that smartphone users useCenter[6] these devices just81% for of U.S. adults have smartphones. The same research also shows that smartphone users use these devices not for calling and texting but also for looking for a job, finding a date, reading a book or doing online shopping. just Online calling and texting but also for looking for a job, finding a date, reading a book or doing online shopping. Online ∗ ∗
Corresponding author. Tel.: +90-212-227-4480; Corresponding Tel.: +90-212-227-4480; E-mail address:author.
[email protected] E-mail address:
[email protected] c 2018 The Authors. Published by Elsevier B.V. 1877-0509 c 2018 1877-0509 Thearticle Authors. Published by Elsevier B.V. This is an open access under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/) 1877-0509 ©under 2019 Thearticle Authors. Published by Elsevier B.V. This is an open access under the Conference CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/) Peer-review responsibility of the Program Chairs. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/) Peer-review under responsibility of the Conference Program Chairs. Peer-review under responsibility of the Conference Program Chairs. 10.1016/j.procs.2019.08.027
Hasan Can Volaka et al. / Procedia Computer Science 155 (2019) 177–184 Hasan Can Volaka et al. / Procedia Computer Science 00 (2018) 000–000
178 2
banking, mailing, playing games can also be added to this list. People report that smartphones are the second essential tool when leaving home after wallets and keys. Providing security and privacy for these tools is essential since they carry personal and other sensitive information, such as passwords or photos. Moreover, they may get stolen, lost or can be accessed by non-users due to their small sizes. The same survey reports that 28% of U.S. smartphone owners say they do not use a screen lock or other features to secure their phone. Although a majority of smartphone users say, they have updated their phones apps or operating system, around four-in-ten say they only update when it is convenient for them. However, some smartphone users forgot updating their phones altogether: 14% say they never update their phones operating system, while 10% say they do not update the apps on their phone. Considering these statistics, securing mobile devices is a leading security challenge because it depends on human attitude or preferences. There are different approaches for securing authentication on these devices. Widely-used methods include PIN and pattern-based authentication. However, these methods are weak against shoulder-surfing, smudge, and other attacks. Additionally, they provide one-time authentication. Regarding this behavior, methods which focus on passive security are gaining importance to answer questions about how to solve those security challenges. Continuous authentication according to the interaction patterns of the user is an emerging alternative solution [9, 1]. This approach is also called behavioral biometrics where interaction patterns of a user are tracked instead of using physical bio-metrical information, such as a fingerprint. Continuous authentication has several advantages, such as working in the background and not disrupting the user experience and working over a session rather than a one-shot authentication. There are two survey papers [9, 1] that investigate the use of biometrics for continuous authentication on smartphones. In [9], it was emphasized that sensors such as camera, microphone, etc. can be used to collect physical data, while components such as accelerometers, gyroscopes, touch screens can be used to collect behavioral biometric data such as walking, screen touch gestures, and hand gestures. In the other review paper [1], the studies in the literature were examined in terms of the type and size of data collected, classifiers used in identification, and results obtained. In this paper, we investigate continuous authentication on mobile phones by the identification of users using both sensor and touch screen data. Notably, we use the HMOG dataset[10] which includes both data from motion sensors -accelerometer, gyroscope and magnetometer- available on smartphones and also the touch-screen. It was collected from 100 users over 24 sessions. Although this dataset has been analyzed using traditional machine learning algorithms [10], to the best of our knowledge, there is only one study [2] that applies LSTM and RNN based deep learning algorithms on the same dataset. They reported 81.32% accuracy using LSTM which is lower than our findings. Additionally, we use a different type of network constructed over feature sets. Instead of using the raw data, we extracted basic and simple features, such as mean, standard deviation and median from the readings of the sensors as well as the touch screen readings. The highlights and contributions can be summarized as follows: • We apply deep neural networks on a large dataset and model the problem as a binary classification problem, rather than a one-class problem, which was studied in [10]. • We explore the effect of different types of modalities, such as touch-screen and motion sensors in identifying users. • Our results show that we can identify users with 88% accuracy on a challenging dataset. Although results were similar, the combination of touch (scroll) and gyroscope features revealed a slightly better performance. Rest of the paper is organized as follows: In Section 2, we present the related work and how our method differs from the related studies. Section 3 presents our methodology particularly the parameters and experiments considered in this study. In Section 4, we present the results of model evaluation and discuss our findings. Finally, Section 5 concludes the paper and includes future studies.
2. Related Work Biometrics are mainly grouped into two categories: behavioral and physical biometrics. Physical biometrics are based on physical attributes of a person, such as a retina or an iris scan and fingerprint, etc. As mentioned, behavioral
Hasan Can Volaka et al. / Procedia Computer Science 155 (2019) 177–184 Hasan Can Volaka et al. / Procedia Computer Science 00 (2018) 000–000
179 3
biometrics are based on a persons behavior and analysis of a person’s handwriting, timing keystroke and usage style, etc. There are four broad categories of studies focusing on continuous authentication: i) keystroke-based authentication, ii) touchscreen-based authentication, iii) sensor-based authentication, iv) multi-modal authentication. Keystroke based authentication mainly focuses on the analysis of typing motions of users. However, in the utilized dataset keystrokedata was not recorded. HMOG (Hand Movement, Orientation and Grasp) dataset [10] includes recordings both from touch-screen and sensors. Accelerometer, gyroscope and magnetometer readings and tap-based features, such as x-y coordinates, finger covered area, pressure, etc., are collected from 100 smartphone users with 24 sessions. Besides the touchscreen related data, authors propose a new set of features, which are derived from micro-movements, obtained from accelerometer, gyroscope and magnetometer sensors data generated while users interact with the touchscreen. Feature selection, feature transformation with principal component analysis (PCA) and outlier removal are performed on these feature sets. They achieved EER of 15.1% using HMOG features combined with tap features. In [3], authors propose a new multi-modal biometric authentication model which is based on the features which are collected while the user slide-unlocks the smartphone to answer a call. The features were populated by slide/swipe, arm movements of the user answering a call (accelerometer, gyroscope, orientation sensors) and voice recognition. The complete system consists of four parts: slide movement recognition, pickup movement recognition, voice recognition, and fusion. Twenty-six participants (sixteen male and ten female) were recruited in various ages. Each participant performs at least 20 swipe, 20 pick-ups, and 10 voice sample. They applied to the feature set one-class Bayes-Net, one-class random forest, and one-class sequential minimal optimization (SMO) classifiers. They achieved best results with the naive Bayes network classifier with a FAR of 11.01% and an FRR of 4.12%. In recent studies, deep learning methods for continuous authentication are also utilized. In [5], a Siamese convolutional neural network is used to learn the signatures of the motion patterns from users. The network is used for deep feature extraction, while 1-class SVM is used for classification. They report verification accuracy up to 97.8% on a dataset [11] consisting of 100 users. Similarly in [4], the feature extraction process is based on a deep learning autoencoder, using only accelerometer data. 2.2% EER was reported for a re-authentication time of 20 seconds. In another study [7], Kernel Deep Regression Network is used for both feature extraction and classification. They report 0.121% EER for inter-week authentication on Touchalytics dataset [8]. However, this dataset includes only touchscreen data. In another recent study [2], HMOG dataset is utilized, and a deep learning framework For user fingerprinting is proposed. However, they achieved 81.32% accuracy using LSTM which is lower than our findings.
3. Methodology In this section, we describe the HMOG dataset, along with the details of how we processed the dataset, which features are used and how we created our deep network. 3.1. Dataset Overview As mentioned, we utilized the HMOG dataset [10] which is focused on continuous authentication studying hand movements, orientation and grasp. They collected a great amount of data from 100 participants during 8 free text typing sessions on Samsung Galaxy S4 phones. In this research, we used scroll event and sensor data from HMOG dataset to build a deep learning network. The data is sparsely diverse since the HMOG experiment was performed with 100 participants with different characteristics. When trying to track the original owner of the phone, it is crucial to have non-real user data as well. Hence, we focus on a two-class classification problem. In HMOG paper [10], they used one-class classification considering that it may not be possible to have non-user data. Regarding their research, they trained and tested the model only with correspondent users. Hence, they tried to resolve user authentication detecting the outliers. On the contrary, we have studied binary classification, as mentioned. We calculated a model for each user, using all their data and randomly picked data from other participants. Randomly sampled data has helped us to create a diverse user model which has enabled us building a more complex model.
Hasan Can Volaka et al. / Procedia Computer Science 155 (2019) 177–184 Hasan Can Volaka et al. / Procedia Computer Science 00 (2018) 000–000
180 4
3.2. Data Preprocessing HMOG dataset contains 24 sessions for each user. Each of these sessions includes sensor-based data and touch screen events, so they are time series data by nature. We need to preprocess data to use machine learning models. LSTM network can work with time series data, but in this paper, we studied the dataset with features extracted as structured data. For each user session, sensor data and scroll (touch) data is merged to understand phone micro-movements for each scroll event. This merging helped us to create a table with scroll data and their sensor reflections. Also, we separated sensor data which is not related to the scroll data. 3.2.1. Feature Extraction When working in a machine learning process, feature extraction is a crucial step to identify information of data. With feature extraction, we convert a time series data to a structured dataset on which our model can find relations between those features. Then these features become the identifiers of our classes. The sensor data; accelerometer, gyroscope, and magnetometer, are provided in time series vector so we should convert every time series to an action which becomes our input data. For calculation of our features from raw sensor data, we used median, mean, and standard deviation. One may argue that, extraction of features is unnecessary when working with deep learning algorithms. Since we are interested in developing a real-time continuous authentication scheme, rather than transmitting raw data to a server where machine learning algorithm runs, we export features as the summary of the data to reduce the data size to be transferred. Then scroll or touch-screen features are calculated using the features reported in Touchalytics study[8]. We can summarize those features as follows: • From X-Axis of scroll event, we take the first and the last value, we calculated maximum deviation and 20%, 50% and 80% deviations. • From Y-Axis of scroll event, we take the first and the last value, we calculated maximum deviation and 20%, 50% and 80% deviations. • We calculated 20%, 50% and 80% pairwise displacement, average, and median of the last three points of the Velocity. • We calculated 20%, 50% and 80% pairwise displacement, and median of the first five points of the Acceleration. • The length of trajectory. • Duration of scroll event • Median of finger pressure and finger size • Distance of scroll trajectory, • Direction of end-to-end line After the feature extraction, our dataset resulted in 66 features and one label which implies if the user is genuine or not. 3.2.2. Data Cleaning Since we have structured numeric data, edge numbers as infinite or not numbers (NaN) can cause problems. Those cases were actually observed through the dataset. As tested with those cases, the deep network failed to learn, and model training step failed. So, data cleaning was a critical step. After cleaning the data, the next step of data preparation is normalizing the data. Normalization of data can be crucial sometimes as we are handling a numerical dataset. If the difference between data in feature sets is more than 3 or 4 decimal points, this can affect the overall accuracy. As a result of our tests with normalized and un-normalized datasets, we observed that overall accuracy can differ up to 3%. In our research, we have used min-max normalization and the data is normalized between 0 and 1. Since we have a significant number of experimenters and we used the leave-one-out method to create our datasets for each user, the number of instances from non-genuine users can be more than the number of instances of real-user. For that reason, we sampled non-genuine users to reduce the number in the dataset and sampled them randomly to not
Hasan Can Volaka et al. / Procedia Computer Science 155 (2019) 177–184 Hasan Can Volaka et al. / Procedia Computer Science 00 (2018) 000–000
181 5
to lose the generalization of non-users’ patterns. After the sampling, each dataset has approximately ten thousands of row. The ratio of genuine users to non-genuine ones is 1 to 3. 3.3. Data splitting To train, validate and test our models, we created three different sub-datasets for each dataset. Since we had approximately ten thousands of rows, we did not want to lower the number of rows in the training subsets. That is why as a first step, we sampled our dataset with 10% to create our validation subset which will validate our trained model in every epoch. Then we sampled our remaining subset with 10% again to create our test subset to evaluate the performance of the created model after training phase. 3.4. Classification and Performance Metrics In this study, we used a deep learning network to identify user actions. Deep learning and neural networks are generally used with image, audio and text datasets but recently they are also used in structured data. HMOG study used one classifier SVM to find outliners; on the contrary, we used a binary classification to make our predictions. As the initial investigation, we aimed to explore how a primary deep learning network would give a model, that is why our we have created our model with three dense layers. After some shallow tests, we decided to train our every data with two different networks which have 64 and 128 nodes in each layer. Then we batched our data as tensors with shape (8192,feature count) and ran the network with 200 epochs. We collected accuracy(1), mean absolute error and mean square error for train and validation test results and f1(4), accuracy, recall(3), and precision(2) measurements and also true positive, false positive, true negative and false negative counts for test predictions. Then we calculated false acceptance rate(5) and false rejection rate(6), and equal error rate by calculating their mean. These metrics are commonly used in the literature. ACCURACY =
T P + FP T P + FP + T N + FN
(1)
PRECIS ION =
TP T P + FP
(2)
RECALL = F1 = 2 ∗
TP T P + FN
precision ∗ recall precision + recall
(3) (4)
FAR =
FP FP + T N
(5)
FRR =
FN FN + T P
(6)
4. Model Performance Evaluation In this section, we present and explain the results obtained from our testing process. As mentioned in Section 3, we divided our dataset into four datasets which are created by only scroll (touch-screen) features, accelerometer plus scroll features, gyroscope plus scroll features, and all sensors plus scroll features. As mentioned in Section 3.4, we build our deep network with three layers and we created the layers with 64 and 128 nodes. In the following, for each figure we display results for four different datasets, and we have four figures displaying test accuracy, test precision, test EER and test F1 metric. For each dataset, we also present the results for 64 nodes and 128 nodes. Although we present the average results, because of the diversity of user characteristics, we also discuss the minimum and the maximum values of each metric in the following subsections where we analyze the results for each feature-specific dataset and evaluate the model performance.
Can Volaka et al. / Procedia Computer 155000–000 (2019) 177–184 HasanHasan Can Volaka et al. / Procedia Computer ScienceScience 00 (2018)
182 6
EER
F1
0.35
1 0.96 0.92 0.88 0.84
0.3 0.25 0.2
0.8 0.76 0.72
0.15 0.1 0.05 0
ALL
GYR+SCROLL 64
ACC+SCROLL
SCROLL
0.68 0.64 0.6
ALL
128
GYR+SCROLL 64
(a) Accuracy
ACC+SCROLL
SCROLL
128
(b) Precision Fig. 1. EER & F1 Metrics of Test Results
PRECISION
ACCURACY 1
1
0.98 0.96 0.94 0.92
0.96 0.92 0.88 0.84
0.9 0.88 0.86
0.8 0.76 0.72
0.84 0.82 0.8
0.68 0.64 0.6 ALL
GYR+SCROLL 64
ACC+SCROLL
SCROLL
ALL
GYR+SCROLL 64
128
(a) Accuracy
ACC+SCROLL
SCROLL
128
(b) Precision Fig. 2. Accuracy & Precision of Test Results
4.1. Scroll Features In this section, we only used the features which are discussed in feature extraction section 3.2.1 to train our model. For each user, these features are calculated. Then those features are fed to the deep network as inputs and trained for 200 epochs. As mentioned, for each user we have data coming from the user as well as from other users, sampled randomly. In Figure 1(a), we can see the results for EER metric on the test data. When we first analyze the average results, we can see that considering the two deep networks (regarding their node counts), the results are nearly the same, except the network with 128 nodes is slightly low. For performance requirements, it may be better to work with networks with smaller size. Also when we compare these results with HMOG dataset, the 128 nodes network has given similar results with their results where they used only HMOG features; on the contrary when we look at our smallest EER result which is 0.5%, only using the scroll features revealed better than similar works. In Figure 2(a), we can see the accuracy results of test data. The test accuracy of our model which is created with only scroll features, is between 76% and 99%. Average of test accuracy is 88%. Since we are studying continuous authentication, one of the most important metrics is the precision. Precision is the ratio of true positives to all true predictions labeled as true, including false positives. In an authentication solution, not authenticating the user is not crucial but authenticating the malicious user is a serious problem. That is the reason, in our results, the precision metric is as essential as the EER metric.
Hasan Can Volaka et al. / Procedia Computer Science 155 (2019) 177–184 Hasan Can Volaka et al. / Procedia Computer Science 00 (2018) 000–000
183 7
In Figure2(b), we can see precision results of our test data. When we analyze the precision metric for scroll data training, we can see the values vary between 58% and 98% and with an average of 78-79%. With the best result, we authenticate only two persons out of 100 incorrectly. This result shows us, with enough data, we can keep precision as high as we would need in an authentication scenario. 4.2. Accelerometer & Scroll Features First of all, we analyze the average results of EER again. With this dataset, we can see that network with 128 nodes has lower EER values comparing to the networks with 64 nodes. With this dataset, change in node count affects the result significantly. The precision for this model is between 57% and 99%, and the average precision value is 79%. If we compare this model with the model created with only scroll features, the results are very similar, but we see that the accelerometer data lowered precision by 2%. The accuracy of this model is between 78% and 99% and the average of it is 89%. The node count change affects the accuracy slightly. Regarding these first results, for a network with three dense layers, accelerometer data makes a very slight change, 0.1% for accuracy. For other metrics, the change is approximately 0.01% with the network with 128 nodes. 4.3. Gyroscope & Scroll Features We start by analyzing the average results of EER. We can see that network with 128 nodes has the lowest EER values comparing to the other networks. With this dataset, the network with the 128 nodes achieved the best result. The precision for this model is between 59% and 99%, and the average precision value is 80-81%. If we compare this model with the other two models that we discussed, this model exhibited a result approximately 3 points higher comparing to the other models. The accuracy of this model is between 78% and 99.7% and the average is 89-90%. The node count change affects the accuracy slightly. But again, we have the best average result compared to the other networks. 4.4. All Sensor Features & Scroll Features When we look at the results of the model which we trained with all the features, we can see that the results did not increase. We can see that network with 128 nodes still has the lower EER value compared to the model with 64 nodes. We have the minimum EER as 0.4% and the maximum EER as 36%, and the average is 17%. The precision for this model is between 53-58% and 99%, and the average precision value is 78%. The accuracy of this model is between 76% and 99.7% and the average of it is 88%. 4.5. Comparison of All Datasets By analyzing these first results with a basic deep network with three layers, we can say that more feature does not mean better results. We want to say that we tried to feed all the features and data without feature selection algorithms. With these comparisons and the HMOG dataset, a deep network made a better success with gyroscope and scroll features. As a result of the evaluations and demonstration of the results, we observe that the dataset which has the gyroscope features and the scroll features performed better than the other datasets. However, when we analyze our user results from one by one, we can see that in all feature-split datasets, there are users who performed with 99% accuracy and 98-99% precision, and nearly 0% EER. Since we aim to create a passive continuous authentication system using user behaviors, more data means better recognition. In HMOG study, they removed 20 persons because of their not quality data; on the contrary, we wanted to use these persons to create a more realistic solution.
184 8
Hasan Can Volaka et al. / Procedia Computer Science 155 (2019) 177–184 Hasan Can Volaka et al. / Procedia Computer Science 00 (2018) 000–000
5. Conclusion and Future Work In this paper we investigated the use of deep neural networks for continuous authentication on mobile phones on a challenging dataset, that consists of data from 100 users over 24 sessions. Rather than modelling the problem as a 1-class problem, we trained a model that tries to identify users as well as non-users. We particularly analyzed the performance with different data modalities: only using touch-screen data, only using data from motion sensors and their combinations. We also trained different networks with different node sizes. Experimental results show that, when different modalities are combined, we achieve an average accuracy of 88% while 15% EER is achieved. As a future work, we plan to utilized different types of networks on the same dataset, such as autoencoders. Currently we are collecting our own dataset with a logger integrated in a mobile banking app and we plan to have a real-time classification framework using deep networks. Acknowledgements This work is supported by Tubitak under grant number 5170078 and by the Galatasaray University Research Fund under grant number 17.401.004. References [1] Alzubaidi, A., Kalita, J., 2016. Authentication of smartphone users using behavioral biometrics. IEEE Communications Surveys Tutorials 18, 1998–2026. doi:10.1109/COMST.2016.2537748. [2] Amini, S., Noroozi, V., Bahaadini, S., Yu, P.S., Kanich, C., 2018. Deepfp: A deep learning framework for user fingerprinting via mobile motion sensors, in: 2018 IEEE International Conference on Big Data (Big Data), pp. 84–91. doi:10.1109/BigData.2018.8622372. [3] Buriro, A., Crispo, B., Del Frari, F., Klardie, J., Wrona, K., 2016. Itsme: Multi-modal and unobtrusive behavioural user authentication for smartphones, in: Stajano, F., Mjølsnes, S.F., Jenkinson, G., Thorsheim, P. (Eds.), Technology and Practice of Passwords, Springer International Publishing, Cham. pp. 45–61. [4] Centeno, M.P., v. Moorsel, A., Castruccio, S., 2017. Smartphone continuous authentication using deep learning autoencoders, in: 2017 15th Annual Conference on Privacy, Security and Trust (PST), pp. 147–1478. doi:10.1109/PST.2017.00026. [5] Centeno, M.P.n., Guan, Y., van Moorsel, A., 2018. Mobile based continuous authentication using deep features, in: Proceedings of the 2Nd International Workshop on Embedded and Mobile Deep Learning, ACM, New York, NY, USA. pp. 19–24. URL: http://doi.acm.org/ 10.1145/3212725.3212732, doi:10.1145/3212725.3212732. [6] Center, P.R., accessed April 2019. Smartphone ownership is growing rapidly around the world, but not always equally. URL: https://www. pewglobal.org/2019/02/05/smartphone-ownership-is-growing-rapidly-around-the-world-but-not-always-equally/. [7] Chang, I., Low, C., Choi, S., Teoh, A.B., 2018. Kernel deep regression network for touch-stroke dynamics authentication. IEEE Signal Processing Letters 25, 1109–1113. doi:10.1109/LSP.2018.2846050. [8] Frank, M., Biedert, R., Ma, E., Martinovic, I., Song, D., 2013. Touchalytics: On the applicability of touchscreen input as a behavioral biometric for continuous authentication. Information Forensics and Security, IEEE Transactions on 8, 136 –148. doi:http://dx.doi.org/10.1109/ TIFS.2012.2225048. [9] Patel, V.M., Chellappa, R., Chandra, D., Barbello, B., 2016. Continuous user authentication on mobile devices: Recent progress and remaining challenges. IEEE Signal Processing Magazine 33, 49–61. doi:10.1109/MSP.2016.2555335. [10] Sitov, Z., Åednka, J., Yang, Q., Peng, G., Zhou, G., Gasti, P., Balagani, K.S., 2016. Hmog: New behavioral biometric features for continuous authentication of smartphone users. IEEE Transactions on Information Forensics and Security 11, 877–892. doi:10.1109/TIFS.2015. 2506542. [11] Yang, Q., Peng, G., Nguyen, D.T., Qi, X., Zhou, G., Sitov´a, Z., Gasti, P., Balagani, K.S., 2014. A multimodal data set for evaluating continuous authentication performance in smartphones, in: Proceedings of the 12th ACM Conference on Embedded Network Sensor Systems, ACM, New York, NY, USA. pp. 358–359. URL: http://doi.acm.org/10.1145/2668332.2668366, doi:10.1145/2668332.2668366.