Detection of Cyber-attacks to indoor real time localization systems for autonomous robots

Detection of Cyber-attacks to indoor real time localization systems for autonomous robots

Accepted Manuscript Detection of Cyber-attacks to indoor real time localization systems for autonomous robots Ángel Manuel Guerrero-Higueras, Noem´ı D...

3MB Sizes 2 Downloads 41 Views

Accepted Manuscript Detection of Cyber-attacks to indoor real time localization systems for autonomous robots Ángel Manuel Guerrero-Higueras, Noem´ı DeCastro-Garc´ıa, Vicente Matellán

PII: DOI: Reference:

S0921-8890(17)30283-X https://doi.org/10.1016/j.robot.2017.10.006 ROBOT 2929

To appear in:

Robotics and Autonomous Systems

Please cite this article as: Á.M. Guerrero-Higueras, N. DeCastro-Garc´ıa, V. Matellán, Detection of Cyber-attacks to indoor real time localization systems for autonomous robots, Robotics and Autonomous Systems (2017), https://doi.org/10.1016/j.robot.2017.10.006 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Detection of Cyber-attacks to indoor real time localization systems for autonomous robots ´ Angel Manuel Guerrero-Higuerasa,∗, Noem´ı DeCastro-Garc´ıab , Vicente Matell´ ana a Research

Institute on Applied Sciences in Cybersecurity, Universidad de Le´ on, Av. de los Jesuitas s/n. ES-24008 Le´ on, (Spain) b Department of Mathematics, Universidad de Le´ on, Le´ on, (Spain)

Abstract Cyber-security for robotic systems is a growing concern. Many mobile robots rely heavily on Real Time Location Systems to operate safely in different environments. As a result, Real Time Location Systems have become a vector of attack for robots and autonomous systems, a situation which has not been studied well. This article shows that cyber-attacks on Real Time Location Systems can be detected by a system built using supervised learning. Furthermore it shows that some type of cyber-attacks on Real Time Location Systems, specifically Denial of Service and Spoofing, can be detected by a system built using Machine Learning techniques. In order to construct models capable of detecting those attacks, different supervised learning algorithms have been tested and validated using a dataset of real data recorded by a wheeled robot and a commercial Real Time Location System, based on Ultra Wideband beacons. Experimental results with a cross-validation analysis have shown that Multi-Layer Perceptron classifiers get the highest test score and the lowest validation error. Moreover, it is the model with less overfitting and more sensitivity for detecting Denial of Service and Spoofing cyber-attacks on Real Time Location Systems. Keywords: Cyber-security, Indoor Positioning, Robotics, Cyber-attack, Beacon, Machine Learning

1

1. Introduction

24 25

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

Cyber-security of Cyber-physical Systems (CPSs) [1] 26 has become an essential requirement. Specifically, cyber- 27 security of autonomous systems is being increasingly scru- 28 tinezed [2]. It is particularly disturbing in critical areas 29 such as medical or defense systems where security and 30 safety problems are a growing concern [3]. Conventional 31 Intrusion Detection Systems (IDSs) are not usually suit- 32 able for autonomous systems. They often do not take into 33 account physical aspects, such us mobility or energy con- 34 sumption. There is also an increasing interest in the cyber- 35 security of robotic systems. For instance, [4] proposes a 36 method based on the Cumulative Sum (CUSUM) algo- 37 rithm for detecting stealthy attacks on a robotic system. 38 In [5] a method to detect cyber-attacks on robot is pro- 39 posed by using the data gathered by the on-board systems 40 and processes to improve IDSs performance. 41 Real Time Location Systems (RTLSs) are critical com- 42 ponents of many robotic systems. For example, to solve 43 autonomous navigation in mobile vehicles, which has been 44 one of the classical problems in robotics, RTLSs are used 45 by robotic systems to obtain their relative position on a 46 given map, which lets them calculate trajectories, plan 47 48

∗ Corresponding

author ´ Email addresses: [email protected] (Angel Manuel Guerrero-Higueras), [email protected] (Noem´ı DeCastro-Garc´ıa), [email protected] (Vicente Matell´ an) Preprint submitted to Robotics and Autonomous Systems

49 50 51 52

next actions, etc. Several technologies have been proposed for self-locating robots. Simultaneous Localization and Mapping (SLAM) [6] has been one of the hot topics in robotics for many years (visual SLAM, laser SLAM, etc.). Although efficient algorithms have been developed to solve the SLAM problem, they demand considerable computing power, which is not usually available in commercial robots. Many industrial applications of mobile robots relay rely on external RTLSs instead of using self-localization techniques. This makes RTLSs a vector of cyber-attacks for robotic systems. Mechanisms for detecting cyber-attacks and methods for deploying more resilient RTLSs have to be provided. Besides, these methods have to adapt to the different technologies used to implement RTLSs: Global Positioning System (GPS), UWB-based systems, ultrasoundbased systems, etc. Cyber-attacks on outdoor RTLSs have been widely reported. For instance, attacks on GPS have been recently analyzed in [7]. However, little research on cyber-security of RTLSs for indoor environments, also known as Indoor Positioning Systems (IPSs), can be found in the literature. IPSs can be implemented using different technologies [8] and properties: time of flight, signal strength, angle of arrival, region inclusion, hop count, neighbor location, etc. Different types of attacks on these technologies have been already described [9]: forced multi-path, speedup attacks, delay transmissions, locally elevated ambient channels, jamming, replay, modify, etc. and proposes statistical methods to make localization attack-tolerant. July 31, 2017

53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109

UWB-based location systems are the most popular IPSs110 for mobile robots [10] because they offer an affordable com-111 putational solution for indoor localization. UWB-based112 location systems estimate the position using the range be-113 tween a mobile transceiver in the robot and several bea-114 cons placed in known positions in the surroundings. We115 will use a UWB-based IPS in the empirical evaluation of116 our proposal, although it can be generalized for other tech-117 nologies. 118 UWB-beacons must be distributed in a way which al-119 lows the mobile transceiver to receive measurements from120 several beacons at every moment. Usually, at least three121 simultaneous measurements are needed to estimate a 2D122 location, and four for a 3D location. A correct distribu-123 tion of the beacons is therefore essential for the mobile124 transceiver to estimate the position accurately, as shown125 at [15]. 126 Attacking the beacons is the prominent liability of127 UWB-based IPS because they are physically exposed. For128 instance, Denial of Service (DoS) attacks on one or more129 of the beacons are easy to carry out. Jamming or spoofing130 the signal of the beacons can also affect position estimation131 by the mobile transceiver. 132 These attacks (DoS and Spoofing) are very similar to the133 ones faced by Wireless Sensor Network (WSN), for which134 different detection methods have been proposed. For in-135 stance, in [11] an algorithm for detecting attacks based on136 the analysis of anomalies is found. In this case, the moni-137 toring nodes added in the network detect abrupt changes138 in the sequence of packets exchanged using a CUSUM al-139 gorithm. 140 The resilience of RTLSs to DoS and distance Spoofing141 attacks has also been analyzed in the literature, and dif-142 ferent methods for secure positioning have been proposed.143 [12] proposed a statistical method for position verification, named Verifiable Multilateration (VM), that determines the position of a mobile transceiver from a set of known144 reference points, using the distances between the reference points and the device. A major limitation of this system is that the position of the reference points has to be given145 146 by a ground truth system when using the system. The main goal of the work described in this paper is147 to create models that classify the integrity of the location148 information received by the RTLS in order to detect two149 types of cyber-attacks: DoS and Spoofing. These models150 will allow creating, in future work, an alert system that151 can provide robots with a the prescriptive instruction to152 omit location estimates received when a cyber attack has153 154 been detected. The proposal is somewhat similar to the one given in155 [11]: look for anomalies in the measurements received by156 the transceiver. But in our case it is a centralized pro-157 cess, not a distributed one, and we are not analyzing the158 sequence of packets, but their content. So, the first spe-159 cific question faced in this study is if it would be possible160 to detect some types of attacks analyzing only the mea-161 surements received by the on-board transceiver. This may162 2

not be easy, since RTLSs precision errors could easily be mistaken for a cyber-attack. Supervised learning methods are the most appropriate techniques for this problem. We can provide training data containing explicitly labeled examples for known points and these models will be able to produce predictions for all points that were not visited. Since classification is limited to two categories (“attack” and “no attack”) and because of the nature of the chosen features, we have evaluated the capabilities of the Machine Learning techniques with eight well-known classifiers and predictor algorithms: Adaptive Boosting (AB), Classification And Regression Tree (CART), K-Nearest Neighbors (KNN), Linear Discriminant Analysis (LDA), Logistic Regression (LR), Multi-Layer Perceptron (MLP), Naive Bayes (NB), and Random Forest (RF); commonly used in classification and prediction problems. The final decision regarding the best model for our objetive is based on the evaluation of different metrics: test accuracy (ability to classify using test data), validation accuracy (ability using any other than test data), and confusion matrix. The last-mentioned allows measuring the sensitivity of the model, which is essential in a cyber-security context due to the consequences of wrongly classified attacks. As in the method proposed in [12], we need ground truth data in the training phase, but once the models have been built, there is no need for them anymore. The rest of the paper is organized as follows: Section 2 describes the empirical evaluation of the algorithms presenting the experimental environment, materials, and methods used. Section 3 summarizes the results of the evaluation. The discussion of the results is developed in Section 4. Section 5 presents the conclusions and future lines of research.

2. Materials and methods In order to construct our models, a set of experiments was carried out. They were conducted in an indoor mockup apartment located at the mobile robotics lab of the University of Le´ on (Spain), shown in Fig 1. The experiments consisted of performing DoS and Spoofing attacks on the beacons of a commercial RTLS used by an autonomous robot to estimate its position. The location estimates provided by the RTLS were recorded. These location estimates were used to train, test, and later validate several learning algorithms. We have evaluated eight supervised learning algorithms and incorporated cross-validation to choose the best model for an optimal generalization. The implementation of the evaluated algorithms has been based on the Scikit-learn [13] Python Module. A four-step methodology, summarized in Fig 2, has been used to evaluate the performance of the algorithms. Each step is explained in more detail below.

Figure 1: Robotics mobile lab plane. Light gray line shows the test trajectory. Dark gray line shows the validation trajectory. Red dots show the location of anchors.

Getting data

/

Preparing data

/

Models analysis

/

Results evaluation

Figure 3: Orby-One and KIO RTLS.

Figure 2: Methodology proposed. 183 163 164 165

184

All datasets mentioned in the paper are available at a public git repository1 , as well as an implementation of the185 186 proposed method. 187

166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182

188 2.1. Getting data Fig 3 shows the Orby-One robot, a commercial wheeled189 robot, model RB-1, developed by Robotnik2 , used in the190 experiments. Orby-One has two arms, each with 7 degrees191 of freedom and 3 fingers, and a mobile base. Software192 controlling the robot is based on Robot Operating System193 194 (ROS) framework [14]. A KIO RTLS commercial solution has been used in195 the experiments. KIO calculates the position of a mo-196 bile transceiver, called a tag, in a two or three dimen-197 sional space. In order to do so, KIO uses radio beacons,198 called anchors, that have to be distributed in known po-199 sitions in the surroundings. The red dots in Fig 1 show200 the position of the 6 anchors used in these experiments.201 The distribution of the anchors has been choosen follow-202 ing the method shown at [15]. This previous work empiri-203 cally demonstrated that there are statistically meaningful204 205

1 http://niebla.unileon.es/proyectos/publications/

dataset-kio-rtls.git 2 http://www.robotnik.es/robotnik-rb-manipulador/

206 207 208

3

differences in the data gathered by beacon-based RTLSs between the case when there is an attacker and when not, which can be used to detect attacks. In the evaluation, different alternatives to define the distribution of beacons were considered to see which one was more discriminant, i.e., which one had more differences. Fig 3 shows a KIO tag on Orby-One and one of the KIO anchors, which in our case was located in the ceiling of the mock-up apartment. The size of the anchor has been enlarged in the picture (tags and anchors have the same size). The location estimates provided by this system have an average error of ±30cm cm according to the manufacturer’s specifications. Calibration done by the authors of this paper on the mock-up apartment shows that the error is higher in some areas, and lower in others, but on average, the claims of the manufacturer are correct. KIO has four types of anchors: A, B, C and D. Each of the six anchors used in the experiments has its own identifier: 408A, 411A, 408B, 412B, 501C and 401D. This means that two A-anchors, two B-anchors, one C-anchor and one D-anchor were used. The tag needs to receive signals from at least one A-, one B-, one C- and one Danchor in order to calculate its 3D position. If only the 2D position is needed, just the signal of three anchors of a different kind is required for calculation, which is the case

209 210 211 212 213 214 215 216 217 218 219 220 221 222

for the wheeled Orby-One robot. Anchor distribution over248 the study area can be seen in Fig 1. Just one signal from249 A- and B-anchors is processed, since they are redundant if250 merely considered to extend coverage of the study area. 251 We developed two ROS-nodes to publish the KIO out-252 puts in a topic that Orby-One robot can consume. First, the KIO-rtls-talker node receives the outputs produced253 by KIO and publishes them as a ROS Point Message254 on the ROS “/kio rtls talker/stdout” topic. Later, the255 KIO-rtls-listener node consumes the ROS Point messages published by KIO-rtls-talker. Fig 4 shows the con-256 ceptual model of the exchange of messages through the257 “/kio rtls talker/stdout” topic. Both ROS nodes are avail-258 able at the public git repository. 259 260 261

262 263 264 265 266

267

Figure 4: Conceptual model of ROS “/kio rtls talker/stdout” topic.268 Dotted lines represent punctual communications. Continuous lines269 represent regular communications. 270 271 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247

Two predefined trajectories were set for the robot in272 the study area, as shown in Fig 1: a test trajectory (light273 gray line in Fig 1) was used to build a training dataset,274 for training and testing the models; and a validation trajectory (dark gray line in Fig 1) was used to generate a different dataset to validate models in a different location275 ensuring generalization. The robot started in both cases at the point marked ”0” and finished at the point marked276 ”1”. Data were recorded by the Orby-One robot moving277 through the apartment, remotely controlled, following the278 test and validation trajectories respectively. We created a279 different rosbag file3 every time Orby-One made the walk,280 saving the location estimates gathered by the KIO device281 for a later analysis. 282 We repeated the test and validation trajectories 10 times283 each, so that 20 rosbag files were recorded. Orby-One284 takes about 72 seconds to finish the walk following the test285 trajectory, and about 40 seconds following the validation trajectory. Each test run yielded 270 location estimates 286 on average, validation runs 150. The runs were recorded in three different scenarios: 287 without suffering any attack, suffering a DoS attack, and 288 suffering a Spoofing attack. DoS attacks were carried out 289 by interrupting the signal of one or more radio beacons. 290 Spoofing attacks were carried out by changing the signal 291 292

3A

rosbag file is equivalent to a recording of the state of the robot293 during a period and can be used as a data set. 294

4

of the radio beacons. The affected radio beacons were selected by looking for anchors with redundancy (A-anchors) and anchors without (C- and D-anchors), at different locations. These situations were labeled as: A0 Orby-One made the walk without suffering any attack. 4 runs were conducted for the test and validation trajectories respectively. A1 Orby-One made the walk suffering a DoS attack. 4 runs were conducted for the test trajectory by interrupting the signal of the 408A, 408B, 501C, and 401D anchors. 3 runs were conducted for the validation trajectory by interrupting the signal of the 408A, 501C, and 401D anchors. A2 Orby-One made the walk suffering a Spoofing attack. 2 runs were made for the test trajectory by spoofing the signal of the 408A, and 501C anchors. 3 runs were taken for the validation trajectory by spoofing the signal of the 408A, 501C, and 401D anchors. Location estimates contain the following 10 features: Aanchor identifier, distance from Tag to A-anchor, B-anchor identifier, distance from Tag to B-anchor, C-anchor identifier, distance from Tag to C-anchor, D-anchor identifier, distance from Tag to D-anchor, and X and Y coordinate estimates. Other features, such as tag identifier, timestamp, and Z coordinate, have not been considered. 2.2. Preparing data In order to build the training dataset for DoS attacks, A0 and A1 runs following the test trajectory (Fig 1), were joined. In the same way, to build the training dataset for Spoofing attacks, A0 and A2 runs were joined. The validation datasets were built in an analogous way, but using the runs taken in the validation trajectory. The training datasets were randomly split to obtain a training dataset, to fit models, and a test dataset, to evaluate them. 80% of the data have been used for training and 20% for testing the models respectively. 2.3. Models evaluation We want to generate a model whose inputs are quantitative and qualitative, while its output is a discrete value: WA (without attack) and A (attack). Two types of learning algorithms can be used: classifiers and predictors, whereby considering the first ones will be better. We have evaluated the following methods which we think are the more promising ones. The implementation of the Scikitlearn library was used.

295 296 297 298 299 300 301 302 303 304 305 306 307

AB. Ensemble methods are techniques that combine dif-348 ferent basic classifiers turning a weak learner into a more349 accurate method. Boosting is one of the most successful350 types of ensemble methods, and AB one of the most pop-351 ular boosting algorithms. 352 This algorithm is focused on determining when an effi-353 cient weak learner can be “boosted” into an efficient strong354 learner. This question was raised in [16] and solved in a355 practical way in [17], where the AdaBoost algorithm is356 357 proposed. For Scikit-learn, multi-class classification, AdaBoost-358 Classifier implements AdaBoost-SAMME and AdaBoost-359 360 SAMME.R, see [18]. 361

308 309 310 311 312 313 314 315 316

CART. A decision tree is a method which predicts the la-362 bel associated with an instance by travelling from a root363 node of a tree to a leaf [19]. It is a non-parametric method in which the trees are grown in an iterative, top-down pro-364 cess. Many algorithms are available for decision trees such365 as ID3 and C4.5, see [20] and [21]. Scikit-learn uses the366 CART algorithm, developed in [22], and produces either367 classification or regression trees, depending on whether the368 dependent variable is categorical or numeric, respectively.369 370

317 318 319 320 321 322 323 324 325 326

KNN classifiers. Although nearest neighbors is the foun-371 dation of many other learning methods, notably unsuper-372 vised, supervised neighbor-based learning is also available to classify data with discrete labels. It is a non-parametric373 technique which classifies new observations based on the374 distance to observation in the training set. A good pre-375 376 sentation of the analysis is given in [23] and [24]. Scikit-learn implements KNeighborsClassifier, based on377 the nearest neighbors of each query point, where k is an378 integer value specified by the user, equal to 5 by default. 379 380

327 328 329 330 331 332 333 334 335 336 337 338

LDA. LDA is a parametric method that assumes that 381 distributions of the data are multivariate Gaussian [24]. Also, LDA assumes knowledge of population parameters.382 In another case, the maximum likelihood estimator can be383 used. LDA uses Bayesian approaches to select the category384 which maximizes the conditional probability (see [25], [26]385 386 or [27]). The default solver in Scikit-learn is Singular Value Decomposition (SVD). It can perform both classification and transformation, and does not rely on the calculation of the covariance matrix. This can be an advantage in situations387 388 in which the number of features is large. 389

339 340 341 342 343 344 345 346 347

LR. Linear methods are intended for regressions in which390 the target value is expected to be a linear combination of391 the input variables. 392 LR, despite its name, is a linear model for classifica-393 tion rather than regression. In this model, the probabili-394 ties describing the possible outcomes of a single trial are395 modeled using a logistic function. Scikit-learn implements396 LogisticRegression, whose default solver uses a Coordinate397 398 Descent (CD) algorithm [28]. 5

MLP. An artificial neural network is a model inspired by the structure of the brain. Neural networks are used when the type of relationship between inputs and outputs is not known. It is supposed that the network is organized in layers (input layer, output layer and hidden layers). An MLP consists of multiple layers of nodes in a directed graph so that each layer is fully connected to the next one. An MLP is a modification of the standard linear perceptron and, the best characteristic is that it is able to distinguish data which is not linearly separable. An MLP uses backpropagation for training the network, see [29] and [30]. Class MLPClassifier in Scikit-learn implements a MLP algorithm that trains using back-propagation. By default, MLPClassifier has 100 neurons in the hidden layer, the activation function for the hidden layer is f (x) = max(0, x) and the solver for weight optimization is “adam” (see [31]). NB. NB is based on applying Bayes’ theorem with the “naive” assumption of independence between every pair of features, see [24] and [32]. The different classifiers differ in the assumptions of the distribution of data. When we estimate the parameters using the maximum likelihood principle, the resulting classifier is called the Naive Bayes classifier. They require a small amount of training data to estimate the necessary parameters and can be extremely fast compared to more sophisticated methods. RF. An RF ([33]) is a classifier consisting of a collection of decision trees, in which each tree is constructed by applying an algorithm to the training set and an additional random vector that is sampled via boostrap re-sampling. The scikit-learn implementation combines classifiers by calculating an average of their probabilistic prediction. The RF algorithm based on randomized decision trees (see [34]) is used. 2.4. Results analysis A 10-iteration cross-validation analysis for selecting the most suitable learning algorithm has been used. Moreover, the accuracy classification score has been used to evaluate the performance of the models. The accuracy classification score is computed as follows: P P Tp + Tn accuracy = P total data P P Where Tp is the number of true positives, and Tn is the number of true negatives. It is important to highlight that the accuracy classification score has been calculated for the training and validation datasets. We will focus on the accuracy of the validation dataset in order to look for a better generalization. The three models with the highest accuracy classification score have been pre-selected for in-depth evaluation by considering the following Key Performance Indicators (KPIs): Precision (P ), Recall (R), and F1 score; all of which were obtained through the confusion matrix. The Precision (P ) is computed as follows:

399 400

401 402 403

P =P

P

405 406

407 408 409 410 411

MLP

P Where Fp is the number of false positives. The Recall (R) is computed as follows: P Tp P R= P Tp + Fn P Where Fn is the number of false negatives. These quantities are also related to the F1 score, which is defined as the harmonic mean of precision and recall: F1 = 2

404

Classifier

Tp P Tp + Fp

RF

KNN

Class

P

R

F1 -score

#examples

A WA total A WA total A WA total

0.96 0.95 0.96 0.98 0.99 0.98 0.96 0.97 0.97

0.95 0.96 0.96 0.99 0.98 0.98 0.97 0.97 0.97

0.95 0.96 0.96 0.98 0.98 0.98 0.97 0.97 0.97

219 229 448 219 229 448 219 229 448

Table 2: Precision, recall and F1 -score for the test dataset.

P ×R P +R

3. Results The results are shown for the developed models to detect DoS and Spoofing attacks respectively.

Figure 6: Confusion matrix for the MLP (left) RF (center), and KNN (right) classifiers evaluated using the validation dataset for DoS attacks.

3.1. Models results for detecting DoS attacks Table 1 shows the accuracy classification score for the test and validation datasets, considering DoS attacks. The 419 Table 3 shows the precision, recall and F1 -score for the highest scores for the validation dataset are highlighted in 420 validation dataset evaluated from the highlighted models. bold. Classifier

Test score

Validation score

MLP RF KNN CART LR LDA NB AB

0.955357 0.984375 0.966518 0.986607 0.964286 0.975446 0.872768 0.979911

0.947559 0.766727 0.742315 0.478300 0.443038 0.429476 0.429476 0.429476

Classifier MLP

RF

KNN

Class

P

R

F1 -score

#examples

A WA total A WA total A WA total

0.93 0.96 0.95 0.65 1.00 0.85 0.63 0.97 0.82

0.95 0.94 0.95 1.00 0.59 0.77 0.98 0.57 0.74

0.94 0.95 0.95 0.79 0.74 0.76 0.77 0.71 0.74

475 631 1106 475 631 1106 475 631 1106

Table 3: Precision, recall and F1 -score for the validation dataset.

Table 1: Accuracy classification score for DoS attacks.

412 413 414

421 3.2. Models results for detecting Spoofing attacks Fig 5 shows the confusion matrix computed from the Table 4 shows the accuracy classification score for the highlighted models: MLP, RF, and KNN; for the test422 423 test and validation datasets, considering Spoofing attacks. dataset. 424 The highest scores for the validation dataset are high425 lighted in bold.

Classifier MLP LR LDA NB KNN RF CART AB

Figure 5: Confusion matrix for the MLP (left) RF (center), and KNN (right) classifiers evaluated using the test dataset for DoS attacks. 415 416 417 418

Table 2 shows the precision, recall and F1 -score for the test dataset, evaluated from the highlighted models. Fig 6 shows the confusion matrix computed from the highlighted models for the validation dataset.

Test score 0.882222 0.848889 0.846667 0.802222 0.931111 0.977778 0.955556 0.928889

Validation score 0.768260 0.663661 0.587917 0.567178 0.537421 0.496844 0.486925 0.467087

Table 4: Accuracy classification score for Spoofing attacks.

6

426 427 428

429 430

431 432

433 434

Fig 7 shows the confusion matrix computed from the435 4. Discussion highlighted models: MLP, LR, and LDA; for the test 436 In general, Tables 1 and 4 show that the medium test dataset. 437 score is greater for detecting DoS attacks, 0.96, than for 438 Spoofing attacks, 0.896. Analogously, this situation is 439 identical for the medium validation accuracy, 0.5832 and 440 0.5719, respectively. However, the medium overfitting (the 441 difference between the medium test and medium valida442 tion) is greater in acDoS rather than in Spoofing, although 443 this difference is small enough (0.37 and 0.32, respec444 tively). Moreover the maximum test accuracy, the highest Figure 7: Confusion matrix for the MLP (left) LR (center), and 445 validation score, as well as the lowest overfitting appear LDA (right) classifiers evaluated using the test dataset for Spoofing 446 for detecting DoS attacks. So, it is clear that DoS models attacks. 447 present a better performance. This is due to the fact that Table 5 shows the precision, recall and F1 -score for test448 some of the chosen features provide missing values on DoS 449 attacks, which makes it easier to find patterns in data. dataset, evaluated from the highlighted models. 450 For instance, a DoS attack on the 408A anchor provides Classifier Class P R F1 -score #examples 451 zero values on the A-anchor identifier and the distance 452 from the Tag to the A-anchor. This situation does not A 0.98 0.51 0.67 106 MLP 453 arise with Spoofing attacks, where data are changed. This WA 0.87 1.00 0.93 344 total 0.90 0.88 0.87 450 454 makes Spoofing attacks more difficult to detect when only A 0.98 0.99 0.98 106 455 paying attention to location estimates of an RTLS. LR WA 0.99 0.98 0.98 344 456 If we focus on the developed models for detecting DoS total 0.98 0.98 0.98 450 457 attacks, we can see in Table 1 that the highest test A 0.96 0.97 0.97 106 LDA WA 0.97 0.97 0.97 344 458 accuracy corresponds to the CART classifier (score = total 0.97 0.97 0.97 450 459 0.986607). However, it would not be a good choice be460 cause of its accuracy for the validation dataset (score = Table 5: Precision, recall and F1 -score for the test dataset. 461 0.478300), which means that this classifier does not work 462 well with other trajectories except for the test trajectory. 463 Since the generalization of the model is essential, we conFig 8 shows the confusion matrix computed from the 464 sider the three models with the highest validation score: highlighted models for the validation dataset. 465 MLP, RF, and KNN, of which the first one presents the 466 lowest overfitting. 467 Once the best generalizable models are considered, a 468 deeper analysis with the confusion matrix of each one is 469 given. Another important item that should be analyzed is 470 the sensitivity of the model for detecting an attack: i.e., 471 the rate of true attacks that the model classifies incor472 rectly. Fig 5 and 6, and Tables 2 and 3, show that the Figure 8: Confusion matrix for the MLP (left) LR (center), and LDA473 MLP classifier gets better values for precision (P ), recall (right) classifier evaluated using the validation dataset for Spoofing 474 (R) and F1 -score than RF and KNN in both test and valattacks. 475 idation stages. Focusing on the models developed for detecting SpoofTable 6 shows the precision, recall and F1 -score for val-476 idation dataset evaluated from the highlighted models. 477 ing attacks, Table 4 shows that RF is the model with 478 highest test accuracy (score = 0.977778). However, it Classifier Class P R F1 -score #examples 479 presents a very high overfitting and so is not useful for 480 generalization.MLP, LR, and LDA, get the highest accuA 0.90 0.52 0.66 478 MLP WA 0.73 0.95 0.82 631 481 racy classification score for the validation dataset. If we total 0.80 0.77 0.75 1109 482 analyze these three models in detail, we can observe in Fig A 0.65 1.00 0.79 478 LR 483 7 and 8, and Tables 5 and 6 that again the MLP classifier WA 1.00 0.59 0.74 631 484 achieves the best results. total 0.85 0.77 0.76 1109 LDA

A WA total

0.63 0.97 0.82

0.98 0.57 0.74

0.77 0.71 0.74

478 631 1109

Table 6: Precision, recall and F1 -score for the validation dataset.

485

486 487 488

7

5. Conclusions and further work The experiments carried out have shown that supervised learning algorithms can be used for building models to detect cyber-attacks on the KIO RTLS. The proposed

489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537

analysis allows selecting best models to build an attack545 detection system, specifically DoS and Spoofing attacks.546 Although no hyper-parameter tuning has been carried out, the proposed analysis shows which ones, a priori, get better results. Building an optimal and scalable attack detection547 system implies using as few models as possible to detect as many attacks as posible. In order to do so we need to548 test as many models as possible and select the best ones.549 Once we have pre-selected the best ones, there is a lot to550 551 do tuning hyper-parameters. 552 Regarding the chosen features, they all have to do with553 the position estimate: distances from anchors (beacon) to554 the tag (mobile transceiver), and the estimated values for555 556 the X and Y coordinates. Any RTLS provides, at least,557 these features. It can therefore be asserted that the pro-558 559 posed method can be used with different RTLSs. Relative to the attacks, DoS models present better per-560 561 formance because of the chosen features. Missing data on562 certain features are helpful in order to detect DoS attacks.563 To improve the performance over Spoofing attacks more564 565 features could be considered. 566 Concerning the algorithms, although some some of them567 present good test accuracy, they do not perform well with568 the validation dataset, so they are not the most suitable569 570 to ensure optimal generalization. CART and RF, for DoS571 and Spoofing attacks respectively, have the highest test572 accuracy, but poor validation accuracy. If Orby-One, or573 any autonomous robot using an RTLS, were moving over a574 575 fixed trajectory, CART and RF models would be the best576 choices, but we guess that an autonomous robot moves577 578 “freely”. So, of all the models evaluated: AB, CART, KNN, LDA,579 580 LR, MLP, NB, and RF; the MLP classifier got the best581 results for both DoS and Spoofing attacks if we pay at-582 tention to the validation error to ensure optimal gener-583 alization. Once the best classifier for this problem has584 585 been selected, there remains much work to do optimizing586 the neural network, especially regarding the detection of587 588 Spoofing attacks. 589 Although the analyses proposed have been done to look590 for either type of attack, models can be used to detect DoS591 or Spoofing attacks independently. Train and validation592 sets are different for each one and can be used to build593 594 specific models. 595 Future work will be related to the construction of an596 alert system which allows the robot to omit an instruction597 598 which is suspected to be the result of a cyber-attack. 599 The performance of this system will be designed in dif-600 ferent stages: 601 602

538 539 540 541 542 543 544

1. Tuning the hyper parameters of models for detecting603 Spoofing attacks in order to obtain better results. 604 605 2. Studying the activation function of an MLP in order606 to determine in which situations an omission instruc-607 608 tion should be sent to the robot. 609 3. Integrating the above-mentioned items in a system610 that will be efficient in an autonomus robotic system.611 8

Moreover, it will be interesting to apply an algorithm that lets the robot update the Machine Learning model.

References [1] Y. Z. Lun, A. D’Innocenzo, I. Malavolta, M. D. D. Benedetto, Cyber-physical systems security: a systematic mapping study, CoRR abs/1605.09641. URL http://arxiv.org/abs/1605.09641 [2] S. Morante, J. G. Victores, C. Balaguer, Cryptobotics: Why robots need cyber safety, Frontiers in Robotics and AI 2 (23) (2015) 1–4. doi:10.3389/frobt.2015.00023. URL http://dx.doi.org/10.3389/frobt.2015.00023 [3] T. Bonaci, J. Yan, J. Herron, T. Kohno, H. J. Chizeck, Experimental analysis of denial-of-service attacks on teleoperated robotic systems, in: Proceedings of the ACM/IEEE Sixth International Conference on Cyber-Physical Systems, ACM, 2015, pp. 11–20. [4] G. Sabaliauskaite, G. S. Ng, J. Ruths, A. P. Mathur, Empirical Assessment of Methods to Detect Cyber Attacks on a Robot, 2016 IEEE 17th International Symposium on High Assurance Systems Engineering (HASE) (2016) 248–251doi: 10.1109/HASE.2016.19. URL http://ieeexplore.ieee.org/lpdocs/epic03/wrapper. htm?arnumber=7423162 [5] T. P. Vuong, G. Loukas, D. Gan, Performance Evaluation of Cyber-Physical Intrusion Detection on a Robotic Vehicle, in: 2015 IEEE International Conference on Computer and Information Technology; Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing, IEEE, 2015, pp. 2106–2113. doi:10.1109/CIT/IUCC/DASC/PICOM.2015.313. URL http://ieeexplore.ieee.org/lpdocs/epic03/wrapper. htm?arnumber=7363359 [6] M. G. Dissanayake, P. Newman, S. Clark, H. F. Durrant-Whyte, M. Csorba, A solution to the simultaneous localization and map building (SLAM) problem, IEEE Transactions on robotics and automation 17 (3) (2001) 229–241. [7] M. L. Psiaki, T. E. Humphreys, Protecting gps from spoofers is critical to the future of navigation, IEEE Spectrum. [8] H. Liu, H. Darabi, P. Banerjee, J. Liu, Survey of wireless indoor positioning techniques and systems, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), Vol 37, pp. 1067-1080. [9] Z. Li, W. Trappe, Y. Zhang, B. Nath, Robust statistical methods for securing wireless localization in sensor networks, in: Proceedings of the 4th International Symposium on Information Processing in Sensor Networks, IPSN ’05, IEEE Press, Piscataway, NJ, USA, 2005. URL http://dl.acm.org/citation.cfm?id=1147685.1147703 [10] J. Gonz´ alez, J. L. Blanco, C. Galindo, A. Ortiz-de Galisteo, J. A. Fern´ andez-Madrigal, F. A. Moreno, J. L. Mart´ınez, Mobile robot localization based on Ultra-Wide-Band ranging: A particle filter approach, Robotics and Autonomous Systems 57 (5) (2009) 496–507. doi:10.1016/j.robot.2008.10.022. URL http://dx.doi.org/10.1016/j.robot.2008.10.022 [11] T. Van Phuong, L. X. Hung, S. J. Cho, Y.-K. Lee, S. Lee, An anomaly detection algorithm for detecting attacks in wireless sensor networks, in: International Conference on Intelligence and Security Informatics, Springer, 2006, pp. 735–736. [12] S. Capkun, J. P. Hubaux, Secure positioning of wireless devices with application to sensor networks, Infocom 3 (5005) (2005) 1917–1928vol313–. doi:10.1109/INFCOM.2005.1498470. [13] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, E. Duchesnay, Scikit-learn: Machine learning in Python, Journal of Machine Learning Research 12 (2011) 2825–2830.

663

[14] M. Quigley, K. Conley, B. Gerkey, J. Faust, T. Foote, J. Leibs,672 R. Wheeler, A. Y. Ng, ROS: an open-source robot operating system, in: ICRA workshop on open source software, Vol. 3,673 Kobe, Japan, 2009, p. 5. ´ M. Guerrero-Higueras, N. DeCastro-Garc´ıa, F. J. Rodr´ıguez[15] A. 674 Lera, V. Matell´ an, Empirical analysis of cyber-attacks to an indoor real time localization system for autonomous robots, Com-675 puters & Security. 70 (2017) 422–435. [16] M. J. Kearns, L. G. Valiant, Learning Boolean formulae or finite676 automata is as hard as factoring, Harvard University, Center for Research in Computing Technology, Aiken Computation Labo677 ratory, 1988. [17] R. E. Schapire, Y. Freund, Boosting: Foundations and algo678 rithms, MIT press, 2012. [18] J. Zhu, H. Zou, S. Rosset, T. Hastie, Multi-class adaboost, 679 Statistics and its Interface 2 (3) (2009) 349–360. [19] J. Friedman, T. Hastie, R. Tibshirani, The elements of statistical learning Ed. 2, Vol. 1, Springer series in statistics Springer,680 Berlin, 2009. [20] J. R. Quinlan, Induction of decision trees, Machine learning681 1 (1) (1986) 81–106. [21] J. R. Quinlan, C4. 5: programs for machine learning, Elsevier,682 2014. [22] L. Breiman, J. Friedman, C. J. Stone, R. A. Olshen, Classifica-683 tion and regression trees, CRC press, 1984. [23] L. Devroye, L. Gy¨ orfi, G. Lugosi, A probabilistic theory of pat-684 tern recognition, Vol. 31, Springer Science & Business Media, 2013. 685 [24] R. O. Duda, P. E. Hart, D. G. Stork, Pattern classification, John Wiley & Sons, 2012. 686 [25] C. M. Bishop, Pattern recognition, Machine Learning 128 (2006) 1–58. [26] D. Koller, N. Friedman, Probabilistic graphical models: princi-687 ples and techniques, MIT press, 2009. [27] K. P. Murphy, Machine learning: a probabilistic perspective,688 MIT press, 2012. [28] J. Friedman, T. Hastie, R. Tibshirani, Regularization paths for689 generalized linear models via coordinate descent, Journal of sta690 tistical software 33 (1) (2010) 1. [29] D. E. Rummelhart, Learning internal representations by error propagation, Parallel distributed processing. [30] G. Cybenko, Approximation by superpositions of a sigmoidal function, Mathematics of Control, Signals, and Systems (MCSS) 2 (4) (1989) 303–314. [31] D. Kingma, J. Ba, Adam: A method for stochastic optimization, arXiv preprint arXiv:1412.6980. [32] H. Zhang, The optimality of naive bayes, AA 1 (2) (2004) 3. [33] L. Breiman, Random forests, Machine learning 45 (1) (2001) 5–32. [34] L. Breiman, et al., Arcing classifier (with discussion and a rejoinder by the author), The annals of statistics 26 (3) (1998) 801–849.

664

Acronyms

665

AB Adaptive Boosting.

666

CD Coordinate Descent.

667

CPS Cyber-physical System.

668

CUSUM Cumulative Sum.

669

DoS Denial of Service.

670

CART Classification And Regression Tree.

671

GPS Global Positioning System.

612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662

9

ICS Industrial Control System. IDS Intrusion Detection System. IPS Indoor Positioning System. KNN K-Nearest Neighbors. KPI Key Performance Indicator. LDA Linear Discriminant Analysis. LR Logistic Regression. ML Machine Learning. MLP Multi-Layer Perceptron. NB Naive Bayes. QDA Quadratic Discriminant Analysis. RF Random Forest. ROS Robotic Operating System. RTLS Real Time Location System. SLAM Simultaneous Localization and Mapping. SVD Singular Value Decomposition. UWB Ultra Wideband. VM Verifiable Multilateration. WSN Wireless Sensor Network.

1

´ ´ Angel Manuel Guerrero-Higueras Angel Manuel Guerrero-Higueras worked as IT engineer at several companies in the private sector (2000–2016), worked as researcher at the Atmospheric Physics Group at Universidad de Le´on (2011–2013) and got his Ph.D. at the University of Le´on in 2017. He currently works as lecturer at Universidad de Le´on and researcher at Research Institute of Applied Science to Cyber-Security. His main research interest include robotic software architectures, cyber-security, and learning algorithms applied to robotics.

Noem´ı DeCastro-Garc´ıa Noem´ı DeCastro-Garc´ıa received her M.Sc. degree in Mathematics at University of Salamanca (Spain) in 2009 and her PhD degree in Computational Engineering from University of Le´on, in 2016. Currently holds as assistant professor in the Department of Mathematics at School of Industrial and Computing Engineering at Le´on University. Also, she is researcher in the Research Institute of Applied Sciences in Cybersecurity. Her research focusing on different areas of Cybersecurity. More specifically, she working with convolutional codes and systems from an algebraically approach, and with data analysis in computational methods such as machine learning and statistical data modeling.

Vicente MAtell´an Vicente Matell´an got his Ph.D. at the Technical University of Madrid (1998), worked as Assistant Professor at Carlos III University (1993-1999), and Associate Professor at Rey Juan Carlos University (19992008). Currently holds the Telef´onica Professorship at the Universidad de Le´on, leading the Robotics Group (Le´on, Spain) and he is also affiliated to the Research Institute of Applied Science to Cyber-Security. His main research interest include robotic software architectures, cyber-security, and artificial vision applied to robotics. He has published over 150 papers in journals, books, and conferences in these areas.

Highlights 

Using supervised learning algorithms for building models to detect cyber-attacks on the Real Time Location Systems is proposed.



The proposed method shows that some type of cyber-attacks on Real Time Location Systems, specifically Denial of Service and Spoofing, can be detected by a system built using Machine Learning techniques.



Eight well-known classifiers and predictor algorithms have been evaluated: Adaptive Boosting, Classification And Regression Tree, K-Nearest Neighbors, Linear Discriminant Analysis, Logistic Regression, Multi-Layer Perceptron, Naive Bayes, and Random Forest.



A cross-validation analysis have shown that Multi-Layer Perceptron classifiers get the highest test score and the lowest validation error.