Accepted Manuscript
A Hybrid Approach for Improving Unsupervised Fault Detection for Robotic Systems Eliahu Khalastchi , Meir Kalech , Lior Rokach PII: DOI: Reference:
S0957-4174(17)30221-X 10.1016/j.eswa.2017.03.058 ESWA 11216
To appear in:
Expert Systems With Applications
Received date: Revised date: Accepted date:
24 December 2016 3 March 2017 25 March 2017
Please cite this article as: Eliahu Khalastchi , Meir Kalech , Lior Rokach , A Hybrid Approach for Improving Unsupervised Fault Detection for Robotic Systems, Expert Systems With Applications (2017), doi: 10.1016/j.eswa.2017.03.058
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Highlights From unsupervised to supervised learning a fault detection model (for robots) Insights to why and when it becomes more accurate Theoretical analysis & a prediction tool Empirical results on 3 real-world domains that back these insights
AC
CE
PT
ED
M
AN US
CR IP T
ACCEPTED MANUSCRIPT
A Hybrid Approach for Improving Unsupervised Fault Detection for Robotic Systems Eliahu Khalastchi1, Meir Kalech2, Lior Rokach2
[email protected],
[email protected],
[email protected] 1College
of Management Academic Studies, 2Ben-Gurion University of the Negev December 22, 2016
Abstract
CR IP T
The use of robots in our daily lives is increasing. As we rely more on robots, thus it becomes more important for us that the robots will continue on with their mission successfully. Unfortunately, these sophisticated, and sometimes very expensive, machines are susceptible to different kinds of faults. It becomes important to apply a Fault Detection (FD) mechanism which is suitable for the domain of robots. Two important requirements of such a mechanism are: high accuracy and low computational-load during operation (online). Supervised
AN US
learning can potentially produce very accurate FD models, and if the learning takes place offline then the online computational-load can be reduced. Yet, the domain of robots is characterized with the absence of labeled data (e.g., “faulty”, “normal”) required by supervised approaches, and consequently, unsupervised approaches are being used. In this paper we propose a hybrid approach - an unsupervised approach can label a data set, with a
M
low degree of inaccuracy, and then the labeled data set is used offline by a supervised approach to produce an online FD model. Now, we are faced with a choice – should we use the unsupervised or the hybrid fault
ED
detector? Seemingly, there is no way to validate the choice due to the absence of (a priori) labeled data. In this paper we give an insight to why, and a tool to predict when, the hybrid approach is more accurate. In
PT
particular, the main impacts of our work are (1) we theoretically analyze the conditions under which the hybrid approach is expected to be more accurate. (2) Our theoretical findings are backed with empirical analysis. We
CE
use data sets of three different robotic domains: a high fidelity flight simulator, a laboratory robot, and a commercial Unmanned Arial Vehicle (UAV). (3) We analyze how different unsupervised FD approaches are
AC
improved by the hybrid technique and (4) how well this improvement fits our prediction tool. The significance of the hybrid approach and the prediction tool is the potential benefit to expert and intelligent systems in which labeled data is absent or expensive to create. Keywords: Fault Detection, Robotic Systems, Unsupervised.
ACCEPTED MANUSCRIPT
1. INTRODUCTION
In recent years we witness a rapid increase in the use of robots. Recent reports by the International Federation of Robotics (IFR, 2016) (IFR, 2016) describe yearly increases of 15% and 25% in sales of service and industrial robots respectively. Robots are used for tasks that are too dangerous, too dull, too dirty or too difficult to be done by humans. Among such tasks we find surveillance and patrolling
CR IP T
(Agmon, Kraus, & Kaminka, 2008), aerial search (Goodrich, et al., 2008), rescue (Birk & Carpin, 2006) and mapping (Thrun, 2002).
Robots are complex entities comprised of both physical (hardware) and virtual (software) components. They operate in different dynamic physical environments with a varying degree of
AN US
autonomy e.g., satellites, Mars rovers, Unmanned Aerial, Ground or Underwater Vehicles (UAV, UGV, UUV), Robotic arm in an assembly line, etc. These sophisticated, and sometimes very expensive machines, are susceptible to a wide range of physical and virtual faults such as wear and tear, noise, or software-control failures (Steinbauer, 2013). If not detected in time, a fault can quickly deteriorate
M
into a catastrophe; endangering the safety of the robot itself or its surroundings (Dhillon, 1991). For
ED
instance, an undetected engine fault in a UAV (Unmanned Aerial Vehicle) can cause it to stall and crash. Thus, robots should be equipped with Fault Detection (hereinafter FD) mechanisms that will
PT
allow recovery processes to take place.
Different types of robots necessitate different FD requirements that challenge FD approaches in
CE
different ways. In this paper we focus on the following requirements: 1. High accuracy. In particular, a low false positive (false alarm) rate and a high true positive
AC
(detection) rate are required. These requirements are especially important where a false alarm might lead to a very expensive mission abortion and an undetected fault might lead to mission failure, e.g., in an unmanned spacecraft mission. In addition, in the dynamic context (e.g., new missions, dynamic environments) under which robots operate one cannot account for all possible faults a priori; unknown faults might occur and must be detected as well. 2. Small computational burden. It is required where the FD process has to be executed onboard the
ACCEPTED MANUSCRIPT
robot. Robots are already engaged in mission oriented real-time processes, such as vision processing, which are very demanding and compete over system resources (CPU, memory). An additional computationally heavy FD process might burden the system and thus interfere with the robot‟s behavior. This requirement is especially important for autonomous robots where there is less dependency on remote supporting systems, e.g., a rover on Mars cannot wait 22 minutes for a server
CR IP T
on Earth to communicate the fact that the rover has a critical fault.
There are types of robots that necessitate both of these requirements from an FD mechanism, e.g., a UAV flying solo in a part of its mission. Unfortunately, traditional approaches for FD in the literature are challenged to meet both of these requirements and at once. Three general approaches
AN US
are usually used for FD: Knowledge-Based systems, Model-Based, and Data-Driven approaches. Knowledge-Based systems (Akerkar & Sajja, 2010) typically associates recognized behaviors with predefined known faults. These approaches tend to have a small computational burden but are unable to detect unknown faults, and thus, accuracy is compromised. Model-Based approaches
M
(Isermann, 2005) on the other hand, are very equipped to detect unknown faults. Instead of modeling
ED
different faults, the expected (normal) behavior is modeled. High discrepancies between the model output and the observed system output are reported as detected faults. Yet, one must consider the
PT
practicality of the model construction, and its load on the robot‟s computational system during runtime. Data-Driven approaches (Hodge & Austin, 2004) (Anderson, Michalski, Carbonell, &
CE
Mitchell, 1986) are model free and have a natural appeal for detecting unknown faults. Online data processing may capture the current temporal context of the robot‟s operation and thus achieve better
AC
accuracy (Khalastchi, Kaminka, Kalech, & Lin, 2011). Yet, these online computations increase the computational load of the robot (Chandola, Banerjee, & Kumar, 2009). Machine learning are datadriven approaches which may involve offline preprocessing that reduce online computations. An FD model is learned from offline data and applied online. Supervised learning approaches rely on data that is already labeled for detected faults (Khalastchi, Kalech, & Rokach, 2014). Unfortunately, such data is typically absent in the domain of robots, and very expensive to produce.
ACCEPTED MANUSCRIPT
In this paper we provide three contributions. The first contribution of this paper is by proposing a hybrid approach that overcomes the absence of labeled data. We utilize an unsupervised FD approach which could have been used to detect faults online. We use this unsupervised FD approach offline, as a black box, to label a large data set. Then, supervised learning is applied and a new classifier is produced. The produced supervised-classifier is used online for FD. Since the classifier has statically
CR IP T
captured the online behavior of the unsupervised FD approach, it has potentially less burden on system resources (CPU, Memory). In addition, under some conditions, the supervised-classifier can even be more accurate than the unsupervised FD approach would have been.
This proposed hybrid approach raises serval interesting questions: (1) what are the conditions
AN US
under which it should achieve greater accuracy? (2) Why the greater accuracy can even be achieved? Especially given the fact that the unsupervised FD approach is not able to perfectly label the data set, and (3) can we predict the success of such a hybrid approach in the absence of labeled data? To the best of our knowledge, although approaches that are similar in concept were applied, there was no
M
attempt to give theoretical answers to these questions.
ED
Our second contribution lies in the theoretical extension of our previous work (Khalastchi, Kalech, & Rokach, 2014). We extend our theoretical findings by addressing these questions for any general
PT
binary classification problem, nominal data, and decision tree learning. We provide a theoretical analysis that (a) uncovers the conditions that make the decision tree more accurate than the original
CE
unsupervised FD approach, (b) provides an answer to why it is more accurate, and (c) provides a prediction about when it is advisable to use the proposed hybrid approach. Thus, we partially fill the
AC
theoretical gap.
The third contribution of our work is the experimental extension of our previous work. We use data sets of three different robotic domains: a new benchmark for FD in a high fidelity flight simulator, a laboratory robot, and a commercial Unmanned Arial Vehicle (UAV). The new benchmark is very challenging since it contains (a) different types of faults, (b) varying fault durations, (c) concurrent occurrence of multiple faults, and (d) contextual faults, i.e., data instances that possess values which
ACCEPTED MANUSCRIPT
are valid under one context but are invalid under another. We provide an empirical analysis which is consistent with our theoretical findings. In particular, we show how the hybrid approach is more accurate than four different unsupervised fault detection approaches that are utilized to label the data sets, and how well this improvement in accuracy fits our theoretical prediction. The paper is organized as follows. In the next section we present related work. In Section 3 we
CR IP T
provide an overview of the proposed hybrid approach. In Section 4 we provide the theoretical analysis which serves as our main contribution. In particular, we uncover the conditions under which the technique works, answer why it works, and derive a predicting calculation. In Section 5 we provide our second contribution - the empirical experimental setup and results. We conclude with a
AN US
discussion in Section 6. 2. RELATED WORK
Steinbauer conducted a survey on the nature of faults of autonomous robots (Steinbauer, 2013) . The
M
survey participants are developers competing in different leagues of the Robocup competition (2013). The reported faults were categorized as hardware, software, algorithmic and interaction-related
ED
faults. The survey concludes that hardware faults such as sensors, actuators and platform related faults have a deep negative impact on mission success. In this paper we focus on the improvement of
PT
the detection accuracy of different FD approaches which can detect such faults to sensors and
CE
actuators.
In the introduction we have discussed the main FD approaches and their disadvantages when
AC
applied on different types of robotic systems, namely, Model-Based, Data-Driven, and KnowledgeBased approaches. Our arguments are consistent with the arguments of Pettersson (2005) who presents a survey about execution monitoring in robotics. He uses the knowledge and terminology from industrial control to classify different execution monitoring approaches applied to robotics. His survey particularly focuses on mobile robots. There is a very good discussion about the advantages and disadvantages of each approach. Our research is focused on a hybrid approach which can avoid these disadvantages.
ACCEPTED MANUSCRIPT
Knowledge-based systems (Akerkar & Sajja, 2010) are typically challenged by the requirement to detect unknown faults. Model-based approaches (Isermann, 2005) (Travé-Massuyès, 2014) that use a model that describes the expected behavior of the robot can detect unknown faults. For example, Steinbauer and Wotawa (2005) and recently in (Wienke, 2016) use a model based approach for detecting failures in the control software of a robot. The software architecture was utilized for the
CR IP T
model creation. Different aspects of software components are observed and expected to behave according to the model unless there is a fault of any type, including of an unknown type. Yet, modelbased approaches are challenged to model the expected behavior of components and their interactions with respect to the dynamic environment. In recent work Steinbauer and Wotawa (2010) emphasize
AN US
the importance of the robot's belief management and fault detection with respect to the real-world dynamic environment. Akhtar and Kuestenmacher (2011) address this challenge by the use of native physical knowledge. The naive physical knowledge is represented by the physical properties of objects which are formalized in a logical framework. Yet, formalizing the physical laws demands great a
M
priori knowledge about the context in which the robot operates (environment and task). A diagnosis
ED
model can be automatically generated (Zaman & Steinbauer, Automated Generation of Diagnosis Models for ROS-based Robot Systems, 2013) (Zaman, Steinbauer, Maurer, Lepej, & Uran, 2013). The
PT
authors utilize the Robot Operating System (ROS) (Quigley, et al., 2009) to model communication patterns and functional dependencies between nodes. A node is a process that controls a specific
CE
element of the robot, e.g., a laser range finder or path planning. The hybrid approach we propose applies decision tree learning. The decision tree is an automatically produced fault detection model,
AC
and as such, the challenge of describing expected behaviors with respect to the environment is avoided.
Data-driven techniques (Golombek, Wrede, Hanheide, & Heckmann, 2011) (Hodge & Austin, 2004) (Anderson, Michalski, Carbonell, & Mitchell, 1986) process data and produce a model, typically for fault detection. The model creation can be done in an offline preprocessing or online as the robot operates and generates the data. Either way, the model should be applied online to detect faults as
ACCEPTED MANUSCRIPT
they occur. Online model creation, in particular density-based approaches such as in (Pokrajac, Lazarevic, & Latecki, 2007) (Serdio, et al., 2014) (Khalastchi, Kalech, Kaminka, & Lin, 2015) (Costa, Angelov, & Guedes, 2015) (Bezerra, Costa, Guedes, & Angelov, 2016), may produce dynamic models that fit very well to the dynamic nature of the robot. Yet, such approaches face the challenge of producing the dynamic model within the constraints of time and computational-load (Chandola,
CR IP T
Banerjee, & Kumar, 2009). On the other hand, offline model creation, such as in machine learning approaches, reduce online calculations, making the approach quicker and easier on system resources. However, the produced model is static and typically cannot account for every possible scenario. This challenge is met with the use of very large data sets which represent the scope of operations of the
AN US
robot. Our empirical analysis utilizes such data sets.
The absence of labeled data in the domain of robots can be handled in different ways. Some approaches, such as in (Leeke, Arif, Jhumka, & Anand, 2011) and (Christensen, O‟Grady, Birattari, & Dorigo, 2008), depend on the injection of fault expressions to a data set of a non-faulty operation prior
M
to the learning phase. Such approaches face the challenge of mimicking the faults propagation
ED
through the system to correctly inject the fault‟s expressos in the data. Another possibility is to use one-class classification techniques. Hornung et al. (2014) utilize a data set of a fault-free operation.
PT
They apply two clustering techniques to produce a two-classed labeled data. First, MahalanobisDistance (Mahalanobis, 1936) is utilized in a radial basis function to cluster the data instances. The
CE
formed clusters depict the normal behavior of robot. Then, negative selection is used to cluster the unknown data instances, i.e., an infinite amount of possible data instances that do not appear in the
AC
data and depict the abnormal behavior of the robot. Having the two labels, a Support Vector Machine (SVM) (Steinwart & Christmann, 2008) learning algorithm is applied offline, and used online as an anomaly detector. For efficiency the high dimension of the data is reduced by techniques of projection and re-projection. In this paper we assume to have a large (unlabeled) data set of operations which already contain hidden faults. There is no need to mimic fault expressions. In addition, we assume to have an unsupervised fault detection approach which is utilized offline as a data labeler. Thus, there
ACCEPTED MANUSCRIPT
is no need for a generic clustering algorithm; we use an FD approach which is already suitable for the domain. Table 1 summarizes the general advantages and challenges the architypes of FD approaches have when applied to robotic systems. Note that this is a general summary and exceptions can be found. We can see that each approach has a certain challenge that is an advantage of the other approaches.
Knowledge-based
Model-based
Advantages
Challenges
CR IP T
Approaches
Low computational burden
Detection of unknown faults
Low computational burden
Modeling expected behaviors of every
Detection of unknown faults
component and their interactions especially
AN US
w.r.t. the environment
Model free
High computational burden
Data-driven
Detection of unknown faults
Absence of labeled data
M
Table 1: Summary of advantages and challenges FD approaches have when applied to robotic systems.
ED
The robot‟s operational data is a time-series comprised of quantitative data streams. In this paper, we are focused on nominal data. The quantitative data streams are transformed to nominal values
PT
depicting the existence of suspicious patterns (e.g., “drift”, “stuck”). Such patterns are general in nature and can be found in other works about fault detection. For instance, the Advanced Diagnostics
CE
and Prognostics Testbed (ADAPT) (Mengshoel, et al., 2008) depicts the following faults to sensors on
AC
an electrical circuit: "stuck" where all values produced by the sensor are the same, "drift" where the values show a movement towards higher (or lower) values, and "abrupt" where there is a sudden large increase (or decrease) in the sensor's values. Another example can be found in the work of Hashimoto et al. (2003). They use Kalman filters along with kinematical models to diagnose "stuck" and "abrupt" faults to sensors of a mobile robot, as well as "scale" faults, where the (gain) scale of the sensor output differs from the normal expectation. Sharma et al. (2010) discuss how to use rulebased, estimation, time series analysis, and learning-based approaches to detect „real-world‟ sensor
ACCEPTED MANUSCRIPT
patterns such as “constant” (stuck), “short” (abrupt), and “noise”, where the variance of the sensor readings increases.
ED
M
AN US
CR IP T
3. AN OVERVIEW OF THE HYBRID APPROACH
PT
Figure 1: an overview of the unsupervised technique.
Figure 1 depicts an overview of the hybrid approach. The left side describes the offline process. We
CE
begin (at the top) with a large unlabeled data set. This data set depicts the nominal values of the robot‟s attributes as recorded during many different operations. This set already includes hidden
AC
examples of faults that the robot has suffered. Specifically, during the robot‟s operations data streams of sampled attributes are recorded. The attributes are based on sensor readings, actuator commands and actuator feedbacks. For instance, attributes of a UAV may include sensor readings such as indicated altitude, airspeed, heading etc., actuator commands such as the force to be applied on the elevators, ailerons, rudder, throttle, etc., and actuator feedback such as the current angle of the ailerons. These recordings form a quantitative
ACCEPTED MANUSCRIPT
time series. Hence we require an unsupervised approach that has the ability to transform the quantitative time series into nominal data1. Next, an unsupervised fault detection approach is used offline, as a black box, to label the data set, i.e., each row in the data set receives the label “Fault” or “Normal”. The fault detection approach used is very accurate, but not perfect. It may have a high fault detection rate (true positive rate), and a low
CR IP T
false alarm rate (false positive rate). Yet, the resulted labeled data set contains false positive and false negative examples, i.e., normal instances deemed as faulty and faulty instances deemed as normal (respectively). Note that we do not know the fault detection and false alarm rates of the approach, nor the amount of false positives and negatives in the data. However, we evaluate that
AN US
relatively small amounts of mistakes do exist within this data set.
Even so, in the next step we apply supervised decision tree learning, i.e., based on the data that the unsupervised approach has labeled. This process produces a fault detecting decision tree which depicts, and perhaps generalizes, the fault detection decision mechanism of the utilized unsupervised
M
approach.
ED
The online process is depicted in the right side of the figure. The robot‟s sampled quantitative data is processed and transformed to nominal data. The nominal data is fed to the decision tree which
PT
quickly determines whether or not this instance is a fault. Applying this decision tree online would typically take significantly less calculations than possible online calculations done by the original
CE
unsupervised fault detection approach; these calculations are replaced with a static tree. We benefit a quick and computationally easy fault detection approach. But is it more accurate than the original
AC
online approach? We answer this question in the next section. Even so, in the next step we apply supervised decision tree learning, i.e., based on the data that the unsupervised approach has labeled. This process produces a fault detecting decision tree which depicts, and perhaps generalizes, the fault detection decision mechanism of the utilized unsupervised approach. The online process is depicted in the right side of the figure. The robot‟s sampled quantitative data 1Implementation
detail: we used a fixed size sliding-window technique to frame the data streams. Pattern detectors were applied on each stream. Detected patterns, such as “drift” or “stuck”, that describe the data stream of an attribute, were assigned as its nominal value.
ACCEPTED MANUSCRIPT
is processed and transformed to nominal data. The nominal data is fed to the decision tree which quickly determines whether or not this instance is a fault. Applying this decision tree online would typically take significantly less calculations than possible online calculations done by the original unsupervised fault detection approach; these calculations are replaced with a static tree. We benefit a
online approach? We answer this question in the next section. 4. THEORETICAL ANALYSIS
CR IP T
quick and computationally easy fault detection approach. But is it more accurate than the original
In this section we provide our main contribution. In subsection 4.1 we provide a theoretical analysis
AN US
for the hybrid approach by investigating the conditions under which it is expected to achieve better accuracy than the unsupervised approach utilized as a data labeler. In addition, we derive the conditions under which greater accuracy can be achieved. In section 4.2 we describe a predictor that can help to decide whether to use the hybrid approach or the original unsupervised approach even
M
though we do not possess labeled data for testing. 4.1 Why the Hybrid Approach can be More Accurate
depicts the notations we use in this section, and their meaning. Let
ED
Table 2
unlabeled set of examples, consisting of are unknown. Let
negative examples. Since
CE
classifier, but not perfect. Figure 2 depicts 's classification of . The examples of the circle. The top half represents
AC
bottom half represents positives
- the share of examples classified by
- the share of examples classified by
and false positives
is unlabeled then
be an unsupervised classification approach. We know
PT
and
positive and
be a sufficiently large
.
includes true negatives
know these amounts, but we do know that
satisfies
are represented by as positive, and the
as negative.
and false negatives and
.
is a good
includes true . We do not
ACCEPTED MANUSCRIPT
Notation
Meaning
CR IP T
A large unlabeled data Set Unknown numbers of Positive and Negative examples in . An Unsupervised classifier that
classified as Positive
The (countable) number of examples in
that
classified as Negative
AN US
The (countable) number of examples in
Unknown numbers of True Positives and False Positives of . Unknown numbers of True Negatives and False Negatives of . The unknown True Positive Rate, and False Positive Rate of .
is a Leaf in , and
as labeled by .
M
A decision Tree, learned on
is a Subset of examples which satisfy the case of
.
that are classified as Positive by .
The (countable) number of examples in
that are classified as Negative by .
PT
ED
The (countable) number of examples in
Unknown numbers of True Positives and False Positives of .
CE
Unknown numbers of True Negatives and False Negatives of .
AC
The unknown True Positive Rate, and False Positive Rate of . Table 2 notations and their meaning
CR IP T
ACCEPTED MANUSCRIPT
Figure 2: how S is classified by an unsupervised classifier.
We can label
is more accurate than
when put to a test, i.e.,
would have a higher true positive rate
AN US
predict if
with , apply a decision tree learning and produce a classification tree . We wish to
and a lower false positive rate. We are interested in these measures since they depict the fault detection rate and false alarm rate (respectively) of a fault detection approach. We wish to determine which classifier is more accurate
M
even though no labeled data exist.
Each
ED
In decision tree classifiers, the decisions are made in the leaves of the tree. leaf
PT
different values. For
represents a case where different features possess
Figure 3: a decision tree.
instance, Figure 3 depicts a decision tree where the leaf
represents a case where attributes which satisfy the case of
CE
in
then the tree decides that the case of
the tree decides that the case of
. Let
be the number of examples in
be the number of examples in
AC
positives by . Let
. Let
and
be the set of examples that are classified as
that are classified as negatives by . If
should be classified as positive. If
should be classified as negative. If
then
then the tree cannot
decide (the leaf has no information gain) and thus such leaves are omitted. Figure 4 represents the two types of leaves (positive and negative) and their corresponding subset of examples in . The leaf
classifies as positive since
. The leaf
classifies as negative
ACCEPTED MANUSCRIPT
.
CR IP T
since
Figure 4: the two types of leaves and their corresponding subset of examples.
Generally, in each subset of
has made. Correct labels include a number of true positives denoted as
negatives denoted as
M
– a leaf that classifies as positive, i.e., every example is
then
are incorrect.
correctly classifies more examples in
|
)
. Since
AC
(
CE
Generally, the probability for
are correct.
is incorrect and
is classified by
as
and the classification by
is the number of examples for
is the number of examples for which
is the number of examples for which
PT
incorrect.
and
ED
is the number of examples for which both and
, and
. Each of these values can be equal to or greater than zero.
positive. Let us investigate the correlation between the classification by
which both the
, and true
. Incorrect labels include a number of false positives denoted as
false negatives denoted as Let us focus on
AN US
incorrect labels
represented by a leaf, there is an unknown number of correct and
is correct and
is
is correct. Thus, if
than does .
to be correct when classifying an example
as negative is
is assumed to be a good classifier then this probability is greater than
1/2. Seemingly, this might mean that
in
.
However, this analysis disregards two important pieces of available information. (a)
is a leaf. The
construction process of the tree only splits nodes if there is meaningful information to be gained. This means that any way to split
would result in leaves depicting subsets of examples for which ‟s
decision does not contribute any meaningful information; such leaves are pruned. Hence, all the
ACCEPTED MANUSCRIPT
examples in
, though different from one another, depict a single unique case and provide the same
information with respect to the desired decision. (b) We know that
that
times, and
times as negative, where
is certain that the case depicted by
this value as
. Note that
. With this information we can state
is positive with a probability of
is known since
and
are countable in
For example, the tree learning process has grouped examples were labeled by
all the 100 examples in
examples under a leaf
as positive and
. Among these
as negative. We know that
depict a single unique case with respect to the contribution of meaningful
information. Thus, we can state that probability
.
is certain that the case depicted by
.
is positive with a
AN US
examples
. We denote
CR IP T
positive
has classified this single case as
Now we can understand why greater accuracy can be achieved. Every example in same case, and according to , each example has a probability of
is
). Since
(
examples in
correctly than
for being positive. Thus, we can
, and the number of true negatives
is greater than 1/2 then
in
. That is,
classifies more
classified.
ED
in
is
M
estimate that the number of false negatives in
depicts the
PT
Continuing the example above, we can estimate that Accordingly,
as Figure 5 depicts. We can see that for case
, which classifies as positive,
CE
depicted by
is correct
times, and
times. Thus, we can see that given an example
AC
tree has a probability of 80/100=0.8 being correct, while being correct;
)
, the
has a probability of only 68/100=0.68 of
is better.
is certain with probability of (
is correct
, i.e., that satisfies the case of
The same principle can be applied to negative leaves such as that
.
. That is, if
. For the examples in
we can state
that they are negative. Hence, then
correctly classifies more examples in
than does .
CR IP T
ACCEPTED MANUSCRIPT
Figure 5: an example of a positive leaf.
more positive leaves in which more cases in which
AN US
Now, we can understand the conditions under which greater accuracy can be achieved. As there are and more negative leaves in which
is correct and
is not.
thus there are
Understanding this condition allows us to create a predictor that can help us decide whether
or
M
should be used. We describe this predictor in the next subsection. As you can see, this theoretical
ED
analysis is not restricted to the fault detection problem. It can be applied to any binary classification problem where nominal data is used and there is enough data in .
PT
4.2 Predicting Which Approach is More Accurate
Recall that in our domain, labeled data is absent and thus we cannot apply a test on a testing set.
CE
Even so, we wish to know which approach is better to use, the original unsupervised approach
or
AC
the hybrid approach with its decision tree ? Seemingly, more information is needed. Yet, we can answer the following question: given an estimation for the true positive and false positive rates
would have theoretically achieved on an
imaginary testing set, what would be the expected true positive and false positive rates of ? The answer to this question can help us decide if Formally, we define a predictor ( true positive rate of , and
is indeed better. )
〈
〉 such that when given
- an estimated false positive rate of , the predictor
- an estimated returns the
ACCEPTED MANUSCRIPT
expected true positive rate and the expected false positive rate of , denoted as For example, Alice implemented an unsupervised approach
. Due to the absence of labeled data she cannot test which
classifier is better. However, she can ask: if
. were to achieve a true positive rate of 0.8 and a
false positive rate of 0.1 on some data set, what would ) and gets an answer of 〈
(
We start by estimating Note that
and
〉, indicating that
is preferable.
and
respectively. Among the positive examples
AN US
. The true positive rate of
. Hence, the amount of true positives are represents the ratio . Thus,
)
| |
)
and assign the estimated number of negative examples:
positive examples and , we can estimate that
CE
input
PT
Continuing the running example, Alice used a data set where | | has labeled
. Similarly, the false positive
. With some algebra we can assign the estimated
ED
.
represents the
. Hence, the amount of false positives are (| |
number of positive examples:
as
M
ratio rate of
(repectivly).
| |. In the data set , we know the number of examples that are labeled by
are true positives and false positives, i.e.,
| |
be able to achieve? Alice applies
- the number of positive and negative examples in
positive and as negative; these are denoted as
(| |
. In addition, she applied the
CR IP T
hybrid approach and produced
respectively.
. Applying
to label
negative examples. According to Alice‟s .
Now, we can calculate the number of true positives, false positives, true negatives, and false
AC
negatives of :
For
ACCEPTED MANUSCRIPT
Note that we have just estimated that in
there are
mislabeled positive examples, and
mislabeled negative examples. These examples are distributed among the subsets represented by the leaves of . In the previous subsection we theorized what are the portions of false negatives and false positive for each type of leaf. In particular, we stated that in a subset , the amount of false negatives
is
. That is, the probability of an example in
CR IP T
leaf
which corresponds to a positive
to be positive times the amount of examples labeled as negative by corresponds to a negative leaf
which
. That is, the
to be negative times the amount of examples labeled as positive by .
Note that all of these amounts are countable in each leaf and thus Let
is
AN US
probability of an example in
, the amount of false positives
. Similarly, in
be the set of positive leaves of , and
and
are known.
be the set of negative leaves of . With the prior
estimation of , we can now calculate the false negative, false positive, true positive and true negative
∑
∑
CE
PT
ED
M
of :
AC
The amount of false negatives for the tree falsely labeled as negative ∑
should count for all the examples in
apart of those which belong to positive leaves. The amount of
counts for all the examples in
that are falsely labeled as negative by
as positive by , and thus cannot be regarded as a part of examples in i.e., ∑
that were
that were falsely labeled as positive
. Similarly,
but are classified
should count for all the
apart of those which belong to negative leaves,
. The amount of true positive for the tree
is set to be the number of examples
ACCEPTED MANUSCRIPT
classified as positive by the tree number of true negatives for
, which is countable, minus its false positives
is
.
Continuing with the running example, assume that Alice counted that examples in
as positive, and
. These are 400 out of
classifies as positives. Thus,
as negative. Assume the sum of examples that
CR IP T
these examples is ∑
correctly
. Similarly, Alice calculated that
. Accordingly,
,
Note that according to our formula, as
and
becomes closer to
.
agree on more examples thus the negated sums are
, and
becomes closer to
AN US
reduced. As a result
classifies
examples as negative. For each positive leaf Alice calculated
the estimated number of examples that were falsely labeled by
∑
. Similarly, the
true positive rate and false positive rate of both
and
. In turn, the predicted
become very similar. This is in support to the
conditions to achieve greater accuracy, as we discussed in the previous subsection.
and
then it is advisable to use .
PT
If
ED
M
Now we can estimate true positive rate and false positive rate of :
CE
In our running example, Alice calculated that Since the prediction for .
AC
use of
is better than the input (0.8, 0.1) of
and
.
then Alice should prefer the
Even without labeled data of any sort, with this predictor we can plot several instances of and the corresponding expected
on a curve. If the prediction for
is better for these
instances than , then it is advisable to use . For example, Figure 6 depicts a case where the hybrid approach is potentially more accurate than the unsupervised approach. The X-Axis represents the false positive rate and the Y-Axis represents the true positive rate. As a point in the chart tends towards the upper left corner, thus it depicts
ACCEPTED MANUSCRIPT
greater accuracy. The four black dots, represent the pairs of fabricated represent the pairs of the corresponding predicted predictor yields a more accurate prediction for
. The four gray dots,
. We can see that for each input, the
. Thus, for this case one may know that it is
recommendable to use the hybrid approach. On the other hand, Figure 7 depicts a case where the hybrid approach is not expected to be different than the unsupervised approach. For each fabricated , the prediction for
was of insignificant difference. This case signifies that
CR IP T
pair of
PT
ED
M
AN US
there is little room for improvement and hence, it is not recommendable to use the hybrid approach.
AC
CE
Figure 6: a prediction example. The hybrid approach is expected to be more accurate than the unsupervised approach.
AN US
CR IP T
ACCEPTED MANUSCRIPT
Figure 7: a prediction example. There is no significant difference between the hybrid and the unsupervised.
5. EVALUATION
In this section we provide our evaluation. We start by describing the experimental setup in
M
subsection A, which includes the simulated and real-world domains. We continue in subsection B
ED
with the description of the different fault detection approaches used offline to label the data sets. In subsection C we provide the results. In particular, we show that the predictor was able to indicate the
5.1 Experimental setup
PT
right decision to whether or not we should use the hybrid approach.
CE
In this paper we provide a way to predict whether or not it is useful to use the hybrid approach over the unsupervised approach - even in the absence of labeled data. However, in order to validate our
AC
hypothesis that the predictions are correct, we have to use labeled data sets. We use the following labeled data sets. We created a new data set that can be used as a public benchmark for general fault detection approaches as well as approaches designed more specifically for the domain of robots. We utilized the FlightGear (Perry, 2004) flight simulator. FlightGear is an open source high fidelity flight simulator designed for research purpose and is used for a variety of research topics. FlightGear is a great domain for robotic FDD research, particularly for UAVs, and yet it is not widely used enough
ACCEPTED MANUSCRIPT
for this topic of research. In addition to this simulated domain, we experiment with two real-world domains as well: a commercial UAV and a laboratory robot. In this section we describe the FlightGear domain and the real-world domains. 5.1.1. The FlightGear Domain
The most important aspect of FlightGear as a domain for FD is the fact that FlightGear has built-in
CR IP T
realistically simulated instrumental, system, engine, and actuators faults. For example, if the vacuum system fails, the HSI (Horizontal Situation Indicator) gyros spin down slowly with a corresponding degradation in response as well as a slowly increasing bias/error (drift) (Perry, 2004), and in turn, if not detected, lead the aircraft miles away from its desired position. Thus, the first
AN US
advantage is that FDD approaches may solve a real-world problem. More importantly, while in other domains faults‟ expressions are injected into the recorded data after the flight is done, in FlightGear the simulated faults are built-in and can be injected during the flight. First, there is no bias; the faults were not modeled by the scientist who created and tested the FDD approach. Second, built-in
ED
expressions are more realistic.
M
faults which are injected during the flight propagate and affect the whole simulation. Hence, fault
We created a control software to fly a Cessna 172p aircraft as an autonomous UAV (see Figure 8).
PT
The flight instruments are used as sensors which sense the aircraft state with respect to its environment. These sensor readings are fed into a decision making mechanism which issues
CE
instructions to the flight controls (actuators) such that goals are achieved. As the UAV operates, its state is changed and again being sensed by its sensors. The desired flight pattern consists of seven
AC
steps: a takeoff, 30 seconds of a strait flight, a right turn of
, 15 seconds of a strait flight, a left
turn of 180 , 15 seconds of a strait flight, and a decent to one kilofeet. In total, the flight pattern duration is 6 minutes of flight.
ACCEPTED MANUSCRIPT
During a flight, 23 attributes are sampled in a frequency of 4Hz. These attributes present 5 flight controls feedback (actuators), and 18 attributes of flight instruments (sensors). The sampled flight controls are the ailerons, elevators, rudder, flaps, and engine throttle. The flight instruments are the airspeed indicator, altimeter, horizontal situation indicator, encoder, GPS, magnetic compass, slip skid ball, turn indicator, vertical speed indicator, and engine‟s RPM. Each instrument yields 1 to 4
CR IP T
attributes of data which together add up to 18 sensor-based attributes. Note that we did not sample
PT
ED
M
AN US
attributes that would have made the fault detection easy. For instance, sampling the current and
voltage of the aircraft would have made the detection of an electrical system failure unchallenging.
CE
Figure 8: flying a Cessna 172p as a UAV in the FlightGear simulator.
Type
Effect
Airspeed indicator
Instrument
Stuck
Altimeter
Instrument
Stuck
3
Magnetic compass
Instrument
Stuck
4
Turn indicator
Instrument
Quick drift to minimum value
5
Heading indicator
Instrument
Stuck
1 2
AC
Fault to
ACCEPTED MANUSCRIPT
6
Vertical speed indicator
Instrument
Stuck
7
Slip skid ball
Instrument
Stuck
8
Pitot tube
Subsystem
Airspeed drifts up
9
Static
Subsystem
Airspeed drifts up or down, Altimeter & Encoder are stuck Electrical
Subsystem
Turn indicator slowly drifts down
11
Vacuum
Subsystem
Horizontal Situation indicator slowly drifts
12
Flight elevator
Actuator
Stuck
CR IP T
10
Table 3 Summary of injected faults
AN US
The data set contains one flight which is free from faults, 5 subsets that each contains 12 recorded flights in which different faults were injected, and one special “stunt flight”. In total, the data set contains 62 recorded flights with almost 90,000 data instances. We injected faults to 7 different
M
instruments and to 4 different subsystems. Table 3 depicts the different faults and their effect. For instance, a fault injection of type 9 fails the static subsystem which, in turn, leads the airspeed
ED
indicator to drift upwards or downwards, and the altimeter and encoder to be stuck. Four subsets represent a single fault scenario. Each of the 12 flights in a subset corresponds to an
PT
injected fault in Table 3, i.e., flight 1 was injected with an airspeed indicator failure, flight 2 was injected with an altimeter failure, etc. Each fault was injected with 3-6 times per flight, and lasted 5-
CE
30 seconds. The fifth subset represents a multi-fault scenario. Each of the 12 flights on this set was
AC
injected 3 times. Each time a fault was injected to two different components at the same time. The double injection occurred at a random time of the flight, and for a random duration of 10-30 seconds. In total, the data set contains 290 injected faults. The first subset was used as an unlabeled data set. Subsets 2-5 were used as a testing set. That is, on these subsets, we compared the fault detection rate and the false alarm rate of the online FD approaches against the hybrid approach which has utilized these approaches. We checked if the improvement was fitting to the theoretical prediction.
ACCEPTED MANUSCRIPT
5.1.2. Real-World Domains
In addition to the FlightGear simulated domain, we experiment with two physical robots: a commercial UAV and a laboratory robot. These domains are not as complex as the simulated domain, but they serve to show the domain independence of the SFDD and its ability to handle real-world data.
CR IP T
Commercial UAV domain: The real UAV domain consists of 6 recorded real flights of a commercial UAV. 53 attributes were sampled in 10Hz. The attributes consists of telemetry, inertial, engine and servos data. Flights duration varies from 37 to 71 minutes. The UAV manufacture injected a synthetic fault to two of the flights. The first scenario is a value that drifts down to zero. The second
AN US
scenario is a value that remains frozen (stuck). The detection of these two faults were challenging for the manufacture since in both scenarios the values are in normal range. The remaining four flights were used as a training set where into two flights we injected similar synthetic faults. In total, the test set contains 65,741 instances out of which 1,593 are expression of faults.
M
Laboratory robot domain: Robotican1 is a laboratory robot (see Figure 9) that has 2 wheels, 3 sonar
ED
range detectors in the front, and 3 infrared range detectors which are located right above the sonars, making the sonars and infrareds redundant systems to
PT
one another. This redundancy reflects real world domains such as unmanned vehicles. In addition, Robotican1 has 5 degrees of freedom arm. Each joint is held by two electrical engines. These engines provide a feedback of the voltage applied
CE
Figure 9: Robotican1 laboratory robot.
by their action.
AC
The following scenario was repeated 10 times: the robot slows its movement as it approaches a cup. Concurrently, the robot‟s arm is adjusted to grasp the cup. In 9 out of the 10 times faults were injected. Faults of type stuck or drift were injected to different type of sensors (motor voltage, infrared and sonar). We sampled 15 attributes in 8Hz. Scenarios duration lasted only 10 seconds where 1.25 seconds expressed a fault. In total, the test set contains 800 instances out of which 90 are expression of faults. Four scenarios were used as an unlabeled training set and the other 6 were used
ACCEPTED MANUSCRIPT
as a testing set. 5.2 The Unsupervised Fault Detection Approaches
Potentially, every unsupervised approach can be used to label the data set. In order to show the strength and weaknesses of the hybrid approach we compare four different unsupervised approaches and analyze how the dependent hybrid approach is affected by each. We used the following
CR IP T
unsupervised FD approaches to label the data sets offline. The first is the incremental local outlier detection approach (Pokrajac, Lazarevic, & Latecki, 2007), denoted as
. This approach is uses the
K-nearest neighbor metric to compare densities of data instances. We arbitrarily chose a fixed threshold above which an outlier is considered as a result of a fault.
– the Online Data Driven Anomaly Detector (Khalastchi, Kalech,
AN US
The second approach is
Kaminka, & Lin, 2015). Data is consumed online in a sliding window fashion. Mahalanobis-Distance is utilized to compare correlated streams of current temporal data. Outliers above a calculated
The third approach is
M
threshold are considered as anomalies that may have been caused by faults. – the Sensor based Fault Detection and Diagnosis approach is an improvement of
where suspicious
ED
(Khalastchi, Kalech, & Rokach, 2013). The
pattern recognition are utilized instead of the Mahalanobis-Distance. In addition, a fault detection
PT
heuristic differentiate faults from normal behaviors. The forth approach is an extended implementation of the
, denoted as
, where among
CE
other extensions, the online temporal correlation detection is replaced with an offline large scale
AC
constant correlation detection. These correlations are learned offline form a fault free record of operation. Since this approach is very accurate we do not expect the decision tree to achieve greater accuracy.
We denote these unsupervised approaches as
,
,
,
as the produced decision trees for these approaches.
respectively, and
ACCEPTED MANUSCRIPT
5.3 Results
First we wish to show the prediction ability of our predictor. For each unsupervised approach, we inputted the predictor with several fabricated instances of true positive rates and false positive rates. It predicted an improvement for all approaches beside and show an improvement with respect to the
. The
approach is very accurate.
has showed that apart of only one example,
CR IP T
Examination of the leaves of
. Figure 6 depicts the predicted values of
and
agree on all examples in the data set. Thus, it is expected for the hybrid approach not to be better than
.
We classified the testing sets with the unsupervised approaches and calculated their true positive
AN US
rate and false positive rate. These results were re-inputted to the predictor. We checked if the predictions are similar to the real results of the corresponding hybrid approach. Table 4
depicts this comparison. The columns depict the unsupervised approaches. The first row
M
depicts the results of these approaches on the test set. Each cell shows the
of the
corresponding approach. These values were inputted, as an estimation, to the predictor. The second . The last row depicts the observed results of
ED
row depicts the predictions for
on the test set. These results are collected from the FlightGear domain. By
PT
comparing the first row and the last row we can see how well each unsupervised approach can be improved by the hybrid approach. For instance, the
managed a very high true positive rate of
CE
0.98 and a very low false positive rate of 0.06. Yet, the hybrid approach
managed to achieve
AC
even better results: a true positive rate of 0.996 and a false positive rate of 0.03. By comparing all the rows we can see how close the prediction to the real obtained results is. For instance, consider the column of
, given
the prediction for
is recommendable to use .
. Indeed,
is
achieved on the testing set better results than
, i.e., it , i.e.,
ACCEPTED MANUSCRIPT
Estimation of
0.87,
0.075
0.88,
0.13
0.98,
0.06
0.96,
0.003
Prediction for
0.90,
0.069
0.90,
0.12
1.00,
0.05
0.96,
0.003
Results of
0.99,
0.022
1.00,
0.02
0.996, 0.03
0.94,
0.004
Table 4 observed results
CR IP T
Obviously, the data of the testing set is different than the training set. As a result, the prediction cannot be 100% precise. Yet, we can see that for each approach the prediction can support the decision whether to use the hybrid or unsupervised approach. For each of the first three approaches, the prediction favored the hybrid approach and indeed it turned out to be better. For
are to be exactly the same. That is, the tree depicts the exact same decisions as
AN US
predictions for
, and thus there is no room for improvement. Hence, we can decided not to use . Indeed, the results for the hybrid
unsupervised
vs.
ED
In the particular case of
, not only there is no room for improvement, but the
approach is better equipped to handle previously unobserved data instances than . Thus,
can be more accurate than
PT
the supervised
for the testing set reveal that
is more accurate. We can conclude that when there is no room for improvement,
the unsupervised approach is preferable.
unsupervised
, but
M
rather use
the
.
CE
The implication of these results is the ability to use the hybrid approach in domains where the labeled data is absent or expensive, and to use our predictor to indicate which classifier should we
AC
use, even when seemingly there is no way to validate the choice due to the lack of (a priori) labeled data.
Next, we wish to demonstrate different aspects of improvement of the hybrid approach. Recall that the
is sliding window based. The size of window governs the accuracy of the approach
(Khalastchi, Kalech, & Rokach, 2013). Figure 10 illustrates the degree of reduction in the average false positive rate over the FlightGear domain. Note that the false positive rate is in logarithmic
ACCEPTED MANUSCRIPT
scale. We can see that with each size of sliding window the false positive rate of the hybrid unsupervised approach.
AN US
CR IP T
approach is significantly lower than the
Figure 10: FP rate vs. sliding window size. Unsupervised is
, hybrid is
, data taken from FlightGear.
M
The different parameters used by the unsupervised approach during the offline phase can be viewed as different unsupervised approaches; each with its own rate of false positives. The hybrid
ED
approach contributes to the reduction of false positive rate for each of these unsupervised approaches.
PT
Moreover, as the false positive rate of the unsupervised approach is getting lower, thus the false positive rate of the hybrid approach tends to 0.
CE
Recall that the sampled data of the robot is of quantitative nature. Again, a sliding window technique is used online to capture frames of the data stream, apply pattern detection on these
AC
framed streams, and output a nominal value for each attribute. As in the
approach, the size of
the online sliding window may govern the accuracy of the hybrid approach. In particular, a smaller size yields more fault reports. The increase of reports may increase the detection rate (true positive rate) of the hybrid approach, but it may also increase the rate of false alarms (false positives). Since the false alarm rate of the hybrid approach is very low, one can decide to tolerate more false alarms in order to increase the fault detection rate.
ACCEPTED MANUSCRIPT
Figure 11 illustrates the false alarm rates and the detection rates of the approach verses
(unsupervised)
(the hybrid approach) under the influence of a changing size of the online
sliding window (62sec – 47sec). The X-Axis represents the false alarm rate and the Y-Axis represents
accuracy. Note that scale of
CR IP T
the detection rate. As a point in the chart tends towards the upper left corner, thus it depicts greater
Figure 11 zooms in on high detection rate (close to 1) and low false alarm rate (close to 0). The added of false alarms to the hybrid approach is of little significance while the effect on the unsupervised approach is apparent. As expected, the detection rate of the hybrid approach is getting
AC
CE
PT
ED
M
AN US
higher as the size of the online sliding window decreases.
Figure 11: ROC –
(hybrid) vs.
(unsupervised), for differen sizes of the online sliding window.
Similar trends were observed for the real-world domains - UAV and Robotican1. As in the FlightGear domain, the predictor supported the use of
but not
. The injected
faults were relatively easy to detect, and where the unsupervised approach achieved a detection rate of 1 so did the hybrid approach, i.e., all faults were detected. Yet, the hybrid approach achieved significantly lower false alarm rates than the unsupervised approaches, apart from
where both
ACCEPTED MANUSCRIPT
approaches achieved similar false alarm rates. Table 5 depicts the false alarm rates of the unsupervised approach (
) vs. the hybrid approach (
score is better. We can see that
) for the real-world domains. a lower
is significantly better. Similar results were obtained for the
other unsupervised approaches, signifying the success of the hybrid approach even for real-world data.
UAV
0.033
0.016
Robotican1
0.067
0.041
CR IP T
Domain
Table 5 False Alarm Rates
AN US
In order to test the generality of the prediction to domains other than fault detection, we tested the Breast Cancer Wisconsin data set (Wolberg & Mangasarian, 1990). The data set contains 699 instances of 10 nominal attributes with a benign / malignant classification problem. We used 80% of the instances as a training set, and the other 20% as a testing set. We treated this domain as if it was
to relabel the training set. Then, we constructed a decision tree
, and
ED
neighbor algorithm
M
unlabeled. We removed the labels from the training set and used the (unsupervised) K-nearest
used the predictor with several parameters of true positive and false positive rates. The predictor
PT
indicated an improvement in the true positive rate at the cost of a higher false positive rate. Indeed, this was the case when we tested the classifiers on the testing set, as Table 6 depicts. FP rate
0.823
0.025
Prediction for
0.865
0.034
Results of
0.86
0.089
AC
Results of
CE
TP rate
Table 6 the breast cancer data set results
The first row depicts the results of output for
on the testing set. The second row depicts the predictor
where 0.823, 0.025 are specifically used as the parameters for the true positive and
false positive rates respectively. We can see that the predictor indicated a higher true positive rate at
ACCEPTED MANUSCRIPT
the cost of a higher false positive rate. The third row depicts the results of indicated by the predictor,
on the testing set. As
achieved higher true positive and false positive rates. This indicates
that our predictor is not limited to the fault detection classification problem. 6. DISCUSSION AND FUTURE WORK
CR IP T
In this paper we addressed the problem of fault detection (FD) for robotic systems. In particular, we tackled the need for higher accuracy and lower computational load of the FD approaches. Given the absence of labeled data in the domain of robots, we introduced a hybrid approach that can utilize an unsupervised approach, designed for FD for robotics, to label a large unlabeled data set. Then, we
AN US
apply decision tree learning to produce a fault detecting decision tree. The online work of classifying according to the decision tree is very easy on system resources, but is it more accurate? We answered three important questions: (1) what are the conditions under which the hybrid approach should achieve greater accuracy? We answered that a higher rate of disagreement between
M
the approaches may leave room for improvement. (2) Why the greater accuracy can even be achieved? Especially given the fact that the unsupervised FD approach is not able to perfectly label the data set.
ED
The answer relies in the notion that all examples belonging to the same leaf depict a single unique case with respect to a meaningful information gain. As such, the probability of the leaf being correct
PT
can be calculated and we showed it is greater than 1/2. (3) Can we predict the success of the hybrid
CE
approach in the absence of labeled data? We answered that indeed we can. We introduced a predictor which is based on our theoretical analysis. We empirically showed that the predictor can support the
AC
correct decision to whether or not we should use the hybrid approach or the original unsupervised approach.
The implications of our work are: (1) we introduced a hybrid approach that may be more accurate than a given unsupervised approach for FD in robotics, (2) the theoretical analysis is not restricted to the fault detection problem, but rather it depicts a binary classification problem with nominal data at hand, and (3) even with the absence of labeled data, one can use our predictor to estimate whether the hybrid approach is preferable.
ACCEPTED MANUSCRIPT
For future work we intend to answer the following questions: (1) how would the hybrid approach work under quantitative data stream? (2) Would the hybrid approach be suitable for a multiclass classification problem and as such be applied for the diagnosis problem? REFERENCES
CR IP T
Agmon, N., Kraus, S., & Kaminka, G. A. (2008). Multi-robot perimeter patrol in adversarial settings. Proceedings of the IEEE International Conference on Robotics and Automation, (pp. 2339-2345). Akerkar, R., & Sajja, P. (2010). Knowledge-based systems. Jones & Bartlett Publishers.
Anderson, J. R., Michalski, R. S., Carbonell, J. G., & Mitchell, T. M. (1986). Machine learning: An artificial intelligence approach. Morgan Kaufmann.
AN US
Bezerra, C. G., Costa, B. S., Guedes, L. A., & Angelov, P. P. (2016). An evolving approach to unsupervised and RealTime fault detection in industrial processes. Expert Systems with Applications, 63, 134-144. Birk, A., & Carpin, S. (2006). Rescue robotics - a crucial milestone on the road to autonomous systems. Advanced Robotics, 20(5). Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: A survey. ACM Computing Surveys (CSUR), 41, 158.
M
Christensen, A. L., O’Grady, R., Birattari, M., & Dorigo, M. (2008). Fault detection in autonomous robots based on fault injection and learning. Autonomous Robots, 24, 49-67. Competition, D. (n.d.). Retrieved from https://sites.google.com/site/dxcompetition2011/
ED
Costa, B. S., Angelov, P. P., & Guedes, L. A. (2015). Fully unsupervised fault detection and identification based on recursive density estimation and self-evolving cloud-based classifier. Neurocomputing, 150, 289-303.
PT
Dhillon, B. S. (1991). Robot reliability and safety. Springer. Golombek, R., Wrede, S., Hanheide, M., & Heckmann, M. (2011). Online data-driven fault detection for robotic systems. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
CE
Goodrich, M. A., Morse, B. S., Gerhardt, D., Cooper, J. L., Quigley, M., Adams, J. A., & Humphrey, C. (2008). Supporting wilderness search and rescue using a camera-equipped mini UAV. Field Robotics, 89-110.
AC
Guizzo, E. (2010). World Robot population reaches 8.6 million. IEEE Spectrum . Hashimoto, M., Kawashima, H., & Oba, F. (2003). A multi-model based fault detection and diagnosis of internal sensors for mobile robot. International Conference on Intelligent Robots and Systems (IROS). Hodge, V., & Austin, J. (2004). A Survey of Outlier Detection Methodologies. Artificial Intelligence Review, 22, 85126. Hornung, R., Urbanek, H., Klodmann, J., Osendorfer, C., & van der Smagt, P. (2014). Model-free robot anomaly detection. International Conference on Intelligent Robots and Systems (IROS). IFR. (2016). Executive Summary World Robotics 2016 Industrial Robots. the International Dederation of Robotics (IFR).
ACCEPTED MANUSCRIPT
IFR. (2016). Executive Summary World Robotics 2016 Service Robot. the International Federation of Robotics (IFR). Isermann, R. (2005). Model-based fault-detection and diagnosis--status and applications. Annual Reviews in control, 71-85. Khalastchi, E., Kalech, M., & Rokach, L. (2013). Sensor fault detection and diagnosis for autonomous systems. the 12th International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS-2013). Saint Paul.
CR IP T
Khalastchi, E., Kalech, M., & Rokach, L. (2014). A Hybrid Approach for Fault Detection in Autonomous Physical Agents . the 13th International Conference on Autonomous Agents and Multiagent Systems (AAMAS-2014). Paris. Khalastchi, E., Kalech, M., Kaminka, G. A., & Lin, R. (2015). Online data-driven anomaly detection in autonomous robots. Knowledge and Information Systems, 657-688. Khalastchi, E., Kaminka, G. A., Kalech, M., & Lin, R. (2011). Online Anomaly Detection in Unmanned Vehicles. the 10th International Conference on Autonomous Agents and Multiagent Systems (AAMAS-2011). Taipei.
AN US
Khalastchi, E., Meir, K., & Rokach, L. (2016). A Sensor-Based Approach for Fault Detection and Diagnosis for Robotic Systems. submited to Autonomous Robots. khtar, N., & Kuestenmacher, A. (2011). Using Naive Physics for unknown external faults in robotics. the 22nd International Workshop on Principles of Diagnosis (DX-2011). Leeke, M., Arif, S., Jhumka, A., & Anand, S. S. (2011). A methodology for the generation of efficient error detection mechanisms. the 41st International Conference on Dependable Systems & Networks (DSN), (pp. 25-36).
M
Mahalanobis, P. C. (1936). On the generalised distance in statistics. the National Institute of Sciences of India, 2, 4955.
ED
Mengshoel, O. J., Darwichse, A., Cascio, K., Chavira, M., Poll, S., & Uckun, S. (2008). Diagnosing faults in electrical power systems of spacecraft and aircraft. Association for the Advancement of Artificial Intelligence (www.aaai.org).
PT
Perry, A. R. (2004). The flightgear flight simulator. USENIX Annual Technical Conference. Boston, MA. Pettersson, O. (2005). Execution monitoring in robotics: A survey. Robotics and Autonomous Systems, 73-88.
CE
Pokrajac, D., Lazarevic, A., & Latecki, L. J. (2007). Incremental local outlier detection for data streams. In IEEE Symposium on Computational Intelligence and Data Mining (CIDM). (pp. 504-515).
AC
Quigley, M., Conley, K., Gerkey, B., Faust, J., Foote, T., Leibs, J., . . . Ng, A. Y. (2009). ROS: an open-source Robot Operating System. ICRA workshop on open source software. Robocup. (n.d.). Retrieved from http://www.robocup.org/ Robocup. (2013). Retrieved from http://www.robocup.org/ Robotics, W. (n.d.). Retrieved from http://www.worldrobotics.org/ Serdio, F., Lughofer, E., Pichler, K., Buchegger, T., Pichler, M., & Efendic, H. (2014). Fault detection in multi-sensor networks based on multivariate time-series models and orthogonal transformations. Information Fusion, 20, 272-291. Sharma, A. B., Golubchik, L., & Govindan, R. (2010). Sensor faults: Detection methods and prevalence in real-world datasets. ACM Transactions on Sensor Networks (TOSN).
ACCEPTED MANUSCRIPT
Steinbauer, G. (2013). A Survey about Faults of Robots Used in RoboCup. In RoboCup 2012: Robot Soccer World Cup XVI (pp. 344-355). Springer Berlin Heidelberg. Steinbauer, G., & Wotawa, F. (2005). Detecting and locating faults in the control software of autonomous mobile robots. the 19th International Joint Conference on Artificial Intelligence (IJCAI-05), (pp. 1742-1743). Steinbauer, G., & Wotawa, F. (2010). On the Way to Automated Belief Repair for Autonomous Robots. the 21st International Workshop on Principles of Diagnosis (DX-10). Steinwart, I., & Christmann, A. (2008). Support Vector Machines.
CR IP T
Thrun, S. (2002). Robotic mapping: A survey. Exploring artificial intelligence in the new millennium, 1-35. Travé-Massuyès, L. (2014). Bridging control and artificial intelligence theories for diagnosis: A survey. Engineering Applications of Artificial Intelligence, 27, 1-16. Wienke, J. l. (2016). Autonomous fault detection for performance bugs in component-based robotic systems. EEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
AN US
Wolberg, W. H., & Mangasarian, O. (1990). Multisurface method of pattern separation for medical diagnosis applied to breast cytology. In Proceedings of the National Academy of Sciences, 9193-9196. Zaman, S., & Steinbauer, G. (2013). Automated Generation of Diagnosis Models for ROS-based Robot Systems. The 24th International Workshop on Principles of Diagnosis.
AC
CE
PT
ED
M
Zaman, S., Steinbauer, G., Maurer, J., Lepej, P., & Uran, S. (2013). An integrated model-based diagnosis and repair architecture for ROS-based robot systems. International Conference on Robotics and Automation (ICRA). IEEE International Conference on Robotics and Automation (ICRA).