Expert Systems with Applications 39 (2012) 6402–6418
Contents lists available at SciVerse ScienceDirect
Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa
Application of Bayesian networks in prognostics for a new Integrated Vehicle Health Management concept Susana Ferreiro a,⇑, Aitor Arnaiz a, Basilio Sierra b, Itziar Irigoien b a b
Fundación TEKNIKER, Eibar, Gipuzcoa, Spain UPV-EHU, San Sebastián, Gipuzcoa, Spain
a r t i c l e
i n f o
Keywords: Prognosis Bayesian network Aircraft maintenance Prediction Brake degradation PHM system Predictive maintenance Operability
a b s t r a c t The aeronautics industry is attempting to implement important changes to its maintenance strategy. The article presents a new framework for making final decision on aeroplane maintenance actions. It emphasizes on the use of prognostics within this global framework to replace corrective and Preventive Maintenance practise for a predictive maintenance to minimize the cost of the maintenance support and to increase aircraft/fleet operability. The main objective of the article is to show the Bayesian network model as a useful technique for prognosis. The specific use case for predicting brake wear on the plane is developed based on this technique. The network allows estimate brake wear from the aircraft operational plan. This model, together with other models to make predictions for various components of the aeroplane (that should be monitored) offers a forward-looking approach of the status of the plane, allowing later the evaluation of different operational plans based on operational risk assessment and economic cost of each one of them depending on the scheduled checks. Ó 2011 Elsevier Ltd. All rights reserved.
1. Introduction At present, all airline operators strive to reduce both the amount and cost of aircraft engineering maintenance while at the same time ensuring aircraft safety and reliability. Aircraft manufacturers are continually providing more novel maintenance solutions with the use of new technologies. Nevertheless, current aircraft maintenance practice is still a heavy labor and unscheduled maintenance that remains a significant problem. Therefore, it is necessary to identify and explain the significance of the major weakness that impact on the maintenance practice, and then, based on these finding, make recommendations for aircraft manufacturers and airline operators so that the identified weakness may be minimized by considering the way in which they impact on the airline operating cost. These cost drivers is established by sub-dividing the whole of the aircraft maintenance process into three convenient areas explained below. Aircraft maintenance (from the operator’s perspective) requires that the plane needs to be sufficiently reliable and easy to maintain with the minimum impact to operations performed on it, it is vital issue. An analysis of the operational disruptions caused by technical problems identifies certain aircraft components (engine, air conditioning, compressed air systems, landing gear or hydraulics, etc.) weakness, taking into account their impact on maintenance
⇑ Corresponding author. E-mail address:
[email protected] (S. Ferreiro). 0957-4174/$ - see front matter Ó 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2011.12.027
costs. Maintenance execution and Maintenance Management are closely related components because the responsible for maintenance execution depends largely on the management (planning, training, spare parts, logistics, etc.) to ensure the ability to perform tasks safety and efficiently. An analysis of the impact of both in maintenance costs identifies the following weaknesses: the lack of technicians and engineers, and the high cost in recurrent training; the lack of integration in the machine system; poor management decisions, leading to the lack of spare parts, materials or pieces for the maintenance execution; mismanagement in complex situations, etc. All these weakness can be grouped into a major feature, the operability of the aeroplane, which involves ensuring operational reliability (the punctuality of the flights), maximizing availability (asset utilization) and reducing maintenance cost. Operational reliability identifies the percentage of scheduled flights which depart and arrive without falling into an operational interruption, in such a way that it would be necessary more robustness against defects, no maintenance during turn times and a rapid ground servicing. Availability implies the probability that the aircraft is available for the service at any time during its operational life and it requires a pro-active maintenance, more flexible maintenance scheduling and more robustness against defects. Finally, maintenance cost groups direct and indirect cost of maintenance activities such as check-ins, equipment, data and record keeping, planning, engineering, supervision, tooling, test equipment, facilities, logistics and administration, etc. These costs must be reduced as much as possible either through less fuel or low maintenance costs.
S. Ferreiro et al. / Expert Systems with Applications 39 (2012) 6402–6418
6403
Nomenclature ANN BATS BN CPT CPD CRIS DBN DM DS EM FTA HMI HUMS IDM IVHM MIMOSA MMEL MROs MTBF NB
Artificial Neural Network Bayesian Automated Troubleshooting System Bayesian network, Bayesian net Conditional Probability Table Conditional Probability Distribution Common Relational Information Schema Dynamic Bayesian Network Data Management Decision support Expectation maximization Fault Tree Analysis Human Machine Interface Health and Usage Monitoring Systems Integrated Data Management Integrated Vehicle Health Management Machinery Information Management Open Systems Architecture Minimum Equipment List Maintenance Repair Offices Mean Time Between Failures Naïve Bayes
The operability of the aircraft is linked with ‘on-aircraft’ maintenance concept. This concept includes aircraft line maintenance which refers to regularly scheduled maintenance and implies the proper maintenance actions between flights, ensuring the punctuality, availability and reliability for the aircraft. Aircraft line maintenance set if the aircraft is able to perform the next flight or on the contrary needs to be repaired and the flight should be delayed or cancelled. Final decision is based on a check of certain aircraft components within a ‘Minimum Equipment List’ (MMEL) carried out on the time interval ‘Turn Around Time’ (TAT) between two flights. Today, the current decision support carried out in the aircraft line maintenance is a reactive process, focused on unexpected or deferred maintenance activities, which represents a high percentage of the reduction in the operability. Fig. 1 represents the actual aircraft line (ramp) maintenance limited to a go or no-go decision for the aircraft’s next flight based on a pre-flight check on certain components of the aircraft, where failures not detected at early stage could be cause delays or cancellations in the next flight, affecting the operational plan of the aircraft/fleet. More flexible and opportunistic maintenance planning with support for decision making is required in order to achieve a suitable
OREDA Offshore Reliability Data Repository OSA-EAI Open Systems Architecture for Enterprise Application Integration OSA-CBM Open Systems Architecture for Condition Based Maintenance PHM Prognostics and Health Management PLM Product Lifetime Management PM Preventive Maintenance RCA Root Cause Analysis RUL Remaining Useful Life SACSO System for Automatic Customer Support Operations SCSI Small Computer System Interface TAN Tree augmented Naïve Bayes TAT Turn Around Time TATEM Technologies And Techniques for nEw Maintenance Concepts TD Technical Data TTF Time To Failure
maintenance, to avoid potential disruptions in the operation of the aircraft, to maximize asset utilization and to reduce downtimes (maintenance opportunity times). In summary, a new type of maintenance (not corrective or preventive) is necessary in order to maximize the operability of the aircraft. The current maintenance should be covered with a new decision support system by means of a proactive maintenance based on prognostics. Nevertheless, although there are more and more new maintenance solutions thanks to developing technologies, maintenance problems are still unpredictable and this represents a significant obstacle. The article presents a new framework for making final decision on airplane maintenance actions, designed to aircraft line maintenance explained before. The aim of the article is to emphasize on the use of prognostics within this global framework to replace corrective and Preventive Maintenance based on time for a predictive maintenance based on condition and system/subsystem Remaining Useful Life estimation. Moreover, the article presents the Bayesian network model as a useful technique, among the models that could be used for prediction making, in cases where its implementation is feasible. The specific use case for predicting brake wear on the aeroplane base on Bayesian network is developed. This network allows estimate the degradation of the brake
Fig. 1. Aircraft line maintenance execution.
6404
S. Ferreiro et al. / Expert Systems with Applications 39 (2012) 6402–6418
from the aircraft operational plan. This model, together with other predictive models for components of the MMEL offers a forwardlooking approach of the status of the airplane, allowing the evaluation of different operational plans based on operational risk assessment and economic cost of each one of them depending on the scheduled checks (place, date, resources, . . .). The article is organized as follows. Section 1 provides an introduction that discusses the current weaknesses in the aeronautics industry. Section 2 presents various frameworks that have evolved to reach an effective maintenance, shows future scenarios and explains the general framework. Furthermore, the basics of Bayesian network and its principles are depicted and it concludes with a brief overview of different applications for which they have been used for diagnosis and prognosis in industrial sector. Next, Section 3 shows the new approach focuses on the integration of forecasting techniques for aircraft line maintenance. As a result, Section 4 presents the use of Bayesian network for the prediction of the aircraft brake wear, the use case integrated into the global system to support final decision making. Finally, Section 5 introduces the conclusions. 2. Related works 2.1. Aircraft maintenance Many recent initiatives to improve operability has been driven by the need to change from the relatively static operational procedure to a more flexible one based on in service experience which provides opportunities for cost savings on maintenance and support. Recent civil aerospace studies have shown that maintenance activities can account for as much as 20% of an operator’s direct operating costs and have remained at this level for many years. It is a clear scope for increasing the efficiency of the maintenance process. For example, it is estimated that line mechanics spend 30% of their time trying to access information to diagnose and rectify failures. Additionally, the occurrence of the need for unscheduled maintenance can introduce costly delays and cancellations if the problem cannot be rectified in a timely manner. And last, a
recent survey the incidence of human error in the maintenance task was estimated as being a contributing factor in 15% of aircraft incidents. Integrated Vehicle Health Management (IVHM) and Prognostic Health Management (PHM) systems are working to overcome unscheduled maintenance problems by integrating all the condition monitoring, health assessment and prognostics into an open modular architecture and then further supporting the operator by adding intelligent decision support tools. The concept of IVHM Systems can be directly traced back the original Health and Usage Monitoring Systems (HUMS) developed for helicopter during the 1980s and 1990s. These systems encompass the tasks needed to determine, diminish, solve and verify aircraft faults as explained in Aaseng’s (2001). Benedettini, Baines, Lightfoot, and Greenough (2008) carried out a state-of-the-art of the systems. Fox and Glass (2000) show the impact of the use of the IVHM technologies in the ground, operations of reusable launch vehicles and space shuttles. However, IVHM framework does not incorporate prognostics and decision making, and in the past years the PHM systems have been design and implement in different industrial sectors. Muller, Suhner, and Iung (2008) explain the need to develop a prognostic process within e-maintenance architecture to improve the availability and the security systems, as well as product quality. It develops a platform called TELMA that supports the unwinding of metal bobbins and which incorporates prognosis in a dynamic model that predicts deterioration. Lee, Ni, Djurdjanovic, Qiu, and Liao (2006) present an evaluation of performance and predictive tools for a proactive maintenance and to avoid machinery failure. In Lee’s article several new prognostic technologies are discussed and validated with a case of study. Using a generic framework for the development and interoperability of prognostic systems of gas turbine engines and gearbox Byington, Watson, Roemer, and Galie (2002) proves that prognosis represents an improvement on traditional diagnostic systems. Little by little these systems are being adopted by the aeronautical industry. The concept of PHM for engines has been widely embraced but the remainder of the aircraft still lags some way behind and this article will look at how IVHM could help improve availability.
Fig. 2. Aircraft Health Management.
S. Ferreiro et al. / Expert Systems with Applications 39 (2012) 6402–6418
2.1.1. Future maintenance scenarios The future maintenance scenarios (Fig. 2) should be designed and development in order to improve significantly the aircraft operability, defining a new maintenance concept based on diagnostics and prognostics that avoids the problems due to unscheduled maintenance, focusing primarily on six issues for which is necessary to study of new techniques and technologies to support. Aircraft Health Monitoring is a backbone function for health management and it is located mainly on-board aircraft. It produces all the data that are required for diagnosis and prognosis purposes in certain aircraft components (engines, avionics, utilities, structures, etc.). The monitoring calculates and updates the real state of the aircraft components based on their actual use and it is an important factor in reducing operational interruptions. Aircraft Health Management is a high level maintenance function for the decision support that provides the aircraft/fleet current status (online or offline) related to potential aircraft defects and a forecast of the aircraft/fleet serviceability for the next flights cycles, flight hours or days. It takes also into account diagnosis and prognosis data and maintenance program to analyze the potential impacts on operability. This function relies mainly upon the following generic functions: (a) Enhanced diagnosis: the efficiency of the today’s aircraft diagnosis cannot be considered satisfactory. It too often leads to time-consuming trouble-shooting or even to no-fault-found. In the worst cases corrective maintenance actions may lead to technical delays or even cancellations. The enhanced diagnosis concept should be seen as a concept close to the ideal immediate and error-free diagnosis, which
6405
means diagnosis has to be first time right and provided as quickly as possible after engines power-off. It relies on an enhanced on-board maintenance system. (b) Prognosis: diagnosis is not self-sufficient to reach the new operability targets; it needs to be complemented with a capability for prognosis, which means a capability for identification of potential defects that could occur with a certain probability.
While Health Monitoring follows the evolution of the aircraft components status during operations, prognosis uses information from it in order to predict the Remaining Useful Life (RUL) times of components. This information, especially the prognosis is used to optimize maintenance scheduling and achieve the best aircraft operability, planning maintenance actions related to potential defects that may degrade the aircraft operability. HM-based maintenance scheduling uses the information from health management, especially prognosis in order to plan maintenance actions related to potential defects that may degrade the aircraft operability. Mobile Maintenance means that maintenance actions can be performed in mobile conditions at the point of job with capabilities to access easily at any information necessary for maintenance task, interacting with the maintenance information system. This concept relies on the use of wearable technologies (wearable displays, augmented reality, hand-free devices, voice activation, etc.) and maintenance applications have to be adapted for mobile use (redesign of the HMIs, new philosophy for presentation of the information). It may significantly reduce the maintenance tasks times
Fig. 3. Future Integrated Management (IDM) System.
6406
S. Ferreiro et al. / Expert Systems with Applications 39 (2012) 6402–6418
and the risk of maintenance errors (quality is increasing by the use of up-to-date digital technical data). Mobil maintenance is directly link with the process-oriented Technical Data (TD), which relies on a new TD structure and content that enables the user to quickly access to the elements of information he needs at the stage of his process, taking into account his progress in the process as well as the industrial context. Process-oriented maintenance may be considered as a big change compared with the current practice. Today’s structure and content of the aircraft TD cannot considered optimized and it may be complex and time-consuming to access to pieces of information that are really needed in one process. Lastly, Integrated Data Management (IDM) must be seen as the future maintenance integrated system. In this concept all data can be accessed, shared and managed in an integrated way with regard to all maintenance functions using these data. As an example, an efficient implementation of the health management concept calls for such an Integrated Data Management framework simply because the optimization of maintenance actions can only be achieve through the integrated approach of data management, taking into account all important dimensions: fleet status, flight operations, maintenance resources, etc. 2.1.2. Integrated Data Management System The future Integrated Management (IDM) System needs to obtain information from various stakeholders and the subsequent management requires the use of different technologies to traditionally used in aeronautics, to integrate all the parts in a consistent and effective framework (Fig. 3), giving rise to the definition of an environment called Data Management (DM) platform which provides support to decision making and an integration framework for technology. IDM system has to provide a global networked infrastructure with all data management capabilities to support the maintenance applications and databases in a trusted network-centric environment.
Existing aircraft systems tend to be limited in both their collection of data and the integration of the available data sources. This has tended to lead to a situation where the operator can become overwhelmed by the variety and disjointed nature of data sources and ‘‘not see the forest for the trees’’. Within the IDM system concept the aircraft is considered as a mobile platform providing continuously health data wherever it is (in flight, at gate) and accessible for tele-maintenance purpose by maintenance teams. According to the new concepts the maintenance applications, the databases and the user involve in the maintenance processes are distributed and mobile. Therefore a representative configuration of IDM system consists as the minimum in the association on-board (e.g. engines, landing gears, brakes, structures, etc.) and off-board (e.g. airline, Maintenance Repair Offices ‘MROs’, equipment manufacturers, etc.) data management platforms communication together through in-flight and/or on-ground data links as illustrated on the figure above. Each DM platform provides the hardware resources and basic software services allowing local applications and databases to be hosted and communication with distributed systems connected to the maintenance network. It is based on ISO-13374 (OSACBM) that specifies an open standard architecture, a framework to implement the condition based maintenance system (OSACBM) by means of standard specifications to information exchanging in order to integrate many types of components and to produce interchangeable hardware and software components from different sources. It is the base for a Prognostic Health Management (PHM) system and it is represented in terms of six functional layers or independent blocks (Fig. 4). Open Systems Architecture for Condition-Based Maintenance (OSA-CBM) standards avoids the integration and information exchange problems within a PHM system and integrates all the functionalities of the system from the sensors and data acquisition to the prognostic and decision support. Moreover, this standard
Fig. 4. Data management platform within IDMS.
S. Ferreiro et al. / Expert Systems with Applications 39 (2012) 6402–6418
provides the system with many advantages as explained below. Firstly, it decreases the costs because there is no need to develop new proprietary architectures, and there is no need for the system to be entirely developed for each individual. Secondly, due to the system division in blocks, it is possible to obtain more specialization (if each individual focuses on the development of a specific technology), competitiveness (now it is possible to compare the functionalities of each block individually instead of a global evaluation of the system solution) and cooperation. And finally, OSACBM allows changes to be made in the design effectively. This framework is maintained by Machinery Information Management Open Systems Alliance (MIMOSA) (www.mimosa.org) which is now the responsible for the new open standards development for maintenance systems. Although OSA-CBM works in a real time the companies request information about system assets state and their monitoring which is stored amongst different information management systems with different data model even in a same company, and has to be compiled to obtain an integrated general view in order to use it for taking future decisions. Open Standard Architecture for Enterprise Integration (OSA-EAI) is described by the Common Relational Information Schema (CRIS) and provides the companies with a set of specifications in order to integrate the data from disparate databases, as well as provides OSA-CBM data fields with a structure. 2.2. Bayesian networks A Bayesian network (BN) model (Pearl, 1998) is a combination of two different mathematical areas: graph theory and probability theory. It is a representation of a joint probability distribution defined on a finite set of random variables that can be discrete or continuous. Concerning knowledge modeling, Bayesian network can be seen as special knowledge representation system (stemming from semantic nets). In this case, these representation mechanisms are fit for the representation of causal semantics. Moreover, the inference mechanism is linked to the binomial (probability theory/graph theory) which dictates the applicable inference laws. Given this, they can also be regarded as high-level reasoning tools for incorporating inherent uncertainty for use in probabilistic inference, in what is known as probabilistic expert systems (Castillo, Gutiérrez, & Hadi, 1997). Many practical tasks can be reduced to problems of classification. Fault diagnostics are a case in point. A Bayesian network (BN), also called a belief network or probabilistic network, helps tackle the problem of classification efficiently, with several advantages for data analysis (Heckerman, 1996). The model encodes dependencies between the variables and it readily handles situations where some data entries are missing (incomplete data). BN can be used to learn causal relationships, and hence to gain understanding about the problem domain and to predict the consequences of intervention. It is an ideal representation for combining prior knowledge (which often comes in causal form) data because it has both causal and probabilistic semantics. Moreover, BN used in conjunction with Bayesian statistical methods offers an efficient principled approach for avoiding data over fitting. A further advantage, related to this point above, is the possibility of feeding BN with two types of probability: ‘Bayesian’ probability, as the human degree of belief that an event will occur (that is, experience) and ‘classical’ probability, which is linked to the physical properties of the event (i.e. the statistical data used to state the resulting probability distributions in each node). In both cases, this can be considered as a-priori knowledge. The representation is a directed acyclic graph consisting of nodes, usually discrete, which correspond to random variables and arcs, which correspond to probabilistic dependencies between the variables. A Conditional Probability Table (CPT) is associated with each node and describes the dependency between the node
6407
and its parents. This CPT is converted into a conditional probability distribution (CPD) in case of continuous nodes. To derive the probability of a node, the probabilities of its parent nodes and conditional probability distribution functions on their connecting edges are computed. Fig. 5 represents a direct acyclic graph, each arc is directed and there are no loops. This example is known under the name ‘Sprinkler example’. It contains four nodes (discrete random variables) to model possible causes of grass being wet, Cloudy, Sprinkler, Rain and Grass Wet. Each variable may be in one of two states: true and false, i.e, dom(Cloudy) = dom(Sprinkler) = dom(Rain) = dom(Grass Wet) = {true, false}. The grass can either be wet because of rain dom (Rain) = true or because the sprinkler was on dom(Sprinkler) = true. From here it can be inferred that Rain and Sprinkler are conditionally independent. That is, if there is no information concerning Cloudy, there is also no known dependency or causal relationship between the two indicators. The only relationship is through the variable Cloudy, i.e. PðSprinklerjCloudy; RainÞ ¼ PðSprinklerjCloudyÞ. And the joint probability distribution of all variables is PðCloudy; Sprinkler; RainÞ ¼ PðSprinklerjCloudyÞPðRainjCloudyÞPðCloudyÞ. Even though the combination of graph and probabilistic theories is one of the key benefits in BN, a wide application of this technology would have not been possible without the development of efficient inference methods to calculate the posterior probabilities, also called belief updating. Exact inference methods (such as enumeration and variable elimination algorithms) are feasible tools for low dimension unconstrained BN. However computational complexity of probability inference in Bayesian network becomes intractable in large and multiply connected BN (Cooper, 1990). Therefore, it has been of special relevance to BN the development of approximate inference methods, where there is normally a trade-off between time and accuracy (normally linked to the ‘space’ of solutions that can be sought). Different approaches are used today, such as direct sampling, loopy belief propagation and changeable methods, though stochastic simulation (e.g. Probabilistic logic sampling, likelihood weighting, Gibbs sampling, . . .) may be the preferred choice in most cases. These algorithms generate samples of data sets from random configurations of the existing Bayesian network, and estimate the posterior probabilities from the sampled configurations. Here, the issue is to construct a database with enough case samples as to have a valid distribution of probability over the BN variables which follows the probability distribution specified in the CPTs (or CPDs). In Jensen and Nielsen (2007)1 a good introduction to these methods is provided. 2.3. Learning and adaptation Learning graphical models has become a very active research topic and many algorithms have been developed for this purpose. Introductory and advanced information on probabilistic network learning can be found in Jordan (1999) and Neapolitan (2004). Three approaches can be distinguished: 2.3.1. I – Learning the structure This kind of learning seeks to make the whole structure of Bayesian network through a blend of data and expert criteria. Methods for structural learning include naïve approaches (see below), search and scoring-based and dependency analysis methods (Cheng, Greiner, & Kelly, 2002). The application of these networks within learning is also known as learning by stochastic models (Dietterich, 1997). In this topic there is a wide area of research concerning the induction of classifiers (of the complete structure) 1 There is a second edition of the book of Jensen, dated 2007, which provides an updated reading. However, the materials that support the article can be traced back to the first edition of 2001.
6408
S. Ferreiro et al. / Expert Systems with Applications 39 (2012) 6402–6418
Fig. 5. Graphic representation of a basic BN model.
using different complexity levels of BN (Larrañaga, Lozano, Peña, & Inza, 2005). A model hierarchy may be established with increasing complexity that has unrestricted Bayesian network at the top end, and a naive Bayes (NB) classifier at the bottom end. The NB classifier holds the assumption that all the variables are conditionally independent given class, and has been reported to perform well. Between NB classifiers and unrestricted BN there is a growing scale of admissible structure complexity: seminaive Bayes, tree-augmented naive Bayes (TAN) (Friedman, Geiger, & Goldszmidt, 1997), k-dependence classifiers, Markov blankets and Bayesian multinets are the best known methods. The main focus in this increase of complexity in classifier structures is to avoid network constraints such as independence assumptions, while keeping computation costs under control. 2.3.2. II – Learning probabilities in batches Information regarding conditional distributions is learned. This approach is based on initial knowledge about the existing structure of the network (the causal relationship may be established by an expert). Parameter estimation uses algorithms such as estimation–maximization (EM) to look for the best parameter distribution once the a-priori graph configuration is set. This approach and the previous approach need a large sample/ case database and are close to traditional approaches concerning knowledge acquisition from data. 2.3.3. III – Learning probabilities sequentially This approach is a little different from the other two. It relies on an initial structure and set of probabilities (that may be derived from experience or from machine learning as indicated in the previous two cases). However, it is also necessary to let probabilities adapt to a particular context. Therefore it is also called ‘adaptation’ as well as ‘sequential updating’. Adaptation is the process of refining the (conditional) probabilities specified for a Bayesian network by taking into consideration the actual outcomes of the experiment. This is probably the most useful type of learning mechanism for machinery diagnosis, as the most important input (in learning terms) can be expected to come from local usage of automated tools as they start to be applied in maintenance and diagnosis systems. For example, every time a machine is diagnosed the information about its symptoms and problems can be used to adapt the network’s probabilities. In Cooper (1990) fractional updating and fading mechanisms are reviewed as a mean of adaptation, where adaptation is observed to
reduce certain uncertainties related to the exact value of probabilities, that may change depending on the context (e.g. the plant where vibration diagnosis is performed), in what is known as second order uncertainties. Whereas the ideal way to handle adaptation is to use an extra parent node T to reflect the consequences on a variable ‘context type’ over the rest of variables, this is difficult to handle. What is normally made is to assume certain independencies (independency between the uncertainties of each variable, independency between configurations of parents applied to a variable at different contexts). With these assumptions, the probability P(A|B,C) can be considered as a distribution established through past cases, with the values attached to each state of A being interpreted as numeric fractions of a total number of past cases. As new inferences are produced, the number of past cases increases in one unit, and the number of estimated cases related to each state of A is also increased in the fraction corresponding to the evidence found for each of such states. As the algorithm tends to overestimate the number of cases (as there may be some cases where no evidence at all was retrieved for A, but still there is a new case to be added an associated algorithm) fading allows decrease the importance of the previous sample size by reducing the number of past cases. Bayesian networks have been widely used in the field of medicine, where the have helped in the decision making process of the diagnosis as well as for determining the condition, as shown in Onisko, Druzdzel, and Wasyluk (1998) and Sierra and Larrañaga (1998). However, until the last few years, this probabilistic model was not used in other applications fields, such as robotics (Lazkano, Sierra, Astigarraga, & Martínez-Otzeta, 2007) or in industry, where its use is novel and allows, as we will see in the following sections, a series of advantages as well as providing good results. The following is a small literature review of the implementation of this model in the industrial field, as this field is closer than the medicine for aerospace. 2.3.4. Bayesian networks for diagnosis First application of BN related to the area of mechanical faults is reported by Breese, Horvitz, Peot, Gay, and Quentin (1992), which describes the process for developing a BN for gas turbine performance diagnosis that is carried out including both efficiency and operating malfunctions. It firstly states, after conversations with diagnostic experts, the importance of uncertain relationships among faults and observations in power generation systems. Also the article proposes methods to obtain the prevalence of the failures by simply using data coming from statistics such as Mean Time Between Failures (MTBF). It also indicates that best way to extract the values for CPTs by querying the experts in the causal direction. That is, they indicate that is easier for an expert to look for the likelihood of a symptom, given that a fault has appeared, than the opposite or diagnostic direction, assessing the probability of a fault being true, given the observations. The application of probabilistic graph models to the diagnostics of a motor-pump system is presented in Chen and Provan (1999). The model proposes three types of variables that are converted into nodes of the causal graph (unobservable variables, representing the states of components that cannot be directly observed; assumable, describing the operating characteristics of such components; and observable or evidence variables, typically system sensors). The model described is focused in particular on bearing faults. Whereas assumable variables (rectangles in Fig. 6) represent the final status (assumption) concerning misalignment, the unobservable variables represent nodes of the causal network structure that relate different component faults between them, apart from the relation between fault and symptoms. In Bobbio, Portinale, Minichino, and Ciancamerla (2001) it is indicated that any reliability system built as a standard Fault Tree
S. Ferreiro et al. / Expert Systems with Applications 39 (2012) 6402–6418
Fig. 6. BN for the identification of bearing related faults.
Analysis (FTA) can be mapped into a BN, as BN represent a more general graph modeling for cause-events. The article firstly indicates how AND and OR gates on FTAs can be automatically mapped onto simple CPTs. In this case, the BN node corresponding to a FTA leaf is a simpler deterministic version of the probabilistic nodes that can be constructed within any BN. More important, it is also possible to enhance standard FTAs (and thus the BN extracted from them) with characteristics such as common cause failures or noisy gates, that can be reflected in the BN by adding values different from 0 or 1 to the CPT values, converting thus the nodes/gates into probabilistic nodes in the causal graph (instead of deterministic ones). In Friis-Hansen (2000) also investigates the application of Bayesian networks for decision support systems related to maintenance planning and risk analysis within maritime industry. It demonstrates that the use of BNs may be very adequate. One of the benefits is, for instance, to outperform FTAs as knowledge modeling mechanisms. Finally, it foresees the potential of BNs for diagnosis through the combination of uncertainty and model updating features, although the lack of data impedes further research. The University of Aalborg is also one of the main research nodes for Bayesian networks where one of the most robust BN shells, called Hugin5 (Andersen, Olesen, Jensen, & Jensen, 1989) has been developed. Concerning the use of BN as a tool for knowledge acquisition from data, Jensen et al. (2001) developed the SACSO system (System for Automatic Customer Support Operations) as an aid in the resolution of printing problems. It included a tool (called Bayesian Automated Troubleshooting System – BATS – Author) developed specifically for the capture of expert knowledge (in the form of Bayesian networks) with no need for knowledge engineers, giving rise to the development of a complete tool for diagnosis (Skaanning, 2004). Following this work, Langseth and Jensen (2003) also targets the problem of fault diagnosis, seeking to efficiently generate an inspection strategy to detect and repair a complex system, and using Bayesian networks to represent the troubleshooting domain. The work focuses on repair ‘actions’, i.e. on the sequence of actions to be taken by an operator, taking into account not only failure probabilities but also the cost of intervention (time, delay, spares, etc.). In Weidl, Madsen, and Dahlquist (2002) a Bayesian network is presented for Root Cause Analysis (RCA) for industrial process control using condition monitoring information, applied to pulp and article processes. It also incorporates a decision support module, performing a ranking on the most appropriate actions to follow based on a probability-cost/efficiency function. Concerning the belief part, this has an object oriented features so that it is possible to embed part of the model into an ‘instance node’ that represents a sub-graph. These can be interpreted as ‘generic’ models that may be expanded multiple times at different parts of the complete graph, thus facilitating the development of complex models with repetitive areas (such as plant process) with respect to modeling, knowledge acquisition and learning. One of these generic models (for control loop and signal classification) may be seen below. Subsequent work (Weidl, Madsen, & Israelsson, 2005) also includes a supervised sequential learning for adaptation, based on the combination of
6409
sequential updating and fading mechanisms. Here, all data arriving from the signal of the distributed control system of the Process Plant serves as feedback for the updating. Anomaly detection using Naïve Bayes (NB) approaches has also been performed by Hamerly and Elkan (2001) where data sets containing relatively few failures serve to construct models that outperform statistical analysis models. In Broadwell (2002) this concept is applied to the detection of a faulty Small Computer System Interface (SCSI) unit (four positive cases) given some SCSI early warning parameters (16 SCSI units) out of a total of more than 4000 normal units. Other recent works describe unrestricted BN for Condition Monitoring, which are also worth mentioning. The tool (Autonet Express) is described in Lund and Faulkner (2006) with a simple graph configuration modeling and extended links to feed CPTs from data. In this sense, the technique seems to be very focused on the acquisition of CPT values from existing samples, through their various links to data repositories. In this sense, it is a very specific deployment of the BN potential on condition monitoring. In Yan and Shi-Qi (2007), it explains how to perform engine analysis based on lubricant information. First, the graph structure (Fig. 7) is also clearly divided between fault and symptoms whose causal relation has been extracted from experience. Second, the CPT tables are constructed using the noisy-OR approach to simplify the number of values needed for the CPT. Here belief is also linked to a diagnosis cost function that may help to decide between corrective and ‘proactive’ actions, similar to the decision support (DS) module in Weidl et al. (2002). Graphical models is indicated as an interesting way to automate diagnosis and condition monitoring tasks within a Product Lifetime Management (PLM) approach in Fathi, Holland, Abramovici, and Neubach (2007). In this case BNs are among the key technologies to provide support on the operation phase. There is a brief example on a diagnosis network indicating causes (propulsion characteristics) that may lead to certain faults in belt conveyors. 2.3.5. Bayesian networks for prognosis There are some articles dealing with automation of prognosis by means of BNs that are worthy to examine and compare. In Langseth (1998) it indicates the possibility of using BNs to analyze the survival times of mechanical systems. In this case, BNs are applied to discover the links among covariates and with the response variable (the Time To Failure) extending in this way the application of proportional hazards assumptions within Cox Regression model. A model is built on the basis of the Gas Turbine subset within Offshore Reliability Data, repository (OREDA, 2002). Main attributes in the data base are feed within an algorithm (Bayesian Knowledge discoverer) that calculates the link between the variables (such as geographical location, environmental information, planned Preventive Maintenance, etc.) form the dataset. The final BN is shown in Fig. 8. Here is important to notice that only two variables (Preventive Maintenance –PM- and Environment) have a direct effect on the response variable time to failure (TTF). This means that if a case has these two data available, not other information is needed to compute TTF. However, if any or both data are unknown, TTF estimate can also be inferred from PM and Environment estimates, which in turn can be inferred from the rest of the nodes that does have a value. In Langseth (1998) it is also indicated one of the main distinction between BNs and most regression and time-series analysis algorithms: Whereas the later looks only for the best correlation between the response variable and the covariates, the BN will also look for the correlation between all covariates as well, which is useful when it is important to understand the effect of indirect covariates on the degradation process. This is similar to the principle
6410
S. Ferreiro et al. / Expert Systems with Applications 39 (2012) 6402–6418
Fig. 7. BN for lubricant analysis.
Fig. 8. BN model calculating Time To Failure (TTF) (Langseth, 1998).
used in Arnaiz, Ferreiro, and Buderath (2010), which allows to link new information (such as usage-related variables) to existing models, to be used when no direct information of the degradation is known. However, (Langseth, 1998) performs an estimation based on statistical values to extract the probability of failure (the TTF) of an asset, which may also be interpreted as a diagnosis. In contrast, current work is seeking for an estimation of a degradation (instead of a probability of failure) based on dynamic usage plans, that are expanded to provide the RUL. The Byington, Roemer, and Galie (2002) presents an example of model-based approach for prognosis of gear tooth failure. Here the main effort is made on the development of the physical model, taking into account existing standards for tooth root stress as a function of root crack initiation. The model also includes a Dempster–Shafer fusion algorithm that also considers vibration signals as correlated with the probability of tooth cracking. These signals, together with physic based results are considered by the algorithm for the final result of Time to Failure (TTF). An extrapolation of past speed and loading profile statistics is claimed to provide a future probability of failure, although no details are shown, or about the extrapolation over the future. It is also interesting to point out that one of most extended models of identification of prognostic approaches and their characteristics is included in this work and extended in Roemer, Byington, Kacprzynski, and Vachtsevanos (2005). Also, Vachtsevanos, Lewis, Roemer, Hess, and Wu (2006) follows this work and retrieve a case study on bearing prognosis by model-based approach. Here again details shown are focused on the generation of the models (early fault detection, high frequency enveloping analysis, bearing spall initiation and propagation). In Muller, Suhner, and Iung (2004) it is described a Dynamic Bayesian Network (DBN) for the prognosis of water valve malfunctions (see Fig. 9). The failure is modeled into 4 different values of the discrete nodes VBk and VBk+1 that goes form normal (state 1 = OK) state to degradation by fouling (state 4 = Fail).
It is also interesting that in this case all probabilities are also elicited from existing knowledge. Prior probabilities are established from default, and the CPTs of variables dP and VBk+1 are filled from existing rules of operation. The CPT of VBk+1 is filled thanks to the knowledge existing on similar valves concerning the fouling process. Given this, once the process starts, it is possible to predict the value of the valve body status in the future t + 1 and consequently the value of the resulting pressure as a consequence of the change in valve body status (if rest of parameters remains unchanged). This can be further extended to recursively predict status at time k + n. The example is further extended in Iung, Véron, Suhner, and Muller (2005), incorporating utility and action nodes, in a system that can be used to model a simulation of the final status that different maintenance actions can produce in several components (through the mathematical modeling of several maintenance indicators) and the cost of the maintenance operation itself. This last system is focused on the effect that maintenance actions have on the state of components, simulating different alternatives to select the most appropriate one (minimum cost). Last, this model is also formalized in Muller et al. (2008) as part of a process model that extends that of Byington et al. (2002). Muller chooses dynamic BNs as the preferred method to treat effectively the model related to the degradation processes, to the functional analysis of the system and even to the influence of degradation indicators and maintenance actions. As a conclusion, the belief models are quite similar to Arnaiz et al. (2010). In particular (Muller et al., 2004) replicates in the
Fig. 9. Prognosis water valve malfunctions (Muller et al., 2004).
S. Ferreiro et al. / Expert Systems with Applications 39 (2012) 6402–6418
6411
BN a model on the operating process of a pressure valve. However, there are some differences with current work: First, concerning the ‘‘projection of the usage profile’’ to perform the prognosis, as indicated in Thurston (2001), there is no consideration of the influence of the change of the operating conditions within Muller’s work. There is an influence of the maintenance plan, considered as a set of maintenance actions that may impact the deterioration process, but the operational plan (i.e. the existence of more severe conditions of operation during next period of time) is not modeled. Second, there are some concepts that are not managed in the studies, such as the treatment of confidence limits around the process. Also, the model does not contemplate adaptations from equations underlying the CPTs, neither the a-priori values, so the potential of adaptation is not considered. Last, the nature of the models is different. Muller’s models are dealing with the operation process monitoring, in such a way that they cannot be used to automate the diagnosis response (previous to prognosis). In this case the diagnosis process is overlooked, and the focus is on the prediction. In fact, the whole degradation prediction is constructed within the BN, as the input parameters for the next prediction t + 1 are coming from the results obtained at time t. In our case, we are taking advantage of structures built for diagnosis modeling to forecast degradation and hence predict RUL. In this sense there is not a need of a ‘dynamic’ approach. In Przytula (2007) it is also presented a Bayesian network approach applied to avionics that uses both past and future usage in order to perform prediction on degradation (Fig. 10). The model does not use a single RUL measure, but rather predict future observations at different parameters to foresee failures based on indicators above thresholds. However, the concept is similar as it is based on the existence of diagnosis models that provides information over the observations, and that can be further extended to prognosis by means of the inclusion of future usage parameters. As main differences with model in Arnaiz et al. (2010) the treatment of the usage is different, as it just applies a single variable concerning usage. Lastly, does not provide any treatment for confidence. Last, there exist several approaches that use ANN, neuro-fuzzy or statistical techniques to predict failures. As an example (Wang, Golnaraghi, & Ismail, 2004) develops a neuro-fuzzy approach for gear faults prognostics. The approach is based on the input information coming from wavelets transform features. The models are trained from data coming from previous experiments, and even from unrelated equations, and serve to perform a one-step look-ahead (10 min) on different failures (wear, spalling, tooth cracking, . . .). To sum up, here the fault prediction is tackled as a time series modeling, that nevertheless obtain good results with respect to other data approaches, such as recurrent neural networks. More interesting, (El-Koujok, Gouriveau, & Zerhouni, 2008) also presents a neurofuzzy approach as a solution for prognosis and Remaining Useful Life. The work presents a predictive window greater that in Wang et al. (2004) and, more important, provides a normal distribution for the prediction error associated to the fuzzy output that makes it possible to build confidence intervals on the predicted degradation. The work includes its application over the prediction of the temperature of a hair drier taking into account four previous measures plus the voltage of the heater, and works well in forecast windows of 1, 5 and 10 steps. The model may also include a-priori information as part of the initial fuzzy structure development, but seems less suitable for providing inference with partial observations, or incorporating the information corresponding to the future usage of the system.
Fig. 10. Avionics prognosis (Przytula & Choi, 2007).
3. New approach: BBNN for prognosis Today maintenance strategies have been evolving in different activity fields of the industry due to the new requirements normally imposed by the needs of the companies. It entails minimizing economic costs while increasing productivity, customer satisfaction and benefits. The fact is that the implementation of an efficient maintenance strategy has become unquestionable because it performs a critical role in the competitiveness, and that is the reason why the traditional maintenance practises are being transformed into a proactive process. The predictive maintenance strategy increases the cost-efficiency, anticipating to the failure by means of prognostic techniques, which is the base for a new decision support system in order to improve maintenance planning and to optimize its resources. This new type of maintenance based on prognosis is carried out when there is a certain level of degradation in the system/subsystem and not after fault recognition or assuming predetermined intervals according to a criterion to reduce the probability of failure. It performs condition monitoring, assesses the health status and predicts its evolution during the operating time to analyze the impact of its future degradation and is thereby being able to predict failures and to plan future maintenance actions, the base of a PHM system. As indicated in Bengtsson (2004) and documented at (Jardine, Lin, & Banjevic, 2006) current development in technologies for condition-based maintenance is heading towards prognostics (that is, a diagnosis of future status), which (Thurston & Lebold, 2001) defined as follows: ‘‘The primary function of the prognostic layer is to project the current health state of equipment into the future taking into account estimates of future usage profiles. The prognostics layer may report health status at a future time, or may estimate the Remaining Useful Life (RUL) of an asset given its projected usage profile’’. Prognostic techniques provide the ability to predict accurately and precisely the Remaining Useful Life of a failing component or subsystem, and as it explained in the Vachtsevanos et al. (2006) to accurate and precise prognosis demands good probabilistic models of the fault growth and statistically sufficient samples of failure data to assist in training, validating and fine-tuning prognostic techniques. Moreover, the models need to represent and manage uncertainty, adaptation and updating knowledge, and be able to use information from different sources such as physical models, condition data, operational data, etc. It is one of the more challenging aspects of the modern prognostic and health management (PHM) system that reduces operational and support cost and
6412
S. Ferreiro et al. / Expert Systems with Applications 39 (2012) 6402–6418
improves safety of many types of complex systems. Therefore, in spite of the most existing prognostic techniques are used as component-oriented without taking into account system performance, prognostic is not an isolated task, but it is widely linked with the PHM systems (with condition based maintenance, health assessment and decision support process), and consequently they must be integrated as a whole, which is still remain to be addressed. As it explained in Section 2, each DM platform provides the hardware and software services allowing local applications and databases to be hosted and communication with distributed systems connected to the maintenance network. The platform provides an environment that supports a new maintenance concept based on a new decision support which uses the diagnosis and prognosis for decision making (Fig. 11). The capture of information of the status of various subsystems and the processing of such information by means of diagnostic and prognostic algorithms allows to calculate the Remaining Useful Life (RUL) of each of the components and to get an overview of the global status of the aircraft/fleet, to establish the rate of the risk depending of the operational plan, and to recommend future actions for future usage and program maintenance. It constitutes the base for a new decision support process. The new decision support process (Fig. 12) to be examined in this article, adds a proactive function to the present aircraft line maintenance procedure, where the GO and NO–GO decision is supported by aircraft health assessment. Under the ‘‘decision support’’, the ‘‘operational risk assessment’’ calculates and evaluates the operational risk for aircraft and fleet operation. It creates or reshapes the long-term maintenance plan based on the aircraft Conditional View (the condition of the aircraft), as well as evaluating the effect of a virtual maintenance plan on alternative future operational scenarios and provides quantified operational risk indicators for further decision support. Aircraft condition is determined by the
‘‘Conditional View’’ module which assesses the status and degradation of certain components of the aircraft. This article focuses on giving an overview of the function performed by the module and on the development of forecasting model based on a Bayesian network to estimate the Remaining Useful Life (RUL) of the brake degradation. 3.1. Conditional View module abilities The ‘‘Conditional View’’ module is responsible for the provision of Remaining Useful Life predictions with a level of accuracy as to the RUL derived from real life operation. It provides a basis for operational risk estimation, together with other sources of information such as operational constraints, economic/safety information, etc. The ‘‘Conditional View’’ module generates an operational view of the aircraft taking into account the health status of its components and their Remaining Useful Life, and updating this data with specific information concerning future usage of the aircraft which can be obtained from the operational plan. In order to develop the functions of the ‘‘Conditional View’’, several issues must be addressed. Firstly, as expressed in Byington et al. (2002), there are basically three types of information that may form the basis of the RUL prediction in a prognostic approach: Statistical models: knowledge based just on failure probabilities coupled with expert judgments (reliability data). Physical or mathematical models: knowledge based on parameters and the connections between them to study the behavior of complex systems. This type of model is validated physically at test-benches. Models based on condition or performance monitoring: normally knowledge based on the identification of partial information (condition data) as monitored by the system.
Fig. 11. OSA-CBM layers.
S. Ferreiro et al. / Expert Systems with Applications 39 (2012) 6402–6418
6413
Fig. 12. Operation support.
Models based on condition or performance monitoring and based on physical/mathematical models interpret the output (RUL) as degradation information, whereas the statistical models refer to the output as a perceived failure probability (making no reference to the internal degradation of the piece). It is also important to understand there is a trend towards using a mixture of these types of information. Secondly, it is of key importance to establish an appropriate level of accuracy/reliability. This task involves two main sources of uncertainty that must be quantified if RUL predictions are to be improved: a. Original RUL estimations (at current time) are normally set up as part of laboratory work including mathematical, physical and/or statistical modeling together with expert criteria. As a result of uncertainty included in the model (incomplete data, incomplete model), an element of uncertainty is added to every RUL prediction. b. RUL predictions (in the future) based on the prediction of the RUL assuming input parameters in the model taken from expected usage. These two sources of uncertainty result in a loss of confidence which depends on the window of the prediction. Both confidence limits and loss of confidence increase over time. Finally, adaptation and updating knowledge are vital to maintaining the optimum Conditional View. Once there is a knowledge-based system (from experience or historical data), there exist many reasons for learning and furthering knowledge, as explained in Gilabert and Arnaiz (2006). The ‘‘Conditional View’’ carries out an ongoing adaptive prognosis based on fleet feedback. Prognosis is based firstly on an initial model to predict the RUL
within safety limits but it is adjusted as more knowledge becomes available, for example concerning degradation trends, or the relation between aircraft usage and degradation. The ‘‘Conditional View’’ uses fleet feedback such as fleet statistics and operational usage to compare aircraft degradation patterns with initial RUL degradation patterns. According to the characteristics, the ‘‘Conditional View’’ module should use technologies that provide accurate estimation for degradation and reliability models, with the capacity to include information as to confidence as part of the estimation. They should also include usage-based information as part of the input (influence factors) of the models, re-assess and make modifications from feedback. Next sections show a specific function of the ‘‘Conditional View’’ module based on a probabilistic model. It predicts an aircraft’s brake wear using Bayesian network. The basis of this model was explained in the section above. 4. Experimental results 4.1. Experimental setup Currently, the estimation of brake wear in aeronautics is made with the use of a physical model. This has been developed by British Aerospace (BAE) Systems using tests data from Airbus UK and is based on aircraft weight, landing velocity, brake operation during landing, flap position and initial brake temperature. Besides using a physical model to make estimations about brake wear degradation, it may be possible to use Standard Degradation or Simple Extrapolation as models. Standard Degradation uses ‘standard landing’ wear rate taken as the mean wear rate based on experience (0.1 mm), whereas Simple Extrapolation attempts to predict future brake wear by relying on real data from historical
6414
S. Ferreiro et al. / Expert Systems with Applications 39 (2012) 6402–6418
evidence. This article proposes two new models based on a Bayesian network: PhysicalBN and OpBN. 4.2. Final model PhysicalBN model examines the parameters with the greatest influence on the degradation estimation used by the physical model: aircraft weight (MassWeight), landing velocity (v_Init) and brake operation during landing (BrakeUse). The structure of the net was developed with expert help from Airbus UK and (BAE) Systems, while the probabilities associated with each parameter and the total/integral distribution of probabilities were obtained from a statistical analysis. This derived from a body of data taken from 3000 randomly-generated samples, each sample reflecting the most important parameters for brake wear together with real wear value taken from calculations based on the physical model provided by (BAE) systems. Fig. 13 represents PhysicalBN and shows the information behind the main nodes corresponding to the variables mentioned above and their influence on brake wear. The PhysicalBN model offers some advantages over the physical model, Standard Degradation or Simple Extrapolation: Causalities and probabilities can be established by expert criteria and statistical analysis from test-benches of the physical model. The model predicts approximately 0.11 mm wear per flight (which is the mean of degradation in a normal landing) when there is no information about future conditions for aircraft weight, landing velocity and brake operation during landing. So, it is a good simulation of the physical model standard wear rate. The model associates the brake wear prediction with a confidence level of 95%. However, brake wear may change substantially depending on flight conditions. A key point here which should be noted is the way prognosis modeling changes according to the available information. This information is not the same in each case: it can be operational plan information, historical lifetime measurements, trends or distributions related to the behavior of components, etc. There is important information in an operational plan that may be used to predict degradation parameters. An operational flight plan (Table 1) may determine the value of the degradation parameters and it can be known in advance. For instance, aircraft weight typically depends on flight distance, since the longer the distance the more fuel must be loaded on the aircraft for dealing with unexpected situations, more passengers, more freight, etc.
The fact is that for the PhysicalBN it is not possible to have aircraft weight, landing velocity and brake operation parameters a priori, and their values are not available before the flight in order to predict brake wear. But BN structure allows the configuration of causal relations between operational plan features and PhysicalBN model input. Thus, a second Bayesian network model (OpBN) is used to explain the influence of ‘operational plan parameters’ on the original model input nodes. The original PhysicalBN is structurally expanded with the new information coming from the operational plan and OpBN appears as follows (Fig. 14): Now, it is possible to make real predictions concerning the values of the input parameters for brake wear estimation for each future flight with certain assumptions: FlightDistance represents hours of flight and it influences the weight of the aircraft as explained before. RunwayLength both landing velocity and brake operation during landing will be dependent on this. Landing velocity is lower when the runway is shorter; and the use of brakes diminishes as the length of the runway increases. Weather affects runway condition. If the weather is rainy, then the runway will surely be wet, whereas if it is sunny, the runway will be dry. RunwayCondition the use of the brake operation during landing will depend on conditions on the runway. If the runway is wet, then it is more probable that brake operation will be off during landing. On the other hand, when the runway is dry, the use of brake operation during landing will be more probable. 4.2.1. RUL computation As a result, brake wear can be calculated (in mm) and mapped onto an estimate of RUL (in mm or in nominal number of landings using ‘standard landing’ wear rate). But the RUL prediction (RUL update) error increases during the computational process, having a serious effect on the operational risk. The computational process of RUL prediction starts with expected usage, which is what OpBN can realistically forecast, expected usage linked to RUL estimation, and this relies on past behavior:
RULðtÞ ¼ RULðt 1Þ f ðExpectedUsage; processDataParametersÞ
ð1Þ
with Expected usageðtÞ normally distributed being RULðt 0 Þ = initial brake’s thickness (mm) = 600 mm. In order to minimize this confidence loss, two questions are examined:
Fig. 13. Bayesian network (PhysicalBN).
6415
S. Ferreiro et al. / Expert Systems with Applications 39 (2012) 6402–6418 Table 1 Operational plan. Flight
Departure airport
Arrival airport
Flight date
Departure time
Arrival time
Check
Wear (mm)
AF0011 AF0012 AF0022 ... AF0489 AF0577 AF0610
CDG DEL CDG
DEL CDG LAX
01/01/2008 02/01/2008 03/01/2008
10:20 00:40 22:30
22:15 06:15 03:50 (+1)
Check-A
568
SXM BOM DEL
BOM DEL CDG
12/03/2008 13/03/2008 14/03/2008
19:15 21:10 07:40
03:50 (+1) 23:55 12:15
Check-A
552
Airport code
Runway length
Runway condition
Weather
CDG DEL BOM ...
3600 3810 3445
Good Poor Fair
Wet Dry Dry
90% 80% 95%
Fig. 14. Bayesian network with operational plan parameters (OpBN).
1. Cumulative variance and confidence levels. Assuming brake wear follows a Gaussian distribution with a 95% of confidence level for each flight, and further assuming independence between flights, the loss/gain assessment of the confidence curve for a whole distribution is defined as follows by the addition of these Gaussian distributions:
Expected usageðt þ 1Þ ¼ mean1 þ mean2 ¼ mean3 Confidence lowerðt þ 1Þ ¼ mean3 2
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r21 þ r22
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Confidence upperðt þ 1Þ ¼ mean3 þ 2 r21 þ r22
where r ¼ sd. 2. Status observation. After some flights (at time tj) physical checks are made at the gate and the actual brake wear is known (measured). Both confidence levels and predicted brake wear are fixed at the same value of real degradation. In this case, 100% confidence level will be regained.
ð2Þ 4.3. Results
ð3Þ ð4Þ
Finally, to evaluate whether it is possible to build more accurate and adaptable models than the original, several results for the case of brake wear are shown.
6416
S. Ferreiro et al. / Expert Systems with Applications 39 (2012) 6402–6418
Fig. 15. OpBN brake wear predictions for an operational plan of 35 flights.
In order to evaluate the value of the PhysicalBN and OpBN model, the error rate is calculated from a new dataset of 100 samples by means of MSE (mean squared error): n 1X Lðy; ^f ðxi ÞÞ ¼ ð^f ðxi Þ yi Þ2 ¼ 0:00587 n i¼1
ð5Þ
While the error of the models is not significant, it is nonetheless necessary to evaluate the error for the computational process of RUL prediction and compare it with the error in other algorithms such as the Standard Degradation or Simple Extrapolation. Fig. 15 and Table 2 represents an operational plan for 35 flights, during which three checks are to be performed. Fig. 15 shows how as the number of flight increases the degradation also increases and the thickness of the brake decreases. When a check is carried out, the real wear is known and the prediction and real wear are set to the same value, recovering 100% of confidence. The first column of the table shows the number of flights made before a new check. When this (check) is carried out, the true condition of the brake can be observed and the error rate for each model can be calculated. The difference between conventional models and Bayesian network is evident. Major error rates feature in Standard Degradation and Simple Extrapolation and these quickly increase over time, whereas PhysicalBN, which represents the physical model, gives improved results when data is available and unlike conventional models it does not make estimations of the brake wear lower to the real degradation. The estimation of brake wear provided by PhysicalBN is greater than real degradation. Consequently, as the operational plan is fulfilled, the predicted thickness (calculated in mm) of the brake is lower than the real as shown in Fig. 15. Nevertheless, PhysicalBN does not fit the prognosis because its input parameters, such as aircraft weight or landing velocity are not available before landing. In this case, the results of the PhysicalBN would be the same as those of Standard Degradation. Even so, PhysicalBN provides good levels of confidence with respect to brake
Table 2 Error rates. Landing before a check
Standard degradation
Simple extrapolation
PhysicalBN
OpBN
3 12 20
0.0256 1.9345 2.2
0.0256 2.0369 4.0036
0.0756 0.0655 0.8012
0.3356 0.3045 1.1088
degradation. OpBN was able to overcome this difficulty by using operational plan information to estimate input parameters for the PhysicalBN. The evaluation of OpBN is more complex because the value of the model depends on the accuracy of the probabilities that link the information of the operational parameter to the input parameters of the PhysicalBN. In order to evaluate the OpBN it was created a data set from the data set of initial network PhysicalBN based on the odds, so that the input parameters of OpBN (FligthDistance, RunwayCondition, RunwayLength and Weather) establish in the most appropriate way the values of the input parameters of the PhysicalBN (MassWeight, v_Init and BrakeUse). Fig. 15 shows the prediction of brake (wear or thickness) for the operational plan of 35 flights mentioned above, together with associated confidence limits. The model provides a degree of uncertainty regarding the prediction, where the lower confidence limit never exceeds the real thickness, or from another point of view, it implies that the brake wear prediction is always greater than real degradation. In addition, Fig. 16 shows the increase in the confidence limits as time of window between checks increases. 5. Conclusions and future works The maintenance strategies used by the aeronautics industry need to evolve into a proactive process based on the condition of the components/subsystems of the aircraft (predictive
S. Ferreiro et al. / Expert Systems with Applications 39 (2012) 6402–6418
6417
Fig. 16. Increment of confidence limits between checks.
maintenance). This process allows the state of the aircraft to be defined and a global vision of the aircraft to be obtained with some degree of foresight, providing a framework for decision making. It helps to establish the most appropriate operational plan given the current state of the aircraft, and to define the most appropriate timescale in which to carry out maintenance. The process provides help in decision making, avoids delays and cancelations to flights due to failure or unexpected or unforeseen malfunction, and reduces the economic costs currently associated with maintenance. However, despite all the efforts that have been made, there currently exist maintenance tasks such as line and in hangar maintenance of the aircraft which, today, continue to be reactive processes with high associated costs. This article has described a global framework which allows current maintenance (corrective or preventive) to be transformed into proactive maintenance based on the predicted condition of the aircraft through the use of prognostic techniques. This framework is complex and includes a series of emerging or development-phase technologies which need to be integrated into the whole process: from the definition of standards (e.g. OSA-CBM) to the development of state detection and prediction models which allow us to define the RUL of the most critical components of the aircraft (e.g. the Bayesian network to define brake wear). The precision of these models establishes the degree of utility and efficiency of the new prediction-based maintenance. In this sense, the ‘‘Conditional View’’ module presented in the article fulfils a most important role within the global framework as it is this that is charged with implementing the necessary functionalities in the detection of the state and prediction in the most critical components of the aircraft (included in the MMEL). It establishes the degradation and the Remaining Useful Life of these components/subsystems throughout the operational plan of the aircraft. Its models must be developed by taking into account the increased quantity of information available, by treating uncertainty appropriately, so as to establish the confidence limits under which reliable predictions may be realized, and, furthermore, it should have the capacity to adapt/readjust to new input information. The article has shown the particular example of its use for detecting and predicting the wear on the brakes of the aircraft based on a Bayesian network model. This example of use covers
part of the functionalities of the ‘‘Conditional View’’ module, and it is the part which should be highlighted in the current project due to the benefits and innovations of the Bayesian network as a predictive model. Firstly, the Bayesian network model permits the step from the diagnosis (detection of state) to the prognosis. The first BN model (PhysicalBN) imitates the behavior of the current physical model with an insignificant error, and permits a diagnosis of the current situation of the brake whilst the aircraft carries out its operational plan. Moreover, it is possible to extend the model to a second BN (OpBN), which is itself a prognostic model, by using information related to the operating plan (when the information relating to the values which the input parameters of the physical model take is unavailable). The input parameters of PhysicalBN (which simulate the behavior of the physical model) are estimated from the operational plan of the aircraft, and this, in turn, supplies the OpBN from which the RUL is calculated using the individual results of the brake wear in each flight realized. This process of calculating the RUL produces better results when the input parameter estimates of the PhysicalBN obtained from the aircraft operational plan are better fits. Likewise, the Bayesian network boasts a series of necessary properties for the production of good prediction: it allows confidence limits to be established, different types of uncertainty to be dealt with, and its behavior to be adapted and readjusted in the face of new input information. The Bayesian network provides all the necessary characteristics to enable the ‘‘Conditional View’’ module to be utilized as a prognostic model, and, therefore, further consideration combined with the provision of incentives for the use of this type of techniques (or similar techniques having the same characteristics) in implementing the different cases of use and so cover all the functionalities of the ‘‘Conditional View’’ module. Acknowledgements The authors gratefully acknowledge the support of the European Commision Sixth Framework programme for Research and Technological Development. This article summarizes work performed as
6418
S. Ferreiro et al. / Expert Systems with Applications 39 (2012) 6402–6418
part of FP6 project TATEM ‘Techniques and Technologies for nEw Maintenance concepts’. The authors also acknowledge BAE Systems for their support on the provision of data for the brake wear use case.
References Aaseng, G. (2001). Blueprint for an Integrated Vehicle Health Management system. In IEEE 20th digital avionics systems conference proceedings, Daytona Beach, Florida, 14–18 October. Andersen, S. K., Olesen, K. G., Jensen, F. V., & Jensen, F. (1989). HUGIN – A shell for building Bayesian belief universes for expert systems. In 11th international joint conference on artificial intelligence proceedings, Menlo Park, California (pp. 1080– 108). Arnaiz, A., Ferreiro, S., & Buderath, M. (2010). New decision support system based on operational risk assessment to improve aircraft operability. Proceedings of the Institution of Mechanical Engineers, Part O: Journal of Risk and Reliability, 224, 137–147. doi:10.1243/1748006XJRR282. Benedettini, O., Baines, T. S., Lightfoot, H. W., & Greenough, R. M. (2008). State-ofthe-art in Integrated Vehicle Health Management. Institution of Mechanical Engineers Proceedings Part G: Journal of Aerospace Engineering, 223, 157–170. Bengtsson, M. (2004). Condition based maintenance system technology. Where is development heading? In 17th European maintenance congress (euromaintenance) proceedings (pp. 147–156). Barcelona: Spanish Maintenance Society. Bobbio, A., Portinale, L., Minichino, M., & Ciancamerla, E. (2001). Improving the analysis of dependable systems by mapping fault trees into Bayesian networks. Reliability Engineering and System Safety Journal, 71(3), 249–260. Breese, J. S., Horvitz, E. J., Peot, M. A., Gay, R., & Quentin, G. H. (1992). Automated decision-analytic diagnosis of thermal performance in gas turbines. In ASME turbo expo 1992 proceedings, Cologne, Germany. Broadwell, P. (2002). Component failure prediction using supervised Naïve Bayes classification.
. Byington, C. S., Roemer, M. J., & Galie, T. (2002). Prognostic enhancements to diagnostic systems for improved condition-based maintenance. In 2002 IEEE aerospace conference proceedings (Vols. 1–7, pp. 2815–2824). Byington, C. S., Watson, M., Roemer, M. J., & Galie, T. (2002). Prognostic enhancements to diagnostic systems for improved condition-based maintenance. In 2002 IEEE aerospace conference proceedings (Vols. 1–7, pp. 2815–2824). Castillo, E., Gutiérrez, J. M., & Hadi, A. S. (1997). Expert systems and probabilistic network models. New York: Springer-Verlag. Chen, Y., & Provan, G. (1999). Condition-based monitoring of motor-pump systems using model-based reasoning. In AAAI technical report SS-99-04. Cheng, J., Greiner, R., & Kelly, J. (2002). Learning Bayesian networks from data: An information-theory based approach. Artificial Intelligence Journal, 137, 43–90. Cooper, GF. (1990). The computational complexity of probabilistic inference using bayesian belief networks. Artificial Intelligence, 42(2–3), 393–405. Dietterich, T. G. (1997). Machine learning research. Four current directions. AI Magazine, 18, 97–136. El-Koujok, M., Gouriveau, R., & Zerhouni, N. (2008). From monitoring data to remaining useful life: An evolving approach including uncertainty. In 34th European safety reliability & data association (ESReDA), San Sebastian, Spain. . Fathi, M., Holland, A., Abramovici, M., & Neubach, M. (2007). Advanced condition monitoring services in product lifecycle management. In: IEEE international conference on information reuse and integration (IRI) proceedings (pp. 245–250). doi:10.1109/IRI.2007.4296628. Fox, J. J., & Glass, B. J. (2000). Impact of Integrated Vehicle Health Management (IVHM) technologies on ground operations for reusable launch vehicles (RLVs) and spacecraft. In IEEE aerospace conference proceedings (Vol. 2, pp. 179–186). Friedman, N., Geiger, D., & Goldszmidt, M. (1997). Bayesian network classifiers. Machine Learning, 29, 131–163. Friis-Hansen, A. (2000). Bayesian networks as a decision support tool in marine applications. Ph.D. thesis. Department of Naval Architecture and Offshore Engineering. Technical University of Denmark. ISBN 87-89502-48-5. Gilabert, E., & Arnaiz, A. (2006). Intelligent automation systems for predictive maintenance: A case study. Robotics and Computer-Integrated Manufacturing, 22, 543–549. Hamerly, G., & Elkan, C. (2001). Bayesian approaches to failure prediction for disk drives. In 18th international conference on machine learning proceedings (ICML 2001), Williamstown, MA, USA (pp. 202–209). Heckerman D. (1996). A tutorial on learning Bayesian networks. Technical report MSR-TR-95-06. Microsoft research. (Revised version November 1996).
Iung, B., Véron, M., Suhner, M. C., & Muller, A. (2005). Integration of maintenance strategies into prognosis process to decision-making aid on system operation. CIRP Annals – Manufacturing Technology, 54(1), 5–8. Jardine, A., Lin, D., & Banjevic, D. (2006). A review on machinery diagnostics and prognostics implementing condition-based maintenance. Mechanical Systems and Signal Processing, 20(7), 1483–1510. Jensen, F.V., Kjaerulff, U., Kristiansen, B., Langseth, H., Skaanning, C., Vomlel, J., et al. (2001). The SACSO methodology for troubleshooting complex systems. In Artificial intelligence for engineering design, analysis and manufacturing (AIEDAM) (Vol. 15(5), pp. 321–333). Jensen, F. V., & Nielsen, T. D. (2007). Bayesian networks and decision graphs (second ed.). Springer-Verlag. Jordan, M. I. (1999). Learning in graphical models. London: MIT Press. Langseth, H. (1998). Analysis of survival times using Bayesian networks. In 9th European conference on safety and reliability proceedings (pp. 647–654). A.A. Balkema. Langseth, H., & Jensen, F. V. (2003). Decision theoretic troubleshooting of coherent systems. Reliability Engineering and System Safety, 80(1), 49–62. Larrañaga, P., Lozano, J. A., Peña, J. M., & Inza, I. (2005). Probabilistic graphical models for classification. Machine Learning, 59(3), 211–212. Lazkano, E., Sierra, B., Astigarraga, A., & Martínez-Otzeta, J. M. (2007). On the use of Bayesian networks to develop behaviours for mobile robots. Robotics and Autonomous Systems, 55, 253–265. Lee, J., Ni, J., Djurdjanovic, D., Qiu, H., & Liao, H. (2006). Intelligent prognostics tools and e-maintenance. Computers in Industry, 57, 476–489. Lund, T., & Faulkner, E. (2006). Condition monitoring using Bayesian networks. In INFORMS annual meeting, Pittsburgh. Muller, A., Suhner, M. C., & Iung, B. (2004). Probabilistic vs. Dynamical prognosis process-based e-maintenance system. In Information control in manufacturing (IFAC-INCOM) proceedings, Salvador, Brazil. Muller, A., Suhner, M., & Iung, B. (2008). Formalisation of a new prognosis model for supporting proactive maintenance implementation on industrial system. Reliability Engineering & System Safety, 93, 234–253. Neapolitan, R. E. (2004). Learning Bayesian networks. New Jersey: Pearson Prentice Hall. Onisko, A., Druzdzel, M. J., & Wasyluk, H. (1998). A probabilistic causal model for diagnosis of liver disorders. In 7th Intelligent international symposium on intelligent information systems (IIS’98), Malbork, Poland (pp. 379–387). OSA-CBM v3.2.1. OSA-EAI v3.2.1. Pearl, J. (1998). Probabilistic reasoning in intelligent systems: Networks of plausible inference. San Mateo, California: Morgan Kaufmann. Przytula, K. W., & Choi, A. (2007). Reasoning framework for diagnosis and prognosis. In IEEE aerospace conference (pp. 1–10). doi:10.1109/AERO.2007.352872. Roemer, M. J., Byington, C. S., Kacprzynski, G., & Vachtsevanos, G. (2005). An overview of selected prognostic technologies with reference to an integrated PHM architecture. In 1st International forum on integrated system health engineering and management in aerospace proceedings, NASA, Napa Valley, CA. Sierra, B., & Larrañaga, P. (1998). Predicting survival in malignant Skull melanoma using Bayesian networks automatically induced by genetic algorithms. An empirical comparison between different approaches. Artificial Intelligence in Medicine, 14, 215–230. Skaanning, C. (2004). First commercial bayesian software for intelligent troubleshooting and diagnostics. European Research Consortium for Informatics and Mathematics (ERCIM), 56, 20–21. Thurston, M., & Lebold, M. (2001). Standards development for condition-based maintenance systems. In New frontiers in integrated diagnostics and prognostics, 55th meeting of the society for machinery failure prevention technology (MFPT). Vachtsevanos, G., Lewis, F., Roemer, M., Hess, A., & Wu, B. (2006). Intelligent fault diagnosis and prognosis for engineering systems. John Wiley & Sons, New Jersey. . Wang, W. Q., Golnaraghi, M. F., & Ismail, F. (2004). Prognosis of machine health condition using neuro-fuzzy systems. Mechanical Systems and Signal Processing, 18, 813–831. Weidl, G., Madsen, A., & Dahlquist, E. (2002). Condition monitoring, root cause analysis and decision support on urgency of actions. In 2nd International conference on hybrid intelligent systems proceedings. Soft computing systems – Design, management & applications (Vol. 87, pp. 221–230). Amsterdam: IOS Press. Weidl, G., Madsen, A., & Israelsson, S. (2005). Applications of object-oriented Bayesian networks for condition monitoring, root cause analysis and decision support on operation of complex continuous processes: Methodology and applications. Computer and Chemical Engineering, 29(9), 1996–2009. Yan, L., & Shi-Qi, L. (2007). Decision support for maintenance management using Bayesian networks. In International conference on wireless communications. networking and mobile computing (pp. 5713–5716). doi:10.1109/ WICOM.2007.1400.