Proceedings of the 7th IFAC Symposium on Fault Detection, Supervision and Safety of Technical Processes Barcelona, Spain, June 30 - July 3, 2009
A Data Driven Prognostic Methodology without a Priori Knowledge Flavien Peysson ∗ Abderrahmane Boubezoul ∗∗ Mustapha Ouladsine ∗ Rachid Outbib ∗ ∗
Laboratoire des Sciences de l’Information et des Syst`emes, LSIS UMR CNRS 6168, Universit´e Paul C´ezanne Aix-Marseille III, Domaine Universitaire de St-Jerˆ ome, 13397 Marseille Cedex 20, France (e-mail:
[email protected],
[email protected]). ∗∗ Universit´e Paris-Est, LEPSIS, INRETS/LCPC, 58 boulevard Lefebvre, 75732 Paris, France (e-mail:
[email protected]).
Abstract: Nowadays systems are more and more complex, there is intense pressure to continuously reduce and eliminate costly, unscheduled maintenance of these systems. In such case, using physics-based damage model is not adequate in term cost/benefit analysis. While, recent technological advances of new sensors, coupled with robust processing algorithms offer an elegant and theoretically sound approach to Condition-Based Maintenance (CBM)/Prognostic Health Management of such complex systems. A new strategy based on forecasting of system degradation through a prognostic data-driven method is required. This paper introduces the development of a data-driven methodology to predict remaining useful life (RUL) of an unspecified complex system. Remaining useful life prediction is performed by recent machine learning techniques without including any system or domain specific informations. The solution is efficient and easy to implement and has the potential to be applicable to a variety of complex systems (automobiles, aerospace systems). Keywords: Prediction methods; Methodology; Maintenance. 1. INTRODUCTION Sensors
Nowadays, Condition Based Maintenance (CBM) uses equipment run-time information to determine the equipment health and consequently its current failure condition. Health monitoring can be used to schedule maintenance action and repair before to breakdown. To complete CBM, Prognostic and Health Management (PHM) techniques have emerged in recent years. PHM extends CBM concept with predicting future health indicators and giving the Remaining Useful Life (RUL), Vachtsevanos et al. [2006]. PHM is a system engineering discipline focusing on detection, prediction, and management of the health and status of complex engineered systems. The PHM cycle is depicted in figure 1. PHM implementation requires a preliminary offline phase to study how the system health evolves according to the system use and what are the most significant features to determine the current system health. Then the online phase extracts the feature from the sensor data and predict the future system health from the current health to schedule maintenance action and avoid unplanned outages. Our works are focused on the prognostic domain of the PHM cycle in order to built an incremental model of the damage state trajectory of a complex system. Various prognostic approaches have been developed ranging from a simple historical failure rate models to a
978-3-902661-46-3/09/$20.00 © 2009 IFAC
System
Preprocessing
Features extraction
Signal processing
Health analysis
Prediction of health evolution
Schedule required action
Monitoring
Prognostic
Maintenance
Fig. 1. The PHM cycle complex physics-based model. Byington et al. [2003] and Lebold and Thurston [2001] have classified these approaches according to their applicability on complex systems and their economic viability. The three main classes are: model based, data driven and experienced based approaches. Most works in literature are on damage indicator evolution, where the damage indicator is an image of the health indicator of a system. More details and references on the review of prognostic approaches in the literature can be found in Peysson et al. [2008]. In this work, we present a novel data-driven methodology for RUL estimation, this methodology was chosen under certain assumptions: • Availability of historical run-to-failure data, which will be used in the RUL estimation. • The propagation of the damage would be manifested in sensor signatures. • No system or domain specific informations are given.
Two main kind of data-driven methods to RUL estimation can be distinguished. The first, designated as the direct
1462
10.3182/20090630-4-ES-2003.0281
7th IFAC SAFEPROCESS (SAFEPROCESS’09) Barcelona, Spain, June 30 - July 3, 2009
Un
it
← Nu cycles of historic → ← N3 cycles of historic → ← N2 cycles of historic → ← N1 cycles of historic →
...
u 3
Operational Settings OSi , i ∈ {1, 2, 3}
1 0 RUL
Dv
Un
Test data
it
2 Sensors Sj , j ∈ {1, . . . , 21}
...
v ← D2 cycles → ← D1 cycles →
2 1 Cycle
Fig. 2. Data set
To evaluate the performance of the prognostic algorithm a function have been defined. As for an engine degradation scenario an early prediction is preferred over late predictions, the scoring function is an asymmetric and exponential function, such that late predictions were penalized more heavily than early predictions. The score P of an algorithm is defined as the sum of score Pk from all the predictions for the n units in the testing data set defined by the competition, the testing data set also contains 218 units. Pk is calculated from dk is the difference between the estimated and the actual RUL. −d /13 n X e k − 1 if dk ≤ 0 Pk with Pk = P = (2) e−dk /10 − 1 if dk > 0 k=1
method, which consists of RUL estimation by applying a multivariate pattern matching process. The second approach, called indirect method, where The damage progression and accumulation is estimated from the current state until a predefined threshold is met. In this case, the RUL is calculated from the intersection of the extrapolated damage and the threshold Goebel et al. [2008]. The overall goal is to provide an accurate and efficient methodology for RUL prediction of engineering system with limited information using data-driven methods. The solution presented here was developed completely using machine learning techniques. No attempt was made to extract underlying features or any such health indicator that may have been present in the data. The data used to evaluate the developed methodology is described in Section 2. Section 3 considers the issues concerning the representation of the data as regression inputs. In Section 4 results are presented and discussed. Finally conclusions and an outline of further research are stated in Section 5. 2. DATA OVERVIEW To illustrate the efficacy of our methodology, we will use data issued from the 2008 PHM Data Challenge Competition Saxena et al. [2008]. The data set, provided by the 2008 PHM Data Challenge Competition, consists of multivariate time series of unspecified component, referred to as unit. Each time series is from a different instance of the same complex engineered system, e.g., the data might be from a fleet of ships of the same type. The data is split into training and test sets where a sample s is characterized by: s = hu, c, OSi , Sj i with i ∈ J1, 3K and j ∈ J1, 21K (1) Where u is the unit ID, c the cycle index, OSi the operational settings that describe how the system is being operated and Si the sensor measurements that are contaminated with noise. The training data set contains 218 units. Each unit starts with different degrees of initial degradation and manufacturing variation which is unknown. At this stage, this degradation and variation is considered normal. At an unspecified point the unit develops a fault which grows in magnitude until unit failure, the remaining useful life of the last operational cycle of each unit in the training data is considered as zero. In the testing data set, contains 218 units the time series ends some time before system failure. The aim of the problem is to predict the number of remaining cycles before failure occurring. The dataset format is depicted in figure 2.
3. METHODOLOGY The followed methodology can be divided into two stages: model building and RUL estimation. Model building step consists in extracting knowledge from the data training set to built a RUL modeling according to OSj and Si . In RUL estimation step, the built model is used to predict a RUL for each cycle of each unit of the data test set and then for a given unit each estimated RUL are aggregated to form the unit RUL. 3.1 Sensor selection A significant characteristic of the PHM Challenge dataset is that it contains time series, the time is represented by cycle indices for each unit. The last cycle of a unit always has the index zero. One of the assumptions of this work is that the propagation of the damage would be manifested in sensor signatures and RUL is a function of the damage state. The major challenge in this work is not introducing extra information about the studied system. Therefore, sensor selection is an important and a little bit difficult step in this approach. The aim is to keep only a subset of sensors Sj that are representative of the system degradation. A temporal analysis of Sj on each unit proves to be unnecessary for the damage propagation modeling. Indeed the unit operating time is the only factor that impacts on the system degradation, the damage propagation is also linked to the operational modes of the system, ie. how and where the system is used. In data set, the use of the system is characterized by the operational settings OSi . A quick analysis of the OSi permits to highlight that 3-uple OSi for all the instances are representative of six clusters reproduced on figure 3. So it seems reasonable to assume
100 80 OS3
Train data
60 40 20 0 1 40
0.5 OS 2
0
10 0
20
50
30 OS 1
Fig. 3. Operational settings of all units. These clusters indicate six discrete operational modes of the system.
1463
7th IFAC SAFEPROCESS (SAFEPROCESS’09) Barcelona, Spain, June 30 - July 3, 2009
that these clusters correspond to six distinct operational modes OMk defined by: OM3 20 0.7 0
OM4 25 0.62 80
OM5 35 0.84 60
OM6 42 0.84 40
8300
1380 (S14 , OM1 )
OM2 10 0.25 20
(S3 , OM5 )
OS1 OS2 OS3
OM1 0 0 100
1390
1370 1360 1350
To know if the sensor Sj have a monotonic trend with few data dispersion during the lifetime of all the units we have defined an heuristic method based on Q1,j,k , Q2,j,k and Q3,j,k respectively the 25th , 50th (median) and 75th percentiles for a given OMk . The cited percentiles are calculated for each RUL value and the obtained curves are fitted with the function h(r) where r is the RUL, Goebel et al. [2008]. h(r) = a + b ec (−r+rmin ) , (a, b, c) ∈ R3 , r ∈ [0, rmin ] (3) where rmin is the min RUL where the number of point for Sj and OMk are sufficient to calculate Q1,j,k , Q2,j,k and Q3,j,k , fitted percentile is noted Qf . In this work we choosed rmin = 250, parameters (a, b, c) are estimated using lsqnonlin method of matlab from Mathworks. On figure 4 the dashed, plain and dash-doted black plot represents respectively the fitted 25th , 50th and 75th percentiles. For a given OMk , Sj is declared to have a clear trend if Qf1,j,k , Qf2,j,k and Qf3,j,k have the same trend and Qf1,j,k (0) > Qf3,j,k (rmin ) in the case of an increasing trend, or Qf1,j,k (rmin ) > Qf3,j,k (0) in the case of a decreasing trend. Application of this heuristic on the data training set give the following tab. The (×) mark indicates that the sensor exhibits monotonic trend in the operational mode. 1 OM1 OM2 OM3 OM4 OM5 OM6
OM1 OM2 OM3 OM4 OM5 OM6
2 × × × × ×
12 × × ×
13 × ×
3 × × × × × ×
4 × × × × × ×
5
14
15 × × × × × ×
16
Sensors 6 7 × ×
17
18
8 × ×
19
9
20 × × ×
10
11 × × × × × ×
1340 300
100
200
8200 8150 8100 300
0
200
RUL
0
100
0
11.2 (S20 , OM6 )
9.3 9.2 9.1 9 300
100 RUL
9.4 (S15 , OM3 )
The definition of OMk allows to study the sensor trend of all instances according to RUL. A few sensors exhibit a monotonic trend during the lifetime of the units. However, some of them show inconsistent end-life trends among the different units in the training data set. The gray scatter plots on figure 4 show example of Sj trends extracted from the data according RUL and OMk . Some of them do not show a clear trend as others due to high noise or their low sensitivity to degradation.
8250
11 10.8 10.6 10.4
200
100
0
10.2 300
RUL
200 RUL
Fig. 4. Example of sensors in different operating modes with different degradation trends. sensors Sj data is composed of different parameters with varying order of magnitude. Therefore a transformation based on relative residual is performed. The relative residual corresponding to Sj is noted Sej and calculated as follows: Sj,k − Qf2,j,k (rmin ) , (4) Sej,k = Qf2,j,k (rmin ) where Sj,k is the values of sensor Sj for operational mode OMk . Qf2,j,k (rmin ) is considered as the nominal value of Sj in OMk ie. value with a negligible damage. Due to the noisy nature of the studied data, outlier detection is an important part in the processing of the data since outliers can easily bias estimations. The outliers detection is carried out by using ”Dynamic test limits” method derived from robust mean and robust sigma proposed in Buxton and Tabor [2003]. For each Sej,k an upper limit Uj,k and lower Lj,k limit are defined according to the fitted 25th , 50th and 75th percentiles. ef ef ef Lj,k = Q 2,j,k − 3 Q2,j,k − Q1,j,k (5) ef ef ef Uj,k = Q + 3 Q − Q 2,j,k 3,j,k 2,j,k All samples that have at least one sensor point such that Sej,k ∈ / [Lj,k , Uj,k ] are eliminated. 3.3 Models for RUL estimation
21 × × ×
Only four sensors have a clear monotonic trend for all OMk . Sensors S3 , S4 , S11 and S15 characterized the sensor subset image of the system degradation. 3.2 Data transformation Due to the noisy nature and the complexity of the data, it is necessary to carry out some pre-processing before to submit data set of our RUL estimation algorithm. Selected
Sensor selection and data transformation steps allow to have a clean data training set to built a model for RUL estimation. As said in section 2, data comes from various units with unknown manufacturing variation and initial damage degree such that working is considered as nominal. To take into account of these informations, a classification of units is made. To perform this classification, the k-means clustering algorithm is used. The k-means clustering algorithm partitions the data set of M objects into the specified number κ of disjoint subsets (κ < M ), called clusters. The clustering depends on the similarity measure between a given pair of objects. For more detailed description the reader is invited
1464
7th IFAC SAFEPROCESS (SAFEPROCESS’09) Barcelona, Spain, June 30 - July 3, 2009
to refer to Tom [1997]. In our study, we assume data objects are elements of Rd , and we define the similarity measure between two objects as their Euclidean distance. This algorithm attempts to find the centers of clusters in the data by minimizing the total intra-cluster variance, or, the squared error function. Each unit u is described by the vector ωu : u u u u f f f f f f f f ω = S S S (6) S u
3
4
11
15
u
ff corresponds to the nominal value of Se for unit where S j j u. This value is determined by fitting for unit u, Sej (r), r ∈ [rmin , 0] with the exponential function h(r) (3), thus ff u = a + b. S j
The choice of the number κ of unit classes Cp is discussed in section 4. Before execution of the k-means clustering algorithm, centroid cp of Cp is initialized to: 2 p−1 cp = ∆ ef ∆ ef ∆ ef ∆ ef (7) S 3 S 4 S 11 S 15 κ where p is the class number and with: u u ff f f ∆ ef = max S j − min S (8) j S
j
The RUL estimation goal is to build a model M to estimate the RUL r according to sensor values. As said and shown previously, the damage propagation is also function of unit manufacturing and current operating modes. Thus not only one model bas been built but 6 × κ models Mk,p are built. The form of these models is given by: r = Mk,p Se3,k,p , Se4,k,p , Se11,k,p , Se15,k,p (9)
We studied the problem of Mk,p building as a regression problem where the collected data set is defined by: S , {(x1 , y1 ), (x2 , y2 ), . . . , (xN , yN )} (10)
where xi , [x1i , x2i , . . . , xdi ]T ∈ X and y ∈ R. N , denotes the number of records in the data set. In real applications xi ∈ X ⊂ Rd is a multidimensional real vector. In this study (xi , yi ) is defined as: ( h i xi = Sej,k,p (r) , j ∈ {3, 4, 11, 15} (11) yi = r
To achive this regression task we choose the well known Support Vector Machines (SVM) algortihm, Vapnik [1998], is the most powerful supervised machine learning algorithms. SVM algorithm often achieves superior classification performance compared to other learning algorithms across most domains and tasks. This algorithm is fairly insensitive to high dimensionality. It has been shown that regression problems can also be modeled using SVM Scholkopf and Smola [2002]. The SVM regression models are based on the ǫ-insensitive error models Scholkopf and Smola [2002] Cristianini and Shawe-Taylor [2000]. The support vector regression model is based on the structured risk minimization principle, the SVR model seeks to minimize an upper bound of the generalization error instead of the empirical error in the other neural network models. The linear function is formulated as f (x) = hw, xi + b, the w and b are coefficients which are determined by minimizing the regularized risk function:
R[f (x)] =
N 1 1 X Lǫ (f (xi ), y i ) kwk2 + C 2 N i=1
(12)
It is known that the regression estimation function is the one that minimizes (12) with the following ǫ-insensitive loss function, defined as: 0, if kf (x) − yk < ǫ (13) Lǫ (f (x), y) = kf (x) − yk − ǫ, otherwise where both C and ǫ are user-determined parameters. Additionally, the first term in (12) denotes the empirical error. The second term in (12) represents the function flatness. The C is used as the trade-off between the empirical risk and the model flatness. Moreover, two positive slack variables, ξ and ξ ⋆ , represent the distance from actual values to the corresponding boundary values of ǫ-tube. Then, (12) is transformed to the following constrained problem as: N X 1 2 min (ξi + ξi⋆ ) kwk + C w,b 2 i=1 (14) y i − hw, xi − b ≤ ǫ + ξi , ⋆ b − hw, xi ≤ ǫ + ξi , ξi , ξi⋆ > 0 ∀i This problem can be written in Lagrangian formulation by introducing Lagrange multipliers αi , i = 1, . . . , N . This gives Lagrangian: ! N X 1 ⋆ 2 (ξi + ξi ) − max 2 kwk + C i=1 X N αi [hw, xi i + b − f (xi ) + ǫ + ξi ] − i=1
N X α⋆i [−hw, xi i − b + f (xi ) + ǫ + ξi⋆ ] i=1 N X (βi ξi + βi ξi⋆ ) −
(15)
i=1
with the constraints: N X (αi − α⋆i ) = 0 with αi , α⋆i ∈ [0, C] .
(16)
i=1
In (16), αi and α⋆i represent Lagrange multipliers. In the case where data is non linearly non separable data, Vapnik proposes to map training data in higher dimensional space H by the function ϕ(.) through dot products h., .i,i.e. on functions of the form K(x1 , x2 ) = hϕ(x1 ), ϕ(x2 )i. Any function that satisfies the Mercer’s condition can serve as the Kernel function. 3.4 Hyper-parameter selection of RUL models In this study, the Gaussian function with a width σ is chosen as a kernel function. The optimal values of C and σ are problem-dependent and can heavily influence the prediction accuracy. Various methods are usually used to set these values Chapelle et al. [2002], Ong et al. [2005]. In this study, trial-and-errors method (K- fold Cross Validation) is chosen, the value of 10 was found to be appropriate. In each step, we calculated the meansquared error (MSE) on the test set and we choose the optimal parameters that minimized this criterion. Figure
1465
7th IFAC SAFEPROCESS (SAFEPROCESS’09) Barcelona, Spain, June 30 - July 3, 2009
250 200
150
RUL
RUL
200
3.6 RUL aggregation and prediction
Training unit 6 MSE = 30.67
100 50
150 100 50
0
0 0
100 200 300 Relative current cycle
0
100 200 300 Relative current cycle
Fig. 5. Example of hyper-parameter selection of RUL models. 5 depicts example of the hyper-parameter selection result for model M2,3 (right plot) and result of various Mk,p combination for unit 6 (left plot). In black this is the actual RUL and in gray the estimated RUL. It is pointed out that the error between estimated and actual RUL is decreasing when RUL tends to zero. 3.5 RUL estimation After building models Mk,p , the aim is to use these models to estimate the RUL of each unit v of the testing data set. In order to distinguish senors of training and testing data set, we note Tj the testing data set sensors. In a first time, the transformation based on relative residual is performed on Tj,k to have Tej,k . the used values Qf2,j,k (rmin ) as the nominal value are the same that are used in (4). To know what model Mk,p used to estimate the RUL is necessary to know what is the class Cp of the unit v. Due to the definition of the testing data set (figure 2), the same approach that was presented in section 3.3 can not be used. Therefore we have assumed that the mean of the l first v cycle of unit v of each Tejv defined the nominal value Te j of unit v, sensors Tj . Due to the unit measurement duration, we choosed l = 20. Each unit of the testing data set are described by the vector θv . l h v v v i v v 1 X ev T (c) (17) θv = Te 3 Te 4 Te 11 Te 15 with Te j = l c=1 j
Then the membership of unit v to class Cp is determinated by using the k-Nearest Neighbor (kNN) algorithm. kNN classifier is based on finding the k nearest examples in some reference set, and taking a majority vote among the classes of these k samples. There is another way of seeing the kNN algorithm, for every vector we estimate the posterior probability, p(y|x) by the proportions of the classes among the k neighbors from the dataset Duda et al. [2001]. We use the Euclidean distance δu,v but any other metric to define neighborhood could be used. v u 4 uX 2 δ =t |(ω ) − (θ ) | (18) u,v
u i
v i
i=1
where (ωu )i and (θv )i are the ith element of vectors ωu and θv . Class Cp of unit v is defined by the class which the sum of the Euclidean distance δu,v between unit v and all unit u of class Cp is minimal. The RUL rˆ of all testing data set samples is then estimated by using the adequate model Mk,p .
The last step of our methodology is to aggregate all rˆv,m ˆv estimated for all the samples of a unit v to give the RUL R of the unit. The used aggregation is based on the sentence: when a cycle is terminated the RUL is decrease of one. This assumption is verified if cycle c is defined in term of load for the system. For the PHM data challenge, the cycle definition is not clear so we assume that cycles are defined in term of cycle. In this case the RUL r according cycle c can be modeled by an affine function: r = α c + β where α = −1 and β ∈ R, Engel et al. [2000].
To aggregate all rˆv,m , we search α ˆ and βˆ that minimize the quantity (Err) defined by: Dv 2 X rˆv,m − α ˆ cv,m + βˆ (19) Err = m=1
where Dv is the length of measurement record of unit v, and cv,m the relative current cycle. α ˆ and βˆ are estimated by using the polyfit method of Matlab from Mathworks. This method is based on the least square algorithm. The ˆ v is then given by the cycle cv,Dv of the last sample RUL R of unit v: ˆv = α R ˆ cv,Dv + βˆ (20) Equation (20) assumes that whatever α ˆ during [cv,1 , cv,Dv ], α ˆ is considered as equal to −1 after cv,Dv . 4. RUL PREDICTION RESULTS Figure 7 depicts the RUL estimation applied on some units of the testing data set. The gray plain lines represent the RUL estimation by the Mk,p model (9). The black plain lines are the RUL aggregation ie. the fit of the RUL estimation by first order polynomial (19). The dotted black lines are the RUL decision (20). The gray dotted lines are the actual RUL. Determination of κ have been realized by experimentation on the training data set. Several units have been considered as unknow and we estimated the RUL of these units for various value of κ. We choosed κ = 8 because that minimizes the prediction error. The κ choice has an impact on the number of samples for the learning stage of Mk,p and consequently impacts the model accuracy. To evaluate the performance of the RUL prediction algorithm, the relative error ERv is used. ˆ v − Rv R ERv = (21) |Rv | Figure 6 shows the ERv repartition of the application of our algorithm on the testing data set. 80% of RUL prediction are earlier (ERv ≤ 0) and the ERv median is -30%. Although median is not near zero, results are encouraging because earlier prediction avoid to evit the unit failure. This methodlogy placed fourth overall in the PHM competition with a score of 1095 (2). Unit percentage
M2,3 , MSE = 27.82 σ = 2−3 , C = 32
250
20 10 0 -100
-80
-60
-40
-20 0 20 Relative error [%]
40
Fig. 6. Prediction algorithm performance.
1466
60
80
100
7th IFAC SAFEPROCESS (SAFEPROCESS’09) Barcelona, Spain, June 30 - July 3, 2009
Testing unit 3 ˆ 3 = 56, R3 = 53 R
Testing unit 30 ˆ 30 = 34, R30 = 33 R
200
Testing unit 74 ˆ 74 = 113, R74 = 111 R
Testing unit 48 ˆ 48 = 44, R48 = 30 R
200
-200
300
RUL
RUL
RUL
RUL
200 100
100
100
100
0
0 0
50 100 150 200 Testing unit 84 ˆ 84 = 109, R84 = 138 R
200
0 0
50 100 150 200 Testing unit 110 ˆ 110 = 25, R110 = 25 R
0 0
200
100 200 300 Testing unit 175 ˆ 175 = 77, R175 = 53 R
0
50 100 150 200 Testing unit 215 ˆ 215 = 100, R215 = 97 R
200
0
100
0 0
50
100
150
200
RUL
100
RUL
RUL
RUL
200
100
-0 0
50
100
150
200
100
0 0
50
100
150
200 250
0
50
100
150
200
Fig. 7. Remaining useful life (RUL) estimation and prediction for testing units 3, 30, 48, 74, 84, 110, 175 and 215. 5. CONCLUSION & FUTURE WORK We designed a data driven prognostic methodology without priori knowledge about the studied system. This methodology is flexible and efficient to implement and it works with most popular machine learning algorithms. As a test of its effectiveness, we applied it to PHM data, the obtained results has been encouraging. This methodology, has great potential of improvement in, for example: • To classify testing units we used the mean of l first samples because we do not have all the historic of these units. In real application, unit class would be determinated upon its entry into service. • The incorporation of prior knowledge into SVMs by considering prior knowledge on some regions of the input space or by designing a specific kernel to the problem formulation. Future works also should explore methods to denoise sensor measurements to improve the accuracy of the algorithms. Feature extraction algorithms should be applied to find feature subsets that are more representative of the system damage. RUL aggregation and prediction should be improved by RUL estimation of the unit on all unit class and weighting RUL prediction according the distance between the tested unit and class centroids. Moreover Bayesian regression techniques should be tested in order to manage uncertainty related to the CBM/PHM paradigm. REFERENCES Paul Buxton and Paul Tabor. Outlier detection for dppm reduction. IEEE ITC International Test Conference, 32(4):820–827, 2003. Carl S. Byington, Patrick W. Kalgren, Robert Johns, and Richard J. Beers. Prognosis enhancements to diagnostic system for improved condition based maintenance. In AUTOTESTCON, pages 320–329, California, USA, September 2003. Olivier Chapelle, Vladimir Vapnik, Olivier Bousquet, and Sayan Mukherjee. Choosing multiple parameters for support vector machines. Journal of Machine Learning, 46(1):131–159, 2002.
Nello Cristianini and John Shawe-Taylor. Support vector machines and other kernel-based learning methods. 2000. Richard O. Duda, Peter E. Hart, and David G. Stork. Pattern Classification. John Wiley & Sons, 2001. Stephen J. Engel, Barbara J. Gilmartin, Kenneth Bongort, and Andrew Hess. Prognostics, the real issues involved with predicting liferemaining. In Aerospace Conference Proceedings, 2000 IEEE, volume 6, 2000. Kai Goebel, Bhaskar Saha, and Abhinav Saxena. A comparison of three data-driven techniques for prognostics. In Failure prevention for system availability, 62th meeting of the MFPT Society, pages 119–131, 2008. Mitchell Lebold and Michael Thurston. Open standards for condition-based maintenance and prognostic systems. In 5th Annual Maintenance and Reliability Conference, MARCON, Gatlinburg, USA, 2001. Cheng Soon Ong, Alexander J. Smola, and Robert C. Williamson. Learning the kernel with hyperkernels. Journal of Machine Learning Research, 6:1043–1071, 2005. Flavien Peysson, Mustapha Ouladsine, Rachid Outbib, JB. Leger, O. Myx, and C. Allemand. Damage Trajectory Analysis based Prognostic. In IEEE International Conference on Prognostics and Health Management, Denver, USA, October 2008. Abhinav Saxena, Kai Goebel, Simon Don, and Eklund Neil. Damage Propagation Modeling for Aircraft Engine Run-to-Failure Simulation. In IEEE International Conference on Prognostics and Health Management, Denver, USA, October 2008. Bernard Scholkopf and Alex Smola. Learning with Kernels Support Vector Machines, Regularization, Optimization and Beyond. MIT Press, Cambridge, MA, 2002. Mitchell Tom. Machine Learning. McGraw Hill, New York, 1997. George .J. Vachtsevanos, Lewis Frank L., Michael Roemer, Andrew Hess, and Wu Biqing. Intelligent Fault Diagnosis and Prognosis for Engineering Systems. Hoboken, NJ: John Wiley & Sons, 2006. Vladimir Vapnik. Statistical Learning Theory. Weiley, 1998.
1467