Application of relevance vector machine and survival probability to machine degradation assessment

Application of relevance vector machine and survival probability to machine degradation assessment

Expert Systems with Applications 38 (2011) 2592–2599 Contents lists available at ScienceDirect Expert Systems with Applications journal homepage: ww...

904KB Sizes 214 Downloads 147 Views

Expert Systems with Applications 38 (2011) 2592–2599

Contents lists available at ScienceDirect

Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa

Application of relevance vector machine and survival probability to machine degradation assessment Achmad Widodo a, Bo-Suk Yang b,⇑ a b

Mechanical Engineering Department, Diponegoro University, Tembalang, Semarang 50275, Indonesia School of Mechanical Engineering, Pukyong National University, San 100, Yongdang-dong, Nam-gu, Busan 608-739, South Korea

a r t i c l e

i n f o

Keywords: Machine prognostics Survival probability Relevance vector machine Censored data Uncensored data

a b s t r a c t Condition monitoring (CM) of machines health or industrial components and systems that can detect, classify and predict the impending faults is critical in reducing operating and maintenance cost. Many papers have reported the valuable models and methods of prognostic systems. However, it was rarely found the papers deal with censored data, which was common in machine condition monitoring practice. This work deals with development of machine degradation assessment system that utilizes censored and complete data collected from CM routine. Relevance vector machine (RVM) is selected as intelligent system then trained by input data obtained from run-to-failure bearing data and target vectors of survival probability estimated by Kaplan–Meier (KM) and probability density function estimators. After validation process, RVM is employed to predict survival probability of individual unit of machine component. The plausibility of the proposed method is shown by applying the proposed method to bearing degradation data in predicting survival probability of individual unit. Ó 2010 Elsevier Ltd. All rights reserved.

1. Introduction Prognostics has emerged as an alternative to traditional reliability prediction, run-to-failure, and scheduled maintenance. It is also an important aspect of machine components or equipment surveillance system. This system has been developed through several modules which use device related to data acquisition and performing condition monitoring, fault diagnostics and prognostics. Condition monitoring and fault diagnostics portions have been well developed for several decades, while prognostics methods have recently attracted much attention in engineering maintenance research work. The reason of the growing interest in developing prognostics technique is there are several advantages could be gained from prognostics application such as reducing production downtime, spare-parts inventory, maintenance cost and safety hazards. Another reason is that prognostics requirements for modern maintenance system and safety–critical components have became a mission that presents many challenges for engineering system design work. The aims of prognostics are usually developed to accurately predict one of related measures such as remaining useful life (RUL), time-to-failure (TTF) or probability-of-failure (POF) of machine components or engineering assets. These objectives are a must to support an excellent maintenance process that capable to estimate future equipment health status, to anticipate

⇑ Corresponding author. Tel.: +82 51 620 1604; fax: +82 51 620 1405. E-mail address: [email protected] (B.-S. Yang). 0957-4174/$ - see front matter Ó 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2010.08.049

the problems and maintenance routines before downtime occurs. The capability of prediction would enable the maintainer to execute a very beneficial strategy based on future expected machine condition. Currently the existing prognostics techniques are developed using approaches such as TTF data-based, stress-based and effects-based (Hines & Usynin, 2008). Time-to-failure based utilizes statistical approaches through, e.g. Weibull analysis of historical time-to-failure data. This technique typically involves fitting probabilistic failure distribution to historical data. The logic extension to this method is the correlation of failure event history with more specific health condition data. It estimates the life of an average component under average usage condition. This method has been implemented by Groer (2000) who performed analysis of TTF with a Weibull model. Another research work on the suitability of Weibull distribution for machine failure estimation was reported by Schömig and Rose (2003). Stress-based approach considers environmental stresses, e.g. temperature, load, vibration, etc. under which the equipment operates. A common method is proportional hazard model (PHM) that utilizes regression and life-tables as proposed by Cox (1972). This method use prior observations of explanatory variables such as stress, vibration, temperature, current, voltage, and the response variable, which is usually failure time, to predict life component. The environmental conditions, termed as covariates (z0), are used to modify a baseline hazard rate (k0) to obtain a new hazard rate. Failure data collected at covariate operating conditions are used to solve for the unknown parameter b using maximum likelihood estimation (MLE). Research works of

2593

A. Widodo, B.-S. Yang / Expert Systems with Applications 38 (2011) 2592–2599

prognostics and health management (PHM) were reported, e.g. by Jardine, Anderson, and Mann (1987, 1989) and Mazucchi and Soyer (1989). Effects-based prognostics approach uses degradation measures to perform a prognostics prediction. These degradation measures are scalar or vector quantities that numerically represent the current ability of the system to perform its designated functions properly. This technique is similar to data-driven technique in the prognostics literature study. Data-driven method was popular technique of prognostics; however, it usually requires a large amount of data to reach high accuracy and good performance of RUL estimation. In this case, the techniques of time series analysis have been performed to predict the future state of machines based on previous state. The examples of research works of machine prognostics that used data-driven technique were conducted by Yang and Widodo (2008), Tran, Yang, Oh, and Tan (2008) and Niu and Yang (2009). In the case of expert system and intelligent techniques applied in prognostic system, artificial neural network (ANN) is one of popular methods. ANN learns from example and aim to capture the relationship among data. The remaining problem of ANN is that the reasoning between their decisions is not always evident but nevertheless, they are a feasible tool for practical problem and easier than to build mathematical models describing system’s physic (Vachtsevanos, Lewis, Roemer, Hess, & Wu, 2006). ANN was reported as tool for prognostics system by many researchers (Gebraeel, Lawley, Liu, & Parmeshwaran, 2004; Huang et al., 2007; Shao & Nezu, 2000; Tse & Atherton, 1999; Wen & Zhang, 2004. Another methods were reported using support vector machine (Yang & Widodo, 2008), regression tree and neuro-fuzzy (Tran et al., 2008, Tran, Yang, & Tan, 2009), and Dempster–Shafer regression (Niu & Yang, 2009). This paper contributes an intelligent machine prognostics system based on probability estimation of CM data of historical units when some data were censored and not undergo failure. This situation commonly occurs in practice when preventive replacements are conducted, while the units under study are still operated. Moreover, CM data is considered to be integrated with reliability analysis to enable prognostic system that is longer-range system. The censored data of historical units usually rare to be considered as prognostic input data and it has also not been fully utilized. Whereas this phenomenon is very common in the practice that the system does not contain of only single unit but a population of units. Therefore, the relation between CM data and actual survival state of the assets need to be deduced. This work complements intelligent prognostics system of the previous work done by Heng, Tan, & Mathew (2008) and utilizes relevance vector machine (RVM) for prediction the survival probability of units under study. The training inputs for RVM were generated from simulation and experimental bearing defect degradation data that involves censored data. Target vectors were survival probability that was obtained from survival analysis by using Kaplan–Meier (KM) and probability density function (PDF) estimators. 2. Theoretical background 2.1. Survival analysis Survival analysis is the name for a collection of statistical techniques used to describe and quantify time to event data. In survival analysis, we use the term ‘failure’ to define the occurrence of the event of interest and the term ‘survival time’ to specify the length of time taken for failure to occur. Situations where survival analysis have been used include prognostics of life time machine components, time from diagnosis to death in clinical trial, duration of industrial dispute, time from infection to disease onset, etc. Our

work deals with survival analysis to estimate the remaining useful life (RUL) of machine components. So we draw a random sample of these machine components, put them into test, collect and perform analysis of the data then make the inference among them. This work employs KM and PDF estimators to generate survival probability as target vectors of our prognostics system. KM estimator also known as product-limit estimator of the survivor function is non-parametric estimator (Kaplan & Meier, 1958), which uses intervals starting death times. The standard formula of this survivor function is given by

b SðtÞ ¼

 k  Y nj  dj

ð1Þ

nj

j¼1

Since, by construction, there are nj units which are survive just before tj and dj failed occurring at tj, the probability that an unit failures between time interval and tj is estimated by dj/nj. Thus, the probability of units surviving through [tj, tj + 1] is estimated by (nj  dj)/nj. The only influence of the censored data is in the computation of the number of units, nj, which are survive just before tj. If a censored survival time occurs simultaneously with one or more unit failure, then the censored survival time is taken to occur immediately after the failure time. In the case of complete failure data, we adapt the previous work done in Heng et al. (2008), that means the machine components have reached failure when removed from the machine, the survivor function is calculated by

b Sðt þ kÞ ¼



1; 0 6 t þ k < T 0;

tþk>T

ð2Þ

where T is failure time. Data set considered as censored if the machine components have not reached the failure threshold when removed from the machine. In this work, the standard formula of KM estimator was modified to produce cumulative survival probability for individual/unit machine components that is given by

b Sðt þ kÞ ¼

8 < 1; :

Q



L6t j 6tþk

nj dj nj



06tþkL

ð3Þ

where L denotes the last observed survival time of the unit machine component. Note that we use the last observed survival time L of each censored unit as the starting time, rather than time 0, to compute appropriate training target survival probabilities. PDF is employed to estimate the survivor function of each unit j which derived from CM data Yj(t) at time t. In this case, the estimated survival probability is successive multiplication of probability of units that have survived preceding intervals having condition indices higher than the observed index of item j but lower than the threshold, this is given by

b Sðt þ dkÞ ¼

k Y j¼1

R Y threshold y

Ri;tþdk 1 yi;tþdk

f ðyjt þ dkÞdy

f ðyjt þ dkÞdy

ð4Þ

where d is time interval. Finally, the target vectors of training are mean of survival probability obtained by above methods. 2.2. Relevance vector machine (RVM) RVM is a Bayesian form representing a generalized linear model of identical functional form of support vector machine (SVM). It differs with SVM in the case of solution which provides probabilistic interpretation of its outputs (Tipping, 2000). RVM evades the complexity by producing models which have both a structure

2594

A. Widodo, B.-S. Yang / Expert Systems with Applications 38 (2011) 2592–2599

and a parameterization process that, together, are appropriate to the information content of the data. As a supervised learning, RVM starts with a set of data inputs fxgNn¼1 and their corresponding target vectors ftgNn¼1 . The aim is to learn a model of the dependency of the target vectors on the inputs in order to make accurate prediction of t for unseen value of x. Typically, the predictions are based on a function y(x) defined over the input space, and learning the process of inferring the parameter of this function. In the context of SVM, this function takes form

yðxÞ ¼

N X

wi Kðx; xi Þ þ w0

ð5Þ

1

where w = {w1, w2, . . . , wN} is weight vectors, w0 is bias and K(x, xi) is a kernel function. RVM seeks to predict target t for any query of x according to

t ¼ yðxÞ þ en

ð6Þ

where en are independents samples from noise process with mean 0 and variance r2. The likelihood of data set can be written as

  1 pðtjw; r2 Þ ¼ ð2pr2 ÞN2 exp  2 kt  Uwk2 2r

ð7Þ

where U is the N  (N + 1) design matrix with Unm = {1, K(xi, x1), K(xi, x2), . . . , K(xi, xN)}T. Maximum likelihood estimation of w and r2 in Eq. (7) often results overfitting. Therefore, Tipping (2001) recommended imposition of some prior constraints on the parameters w by adding a complexity to the likelihood or error function. This a priori information controls the generalization ability of the learning process. Typically, new higher-level parameters are used to constrain an explicit zero-mean Gaussian prior probability distribution over the weights

pðwjaÞ ¼

N Y  N wi j0; a1 i

ð8Þ

tions for a new data are then made according to integration out the weights to obtain the marginal likelihood for the hyperparametes

pðtja; r2 Þ ¼

Z

pðtjw; r2 ÞpðwjaÞdw ¼ ð2pÞN=2 jB1 þ UA1 UT j1=2   1  exp  tT ðB1 þ UA1 UT Þ1 t ð14Þ 2

3. Methodology The machine degradation assessment methodology is depicted in Fig. 1 which employs CM data of j units machines that obtained from CM routine. Feature calculation is performed to obtain good features that represent clear progressive degradation of machine. When we deal with multi-features, feature extraction should be performed to map the calculated features from high dimensional space onto lower dimensional space. We can employ unsupervised learning techniques such as principal or independent component analysis and self-organizing map for feature extraction. One dimensional feature may be obtained by unsupervised learning from which the survival probability is calculated. Survival probability is then estimated by KM and PDF estimators as target vectors for RVM training and validation. Good validation process is measured by root-mean-square error (RMSE) that the lower the better of validation process, and correlation (R). One or more CM data from individual unit can be used to test the performance of RVM model after validation. The weights obtained from validation process are saved and then used for testing the ability of RVM based machine degradation assessment. CM data of j unit machines

Target vectors

Feature calculation and extraction

Survival probability estimation

Training RVM and validation Yes

Good (?)

No

i¼0

where a is a vector of (N  1) hyperparameters that controls how far from zero each weights is allowed to deviate (Schölkopf & Smola, 2002). Using Bayes’ rule, the posterior overall unknowns could be computed, given the defined non-informative prior-distributions

pðw; a; r2 jtÞ ¼ R

pðtjw; a; r2 Þpðw; a; rÞ pðtjw; a; r2 Þpðw; a; r2 Þdw da dr2

Testing RVM model

Fig. 1. Machine degradation assessment method.

Outer-race fault, 100rpm, BPFO=4.89Hz, Gaussian Noise 30dB

ð9Þ

2

However, we cannot compute the solution of the posterior in Eq. (9) directly since we cannot perform the normalizing integral pðtÞ ¼ R pðtjw; a; r2 Þpðw; a; r2 Þdw da dr2 . Instead, we decompose the posterior as

0

pðw; a; r2 jtÞ ¼ pðwjt; a; r2 Þpða; r2 jtÞ

ð10Þ

to facilitate the solution. The posterior distribution of weights is given by

pðwjt; a; r2 Þ ¼

pðtjw; r2 Þpðw; aÞ pðtja; r2 Þ

ð11Þ

-2

0

ð12Þ

T

l ¼ RU Bt

ð13Þ 2

with A = diag(a1, . . . , aN+1), and B = r I. Note that r2 is also treated as a hyperparameter, which may be estimated from the data. Therefore, machine learning becomes a search for the hyperparameter posterior most probable. Predic-

(a)

0.5

1

(b)

500

1.5

2

2.5

3

3.5

4

4.5

5

Time (sec)

100 50 0

0

Eq. (11) has an analytical solution where the posterior covariance and mean are

R ¼ ðUT BU þ AÞ1

Machine Degradation Assessment

1000

1500

2000

2500

Freq (Hz)

←4.88 100 0

←9.77 ←14.65 ←19.53 ←24.41 0

(c)

10

20

30

40

50

60

70

80

90

100

Freq (Hz)

Fig. 2. Simulated signal of outer-race defect: (a) time domain plot of raw signal, (b) frequency spectrum of raw signal and (c) fault detection after demodulation.

2595

1

1

0.8

0.8

0.6

0.6

0.6

0.4

0.4

0.4

0.2 0 -0.2

Amplitude

1 0.8

Amplitude

Amplitude

A. Widodo, B.-S. Yang / Expert Systems with Applications 38 (2011) 2592–2599

0.2 0

-0.2

0.2 0 -0.2

-0.4

-0.4

-0.4

-0.6

-0.6

-0.6

-0.8

-0.8

-0.8

-1

-1

0

1

2

3

4

5

-1 0

1

2

3

4

5

0

0.5

1

1.5

2

Time [s]

Time [s]

2.5

3

3.5

4

4.5

5

Time [s]

Fig. 3. Defective bearing signal simulation.

12

1.8

11

1.7

10

1.6

9

Kurtosis

10

Peak

12

Entropy Estimation

14

8 6

8 7 6 5 4

4

10

20

30

40

50

60

70

80

90 100

Time-step

2 0

1.4 1.3 1.2 1.1 1 0.9

3 2 0

1.5

10

20

30

40

50

60

70

80

90 100

0.8 0

10

20

30

Time-step

40

50

60

70

80

90 100

Time-step

Fig. 4. Peak, kurtosis and entropy estimation of simulated defective bearing signal.

(a)

Quantization Error (QE)

(b) 30

Dataset 1 Dataset 10 Dataset 39

25

20

15

10

5

0

10

20

30

40

50

60

70

80

Time step Fig. 5. (a) Feature extraction by PCA and (b) presentation of QE obtained from different dataset.

4. Application on machine degradation The proposed method is validated by using simulation data of bearing defect degradation and real data obtained from experimental work. In the simulation, we developed vibration CM data that represents defect propagation of rolling element bearing by Matlab program. The properties of rolling element bearing in the simulation were as follows: pitch diameter of 23 mm, number of rolling elements of 9; roller diameter of 8 mm and contact angle of 0°. We conducted bearing outer-race defect simulation under

rotating speed 100 rpm and sampling frequency 5 kHz. Fig. 2(a) shows the simulated time domain signal of bearing with outer-race defect. This signal was converted to frequency domain using fastFourier transform (FFT) as shown in Fig. 2(b). This figure presents that the spectrum was dominated by high-frequency resonant signals. To separate the bearing fault frequency signal from these dominant signals, the vibration signals were band-pass filtered and rectified. Fig. 2(c) depicts the peaks were detected at 4.88 Hz, which closely matched with the calculated outer-race fault frequency as indicated in the top of figure.

2596

A. Widodo, B.-S. Yang / Expert Systems with Applications 38 (2011) 2592–2599 1 0.9

Survival probability

0.8 0.7 0.6 0.5 0.4 0.3 RMSE =2.73e-6 R = 0.98

0.2 0.1

0

5

10

15

20 25 30 Measurement points

35

40

45

50

80

90

100

Fig. 8. Validation of RVM training with QE.

1.4 1.2 1

Fig. 6. Bearing test rig and sensor placement illustration (Qiu et al., 2006).

QE

0.8 0.6 0.4 80 70

Kurtosis of vibration

0.2

Data No. 1-Bearing 3 Data No. 2-Bearing 1 Data No. 3-Bearing 3

Actual Prediction (kernel width = 5e-5) Prediction (kernel width = 2.5e-5) Predcition (kernel width = 2.5e-6)

0

60

-0.2 0

10

20

30

40

50

60

70

Measurement points 50 Fig. 9. Overfitting prediction of simulation data.

Threshold

40 30

Table 1 Performance of RVM testing w.r.t. kernel-width using bearing simulation data.

20 10 0

0

200

400

600

800

1000 1200 1400 1600 1800 2000

Mesurements points

Kernel-width

RMSE

R

5.0  104 2.5  104 5.0  105 2.5  105 5.0  106 2.5  106

0.170 0.115 0.060 0.048 0.046 0.266

0.92 0.95 0.97 0.98 0.98 0.79

Fig. 7. Kurtosis of vibration data and threshold of failure condition.

The simulated signals were repeatedly generated from the computer program based on equations presented by McFadden & Smith (1984) & Wang & Kootsookos (1998), while the defective severity was increased exponentially with random fluctuations to represent real condition (Fig. 3). Every simulated signal has defect impulses that increase at different rates and time measurements. The signals were set up to be having same threshold of failure, but the time of reaching failure was different for each data set. It has been observed in bearing life test that bearing degradation signals possess an inherent exponential growth (Gebraeel, Lawley, Rong, & Ryan, 2005; Gebraeel, 2006).

We calculated three features from time domain signals namely peak, kurtosis and entropy estimation (as depicted in Fig. 4), then we performed feature extraction by means of PCA to reduce the dimensionality of calculated features. This feature reduction was addressed to minimize the input of the RVM network and training time. After PCA training, the deviations between mapped features of simulated signals and healthy state conditions were calculated. These deviations are regarded as quantization errors (QE) as depicted in Fig. 5(a). Fig. 5(b) shows the QE of different simulated data set generated from defective bearing simulation. We generated 40 datasets and the corresponding QE values were obtained. Thirty-six of 40 datasets were employed for train-

2597

A. Widodo, B.-S. Yang / Expert Systems with Applications 38 (2011) 2592–2599

4.5

1 Actual failure time

4

0.9

X: 100 Y: 4.075

0.8

3.5

Survival probability

3

QE

2.5 2 1.5

0.7 0.6 0.5 0.4 0.3

1 0.2 0.5 0

0

Prediction (kernel width = 0.5) Prediction (kernel width = 1e-1)

0.1 10

20

30

40

50

60

70

80

90

100

0 0

110

Measurement points

200

300

400

500

600

700

Measurement points

Fig. 10. Bearing defect degradation of dataset No. 30.

Fig. 13. Overfit prediction of experimental bearing data.

1.2

Table 2 Performance of RVM testing w.r.t. kernel-width experiment bearing data. Overfitting

1

Survival probability

100

0.8

0.6

Kernel-width

RMSE

R

0.5 0.1 0.01 0.001 0.0001 0.00001

28.427 8.046 0.906 0.146 0.044 0.011

0.25 0.31 0.47 0.88 0.98 0.99

0.4 50 X: 98 Y: 0.1308

0.2 Actual Prediction (kernel width =5e-6 )

0

10

20

30

40

50

60

70

80

90

100

Measurement points Fig. 11. RVM prediction of bearing defect degradation dataset No. 30.

1

X: 692 Y: 46

Threshold

40

Kurtosis of vibration

0

45

Predicted failure time

35 30 25 20 15

0.9 10

0.8

Survival probability

5

0.7

RMSE = 1.29e-6 R = 0.99

0

0.6

0

100

200

300

400

500

600

700

Measurement points

0.5 Fig. 14. Bearing degradation data for testing RVM.

0.4 0.3 0.2 0.1 0

Actual Prediction

0

20

40

60

80

100

120

140

160

180

Measurement points Fig. 12. Validation of SVM training with kurtosis of vibration experimental data.

ing and the remaining for testing the system. In training datasets, we imposed 1/3 of training data which are censored data. The target vectors for training process were obtained from KM and PDF estimators.

The experimental data was also generated from bearing test rig that able to produce run-to-failure data. These data was downloaded from Prognostics Center of Excellence (PCoE) through prognostic data repository contributed by Intelligent Maintenance System (IMS), University of Cincinnati (Lee, Qiu, Yu, Lin, & Rexnord, 2007). Bearing test rig consists of four bearings that were installed on one shaft as presented in Fig. 6. The rotation speed of shaft was kept constantly at 2000 rpm and a radial load of 6000 lb was added to the shaft and bearings through spring mechanism. The bearings used were Rexnord ZA-115 double row bearings that have 16 rollers at each row, a pitch diameter of 2.815 in., roller diameter of 0.311 in., and a tapered contact angle of 15.17°. The vibration signals were acquired by eight accelerometers from PCB 353B33

A. Widodo, B.-S. Yang / Expert Systems with Applications 38 (2011) 2592–2599

(a high sensitivity quartz ICP accelerometers) that were installed at vertical and horizontal directions. Four thermocouples were also installed to the outer-race of each bearing to record bearing temperature for monitoring lubrication purposes. Vibration signals were collected every 20 min by NI-DAQCard 6062E data acquisition card with data sampling rate was 20 kHz. The collected vibration data were 12 complete failure datasets with different failure time and four datasets regarded as normal condition. The measurement points of original data were cut due to high dimensionality that represents normal condition. In our work, we used 2100 measurement points which is still able to show the normal condition and failure event. We calculated and used only one dimensional features namely kurtosis for validating the proposed method. The presentation of kurtosis of vibration data and threshold assumption of failure is shown in Fig. 7. In the case of complete failure, we only took eight data and calculated survival probability for training the target vectors. Four datasets were taken to represent the censored data, and then modified KM and PDF estimators were performed to determine the survival probability of censored data. One remaining data was addressed to test the performance of system after training RVM. 5. Result and discussion In the case of simulation data, we trained RVM using inputs from QE of 36 CM bearing degradation data and target vectors obtained by KM and PDF estimators. In RVM training, we employed Gaussian kernel and performed 2-fold cross-validation for obtaining proper kernel-width parameter (c). We searched kernel-width value in the range of {5  104, 2.5  104, . . . , 2.5  106} to obtain optimized RVM training process. The validation process is shown in Fig. 8 with acceptable RMSE and R are 2.73  106 and 0.98, respectively. The effect of improper kernel-width parameter is presented in Fig. 9 that shows overfitting phenomenon in prediction of survival probability of bearing data. In addition, Table 1 informs the performance of testing process after RVM validation with respect to kernel-width. In our work, kernel-width values was studied in the range of {5  104, 2.5  104, . . . , 2.5  106} while higher and lower from this range gave serious overfitting. The overfit prediction of survival probability resulted over prediction as depicted in Fig. 9. Kernel-width obtained from cross-validation was 5  106. Fig. 10 shows the testing data for validated RVM obtained from QE of bearing dataset No. 30. The actual failure time is located at ta = 100. RVM prediction is presented in Fig. 11 which gives good prediction of failure time at tp = 98. At early measurement points, there is still having overfit prediction, however, it does not significantly reduce the meaning of prognostics because the bearing still in normal condition. The accuracy of prediction can be simply calculated as

    jt a  t p j 100  98  100% ¼ 1   100% Accuracy ¼ 1  ta 100 ¼ 98:0% In the case of experimental data, RVM was trained by 700 data points of kurtosis of CM vibration data that represent run-to-failure data. In this case, we also employed Gaussian kernel and performed 4-fold cross-validation for obtaining proper kernel-width parameter (c). We searched kernel-width value in the range of {0.5, 0.1, . . . , 5  106} to obtain optimized RVM training process. The validation process is shown in Fig. 12 with plausible RMSE and R are 1.29  106 and 0.99, respectively Improper kernel-width selection leads to overfitting phenomenon as presented in Fig. 13. In this case, selection of relatively high

1.2 Overfitting 1

0.8

Survival probability

2598

0.6

0.4

0.2 X: 664 Y: 0.007243

0

-0.2

Predicted failure time

0

100

200

300

400

500

600

700

Measurement points Fig. 15. RVM prediction of bearing failure time.

kernel-width gave serious overfit prediction of survival probability of RVM testing. The complete results of RVM testing performance is summarized in Table 2. The best RVM testing performance was reached at kernel-width 1.0  105 with RMSE and R are 0.01 and 0.99, respectively. In addition, selection of kernel-width that lower than 1.0  105 effected high error and low correlation of survival probability. Fig. 14 shows the individual bearing data that used for testing the validated RVM. This data was no involved in training RVM and reached failure time at ta = 692. RVM based survival probability prediction is depicted in Fig. 15 and predicts the failure time at tp = 664. The maximum amplitude of kurtosis of vibration data reached threshold at t = 692 is matched with the decreasing of survival probability, S = 0, that represent failure condition of bearing under study. The plausibility of the prediction can be shown from the accuracy given by

Accuracy ¼

    jt a  tp j 692  664  100%  100% ¼ 1  1 ta 692

¼ 95:9% Fig. 15 also presents overfit prediction at the early measurement points, but this case does not significantly decrease the prognostics meaning because the machine still in normal condition. 6. Conclusion This paper presents the study of machine degradation assessment based on RVM and survival probability. RVM was trained by simulation and experimental CM data including censored data to obtain good prognostics model. Target vectors were generated by KM and PDF estimators which represent survival probability of the population of machines being studied. RVM has been experimented and validated by simulation and experimental data, and they have resulted plausible performance of failure time prediction. Overfit prediction emerged at the early measurement points of both simulation and experimental data. However, it might be acceptable and still gives prognostics meaning. Result deduced from simulation and experimental data is plausible to be a machine degradation assessment model. Acknowledgement This work was supported by the Brain Korea (BK) 21 project.

A. Widodo, B.-S. Yang / Expert Systems with Applications 38 (2011) 2592–2599

References Cox, D. R. (1972). Regression model and life-tables. Journal the Royal Statistic Society, Series B (Methodological), 34(2), 187–220. Gebraeel, N., Lawley, M., Liu, R., & Parmeshwaran, V. (2004). Residual life prediction from vibration-based degradation signals: A neural network approach. IEEE Transactions on Industrial Electronics, 51, 694–700. Gebraeel, N. Z., Lawley, M. A. ., Rong, Li., & Ryan, J. K. (2005). Residual-life distribution from component degradation signals: A Bayesian approach. IIE Transactions, 37, 543–557. Gebraeel, N. (2006). Sensory-updated residual life distribution for components with exponential degradation pattern. IEEE Transactions on Automation Science and Engineering, 3(4), 382–393. Groer, P. G. (2000). Analysis of time-to-failure with a Weibull model. In Proceedings of the maintenance reliability conference, Knoxville, Tennessee, USA. Heng, A., Tan, A., & Mathew, J. (2008). Asset health prognostics incorporating reliability data and condition monitoring histories. In J. Gao, J. Lee, J. Ni, L. Ma, & J. Mathew (Eds.), Proceeding of the 3rd world congress engineering asset management and intelligent maintenance system (WCEAM-IMS), Beijing, China (pp. 666–672). Hines, J. W., & Usynin, A. (2008). Current computational trends in equipment prognostics. International Journal of Computational Intelligence System, 1(1), 94–102. Huang, R., Xi, L., Li, X., Liu, C. R., Qiu, H., & Lee, J. (2007). Residual life predictions for ball bearings based on self-organizing map and back propagation neural network methods. Mechanical System and Signal Processing, 21, 193– 207. Jardine, A. K. S., Anderson, P. M., & Mann, D. S. (1987). Application of the Weibull proportional hazards model to aircraft and marine engine failure data. Quality and Reliability Engineering International, 3, 77–82. Jardine, A. K. S., Ralston, P., Reid, N., & Stafford, J. (1989). Proportional hazards analysis of diesel engine failure data. Quality and Reliability Engineering International, 5, 207–216. Kaplan, E. L., & Meier, P. (1958). Nonparametric estimation from incomplete observations. Journal of the American Statistical Association, 53, 457–481. Lee, J., Qiu, H., Yu, G., Lin, J., & Rexnord (2007). Technical Services 2007. ’Bearing Data Set’, IMS, University of Cincinnati. NASA Ames Prognostics Data Repository, NASA Ames, Moffett Field, CA. Accessed 06.07.09. Mazucchi, T. A., & Soyer, R. (1989). Assessment of machine tool reliability using a proportional hazards model. Naval Research Logistics, 36(6), 765–777.

2599

McFadden, P. D., & Smith, J. D. (1984). Model for the vibration produced by a single point defect in rolling element bearing. Journal of Sound and Vibration, 96, 69–82. Niu, G., & Yang, B. S. (2009). Dempster–Shafer regression for multi-step-ahead timeseries prediction towards data-driven machinery prognosis. Mechanical System and Signal Processing, 23(3), 740–751. Qiu, H., Lee, J., Lin, J., & Yu, G. (2006). Wavelet filter-based weak signature detection method and its application on rolling element bearing prognostics. Journal of Sound and Vibration, 289(4-5), 1066–1090. Shao, Y., & Nezu, K. (2000). Prognosis of remaining bearing life using neural network. Proceedings of the Institution of Mechanical Engineers, Part I: Journal of System and Control Engineering, 214(3), 217–230. Schölkopf, B., & Smola, A. J. (2002). Learning with kernels: Support vector machines, regularization, optimization, and beyond. Cambridge, MA: MIT Press. Schömig, A., & Rose, O. (2003). On the suitability of the Weibull distribution for the approximation of machine failures. In The Proceeding of Industrial Engineering Research Conference, Portland, Oregon, USA. Tipping, M. E. (2000). The relevance vector machine. In S. Solla, T. Leen, & K. R. Muller (Eds.). Advances in neural information processing system (Vol. 12, pp. 287–289). Cambridge, MA: MIT Press. Tipping, M. E. (2001). Sparse Bayesian learning and the relevance vector machine. Journal of Machine Learning Research, 1, 211–244. Tran, V. T., Yang, B. S., Oh, M. S., & Tan, A. C. C. (2008). Machine condition prognosis based on regression trees and one-step-ahead prediction. Mechanical System and Signal Processing, 22(5), 1179–1193. Tran, V. T., Yang, B. S., & Tan, A. C. C. (2009). Multi-step ahead direct prediction for the machine condition prognosis using regression trees and neuro-fuzzy systems. Expert System with Application, 36(5), 9378–9387. Tse, P., & Atherton, D. (1999). Prediction of machine deterioration using vibration based fault trends and recurrent neural networks. Transaction of the ASME: Journal of Vibration and Acoustics, 121, 255–362. Vachtsevanos, G., Lewis, F., Roemer, M., Hess, A., & Wu, B. (2006). Intelligent fault diagnosis and prognosis for engineering systems. New Jersey: John Wiley and Sons. Wang, Y. F., & Kootsookos, P. J. (1998). Modelling of low shaft speed bearing faults for condition monitoring. Mechanical System and Signal Processing, 12(3), 415–426. Wen, G., & Zhang, X. (2004). Prediction method of machinery condition based recurrent neural network models. Journal of Applied Sciences, 4, 675–679. Yang, B. S., & Widodo, A. (2008). Support vector machine for machine fault diagnosis and prognosis. Journal of System Design and Dynamics, 2(1), 12–23.