A Wiener-process-based degradation model with a recursive filter algorithm for remaining useful life estimation

A Wiener-process-based degradation model with a recursive filter algorithm for remaining useful life estimation

Mechanical Systems and Signal Processing 35 (2013) 219–237 Contents lists available at SciVerse ScienceDirect Mechanical Systems and Signal Processi...

836KB Sizes 1 Downloads 86 Views

Mechanical Systems and Signal Processing 35 (2013) 219–237

Contents lists available at SciVerse ScienceDirect

Mechanical Systems and Signal Processing journal homepage: www.elsevier.com/locate/ymssp

A Wiener-process-based degradation model with a recursive filter algorithm for remaining useful life estimation Xiao-Sheng Si a,b, Wenbin Wang c,n, Chang-Hua Hu a, Mao-Yin Chen b, Dong-Hua Zhou b,n a b c

Department of Automation, Xi’an Institute of High-Tech, Xi’an, Shaanxi 710025, China Department of Automation, TNList, Tsinghua University, Beijing 100084, China Dongling School of Ecomonics and Management, University of Science and Technology Beijing, Beijing 100083, China

a r t i c l e i n f o

abstract

Article history: Received 2 February 2012 Received in revised form 27 July 2012 Accepted 10 August 2012 Available online 1 September 2012

Remaining useful life estimation (RUL) is an essential part in prognostics and health management. This paper addresses the problem of estimating the RUL from the observed degradation data. A Wiener-process-based degradation model with a recursive filter algorithm is developed to achieve the aim. A novel contribution made in this paper is the use of both a recursive filter to update the drift coefficient in the Wiener process and the expectation maximization (EM) algorithm to update all other parameters. Both updating are done at the time that a new piece of degradation data becomes available. This makes the model depend on the observed degradation data history, which the conventional Wiener-process-based models did not consider. Another contribution is to take into account the distribution in the drift coefficient when updating, rather than using a point estimate as an approximation. An exact RUL distribution considering the distribution of the drift coefficient is obtained based on the concept of the first hitting time. A practical case study for gyros in an inertial navigation system is provided to substantiate the superiority of the proposed model compared with competing models reported in the literature. The results show that our developed model can provide better RUL estimation accuracy. & 2012 Elsevier Ltd. All rights reserved.

Keywords: Reliability Remaining useful life Wiener process Recursive filter Expectation maximization

1. Introduction Enhancing safety, efficiency, availability, and effectiveness of industrial and military systems through prognostics and health management (PHM) paradigm has gained momentum over the last decade [1,2]. PHM is a systematic approach that is used to evaluate the reliability of a system in its actual life-cycle conditions, predict failure progression, and mitigate operating risks via management actions. There are two parts in PHM, namely, ‘‘prognostics’’ and ‘‘health management’’. Prognostics is often characterized by estimating the remaining useful life (RUL) of a system using available condition monitoring (CM) information [3–7]. Once such prognosis is available, appropriate health management actions such as repair, replacement, and logistic support can be performed to achieve the required system’s operational objectives [8–10]. In PHM, the term ‘‘RUL estimation’’ often implies to find the probability density function (PDF) of the RUL or the mean of the RUL [11], but the emphasis is often placed more on estimating the PDF of the

n

Corresponding authors. Tel.: þ 86 010 62794461; fax: þ86 010 62786911. E-mail addresses: [email protected] (X.-S. Si), [email protected] (W. Wang), [email protected] (D.-H. Zhou).

0888-3270/$ - see front matter & 2012 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.ymssp.2012.08.016

220

X.-S. Si et al. / Mechanical Systems and Signal Processing 35 (2013) 219–237

RUL than the mean RUL since a PDF can characterize the uncertainty associated with the RUL and is hence more informative for management decision making. The current RUL estimation approaches can be broadly classified as physics of failure, data driven and fusion methods. Physics of failure approaches rely on the physics of underlying failure mechanisms. Data driven approaches achieve RUL estimation via data fitting mainly including machine learning and statistics based approaches. The fusion approaches are the combination of the physics of failure and data driven approaches. However, for complex or large-scale engineering systems, it is typically difficult to obtain the physical failure mechanisms in advance or cost-expensive and timeconsuming to capture the physics of failure by experiments. In contrast, data-driven approaches attempt to derive models directly from collected degradation data or life data, and thus are more appealing and have gained much attention in recent years. Statistics-based data driven methods for RUL estimation can be classified into the models based on the indirectly observed state processes and the models based on the directly observed state processes [2]. The former models considered the data partially indicating the underlying state of the system, and assumed that the available CM data were stochastically related to the underlying health state. In this case, lifetime data must be available to establish the relationship between the CM data and failure. The latter models utilized the observed degradation data directly to describe the underlying state of the system. Therefore, the RUL is defined as the time to reach the failure threshold of the monitored degradation data for the first time, namely the first hitting time (FHT) [12]. It is noted that the observed degradation data with a threshold are easier to manipulate and implement in practice when lifetime data are scarce [13,14]. It is well recognized that degradation process is uncertain over time and thus stochastic models are frequently used to characterize the evolution of degradation process. In literature, random effect regression (RER) models and stochastic process (SP) models are two kinds of most commonly used stochastic models. Lu and Meeker in [15] first presented the RER model to characterize the degradation of a population of units, and many extensions appeared later [2,16]. However, the degradation modeling paradigm in RER models is based on the fact that a population of ‘‘identical’’ systems (or devices) has a common degradation form. However, individual systems may exhibit different degradation rates, hence the different failure times. On the other side, Pandey et al. in [17] identified that the temporal uncertainty of the degradation process was not taken into account in RER models and thus argued that SP models could remedy it well. They also showed the advantages of SP models over RER models in condition based maintenance. SP models such as Markov chain, Gamma processes, and Wiener processes have been widely used to model the degradation process [5,18–21]. A Wiener-processbased degradation model is one of statistics-based data driven models, which can characterize a non-monotonic degradation process, and provide a good description of system’s behavior due to an increased or reduced intensity of the use [22]. This type of models has been applied to model the degradation process and to estimate the RUL of a variety of industrial assets, such as rotating element bearings [19], LED lamps [23], self-regulating heating cables [24], laser generator [25], bridge beams [26], and fatigue crack dynamics [27]. Therefore, in this paper we focus on RUL estimation based on a Wiener process where the degradation process can be observed directly and a failure threshold of degradation is available. It is known that the selection of the failure threshold is an important problem in practice. However, such threshold is usually set based on either engineering domain knowledge or accepted industrial standards. For example, the ISO 2372 and ISO 10816 are frequently adopted for defining acceptable vibration threshold levels. It is therefore an issue beyond the scope of this paper. In this work, we assume that the failure threshold is known a priori. A Wiener-process-based model has a drift term characterized by its drift coefficient and a noise term by Brownian motion. It has been widely used to model degradation processes which can be observed directly, and conduct lifetime analysis. Tseng et al. in [23] used a Wiener process to determine the lifetime for the light intensity of LED lamps of contact image scanners. As an extension, Tseng and Peng proposed an integrated Wiener process to model the cumulative degradation path of a product’s quality characteristics [28]. Joseph and Yu used a Wiener process for degradation modeling and reliability improvement [14]. Other recent extensions in lifetime estimation can be found in [29–31]. However, the degradation modeling paradigm for lifetime analysis in these conventional Wiener-process-based models is based on an assumption that the estimated PDF of RUL depends only on the currently observed degradation data, which is a strong Markovian assumption. To relax this assumption, Gebraeel et al. in [19] presented an exponential degradation model for rotating element bearings based on a Wiener process, but incorporated some new and important improvements for RUL estimation. Their model established a linkage between the past and current degradation data of the same system by a Bayesian mechanism. However, it is worth noting that the Brownian motion in the Wiener process was just used as an error term in their models and the availability of the explicit distribution of the FHT from the Wiener process was not utilized. Instead, they directly estimated the RUL distribution using an implicit monotonic assumption. It is well-known that Wiener processes are non-monotonic. As such, the resulted RUL estimates in [19] are approximations. In addition, we note that the stochastic coefficients reflecting the individual-to-individual variability in [19] followed some prior distributions, but no elaborated method was presented to select the parameters in the prior distributions. Typically, several systems’ historical degradation data of the same type are required to determine the prior parameters. However, such historical degradation data of many systems are not always available in practice, particularly for newly commissioned systems. It is shown in Section IV that the inappropriate selection of the prior parameters can result in inaccurate estimation of the degradation and the RUL. Wang et al. in [32] recently proposed a Wiener-process-based model which used all past degradation data to date of the system for RUL estimation. Their model explicitly used the FHT from the Wiener process for RUL estimation.

X.-S. Si et al. / Mechanical Systems and Signal Processing 35 (2013) 219–237

221

However, we note that the model in [32] also required the data of many same systems for parameter estimation, and the distribution of the updated drift coefficient was not considered. Our results reveal that considering such distribution can lead to uncertainty reduction in the estimated RUL. One final observation from the existing literature of Wiener-processbased RUL estimation models is that all estimated parameters are not updated in line with newly observed data. From the above review of related researches, we observe that there are three issues remaining to be solved when applying Wiener process for RUL estimation. The first is how to estimate the model parameters from an individual system’s data without the need of past data from many systems. The second is to consider the distribution in the estimated drift coefficient, which is a critical parameter having impact on both the mean and variance of the RUL. The third is to update model parameters based on newly observed degradation data. As we know the system’s life is heavily influenced by the way it is operated, maintained and the environment where it has been operating. The consideration of the above three issues will make our model tailored to an individual system through its actual monitoring data which relate to its operational and environmental characteristics. In this paper, we address the above issues by utilizing a Wiener-process-based model with a recursive filter algorithm for RUL estimation. We use two techniques for the updating of the RUL estimation. A state-space model is used to recursively update the drift coefficient and an expectation maximization (EM) algorithm is used to re-estimate all unknown parameters at each time when new data are available. The new contributions of this paper are summarized as follows: (1) Different from all the previous works, our model estimates an individual system’s RUL based on its entire monitoring information to date through a recursive filter and an EM algorithm, and does not require historical degradation data of other systems in a population; (2) Unlike the work of Wang et al. in [32] where a recursive filter was also used, the drift coefficient is treated as a random variable to incorporate its distribution in estimation; (3) Our model is also different from the approximated results in [19,33] in that our result on the PDF of the RUL is exact in the sense of the FHT, and we also show that our result can ensure that the moments of the RUL exist, but this is not the case for the approximated results in [19,33]; and (4) We apply the proposed model to estimate the RUL of gyros in an inertial navigation system used in weapon systems as a case application. The remainder parts are organized as follows. In Section II, we develop a Wiener-process-based degradation model and obtain the distribution of the RUL. In Section III, we discuss the parameter estimation algorithm in detail. Section IV provides a case study to illustrate the application and usefulness of the developed model. Section V draws the main conclusions.

2. Wiener-process-based degradation modeling and RUL estimation Notation used in this paper is summarized as follows. Time of the ith CM point (can be irregularly spaced) ti XðtÞ Random variable representing the degradation at time t xi Degradation observation at t i X0:i ¼ fx0 ,x1 ,x2 ,:::,xig History of degradation observations for the system to t i Ui ¼ l0 , l1 ,. . ., li1 History of drift coefficients to ti EðUÞ Expectation operator BðtÞ Standard Brownian motion l Drift coefficient li Drift coefficient at t i l0 Initial state which follows Nða0 ,P 0 Þ s Diffusion coefficient Z Noise of li which follows Nð0,Q Þ w Failure threshold l^ i Estimated li conditional on X0:i , denoted by Eðli 9X0:i Þ l^ j9i Smoothed lj conditional on X0:i , denoted by Eðlj 9X0:i Þ P i9i Variance of estimated li , denoted by Varðli 9X0:i Þ P i9i1 Variance of estimated li at t i1 , denoted by Varðli 9X0:i1 Þ Ki Filter gain Ri RUL at t i with its realization, r i T Lifetime h Parameter column vector, denoted by h ¼ ½a0 ,P0 ,Q , s2 T h^ i Maximum likelihood estimate (MLE) of h based on X0:i ðkÞ h^ i Estimated h in the kth step based on X0:i in the EM algorithm Li ðhÞ Log-likelihood function for observed events, x0 ,x1 ,x2 ,:::,xi ‘i ðhÞ Joint log-likelihood function for observed events, X0:i and Ui ðkÞ ðkÞ ‘ðh9h^ i Þ Expectation of ‘i ðhÞ conditional on h^ i and X0:i ðkÞ Dn Order principal minor determinant of the second derivation of ‘ðh9h^ i Þ, with n ¼ 1,2,3,4 FðUÞ Standard normal CDF

222

X.-S. Si et al. / Mechanical Systems and Signal Processing 35 (2013) 219–237

2.1. An outline of Wiener-process-based degradation model for lifetime analysis In this section we briefly present the conventional Wiener-process-based degradation model for lifetime analysis. A Wiener process is typically used for modeling degradation processes where the degradation increases linearly in time with random noise. The rate of degradation is characterized by the drift coefficient. The wear of break pads on automotive wheels is a practical example. Christer and Wang in [34] modeled the wear of break pads as a linear function of time where the thickness of the break pad decreases linearly in time with a Gaussian noise. In general, a Wiener-process-based degradation model can be represented as, XðtÞ ¼ lt þ sBðtÞ,

ð1Þ

where l is the drift coefficient, s 40 is the diffusion coefficient, and BðtÞ is the standard Brownian motion representing the stochastic dynamics of the degradation process. In physics, a Wiener process aims at modeling the movement of small particles in fluids and air with tiny fluctuations. A characteristic feature of this process in the context of reliability is that the plant’s degradation can increase or decrease gradually and accumulatively over time. The small increase or decrease in degradation over a small time interval behaves similarly to the random walk of small particles in fluids and air. Therefore, this type of stochastic processes has been widely used to characterize the path of degradation processes where successive fluctuations in degradation can be observed, such as the degradation observations of rotating element bearings [19], LED lamps [23], self-regulating heating cables [24], laser generator [25], bridge beams [26] and other examples in [14] and [30–35] and our case of gyros’ drifting. Modeling a stochastic degradation process as a Wiener process implies that the mean degradation path is a linear function of time, i.e. E½XðtÞ ¼ lt. Therefore, the drift parameter l is closely related with the progression of the degradation. In addition, we have the variance of the degradation process var½XðtÞ ¼ s2 t, which represents the uncertainty of the degradation at time t. For in service lifetime estimation at time t i with the obtained degradation observation xi , we can use, XðtÞ ¼ xi þ lðtt i Þ þ sðBðtÞBðt i ÞÞ ¼ xi þ lðtt i Þ þ sBðtt i Þ,

f or t 4 t i

ð2Þ

At time t i , we assume xi ow, otherwise the degradation has crossed w and the system would have failed as defined. Although the above model setting is the same, there are two different ways to relate XðtÞ to lifetime T at t i  the degradation  in the literature. The first one is that the lifetime is directly defined as T ¼ t : XðtÞ Z w9xi , and then the lifetime   distribution can be represented by F T9xi ðt9xi Þ ¼ PrðXðtÞ Z w9xi Þ, such as [19,31,33]. To see this, using T ¼ t : XðtÞ Z w9xi , the PDF and the cumulative density function (CDF) of lifetime T at time t i can be directly obtained as, ! 2 ðwxi lðtt i ÞÞ 1 f T9xi ðt9xi Þ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi exp  ð3Þ 2s2 ðtt i Þ s 2pðtti Þ F T9xi ðt9xi Þ ¼ 1F

  wxi lðtt i Þ pffiffiffiffiffiffiffiffiffi , s tti

ð4Þ

where FðUÞ denotes the standard normal CDF.   Another one defines the lifetime based on the concept of the FHT as T ¼ inf t : XðtÞ Z w9xi , such as [21,29,30,32,35]. Therefore, using the FHT, the following results can be obtained [36], ! 2 wxi ðwxi lðtt i ÞÞ ffi exp  f T9xi ðt9xi Þ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi , ð5Þ 2s2 ðtt i Þ 2pðtt i Þ3 s2       wxi lðtt i Þ 2lðwxi Þ ðwxi Þlðtt i Þ pffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffi F þ exp , ð6Þ 2 s s tti s tti h    3 with the mean E T9xi ¼ t i þ ðwxi Þ=l and variance var T9xi ¼ ðwxi Þs2 =l . This reflects the relationship among the parameters used in the model, the current degradation measurement and the estimated future life of the plant modeled. Particularly, l is critical for both the mean and variance of the estimated lifetime. Clearly, the above two definitions are different from each other and also lead to different lifetime estimations. As noted   by Park and Bae in [16], T ¼ t : XðtÞ Zw9x  i completely  ignores the possible hitting events within interval ðt i , tÞ and thus is only a crude approximation to T ¼ inf t : XðtÞ Zw9xi when the degradation fluctuations are large. In addition, the CDF using the FHT as Eq. (6) is greater than the approximation by Eq. (4). For safety-critical systems, using the approximated result as Eq. (4) for maintenance scheduling may lead to under-maintenance because of the lower risk of failure estimated by Eq. (4). As such, it is necessary to consider the FHT as the lifetime, which is exact if the failure is defined as the FHT. We note that Eq. (6) uses only the current degradation data, but not its history before t i . However, as we have discussed, this is a strong Markovian assumption and ideally the future FHT should depend on the path that the degradation has involved to date. For example, in Fig. 1, at the same level of xi , case (a) would be expected to fail faster than case (b), but using Eq. (6) will give the same prediction. F T9xi ðt9xi Þ ¼ 1F

X.-S. Si et al. / Mechanical Systems and Signal Processing 35 (2013) 219–237

Degradation

223

Degradation w

w xi

xi Time

Time

Fig. 1. Two exemplar sample paths with different tracks but the same xi .

Consequently, it is desired to utilize the degradation data to date for evaluating the RUL of the degraded system. It is expected that utilizing the degradation data to date can make the RUL estimation sharper and more tailored to an individual system than only using the current data. This is our main focus in the remaining parts of this paper. 2.2. Wiener-process-based degradation modeling Now we address the issues discussed in the introduction. Since some variants introduced in [19] can be easily transformed into Eq. (1) by logarithmic transformation, we only focus on Eq. (1), which is used to describe the evolution of the monitored degradation variable over time in this paper. To incorporate the history of the observations and to maintain at the same time the nice property of the Wiener process, we consider an updating procedure for coefficient l by a random walk model li ¼ li1 þ Z over time where Z  Nð0,Q Þ. Thus the drift coefficient l evolves as a time-dependent random variable with a distribution, conditional on li1 . In fact, the diffusion coefficient, s, can also be made time-dependent. However the structure of Eq. (1) does not allow us to use the state space model shown later, and therefore a general filter has to be used, which is computationally difficult. There is also a practical reason why we are only interested in making l time-varying. It is known from Eq. (1) and the discussion in Section II-A that the mean degradation and the progression of the degradation are governed by the drift coefficient l and the time while the diffusion coefficient s controls in part the uncertainty in the degradation process. The trend in degradation is determined by the drift coefficient while the diffusion coefficient only influences the noise, which can be considered to be constant. A similar idea can also be found in statistical process control literature, in which it is frequently assumed that the process mean will change but the variance will be constant when the process shifts from ‘‘in control’’ to ‘‘out of control’’ [37]. Motivated by the state space model [38], the degradation equation can be reconstructed via a linear state-space model as,

li ¼ li1 þ Z,

ð7Þ

xi ¼ xi1 þ li1 ðt i t i1 Þ þ sei ,

ð8Þ

where t 0 ¼ 0, x0 ¼ 0, and ei  Nð0,t i t i1 Þ. The use of t i t i1 as the variance of ei is required by the property of Brownian motion. Eq. (7) is called the system equation, while Eq. (8) is the observation equation [38]. The reason to use a linear system equation is not only because of its simplicity. We can use a nonlinear system equation for li , but we expect that the gain obtained will be minor, but at the expense of a substantially long computation time. There is also a practical problem as what form of nonlinearity we should use because li is not observable. Since the system equation is to model the change of the drift coefficient over a sampling interval which is not long, we would expect that li should be around li1 adjusted by the noise term. We assume that the initial drift coefficient l0 follows a normal distribution with mean a0 and variance P 0 as required by the state space model. The drift coefficient is considered as a hidden ‘‘state’’ and can be estimated from the observations up to t i , denoted by X0:i ¼ fx0 , x1 , x2 ,:::, xi g. As such, this model establishes the linkage between the drift coefficient and the observation history up to t i . In Eq. (7), li follows a distribution which can be estimated by a recursive filter once new observation xi is available at t i . We denote its mean by l^ i ¼ Eðli 9X0:i Þ and its variance by P i9i ¼ Varðli 9X0:i Þ. In order to compute l^ i and P i9i , we need to know the PDF of the li given X0:i , denoted by pðli 9X0:i Þ. Recursion solution of pðli 9X0:i Þ can be computed from pðli1 9X0:i1 Þ by the well-known Bayesian rule as follows, R Z pðli 9li1 Þ pðxi 9li1 ,X0:i1 Þ pðli1 9X0:i1 Þdli1 ð9Þ pðli 9li1 Þ pðli1 9X0:i Þdli1 ¼ pðli 9X0:i Þ ¼ pðxi 9X0:i1 Þ: It has been well established that if Eqs. (7) and (8) are used, Eq. (9) is Gaussian with mean l^ i and variance P i9i , which can be computed by the Kalman filter [38]. As a result, the entire history is captured via recursively updating the estimate of li , which is the advantage of the state space model. The recursive estimations for l^ i and Pi9i using Kalman filtering are summarized as Algorithm 1 in Appendix A. In literature, the Kalman filter has been successfully applied when system states (here referred the drift coefficient as a state) and observations evolve in a smooth and gradually changing way. However, degradation sometimes may have jumps or sudden changes [19]. Here, we introduce an algorithm to deal with this by a strong tracking filter (STF) [39]. STF is also Kalman filter-based, but adjusts the prediction variance P i9i1 so that it is sensitive to the prediction error,

224

X.-S. Si et al. / Mechanical Systems and Signal Processing 35 (2013) 219–237

xi xi1 l^ i1 ðt i t i1 Þ, and then the filter gain K i is sensitive to the change of the system state. The details about the STF algorithm are summarized in Appendix B as Algorithm 2. Based on Eqs. (7), (8), and (9), the PDF of li conditional on X0:i is,

2 1 2Pi9i , ð10Þ pðli 9X0:i Þ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffi exp  li l^ i 2pP i9i where the dependence between li and X0:i is contained in l^ i and Pi9i . Based on this result, we derive the associated RUL distribution in the following. 2.3. Real-time updating of the RUL distribution Based on a predefined threshold w, the RUL modeling principle is that when degradation XðtÞ first reaches threshold w, the system is declared to be non-operable and its lifetime terminates. Consequently, it is natural to view the event of lifetime termination as the point that the degradation XðtÞ exceeds threshold w for the first time. In this paper, from the concept of the FHT, we define RUL Ri at time t i as   ð11Þ Ri ¼ inf r i : Xðr i þt i Þ Z w9X0:i , with CDF F Ri 9X0:i ðr i 9X0:i Þ and PDF f Ri 9X0:i ðr i 9X0:i Þ. From Eq. (2), it is direct to obtain the PDF and CDF of the RUL at time t i defined in Eq. (11) as follows [36], ! 2 wxi ðwxi li r i Þ f Ri 9li ,X0:i ðr i 9li ,X0:i Þ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð12Þ , r i 4 0: exp  2s2 r i 2pr i 3 s2       wxi li r i 2li ðwxi Þ ðwxi Þli r i F Ri 9li ,X0:i ðr i 9li ,X0:i Þ ¼ 1F F þexp , pffiffiffiffi pffiffiffiffi 2 s ri s ri s

r i 4 0:

ð13Þ

In Eq. (12) if we replace li by l^ i , then it is the RUL model used in [32]. We call this as Wang’s model for subsequent comparisons in Section IV. However, as mentioned above, the drift coefficient evolves as a random variable in Eq. (7) with a distribution, pðli 9X0:i Þ, conditional on the observed data up to time t i as formulated by Eq. (10). Now we want to use pðli 9X0:i Þ for deriving the estimated RUL distribution. In order to achieve this aim, we first give two lemmas, which can significantly simplify the course of the derivation for the RUL distribution. qffiffiffiffiffiffiffiffiffiffiffiffiffi Lemma 1. If Y  Nðm1 , s21 Þ, then EY ½FðYÞ can be formulated as EY ½FðYÞ ¼ Fðm1 = s21 þ 1Þ where FðUÞ denotes the standard normal CDF. Proof: From the property of the normal distribution, we have qffiffiffiffiffiffiffiffiffiffiffiffiffi  EY ½FðYÞ ¼ E EðIfZ r Y g 9YÞ9Y ¼ PrðZ r YÞ ¼ PrðZY r 0Þ ¼ Fðm1 = s21 þ 1Þ: ð14Þ In the derivation process, IfZ r Y g is the indicator function, Z is a standard normal variable and independent of Y, and ZY  Nðm1 , s21 þ1Þ. Lemma 2. If l  Nðml , s2l Þ , the PDF and CDF of the FHT of process XðtÞ ¼ lt þ sBðtÞ to first hit threshold w can be formulated as "  2 # wml t w   : exp  f T ðt Þ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð15Þ   2t s2l t þ s2 2pt 3 s2 t þ s2 l

0

1 0 1 !   2 2 2ml w 2s2l w2 B ml tw C B 2sl wt þ s ml t þw C qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi A: þ F@ F T ðt Þ ¼ F@qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiA þ exp s2 s4 s2l t2 þ s2 t s2 s2l t2 þ s2 t

ð16Þ

Lemma 2 is similar to the results given in [40] and [41]. This lemma can be obtained by some direct manipulations using Lemma 1 and the total law of probability [25]. From Lemma 2, we give the following theorem. Theorem 1. For the Wiener process defined by Eq. (2) and the state space model as Eqs. (7) and (8), the PDF and CDF of the updated RUL at t i based on the updated PDF of li can be obtained as 0

2 1 ^r wx  l i i i wxi B C ð17Þ f Ri 9X0:i ðr i 9X0:i Þ ¼ rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi A, r i 40:

exp@

2r i P i9i r i þ s2 2pr 3 P r þ s2 i

i9i i

X.-S. Si et al. / Mechanical Systems and Signal Processing 35 (2013) 219–237

225

0

1 0

1 ! 2 ^ 2 2l^ i ðwxi Þ 2P i9i ðwxi Þ B wxi l^ i r i C B 2P i9i ðwxi Þr i þ s li r i þwxi C qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi F Ri 9X0:i ðr i 9X0:i Þ ¼ 1F@qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiA þ exp þ F@ A: s2 s4 s2 Pi9i r i 2 þ s2 r i Pi9i r i 2 þ s2 r i Proof: Using Eqs. (10), (12), (13), (15), (16) and the total law of probability, we have Z þ1 f Ri 9li ,X0:i ðr i 9li ,X0:i Þpðli 9X0:i Þdli , f Ri 9X0:i ðr i 9X0:i Þ ¼

ð18Þ

ð19Þ

1

F Ri 9X0:i ðr i 9X0:i Þ ¼

Z

þ1

1

F Ri 9li ,X0:i ðr i 9li ,X0:i Þpðli 9X0:i Þdli :

ð20Þ

Following Lemma 2, it is straightforward to obtain Eqs. (17) and (18). Comparing Eq. (17) with Eq. (12), we observe that the observation history and the variance of li are involved in Eqs. (17) and (18), which is also recursively updated. We call Eq. (17) as our model to distinguish other models in Section IV.   Remark 1. In [19], they directly used PrðRi r r i 9X0:i Þ ¼ Pr Xðr i þ t i Þ Z w9X0:i to calculate the RUL distribution and obtained   the associated RUL distribution with similar form to Eqs. (3) and (4). However, using PrðRi r r i 9X0:i Þ ¼ Pr Xðr i þ t i Þ Z w9X0:i ignores the possible hitting events within ðt i ,t i þr i Þ as discussed in Section II-A and thus their results are approximations. Instead, Eqs. (17) and (18) are exact in the sense of the FHT. Remark 2. The moment of the RUL distribution obtained in [19] and [33], does not exist since their obtained RUL distributions belong to the family of Bernstein distributions, known without moments, but this is not the case for our result. For example, the mean of RUL can be easily formulated by pffiffiffi 2 Z ^ li wxi wxk l^ u2 l^ i 2ðwx Þ ffiÞ EðRi 9X0:i Þ ¼ E½EðRi 9li ,X0:i Þ9X0:i  ¼ Eð 9X1:k Þ ¼ expð i Þ expð Þdu ¼ qffiffiffiffiffiffiffi i Dðqffiffiffiffiffiffiffiffiffi li P i9i 2P i9i 0 2P i9i P 2P i9i

2

Rz

i9i

2

where DðzÞ ¼ expðz Þ 0 expðu Þdu is the Dawson integral for real z, which is known to exist. This property is desired in maintenance practice, since the expectation of the life estimation is required to be existent sometimes [42,43]. In Eqs. (17) and (18), parameters a0 ,P0 , Q and s2 should be estimated. As opposed to the result in [19,32], we develop a parameter estimation algorithm in the following section for this task. 3. Parameter estimation Now we return to estimate and update a0 , P 0 , Q and s2 in Eqs. (7) and (8). Unlike the method adopted in [32], we use only the data from one system from the time of installation and recursively update the estimate along with the observation process. We denote h ¼ ½a0 , P 0 , Q , s2 T as a parameter vector. We use the maximum likelihood estimation (MLE) to estimate h once new degradation observation xi is available. In this case, the log-likelihood function for X0:i can be written as Li ðhÞ ¼ log½pðX0:i 9hÞ,

ð21Þ

where pðX0:i 9hÞ is the joint PDF of the degradation data X0:i . Then the MLE estimate of h, denoted by h^ i , conditional on X0:i can be obtained by

h^ i ¼ argmax Li ðhÞ:

ð22Þ

h

If the drift coefficient is constant then maximizing Eq. (21) with respect to h is straightforward. However, since we treat li as a hidden variable which is given by Eq. (7), then directly maximizing Eq. (21) is impossible. However, the EM algorithm provides a possible framework for estimating the parameters involving hidden variables [44]. A fundamental assumption of the EM algorithm is that the hidden variables can be estimated by observed data. This is the case for our problem since pðli 9X0:i Þ can be obtained by Eq. (10). The fundamental principle of the EM algorithm is to replace the hidden variables with their expectations conditional on the observed data. Then the parameter estimation can be formulated as maximizing the joint likelihood function pðX0:i , Ui 9hÞ. Specifically, by manipulating the relationship between pðX0:i 9hÞ and pðX0:i , Ui 9hÞ, Li ðhÞ can be divided into two parts as Li ðhÞ ¼ ‘i ðhÞlog pðUi 9X0:i , hÞ,

ð23Þ

where ‘i ðhÞ ¼ log pðX0:i , Ui 9hÞ:

ð24Þ 0

Then taking the expectation operator on both sides of Eq. (23) with respect to Ui 9X0:i , h , we have Li ðhÞ ¼ ‘ðh9h0 ÞKðh9h0 Þ,

ð25Þ

226

X.-S. Si et al. / Mechanical Systems and Signal Processing 35 (2013) 219–237

where

  ‘ðh9h0 Þ ¼ EUi 9X0:i , h0 ‘i ðhÞ ,

Kðh9h0 Þ ¼ EUi 9X0:i , h0



ð26Þ

 log pðUi 9X0:i , hÞ :

ð27Þ

Finally, via Eq. (25), the following holds Li ðhÞLi ðh0 Þ ¼ ‘ðh9h0 Þ‘ðh0 9h0 Þ þKðh0 9h0 ÞKðh9h0 Þ , |fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}

ð28Þ

Z0

where the positivity of the last term is implied by the Kullback–Leibler divergence metric between pðUi 9X0:i , hÞ and pðUi 9X0:i , h0 Þ, [45]. Obviously, if ‘ðh9h0 Þ 4 ‘ðh0 9h0 Þ, then Li ðhÞLi ðh0 Þ 4 0 holds. This is achieved by the EM algorithm which takes an ðkÞ ðk þ 1Þ according to the following two steps: approximation h^ i of MLE h^ i given in Eq. (22) and updates it to a better h^ i E-step   ðkÞ ð29Þ Calculate ‘ðh9h^ i Þ ¼ E ^ ðkÞ ‘ i ðhÞ ,

Ui 9X0:i , hi

ðkÞ where y^ i ¼ ½a0i ðkÞ , P0i ðkÞ , Q i ðkÞ , si 2ðkÞ T denotes the estimated parameters in the kth step conditional on X0:i . M-step     ðk þ 1Þ Calculate h^ i ¼ argmax E : ðkÞ ‘ i ðhÞ ^

h

Ui 9X0:i , hi

ð30Þ

Then we iterate the E-step and M-step until a criterion of convergence is satisfied. In our case, we can calculate the E-step and M-step separately but just outline the properties of the estimation algorithm. The details of the algorithm are summarized in Appendix C. Interestingly, it can be observed from Theorem 2 in Appendix C that the M-step in our approach can be solved analytically and we can obtain the unique maximum point. This implies that each iteration of the EM algorithm can be performed with a single computation, which leads to an extremely fast and simple estimation procedure. This computation advantage plus the exact RUL distribution are particularly attractive for practical applications. The convergence property of the proposed algorithm can be similarly demonstrated in [46–50]. 4. A practical case study In this section, we provide a practical case study for gyros in an inertial navigation system (INS) to illustrate the application of our model and compare the performance of our model with the models presented in [19,32]. In our study, it is found that STF can generate superior results to Kalman filter to a certain extent. As such, in the following, we only use STF in the filtering step for illustration. Of course, in practice which one to use needs to be tested against the model fit. However, due to limited space, we do not discuss this issue in this paper. Actually, the detailed comparisons between STF and Kalman filter with sudden state changing can be found in [51,52]. 4.1. Problem description As a key device of the INS in weapon systems and space equipment, an inertial platform plays an important and irreplaceable role in the INS. Its operating state has a direct influence on navigation precision. The sensors fixed in an inertial platform include three gyros and three accelerometers, which measure angular velocity and linear acceleration, respectively. The gyro fixed on an inertial platform is a mechanical structure having two degrees of freedom from the driver and sense axis (see [53] for a general description of inertial navigation platforms and gyros). When the inertial platform is operating, the wheels of the gyros rotate at very high speeds and can lead to rotation axis wear. As the wear is accumulated, the bearing on the gyro’s electric motor will become deformed and such deformation can lead to the drift of the gyros. The increasing drift finally results in the failure of gyros and then the inertial platform. Past data show that almost 70% of the failures of inertial platforms result from gyroscopic drift and such drift is largely resulted from the wear of bearings as the case of rolling element bearings which are extensively investigated in the literature. However, the difference of our case is that we use the drift data of gyros to estimate the RUL rather than the vibration data as rolling element bearings since we cannot obtain the vibration data as it is not allowed to fix the vibration sensors in the inertial platform. As such, the drift of gyros is often used as a performance indicator to evaluate the health condition of an inertial platform and to schedule maintenance activities. For an illustrative purpose, we provide an illustration of the inertial platform (see Fig. 2), and an illustration of a deformed bearing of the gyro’s electric motor (see Fig. 3), which was obtained by scanning electron microscopy S-3700N. It can be found that the maximum length of the metal flake is 155 mm. Such deformation is reflected by the drifting data of gyros, which can be monitored directly. In this study, we assume that CM values of drift coefficients reflect the performance of the inertial platform, and the larger the drift coefficients monitored are, the worser the performance. Therefore, according to the CM data and technical

X.-S. Si et al. / Mechanical Systems and Signal Processing 35 (2013) 219–237

227

Fig. 2. Illustration of the inertial platform.

Fig. 3. Illustration of a deformed bearing of the gyro’s electric motor.

index of the inertial platform, failure prediction can be implemented by modeling the drift coefficients. The drift coefficients of an inertial platform mainly include K 0X , K 0Y , K 0Z , K SX , K SY , K IZ , in which K 0X , K 0Y , K 0Z denote constant drift coefficients, and K SX , K SY , K IZ are stochastic drift coefficients, where K SX , K SY denote the coefficients related to the first moment of specific force along the sense axis, and K IZ denotes the coefficient related to the first moment of specific force along the input axis. Generally, the drift degradation measurement along the sense axis, K SX , plays a dominant role in the assessment of gyro degradation. In our study, we take the CM data of K SX as the degradation signals and use them for RUL estimation of the INS. For our monitored INS in certain weapon system with the terminated life 180.5 h, 73 points of drift coefficients data were collected with regular CM intervals 2.5 h in field condition. The collected data are illustrated in Fig. 4. In the practice of the INS health monitoring, it is usually required that the drift measurement along the sense axis should not exceed 0:37ð1=hÞ. This threshold is predetermined at the design stage and is strictly enforced in practice since an INS is a critical device used in a navigated weapon system.

4.2. The implementation of our model for RUL estimation of the INS Using our model, the predictions of the gyro’s drift and the distribution of the RUL can be obtained at each CM point. Specifically, using our approach initialized by parameter vector h0 ¼ ½0:002, 0:001, 0:01, 0:01T , the one step predicted drifting path by x^ i þ 1 ¼ xi þ l^ i ðt i þ 1 t i Þ is illustrated in Fig. 4 to show the fitness of our model to the gyro’s drift degradation

X.-S. Si et al. / Mechanical Systems and Signal Processing 35 (2013) 219–237

The degradation path of gyro drift (0/hour)

228

0.4 The actual observed path

0.35

The predicted path

0.3 0.25 0.2 0.15 0.1 0.05 0

0

20

40

60 80 100 120 The CM point (hours)

140

160

180

Fig. 4. The actual gyro’s drift data and the predictions of our model.

data. Clearly, the predicted results match with the actual data well and the mean squared error (MSE) of the predictions is 1.1962E-4 which is small. This demonstrates that our developed model can model the gyro’s drift degradation data effectively. Correspondingly, the evolving path of the estimated parameter vector h^ , consisting of a0 , P 0 , Q and s2 , are illustrated in Fig. 5, which are estimated by Algorithm 4 summarized in Appendix E. Fig. 5 shows that the updated parameters converge quickly as the observed degradation data are accumulated. In this case, once the parameters converge, further updating may be unnecessary. However, such updating is needed for the case that the degradation process may be subject to unusual changes in its progression. Once the parameters in the model are updated, the PDF of the estimated RUL can be calculated at each CM point. It is noted that the RUL estimation at a CM point is not achieved by estimating the long term future degradation state with estimating errors, and instead is achieved by directly calculating the FHT of the degradation process. Fig. 6 illustrates the RUL distributions at six different CM points. It is noted that the first drift reading is zero according to the model setting. In our used practical data, the last drift reading is 0:3566 ð1=hÞ at the monitoring time t 73 ¼ 180 h. It can be found that this drift reading is very close to the failure threshold (w ¼ 0:37 ð1=hÞ). Therefore, we can consider that the data captures full life cycle history (useful life) of the machine component. In other words, the actual lifetime of the gyros in an INS is approximated to be 180.5 h and thus the actual RUL at each CM point is known from the full life cycle data. As shown in Fig. 6, the actual RUL (denoted by square) falls within the range of the estimated PDF of the RUL at each CM point from our model and further the estimated PDF of the RUL becomes sharper as the degradation data are accumulated. This implies that the uncertainty of the estimated RUL is reduced since more data are utilized during estimating the model parameters. When we use our predictive model for RUL estimation at a given CM point, we use the CM data up to that CM point. In other words, if we estimate the RUL at t i , the data X0:i are used to update the model and for RUL estimation. Therefore, at the last CM time t 73 , all the available CM data are used to estimate the RUL. Fig. 7 illustrates the performance of our predictive model against the full life cycle data. It can be observed that the estimated mean RUL and the actual RUL match each other well. For example, the relative error between the actual life and the estimated mean life at t 73 is 0.16%. This reflects that the estimated result of our developed model can match the actual result from the full life cycle data closely. 4.3. Comparative studies In this part, we conduct some comparative studies with the models presented in [19,32]. We first compare our model with Wang’s model about their performance of the RUL estimation for the INS. The detailed implementation process of Wang’s model can be found in [32]. In order to compare our model with Wang’s model, a loss function is employed to enable a direct comparison of the distributions for the RUL estimation between two models. The loss function is the MSE about the actual RUL obtained at each observation point [4], defined as Z 1   MSEi ¼ ðr i r~ i Þ2 f Ri 9X0:i r i 9X0:i dr i , ð31Þ 0

  where r~ i is the actual RUL obtained at t i and f Ri 9X0:i r i 9X0:i is the estimated PDF of the RUL. Fig. 8(a) compares the estimated RUL distributions from our model with Wang’s model. In both cases, the unknown parameters, such as a0 ,P0 , Q and s2 , are obtained by the EM algorithm, but the difference is that our model recursively updates all parameter estimates whenever a new piece of information is available, whereas the model in [32] uses the data of past systems to estimate the model parameters and once estimated they are fixed. Only l^ i is updated with the new data

X.-S. Si et al. / Mechanical Systems and Signal Processing 35 (2013) 219–237

1

229

x 10-3 0.025

0.9 0.8

0.02 Estimated a0

Estimated P0

0.7 0.6 0.5 0.4

0.015

0.01

0.3 0.2

0.005

0.1 0

0

1.2

20

40

60 80 100 120 The CM point(hours)

140 160

0

180

x 10-3

1.4

40

60 80 100 120 140 160 180 The CM point(hours)

x 10-3

1 Estimatedσ 2

0.8 0.6 0.4

0.8 0.6 0.4

0.2

0.2

0

20

40

60 80 100 120 140 160 180 The CM point(hours)

0

0

20

40

60 80 100 120 140 160 180 The CM point(hours)

Fig. 5. The updated parameters at monitoring points t 0 , t 1 ,:::, t 73 .

0.1 Probability density function (PDF)

Estimated Q

20

1.2

1

0

0

The 21th CM point The 41th CM point The 51th CM point The 61th CM point The 71th CM point The 73th CM point The actual RUL

0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 0

20

40

60

80 100 120 140 160 180 200 The RUL(hours)

Fig. 6. Illustration of the PDF of the RUL at six different CM points.

230

X.-S. Si et al. / Mechanical Systems and Signal Processing 35 (2013) 219–237

200 The actual RUL The estimated RUL

180 160 RUL(hours)

140 120 100 80 60 40 20 0

0

20

40

60 80 100 120 The CM point(hours)

140

160

180

Fig. 7. The estimated mean RUL and the actual RUL at monitoring points t 0 ,t 1 ,:::,t 73 .

Our model Wang's model

Probability density function (PDF)

0.2 0.15 0.1 0.05 0 180 175 170 The CM time (hours)

3

165

0

5

10

15

20

25

The RUL (hours)

x 104 Wang's model Our model

MSEi at each CM point

2.5

2

1.5

1

0.5

0

0

20

40

60 80 100 120 The CM point(hours)

140

160

180

Fig. 8. Comparative results of our model with Wang’s model (a) the estimated PDFs of the RULs at the last six CM points, and (b) the MSEs of the RULs at all CM points (&–actual RUL).

in Wang’s model. Clearly, the PDFs of the RULs from both models can cover the actual RULs well as the obtained data are accumulated. However, it is observed from Fig. 8(a) that the PDFs of the estimated RULs are typically more dispersed when using Wang’s model. Fig. 8(b) shows the calculated MSE between the actual RUL and the estimated RUL at each CM point. We can observe that the MSEs using Wang’s model change irregularly at different sampling points with relatively large

X.-S. Si et al. / Mechanical Systems and Signal Processing 35 (2013) 219–237

231

fluctuations, particularly in the earlier stage of estimation. This implies that the estimated RUL PDFs by Wang’s model is sensitive to small changes in observations shown in Fig. 4. This may also lead to completely different health management decisions at two consecutive CM points where the actual degradation observations may only have changed a little as seen from Fig. 4. The reason for such behavior seen from Fig. 8(b) stems from the factor that the distribution pðli 9X0:i Þ of the updated drift coefficient is not considered and the estimated parameters are not recursively updated in [32]. From Fig. 8(b), we observe that our model can make the MSEs about the actual RUL at each CM point less sensitive to small changes. But most importantly our model can improve the accuracy of the estimated PDFs of the RULs with reduced variances, see Fig. 8. This reduction in variance is due to the use of the complete distribution of li via the Bayesian rule shown earlier. These comparisons reflect the superiority of our model to Wang’s model in the RUL estimation for the INS.

The drift degradation data of the INS(o/hour)

0.4 0.35 0.3 0.25 0.2 0.15 0.1 The actual degradation path Gebraeel's model with random prior parameters Gebraeel's model with appropriate prior parameters Our model with random prior parameters

0.05 0 0

20

60 80 100 120 The CM time (hours)

140

160

180

Our model Gebraeel's model with appropriate prior parameters

0.3 Probability density function (PDF)

40

0.25 0.2 0.15 0.1 0.05 0 180 175

170 The CM time (hours)

0

5

10

20

25

The RUL (hours)

Our model Gebraeel model with random prior parameters

0.3 Probability density function (PDF)

165

15

0.25 0.2 0.15 0.1 0.05 0 180 175 170 The CM time (hours)

165

0

5

10

15

20

25

The RUL (hours)

Fig. 9. Comparative results of our model with Gebraeel’s model (a) the predicted degradation path, (b) RUL estimation using appropriate prior parameters, and (c) RUL estimation using random prior parameters (&–actual RUL).

232

X.-S. Si et al. / Mechanical Systems and Signal Processing 35 (2013) 219–237

Conditional reliability at the last CM point

1 Our model Gebraeel's model

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

0

20

40

60

80 100 120 The RUL(hours)

140

160

180

Fig. 10. Conditional reliability at the last CM points.

Now, we further compare our model with Gebraeel’s model about their performance in the RUL estimation. The detailed implementation process of Gebraeel’s model can be found in [19]. It is noted that the model in [19] needs prior parameters as well our model. In order to compare the goodness of fit, the prior parameters of our model in the following comparison are selected at random at time zero, but we consider two cases for the prior parameters of Gebraeel’s model. One uses the appropriately selected parameters as the prior parameters and the other uses random prior parameters. Fig. 9 shows the comparative results. Fig. 9(a) shows that the predicted degradation path. The MSEs between the actual data and the corresponding predictions of the degradation path for our model with random prior parameters, Gebraeel’s model with random prior parameters, and Gebraeel’s model with appropriate prior parameters are 1.179E-4, 0.0026, and 7.9864E-4, respectively. Thus, our model has the best fitting for the degradation data. We also find that our model is robust with the choice of the prior parameters, but Gebraeel’s model is not. Fig. 9(b) illustrates several estimated PDFs of the RULs over time using Gebraeel’s model with appropriate prior parameters. It is clear that the range of PDFs of RULs covers the actual RULs, but the uncertainty in the estimated RUL of our model is less than that from Gebraeel’s model, particular for the last few sampling points, since our estimated PDFs of the RULs are sharper than the results of Gebraeel’s model. However, if the prior parameters in Gebraeel’s model are selected at random, the estimated RUL may be incorrect as illustrated in Fig. 9(c) (the observed RULs are outside of the predicted RUL PDF ranges). In comparison, our model using the random selected prior parameters can produce reasonable estimates. This demonstrates the merit of our model for RUL estimation of a particular system. Therefore, if there are no sufficient historical degradation data at the beginning for selecting the prior parameters, our model may be more appropriate for RUL estimation. As noted by Remark 2, the moments of the RUL distribution obtained by Gebraeel’s model do not exist. Therefore, we cannot calculate the MSE about the actual RUL defined in Eq. (31). But, to have a further look at the difference of the estimated RUL distributions between our exact result and Gebraeel’s approximate result, it is desired to compare the estimated reliability from both models. Here, we use F Ri 9X0:i ðr i 9X0:i Þ ¼ 1F Ri 9X0:i ðr i 9X0:i Þ as the conditional reliability obtained from our exact approach at t i , while F R0i 9X0:i ðr i 9X0:i Þ as the conditional reliability obtained from Gebraeel’s model by   PrðR0i rr i 9X0:i Þ ¼ Pr Xðr i þ t i Þ Zw9X0:i at t i . The difference between them at the last CM point is illustrated in Fig. 10. It can be found that the difference between both results is significant. This  is due in part to  the many fluctuations existing in the degradation path of gyro’s drift so that using PrðL0k r lk 9X1:k Þ ¼ Pr Xðlk þt k Þ Zw9X1:k to approximate the RUL will lead to overestimating of the reliability, as discussed in Remark 1. Consequently, when the approximated result is applied to reliability-centered maintenance, the obtained decision may be far from the reality. In sum, this practical case study demonstrates that our developed model can work well and efficiently. On the other hand, we verify that incorporating the observation history to date can improve the accuracy of the RUL estimation indeed. 5. Concluding remarks In this paper, we present a Wiener-process-based degradation model with a recursive filter algorithm and Bayesian updating to estimate the PDF of the RUL. Based on the Wiener process, we construct a RUL estimation model which depends on the system CM data to date via recursively updating. The updating is done twice in that the drift coefficient is recursively updated via the state-space model and all parameters are updated via the EM algorithm, both at the time that a new piece of CM data becomes available. During the RUL estimation, we also consider the distribution in the estimated drift coefficient and accordingly an explicit RUL distribution is obtained based on the concept of FHT. The usefulness of the proposed model is demonstrated via a real-world example for gyros in an inertial navigation system. By comparing the

X.-S. Si et al. / Mechanical Systems and Signal Processing 35 (2013) 219–237

233

proposed model with existing models, we show that the proposed model can generate better results than existing models compared both in terms of the mean and variance of the estimated RUL and is robust with respect to the choice of the prior parameters. Although the practical case studies show that our model is a simple but efficient method for RUL estimation, there are still several issues needed to be further studied. Firstly, in this paper, we only consider a Wiener process with a linear drift, but for complex systems in practice, a nonlinear model may be more appropriate. Secondly, we mainly consider the case that the failure is resulted by the gradual and continuous degradation in this paper. However, considering the sudden failure in degradation modeling and RUL estimation is interesting and of practical significance. Because the failure of complex engineering systems is often the result of gradual degradation and stresses generated either by the systems themselves or from external shocks. Such shocks frequently lead to the sudden failure. Therefore, considering both the gradual and sudden cases are necessary in future research. Thirdly, we only consider the case that the degradation process is observable in this paper. However, hidden or partially observable degradation process is frequently encountered in many engineering practices. This may encourage the research for RUL estimation subject to hidden degradation processes. The key to tackle this issue is to incorporate the estimation error or estimation uncertainty in RUL estimation due to the unobservability of the degradation state. In addition, we primarily discuss the issues associated with estimating the RUL. Much more decision-focused research under the prognostic information needs to be researched so that we can generate new insights on the effect of the estimated RUL upon decision making.

Acknowledgments The authors would like to sincerely thank and acknowledge the support and constructive comments from the Editor and the anonymous reviewers. This work was partially supported by the National 973 Project under grants 2010CB731800 and 2009CB32602, the NSFC under grants 61028010, 61021063, 60931160440, 61174030, 71071097, 71231001, 61210012, and 61104223 the National Science Fund for Distinguished Young Scholars of China under grant 61025014 and the Fundamental Research Funds for the Central Universities of China. Appendix A. Algorithm 1: Kalman filtering algorithm Step 1: Initialize l^ 0 ¼ a0 , P 0 . Step 2: State estimation at time t i P i9i1 ¼ P i19i1 þ Q K i ¼ ðt i t i1 Þ2 Pi9i1 þ s2 ðt i t i1 Þ

l^ i ¼ l^ i1 þ Pi9i1 ðti ti1 ÞK 1 xi xi1 l^ i1 ðt i t i1 Þ : i Step 3: Updating variance Pi9i ¼ P i9i1 P i9i1 ðt i t i1 Þ2 K 1 i P i9i1 :

Appendix B. Algorithm 2: Strong tracking filtering algorithm Step 1: Initialize l^ 0 ¼ a0 , P 0 , a, r. Step 2: Calculating fading factor uðt i Þ from orthogonality principle 8 < g2 ðt1 Þ, i ¼ 1 with gðt i Þ ¼ xi xi1 l^ i1 ðt i t i1 Þ V 0 ðt i Þ ¼ rV 0 ðti1 Þ þ g2 ðti Þ : , i4 1 1þr Bðt i Þ ¼ V 0 ðt i ÞQ ðt i t i1 Þ2 as2 ðt i t i1 Þ; Cðt i Þ ¼ P i19i1 ðt i t i1 Þ2 ; u0 ¼ Bðt i Þ=Cðt i Þ: ( u0 , u0 Z 1 uðt i Þ ¼ 1, u0 o 1 Step 3: State estimation P i9i1 ¼ uðt i ÞP i19i1 þ Q K i ¼ ðt i t i1 Þ2 Pi9i1 þ s2 ðt i t i1 Þ

l^ i ¼ l^ i1 þ Pi9i1 ðti ti1 ÞK 1 xi xi1 l^ i1 ðt i t i1 Þ : i

234

X.-S. Si et al. / Mechanical Systems and Signal Processing 35 (2013) 219–237

Step 4: Updating variance P i9i ¼ Pi9i1 Pi9i1 ðt i t i1 Þ2 K 1 i P i9i1 :

In Algorithm 2, a Z 1 and r denote the softening factor and the forgetting factor, respectively, which can be selected heuristically. r ¼ 0:95 has been used in general [39,51]. Appendix C. The implementation of EM algorithm for our model Following the above-mentioned procedures for the EM algorithm, the joint log-likelihood function for our problem can be expressed as ‘i ðhÞ ¼ log pðX0:i 9Ui , hÞ þ log pðUi 9hÞ Yi Yi ¼ log pðl0 9hÞ þ log j ¼ 0 pðlj 9lj1 , hÞ þ log j ¼ 0 pðxj 9lj1 , hÞ:

ðC  1Þ

  From Eqs. (7) and (8), we directly have lj 9lj1  Nðlj1 ,Q Þ, xj 9lj1  N xj1 þ lj1 ðt j t j1 Þ, s2 ðt j t j1 Þ and l0  Nða0 ,P 0 Þ. Using Eq. (C-1) and ignoring the constant terms, the joint log-likelihood function can be formulated as

Xi 2 log Q þð l  l Þ =Q 2‘i ðhÞ ¼ logP 0 ðl0 a0 Þ2 =P 0  j j1 j¼1

Xi  2 2 log s þ x x lj1 ðt j t j1 Þ =s2 ðt j t j1 Þ : ðC  2Þ  j j1 j¼1 ðkÞ To calculate the conditional expectation ‘ðh9h^ i Þ defined in Eq. (29), we have h

Xi ðkÞ 2 log Q þ ðlj lj1 Þ2 =Q 2‘ðh9h^ i Þ ¼ E ^ ðkÞ ½2‘ i ðhÞ ¼ E ^ ðkÞ logP 0 ðl0 a0 Þ =P 0  j¼1

Ui 9X1:i , hi



Xi j¼1

Ui 9X0:i , hi



 2   i : logs2 þ xj xj1 lj1 ðt j t j1 Þ = s2 ðt j t j1 Þ

ðC  3Þ

Clearly, to calculate the expectation of this expression requires to obtain E

ðkÞ

Ui 9X0:i , h^ i

2

ðlj Þ, E

ðkÞ

Ui 9X0:i , h^ i

ðlj Þ and E

ðkÞ

Ui 9X0:i , h^ i

ðlj lj1 Þ,

which are the conditional expectations with respect to Ui , given the observed history X0:i . In this paper, we use the Rauch– Tung–Striebel (RTS) smoother to provide an optimal estimation of E

ðkÞ

Ui 9X0:i , h^ i

ðlj Þ, E

2

ðkÞ

Ui 9X0:i , h^ i

ðlj Þ and E

ðkÞ

Ui 9X0:i , h^ i

ðlj lj1 Þ,

summarized as Algorithm 3, [47,54]. In Algorithm 3, we define M j9i ¼ Covðlj , lj1 9X0:i Þ. Algorithm 3. RTS smoothing algorithm Step 1: Forwards iteration by Algorithm 1 or Algorithm 2 Step 2: Backwards iteration Sj ¼ P j9j P 1 j þ 19j

l^ j9i ¼ l^ j þ Sj ðl^ j þ 19i l^ j þ 19j Þ ¼ l^ j þ Sj ðl^ j þ 19i l^ j Þ P j9i ¼ Pj9j þSj 2 ðPj þ 19i Pj þ 19j Þ Step 3: Initialize M i9i ¼ ð1K i ðt i t i1 ÞÞPi19i1 Step 4: Backwards iteration for smoothing covariance M j9i ¼ P j9j Sj1 þ Sj ðM j þ 19i P j9j ÞSj1

2

From the RTS smoothing algorithm, we can obtain the conditional expectations of E ðkÞ ðlj Þ, E ðkÞ ðlj Þ and Ui 9X0:i , h^ i Ui 9X0:i , h^ i E ðkÞ ðlj lj1 Þ in the following lemma. ^ Ui 9X0:i , hi

ðkÞ Lemma 3. Conditional on current estimated parameter h^ i and observations history X0:i , the values of E ðkÞ ðlj Þ, Ui 9X0:i , h^ i 2 E ðkÞ ðlj Þ and E ðkÞ ðlj lj1 Þ are given by Ui 9X0:i , h^ i Ui 9X0:i , h^ i ^ , E ðkÞ ðl Þ ¼ l

Ui 9X0:i , h^ i

E

ðkÞ

E

ðkÞ

Ui 9X0:i , h^ i Ui 9X0:i , h^ i

j

j9i

2 2 ðlj Þ ¼ l^ j9i þ Pj9i ,

ðlj lj1 Þ ¼ l^ j9i l^ j19i þ Mj9i ¼ Pj9j Sj1 þ Sj ðMj þ 19i Pj9jÞS

j1

þ l^ j9i l^ j19i :

ðC  4Þ

X.-S. Si et al. / Mechanical Systems and Signal Processing 35 (2013) 219–237

235

Proof: These equations are the direct results of applying the properties of variance-covariance and RTS smoothing algorithm and the proof is hence omitted. ðkÞ From Eq. (C-3) and Lemma 3, ‘ðh9h^ i Þ can be written as h

Xi ðkÞ 2 log Q þ ðlj lj1 Þ2 =Q 2‘ðh9h^ i Þ ¼ E ðkÞ ½2‘ i ðhÞ ¼ E ðkÞ logP 0 ðl0 a0 Þ =P 0  j¼1 Ui 9X0:i , h^ i Ui 9X0:i , h^ i

Xi  2   i  logs2 þ xj xj1 lj1 ðt j t j1 Þ = s2 ðt j t j1 Þ j¼1

Xi log Q þ ðC j9i 2C j,j19i þC j19i Þ=Q ¼ logP 0 ðC 09i 2a09i a0 þ a0 2 Þ=P 0  j¼1

  Xi  2   logs2 þ xj xj1 2l^ j19i xj xj1 ðt j t j1 Þ þðt j t j1 Þ2 C j19i = s2 ðt j t j1 Þ :  j¼1

ðC  5Þ

This completes the E-step and in the following we handle the M-step. ðkÞ ðk þ 1Þ After obtaining ‘ðh9h^ i Þ, the results of estimated parameter h^ i in the (kþ1)th step can be summarized in the following theorem. ðk þ 1Þ ðkÞ , by maximizing ‘ðh9h^ i Þ, is given by Theorem 2. h^ i

a0i ðk þ 1Þ ¼ a09i , P0i ðk þ 1Þ ¼ C 09i a09i 2 ¼ P09i , 1 Xi ðC j9i 2C j,j19i þC j19i Þ, Q i ðk þ 1Þ ¼ j¼1 i 0 1 2   xj xj1 2l^ j19i xj xj1 ðt j t j1 Þ þ ðt j t j1 Þ2 C j19i  2  ðk þ 1Þ 1 Xi @ A: s i ¼ j¼1 i t j t j1 ðlj Þ,a09i ¼ l^ 09i ,C j,j19i ¼ E 2

with C j9i ¼ E

ðkÞ

Ui 9X0:i , h^ i

ðkÞ

Ui 9X0:i , h^ i

ðC  6Þ

ðk þ 1Þ ðlj lj1 Þ and h^ i is uniquely determined and located at the maximum.

Proof: See Appendix D. The preceding derivations are summarized as Algorithm 4 in Appendix E via a complete specification of the EM-based algorithm for estimating the parameters in h when new observation xi is available. Appendix D. The proof of Theorem 2 ðk þ 1Þ ðkÞ The unknown parameters h^ i can be obtained by maximizing the ‘ðh9h^ i Þ with h. Therefore, the central goal in this ðkÞ step is to find the maximum of function ‘ðh9h^ i Þ on h, i.e.     ðk þ 1Þ ðkÞ h^ i ¼ argmax E ¼ argmax ‘ðh9h^ i Þ ^ ðkÞ ‘ i ðhÞ

h

Ui 9X0:i , hi

ðD  1Þ

h

ðkÞ @‘ðh9h^ i Þ=@h,

ðkÞ @‘ðh9h^ i Þ=@h

we obtain the solution by ¼ 0, which leads to Eq. (36), and taking From Eq. (C-5), taking ðkÞ @2 ‘ðh9h^ i Þ=@h@hT , the following is obtained, 2 3 2=P 0 2ða0 a09i Þ=P 20 0 0 6 7 ðkÞ 2 2 3 2 7 0 0 @2 ‘ðh9h^ i Þ 16 6 2ða0 a09i Þ=P 0 1=P0 2ðC 09i 2a09i a0 þ a0 Þ=P0 7 ¼ 6 ðD  2Þ 7, T 7 26 @h@h 0 0 i=Q 2 2c=Q 3 0 4 5 0 0 0 i=s4 2f=s6 with



Pi



Pi

j ¼ 1 ðC j9i 2C j,j19i þ C j19i Þ: j¼1

  2 ðxj xj1 Þ 2l^ j19i ðxj xj1 Þðtj tj1 Þ þ ðtj tj1 Þ2 C j19i

ðD  3Þ

t j tj1

ðk þ 1Þ We show that the matrix in (D-2) is negative definite at h ¼ h^ i , by calculating the order principal minor determinant as follows, ! 2 ða0 a09i Þ2 1 1 1 2ðC 09i 2a09i a0 þ a0 Þ D1 ¼  , D2 ¼   ,  P0 2P0 P 20 P 40 P 30

D3 ¼

    1 i 2c 1 i 2f  D , D ¼  D3 : 2 4 2 Q2 Q3 2 s4 s6

236

X.-S. Si et al. / Mechanical Systems and Signal Processing 35 (2013) 219–237 ðk þ 1Þ

Then, at h ¼ h^ i

D1 9

ðk þ 1Þ

¼

1 o0, P09i

ðk þ 1Þ

¼

1 2P09i

h ¼ h^ i

D2 9

h ¼ h^ i

D3 9



D4 9

, the following results are obtained,

ðk þ 1Þ h^ i

ðk þ 1Þ

h ¼ h^ i

¼

1 P 09i

 2

2ðC 09i 2a09i a09i þ a09i 2 Þ P 09i 3

! 

ða09i a09i Þ2 P 09i 4

¼

1 2P09i

1 P 09i

 2

2

!

P 09i 2

¼

1 2P 09i 3

4 0,

! 1 i3 2i3 c i3  D2 9 ^ ðk þ 1Þ ¼  2 D2 9 ^ ðk þ 1Þ o 0, 2 3 h ¼ h h ¼ hi 2 c i 2c c

!   1 i 2f 1 i3 2i3 f i3 ¼  D 9  D 9 D 9 ðk þ 1Þ ¼ ðk þ 1Þ ¼  ðk þ 1Þ 4 0: 3 3 ^ ^ 2 3 h ¼ h^ h ¼ hi h ¼ hi 2 s4 s6 2 f2 i 2f f3

ðk þ 1Þ ðk þ 1Þ , verifying that h^ i is located at a This completes the proof that the matrix in (D-2) is negative definite at h ¼ h^ i ðk þ 1Þ ðk þ 1Þ ðkÞ is unique since h^ i is the only solution satisfying @‘ðh9h^ i Þ=@h ¼ 0. maximum. In addition, h^ i

Appendix E. Algorithm 4: EM algorithm for parameter estimation ð0Þ 1) Initialization Initialize the initial parameters in h^ i . 2) E-Step Calculate the expectation quantities defined in Eq. (C-3) using Algorithm 3 and Eq. (C-4), with state-space model ðkÞ of Eqs. (7) and (8) parameterized by h^ i .

3)

ðkÞ

M-Step Maximize ‘ðh9h^ i Þ using Eq. (C-6) to obtain the updated parameter estimates by Theorem 2. 4) Test convergence Test the convergence of the algorithm. If converged, then stop. Otherwise set k ¼ k þ1, go to Step 2 and repeat.

References [1] M. Pecht, Prognostics and Health Management of Electronics, John Wiley, New Jersey, 2008. [2] X.S. Si, W. Wang, C.H. Hu, D.H. Zhou, Remaining useful life estimation–a review on the statistical data driven approaches, Eur. J. Oper. Res. 213 (2011) 1–14. [3] F. Camci, R.B. Chinnam, Health state estimation and prognostics in machining processes, IEEE Trans. Autom. Sci. Eng. 7 (2010) 581–597. [4] M.J. Carr, W. Wang, Modeling failure modes for residual life prediction using stochastic filtering theory, IEEE Trans. Rel. 59 (2010) 346–355. [5] M. Dong, D. He, Hidden semi-Markov model-based methodology for multi-sensor equipment health diagnosis and prognosis, Eur. J. Oper. Res. 178 (2007) 858–878. [6] Y. Peng, M. Dong, A prognosis method using age-dependent hidden semi-Markov model for equipment health prediction, Mech. Syst. Signal Process. 25 (2011) 237–252. [7] J. Sikorska, M. Hodkiewicz, L. Ma, Prognostic modelling options for remaining useful life estimation by industry, Mech. Syst. Signal Process. 25 (2011) 1803–1836. [8] M.I. Mazhar, S. Kara, H. Kaebernick, Remaining life estimation of used components in consumer products life cycle data analysis by Weibull and artificial neural networks, J. Oper. Manag. 25 (2007) 1184–1193. [9] W. Wang, A two-stage prognosis model in condition based maintenance, Eur. J. Oper. Res. 182 (2007) 1177–1187. [10] M.Y. You, L. Li, G. Meng, J. Ni, Statistically planned and individual improved predictive maintenance management for continuously monitored degrading systems, IEEE Trans. Rel. 59 (2010) 744–753. [11] A.K.S. Jardine, D. Lin, D. Banjevic, A review on machinery diagnostics and prognostics implementing condition-based maintenance, Mech. Syst. Signal Process. 20 (2006) 1483–1510. [12] M.-L.T. Lee, G.A. Whitmore, Threshold regression for survival analysis: modeling event times by a stochastic process reaching a boundary, Stat. Sci. 21 (2006) 501–513. [13] L.A. Escobar, W.Q. Meeker, A review of accelerated test models, Stat. Sci. 21 (2006) 552–577. [14] V.R. Joseph, I.T. Yu, Reliability improvement experiments with degradation data, IEEE Trans. Rel. 55 (2006) 149–157. [15] C.J. Lu, W.Q. Meeker, Using degradation measures to estimate a time-to-failure distribution, Technometrics 35 (1993) 161–174. [16] J.I. Park, S.J. Bae, Direct prediction methods on lifetime distribution of organic light-emitting diodes from accelerated degradation tests, IEEE Trans. Rel. 59 (2010) 74–90. [17] M.D. Pandey, X.-X. Yuan, J.M. van Noortwijk, The influence of temporal uncertainty of deterioration on life-cycle management of structures, Struct. Infrastruct. Eng. 5 (2009) 145–156. [18] S. Bloch-Mecier, A preventive maintenance policy with sequential checking procedure for a Markov deteriorating system, Eur. J. Oper. Res. 147 (2002) 548–576. [19] N. Gebraeel, M.A. Lawley, R. Li, J.K. Ryan, Residual-life distributions from component degradation signals: a Bayesian approach, IIE Trans. 37 (2005) 543–557. [20] M.C. Delia, P.O. Rafael, A maintenance model with failures and inspection following Markovian arrival processes and two repair modes, Eur. J. Oper. Res. 186 (2008) 694–707. [21] H.T. Liao, E.A. Elsayed, Reliability inference for field conditions from accelerated degradation testing, Nav. Res. Log. 53 (2006) 576–587. [22] C.T. Barker, M.J. Newby, Optimal non-periodic inspection for a multivariate degradation model, Reliab. Eng. Sys. Safe. 94 (2009) 33–43. [23] S.T. Tseng, J. Tang, L.H. Ku, Determination of optimal burn-in parameters and residual life for highly reliable products, Nav. Res. Log. 50 (2003) 1–14. [24] G.A. Whitmore, F. Schenkelberg, Modelling accelerated degradation data using Wiener diffusion with a time scale transformation, Lifetime Data Anal. 3 (1997) 27–45. [25] C.Y. Peng, S.T. Tseng, Mis-specification analysis of linear degradation models, IEEE Trans. Rel. 58 (2009) 444–455.

X.-S. Si et al. / Mechanical Systems and Signal Processing 35 (2013) 219–237

[26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54]

237

X. Wang, Wiener processes with random effects for degradation data, J. Multivariate Anal. 101 (2010) 340–351. A. Ray, S. Tangirala, Stochastic modeling of fatigue crack dynamics for on-line failure prognostics, IEEE Trans. Control Syst. Techn. 4 (1996) 443–450. S.T. Tseng, C.Y. Peng, Optimal burn-in policy by using an integrated Wiener process, IIE Trans. 36 (2004) 1161–1170. M. Crowder, J. Lawless, On a scheme for predictive maintenance, Eur. J. Oper. Res. 176 (2007) 1713–1722. J. Tang, T.S. Su, Estimating failure time distribution and its parameters based on intermediate data from a Wiener degradation model, Nav. Res. Log. 55 (2008) 265–276. Z.G. Xu, Y.D. Ji, D.H. Zhou, Real-time reliability prediction for a dynamic system based on the hidden degradation process identification, IEEE Trans. Rel. 57 (2008) 230–242. W. Wang, M. Carr, W. Xu, A.K.H. Kobbacy, A model for residual life prediction based on Brownian motion with an adaptive drift, Microeletron. Rel. 51 (2011) 285–293. A.H. Elwany, N.Z. Gebraeel, Real-time estimation of mean remaining life using sensor-based degradation models, J. Manuf. Sci. Eng. 131 (2009). 051005-1-9. A.H. Christer, W. Wang, A model of condition monitoring of a production plant, Int. J. Prod. Res. 30 (1992) 2199–2211. R. Li, J.K. Ryan, A Bayesian, inventory model using real-time condition monitoring information, Prod. Oper. Manag. 20 (2011) 754–771. D.R. Cox, H.D. Miller, The Theory of Stochastic Processes, Methuen and Company, London, 1965. D.C. Montgomery, Introduction to Statistical Quality Control, 6th Edition, John Wiley, New York, 2008. A.C. Harvey, Forecasting, Structural Time Series Models and the Kalman Filter, Cambradge University Press, Cambridge, U.K., 1989. D.H. Zhou, P.M. Frank, Strong tracking filtering of nonlinear time-varying stochastic systems with colored noise: application to parameter estimation and empirical robustness analysis, Int. J. Control. 65 (1996) 295–307. G.A. Whitmore, Normal-Gamma mixture of inverse Gaussian distributions, Scand. J. Statist. 13 (1986) 211–220. O.O. Aalen, Effects of frailty in survival analysis, Stat. Methods Med. Res. 3 (1994) 227–243. C. Derman, G.J. Lieberman, S.M. Ross, On the use of replacements to extend system life, Oper. Res. 32 (1984) 616–627. S.M. Shechter, M.D. Bailey, A.J. Schaefer, Replacing nonidentical vital components to extend system life, Nav. Res. Log. 55 (2008) 700–703. A.P. Dempster, N.M. Laird, D.B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, J. Roy. Stat. Soc. B 39 (1977) 1–38. T. Kailath, A. Sayed, B. Hassabi, Linear Estimation, Prentice-Hall, Upper Saddle River, NJ, 2000. C.F.J. Wu, On the convergence property of the EM algorithm, Ann. Statist. 11 (1983) 95–103. M. Dewar, K. Scerri, V. Kadirkamanathan, Data-driven spatio-temporal modeling using the integro-difference equation, IEEE Trans. Signal Process. 57 (2009) 83–91. S. Gibson, A. Wills, B. Ninness, Maximum-likelihood parameter estimation of bilinear systems, IEEE Trans. Automat. Contr. 50 (2005) 1581–1596. A. Wills, B. Ninness, S. Gibson, Maximum likelihood estimation of state space models from frequency domain data, IEEE Trans. Automat. Contr. 54 (2009) 19–33. ¨ A. Wills, B. Ninness, System identification of nonlinear state-space models, Automatica 47 (2011) 39–49. T.B. Schon, D.H. Zhou, P.M. Frank, Fault diagnostics and fault tolerant control, IEEE Trans. Aero. Electr. Sys. 34 (1998) 420–427. D.J. Jwo, S.H. Wang, Adaptive fuzzy strong tracking extended Kalman filtering for GPS navigation, IEEE Sens. J. 7 (2007) 778–789. O.J. Woodman, An introduction to inertial navigation, Technical report, published by the University of Cambridge Computer Laboratory, 2007, /http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-696.htmlS, available via Internet. H.E. Rauch, F. Tung, C.T. Striebel, Maximum Likelihood estimates of linear dynamic systems, AIAA J. 3 (1965) 1145–1450.