ENTROPY OPTIMIZATION FILTERING FOR FAULT ISOLATION OF NON-GAUSSIAN SYSTEMS Lei Guo ∗
Hong Wang ∗∗
∗
Research Institute of Automation, Southeast University, Nanjing 210096, China. Email:
[email protected] ∗∗ Control Systems Centre, The University of Manchester, M60 1QD, UK. Email:
[email protected]
Abstract: In this paper, the fault isolation (FI) problem is investigated for nonlinear non-Gaussian dynamic systems with multiple faults (or abrupt changes of system parameters) in the presence of noises. By constructing a filter to estimate the states, the FI problem can be reduced to an entropy optimization problem subjected to the non-Gaussian estimation error systems. The design objective for the FI purpose is that the entropy of the estimator error is maximized in the presence of the diagnosed fault and is minimized in the presence of the nuisance faults or noises. It is shown that the error dynamics is represented by a nonlinear non-Gaussian stochastic system, for which new relationships are applied to formulate the PDFs of the stochastic error in terms of PDFs of the noises and faults. The Renyi’s entropy has been used to simplify the computations in the filtering for the recursive design algorithms. It is noted that the output can be supposed to be immeasurable (but with known stochastic distributions), which is different from the existing results where the output is always measurable for c feedback. Copyright °2006 IFAC. Keywords: Fault isolation; nonlinear filtering; stochastic systems; optimal control; entropy optimization; non-Gaussian systems.
1. INTRODUCTION Fault detection and isolation (FDI) for stochastic systems has drawn a considerable attention in the past decades, where many effective methodologies have been developed as seen from the survey works (Basseville and Nikiforov, 2002; Frank and Ding, 1997; Isermann and Balle, 1997; Patton and Chen, 1996). Two kinds of approaches have been developed for the related FDI problems of stochastic systems. One was originated from the statistic theory, where the likelihood ratio and Bayesian methods are applied to estimate the fault (or the abrupt change of the parameters), by combining with some numeral computations such as the
Monte Carlo method or the particle filtering (see e.g. (Basseville and Nikiforov, 2002; Bar-Shalom et al., 2001; Isermann and Balle, 1997; Zhang et al., 1998) and references therein). Another group of methodologies are the filter based method, where the minimax techniques are used such that the estimation errors satisfy various performance indexes (see (Patton and Chen, 1996; Chen et al., 2003) and references therein). In many previous filtering results on FDI problems, the linear systems with Gaussian variables are studied, for which the Kalman filters or the extended Kalman filters (EKFs) can be applied (Chen et al., 2003). For nonlinear (stochastic) systems, some trans-
432
formations or approximations are required such that the error equation satisfies some linear structures (see e.g. (Chen and Saif, 2003; Kabore and Wang, 2001; Zhang et al., 2001) and references therein). In (Stoorvogel et al., 2002; Guo and Wang, 2005), the robust control approaches such as H2 and H∞ optimization have been applied to design FDI filters for linear systems with uncertainty. Even for the systems with Gaussian noises, nonlinearity can lead to the non-Gaussian output. For a non-Gaussian variable, especially for that with a non-symmetric probability density function (PDF), the mean and variance are insufficient to characterize its stochastic property. Hence, FDI for non-Gaussian and nonlinear systems is a challenge subject because EFKs can not be fully applied as it only gives a minimum variance estimation. In this regard, another general measure of randomness, namely, the entropy, should be considered. Entropy has been widely used in information, thermodynamics and control theory (Erdogmus and Principe, 2002; Feng and Loparo, 1997; Papoulis, 1991; Renyi, 1987; Wang, 2002). In (Guo and Wang, 2006b), entropy was used as a measure for the average information contained in the PDF of the estimation errors using the filters. Generally a PDF is a positive nonlinear function with the integral constraint, the main obstacle is to find the relationships between the PDFs of the input and the concerned errors, with which the entropy of the distribution of errors can be optimized (Guo and Wang, 2006b). In (Wang and Lin, 2000), B-spline expansions were used for PDFs of the system output and the problem was transferred into the fault detection and diagnosis (FDD) problem for the weights’ dynamics governed by linear equations. Recently, the results in (Guo and Wang, 2005) addressed a solution for the FDD problems by using the measured output PDFs and provided a robust FDD scheme for the nonlinear weighting models. However, the results in (Guo and Wang, 2005; Wang and Lin, 2000) hold only when the output PDFs can be approximated by B-spline expansions and the weighting dynamics can be modelled between the control input and the weights on-line. Also, the FI problem was not studied there. Recently, some primary results have been reported in (Guo and Wang, 2006b) for the minimum entropy filtering and applied to the fault detection in (Guo and Wang, 2006a), where the neuro-fuzzy approximations for the PDFs are not required. It is noted that only fault detection was considered there and the design procedures involved heavy computations. Moreover, the outputs were supposed to be measured as in the classical filter based approaches.
In this paper, the main contribution is to find a novel filtering approach to isolate the target fault for the nonlinear non-Gaussian systems. Two cases will be considered. The first one is that the output is measured as studied in the classical methodologies. The second case is that the stochastic distributions of the output is supposed to be measurable. By constructing a filter to generalize a residual, we present an entropy optimization principle to design the filter gain such that the target fault can be isolated from the nuisance faults in the presence of the disturbances. In order to simplify the computations, some auxiliary mappings are constructed to formulate the error PDFs in terms of the input PDFs and the Renyi’s entropy is adopted in design algorithms, which are also different from the previous results in (Guo and Wang, 2006b; Guo and Wang, 2006a).
2. PROBLEM FORMULATION 2.1 Plant Models and Fault Isolation Problems Consider a nonlinear stochastic discrete system described by ½ xk+1 = f (xk , δk , ηk , wk ) (1) yk = h(xk , vk ) where xk is the state sequence, yk is the output sequence, wk and vk are the stochastic disturbances influencing on the state equation and the output equation respectively. δk and ηk are two faults where δk is the target fault to be isolated and η is the nuisance fault. It is noted that δk and ηk can also represent the abrupt changes of the model parameter. This model is actually a generalization of those which only have the additive fault inputs or the unexpected changes of model parameters. It is supposed that δk , ηk , wk and vk are arbitrary bounded independent random vectors rather than the Gaussian ones as concerned in the existing FI approaches based on the Kalman filtering theory. The following assumptions are required to simplify the filter design procedures, which could be satisfied by many practical cases. Assumption 1. f (·) and h(·) are two known Borel measurable and smooth nonlinear functions of their augments, f (0, 0, 0) = 0 holds. Assumption 2. Exogenous inputs δk , ηk , wk and vk (k = 0, 1, 2, · · ·) are bounded, independent mutually random variables with known PDFs γδ (z), γη (z), γw (z) and γv (z) defined on a bounded interval [a, b], respectively. The initial value x(0) is also supposed to be a random variables with the 433 known PDF γx0 (z).
in the networked monitoring systems and chemical processes, where only the statistic information of γyk (z) can be measured or estimated rather than yk itself (see (Guo and Wang, 2005; Yue and Wang, 2003)). For example, one typical potential application is the FI problem for the molecular weight distribution (MWD) model in polymerization processes, where the molecular weights (MWs) usually are difficult to measure. However, the distributions of MWs can be measured and a model for the molecular variable can be identified (Yue and Wang, 2003). This case was less concerned since most previous filtering approaches required the deterministic information of yk .
γδ (z), γη (z), γw (z) and γv (z) can be measured by the direct measurement using some advanced instruments (e.g. a camera for flame in the combustion process), or estimated by the data processing technique (e.g. kernel estimation technique on the open loop test (Erdogmus and Principe, 2002; Schioler and Hartmann, 1992; Silverman, 1986). To provide a feasible FI approach with explicit design procedures, in this paper we propose a novel approach for the FI filtering using the entropy optimization principle. 2.2 Fault Isolation Filter and Estimation Error
In the following sections, we will establish the relationships between the PDFs of inputs and outputs for the two different cases.
For a nonlinear dynamic system given by (1), a filter can be described by ½ x bk+1 = f (b xk , 0, 0, 0) + Uk (yk − h(b xk , 0)) (2) ybk = h(b xk , 0)
2.3 PDFs of the Error for Case 1 where Uk is the filter gain to be determined. The resulting estimation error ek = xk − x bk satisfies ek+1 = f (xk , δk , ηk , wk ) − f (b xk , 0, 0, 0) −Uk (yk − ybk )
Under Case 1, in (3), the variable θ1k := −Uk yk − f (b xk , 0, 0, 0) + Uk ybk
(3)
(4)
can be regarded as the deterministic terms with unknown gain Uk at sample time k. The main obstacle is that f (xk , δk , ηk , wk ) is both nonlinear and stochastic at the sample time k.
It is noted that for the filter-based fault diagnosis task, the residual variable eˆk = h(xk , vk ) − ˆ xk ) should be considered rather than ek . Howh(ˆ ever, under Assumption A.1, it can be verified that h(xk ) can be rewritten as h(xk ) − h(b xk )= Lk ek + ρk (x, x b), where Lk is a known constant and ρk (x, x b) is an unknown function satisfying M1 ≤ ρk (x, x b) ≤ M2 . Here M1 and M2 are given constants. Then the entropy of random variable ebk satisfies (see (Guo and Wang, 2006a) for details)
We consider the formulation of γxk and γek recursively. Suppose the initial conditions are given by x0 , x b0 and U0 = 0. It is noted that (1) reduces to x1 = f (x0 , δ0 , η0 , w0 ). Using the PDFs of x0 , δ0 , η0 and w0 under Assumption A.2, we can formulate the joint PDF of (x0 , δ0 , η0 , w0 ) as
H(b ek ) ≤ H(ek ) + {ln |Lk |} + {ln |M2 − M1 |}
γx0 δηw (τ1 , τ2 , τ3 , τ4 ) = γx0 (τ1 )γδ (τ2 )γη (τ3 )γw (τ4 ) As such, the PDF of x1 can be given in terms of the multiple curve integrals based on the transformation theory of probability distributions. Recursively we denote the PDF of xk and ek as γxk (τ ) and γek (τ ), then we can get the following results.
In this paper, the following optimization will be focused on H(ek ) to save space. Uk will be designed such that ek+1 is affected primarily by the target fault δk and minimally by the others including ηk , wk and vk . It is denoted that as the estimation error, ek+1 ∈ [a, b], where a and b can be respectively chosen as −∞ and +∞. In this note, different from most previous works on filter or observer theory, two cases for the output yk will be considered respectively.
Lemma 1. For Case 1, at the sample time k (k = 1, 2, · · ·), γxk+1 (τ ) can be computed by i hR R R R γ (τ , τ , τ , τ )dτ dτ dτ dτ ∂ x δηw 1 2 3 4 1 2 3 4 k Ωk (5) ∂τ
Case 1: yk is measurable at every sample time k.
where for a given τ, the integrated domain is
Case 2: yk is immeasurable at every sample time k, but its PDF γyk (z) can be measured or estimated. Remark 1. Case 1 is the conventional condition in classical filter or observer based FI setup. On the other hand, Case 2 is also frequently encountered
Ωk := {(τ1 , τ2 , τ3 , τ4 ) | f (τ1 , τ2 , τ3 , τ4 ) ≤ τ } Thus, we have γek (τ ) = γxk (τ − θ1k ), where θ1k is denoted by (4). 434
Proof is omitted for brevity.
2.4 PDFs of the Error for Case 2 Under Case 2, correspondingly to (4), we denoted θ2k := −f (b xk , 0, 0, 0) + Uk ybk
(6)
which can be regarded as the deterministic terms with the unknown gain Uk . The left two terms of (3) are both nonlinear and stochastic. This means that the statistics of yk has to be used for the formulation of γek (τ ) for the FI purpose, rather than yk itself. In this case, initially (1) reduces to x1 = f (x0 , δ0 , η0 , w0 ), y0 = h(x0 , v0 ) To determine the statistics of x1 and e1 , not only the statistics of x0 , δ0 , ηk and w0 , but that of y0 should be used. Recursively, at every sample time k, θ1k is unavailable to compute γe1 (τ ), which is different from Case 1. As such, we denote dk+1 = f (xk , δk , ηk , wk ) − Uk yk
(7)
as a sum of two random variables.
which is also a functional of γδ (τ ), γw (τ ), γv (τ ) and γxk (τ ) as well as the under determined gain Uk . It should be pointed out that H(ek ) is actually a conditional entropy which can be further represented by H(ek | xk , δk−1 , ηk−1 , wk−1 , vk−1 , Uk−1 ). A desired FI approach can make the error is affected maximally by δk and minimally by η, wk and vk . For this purpose, it is expected that the entropy from δk to the error should be maximized while that from the other variables including ηk , wk and vk be minimized. As such, ek should be formulated in two different situations. One responds to the case in the presence of δk but in the absence of ηk , wk and vk , where the error is denoted as e1k . Concretely, the error dynamics is driven by e1k+1 = f1 (x1k , δk ) − θ1k
where f1 (x1k , δk ) = f (x1k , δk , 0, 0), θ1k is also defined by (4) and the corresponding state is denoted by x1k satisfying x1k+1 = f (x1k , δk , 0, 0). Another situation corresponds to the case in the presence of ηk , wk and vk but in the absence of δk , where the error is denoted as e2k driven by
Lemma 2. For Case 2, at the sample time k, the PDF of dk+1 can be given by Zb γxk+1 (τ − ρ)γyk (Uk−1 ρ)Uk−1 dρ (8)
γdk+1 (τ ) = a
where γxk+1 (τ ) can also be calculated by (5). Thus, we have γek (τ ) = γdk (τ − θ2k ). Proof: In this case ek+1 = dk+1 + θ2k , while dk+1 denoted in (7) posses a sum form of two random variables. Since γyk (ρ) is supposed to be known, the sum operation can be used to represent γdk+1 (τ ) as (8). The rest part of the proof is similar to that for Lemma 1. Q.E.D At this stage, γek (τ ) can be represented by γδ (τ ), γη (τ ), γw (τ ), γv (τ ) and γxk−1 (τ ) as well as the under-determined gain Uk . However, it is noted that in the above results, the formulations of the entropies were reduced to the differentials of some multiple integrals which depend on their integral domains.
(10)
e2k+1 = f2 (x2k , ηk , wk ) − θ2k
(11)
where f2 (x2k , ηk , wk ) = f (x2k , 0, ηk , wk ) − Uk yk , θ2k is also defined by (6), and the corresponding state satisfies x2k+1 = f (x2k , 0, ηk , wk ). The design objective is to find Uk at each step such that the entropy resulting from the first situation is maximized whilst the entropy resulting from the second situation is minimized. Thus, the following two entropies have to be considered H(e1k+1 ) = H(e1k+1 | xk+1 , δk , 0, 0, Uk ) H(e2k+1 )
=
H(e2k+1
| xk+1 , 0, wk , vk , Uk )
(12) (13)
where 0 still means the corresponding argument is set to be the deterministic constant 0, i.e., the corresponding variables do no contributions on the resultant error. For a given τ , the following two sets as the integrated domains are denoted Π1k := {(τ1 , τ2 ) | f (τ1 , τ2 , 0, 0) ≤ τ } and
3. ENTROPY OPTIMIZATION FILTERING 3.1 Entropy Optimization Principle The entropy of ek can be given by Zb H(ek ) = −
γek (τ ) ln (γek (τ )) dτ γek (τ ) > 0 (9) a
Π2k := {(τ1 , τ2 , τ3 ) | f (τ1 , 0, τ2 , τ3 ) ≤ τ } Similarly to Lemma 1, for Case 1 the formulations for γe1k can be reduced as follows. Theorem 1. For Case 1, at the sample time k (k = 1, 2, · · ·), γx1k+1 (τ ) can be computed by hR R i ∂ γ 1 (τ , τ )dτ1 dτ2 Π1k xk δ 1 2 (14) 435 γx1k+1 (τ ) = ∂τ
minimization (maximization) of H2 (ek ) reduces to the maximization (minimization) of V (eik ) = Rb 2 γ (τ )dτ , which is called the information poa eik tential. Thus, we only consider the following instantaneous cost function
Thus, we have γe1k (τ ) = γx1k (τ − θ1k ), where θ1k is also denoted by (4). Proof: This result can be proved similarly to Lemma 1. Q.E.D The results for case 2 can be obtained in the similar solution route following Lemma 2 and Theorem 1.
1 Jk (Uk ) = V (e1k ) − R1 V (e2k ) + R2 Uk2 2
where the logarithm computation in (17) can be avoided.
Theorem 2. For Case 2, at the sample time k, the PDF of %k+1 = f2 (x2k , ηk , wk ) can be given by
The optimal filtering strategy can be obtained via ∂Jk ∂Uk = 0, where an explicit function from other augments to Uk should be determined. For this purpose, we consider a recursive procedure in the following. Denote
Zb γx2k+1 (τ − ρ)γyk (Uk−1 ρ)Uk−1 dρ (15)
γ%k+1 (τ ) = a
where γx2k+1 (τ ) =
∂
hR R R
Uk = Uk−1 + ∆Uk , k = 1, 2, · · · , N,
i Π2k
γx2k ηw (τ1 , τ2 , τ3 )dτ1 dτ2 dτ3
1 Jk (Uk ) = Jk0 + Jk1 ∆Uk + Jk2 ∆Uk2 + o(∆Uk2 )(20) 2 where
To isolate the fault and attenuate the disturbances, we propose the following performance index as called entropy optimization principle (EOP)
Jk0
¸ N · X 1 2 1 2 −H(ek ) + R1 H(ek ) + R2 Uk (16) = 2
Uk =Uk−1
Theorem 3. For Case 1, an EO filtering strategy for the FI purpose for Jk subject to nonlinear error model (3) is given by
where R1 and R2 can be selected as weights. Remark 2. Entropy optimization is equivalent to the variance optimization for Gaussian signals. It can be verified that the entropy optimization principle is consistent with the results for linear Gaussian systems (see e.g. (Chen et al., 2003)).
∆Uk∗ = −
Jk1 + R2 Uk−1 Jk2 + R2
(21)
for a weight R2 > 0 satisfying Jk2 + R2 > 0. Proof is omitted for brevity.
If there exists a filter such that Jk is minimized for each sample time k, then it is called to be an entropy optimization (EO) filter (EOF).
Remark 3. For Case 1, the suboptimal EO filtering algorithm can be summarized as follows: i) Initialize x0 , x b0 and U0 ; ii) At each sample time k, compute γei k+1 (τ ), τ ∈ [a, b] and V (eik+1 ) (i = 1, 2); iii) Calculate ∆Uk and Uk using equation (21) and (19); iv) Increase k by 1 and go back to i) .
3.2 Optimal FD Filter Design Strategy It is noted that the above calculation procedure for entropy will lead to a slow and computational inefficient algorithm. To solve this, the Renyi’s entropy will be used as shown in the following (Renyi, 1987) 1 ln 1−α
¯ ∂Jk (Uk ) ¯¯ = Jk (Uk )|Uk =Uk−1 , Jk1 = ∂Uk ¯Uk =Uk−1 ¯ ∂ 2 Jk (Uk ) ¯¯ Jk2 = ∂U 2 ¯ k
k=1
Hα (ek ) =
(19)
It can be approximated that
∂τ
Thus, we have γe2k (τ ) = γ%k (τ − θ2k ).
JN
(18)
The design procedures for Case 2 follow similarly. Simulations are omitted here for brevity.
Zb γeαk (τ )dτ
(17)
4. CONCLUSIONS
a
In (17) it can be selected as α = 2 as shown in (Erdogmus and Principe, 2002). In this case, the
436
In this paper, we present a novel fault isolation (FI) approach for the dynamic nonlinear nonGaussian systems with the faults (or the abrupt
changes of system parameters) and the noises. By constructing a filter to estimate the states, the FI problem can be reduced to an entropy optimization problem subjected to the estimation error systems, which is also represented by a nonlinear non-Gaussian system. After new relationships are applied to describe the PDFs of the stochastic error in terms of PDFs of the stochastic noises and faults, we propose recursive approaches to design real-time suboptimal FI filters. The design objective is that the entropy of the stochastic estimator error is maximized with respect to the faults and is minimized with respect to the other faults and the stochastic noises. Another feature of the proposed approach is that the output can be supposed to be immeasurable. With the measured pdf the output, similar approaches can be provided, which is different from the existing results. Acknowledgement: This work is supported by NSF of China (No. 60474050 and 60472065)
REFERENCES Bar-Shalom, Y., X. R. Li and T. Kirubarajan (2001). Estimation with applications to tracking and navigation. John Wiley & Sons. London. Basseville, M. and I. Nikiforov (2002). Fault isolation for diagnosis: Nuisance rejection and multiple hypothesis testing. Annual Reviews in Control 26, 189–202. Chen, R. H., D. L. Mingori and J. L. Speyer (2003). Optimal stochastic fault detection filter. Automatica 39, 377–390. Chen, W. and M. Saif (2003). Fault detection and accommodation in nonlinear time-delay systems. In: Proceedings of the ACC. Denver, Colorado. pp. 4255–4260. Erdogmus, D. and J. C. Principe (2002). Generalized information potential criterion for adaptive system training. IEEE Trans. on Neural and Networks 13, 1035–1044. Feng, X. B. and K. A. Loparo (1997). Active probing for information in control system with quantized state measurements: a minimum entropy approach. IEEE Trans. on Automatic Control 42, 216–238. Frank, P. M. and S. X. Ding (1997). Survey of robust residual generation and evaluation methods in observer-based fault detection systems. J. of Process Control 7, 403–424. Guo, L. and H. Wang (2005). Fault detection and diagnosis for general stochastic systems using B-spline expansions and nonlinear filters. IEEE Trans. on Circuits and Systems-I: Regular Papers 52, 1644–1652. Guo, L. and H. Wang (2006a). Fault detection for nonlinear non-Gausian stochastic systems
using entropy optimization principle. Transactions of the Institute of Measurement and Control 28, to appear. Guo, L. and H. Wang (2006b). Minimum entropy filtering for multivariate stochastic systems with non-gaussian noises. IEEE Trans. on Automatic Control 51, to appear. Isermann, R. and P. Balle (1997). Trends in the application of model-based fault detection and diagnosis of technical process. Control Engineering Practice 7, 709–719. Kabore, P. and H. Wang (2001). Design of fault diagnosis filters and fault-tolerant control for a class of nonlinear systems. IEEE Trans. on Automatic Control 46, 1805–1810. Papoulis, A. (1991). Probability, Random variables and stochastic processes, 3rd. McGrawHill. New York, USA. Patton, R. and J. Chen (1996). Control and dynamic systems: Robust fault detection and isolation (FDI) systems. Academic Press. London. Renyi, A. (1987). A Diary on information Theory. Wiley. NY. Schioler, H. and U. Hartmann (1992). Mapping neural network derived from the Parzen window estimator. Neural Networks 5, 903–909. Silverman, B. W. (1986). Density Estimation for statistics and data analysis. Chapman and Hall. London. Stoorvogel, A. A., H. H. Niemann, A. Saberi and P. Sannuti (2002). Optimal fault signal estimation. Int. J. Robust & Nonlinear Contr. 12, 697–727. Wang, H. (2002). Minimum entropy control of non-Gaussian dynamic stochastic systems. IEEE Trans. on Automatic Control 47, 398– 403. Wang, H. and W. Lin (2000). Applying observer based FDI techniques to detect faults in dynamic and bounded stochastic distributions. Int. J. Control 73, 1424–1436. Yue, H. and H. Wang (2003). Recent developments in stochastic distribution control:a review. J. of the Measurement and Control 36, 209–216. Zhang, Q., M. Basseville and A. Benveniste (1998). Fault detection and isolation in nonlinear dynamic systems: A combined input-output and local approach. Automatica 38, 1359–1373. Zhang, X., M. Polycarpou and T. Parisini (2001). Robust fault isolation for a class of nonlinear input-output systems. Int. J. Control 74, 1295–1310.
437