IFAC
Copyright 0 IFAC Fault Detection, Supervision and Safety of Technical Processes, Washington, D.e. , USA , 2003
c:
0
C>
Publications www.elsevier.comllocate!ifac
THE FAULT DETECTION PROBLEM FOR DYNAMIC SYSTEMS: GENERAL RESULTS Darkhovski B.S.· Staroswiecki M .••• ,1
• Institute for Systems Anal!lSis RAS 117312 Moscow B-312;., Russia ••• LAIL CNRS UPRJ'JSA 802 EUDIL Universite des Sciences et Technologies de Lille 59655 Villeneuve d 'Ascq cedex France fax: (33) 32033 71 89 e-mail :
[email protected]
Abstract: This paper considers the FD problem of a non-linear continuous time system with deterministic uncertainties and stochastic measurements, by means of the hypotheses testing problem in the presence of unknown parameters. Two problem settings, associated with two different fault models, are considered, and two ideas are used for the construction of the decision rule. The first one uses only a part of the observations for decision making and the whole sample for the parameter estimation, while the second one introduces some averaging over the parameter set according to an artificial "posterior" distribution. It is proven that the proposed approach is asymptotically equivalent to the optimal Bayesian rule with known parameters. Copyright «:> 2003 IFAC Keywords: non-linear systems, hypotheses testing.
1. INTRODUCTION
in surveys and books (Chen and Patton, 1999), (Frank, 1990). However, in many cases, this aJr proach does not work, either because all the nuisances cannot be eliminated - approximate decoupling is then used, (Gertler and Kunwer, 1993), (Staroswiecki et aI. , 1993) - or because nuisances and faults act in the same spaces (Staroswiecki and Darkhovski, 2001) .
Real-time fault detection (FD) is a decision problem in which the healthy or faulty state of a system has to be inferred from the available data. The main difficulty is that the observed data depend on nuisances (unknown or uncertain parameters, unknown inputs, unknown initial conditions). Many works have considered this difficulty, through two main points of view, the so-called geometric and statistical approaches.
Statistical approaches proceed in one step, considering the distributions of the observations under hypothesis Ho (nominal system operation) and HI (faulty system operation) . These distributions can be derived from the consideration of the system stochastic behaviour model. Different tests exist, which depend on the way data are collected (fixed or growing number of observations), on the problem setting (hypothesis testing or changepoint detection), on the kind of optimality criterion which is used (error probabilities, mean time between errors, detection delay, etc.) (see (Basseville and Nikiforov, 1993), (Brodsky and Darkhovski, 2000), for recent synthesis) . However, the problem of nuisances has still not received any complete and general solution. In applications,
Geometric approaches use a deterministic behaviour model of the system in order to eliminate the nuisances (Cocquempot et aI., 1991) (SeJiger and Frank, 1991). They proceed in two steps, first creating a residual vector , then applyiflg some ad hoc deCision procedure to that vector . The design of the residual vector incorporates the knowledge which is available upon the system behaviour , so that design procedures can be developed in order to obtain residuals with specific geometric properties (robust, directional, structured). Many research efforts have been devoted to geometric approaches, and a good synthesis can be found 1
Corresponding author
191
the hypothesis testing problem under unknown parameters is solved by computing the maximal likelyhood ratio (MLR) by means of the respective maximal likelyhood estimates (MLE) . However, this is a heuristic rule whose optimality is not proved. In mathematical statistics the problems mentioned above are also considered as a natural generalization of the classical hypotheses testing problem. Up to the authors knowledge, the first results concerning asymptotic optimality of the MLR criterion usmg the respective MLE of the parameter were given in (Bahadur, 1966), who showed that the lower asymptotic bound of the probability of type I is reachable by such criterion. Borovkov (1984, p. 349 - 350) notes that "probably, such MLR criterion has no exact optimality feature in the general case", and exhibits cases (p. 373-407) in which it is possible to prove its asymptotic optimality, under special assumptions on the faults. Asymptotic optimality means that the difference between the probability errors given by the rule and those given by any other rule are non-positive, providea the number of observations n grows to mfinity. The two cases in which asymptotic optimality is proven are: 1) one of the parametric sets consists of only one point, and 2) both parametric sets tend to the same point, with some definite speed, when n -+ 00 (the socalled bringing hypotheses). Let us underline that both assumptions are quite far from real life FD problems. In (Brown, 1971), a rather sophisticated transformation of the parametric space is proposed. Under special assumptions, the transformation allows to construct the likelyhood ratio test in the new parametric space, such that it is asy'mptotically optimal with respect to both probabllity errors. This paper considers the FD problem for a dynamic system with deterministic nuisances ahd stOChastlC measurements, by means of the hypotheses testing problem in the presence of unknown parameters. Two problem settings, namely the one parametric set and the two parametric sets problems are defined. The proposed solutions (based on the results of (Darkhovski and Staroswiecki, 2002» are proven to be asymptotically optimal without need for any extra assumption on the faults. The results are obtained using two ideas: • use only a part of the observations to make the decision, but the whole observation to construct the estimate of the unknown parameters, • construct some a posteriori level of confidence associated with each of the two hypotheses.
2. PROBLEM SE'ITING 2.1 System description
Consider on a given time window [0, T] the following continuous time dynamic system x(t) = F(x(t), u(t), vet), B,{(t»
x(O)
E
(1)
Xo
where x(t) is the state vector, u(t) is the known control vector, vet) E V is some unmeasured disturbance vector (known to belong to some given compact set V), {et) is the fault process (known to belong to some given compact set 3, ~ 3), and B is the system parameter vector (known to belong to some given compact set 9) . All vectors belong to real vector spaces, but their dimensions being not needed, they are not specified. The parametric set 9 is supposed to be 9 = e O U9 l , 8 0 n9 t = 0.
°
The disturbances are just supposed to belong to a known set; a stochastic description is not used, because it is impossible, even in principle, to get any knowledge about their distribution, due to the non-availability of the state vector. Assuming that the control is a differentiable function of time, and taking into account that 8(t) = 0, B(O) E 8 0 or B(O) E 9 1 eqn (1) is rewritten as follows, with W = (x, u, B):
wet) where W
= G(w(t), v(t),{(t») , w(O) E W
(2)
= Xo x {uo} x 8 .
The observations of the system are described by
Yk=H(wk,fk), k=O,l, ... ,n
(3)
where HO is a given vector-function, {Ek}~O is the sequence of i.i.d. random vectors and for any vector-function het) of continuous time t, hk h(ktl.), tl. is the sampling period, tl.n = T.
2.2 Faults description
Normal operation of system (2), (3) is described by
Ho : €(t) == 0, t E [0, T],
Ek '"
IT o, 'Vk
(4)
where ITo is a probability measure with known density function (d.f.). From (4), it follows that the faulty operation (hypothesis Ht) can be defined in four ways, considering changes in the distribution of the random variables, deviations of the system parameter vector 9, deviations of the fault process €(t), or all simultaneously. In this paper only the three first tYl?es of faults will be considered. Note that these definitions do not cover the cases in which faults in the distributions occur during the time wmdow
The proposed rrocedure can be applied to the FD problem 0 dynamic systems thanks to the approximation of the continuous time system's solutions. The organization of the paper is as follows. Section 2 introduces the one parametric set and two ~arametric sets decision problems in the frame of FD. In Section 3, the one parametric set problem is considered. Section 4 gives the result concerning the two parametric sets problem. Section 5 concludes the paper with some comments and a short discussion.
[O,T]. 2.:1 The hypotheses testing problems
From above, the fault detection problem can be stated as a two hypotheses testmg problem.
192
Namely. each observation Ylt.. depends of the unknown parameter value Wk. The unknown parameter belongs either to one set W (the problem is called one parametric set problem; it corresponds to the case when there are deviations only In the distributions of the random variables) or to one of two given sets Wo. WI. Wo n W I = 0 (the problem is called two parametric sets problem; it corresponds to the case when there are no deviations in the distributions of the random variables, but there are deviation in the system parameter vector or in fault process). Therefore, there exist two collections of pro babilistic distributions. namely 'Pi. i = 0,1, 'Pi = {Ji(-.W)}WEW or 'Pi = {J('.W)}wEW" where fi(·. w) are given dJ. of the observable variable yO • and the hypotheses testing problem is to decide to which collection of distributions the sample Y = {Yk}k=1 of i.Ld. random vectors belongs. The described hypotheses testing problems cannot be solved in their initial statement, because the dimension of the unknown parameter is growing simultaneou.sly with the number of observations. To overcome the problem. let us use a finitedimensional approximation of the solutions of equation (2). As un measured disturbances and faults have compact sets of values. the set of all solution of equation (2) is a compact in the space of continuous functions and, consequently. it can be approximated uniformly, relatively to unmeasured dISturbances and faults with any ~iven accuracy, and in any natural functional metric. by step-wise constant functions with only a finite number of ·umps. Le .• actually, by finite dimensional vectors rom some compact set. As soon as the step of the approximatIOn is choosen the dimension o( the unknown parameter becomes fixed and one can use a growing number of the observations assuming tnat the sampling period fj" tends to zero.
3. THE ONE PARAMETRIC SET PROBLEM 9.1 Assu.mptions
For a correct statement. let us assume that the following detectability condition holds (5)
Denote by P .... l (resp. E ... .d P ... .o (resp. E .... o) the probabilistic measures (resp. the mathematical expectations), associated with the hypotheses HI, Ho and the "true" val ue z * of the parameter. Let il, io be the MLE estimates for the unknown parameter z calculated using the whole sample Y and without any constraint (i.e. il, io may take their values in the whole space) with respect to the hypotheses Hb Ho. Let Zi = PrK(ii) , i = 0,1, where PrA is the projection operator on the set A. Call the estimates Zi the special maximal likelyhood estimates (SMLE) of the parameter. By definition the SMLE estimates belong to K but might not belong to Z. Consider the following condition: there exist sets Zo E K and Zl E K such that inf.. Ezo dis(z, Z) ?: 8> O. inf"Ez. dis(z, Z) ?: 8 > 0 and for any x> 0
x} = 0 Iim p .. , o{dis(zl. ZI) > x} = 0
lim p .. , ddis(Zo,Zo) >
n ...... oo
I
" ...... 00
I
for any z*
E
(6)
Z, where dis(u, Z) ~ inf.. Ez /lu - z/l.
It is easy to show that (6) implies (5) under the regularity conditions for some neighbourhood of Z. There are many situations in which condition (6) holds. For example. assume that the parametric vector z E Z has the same nature for both densities, and supJ!ose that there exist homeomorph isms 'Pi : Z -+ U. i = 0, 1 such that the densities have the form fo (Y. 'Po(z)), h (y. 'PI (z)) . Let u be some consistent estimator of the new parameter U = 'Pi(Z) (which is the same for both densities) such that Zi = 'P;- I ( u), i = 0, 1 are asymptotically normal estimators of the parameter z. Then
t
As a result of the approximation procedure one gets the collections of probabilistic distributions described by 'Pi = {Ji(·.Z)}zEZ.i = 0,1 or by Pi = {J( ,.z)}"Ez"i = 0.1, where z is the new parameter with fixed dimension and Z, Zi, i = 0.1 are given compact sets. The same hypotheses testing problems as before are considered.
, p. .. ZO --+
(-I 'Po
0
) ZI , 'Pl ) ( Z.
p. ,0
--+
(-I 'PI
0
'Po ) ( z )
Therefore condition (6) holds if
For both the one parametric set and the two parametric sets problems. it is implicitly assumed. everywhere below. that there exists sufficiently large ball K which contains the considered parameter sets (therefore. the term " parametric space" will mean: "some open set of a finitedimensional space"), and that the parametric collections Ii, i = 0,1 and f are known and defined for all parameter values m K.
('POlo 'Pl)(Z)
('PI
I
0
'Po)(Z)
n n
Z
=0
Z = 0
(here Zo = ('Polo'P!)(Z). ZI = ('Pl lo 'Po)(Z)), In particular, if 'Pi(Z) = u.;+z, i = 0, 1 then condition (6) holds when (Z ± (UI - uo)) nZ = 0. Many other situations in which condition (6) holds can be exhibited, and therefore this is assumed to be true in the sequel, and not further developed here. Furthermore. it is also assumed that all integrals below exist.
Besides. it is implicitly assumed. everywhere below. that the so-called regularity conditions for the densities hold (see Borovkov (1984) , p.238-239) for the whole parametric space. To shorten the paper, the exact formulation of these conditions is not given. Remember only that. roughly speaking, the regularity conditions mean that tne densities are smooth enough with r~ct to the parameter and there exist respective Fisher matrices.
9.2 Preliminaries The hypotheses testing problem is set in the Bayesian statement: if d(Y) is a decision function (i.e .• a measurable map of the sample space into
193
rameter z. Therefore, if g(z) is a bounded and continuous function, then
the set {O,I}), and a > 0,/3 > 0 are given numbers, the functional to be minimized (under any z E Z) is aPz,1 (d(Y)
= 0) + /3P z,a (d(Y) = 1)
L
g(z)PM(zl·)dz
(7)
n be such
Relation (8) allows to construct some new decision rule based on the densities mi(z), i = 0,1. This is the second idea of the proposed approach.
Let z· E Z be the true (unknown) value of the parameter. Put I
(8)
where z* E Z is the true parameter value and PM(zl ·) is the posterior density.
that n < nand n/n -+ 0 as n -+ 00, n -+ 00. Consider the part of the observations Y = (YI, ... , Yii). Let
~ g(z*)
= In h(Y, z·) . fa(Y, z·)
3.3 The decision rule
Note now that if the parameter value z* is known, the optimal Bayesian rule for problem (7) when only part Y of the observations are used is
Let z;P = estimate
onD I = {Y: I > c·} accept HI; on Da = {Y : I ~ c·} accept Ha;
-AP
z
Z, with densities mi(zIY) ~ mi(z)" i = 0,1 as follows : if z ~ 0 /i(Y, z) . mi(z) = If z E Z • . { MY, z)dz
1
z.
z.
It is easy to see that mi(B) are just the posterior d.f. on the set Z. if it is assumed that z is a random variable with prior uniform distribution. But these measures can be used without this assumption, and even for unbounded sets. A possible interpretation of these measures is that they provide a natural level of confidence with respect to each of the two hypotheses, for each point of Z after getting the observations. Note that such measures have recently been proposed in (Darkhovski, 1998) for the stochastic recovery problem. It is well known (see Walker (1968), Schervish (1997)) that under the regularity conditions, the posterior density of the random variable VM(zZM) tends uniformly in probability to the normal density (with some finite covariance matrix), as M -+ 00, on any compact set . Here, M is some sample size and ZM is the MLE estimate of pa-
Al
ap
= Aa + Al za
+
Aa Aa
ap
+ At zl
,
where Aa = dis(za, Z), Al = dis(zt, Z) (by definition, here % = 1/2).
where c· = In(/3/a). Now, consider the problem from another point of view. Namely~ suppose that from the beginning the sample 01 size n < n is available, and the decision about the hypotheses has to be made from this sample, but In a way such that it will be asymptotically equivalent to the optimal one for any (unknown!) z E Z. As it has been noted in the introduction." the decision rule based on the MLR with the MLE of the parameter based on the whole samr.le is, ingeneral, not asymptotically optimal. But If the MLE of the parameter based on the sample of size n (i.e., actually, based on the real available sample, with n > n) is used, then it might be possible to prove such optimality. Thus, "additional" observatIOns n-n appear to be the" cost" which has to be payed for asymptotic optimality. This is the first idea of the proposed approach. To explain the second idea, define 'probabilistic measures on the s-neighbourhood Z. of the set
iz. zmi(z)dz, i = 0,1 and consider the
For i tion
= 0,1 define the Kullback-Leibler informa-
Everywhere below, the same notation C will be used for (generally speaking) different positive constants which do not depend on n .
- -AP
Let I' = In ~~f~:;AP~ and consider the following decision rule onih onDa
= (Y:
I* > c·} accept Ht;
= {Y: I* :::; c·} accept Ho.
(9)
Let >.(y,z) = In ~~f~:;l and €i(n) = Pz·,a(I > c*), €OCn) = p ... ,t(I ~ c*) be the respective probability errors for the optimal Bayesian rule using only the part of observations Y for the problem with knoum parameter z· . Theorem 1. Assume that for any z E Z and = 0, 1 the following conditions hold:
i
a) the function >.(y, z) is continuously differentiable with respect to z and there exists a positive function G(y) such that sup 1I >'~(y, z)1I ~ G(y)
~=
J
G(tt.)!.(tt., z)dtt. <
00
Pz,.{IG(tt.) - m,1 ~ x} ~ Cexp(-Cx)'v'x> 0;
b) both distributions P .. ,a, P ",I of the random variable >'(y, z) have absolutely continuous component ;
C)Ez,i exp (-r>.(y, z)) <
00
for some
'"Y
> 0;
d) the functions Ki(z', z) are twice continuously differentiable with respect to z. Then under the relation Ti/n -+ 0 as n -+ 00, n -+ 00 the decision rule (9) is asymptotically
equivalent to the optimal Bayesian rule for any known value of the parameter in the following sense:
The decision rule has the form (remind that c· = Inf3!er): if J ~ c· accept Ho (16) if J> c· accept HI'
If the values zo, ZI are known, then the optimal Bayesian rule for problem (7), using the sample Y is
In f(Y,zJ) < c* _ f(}[, zo) In J(Y,zJ) _ > c* J(Y, zo)
if The proof is based on the results of (Darkhovski and Staroswiecki, 2002) .
Remark 1. It is known (Borovkov, 1984, p.245 ) that, under the regularity conditions, the Bayesian estimate of the 'parameter is asymptotically equivalent to the MLE, for any prior aensity function which is positive and continuous on the set 2. In the present case, this means that .;n(Zi - zfP)
-+
0, P z· ,i
Therefore, taking 2. = the ball which was used the SMLE), decision rule replacing the SMLE in Ai
-
a.s., i
= 0, 1
K (recall that K is in the construction of (9) can be performed by z?
In some cases there exist asymptotically normal estimates, with values in K, which are more easily or For example, in linear calculated than observation models, the projections on the set K of the mean square estimates (MSE) are asymptotically normal under usual conditions and are calculated very easily. In such situations, the MSE estimates can be used in the decision rule instead of and
(17)
accept HI
Denote by -) P to (I n J(Y,ZI) ( flz0,zl,n= >c .) f(Y, zo) -) P z, (I n J(Y,Zt) .) lOO (Zo, Zlt n = ~ c J(Y,zo) the respective error probabilities for the rule (17). For any z
=
(zo, ZI),
z; E 2 i , i = 0,1 denote
z:p.
z,
z:p.
z,
if
accept Ho
Theorem 2. Assume that for i = 0, 1 and any z; E 2, for the functions >. (y, z) and K (z; , z;) assumptions similar to a) - d) of the Theorem 1 hold. Then under the relation n/n -+ 0 as n -+ 00, fi -+ 00 decision rule (16) is asymptotically equivalent to the optimal Bayesian rule for any known parameters Zo E 20, zi E 21 under the respective alternatives 1/>1 (zo) = Prz, zo, 4>o(zi) = Przozi in the following sense:
4. THE TWO PARAMETRIC SETS
PROBLEM
In this section, the following problem HI : d.f. of obs. = f(Y, z), z E 21 Ho : d.f. of obs . = f(Y, z), z E 20 where z is the unknown parameter, is considered.
.
Let (zo, ZI) be any pair such that Zo E 2 0 , ZI E 2 1 , Denote by P Zo (resp . P z,) the probabilistic measure associated with the parameter value Zo (resp. ZI)'
lim n-oo
n
zti
J=
In this paper, nuisances are collected in a parameter vector of finite and constant dimensIOn when the number of observations grows to infinity, thanks to the approximation of the solutions of the continuous system by step-wise constant functions.
AP
I J(Y, ZI P ) where, ) n J(Y, remind, Y = {Yl. . . . ,Yr.}, n < n. Consider the criterion
InPzj(J ~ c·) =1 In eo(4)o(zi),zi,fi)
The simplest approach for the decision step in FD in dynamic systems involves hypothesis testing. This problem is rather difficult when unknown initial conditions, unknown inputs and uncertain parameters are present\ and the strict decoupling ap'proach fails to worK. In that case, optimal solutions to the statistical decision problem are known only under some restrictive assumptions.
PrZ,zAP. -
=1
5. CONCLUSION AND COMMENTS
Put Z = 20 U 2 1 , Consider a v-neigbourhood of Z and put the d.f. m(z) ("posterior" density) on the set Zv in the same way as above, but for only one density J. P = PrzozAP, zt P = Let zAP = zm(z)dz,
Jz.
lnPzci(J> c·)
nl~oo Infl(ZO,
In the two parametric sets problem, the detectability condition (which is assumed to hold) is expressed under the following form : 20 21 = and at any set of positive Lebesgue measure J(', ZI) i= J(', Z2) if ZI i= Z2 ·
o
Then as for Theorem 1, and taking into account that the estimate zAP is asymptotically equivalent to the MLE, one has :
zt
195
Two problem settings associated with two different approaches to fault modeling, have been distingUIshed, and the problem solutIOn has been proposed, nased on two ideas. The first is to make a decision, based only on a part ofthe observations n < n , but to construct the estimate of the parameter usin\l the whole sample. The interpretation is that the additional" n - n observations are some "cost" which should be payed for the optimality in hypotheses testing pr05lem, with the sample size
n.
The second idea is to use some" averaging" MLR, which is done with respect to some artIficial " posterior" measure on the parametric set (it is not assumed that the parameter is random, but it is possible to construct such a measure which will have the same features as a posterior distribution) .
Darkhovski, B.S., and Staroswiecki, M. (2002) . On hypotheses testing problem under unknown parameter. Teoria veroyatnostei i primenenia 47(4), 654-671. . Frank, D. M. (1990). Fault detection in dynamic systems using analytical and knowledge-based redundancy - a survey and some new results. Automatica 26(3), 450-472.
Fu, J . C., & Kass, R. E. (1988). The exponential rates of convergence of posterior distributions. Annals of Institute Statist. Math. 40(4), 683-69l. Gertler, J . J ., & Kunwer, M. M.(1993). Optimal residual decoupling for robust fault diagnosis. In Proc. Tooldiag '99 International Conference on Fault Diagnosis (pp.436-452). Toulouse, France.
Indeed, under natural conditions and any choice
Schervish, M. J . (1997) . Theory of Statistics. New York: Springer-Verlag.
alence between the new rules and the optimal one under any known parameter becomes true, and no additional assumption (like the bringing hypotheses) is needed. Although such rules are theoretically worse than the optimal one (underline that the optimal rule is not known for the problems with one parametric set), they might be useful in practice, because the real effectiveness of the optimal rule is determined by the rate of convergence in asymptotic relations and, most probably, (because n/n -+ 0 as slowly as one Wishes) will be comparable to that ofthe proposed methods. Acknowledgments. This work has been carried out under a grant of the French Ministery of Research (PAST RI).
Seliger, R., & Frank, P. M. (1991) . Fault diagnosis gy disturbance decoupled non-linear observers. In Proc. 30th IEEE Conf. on Decision and Control (pp.2248-2253) . Brighton.
n < n, n/n -+ 0 as n -+ 00 the asymptotic equiv-
References Bahadur, R. R. (1966) . An optimal property of the likely hood ratio statistic. In Proc. Fifth Berkeley Symp. Math. Statist. Prob. VoU (pp. 13-26). Berkeley and Los Angeles: University of California Press. Basseville, M., & Nikiforov, I. V. (1993) . Detection of abrupt changes - theory and c!pplications. Information and Systems Sciences Series. Englewood Cliff, New Jersey: Prentice Hall. Borovkov, A. A. (1984) . Mathematical Statistics. Moscow: Nauka. Brodsky, B. E., & Darkhovsky, B.S. (2000). Nonparametric Statistical Diagnosis. Problems and Methods. Dordrecht: Kluwer Academic Publishers. Brown, L. D. (1971) . Non-local asymptotic optimality of appropriate likelyhood ratio test. The Annals of Mathematical Statistics 42(4), 12061240. Chen, J ., & Patton, R. J . (1999) . Robust modelbased fault diagnosis for dynamic systems. Boston: Kluwer Academic Publishers. Cocquempot,V., Cassar, J .-Ph., & Satroswiecki, M. (1991) . Generation of Robust Analytic Redundancy Relations. In Proc. European Control Conference , ECC'91 (pp.309-314). Grenoble, France. Darkhovski, B. S. (1998). On stochastic recovery problem. Theory of probability and its applications 43(2), 357-364.
196
Staroswiecki, M., Cassar, J .-Ph., & Cocquempot, V. (1993). Generation of optimal structured residuals in the parity space. In Proc. 12 th IFAC Word Congress Vol. 5 (pp.535-542). Sydney. Staroswiecki, M., & Darkhovski, B. (2001). On Structural and Parity Space Detectability. In Proc. European Control Conference, ECC'OI (pp.161165). Porto, Portugal. Walker, A. M. (1968). On the Asymptotic Behaviour of Posterior Distributions. Journal of Royal Statistical Society, Seria B, 31, 80-88.