Cop\TiKllI © IL\C S.och"st ic Co ,, 'rol \ ' il " ius, Li.huania " SSR, L'SSR, 191<1i
NONPARAMETRIC ALGORITHMS OF STATISTICAL DIAGNOSTICS B. E. Brodskij and B. S. Darkhovskij Cn/ l ml Rl'lfarrh / I/.\Iilllll' of CUII/pln: AlllOlI/ulilll/ , ,\lusrml', L'SSR
Abstract. The problems of detecting the mome n ts of cha n ge of the stat i s tica l cha r acteristics of random sequences are discussed. The nonparamet r i c a l gorithms of solving such problems in d i fferent s i tuations are pro posed . The l i ~ i t theo r ems for these algo r ithms are p r oved. Keywords . Random p r ocesses; statist i cs ; disorde r detection.
standard situation framework. Methods of r educ i ng other situations to the standard one a r e d i scussed toward the end of the r epo r t .
INTRODUCTION The statist i cal methods of diagnostics play an i~portant role in modern process control systems . They are often used to check the adequacy of stochast i c models, to detect prealarm situations of the process and a l so d i ffere n t types of disorders and fai l u r es i n control and measurement channe l s.
APOS TER I OR I DISORDER PROBLEMS I n th is sect i o n t he aposte r iori disorder prob l ems are formu l ated and different me thods of so l ving them i n a standard s i tu at i on a r e given . I'le consider the aposte rio r i disorde r problems as problems of vector pa r ame t e r es timat i on in a scheme of se ri es .
Man y pract i cal quest i ons of diagnostics can be t r eated as statistica l problems of moments estimat i on i n which the probab il is tic d i st r ib u tion of a r andom sequence changes . The estimation of disorder moments can be made aposter i or i, i. e . by the whole obse r ved realisation of the process , and sequent i ally , i .e . on - l i ne with obse rvat i ons . Thus , we shall distinguish be tween aposter i o r i and seque n tial methods of disorder - t i me est i mation.
8 = ( 91 , .• , rame t er , K~1;
Let
8K )
be the vector pa-
O
On a certain probabil i ty space
Pe)
(.fl/;F,
two fam ili es of random sequences
\XN}, N>[VoLl ) XN= {X 14(rt) }~=i '
P r oblems of d i sorder detection we r e i n vest i ga t ed by E . S. Page , A. N. Shiryayev , G. Lo r den , L. Telksnys, N. Kl ig i ene , M. Pollak , I . V. Nikiforov , L.Yu . Vostrikova, M. Basseville and others. A state - of - art rev i ew and detailed b i bliog r aphy were g i ven by Kl igiene and Telksnys (1983).
a~d \~d) ~~={~i(ro};1' X~(n);f(nJ9)+~i.(t1)
l=1, ... , \(+1 ,
In spite of extens i ve research into the prob l em of di s o r de r detection and diagno s i s, it r ema i ns i n focus of at t ention , f ir st , due to its pract i cal i~portance , secondly , because many of its theoretical aspects continue to be a challenge, wi th non parametr i c methods being insufficien tl y developed . The bu l k of the results have been obtain ed with the parametric approach requ ir ing the know l edge of the random se quenc e model. Bes i des , several though common comp li cated types of disorder (non ab r upt or gradua l changes , etc . ) have not been investigated .
if
eo=-o
I
[tll N] +1 ~t1~ (e.:N] , eK+~ ==
and deterministic function fined below. Moments
[ 9L-iNI
t!'!t :=
~
,
'- l
l-
1, ,, "
1,
f(n ,9) K
is de -
will be
ca ll ed the moments of disorder , and the function r(n,9) determines the type of diso r der . The p r oblem is to construct a consistent est i mate of the paramete r by the g i ven samp l e
e
XN .
Let us formulate some assumptions about the family {~~J The natural filtration
.
{~nt , 1[O=-( CP,A) , CJln=O{W: ~~(1)" " , ~l(r1)}
This r eport offers nonpara~etric methods of d i sorde r detect i on and cives the re su l ts of thei r i nvest i gation in different s it ua ti ons. It shou l d be noted that the numbe r of diso r de r types is as large as the n umber of probab i l i st i c characterist i cs of random sequences . Nevertheless , t he r e is a standar d s i tuation , i.e . a change of the mean va l ue , to which other ca s es can be reduced. I-That fo l lovlS are the resu l ts of investigations within the
is associated wi th each sequence Let
~i.(n.),
L=1,,,,K+1 ,
1lst ::: o{ W : ~c: (S»)" , , ~.(S+t)}, G-n,= Vt. er;==Vrist l
'
l
Let us assume that the following condi tions a r e satisfied: i ) each of the sequences ;c:(n) is stationary (in a wide sense) ; 285
SCF-' - J w-
ta;Jrt):=;O,
L=1, ... ,K+1, are considered, and
l
B. E. Brod ski j an d B. S. Dark h ()\s kij
286
E' expt~~(n)
ii)
all
'tl~ H
if
l ;
fo r
the V - mJ.x J. ng condition is fulf i l led , i . e. for a l l
iii)
AE(j~
,
BE ~ , P(A)P(B):fO :
\P(AB},'FXA)P
"t(sH 0
s~
N
t:=n+i
n.
n.=~r .· ,1II-2;
m=i, ... ,N-2.,
tt+rrt~N-1
I'lltn.-~
N\m-~
L=' /1l
WrtIXHci)1, l= rt\+
n.
Pa\l\e- ell>E} ~ L eXp(-LE'-N) 9
Now , let A be un k n own. For the sake of convenience let us int r oduce he r e cont inuous t ime . For this , continuous r andom field (t, S) i s built on the set
'IN
M={(t,s): O'"t~1, O£S<1-oL}
Abrupt d i sorder
K=~, t(n,8)=f(n,G1)= =a I (It> [e1Nl) , a.:t 0
A'>
ae
qt4 (-
(I (A) is the indicator of the set A ) . Fo r the detection of th i s type of diso r de r the
8'1
be the estimate
of
~
Then
9,..= ~i+
'f .
Theo r em 3 . For all 0< E < Eo the re ex i st '~E» ~(E) > 0 s uc h t ha t = (9.. , 6;z.) obta i ned with t he estimate A= A(E) ~=;:,e(E.) satisf i es the inequal ity ,
ME» 0
In th i s case
mN
in
on T(5) i s bu il t ( ~ - maximu m (m i n imum) i s the set where the maximum (mi n imum) i s a t tainable with the accuracy of Then fo r certain 0 the se t of i\ - min i mum of t he f un c ti on des) is built and the s u premum 'f of th i s set on [0, ~-ol.J is fo u nd. Finally , the maxi mum is fo u nd at t he set of - maximum of J If) on '"T(S) which is ta~e n to -t
is used . Let
y~ (n, m)
Algorithm of estimation . For ce r ta i n the fu n ct i on des) - the diameter of the set of ~ - maxi mum of the funct i o n ~N( " S)
er ).
Some results for aposter i ori detection prob l ems a r e presented be l ow (w i th the letter L indicat i ng actua ll y d i fferent constants) .
::(~ (n)
to
0< £<
ae>o
;
1~n~Nl(-1; 1~m~N-N~t1, N"=[oLN/2]
statistics
81
the lattice nodes (t= tt/N, s= m/N) . For a ll O~ s~ 1-,,£ put -nS);{t:(t,S)fM},
l=n+m+-i
Y~(rt,m)==*(1-~l*L :X:~Q-
1:}2.= ei+ [A€)N1
f i eld va l ues are equal to
N-n-mL XN(l) ,
l=~
[c£N]~ 11.,*,~hL>N]
by means of l i nea r i n t erpo l ation . The
N
1
YN(n,tr9= kL XN(i) 2.
Then
s~oo
if
nLX\()-tl:nL>·--li\i.); n:=1, ,,. ,N-~ ; i.==~
wi th
Theorem 2 . Fo r all
For solv i ng the p r oblem , the following tistics will be used :
~ ~n. 'iN{n)=
I,(~ (n,[AaN])1
can be chosen to be the estimate e~ of
9,
e
be
the set of max i mu ms of the function
1'f~{n)1 ([J.N] ~ n.~[(1-cL)NJ) . Then , a r b i trary e l ement of chose n to be the estimate parame t e r
e1
.
" Disappearing" disorder
1nN
"9 1
can be of the
Th e fo ll ow i ng theo r em
about conve r gence of true .
e1
to
e1
is
K==-l 1~ rt ~ [81N1 f (R, fk 9.z)= a *0
I n this case
used . I f
A€)::=
'f,t(n, m)
9.2.- 91
i~ u~d
Gradual diso r der
if if
i s known, then
f(n,~.~)=aiO ( n , ~. 92.)= ~(Il/N)
;
IY!(rt,[A8N)),([~n~[(hL;41) ~1 of Si . Then
Theo r em 4 . Fo r a ll
0< E< to
P{\\e-9~>t}~ Lexp(-Lf2.N)
where 'f{t) is the cont i nuous function confined between two cont i nuous a n d monotonous (at [e1,~
When AB is unknown and continuous r andom f i e l d ~N
functions . For detection t he stat i stics
by
if
Yrf (n, m)
If
wi th
A
is used . Let Ae",
9 is known , m== [A8N1 i s
the statist i cs used.
is
as the estimate
Bz= 9r t [A9N] .
K=2, f(n,81, 8.z.):= 0
1~ n"*r~t-l1 l~NJ ~ n~ N ; f [&1 Nl ~n,[el.Nl
'1'N (m, n)
is used wit h m.=TAElN] . In this case an arbitrary p o i nt of min i mum of the funct i on
I n this case
J
For detect i on , statistics
0<£<£0
Theorem 1 . Fo r all
-f(n,Bi,9..z.)==O if and [aLNl
Y~ (n, m)
I
A9 >d-> 0
(t,S)
the is buil t
as was desc r ibed above .
~- 91
Y;(n.m)
I n th i s case an
arb i t r ary point of maximum of the fu n ct i o n
Algo r ithm of estimation For certain ~> 0 the function the d i ameter of the set of - min i mum o f
ae
des)
:'Ilon parametric ~N(" OS) on T(S) is built . Then f or certain A> 0 the set of A - mi nimum of the function d(S) is built and the supremum ~ of th i s set on [~} ~-2~] i s found. Finally, the max i mum i s found at the set of ae - min i mum of the function 1 If) on T(S) wh i ch is take~ t~ be the estimate ~1 of 1 . The n e~:=a.+'f.
7,.. (.
9
0< £ < Eo
Theo r em 5 . Fo r a ll
'\(£»0,
there exist such that the = (Si' ~ ob t ai n ed by A=A{£), , satisfie s the i n equa l ity :
t!£»O
aE;.(E»~,
estimate 8 (£= £(£)
'
Pe {II B-911 ,>e}< L exp(-Lt(£)N)
f (n) 8)= Clt
K~ 2}
if
[9L- 1N]+1.$fl6[(:hNl, 1.==1, ...,K+i; a.~ =ta· . . . { J if I..:#J a nd mm ,et - e~+~\: ~~ i.~
~ K+ 'I } ~ cl... ') 0 stat i stics '{; (11.,
For detect i on
rn)
is used .
For the sake of con venience l et us i ntro duce con ti n uous time . Fo r this , by means of linear interpolation cont i n u ous r andom f i eld ;N(t,S) is bu ilt on the set
n
[0, X (0, 1- ol/l] , wh i ch equals y~ (n,rn) in the lattice n odes (t ==
=tt/N*', S=m/N) . ~w(5)
process
on
£ER
The
. ..,
.....-'"
9K
a r e built as follows . If the se t
o f pa r a -
A.i ' j== 1,
P
K
Sg~ ~
non -i ntersect -
P
with the the t o t a l l ength o f the i n -
,
"' 1
terva l s not exceeding 1, then we pu t A
Sj ~ Sj + cI../4 ~ ==
0
91
) asymptotically unbiased
91 O(1/N)
vergence and
d
'
with the rate of con -
be l ongs , wi th ce r tain C to the c l ass j( . A
Theorem 7. Fo r every estimate ~(X"') E 'J{ the f ollowi ng inequality is true :
Ee(ei(X~-ei'>(C/Nt'max(R(U*,8;), R(U~ &-J) i
for N>(1-2cLr , where
u~ is the non - zero root of the equation C2-U)eXp(U)= 2,
~= Sptt)dx/p(x+a)
,
In the above - con sidered p r oblems the func tiona l limit theo r ems are also p r oved which e n able us to f i nd the limit distri but i ons of t he stat i st i cs used . As an example , the limit theo r em for the problem of abrupt d i sorder i s given below . Theo r em 8 . I f a. == 0 in the prob l em of abrupt diso r der (i. e . without diso r der) 00
and Lt<=i
VClO< 00,
then the process
([Ntl)
THE PROB LEM OF SEQUENT I AL DI SORDER DETECTION
and o f the number
and may b e covered by ing interva l s cen t e r s Sj
(on
l e t u s int r oduce the sets
I'
81 ,
formly
est i mate of
weakly converges in Sko r o khod space 1>([cL,1-cl]) to the process (W(i:)- tW(1»/{t<1-t)) where W(·) is the standa r d Wiene r p r ocess .
g = [O} 1- ..(/21
Sg ={Sf 5 : lJN(S)~ B} es ti mates Elt, ... , eK , K
me t ers
c 19'- 9'//
is tru e}. It may be shown that eve r y uni -
VN'1~
De f i n e the r andom
YN (S)::: ~~~1 I ~N(t,5)1 For al l
- E~JII~O(N)\ ~
_{ueXP(-U)/(2(&td-tJ, u>2dl'n& R(U,&)- cf'/( o-M- 1) , U* 2dlhct
Mu l tiple d i so r de r In th i s ca s e
287
A l gorithm~
.....
p. ' a ll J
K ==
1
for
o t he r wi se
Theorem 6 . For all 0<£<£0, 0<6
The sequence of ra n dom variables X={XnFn:1 i s conside r ed on probabi li ty space (Jl. "F) P) and the one- dimens i onal dist ri bution ~unc tion of Xi, ... ,X m- 1 is equal to Fo
f1 .
and tha t of Xm J Xm+1,'" The momment m will be called the mome n t of dis order . I n th i s case , the standa r d s i tua tion is the ass umpt i on f1(.:x) = Fo (X+Cl) ) a7:0 .
Pm (EM) , Po (Eo)
designate measu r es
(ex-
pectations) correspond i ng to the sequence X wi t h the moment of d i sorder and wi thout d i sorder. will designate ~-
5Fn
algebra generated by
{X1, " ') Xn}
m.
.
Let
N "> 1
be a f i xed n umber . The problem of sequential detection of the momen t of d i sorde r may be fo r mulated as the p r ob l em Now the r a te of conve r gence i n diso r der prob l ems will be discuss ed . We shal l consider the prob l e m o f a b rup t d i so r de r and assume that ;i(rt) = ;2,(rt) == ~(rt) which i s the seque n ce of i ndepende nt i dent i ca l ly distribu ted rand om va r iables wi th density P(X) • For a ll C> 0 and d~1 , l et us con sider t he class J(c,d):= X of the est i mat e s
~(XN) follows :
of
91
which i s defined as
J(C,d)::{~1 (XN) :
if
IS/-s"r-d/N , <8',8") E [J.., i-cL] then the inequ a li ty lEe' 91 (X N) -
of minimization of
SUPm~N Em('t-m)+ on 1r (adap t ed to
the set of stopp i ng ti mes {rn})' fo r wh i ch Eo't~T . Algorithm of detectio n
Let us cons i de r the method of detection with finite memory volume N . Two pa r a meters of algorithm are 0<01..<0,5 and
C>O . Put
B. E. Brods kij and B. S. Da rkh ()\s kij
288
where
n
n-tJ+m
VN(rn,I1);= rkL. Xl
J-m
i.:;n-N~
L Xl
~=n.-Ntm.
But the use of statistics of 1.{N(n) type in
m=1, . ..,N-1 ; rt~N , N+i,
N> [1/0£)
(he r e i nafte r
);
IN(n) = max[o/.N}~ m~((hl~] \ "N(m,tl)I and in troduce t he decision func t ion dH{rl) -= (the hypothes i s about the presence of d i so r de r is accep t e d ) if
d,..(n):
1
ZN(n)'~ C
dN(n) = 0
if
Z,. (rt)
. K-
Theorem 9 . Let X be t he sequence o f dependent random var i ab l es , -1 N = N(T)",fuT and C < 0./ 2 t he fo llow in g es timates a r e t r ue for the method of detection d H :
Eo
SUp m.~N(T) Em('t(dH)- mt~L &tT + o(BtT) f Eo't(dN)~ T Theorem 10. Let X be t h e sequence with '¥' -mi xi n g and Eo exptX(n) < 00 for It I ~ H . The n with N~ 00 t he following estima t es a r e true fo r the method o f detec t ion dtJ : m~N
Pm{\l~dN) - cI.:~ \ >E} ~Lexpt-LE2N)
~{c4.(n)=1}~ Lexp(-LeN) DISORDE R PROBLEMS I N A GENERAL SITUATI ON Reduct i on o f ge n eral disor de r p r ob l ems to t he standa r d situati o n will n ow b e d i s cussed . Le t us fi rs t conside r the p r ob l e m of d i so r de r de t ect i on i n one- dime n sio n a l d i st r ibution of the r andom sequ e n ce . It should be noted t hat t he above -formulated theorems can be gene r a li zed f or th i s cas e . As an example , l et us cons i der t he fo llowin g result . Suppose that i n th e ab r up t d i s order problem t he g i ven samp l e XN is:
XN(rt)
={~~(n.) , 1£ n*[~N]
~1-(n), [e~ N]
{';PUt:" , {~2..(rt) 1:':1
where are stationary " -mi xing seq uence s wi t h va lues wi th i n [ 0 ,1] , the one-d i mensiona l a nd d i str i b u t i ons of t hem a r e r espect i ve l y , and
Fa
F;
Ift( u+ ft) - f1 (U)\ ~ c \~ lj3~ i.~O,1 ;.fo>t ,C>O Let
Fo
be the one - dimensional
distr i bution functions of the sequence before and afte r the d i sorder . If
't'(dtl) wi ll des~2:ate ~ \1e p topP.i ng ti~e } generated by dN . [(dl'l)-lJ'\f\n~N : df((n..)-1
SUp'
p r actical situations may present some dif ficulty . So it seems more practicable t o r educe gene r al problems to the standa r d p r ob l em of the change of the mean value. The fo l lowin0 procedure can be used. Let
fers from fu n ct i on
F1 '
Fo
dif -
then there exists the
f : Ri ~ R" ,
such that
~f{X)cLFo(X) =/:. H(X)dF, (X)
(1)
In that case the mean of the sequence n) wi ll change . If the function f is aprio ri unknown , then one can try to find the f inite family of functions lE such that at least one of them fu l f i ls (1) . For example , i f the dis tributions befo r e and after the disorde r are concentrated i n a finite number of po i nts , one can use indicators as such fa mil y of f unct i ons . It may be recomme n ded to use as such fam il y the fin i te number of cha r ac t e r ist i c funct i ons of non - intersect i ng subse t s of the range of t~e random se quence . After the family {fi. JlE I is
Yn.==t
{fi.} r
'
t
chose n for each sequence (Xn.) the disorde r p r ob l em i s solved by one of the s t a n da r d a l gor i thms described above , and t he dec i s i on about the diso r der is made and the ide n tif i cat i on of the d i sorder t i me is done using t he sum total of de cisions (the disorder exists if it is de t ected in one of t he sequences) . A simila r method can be p r oposed for detect i on of changes in multidimensional distribut i ons. We shall give here one simple examp l e . I f one i s t o de t ect the change in the co r re l at i on fu n ction R(t) of the sequence {Xn} i n one or severa l po i nts 't==-O, 1, ... , K Co n s i der i ng the p r oblem for sequences {XnX n+s~ , 5= 0, 1, ... , K ' we reduce it to the p r oblem of detecting the change of t he mean. But it must be emphasized t hat fo r mu l tidimens i onal d i stributions the re duction to the standard situation resu l ts i n the r ep l acement of the abrupt diso r der by the g r adual one wi th known or unknown time of transition p r ocess (as is seen fr om the above example) . Therefore , in the case of mul t i dimens i o n al distributions the a l gor i thms of gradual d i sorde r detect i on should be app l ied. A general situation is reduced to the s t anda r d one in the p r oblem of sequent i onal disorder detection using the same procedu r es . CONC LUS I ONS
Theo r em 11. Fo r a ll
O
e~m Ps{maX~fQH\~ -9.1> t}:::O
This report p r esents some r esu l ts concern ing d i ffe r en t p r oblems of detecting the moments of change i n s t a t is ti cal charac te r istics of r andom sequences. Nonparamet ric a l gorithms of solving such problems are proposed and thei r characte r istics a r e i nvestigated. The advantage of nonparamet -
:'\ion parametric Algorithms ric algorithms is that for their application no apriori information about the distribution of the random process is needed. This fact and the demonstrated efficacy of nonparametric algorithms of disorder detection encourage engineers to use them for failure and fault diagnosis in stochastic control systems. It seems to us that the following questions are of interest for further studies in disorder detection problems: investigation of optimality and asymptotic optirnality of different methods of disorder detection (it should be emphasized, that several optimal and asymptotically optimal methods of disorder detection in different situations were given by Shiryayev (1978), Lorden (1971), and Pollak (1985) ; comparative analysis of different disorder detection methods; experimental testing of the proposed algorithms of disorder detection in real situations. Finally, we note that some problems, considered in the report, were discussed by Darkovskij and Brodskij (1980); Darkovskij (1984); Brodskij and Darkovskij (1983). REFERENCES Brodskij, B.E., and B.S. Darkovskij (1983). On the fastest detection of the time when the probabilistic characteristics of a random sequence change (in Russian). Avtomatika i Telemekhanika, 10, pp. 101-108. -Darkovskij B.S. (1984). On two problems of the time estimation when probabililis tic characteristics of a random sequence change. Probability Theory and Applications, 29, pp. 464-473. Darkovskij, B.S., and B.E. Brodskij (1980). A posteriori detection of the disorder time for a random sequence. Probability Theory and Applications, ~, pp. 635-639. Kligiene N., and L. Telksnys (1983). Methods to determine the times when the properties of random processes change (in Russian). Avtomatika i Telemekhanika, 10, pp. 5-56. Lorden, G. (1971). Procedures for reacting to a change in distribution. Ann. Ilath. Statist., il, pp. 1897-1908. Pollak, M. (1985). Optimal detection of a change in distribution. The Annals of Statistics, !l, pp. 206-227. Shiryayev, A.N. (1978). Optimal Stopping Rules. Springer-Verlag, New York.
289