Statistics & Probability North-Holland
Letters 8 (1989) 371-376
MARKOV PROCESSES
Bernard
September
AND EXPONENTIAL
1989
FAMILIES ON A FINITE SET
YCART
De’partement de Mathematiques,
Fact&
des Sciences, Ao. de I’lJniuersit&, 64000 Pau, France
Received May 1988 Revised September 1988
Abstract: We characterize all continuous time Markov an exponential family with one parameter. AMS
Subject Classifications:
Keywordr:
Markov
processes,
processes
on a finite set such that their distribution
at any instant
is in
60525, 62ElO. exponential
families.
1. Introduction The two notions of Markov process and exponential family are usually coupled in two different ways. A first class of papers deals with Markov processes or chains the generator of which belongs to an exponential family of operators. The papers by Ktichler (1982a,b) and Hudson (1982) are representative of this point of view. On the other hand, some emphasis has been given to these processes whose distribution is in an exponential family (cf., for instance, Sigmund, 1982). In a recent paper (Ycart, 1988) we characterized those birth and death processes which have at any instant a negative binomial, binomial, or Poisson distribution. The aim of the present note is to characterize all continuous time Markov processes on a finite set such that their distribution at any instant is in an exponential family with one parameter. Definition 1.1. Let E = {e,, . . . , eN} be a finite set, p a positive measure on E and f a mapping from E to R such that f(E) is not reduced to one point. We call exponential family with one parameter on E, generated by 1-1and f, the family F= {PO, 6JE R} of probability distributions on E such that: V8 E Iw, Vi=1 >..., N,
P0(e,)=P(e,>ew(V(e,))/
i!ir-l(q)
eXP(U(ej)).
J=l
In the particular case where E c R and f is the identity mapping, 3 is called a natural exponential family. Our general reference on exponential families is the book by Barndorff-Nielsen (1978). We suppose that the reader is familiar with the elementary theory of Markov processes on a finite set (cf., for instance, Karlin, 1966). In particular we shall use freely the notion of Markov generator. We understand it as a linear operator whose matrix is nonidentically null, with nonnegative coefficients outside the main diagonal (the rates of transition), the sum of the terms in each row being zero. 0167-7152/89/$3.50
0 1989, Elsevier Science Publishers
B.V. (North-Holland)
371
STATISTICS&PROBABILITY
Volume 8, Number 4
LETTERS
September 1989
Definition 1.2. Let F= ( Pe, 8 E R} be an exponential family with one parameter and A a Markov generator on the finite set E. The generator A is said to be stable for the family P if there exists: - a Markov process { X,, t >, 0} with generator A, - an open interval Z in [0, + cc[, - a differentiable mapping 0(t) from Z to R, such that for all t in I, e’(l) # 0 and the distribution of x, is G(r). The hypothesis
d’(t) # 0 avoids the trivial case when { X,, t > 0) is a stationary
Definition 1.3. A Markov generator A on E is said to be exponentially family with one parameter 9 on E such that A is stable for _F. Our basic example Example 1.4. Let (a,,), ajj=O ai,i+l
of an exponentially
stable generator
= (n - i)h
stable if there exists an exponential
is the following:
i, j = 0,. . . , n, be the matrix of a Markov
if [i-j1
process.
generator
A on E = (0,. . . , n } such that:
22, and
ai,i_,
= ip,
Vi=O,...,n,
where A and p are nonnegative reals such that hl~. > 0. Let (X,, t >, 0} be a Markov process on E with generator A; it is a linear growth birth and death process (cf. Ycart, 1988). The following result can easily be verified: If the distribution of X,, is binomial with parameters n and p(O) then for each t > 0, the distribution of X, is binomial with parameters n and p(t), where: ~(t)=X/(A-t~)+(~(O)-A/(h+-))exp-(A+~)t. Our first step towards a complete characterization of exponentially stable generators on a finite set is the following: On a finite set of reals, the generator of Example 1.4 is, up to an affine transformation of E, the only one to be stable for a natural exponential family on E. This is the main result of Section 2 (Theorem 2.1). In Section 3, we consider the general case: Theorem 3.1 shows that any exponentially stable generator can be naturally associated to that of Example 1.4. Then it completes the characterization by showing that conversely a generator associated to that of Example 1.4 is indeed exponentially stable. Moreover, the distribution at any instant of a process with that generator can be computed explicitly. A simple example is provided.
2. Markov processes and natural exponential families The purpose of this section is to prove the uniqueness stable generator for a natural exponential family.
of the generator
of Example
1.4, as an exponentially
Theorem 2.1. Let E= {x0 ,..., x, } be a finite subset of R. With no loss of generality, we assume that xg -=zx1 < . . .
Vi=O,...,n.
Volume
8, Number
STATISTICS&PROBABILITY
4
September
LETTERS
(b) The exponential family 3 is formed by the image measures {O,..., n}. (c) The matrix (a,,), i, j = 0,. . . , n, is that of Example 1.4.
by g-’
of the binomial
distributions
1989
on
Conversely (a) and (c) are sufficient conditions for A to be exponentially stable; this is a straightforward extension of the result quoted in Example 1.4. We shall give here only the main steps of the proof of Theorem 2.1. Each of these steps can be proved by elementary means. According to the notations of Definitions 1.1 and 1.2, let {X,, t > 0) be a Markov process with generator A on E, p a positive measure on E, and O(t) a differentiable mapping from some open interval I into R such that the distribution of X, is PB(,) Vt E I, where for all i = 0,. . . , n:
(1) The first step is the equation (2) below, obtained by comparing PBcrj(xk), computed in two different ways. Firstly by differentiating i=O , . . ., n and all t in I: dP,,,,(x,)/dt=e’(t)P,,,,(x,) Secondly, 1966) gives:
the general
expression
dPo,,,(x,)/dt= that can be rewritten
2 (x,-x,)&,) /=o for the derivative
the derivatives in t of Pscc,(x,) and the expression (1) one gets for all
exp@(t)x,)/
5
P(x,>
exp(%)x,).
J=o
of the distribution
of a Markov
process
(cf. Karlin,
c a,,Poc,,(x,>,
;=o
as:
dPs(r)(X,)/dt=P,(,,(x,) i aj, exp(e(t)(x, -x,>)p(x,>/p(x,>. J=o
Comparing
these two expressions
for i and k gives: Vi, k = 0,. , . , n, Vt E I,
alk exp(e(t)(x,-xk))~(x,)/~l(xk) !
2
ali exp(e(t)(xj-x,))~(X,)/~L(X,)
J=o
To derive further necessary analysis (cf. Barndorff-Nielsen,
conditions from (2), one uses extensively 1978, Theorem 7.2).
Lemma 2.2. Let z1 < z2 < . . . < z, be m real numbers. m reals c,, . . _, c, such that: VeEJ, Then necessarily
m C
j=l
C,
exp
the following
lemma
I. (2) of elementary
Suppose that there exists an open interval J in R and
ez, = 0.
c, = . . . = c, = 0. 373
Volume
8, Number
STATISTICS&PROBABILITY
4
LETTERS
September
1989
The next step is the following: Vi,j=O
,...,
n,
if Ii-j1
>l,
then a,,=O.
(3)
To get (3), we first prove that a,, is zero if i j. Take i -cj - 1 and k = i + 1 in (2) and repeat the same argument to get aj, = 0. Symmetrically, one proves that a,, is zero if i >j + 1 by increasing induction on j. Now, we prove part (a) of Theorem Vi=
1 )...)
n-l,
2.1 under
the following
equivalent
form:
x,+,-x,=x1-xg.
(4)
Due to (3), one of the u~+,.~‘s or one of the u~_,,~ ‘s is non-zero (otherwise the generator would be identically zero). Suppose that uk + ,,k is non-zero for some k and zero for bigger indices. Then the strongest exponents in the left and right members of (2) have to be the same, by Lemma 2.2. This implies that x,+r - x, = xk+r - xk, which is (4). Now, since the x,‘s are regularly spaced, they can be mapped onto {0,. . . , n } by an affine tranformation. Therefore we can replace x, by i in equation (2) without changing it essentially. Taking k = 0 and applying again Lemma 2.2 leads to the following relations: Vi=O,...,n-l,
a,,,+,l-l(x,)/~L(~,+~)=(i+I)a~,~Cl(x,)/p(x,),
Vi=0
,...,n,
u,,;=(l-i)u,,+iu,,,,
Vi=
1 >.-., n,
’
ui,i-lP(xz)/P(x~-l)=
((n
(5)
-i+l)/n)a,,~~L(x,)/~l.(x~).
The difference equation (6) below can be obtained by writing that a,,,_ 1 + a,,, + u,,,+, = 0 and using the relations (5). In view of Theorem 3.1, it is also important to remark that it can be derived directly from the equation (2). Vi=
1 ,...,
n,
((n - i + l>/~>u,.,(~L(x~)/~(x~))IL(x,-,) + (i + l>u,.,(~(x,>/~(x,))~(x,+l)
+ I(1 - i)a,.,
+ i4~(x,)
= 0.
(6)
We turn now to solving the difference equation (6). As expected, if I, i = 0,. . . , n, is a solution, then n. So that ~(x,,) and p(xI) can be chosen arbitrarily. We take I = 1 and so is @I, i=O,..., I = n. Let g be the generatrix function of the p(xi)‘s. Multiplying the equation (6) by z’ and summing over i leads to the following differential equation in g: g(z)[%,z
+
Solving this equation, positive iff: a,,,
+
a1.0
+
%I + ‘T’b)[-w2 + (%I - 4& + %,/~I = 0. one checks
%,0((1
-
HI/n>
easily
=
that a solution
is such that
g(0) = 1 and
(7)
0.
In this case this solution is g(z) = (I + l)n. Finally, under the condition (7), the general
g(z) > 0 for all z
solution
of (6) is:
(8) which is part (b) of the theorem. 314
Part (c) is a consequence
of (5), (7) and (8).
Volume
8. Number
3. General
STATISTICS&PROBABILITY
4
LETTERS
September
1989
case
According to Definition 1.1, we consider an exponential family .F= { Pe, 8 E W} on E = {e,, . . . , e,}, generated by p and f. We denote by F = { x0, . . . , x, } the set f(E). Let v be the image measure of TVby f; it is a positive measure on the set of reals F. We denote by 9= {Q,, 8 E W} the natural exponential family on F generated by v. Notice that Q, is the image measure of Ps by f. The main result of this section is that, if a generator A is stable for the family 9, then it is possible to associate to A a generator B on F which is stable for the family 9. Then necessarily B is the generator of Example 1.4, as Theorem 2.1 shows. Theorem 3.1. Let A be a generator on E, stable for the family F. infinitesimal rate of transition from e, to e,. Then: N, Vh, k = 0 ,..., n, tff(e,) = f(e,,) = xk, then (a) Vj, j’= l,..., JI,
a,,p(e,>/p(e,)
rcd=x,
=
Denote
by a,,,
i, j = 1,. . . , N, the
C a,,+(e,>/p(e,O. i s.t. f(e,)=%
We denote by Bhk the common value of this sum and let b,, = B,,v(x,)/v(x,),
Vh, k = 0 ,.._, n.
(b) the bhk’s, h, k = 0, . . . , n, are the infinitesimal rates of transition Let B be the corresponding generator. Let {S(t), t > 0} and {Z(t), by A and B respectively. (c) Vt > 0, VB, 8’ E R, S(t)*P,=
Per e
where * denotes the transposition
Z(t)*Q,=
of a Markov process on F. t > 0} be the semi-groups generated
Q,,,
of operators.
Part (c) of this theorem can be used in two ways. On the one hand, it states that the generator B of part (b) is stable for the natural family ‘9, and the function d(t) of Definition 1.2 is the same for A and B. Now, due to Theorem 2.1, B is necessarily the generator of Example 1.4, up to an affine transformation of F. Conversely starting from the generator of Example 1.4 one can construct a generator A, a positive measure p and a function f on an arbitrary set E that satisfy the equations of part (a) in Theorem 3.1. Then A will be stable for the family 9 generated by p and f. Let { X,, t > 0} be a Markov process with generator A. If the distribution of X0 is in .F, then so is the distribution of X, for any t > 0, and it is possible to compute the explicit expression of this distribution from the result of Example 1.4. Example 3.2. Consider E = { e, , e2, e3 }. Let p(el) = 1, p(ez) = 2, p(e3) = 3, f(e,) = f(ez) = 0 and f(e3) = 1. The family F is the family of distributions on E such that: P(e,) =p, P(e,) = 2p, P(e,) = 1 - 3p, for some p in IO, f[. Take a,, = -3, ai2 = 1 a,3 =2, a2i = 1, az2 = - 1, az3 = i, a3i = i, aj2 = +, a s2= -1. Let {X,, t 2 O} be a Markov process with generator A. It is easy to check that if the distribution of X0 is (p(O), 2p(O), 1 - 3p(O)), then for any t, the distribution of X, is (p(t), 2p(t), 1 - 3p(t)), where p(t)=i[1+(6p(O)-1)
exp(-2t)].
The proof of parts (a) and (b) of Theorem 3.1 is analogue to that of Theorem 2.1. One first derives the analogous of the equation (2), from which part (a) follows directly by Lemma 2.2. Part (b) is also a 375
Volume
8, Number
STATISTICS
4
& PROBABILITY
LETTERS
September
1989
consequence of this equation. For part (c), one introduces the matrix M defined as follows: its rows and columns are indexed by E and F respectively. The element corresponding to row ej and column x,, is non-zero iff f(e,) = xh and its value in this case is p(e,)/v(x,). With this definition, one has: VeER,
Pe=MQ,,
(9)
and A*M=MB*,
(10)
where A* and B* are the transposed matrices of A and B respectively. The identity consequence of the definitions, while (10) is another writing of part (a) of the theorem. From (lo), one deduces by induction that, for all n:
(9) is a direct
(A”)*M=M(B”)*, and this implies
that for all t > 0:
(exp(tA*))M= where the exponential (Z(t)), we have:
M(exp(tB*)),
of a matrix is defined
S(t)*M=MZ(t)*
forall
in the usual way. NOW in terms of the semi-groups
(S(t))
and
t>O.
Finally: PO, -
S(t)*P,=
S(t)*MQ,=MQ,,
-
MZ(t)*Q,=MQ,,
The last implication is the only one which is not obvious. inverse M’ to M as follows: M’=
(m,,),
i=O ,...,
n, h=l,...,
-
Z(t)*Q,=Q,c
To see that it is true, one has to define
a left
N,
where mih =
1
if f(e,)
i 0
if not.
This ends the proof of Theorem
= i,
3.1.
References Bandorff-Nielsen, 0. (1978), Information and Exponential Families in Statistical Theoty (Wiley, New York). Hudson, I.L. (1982), Large sample inference for Markovian exponential families with application to branching processes with immigration, Austral. J. Statist. 24 (I), 98-112. Karlin, S. (1966), A First Course in Stochastic Processes (Academic Press, New York). Kiichler, U. (1982a), Exponential families of Markov processes Part 1. General results, Math. Operationsforsch. Statist. 13 (1). 57-69.
376
Kiichler, U. (1982b), Exponential families of Markov processes Part 2. Birth and death processes, Math. Operationsforsch. Statist. 13 (2), 219-230. Stegmund, D. (1982) Large deviations for boundary crossing probabilities, Ann. Probab. 10 (3). 581-588. Ycart, B. (1988), A characteristic property of linear growth birth and death processes, Sankhyii Ser. A SO (2), 184-189.