Markov processes and exponential families on a finite set

Markov processes and exponential families on a finite set

Statistics & Probability North-Holland Letters 8 (1989) 371-376 MARKOV PROCESSES Bernard September AND EXPONENTIAL 1989 FAMILIES ON A FINITE SE...

425KB Sizes 0 Downloads 27 Views

Statistics & Probability North-Holland

Letters 8 (1989) 371-376

MARKOV PROCESSES

Bernard

September

AND EXPONENTIAL

1989

FAMILIES ON A FINITE SET

YCART

De’partement de Mathematiques,

Fact&

des Sciences, Ao. de I’lJniuersit&, 64000 Pau, France

Received May 1988 Revised September 1988

Abstract: We characterize all continuous time Markov an exponential family with one parameter. AMS

Subject Classifications:

Keywordr:

Markov

processes,

processes

on a finite set such that their distribution

at any instant

is in

60525, 62ElO. exponential

families.

1. Introduction The two notions of Markov process and exponential family are usually coupled in two different ways. A first class of papers deals with Markov processes or chains the generator of which belongs to an exponential family of operators. The papers by Ktichler (1982a,b) and Hudson (1982) are representative of this point of view. On the other hand, some emphasis has been given to these processes whose distribution is in an exponential family (cf., for instance, Sigmund, 1982). In a recent paper (Ycart, 1988) we characterized those birth and death processes which have at any instant a negative binomial, binomial, or Poisson distribution. The aim of the present note is to characterize all continuous time Markov processes on a finite set such that their distribution at any instant is in an exponential family with one parameter. Definition 1.1. Let E = {e,, . . . , eN} be a finite set, p a positive measure on E and f a mapping from E to R such that f(E) is not reduced to one point. We call exponential family with one parameter on E, generated by 1-1and f, the family F= {PO, 6JE R} of probability distributions on E such that: V8 E Iw, Vi=1 >..., N,

P0(e,)=P(e,>ew(V(e,))/

i!ir-l(q)

eXP(U(ej)).

J=l

In the particular case where E c R and f is the identity mapping, 3 is called a natural exponential family. Our general reference on exponential families is the book by Barndorff-Nielsen (1978). We suppose that the reader is familiar with the elementary theory of Markov processes on a finite set (cf., for instance, Karlin, 1966). In particular we shall use freely the notion of Markov generator. We understand it as a linear operator whose matrix is nonidentically null, with nonnegative coefficients outside the main diagonal (the rates of transition), the sum of the terms in each row being zero. 0167-7152/89/$3.50

0 1989, Elsevier Science Publishers

B.V. (North-Holland)

371

STATISTICS&PROBABILITY

Volume 8, Number 4

LETTERS

September 1989

Definition 1.2. Let F= ( Pe, 8 E R} be an exponential family with one parameter and A a Markov generator on the finite set E. The generator A is said to be stable for the family P if there exists: - a Markov process { X,, t >, 0} with generator A, - an open interval Z in [0, + cc[, - a differentiable mapping 0(t) from Z to R, such that for all t in I, e’(l) # 0 and the distribution of x, is G(r). The hypothesis

d’(t) # 0 avoids the trivial case when { X,, t > 0) is a stationary

Definition 1.3. A Markov generator A on E is said to be exponentially family with one parameter 9 on E such that A is stable for _F. Our basic example Example 1.4. Let (a,,), ajj=O ai,i+l

of an exponentially

stable generator

= (n - i)h

stable if there exists an exponential

is the following:

i, j = 0,. . . , n, be the matrix of a Markov

if [i-j1

process.

generator

A on E = (0,. . . , n } such that:

22, and

ai,i_,

= ip,

Vi=O,...,n,

where A and p are nonnegative reals such that hl~. > 0. Let (X,, t >, 0} be a Markov process on E with generator A; it is a linear growth birth and death process (cf. Ycart, 1988). The following result can easily be verified: If the distribution of X,, is binomial with parameters n and p(O) then for each t > 0, the distribution of X, is binomial with parameters n and p(t), where: ~(t)=X/(A-t~)+(~(O)-A/(h+-))exp-(A+~)t. Our first step towards a complete characterization of exponentially stable generators on a finite set is the following: On a finite set of reals, the generator of Example 1.4 is, up to an affine transformation of E, the only one to be stable for a natural exponential family on E. This is the main result of Section 2 (Theorem 2.1). In Section 3, we consider the general case: Theorem 3.1 shows that any exponentially stable generator can be naturally associated to that of Example 1.4. Then it completes the characterization by showing that conversely a generator associated to that of Example 1.4 is indeed exponentially stable. Moreover, the distribution at any instant of a process with that generator can be computed explicitly. A simple example is provided.

2. Markov processes and natural exponential families The purpose of this section is to prove the uniqueness stable generator for a natural exponential family.

of the generator

of Example

1.4, as an exponentially

Theorem 2.1. Let E= {x0 ,..., x, } be a finite subset of R. With no loss of generality, we assume that xg -=zx1 < . . .
Vi=O,...,n.

Volume

8, Number

STATISTICS&PROBABILITY

4

September

LETTERS

(b) The exponential family 3 is formed by the image measures {O,..., n}. (c) The matrix (a,,), i, j = 0,. . . , n, is that of Example 1.4.

by g-’

of the binomial

distributions

1989

on

Conversely (a) and (c) are sufficient conditions for A to be exponentially stable; this is a straightforward extension of the result quoted in Example 1.4. We shall give here only the main steps of the proof of Theorem 2.1. Each of these steps can be proved by elementary means. According to the notations of Definitions 1.1 and 1.2, let {X,, t > 0) be a Markov process with generator A on E, p a positive measure on E, and O(t) a differentiable mapping from some open interval I into R such that the distribution of X, is PB(,) Vt E I, where for all i = 0,. . . , n:

(1) The first step is the equation (2) below, obtained by comparing PBcrj(xk), computed in two different ways. Firstly by differentiating i=O , . . ., n and all t in I: dP,,,,(x,)/dt=e’(t)P,,,,(x,) Secondly, 1966) gives:

the general

expression

dPo,,,(x,)/dt= that can be rewritten

2 (x,-x,)&,) /=o for the derivative

the derivatives in t of Pscc,(x,) and the expression (1) one gets for all

exp@(t)x,)/

5

P(x,>

exp(%)x,).

J=o

of the distribution

of a Markov

process

(cf. Karlin,

c a,,Poc,,(x,>,

;=o

as:

dPs(r)(X,)/dt=P,(,,(x,) i aj, exp(e(t)(x, -x,>)p(x,>/p(x,>. J=o

Comparing

these two expressions

for i and k gives: Vi, k = 0,. , . , n, Vt E I,

alk exp(e(t)(x,-xk))~(x,)/~l(xk) !

2

ali exp(e(t)(xj-x,))~(X,)/~L(X,)

J=o

To derive further necessary analysis (cf. Barndorff-Nielsen,

conditions from (2), one uses extensively 1978, Theorem 7.2).

Lemma 2.2. Let z1 < z2 < . . . < z, be m real numbers. m reals c,, . . _, c, such that: VeEJ, Then necessarily

m C

j=l

C,

exp

the following

lemma

I. (2) of elementary

Suppose that there exists an open interval J in R and

ez, = 0.

c, = . . . = c, = 0. 373

Volume

8, Number

STATISTICS&PROBABILITY

4

LETTERS

September

1989

The next step is the following: Vi,j=O

,...,

n,

if Ii-j1

>l,

then a,,=O.

(3)

To get (3), we first prove that a,, is zero if i j. Take i -cj - 1 and k = i + 1 in (2) and repeat the same argument to get aj, = 0. Symmetrically, one proves that a,, is zero if i >j + 1 by increasing induction on j. Now, we prove part (a) of Theorem Vi=

1 )...)

n-l,

2.1 under

the following

equivalent

form:

x,+,-x,=x1-xg.

(4)

Due to (3), one of the u~+,.~‘s or one of the u~_,,~ ‘s is non-zero (otherwise the generator would be identically zero). Suppose that uk + ,,k is non-zero for some k and zero for bigger indices. Then the strongest exponents in the left and right members of (2) have to be the same, by Lemma 2.2. This implies that x,+r - x, = xk+r - xk, which is (4). Now, since the x,‘s are regularly spaced, they can be mapped onto {0,. . . , n } by an affine tranformation. Therefore we can replace x, by i in equation (2) without changing it essentially. Taking k = 0 and applying again Lemma 2.2 leads to the following relations: Vi=O,...,n-l,

a,,,+,l-l(x,)/~L(~,+~)=(i+I)a~,~Cl(x,)/p(x,),

Vi=0

,...,n,

u,,;=(l-i)u,,+iu,,,,

Vi=

1 >.-., n,



ui,i-lP(xz)/P(x~-l)=

((n

(5)

-i+l)/n)a,,~~L(x,)/~l.(x~).

The difference equation (6) below can be obtained by writing that a,,,_ 1 + a,,, + u,,,+, = 0 and using the relations (5). In view of Theorem 3.1, it is also important to remark that it can be derived directly from the equation (2). Vi=

1 ,...,

n,

((n - i + l>/~>u,.,(~L(x~)/~(x~))IL(x,-,) + (i + l>u,.,(~(x,>/~(x,))~(x,+l)

+ I(1 - i)a,.,

+ i4~(x,)

= 0.

(6)

We turn now to solving the difference equation (6). As expected, if I, i = 0,. . . , n, is a solution, then n. So that ~(x,,) and p(xI) can be chosen arbitrarily. We take I = 1 and so is @I, i=O,..., I = n. Let g be the generatrix function of the p(xi)‘s. Multiplying the equation (6) by z’ and summing over i leads to the following differential equation in g: g(z)[%,z

+

Solving this equation, positive iff: a,,,

+

a1.0

+

%I + ‘T’b)[-w2 + (%I - 4& + %,/~I = 0. one checks

%,0((1

-

HI/n>

easily

=

that a solution

is such that

g(0) = 1 and

(7)

0.

In this case this solution is g(z) = (I + l)n. Finally, under the condition (7), the general

g(z) > 0 for all z

solution

of (6) is:

(8) which is part (b) of the theorem. 314

Part (c) is a consequence

of (5), (7) and (8).

Volume

8. Number

3. General

STATISTICS&PROBABILITY

4

LETTERS

September

1989

case

According to Definition 1.1, we consider an exponential family .F= { Pe, 8 E W} on E = {e,, . . . , e,}, generated by p and f. We denote by F = { x0, . . . , x, } the set f(E). Let v be the image measure of TVby f; it is a positive measure on the set of reals F. We denote by 9= {Q,, 8 E W} the natural exponential family on F generated by v. Notice that Q, is the image measure of Ps by f. The main result of this section is that, if a generator A is stable for the family 9, then it is possible to associate to A a generator B on F which is stable for the family 9. Then necessarily B is the generator of Example 1.4, as Theorem 2.1 shows. Theorem 3.1. Let A be a generator on E, stable for the family F. infinitesimal rate of transition from e, to e,. Then: N, Vh, k = 0 ,..., n, tff(e,) = f(e,,) = xk, then (a) Vj, j’= l,..., JI,

a,,p(e,>/p(e,)

rcd=x,

=

Denote

by a,,,

i, j = 1,. . . , N, the

C a,,+(e,>/p(e,O. i s.t. f(e,)=%

We denote by Bhk the common value of this sum and let b,, = B,,v(x,)/v(x,),

Vh, k = 0 ,.._, n.

(b) the bhk’s, h, k = 0, . . . , n, are the infinitesimal rates of transition Let B be the corresponding generator. Let {S(t), t > 0} and {Z(t), by A and B respectively. (c) Vt > 0, VB, 8’ E R, S(t)*P,=

Per e

where * denotes the transposition

Z(t)*Q,=

of a Markov process on F. t > 0} be the semi-groups generated

Q,,,

of operators.

Part (c) of this theorem can be used in two ways. On the one hand, it states that the generator B of part (b) is stable for the natural family ‘9, and the function d(t) of Definition 1.2 is the same for A and B. Now, due to Theorem 2.1, B is necessarily the generator of Example 1.4, up to an affine transformation of F. Conversely starting from the generator of Example 1.4 one can construct a generator A, a positive measure p and a function f on an arbitrary set E that satisfy the equations of part (a) in Theorem 3.1. Then A will be stable for the family 9 generated by p and f. Let { X,, t > 0} be a Markov process with generator A. If the distribution of X0 is in .F, then so is the distribution of X, for any t > 0, and it is possible to compute the explicit expression of this distribution from the result of Example 1.4. Example 3.2. Consider E = { e, , e2, e3 }. Let p(el) = 1, p(ez) = 2, p(e3) = 3, f(e,) = f(ez) = 0 and f(e3) = 1. The family F is the family of distributions on E such that: P(e,) =p, P(e,) = 2p, P(e,) = 1 - 3p, for some p in IO, f[. Take a,, = -3, ai2 = 1 a,3 =2, a2i = 1, az2 = - 1, az3 = i, a3i = i, aj2 = +, a s2= -1. Let {X,, t 2 O} be a Markov process with generator A. It is easy to check that if the distribution of X0 is (p(O), 2p(O), 1 - 3p(O)), then for any t, the distribution of X, is (p(t), 2p(t), 1 - 3p(t)), where p(t)=i[1+(6p(O)-1)

exp(-2t)].

The proof of parts (a) and (b) of Theorem 3.1 is analogue to that of Theorem 2.1. One first derives the analogous of the equation (2), from which part (a) follows directly by Lemma 2.2. Part (b) is also a 375

Volume

8, Number

STATISTICS

4

& PROBABILITY

LETTERS

September

1989

consequence of this equation. For part (c), one introduces the matrix M defined as follows: its rows and columns are indexed by E and F respectively. The element corresponding to row ej and column x,, is non-zero iff f(e,) = xh and its value in this case is p(e,)/v(x,). With this definition, one has: VeER,

Pe=MQ,,

(9)

and A*M=MB*,

(10)

where A* and B* are the transposed matrices of A and B respectively. The identity consequence of the definitions, while (10) is another writing of part (a) of the theorem. From (lo), one deduces by induction that, for all n:

(9) is a direct

(A”)*M=M(B”)*, and this implies

that for all t > 0:

(exp(tA*))M= where the exponential (Z(t)), we have:

M(exp(tB*)),

of a matrix is defined

S(t)*M=MZ(t)*

forall

in the usual way. NOW in terms of the semi-groups

(S(t))

and

t>O.

Finally: PO, -

S(t)*P,=

S(t)*MQ,=MQ,,

-

MZ(t)*Q,=MQ,,

The last implication is the only one which is not obvious. inverse M’ to M as follows: M’=

(m,,),

i=O ,...,

n, h=l,...,

-

Z(t)*Q,=Q,c

To see that it is true, one has to define

a left

N,

where mih =

1

if f(e,)

i 0

if not.

This ends the proof of Theorem

= i,

3.1.

References Bandorff-Nielsen, 0. (1978), Information and Exponential Families in Statistical Theoty (Wiley, New York). Hudson, I.L. (1982), Large sample inference for Markovian exponential families with application to branching processes with immigration, Austral. J. Statist. 24 (I), 98-112. Karlin, S. (1966), A First Course in Stochastic Processes (Academic Press, New York). Kiichler, U. (1982a), Exponential families of Markov processes Part 1. General results, Math. Operationsforsch. Statist. 13 (1). 57-69.

376

Kiichler, U. (1982b), Exponential families of Markov processes Part 2. Birth and death processes, Math. Operationsforsch. Statist. 13 (2), 219-230. Stegmund, D. (1982) Large deviations for boundary crossing probabilities, Ann. Probab. 10 (3). 581-588. Ycart, B. (1988), A characteristic property of linear growth birth and death processes, Sankhyii Ser. A SO (2), 184-189.