On the Lebesgue decomposition of the posterior distribution with respect to the prior in regular Bayesian experiments

On the Lebesgue decomposition of the posterior distribution with respect to the prior in regular Bayesian experiments

ELSEVIER Statistics & Probability Letters 26 (1996) 147-152 On the Lebesgue decomposition of the posterior distribution respect to the prior in ...

319KB Sizes 4 Downloads 21 Views

ELSEVIER

Statistics

& Probability

Letters

26 (1996)

147-152

On the Lebesgue decomposition of the posterior distribution respect to the prior in regular Bayesian experiments Claudio Institufo Nazionale

di Ah

Received

Maccir

Matematica

August

with

“Francesco

1994; revised

November

Sewri

‘I, Rum,

llaJ>

1994

Abstract Given a regular Baysian experiment we can consider the Lebesgue decomposition of the posterior distributions w.r.t. the prior. Then, by using the Lebesgue decomposition of each sampling distribution w.r.t. the predictive distribution, we show that the absolutely continuous parts of the posteriors are only determined by the absolutely continuous parts of the sampling distributions, while the singular parts of the posteriors are only determined by the singular parts. A.M.S. Subject Classijication: Keywords:

60A10, 62A15, 62B15

Regular Bayesian experiment;

Predictive distribution;

Lebesgue decomposition

of the posterior

distribution

1. Introduction The aim of the short note is to point out somemathematical aspectsof conditional distributions; in view of applications in the field of Bayesian statistics, our terminology will be adapted to the frame of Bayesian experiments (e.g. Florens et al., 1990). Let (x3~4) (samplespace)and (L, S?)(parameter space)be two Polish spaces.Let p be a probability measure on (& and (Pe)s,L a family of probability measureson .d and suppose that the following condition holds: YE E d:

the function 0 H P,(E) is g-measurable.

(1)

In other words, we shall consider a statistical model(x, &,(PH)BEL)in the Bayesian setting with ,Uhaving the meaning of the prior distribution on %?and (Ps)eELbeing the family of samplingdistributions. By condition (l), it is possible to consider the Bayesian experiment

‘Address

for correspondence:

Via Pietro

0167-7152;96/S12.00 XT 1996 Elsevier SSDI 0167-7152(95)00004-6

Maroncelli

Science

44. 00149

B.V. All rights

Rome,

reserved

Italy.

C. Macci

148

where 17 is the probability

& Probability

Letters

26 (1996)

147-152

measure on V @ d such that

VCEWandVEEE:

Il(CxE)=

whence we can obtain a probability VEE&‘:

/ Statistics

sC

(2)

P&3+4@,

measure P on ~2 (predictive distribution)

P(E)=J7(LxE).

defined by (3)

Since (31,&‘) and (I,, %?)are Polish spaces, it there exists (e.g. Florens et al., 1990, p. 31) a family (pX)XsX of probability measure on %Z(posterior distributions) such that VCEVaandVEE&:

Il(CxE)=

sE

dC)dP(x);

(4)

then 8 is said to be regular. Moreover this family is P a.e. unique. Moreover, if each PO is absolutely continuous w.r.t. a fixed a-finite measure I on d (in other words, if (?I, A!, (Pe)ecL) is a dominated statistical model (e.g. Bahadur, 1954)), we have (e.g. Liptser and Shiryiayev, 1977, p. 285)

where f is a jointly Vfl EL:

measurable function such that

P@(E) =

f(6, x)dA(x),

I

VE E -c4;

(6)

E

consequently, with probability 1 w.r.t. P, pL, is absolutely continuous w.r.t. p. On the contrary, if (x, &, (Pe)eGL) is not dominated, the probability (w.r.t. P) of a set of posterior distributions that are not absolutely continuous w.r.t. the prior distribution can be positive, i.e. only for some particular choices of p we can have, P a.e., pX absolutely continuous w.r.t. p. In what follows, for two given positive measures Q and R on a a-algebra 9, we shall denote by (Q’““‘, Q’“‘) the Lebesgue decomposition of Q w.r.t. R, i.e. (Qcac),Q(“‘) is the unique pair of positive measures on B such that Q

=

Qcac)

+ Q(“), with Qcac)6 R

and

Q@)I R;

(7)

then, if (x, ~2, (Pe)e,L) is dominated, P a.e. the Lebesgue decomposition of pX w.r.t. p is trivial, whatever be the prior distribution ~1. This paper aims to study (P a.e.) the Lebesgue decompositions (nX(ac),p:)) of 11, (x E x) w.r.t. p and to find how these decompositions depend on the Lebesgue decomposition (ZZ7’““‘,II(‘)) of I7 w.r.t. p 0 P and on the Lebesgue decompositions (Pg”‘, P!‘) of PO (0 E L) w.r.t. P. In Section 2 it will be shown that (P$‘)),,~ is determined only by IT@) and (P$‘)~.~ is determined only by Ii’(‘). This result will then be used in Section 3, where we shall prove that (/A!‘))~~~ is determined only by is determined only by (Pf’),,,. W’)t%L and (P?),., Such arguments will be illustrated by means of the example of undominated Bayesian experiment presented in Section 4.

C. Macci

2. Decomposition

of the posteriors

/ Statistics

& Probability

Letters

26 (1996)

in terms of the decomposition

From now on the following notation

147-152

149

of II w.r.t. p 0 P

will be used:

s(Q, 4 =

(8)

and, for a set D E ‘G?? 0 d, D( ., x) and D(0, .) denote the x-section and the d-section respectively, i.e. we put D( ., x) = (0 E L/(8, x) E D} and D(B, .) = {x E x1(0, x) E D}. We have the following result. Proposition

1. Let D E %?Q & such that p @ P(D) = 0 and II(‘) is concentrated

on D. Then we have

(i) P a.e. vc E %?: p:“(c)

=

VE E d:

Prc’(E)

YE E &:

P:‘(E)

sC

=

=

g(e, x)dp(8),

sE sE

(9)

g(B, x)dP(x),

(11)

1D(e,4 dpo (4 =

(12)

lm.,(x)df’ow.

Proof. We start by noting that Il(CxE)

=

g(e,X)d(C10p)(e,X)+n((cx~)nD)

s CxE

=

w-4

dkmde,

s E

4

dP(x)pX(CnD(.,x)),

+

5C

VCE~ and VEER.

s E

Then it follows P a.e.: px(C) =

sC

de, x)dp(e) + pX(CnD(.,

x)),

VC E e

and (i) holds because, P a.e., we have p,(.nD(.,x)) PW.,

is concentrated on 0(.,x),

(13)

xl) = 0.

(14)

Indeed (13) is obvious and (14) follows by observing that dP(x)

0 = p Q P(D) = i

Similarly

X

we can prove (ii).

5L

0

dp(O) la@, x) =

dP(x) p(D(., x)). s Y

C. Mncci

150

The Bayesian experiment

/ Statistics

& Probability

Letters

26 (1996)

147-152

8 is said to be dominated (e.g. Florens et al., 1990) if

n<
(15)

then, as an immediate

consequence of Proposition

1, we have the following

Corollary. The following statements are equivalent: (a) (L x x, e x d, ZZ) is a dominated regular Bayesian experiment; (b) P a.e. px is absolutely continuous with respect to pu; (c) p a.e. PO is absolutely continuous with respect to P. We remark that (c) is a slight modification of the definition of “domination by P” for the statistical model (XTd> (P0h.d The Corollary above shows that the Lebesgue decomposition of pX w.r.t. p is non-trivial only when (15) fails.

3. Decomposition

of the posteriors

in terms of the decompositions

of sampling distributions

Consider the function g(0, x) defined by (8); by (11) we also know that, ,U a.e., g(d;) is a version of the density of Prc) with respect to P. Thus, to present the following result, we shall write dPr”‘/dP(x) in place of Y (03 xl. Proposition

2. P a.e. we haue

VC E e:

,L$~)(C) =

s

dP:“’ cdP

(16)

(x)d~(Q>

(17) Proof. (16) easily follows VCEeandVEE&‘:

from (9) and (11). To prove (17) we start by noting that

sE

/$‘(C)dP(x)

=

pJC)dP(x)

-

i E

,uFC)(C)dP(x).

(18)

I E

On the other hand, by (2) and (4) we have VCEeandVEE&:

J^E

while, by (9) and (ll), it follows

uC)dP(x)

=

(19)

f’e(E)dd@~ C

(by using Fubini theorem) j~:“‘(C)dP(x)

VCEeandVEEszZ:

i

=

.i E

g(R xl h-44

HiE

C

dP(x) 1

g(R xl W-4 =I

Hence, by putting (19) and (20) in the right-hand sE

bL.:)(C)dP(x)

=

Po(E)dA4

-

sC

C u

E

(20) P;“(E)d/@).

d/t(@) = 1

i c

side of (18). we obtain (17). Indeed it is

P;“(E)d&@

=

Pr’(E)dp(B), C

VC E e and YE E d.

0

C. Macci

/ Statistics

& Probability

Lette?s

26 (1996)

151

147-152

4. An example and concluding remarks

In this section we consider a very simple undominated statistical model and a prior distribution p which give rise to an undominated Bayesian experiment. By employing (f6) and (17), we obtain the Lebesgue decompositions with respect to ~1for P almost all the posterior distributions. This might also be done by deriving a family of posterior ditributions (&,x and successively considering their Lebesgue decompositions w.r.t. p. On other hand, we remark that the real interest of (16) and (17) consists in showing that (P:“),,~ is determined only by (Prc))BEL and (pzS))XEXis determined only by (PF’)e.L. Let (R, B) be the real line equipped with the Bore1 a-algebra and put (a, d) g [RF;n, d) M~I%~vFz,

atid

(t, 2) g (R, a),

(21)

givena positive density q w.r.t. the Lebesgue measure on a’, let ,U be such that

p(C) =

sc

q(@d0,

VCE e

(22)

atid fet (f’e)eer. be such that PO(E) = p l,(0) + (1 - p)

If8 E L:

sE

f(Q, x)dx,

VE E cd’,

(23)

where p E 10, l[ and f is a jointly measurable function such that { f(0, .): 8 E L} is a family of positive densities w.r.t. the Lebesgue measure. In other words, (23) defines the conditional distribution PO of a statistical observation X to the following situationtX is equal to l3 with probability p and, conditionally on X taking values different from 8, it has L distribution admitting density f(0, .) w.r.t. the Lebesgue measure. We remark that (Pe)e,r. is not dominated; indeed, any positive measure i, dominating the statistical model mfkL> must assign a positive measure to each singleton of R; being R uncountable, i cannot be o-finite. By (22) and (23) it follows P(E) =

Po(E)dp(@

1L

=

pdx)

+ (1 -

PI

S[E

so that, for any 8 E L, the Lebesgue decomposition P:“(E)

= (1 - p)

sE

f@, s

f(0, x)dx

and

P:‘(E)

x1 q(Rd~

L

1 dx,

Q'EE

d,

(24)

of PO w.r.t. P is = p lE(0),

VE E d.

(25)

(25) trivially shows that the condition (c) of the Corollary in Section 2 fails, whence the Bayesian experiment defined by (21), (22) and (23) is not dominated. Now, by employing (16) and (17), we obtain (P a.e.) the Lebesgue decomposition (p:‘), p!$) of px w.r.t. ,u. By (16) we have /p(C)

dP:” = c~WW)

=

s

Furthermore,

VCee.

(26)

since (for any fixed C E e)

P:‘(E)d/.@) s C

(1 - P) Jcf VA4 q (0) de pq(x) + (1 - P) J,fUi x)q(@d@

= p s EnC

lc(x)q(x)dx,

q(@dd = P E

YE EA’

(27)

152

C. Macci

/ Statistics

& Probability

Letters

26 (1996)

147-152

by (17) we obtain = d[fc

P kwl(x) mW+(l -&.f(‘Wq(QW’ Then, by (26) and (28), for any family (P~),,~ of posterior distributions p(s)(c)

x

Pae

~ . .

P:‘WCc(@l dP

(x) =

(C)=(l-~)Scf’(e,x)q(e)de+~lc(~)q(x),

~cEe

x pd-4

+

vcEe’ we have the following:

(1

-

14 JLfte,

4

de) de

(28)

(29)

Note that the singular part ,uz’, being (P a.e.) concentrated on {x>, means that the statistical observation X may provide a piece of deterministic information about the unknown parameter 8. This obviously is what one heuristically expects taking into account the presence, in each PO, of a positive probability concentrated on the singleton {O}. Equation (29) allows to convert this intuitive idea into a quantitative result. Acknowledgements

Proposition 1 was suggested by Professor Giorgio Letta; I thank him for his considerable help. I also thank Professor Fabio Spizzichino and an anonymous referee for helpful comments and discussion.

References Bahadur, R.R. (1954), Sufficiency and statistical decision functions, Ann. Math. Statist. 25, 423-462. Florens, J., Mouchart, M., Rolin, J. (1990), Elements of Bayesian Statistics (Marcel Dekker, New York). Liptser, R.S. Shiryiayev, A.N. (1977), Statistics of Random Processes I, General Theory (Springer, New York).