mm7--.,--~-~
STATISTICS& PROBABILITY LETTERS ELSEVIER
Statistics & Probability Letters 25 (1995) 351 355
On a class of exchangeable sequences Alexander V. Gnedin Institute of Mathematical Stoehastics, University of G6ttingen, Lotzestrasse 13, 37083 GSttingen, German)' Received September 1994; revised October 1994
Abstract Assuming that the probability distribution of a finite sequence has a density depending solely on the extreme components we give an elementary criterion for extendibility of this sequence to an infinite exchangeable sequence of random variables, which turns out to be a mixture of i.i.d, uniformly distributed sequences. A one-sided version of this result leads to a Schoenberg-type theorem for the maximum norm. Keywords: Exchangeability; Partial exchangeability; Extendibility
I. Introduction An infinite sequence o f exchangeable random variables is representable as a mixture of sequences of i.i.d, random variables. This does not apply to f i n i t e exchangeable sequences: a finite sequence has such a representation if and only if it is (infinitely) extendible, i.e., can be completed to an infinite exchangeable sequence. Given a "non-parametric" family o f exchangeable sequences of finite length, a problem of interest is to identify all extendible sequences therein and to find explicitly their representations as mixtures. The extendibility problem has been studied so far mainly in the case of variables with discrete distributions, some references are found in the survey o f Aldous (1985) (cf. also a more recent paper by von Plato (1991)). Different aspects of finite exchangeability are found in Diaconis et al. (1992). In what follows we introduce an interesting class of continuous multivariate distributions characterized by their densities, which depend solely on the maximal and the minimal components, and prove that extendible sequences are the mixtures o f uniform distributions. A one-sided version of this result, when the density is concentrated in the positive orthant and depends there only on the maximal component, leads to a Schoenbergtype theorem.
2. The two-sided case Given n > 1, let Gn be the class o f probability densities of the form pn(Xl . . . . . X,,) = g , ( x , V . " V X , , Xl A ... Axn), 0167-7152/95/$9.50 @ 1995 Elsevier Science B.V. All rights reserved SSDI 0 1 6 7 - 7 1 5 2 ( 9 4 ) 0 0 2 4 0 - 1
(1)
A. V. Gnedin/ Statistics & Probability Letters 25 (1995) 351-355
352
where V = max, A = min and 9n is a measurable non-negative function on ~ 2 = {(s, t) E ~2 . s > t}. Computing the total probability integral we see that such a 0n must satisfy n(n - 1 ) / / 9 n ( s ,
(2)
t)(s - 0 "-2 ds dt = 1.
RL Since the right-hand side of (1) is invariant under the permutations of the xi's, this formula determines a distribution of an exchangeable random sequence of length n. We want to describe all extendible sequences of this kind. In what follows we make no difference between any two versions of O. differing on a set of zero Lebesgue measure, since they describe the same probability distribution. Integrating out any n - m of xi's in (1) we find that the m-dimensional marginal density Pm is in Gm for m < n, with 9m given by Om(s,t) = 9n(S,t)(s -- t) "-m + (n -- m ) +(n - m)
9 . ( u , t ) ( u -- t) " - m - I du
On(s,u)(s - u) n-m-1 du Cx~
+ (n - m)(n - m - 1 )
gn(U, v)(u - v) " - m - 2 du dr.
(3)
O0
Thus { G . ; n = 1,2 .... } build a projective family with projections n.m : 9 . H 9m. The projective limit G ~ is the convex set of sequences {p~, P2 .... } satisfying rcnmp. = Pro. Any point of G ~ corresponds to an infinite exchangeable sequence of random variables. A restricted version of the extendibility problem might be posed as the problem of describing G ~ or, what is equivalent, of its extreme points. It turns out that for the family of distributions under consideration the solution of the restricted problem provides also a description of all extendible sequences. Given ~ > /~ let n
p~'~(x~.....
x.) = (~ - / ~ ) - " 1-[ lr~,~j(x,) i=1
be the density of the uniform distribution in [/~, ~]". Clearly, p~'/~ E G,. Theorem 1. A s s u m e n >.4 and p . E Gn. The correspondin9 random sequence & extendible i f and only i f 9.(s, t) has a version which is non-increasin9 in s, non-decreasin9 in t and satisfies .q.(s',t') + 9 . ( s , t ) - 9n(s',t) - O.(s,t')>>-O
f o r t<~t' < s' <~s.
(4)
In this case, there exists a unique probability m e a s u r e It on ff~2 such that p.(xl ..... x.) =// J JR
p J ( x l . . . . . xn)d#(~,fl).
(5)
Proof. First of all, we will identify all product distributions in G.. To do this, we need to solve the equation p n ( X l . . . . . Xn) = p l ( x l ) . . . p l ( x . )
353
A.V. Gnedinl Statistics & Probability Letters 25 (1995) 351 355
which in G. implies g,(xl V . . . Vx,, xl A . . . A x , ) =
p l ( x n ) g n - l ( x t V . . . V x , - i , xl A . . - A x n - I ) .
Fixing xl . . . . . x . - i , we see that the left-hand side here does not depend on x . , as this variable accepts the values between x~ A . . . A x . - t and x~ V . . . V x._~, hence pj in this range must be constant. Furthermore, choosing xt arbitrarily close to the infimum of the support o f pl, and x2 .... ,x._~ close to its supremum so that p . _ ~ ( x l . . . . . x . _ l ) > 0 (this is always possible for product densities), we see that pl is constant on its support, whence Pl is the density o f a uniform distribution on some interval. Mixing uniforms p~,l* with a directing measure /~ on E2 yields a density in G., with
g.(s, t) =
(~ - fl)-" d/~(a, fl),
(6)
O0
non-increasing in s and non-decreasing in t (such functions are called for simplicity in the sequel m o n o t o n e ) . As a mixture o f products, such a density is extendible. Conversely, let g. be monotone in the above sense. Define a measure v. on ~2> by setting v . ( ( s ' ; s ) × ( t ' ; t ) ) = gn(s',t') + g~(s,t) - g . ( s ' , t ) - g . ( s , t ' ) for s , s ' , t , t ' as in (4) and such that g. is continuous in four points (s, t), (s', t'), (s', t) and (s, t'). Let measures v0. . . . . v._ 1 be absolutely continuous with respect to v,,, with the Radon-Nikodym derivative d v m / d v . ( s , t ) = (s - t) n-re.
(7)
By the construction, (6) holds with/~ = v0. Integrating by parts translates (2) into v0(E2> ) = 1, thus showing that v0 is a probability measure, and (5) follows. In fact, a representation similar to (6) holds also for gm = nnmgn, with p = Vm,m = 1. . . . . n - 1. It remains to prove that extendibility implies the monotonicity o f gn. Fix some gn,n>14, and assume that there is an infinite exchangeable sequence X I , X 2 . . . . whose n-dimensional marginal density is given by (1). We will combine in what follows an idea found in Aldous (1985, p. 8), with a "delta function" argument. If ~ 1 , ~ 2 . . . . is an exchangeable sequence of square integrable random variables, then E~I~2~>0. Indeed, write
0~E(~l + . . . + ~,)2 = n E ~ +
n(n
-
1)E~1~2,
divide by n 2 and set n ~ oo. We start with proving the monotonicity of g4 •n4gn. By Lusin's theorem for any e there exists a measurable set C,: C E2 such that g4 is continuous on C,: and the complement to C,: is o f Lebesgue measure less than e. Removing from C,: a null set we can leave there only density points (s, t) satisfying =
(2a)-2meas(C~: A [s - a,s + 6] x [t - a,t + 6]) --+ I,
Pick two pairs ( s , t ) , ( s ' , t ' ) C
6---+0.
C,: satisfying t<~t' < s'<~s, and set
A=C,:r~[s-a,s+a]x[t-a,t+6];
B=C,:N[s'-&s'
~i : (2a) -2 ( 1 A ( X 2 i - I , X 2 i ) - 1B(X2i-I,X2i)).
+a]x[t'-a,t'
+6],
354
A.V. GnedinlStatistics & Probability Letters 25 (1995) 351-355
The variables ~i build an infinite exchangeable sequence since the pairs (XI,X2), (X3,X4) .... are exchangeable. Applying the above argument, we have
0~E~1~2=(2~)-4 (ff/fAxAP4(Xl,X2,X3,x4)dxldx2dx3dx4 + f/ffBxBP4(Xl,X2,X3,X4)dxl dx2dx3dx4 - R /fff×BP4(Xl,X2,X3,X4)~l dx2dx3~4 ) =(26)-4 ( f f f f x ,
+ffff.xB
g4(Xl V X2 V x3 V x4'Xl AX2 Ax3 Ax4)dxl dx2dx3 dx4
g4(Xl VX2 Vx3 VX4,XI AX2 AX3 A X4) dXl dx2 dx3 dx4
- 2 f f f f x ~ g 4 ( X l V X2 V x3 V x4,Xl AX2 Ax3 Ax4)dxi dx2dx3 dx4) =qt +q2-2q3, where ql ---* g4(s,t), qz ---* g4(s~,t~), q3 ---+,q4(s,t) as 6 ---* 0, as it follows from the mean value theorem for Lebesgue integral, continuity and the density property of (s, t) and (s ~, F). Consequently, g4(s t, t t) -- g4(s, t) >/0. Setting e ---* 0 along a countable sequence we see that g4 has a version on ~2>, which is non-increasing in s and non-decreasing in t. It follows that g4 can be selected almost everywhere continuous and left-continuous in both arguments. Given four continuity points (s, t), (s ~, F), (s ~, t) and (s, F), introduce another sequence of random variables (i = (2¢~)-2(1A(X2i-I,X2i) q- 1B(X2i-I,X2i) - 1 c ( S 2 i - l , X 2 i ) - 1o(S2i-l,X2i)),
where
A=[s-f;s+6]x[t-6;t+6]; C=[s-3;s+6]
x[t'-6;t'+6];
B=[s'-6;st +6]x[t'-f;t' D=[s'-b;s'+6]
+6],
x[t-6;t+3].
By the very same argument involving computing of E~I ~2 and setting 6 ~ 0 we obtain the inequality (4) for the values of g4 at these points. Now left continuity guarantees that this inequality holds in fact for arbitrary t ~ 4, monotone g4 can be lifted to G~: invert (7) to obtain a measure vn by setting dvn/dva(s,t)= ( S - t) 4-n and note that v~ specifies a monotone g~ via the formula
g,(s,t)
= v,([s; 00) x ( - ~ ; t ) ) .
That is, monotonicity of g4 implies monotonicity of g,. The theorem is proved.
[]
It is easy to show that the total projective sequence P2, P3 .... is determined already by P2. The projection given by (3) can be always inverted, but for general gm this does not produce a positive gn. On this way a criterion of finite extendibility in the family {Gn} might be elaborated. For n = 2, 3 the monotonicity of g~ still implies extendibility in Goo, but for n = 2 this condition is not necessary for general extendibility, because any bivariate symmetric density satisfies (1). We do not know whether for n = 3 general extendibility implies (4).
gn H g,n,n > m
A.V. Gnedin/Statisties & Probability Letters 25 (1995) 351 355
355
3. The one-sided case Next is a one-sided version of the above result. The proof is completely analogous.
Theorem 2. Assume a probability density is of the form pn(xl ..... xn) = gn(xl V ' - - V x,~)l[o.~](xl A "-- AXn),
n > 1,
(8)
where gn satisfies n
~_ g,(x)x ~-I dx = 1. +
The corresponding random sequence is extendible if and only if g, is non-increasing. In this case the sequence is a scaled mixture of uniform distributions on [0,/~]". The family of densities (8) has been studied in Gnedin (1994) in connection with the best choice problem, where the Markov property of the sequence of record values, XI,X1 V X2 ..... X1 V ... V X,, was exploited. Characterization of discrete uniform distributions as extreme solutions of an equation for point probabilities analogous to (8) is found in Lauritzen (1975), Ressel (1985, Example 4), and Lauritzen (1988, p. 176), where the discrete case is treated as an illustration of a semigroup approach. The famous Schoenberg's theorem (cf. Aldous, 1985) about spherically symmetric sequences may be reformulated in the following manner: assume for each n = 1,2 .... the n-dimensional marginal density of an infinite exchangeable sequence depends solely on the Euclidean norm of the argument, then the sequence is a scaled mixture of i.i.d, normal sequences with zero mean. The following result tells what happens if we use the maximum norm instead of the Euclidean. Corollary. An infinite exchangeable sequence with marginal densities of the form
p,(xl . . . . . x , , ) = g , ( l x l l V . . . V [ x , ] ) ,
n = 1,2 . . . . .
is representable as a mixture of i.i.d, uniform on ( - 0 , O) sequences. Proof. Indeed, the distribution is invariant with respect to reflections X/ H -Xi, and by Theorem 2 the sequence [)(11, IX21.... is a mixture of uniforms on [0, 0]. By symmetry, the distribution of XI,X2 .... is also a mixture of uniforms. []
Acknowledgements The author is grateful to Lutz Mattner for pointing out an error in an earlier version of this paper.
References Aldous, D.J. (1985), Exchangeability and Related Topics, Lecture Notes in Mathematics, Vol. 1117 (Springer, New York). Diaconis, P.W., M.L. Eaton and S.L. Lauritzen (1992), Finite de Finetti Theorems in linear models and multivariate analysis, Stand. J. Statist. 19, 289-315. Gnedin, A.V. (1994), Pick the largest number: a solution to the game of Googol, Ann. Probab. 23, 1588-1595. Lauritzen, S.L. (1975), General exponential models for discrete observations, Stand J. Statist. l l , 65-91. Lauritzen, S.L. (1988), Extremal Families and Systems o f Sufficient Statistics, Lecture Notes in Statistics, Vol. 49 (Springer, New York). Ressel, P. (1985), De Finetti-type theorems: an analytical approach, Ann. Probab. 13, 898-922. von Plato, J. (1991), Finite partial exchangeability, Statist. Probab. Lett. 11, 99-102.