Peakedness of linear forms in ensembles and mixtures

Peakedness of linear forms in ensembles and mixtures

STATISTICS& PROBABILITY LETTERS Statistics & Probability Letters 35 (1997) 277-282 ELSEVIER Peakedness of linear forms in ensembles and mixtures D.R...

379KB Sizes 0 Downloads 17 Views

STATISTICS& PROBABILITY LETTERS Statistics & Probability Letters 35 (1997) 277-282

ELSEVIER

Peakedness of linear forms in ensembles and mixtures D.R. Jensen Department of Statistics, Virginia Polytechnic Institute and State Universi~, Blacksburg, VA 24061, USA

Received 1 October 1994; received in revised form 1 October 1996

Abstract Linear forms ff'(p) = p l X ~ + ... + p , X , are studied in random variables {X 1.... ,X,} having common locationscale parameters (/~, a2). For certain distributions on R" having star-shaped contours and others, it is shown that if q = [ql . . . . . q,]' majorizes p = [pl . . . . . p.]', then W ( p ) is more peaked about/~ than W(q) in the sense of Birnbaum (1948). In particular, the peakedness about # of X(n) = (X~ + ... + X , ) / n increases monotonically with n. If neither c nor d majorizes the other, then {W(c), W(d)} are less peaked about/~ than W(c A d), and are more peaked than W(c V d). This extends the findings of Proschan (1965) and Olkin and Tong (1988). Stochastic majorants and minorants for linear estimators are given in certain ensembles, including star-contoured distributions on R" if ordered by peakedness. © 1997 Elsevier Science B.V. AMS classifications: 62E10, 62F11 Keywords: Estimating location; Weighted averages; Concentration; Majorization; Star-contoured distributions; Mono-

tone consistency

1. Introduction Following B i r n b a u m (1948), a r a n d o m variable Y is said to be more peaked a b o u t a e ~1 than is Z a b o u t b e N1 if P ( I Y - a[ <<,t) >~ P ( I Z - b l <~ t) for every t e (0, ~ ), in which case we write (Y - a) ~ e (Z - b). Let {X1 . . . . . X,} be r a n d o m scalars having c o m m o n location-scale parameters (#,a2), and let X ( n ) = (X1 + ... + X . ) / n . Laws of large numbers assert that )~(n) tends to concentrate in probability a r o u n d #, as quantified under second m o m e n t s by Chebychev's inequality P ( l X ( n ) - # l < ~ c ) >>, 1 - aZ/nc 2 in which the b o u n d s increase m o n o t o n i c a l l y with n. Although the probabilities themselves need not be m o n o t o n e , it is of interest to study circumstances for which this is the case; see P r o s c h a n (1965). M o r e generally, let l,V(p) = p l X a + ".. + p , X . such that {0 ~< p, ~< 1; pl + .-. + p. = 1}, and suppose that q = [q~ . . . . . q. ]' majorizes p = [p ~. . . . . p,]'. If {X~ . . . . , X . } are independent and identically distributed (i.i.d.) having either a symmetric log-concave density or its convolution with a C a u c h y density, P r o s c h a n (1965) has shown that IV(p) is m o r e peaked a b o u t # than W(q). Olkin and T o n g (1988) extend these findings to include certain elliptical distributions on N" that are unimodal in the sense of A n d e r s o n (1955). In 0167-7152/97/$17.00 © 1997 Elsevier Science B.V. All rights reserved PII S0 1 67-7 1 5 2 ( 9 7 ) 0 0 0 2 3 - 0

D.R. Jensen / Statistics & Probability Letters 35 (1997) 277-282

278

consequence, Proschan (1965) demonstrated that the peakedness of X(n) about/~ increases monotonically with n for distributions of the types considered there. Standard models seldom capture the complexities of modern experimental error structures. Scientific measurements often are subject to common errors of calibration and to scaling in a random experiment. Elementary error analyses show that observations {X~, ... ,X,} sharing common calibration errors are typically dependent and often are equicorrelated with parameter p. Moreover, induced correlations and scaling may vary randomly from experiment to experiment. Accordingly, we extend earlier findings on the peakedness of linear forms to included mixtures of elliptical distributions. These may have heterogeneous location parameters, conditionally exchangeable errors, and their densities may be star-contoured. It is seen that our developments require neither independence, log-concavity, elliptical symmetry, nor unimodality, as assumed in earlier studies.

2. Preliminaries We fix notation and then catalog the types of distributions to be developed subsequently. 2.1. Notation

Here ~" and R~_ denote Euclidean n-space and its positive orthant; S + consists of positive definite (n x n) matrices; I. is the (n x n) identity matrix, and 1. = [1 . . . . . 1]' ~ ~"; and J(p) e S + is the equicorrelation matrix given by J(p) = a2[(1 - p)I. + p l . l ' . ] with 0.2 > 0 and {a(n) ~< p ~< 1}, where a(n) = - (n - 1) -1. Abbreviations include p.d.f., c.d.f., and c.h.f, as probability density, cumulative distribution and characteristic functions, respectively, and LP(X) designates the law of distribution o f X e R". Considerp = [Pl, . . . , p.]' and q = [ q l , ... ,q.]' such that {pl~> ... ~>p,} and {ql~> ..-~>q.}. If q l + "'" - F q k > / P l -b "'" q-Pk for 1 ~< k ~< n - 1, and ql + "'" + q. = Pl + "'" + P., then q is said to majorize p, and we write q >>p. In particular, (S(1), >> } is the ordered simplex given by S(1) = {p e R": 0 ~< Pi ~< 1; Pl + "'" + P, = 1}. A set S c R" is said to be star-shaped about 0 e R" if for every x e S, the line segment connecting 0 to x is in S. A nonnegative function ~(. ) is said to be star-contoured about 0 E R" if its level sets C(r) = {x e R": q~(x) > r} are either star-shaped about 0 e N", or are empty. Subsequent developments prompt us to consider yet another concept of consistency in estimation. Details follow.

Definition 1. Let {T,; n = 1, 2 . . . . } be weakly consistent for a parameter 0 ~ R 1. If in addition T, becomes successively more peaked about 0 as n increases, then T, is said to exhibit monotone consistency for 0. 2.2. Distributions

Suppose that the c.h.f, o f X = [X1 . . . . . X , ] ' ~ •" has the form (x(t) = E(e ~x) = eit'°c~(t'~,t) with argument t e ~" and parameters (0, 2~) ~ R" x S+ . Then &a(X) is said to be elliptically contoured on R" with locationscale parameters (0, 2~), and we write Za(X) = E.(O, ~,, c#). Properties of these distributions are developed in Cambanis et al. (1981). Let E ( n ) = {E,(0,2;,~b); ( 0 , E ) e R " x S + , ~ b ~ } comprise the class of all such distributions. Corresponding to each 4~~ ~ is a radial density 0,(" ) on [0, oo ). Let U(n) consists of elliptical distributions on R" that are unimodal in the sense of Anderson (1955); for these, 9~(" ) is decreasing on [0, oo ). Further, let G(n) comprise the scale mixtures of Gaussian laws on R" with typical member G.(O, S, F) having a density of the form 9.(x; 0, Z, F) = [ ' ~ (2~t) -"/2 ]EI - a/2 e - Q(x;0,~)/2, dF(t), Jo

(2. 1)

D.R. Jensen / Statistics & Probability Letters 35 (1997) 277-282

279

where Q(x; O, ~) = (x - O)'Z, - 1(x - O) and F (.) belongs to the class Fo of all c.d.f.s on RI+. A standard result gives the inclusion relations G(n) c U(n) ~ E(n). To model equicorrelated errors with random correlations, let Go be the class of c.d.f.s on the interval (a(n), 1 with a(n) = - (n - 1)- 1. Consider mixtures of corresponding elliptical measures having c.h.f.s of the type ~bx(t) =

dp(t' J(r)t)dG(r).

(2.2)

(n)

Designate the corresponding distribution by ME,(0, ~2, ~b, G), and by ME(n) = {ME,(0, t~2, q~, G); (0, t72) e ~" x ~a+, q~ 6 ~, G E Go}

(2.3)

the class of all such location-scale families of mixtures. Subclasses of mixtures corresponding to U(n) and G(n) are designated by MU(n) and MG(n), typical members of which are denoted by MU,(0, ~2, ~b, G) and by MG,(0, ~2, F, G), respectively. These subclasses likewise satisfy the inclusion relations MG(n) c MU(n) ME(n). In particular, the p.d.f, for MG,(0, ~2, F, G) is the mixture g(x;O,~2,f,G) =

g,(x;O,J(r),f)dG(r).

(2.4)

(n)

It may be noted that all densities in MU(n), and thus in MG(n), are star-contoured.

3. The main results We first establish connections between majorization and peakedness orderings for weighted averages from E,(O,J(p),4)). To these ends let P(P) =P101 + "" + p , O , . For later reference consider c = [cl . . . . . c,]' E S(1) and d = [dl . . . . , d,]' e S(1) such that neither c >> d nor d >> c. Then there is a greatest lower bound e A d and a least upper bound c V d in S(1). For further details see Jensen (1992). A basic result is the following. Theorem 1. Suppose that L#(X) = ME,(0, a 2, c~, G) with ~ ( X [ p ) = E,(O,J(p), (a) , and let W (p) = Pa X l + •.. + p , X , and I~(p) -= p101 + "" + p,O,. (i) l f q >>p with p fixed, then ~ ( W (P)hP) is more peaked about #(p) than is ~ ( W (q)lP) about #(q). (ii) I f q ~ p, then [W(p) - #(p)] is more peaked about 0 ~ ~1 than [W(q) -/~(q)]. (iii) Suppose that 0 = 0 and that neither c >>d nor d >>c. Then { W (c), W(d)} are bounded in peakedness by

W(c v d) ~ Pl > P2, Pl + Pz = b = ql + q2, with 0 < b < 1, and that {Pl = qi; 3 ~< i ~ n}. Clearly q >>p. For a ~ ~", the c.h.f, of W ( a ) = a l X 1 + ... + a , X , , with argument s ~ ~1, has the form e~Sa'°(9(a'J(p)as 2) from standard properties of c.h.f.'s. This is the c.h.f, of a symmetric distribution on R 1 centered at a'O with scale parameter a ' J ( p ) a . On letting H(z) = [z 2 + (b - z) z + p~ + ... + pZ]

(3.1)

with {P3, ... ,P,} fixed such that z >1 b - z and b = 1 - P3 . . . . . p,, we infer that H(z) is increasing for z e [b/2, b] with minimum at b/2. Under hypotheses of the theorem for conclusion (i) it follows that q'q > p'p,

280

D.R. Jensen / Statistics & Probability Letters 35 (1997) 277-282

whereas [W(q) - #(q)] and [W(p) - #(p)] have c.h.f.'s O(q'J(p)qs z) and (o(p'J(p)p s2), respectively. We now apply Lemma 1 of Jensen and Foutz (1989), showing that if { ~- 1F(x/7); ~ e (0, oe ) } is a scale family of c.d.f.'s of symmetric distributions on N~, then this family decreases monotonically in peakedness as 7 increases. Note here that q >>p implies q'J(p)q >>.p'J(p)p. If q >>p generally, then p can be derived from q through successive applications of at most n - 1 T-transforms T = 2/, + (1 - 2)Q in which 0 ~< 2 ~< 1 and Q is a permutation matrix that interchanges just two coordinates; see Marshall and Olkin (1979), p. 22. The foregoing developments apply at each step, to complete our proof for conclusion (i). As the ordering holds pointwise for each p ~ (a(n), 1), it thus holds unconditionally on taking expectation with respect to G ~ Go, giving conclusion (ii). Conclusion (iii) follows from (ii) and the fact that (S(1), >>) is a lattice. Details are given in Jensen (1992), including constructive methods for finding c A d and c V d. To see conclusion (iv), observe that p'J(p)p = (l - p)p'p + p. The difference in scale parameters is P'[J(P2) - J ( P l ) ] P = (P2 - Pt) (1 - p ' p ) , which is positive if and only if P2 > Pl- The cited result of Jensen and Foutz (1989) now gives assertion (iv) as acclaimed. [] Specializing to the case 0 = #1, shows that W (p) is more peaked about # than W (q) whenever q >>p. We turn next to the comparative peakedness of X(n) as n varies, where we take 0 = #1,. To consider X(n) = [X~, ... , X,]' as n varies, we proceed conditionally given p. Preserving elliptical symmetry of 2, (X(n) lp) as n increases requires that 2,(X(n) lp) be extendible. In order that 2, (X(n) lp) should be extendible for all n, it is necessary and sufficient that 2'(X(n) lp) should be a scale mixture of Gaussian laws as in (2.1); see Schoenberg (1938). Accordingly, we suppose that 2,(X(n)lp) = G,(# 1,, cr2 J(p), F) for some (#,a z) ~ a~ x El+ and F e Fo. A principal result is the following. Theorem 2. Suppose that 2~'(X) = MG,(# 1,, ~rz, F, G) ~ MG(n), and let X(n) = (X1 + ".. + X,)/n. Then (i) The peakedness of X(n) about I~ increases monotonically with n. (ii) I f in addition X (n) is weakly consistent for p, then X (n) exhibits monotone consistency for # according to Definition 1. Proof, Temporarily fix p and suppose that 2 , ( / I P ) = G , ( # I , , a 2 j ( P ) , F ) with F EFo. For n # m , the marginal distributions of ~ (XIp) on ~1 clearly have the same functional form for G,(p 1,, o'2j(p), F) as for Gm(p lm, a2J(p),F). Arguments leading to the proof for Theorem 1 again apply, and conclusion (i) of the theorem holds for each fixed p on noting that [1/n . . . . , l/n,0] majorizes [1/(n + 1) . . . . . 1/(n + 1)] on ~,+1. As the ordering holds pointwise for each p ~ (a(n), 1), it holds unconditionally as well on taking expectations with respect to G e Go. Conclusion (ii) follows from (i) and Definition 1, to complete our proof. [] In particular applications, the degree of concentration of X(n) about # depends on the corresponding mixing distributions. We thus seek bounds on the comparative peakedness of )((n) under different models. In fact, the foregoing developments support stochastic bounds for ensembles of distributions on El. Let {Vw(. ); co ~ f2} be an ensemble of probability measures on El, and suppose that measures Vm(') and v u ( ' ) can be found such that vm(') ~v{Vw('); ~o ~ O} ~ v v ~ ( ' ) . Then v~(.) is called a stochastic majorant, and vm(') a stochastic minorant, for the ensemble under ordering by peakedness on ~1. To these ends let {MG,(# 1,, o 2, F,~, Gr); (~o,7) E (2 x F}

(3.2)

be an ensemble in M G(n), and let F,, (t) = info,~ {F,o (t)} and FM(t) = sup o,,a { Fo,(t)}, both c.d.f.'s. Similarly let G,,(t)=inf~r{G~,(t)} and G u ( t ) = s u p ~ r { G ~ ( t ) } . Designate by X(F~,Gr) the sample mean from MG,(# 1,, a 2, F,o, Gr). In particular, if(F1, F2) E Fo such that Fl(t) ~> F2(t) for every t e [0, oc ), then a result of Jensen (1984) shows that Jr(F1, G) is more peaked about p than is X(F2, G) for each fixed G e Go.

D.R. Jensen / Statistics & Probability Letters 35 (1997) 277-282

281

Similarly from Theorem l(iii) we infer that if(G1 ,G2) E Go such that Gl(r) >>-G2(r) for every r ~ (a(n), 1), then )((F, G1) is more peaked about # than X(F, G2) for each fixed F 6 Fo. Repeated applications of these facts support the following conclusions for each fixed n. Theorem 3. Consider linear estimation in the ensemble (3.2); let {Fro(t),FM(t) } and { Gm(t), GM(t) } be as defined;

and let W (a; Fo~, Gr) = al X l + ... + a , X , and X (Fo, Gv) = (X1 + ... + X , ) / n 9iven the mixing distributions (F~,,Gr) in the model MG,(#I,,(r2,F,o, Gr). For ( F i , F z ) s Fo, suppose that Fl(t) >~Fz(t)for every t ~ [0, oo ), and for (G1, Gz) ~ Go, that Gl(r) >1 Gz(r)for every r ~ (a(n), 1). (i) If q >>p, then W (q; F2, G2) ~ e { W (p;F,o, Gr), W (q;Fo,, Gr)} ~ p W (p; F1, G1)

(3.3)

for (co, 7)e {1,2} x {1,2}, (ii) Given the ensemble { X (Fo, G~); (co,y)e f2 x F} eorrespondinq to (3.2), nested sequences of stochastic majorants and minorants for the ensemble are 9iven by

X(Fm,Gin) ~1" {)((F,,, G,,); ? e F } ~ e

{J((Fo,, G,); (co,y) e f2 x F}

~ p {X(F~, G,; y e F} % e £ ( F M , GM).

(3.4)

Proof. The theorem follows on ordering by majorization as in Theorem 1, then using elementary consequences of stochastic ordering as sketched in comments preceding the theorem. [] Note that the outer bounds in (3.4) are global bounds for the ensemble (3.2), not depending on the fine structure of mixtures so bounded. On the other hand, the inner bounds depend on the particular G~ with 7 e F. Alternatively, expressions within the second and fourth pairs of braces may be replaced by {X(F,o, G,,); co e f2} and {Jf(F,o, GM); co e ~2}, respectively. 4. Conclusions

Variance is the standard gauge for the precision of an unbiased estimator. In contrast, ordering estimators by their comparative peakedness is stronger and more basic, implying that numerous other gauges of scatter are ordered correspondingly. In particular, ordering by peakedness implies not only that variances are reverse ordered, but so also are all even central moments where defined, as well as every symmetric interfractile range. The latter often serve as nonparametric gauges of precision. These and related conclusions follow since Y is more peaked about 0 ~ R 1 than Z if and only if E[~b(Y)] ~< E [~b(Z)] for every function in the class 7~ consisting of functions on ~l that are even, continuous, and increasing on [0, oo ). For further details see Lemma 1 of Jensen and Foutz (1989). The class MU(n) contains mixtures of elliptical stable laws on ~", including spherical Cauchy laws. For all such distributions X'(n) becomes successively more peaked about # as n increase, even without moments. Under spherical Cauchy errors it is known that X(n) is weakly consistent for /~; see Jensen (1978). On combining these results as in Theorem 2(ii), we may conclude that X(n) exhibits monotone consistency for # under spherical Cauchy errors, as in Definition 1. In contrast, if {Xi, . . . , X,} instead were i.i.d. Cauchy variates, then X1 and X(n) would have comparable peakedness for all n, as noted earlier by Proschan (1965). References

Anderson,T.W., 1955.The integral of a symmetricunimodalfunctionover a symmetricconvexset and some probabilityinequalities. Proc. Amer. Math. Soc. 6, 170-176.

282

D.R. Jensen / Statistics & Probability Letters 35 (1997) 277-282

Birnbaum, Z.W., 1948. On random variables with comparable peakedness. Ann. Math. Statist. 19, 76-81. Cambanis, S., Huang, S., Simons, G., 1981. On the theory of elliptically contoured distributions. J. Multivariate Anal. 11, 368-385. Jensen, D.R., 1978. On consistent Cauchy averages. J. Statist. Comput. Simul. 7, 287-289. Jensen, D.R., 1984. Ordering ellipsoidal measures: Scale and peakedness orderings. SIAM J. Appl. Math. 44, 1226-123l. Jensen, D.R., 1992. Matrix extremes and related stochastic bounds. In Stochastic Inequalities. Shaked, M., Tong, Y.L. (Eds.), pp. 133-144, IMS Lecture Notes/Monograph Series Vol. Hayward, CA. Jensen, D.R., Foutz, R.V., 1989. The structure and analysis of spherical time-dependent processes. SIAM J. Appl. Math. 49, 1834-1844. Marshall, A.W., Olkin, I, 1979. Inequalities: Theory of Majorization and Its Applications. Academic Press, New York. Olkin, I., Tong, Y.L., 1988. Peakedness in multivariate distributions. In Statistical Decision Theory and Related Topics IV, vol. 2, Gupta, S.S., Berger, J.O., (Eds.), Springer, New York. Proschan, F., 1965. Peakedness of distributions of convex combinations. Ann. Math. Statist. 36, 1703-1706. Schoenberg, I.J., 1938. Metric spaces and completely monotone functions. Ann. Math. 39, 811-814.