JOURNAL OF MULTIVARIATE ANALYSIS
48,
107-114
(1994)
An Identity for the Noncentral Wishart Distribution with Application PUt LAM LEUNG
The Chinese University o[ Hong Kong, Shath~, Hong Kong
This paper generalizes an identity for the Wishart distribution (derived independently by C. Stein and L. Haft) to the noncentral Wishart distribution. As an application of this noncentral Wishart identity, we consider the problem of estimating the noncentrality matrix of a noncentral Wishart distribution. This noncentral Wishart identity is used to develop a class of orthogonally invariant estimators which dominate the usual unbiased estimator. ,~ 1994AcademicPress,Inc.
1. INTRODUCTION Let S be an m x m positive definite r a n d o m matrix having a W i s h a r t d i s t r i b u t i o n with n degrees of freedom and c o v a r i a n c e matrix £', d e n o t e d by S = (s~j)~ W,,(n, X). A scalar identity concerning the expectation of functions of S was given in Haft [ 6 ] . F o r reference, we define the following s y m b o l s and state this W i s h a r t identity. Let V be a matrix and V~,~= r V + (1 - r) diag(V). We define
D = (di/) = ½( 1 + 6~j) c~/Oso
( 1.1 )
as a matrix of differential o p e r a t o r s , where 6,..,. is the K r o n e c k e r delta. D V is the formal matrix p r o d u c t of D a n d V a n d c3h(S)/OS= (Oh(S)/Os o) for a real-valued function h(S). U n d e r fairly general regularity conditions, we have
E [ h ( S ) t r ( Z ' - 1V)] = 2 E [ h ( S ) t r ( D V)] + 2 E tr[(Oh(S)/OS) V~ i/2~] + (n - m - 1 ) E [ h ( S ) t r ( S - ' V)],
1.2)
where V is a matrix whose elements are functions of S and X, h(S) is a real-valued function of S. This W i s h a r t identity is derived from Stokes' theorem, a m u l t i v a r i a t e integration by parts. The regularity c o n d i t i o n s are Received September I, 1992; revised March 26, 1993. Key words and phrases: Wishart identity, noncentral Wishart identity, noncentrality matrix, decision-theoretic estimation. 107 0047-259X/94 $6.00 Copyright I 1994 by Academic Press, Inc, All rights of reproduction in any form reserved.
108
PUi LAM LEUNG
to ensure that the function hV satisfies the conditions of Stokes' theorem and are given in Haft [2]. Haft used this identity to prove dominance results in a decision-theoretic estimation problem in his papers [3, 4]. Later on, Haft [5] extended (1.2) to matrix identities with applications in regression as well. The Wishart identity was also used in Leung and Muirhead [10]. Muirhead and Verathaworn [13] obtained an identity for the multivariate F distribution similar to (1.2). Applications of this F identity can also be found in Leung [9], Konno [7, 8]. In this paper, we generalize (1.2) to the noncentral Wishart distribution, and the result is called the "noncentral Wishart identity." In Section 2, the noncentral Wishart identity is stated and proved. As an application of this identity, we consider the problem of estimating the noncentrality matrix A of a noncentral Wishart distribution in Section 3. A class of orthogonally invariant estimators of d which dominates the usual unbiased estimator of A is proposed. This problem was also considered by Leung and Muirhead
[io3.
2.
THE NONCENTRAL WISHART IDENTITY
Assume that A is an m x m positive definite random matrix having a noncentral Wishart distribution with n degrees of freedom, covariance matrix X, and noncentrality matrix 3, denoted by A ~ W,,(n, Z, z~). Let V(A, ~) be a matrix whose elements are functions of A and A, and h(A) be a real-valued function of A. We simply write V(A, A) as V and h(A) as h for brevity. Under the same regularity conditions for the Wishart identity given in Haft [2], we have THEOREM 2.1 (Noncentral Wishart identity).
E[h tr(Z - ~V)] = 2E[h tr(D V)] + 2E tr[(c~h(A )lena ) V c~,,2~] +(n-m-
1) E[h tr(A- IV)]
+ E~ [h tr(zlA
J V)],
(2.1)
where the expectation E is taken over a W,,,(n, X, •) distribution and the expectation El in the last term of (2.1) is taken over a W,,,(n + m + l, Z, ,~) distribution, i.e.,
E~[htr(AA-'V)]= I
,4 >0
htr(AA-JV)f,(A)(dA)
109
N O N C E N T R A L ~,"ISHART D I S T R I B U T I O N
with f~(A) denoting the density of a W,,,(n+m+ I , X , J ) distribution. The identity (2.1) is the same as (1.2) except for the last term, where the expectation is taken over a noncentral Wishart distribution with degrees of freedom changed from n to n + m + 1. Note that when J = 0, (2.1) reduces to (1.2). Before we prove (2.1), we need the following Lemma. LEMMA 2.2. (i) E[tr(X ~ A ) ] = n m + t r A , (ii) E [ t r ( X tAX t A ) ] = t r ( j 2 ) + 2 ( n + m + l ) ( t r J ) + n m ( n + m + l ) , (iii)
tr D(AX ~ A ) = ( m + l)tr(X
JA).
Prot~ Part (i) can be easily obtained by noting that X ~J2AX ~/2 is distributed as W(n, L XmzlX - ~/2) and (ii) follows from Theorem 4.4 of Magnus and Neudecker Ell]. Part (iii)is given in Lemma (2.2) of Konno [7]. Now we turn to the proof of Theorem 2.1.
Proof of (2.1). The proof is similar to the proof of (1.2) given in Haft [2]. The density of A is f ( A ) = C ( d e t A ) c. . . .
V2etr(-X
~A/2)oFl(n/2;dX
~A/4),
(2.2)
where C = [2""/2F,,,(n/2)] ~ (det X) ''/-' e t r ( - A / 2 ) , n - m 1 >0, A > 0 , e t r ( . ) = e x p E t r ( - ) ] , o F ~ ( - ) i s the hypergeometric function with matrix argument, and F,,,(-) is the multivariate gamma function (see Muirhead [12] for details). For the differential operator D defined in (1.1), we have D(det A) ~.....
iv2= [ ( 1 7 - m - 1)/2](det A) I....... t~/2A l
and
Detr(-X
'A/2)=(-X
~/2)etr(-X-'A/2).
Hence, D operating o n f ( A ) i n (2.2) gives
Df(A)=[n-m- 2
l A ' - - ~1X
'] f(A)+Q(A),
(2.3)
where
Q ( A ) = C ( d e t A ) I...... ' V 2 e t r ( - X
IA/2)[DoFl(n/2:JX
JA/4)].
(2.4)
The same set of regularity conditions on hV given in Haft [2] ensures that S,~,o t r D [ h V f ( A ) ] ( d A ) = O . It follows that
O=Etr[(Oh/OA) V~'/z~]+ E [ h t r ( D V ) ] + fA>o htr[V(Df)](dA).
110
PUl LAM LEUNG
Using (2.3), we have
E[h tr(Z" ' V)] = 2E[h tr(D V)] + 2E tr[(Oh(A )/c)A ) V~,/2}] +(n-m-l)E[htr(A
IV)]
h tr( Q V)( dA ).
+2f
(2.5)
.-, > 0
Comparing (2.5) with (2.1), we see that the proof is complete if
2f
htr(QV)(dA)= I ,4>0
htr(AA-'V)f.(A)(dA), ,4>0
where f~(A) is the density of a W,,,(n + m + 1, Z, A) distribution. Therefore, it suffices to show that
2Q(A)f I I ( A ) = A A - t
a.e.
(2.6)
Q(A) defined in (2.4) involves the operation of D on oF~(. ). Although it is possible to prove (2.6) directly by differentiating the series of zonal polynomials in oF,(- ), this could be very complicated and messy. We take another approach to proving (2.6). By taking V ( A , Z ) = A S - ~ A and h ( A ) = l in (2.5) and from Lemma 2.2, we have tr(3 z) + 2(71 + m + 1 )(tr 3 ) + nm(n + m + 1) = 2 ( m + l ) ( n m + t r A ) + ( n - m - l ) ( n m + t r A)
tr(QAX 'A)(dA),
+2f ,4 >O
which implies
21
tr(QAS 'A)(dA)=tr(A'-)+(n+m+l)(trLl).
(2.7)
.4 > 0
Note that the right-hand side of (2.7) is equal to 4 >o tr(,JX IA )./'I(A )(dA ), where f](A ) is the density of a W,,,(n + m + 1, Z, A) distribution. It follows from (2.7) that tr[2QAX ~AJ'~ J ( A ) - z l X ~ A ] = O a.e. or tr{[2Q/',-'(A)-AA ~](AZ ' A ) } = 0 a.e. for all A > 0 and X > 0 , which implies (2.6), and the proof is completed.
Remark. It is possible to obtain a similar identity for a noncentral multivariate F distribution but this is not explored in the present paper.
1 11
NONCENTRAL VClSHART DISTRIBUTION
3. IMPROVED ESTIMATION OF NONCENTRALITY MATRIX The Wishart identity (1.2) is very useful for finding bounds for expectations which often appear in risk calculations. We expect that similar applications can be found for the noncentral Wishart identity (2.1) as well. To illustrate a nontrivial application of the noncentral Wishart identity, we consider the problem of estimating the noncentrality matrix of a noncentral Wishart distribution. Assume that A ~ W,,,(n, I, "4), i.e., A has a noncentral Wishart distribution with n degrees of freedom, identity covariance matrix, and noncentrality matrix '4. The usual unbiased estimator of '4 is J u = A - nL By using a decision-theoretic approach, we estimate '4 using the invariant loss function L('4, J ) = t r ( ' 4
1j_/,,,)2.
(3.1)
Let zi, = Ju + (a/(tr A))I,,,. It is shown in Theorem 3.2 that J , dominate J u for a suitable choice of ct. This problem and ,J, were also considered by Leung and Muirhead [10], using the squared error loss function L('4, "4) = tr(zJ-.4)2 (see Leung and Muirhead [10] for the significance and details of this problem). The univariate version of this problem was considered by Perlman and Rasmussen [14], Saxena and Alam [15], and Chow [1]. Before we state and prove the dominance result, we need the following lemma. LEMMA 3.1. Assume that n > 4 . Then
[trI{_-~A)l ~<.E [tr '4-21
EL
trA
_1
FtrA-~ l
L trA _ ] - 2 ( n - 4 ) E [ _ ( t r A ) Z l
[tr A-'-]_ + E l i _ trA J
[tr'4-' 1 2E~[_(trA)2j,
where E is taken over a W,,,(n, L "4) distribution and El is taken over a W,,(n + m + 1, L "4) distribution. Proof We apply the noncentral Wishart identity given in (2.1) with Z = L V = A - 2 A , and h = 1/(trA). Since tr(DV)= [ ( m + 1)/2](tr'4 -2) and ah/OA = [ - 1/(tr A) z] I,, (see Haft [4]), we have
Vtr(~---~A!]=nEV tr '4--'1-2E
EL
trA
L trA J
[tr/'4
L (tr
)] + E, I-tr '4-']
L trA J'
(3.2)
To compute the second term on the right-hand side of (3.2), we apply 683/48/1-8
112
pu1 LAM LEUNG
identity (2.1) again with V=zllZA and h = l / ( t r A ) 2. Since cOh/OA= [ - 2 / ( t r A) 3] L,, (see Haft [4]), (3.2) becomes
[tr(,a-2A)]
_Ftr,J-21
A=nLL T I-,
eL
[tr,~-']
._Ftr(zi--'A)l
L ~T-~T.J+EIL(trA)2j.
(3.3)
Using the fact that tr(A-2A)<~(trzl-2)(trA) in the second term of the right-hand side of (3.3), we have
[tr(~-2A !j] ~>( n - 4 )
EFtr~-21 + E, [-tr A-'- l
E L (trA) 2
L~-A-)SJ
(3.4)
L(tr A)2J"
With (3.4) substituted into (3.2), the proof is completed. THEOREM 3.2. Assume that n > 4. Then J , dominate 4(n-- 4).
Ju if
0 < ct <
Proof For the loss function defined in (3.1), it is straightforward to show that the risk difference between J u and zi, is G(A) = ElL(A, J u ) - L(A, A~)]
[tr,a-'l_
= 2~E k
tr A J
[trIJ-;Al] 2~E L
fr-J
[tr~-~l_~
J + 2n=E L tr A J
~
FtrA-21
EL
-z j
Using Lemma 3.1 and simplifying, we have G(A)>_. 2~E[ tr d - ' l m trA
Ftr d
'l
[tr A-' l
J-Z~E'L~J+4~E'L(trA)2 j
+ ~[4(n-4)-~]
El-tr -2l L ~ J
(3.5)
From Eq. (2.9) in Leung and Muirhead [9], E [ l / ( t r A ) ] = E2[l/(mn+ Z K - 2)] and E,[I/(tr A ) ] = E 2 { 1 / [ m ( n + m + l)+ 2 K - 2 ] } , where K has a Poisson distribution with mean ½(tr A) and E2 is taken with respect to K. Therefore (3.5) becomes m(m+ 1) ] G(zl)>~2~(trzl l ) E 2 ( m n + 2 K - 2 ) [ m ( n + m + l ) + 2 K - 2 ] +4~E,
Ftr,~-,] L(tr A)2_[+ ~[4(n - 4 ) -
~]
Ftr,,- l
E L ~ - ~ j.
The first two terms on the right-hand side are nonnegative. A sufficient condition for G(A) ~>0 is 0 ~<~ ~<4(n -- 4) and the proof is completed.
NONCENTRAL ~,rlSHART DISTRIBUTION
1 13
Remark. J u a n d zi~ are n o t necessarily positive definite. T h e y are d o m i n a t e d by their t r u n c a t e d v e r s i o n s ,~(j a n d A~+, respectively; i.e., J ÷ is f o r m e d by r e p l a c i n g the n e g a t i v e e i g e n v a l u e s in J by zero. T h e f o l l o w i n g t h e o r e m d e m o n s t r a t e s this. THEOREM 3.3. Let J be an), estimator .4 and A + be the truncated version o f d. Then E [ L ( d , ,~)] ~> E [ L ( d , d+ )] .for all d and the inequality is strict if ~ is not nonnegative definite.
Proof. Let d = H L H ' , where H is o r t h o g o n a l a n d L is a d i a g o n a l m a t r i x with d i a g o n a l e l e m e n t s 1, ..... 1,,,. T h e n J + = H L + H ', where L + = d i a g ( l , + ..... 1.+,) w i t h li + = / i if 1~ > 0 a n d /~+ = 0 if l~ ~< 0. E [ L ( A , zi)] = E [ t r ( A - l j _ i,,,),,] =Etr[A-t/2(j_d)
d
=Etr[d-'/2H(L-H'AH)
t/_,] 2
H'A
1/212
~> E t r i a l - I / 2 H ( L + - H ' A H ) H ' A -'/'-]" = E[L(zl, J+)]. T h e i n e q u a l i t y is strict if li < 0 for s o m e i, a n d the p r o o f is c o m p l e t e d .
REFERENCES [1] [2] [3] [4] [5] I-6] 1-7] 1-8] [-9] [-10]
CHOW, M. S. (1987). A complete class theorem for estimating a noncentrality parameter. Ann. Statist. 15 800-804. HAFF, L. R. (1979). An identity for the Wishart distribution with applications. J. Multivariate Anal. 9 531-544. HAFF, L. R. (1979). Estimation of the inverse covariance matrix: Random mixtures of the inverse Wishart matrix and the identity. AmL Statist. 7 1264-1276. HAH=,L. R. (1980). Empirical Bayes estimation of the multivariate normal covariance matrix. Ann. Statist. 8 586-597. HAl:F, L. R. (1981). Further identities for the Wishart distribution with applications in regression. Canad. J. Statist. 9 215-224. HAFI:, L. R. (1982). Identities for the inverse Wishart distribution with computational results in linear and quadratic discrimination. Sankhy~ 44 245-258. KONNO,Y. (1988). Exact moments of the multivariate F and beta distributions. J. Japan Statist. Soc. 18 123-130. KONNO,Y. (1991). A note on estimating eigenvalues of scale matrix of the multivariate F-distribution. Ann. Inst. Statist. Math. 43 157-165. LEUNG,P. L. (1992). Estimation of the scale matrix of the multivariate F distribution. Comm. Statist. Theory Methods 21 1845-1856. LEUNG, P. L., AND MUIRHEAO, R. J. (1987). Estimation of parameter matrices and eigenvalues in MANOVA and canonical correlation analysis. Ann. Statist. 15 1651-1666.
114
p u I LAM LEUNG
[ 11 ] MAGNUS,J. R., AND NEUDECKER, H. (1979). The commutation matrix: Some properties and applications. Ann. Statist. 7 381-394. [ 12] MUIRHEAD, R. J. (1982). Aspects of Multivariate Statistical Theory'. Wiley, New York. 1-13] MUIRHEAD, R. J., AND VERATHAWORN,T. (1985). On estimating the latent roots of --v'~Z'il. In Multivariate Analysis 11"1 (P. R. Krishnaiah, Ed.), pp. 431-447. NorthHolland, Amsterdam. [14] PERLMAN,M. D., AND RASMUSSEN, U. A. (1975). Some remarks on estimating a noncentrality parameter. Comm. Statist. 4 455--468. [15] SAXENA, K. M. L., AND ALAM, K. (1982). Estimation of the noncentrality parameter of a chi-squared distribution. Ann. Statist. 10 I012-1016.