Journal of Multivariate Analysis 101 (2010) 594–602
Contents lists available at ScienceDirect
Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva
Estimation of means of multivariate normal populations with order restrictionI Tiefeng Ma a,∗ , Songgui Wang b a
Statistics College, Southwestern University of Finance and Economics, Chengdu, 610074, China
b
Department of Applied Mathematics, Beijing University of Technology, Beijing, 100124, China
article
info
Article history: Received 19 December 2007 Available online 20 October 2009 AMS subject classifications: primary 62F30 secondary 62H12 Keywords: Multivariate normal mean Order restrict Graybill-Deal estimator Isotonic regression
abstract Multivariate isotonic regression theory plays a key role in the field of statistical inference under order restriction for vector valued parameters. Two cases of estimating multivariate normal means under order restricted set are considered. One case is that covariance matrices are known, the other one is that covariance matrices are unknown but are restricted by partial order. This paper shows that when covariance matrices are known, the estimator given by this paper always dominates unrestricted maximum likelihood estimator uniformly, and when covariance matrices are unknown, the plugin estimator dominates unrestricted maximum likelihood estimator under the order restricted set of covariance matrices. The isotonic regression estimators in this paper are the generalizations of plug-in estimators in unitary case. Crown Copyright © 2009 Published by Elsevier Inc. All rights reserved.
1. Introduction Let Xij be the jth p × 1 observation vector of the ith population and mutually independently distributed as N (µi , Σi ), i = 1, . . . , k, j = 1, . . . , ni , where µi and Σi are unknown. Let be a partial order on the index set {1, . . . , k}. Let µil be the lth element of vector µi . The vector set µ1 , . . . , µk is said to be isotonic with respect to in the Löwner sense means that if j i, then µil ≥ µjl , l = 1, . . . , p, which is written as µi µj . Define
Ω = {µ : µ1 µ2 · · · µk }.
(1.1)
For convenience, we still use µj ≤ µi to denote µj µi in the following state. If p = 1, the above problem reduces to univariate case and Σi reduces to σi2 . The estimation problem of two normal means was discussed by [1]. Many authors studied such problem when k > 2. When all variances are known and µ1 = µ2 = · · · = µk = µ, a natural unbiased estimator of µ is k P
µ ˆ =
j =1 k P j=1
nj
σj2
nj
X¯ j
.
σj2
I This research was supported by the key Discipline Program of Statistics, 211 Project for the Southwestern University of Finance and Economics (phase III). ∗ Corresponding author. E-mail address:
[email protected] (T. Ma).
0047-259X/$ – see front matter Crown Copyright © 2009 Published by Elsevier Inc. All rights reserved. doi:10.1016/j.jmva.2009.10.002
T. Ma, S. Wang / Journal of Multivariate Analysis 101 (2010) 594–602
595
In practice, variances are often unknown, so a feasible estimator is Graybill-Deal estimator k P
µ ˆ GD =
nj
σˆ j2
j=1 k P
nj
X¯ j
.
(1.2)
σˆ j2
j=1
Misra and van der Meulen [2] extended the result given by [3] and [4] and proposed an isotonic estimator of µ k P
µ ˆT =
nk−i+1 τˆi X¯ k−i+1
i =1 k P
,
(1.3)
nk−i+1 τˆi
i=1
which is better than the usual Graybill-Deal estimator in terms of stochastic dominance and the Pitman measure of closeness over a simple order restriction in terms of 0 < σ12 ≤ σ22 ≤ · · · ≤ σk2 < ∞,
(1.4)
Pni
¯ 2 where τˆi = mint ≥i maxs≤i (( r =s nk−r +1 ) S i = n −1 j=1 (Xij − Xi ) . r =s nk−r +1 Pr ), Pi = i Under (1.1), Robertson et al. [5] gave the restricted maximum likelihood estimator of µi t P nj X¯ j 2 Pt
min max t ≥i
s≤i
σj
j =s
t P
nj
−1
Pt
1 , Sk−i+1
1
.
(1.5)
σj2
j =s
Lee [6] and Kelly [7] showed that the estimator (1.5) uniformly improves upon X¯ i . Shi and Jiang [8] discussed some properties of the maximum likelihood estimates of means and unknown variances under the restriction and proposed an algorithm for obtaining estimates. If k = 2, [1] gave the plug-in estimators of µi which uniformly improves upon X¯ i under set (1.1) when σi are unknown. The above discussions are simple unitary estimation problems. In multivariate case i.e. p ≥ 2, Chiou and Cohen [9]showed that k X
µ ˆ GDd =
! −1 −1
ni Si
i =1
k X
ni Si−1 X¯ j ,
(1.6)
i=1
dominates neither X¯ 1 nor X¯ 2 , when µ1 = µ2 = · · · = µk = µ and k = 2. Loh [10] estimated the mean vector from a symmetric loss function point of view as alternatives to µ ˆ GDd . Jordan and Krishnamoorthy [11] considered a weighted Graybill-Deal estimator
µ ˆ GDW =
k X
! −1 −1
ci ni Si
ci ni Si−1 X¯ j ,
(1.7)
i=1
i=1
[Var (Ti2 )]−1
k X
2p(n −1)2 (n −2)
ni (X¯ i − µ)0 Si−1 (X¯ i − µ) and Var (Ti2 ) = (n −p−i 2)2 (n −i p−4) . i i While the isotonicity of unitary normal populations has been the primary concern in the past studies, very little attention has been paid to the isotonic estimation problem of multivariate normal means. It is obvious that Graybill-Deal estimator play an important role in establishing the isotonic estimator of normal means whether in univariate case or multivariate case. The isotonic estimation of multivariate normal means is also based on Graybill-Deal idea in this paper when mean vectors are restricted by simple order. In order to obtain the dominance results of estimators, we consider the squared error loss of the estimators of µi , where ci = Pk
2 −1 i=1 [Var (Ti )]
, Ti2 =
Pk
i=1
L(µi , µ ˆ i , D) = (µ ˆ i − µi )0 D(µ ˆ i − µi ),
(1.8)
where D is a positive definite matrix. Then the risk is given by R(µi , µ ˆ i , D) = E [L(µi , µ ˆ i , D)].
(1.9)
The main object of this paper is to generalize dominance results given by [1] to multivariate case and consider the problem of estimating the mean vectors of two normal populations under order restriction. When Σi is known, we give a restricted maximum likelihood estimator which is similar to the equivalent in univariate case and show that it improves upon the unrestricted maximum likelihood estimator uniformly. When Σi is unknown, we obtain a feasible plug-in estimator dominating the unrestricted maximum likelihood estimator under order restricted set of covariance matrices respectively. These results are derived in Section 2. In Section 3, the proof processes of theoretical results of Section 2 are given.
596
T. Ma, S. Wang / Journal of Multivariate Analysis 101 (2010) 594–602
2. Main results
Pn
i ¯ ¯ 0 Let Si = n 1−1 j=1 (Xij − Xi )(Xij − Xi ) , i = 1, 2. Obviously, Si are sample covariance matrices, and (ni − 1)Si ∼ i Wp (ni − 1, Σi ). In this paper, we consider the isotonic estimation of two normal mean vectors when they are subject to the order restriction (1.1) and k = 2. In this section, we first consider the improved estimator when Σi is known. Motivated by restricted maximum likelihood estimators in unitary case, we obtain
µ ˜ GD =
1 n1
Σ1
−1
+
1 n2
Σ2
−1 ! −1
1 n1
Σ1
−1
X¯ 1 +
1 n2
Σ2
−1
! X¯ 2
.
(2.1)
From this, the improved estimators are obtained immediately
˜ GD IX¯ 1 >X¯ 2 , µ ˜ 1 = X¯ 1 (1 − IX¯ 1 >X¯ 2 ) + µ
(2.2)
˜ GD IX¯ 1 >X¯ 2 , µ ˜ 2 = X¯ 2 (1 − IX¯ 1 >X¯ 2 ) + µ
(2.3)
where IC denotes the indicator function of the set satisfying the condition C. We obtain the following dominance results of µ ˜ 1 and µ ˜ 2 under the order restriction (1.1). Theorem 2.1. µ ˜ 1 uniformly improves upon the unrestricted maximum likelihood estimator X¯ 1 under generalized square loss. Regarding the improved estimation of µ ˜ 2 , we have a similar result as follows. Theorem 2.2. µ ˜ 2 uniformly improves upon the unrestricted maximum likelihood estimator X¯ 2 under generalized square loss. When Σi is unknown, (2.1) is unfeasible, then (2.2) and (2.3) are unfeasible too. Motivated by the idea of Graybill-Deal estimator in unitary case, we give the GD estimator of mean vector
µ ˆ GD =
1 n1
−1 S1
+
1 n2
−1 !−1 S2
1 n1
−1 S1
X¯ 1 +
1 n2
−1 S2
! X¯ 2
.
(2.4)
Furthermore we have
µ ˆ 1 = X¯ 1 (1 − IW X¯ 1 >W X¯ 2 ) + µ ˆ GD IW X¯ 1 >W X¯ 2 ,
(2.5)
µ ˆ 2 = X¯ 2 (1 − IW X¯ 1 >W X¯ 2 ) + µ ˆ GD IW X¯ 1 >W X¯ 2 ,
(2.6)
where W = ( + ) These two estimators are called as plug-in estimators. Define 1 1 R1 = Σ : Σ1 ≥ Σ2 , n1 n2 1 S n1 1
1 S −1 . n2 2
1 1 Σ1 ≤ Σ2 , R2 = Σ : n1
n2
which are the multivariate extend of (1.4). For p × p matrices A and B, A ≥ B means that A − B is a non-negative definite matrix. To these two feasible estimators, we also have the following dominance results under the order restriction (1.1). Theorem 2.3. The plug-in estimator µ ˆ 1 uniformly improves upon the unrestricted maximum likelihood estimator X¯ 1 under generalized square loss over the set R1 . Theorem 2.4. The plug-in estimator µ ˆ 2 uniformly improves upon the unrestricted maximum likelihood estimator X¯ 2 under generalized square loss over the set R2 . Homoplastically, we define two new order sets
R3 = {Σ : Σ1 ≥ Σ2 }, R4 = {Σ : Σ1 ≤ Σ2 }. As an immediate consequence of Theorems 2.3 and 2.4, we have the following corollaries. Corollary 2.1. If n2 ≥ n1 , the plug-in estimator µ ˆ 1 uniformly improves upon the unrestricted maximum likelihood estimator X¯ 1 under generalized square loss over the set R3 . Corollary 2.2. If n2 ≤ n1 , the plug-in estimator µ ˆ 2 uniformly improves upon the unrestricted maximum likelihood estimator X¯ 2 under generalized square loss over the set R4 .
T. Ma, S. Wang / Journal of Multivariate Analysis 101 (2010) 594–602
597
3. The proofs of theorems Proof of Theorem 2.1. Putting V = (( n1 Σ1 )−1 + ( n1 Σ2 )−1 )−1 ( n1 Σ1 )−1 =
) (
)
1 Σ2 2 n1 Σ1 −1 , n2 1
1
µ ˜ 1 is expressed as
2
(
1 Σ2 n1 Σ1 n2 1
1
+ n12 Σ2 )−1 , D = ( n11 Σ1 )−1 ( n11 Σ1 +
µ ˜ 1 = X¯ 1 (1 − IX¯ 1 >X¯ 2 ) + (V X¯ 1 + (I − V )X¯ 2 )IX¯ 1 >X¯ 2 ,
(3.1)
and we calculate the risk difference of X¯ 1 and µ ˜ 1 as R(µ1 , X¯ 1 , D) − R(µ1 , µ ˜ 1 , D) = E [(X¯ 1 − µ1 )0 D(X¯ 1 − µ1 ) − (µ ˆ 1 − µ1 )0 D(µ ˆ 1 − µ1 )]IX¯ 1 >X¯ 2
= E [(X¯ 1 − µ1 )0 D(X¯ 1 − µ1 ) − (V X¯ 1 + (I − V )X¯ 2 − µ1 )0 D(V X¯ 1 + (I − V )X¯ 2 − µ1 )]IX¯ 1 >X¯ 2 .
(3.2)
Making the transformations Z1 = X¯ 1 − µ1
and Z2 = X¯ 2 − µ1 ,
(3.3)
Z1 and Z2 are mutually independently distributed as N (0, From (3.2), we have
1 Σ1 n1
) and N (µ,
), respectively, where µ = µ2 − µ1 ≥ 0.
1 Σ2 n2
R(µ1 , X¯ 1 , D) − R(µ1 , µ ˜ 1 , D) = E [Z10 DZ1 − (VZ1 + (I − V )Z2 )0 D(VZ1 + (I − V )Z2 )]IZ1 >Z2
= E [Z10 V 0 D(I − V )(Z1 − Z2 ) + (Z1 − Z2 )0 (I − V )0 DVZ1 ]IZ1 >Z2 + E [Z10 (I − V )0 D(I − V )Z1 − Z20 (I − V )0 D(I − V )Z2 ]IZ1 >Z2 . Denote Y1 = Z1 − Z2 , Y2 = Z1 + (
( ) =( +
)
)(
)
then Y1 ∼ N (−µ,
1 Σ2 −1 Z2 , n2
1 Σ1 n1
)
) ) (Y2 − Y1 ). Thus we have
(3.4)
∼ ( ) ) (
(
) µ, + ) + ),
1 Σ2 , Y2 N n1 Σ1 n1 Σ2 −1 n1 Σ1 n2 1 2 1 1 1 1 −1 −1 1 −1 Σ Σ Σ Σ Y Y2 1 2 1 2 1 n1 n2 n1 n2
+
and Y2 are mutually independent. Then Z1 = (I +
1 Σ1 n1 Σ2 −1 n1 Σ1 , and Y1 n1 2 1 1 Z2 I Σ1 n1 Σ2 −1 −1 n1 2
(
1 Σ1 n1
(
(
E [Z10 V 0 D(I − V )(Z1 − Z2 ) + (Z1 − Z2 )0 (I − V )0 DVZ1 ]IZ1 >Z2
1
= E Y2 +
n1
Σ1
1 n2
Σ2
+ Y1 (I − V )0 DV I +
!0
−1
1 n1
Σ1
1
Y1
I+
−1 ! −1
1 n2
Σ2
n2
Σ2
−1
1 n1
Y2 +
1 n1
! −1 V 0 D(I − V )Y1
Σ1
Σ1
1
Σ2
n2
−1
! Y1 IY1 >0 ,
(3.5)
E [Z10 (I − V )0 D(I − V )Z1 − Z20 (I − V )0 D(I − V )Z2 ]IZ1 >Z2
1 = E Y2 + Σ1
n1
×
1
Y2 +
×
1
I+
Σ1
n1
n1
Σ1
n2
1
1 n2
1
Σ2
Σ2
Σ2
n2
!0
−1
I+
Y1
1 n2
Σ2
−1
!
−1
− (Y2 − Y1 )
Y1
−1 !−1
0
1 n1
I+
! −1 (I − V ) D(I − V ) I +
Σ1
1 n2
Σ2
0
−1
1 n1
1 n1
Σ1
1 n2
Σ2
−1 !−1
! −1 (I − V )0 D(I − V )
Σ1
(Y2 − Y1 ) IY1 >0 ,
(3.6)
From (3.5) and (3.6), we obtain E [Z10 V 0 D(I − V )(Z1 − Z2 ) + (Z1 − Z2 )0 (I − V )0 DVZ1 + Z10 (I − V )0 D(I − V )Z1 − Z20 (I − V )0 D(I − V )Z2 ]IZ1 >Z2
= E Y20 I + " 0
+ E Y1 0
+ Y1
1 n2
1 n1
Σ2
1 n2
Σ2
Σ1 +
−1
−1
n1 1
n2
1 n1
1
Σ2
! −1 D(I − V )Y1 + Y10 (I − V )0 D I +
Σ1
−1
Σ1 I +
1 n1
Σ1 V D(I − V )Y1 + Y1 (I − V ) DV 0
1 n2
Σ2
−1
0
1 n1
1 n1
1 n1
Σ1
Σ1
n2 1 n2
! −1 Σ1
(I − V ) D(I − V ) I + 0
1
Σ2
−1 ! −1
Y2 IY1 >0
Σ2 +
1 n1
Σ1
1 n1
Σ1
1 n2
−1
Σ2
Y1
−1 ! −1
1 n1
598
T. Ma, S. Wang / Journal of Multivariate Analysis 101 (2010) 594–602
× Σ1 " 0
n2
= E Y2 + Y1
1 n1
1
1
− Y1
n1
" 0
= E Y2
Σ2
Σ1
Σ2
n2 0
1
−1
Y1 − Y1
−1
1
1
Σ1
n2
−1
1 n1
Σ2 1
" E Y20
1 n1
−1
n2
−1
−1
1 n2
n2
1 Σ1 n1
Σ2 Y1 + Y10
n2
Σ2
Σ2
−1
n1
1
1 n1
1
Σ2
1
Σ2
−1
n1
Σ1
#
1
−1
n2
n2
−1
"
Y2 IY1 >0 + E Y10
1
)
1
(I − V ) D(I − V ) I +
Σ1
1
0
Σ1 +
Σ1
n1
(
! −1
Σ1 +
n1
n2
Σ1
n1
2
1
1
1 Σ1 n1 Σ2 −1 n1 Σ1 n1 2 1
+ 1
Σ2
Y1 + Y10
Σ2 Y1 + Y1
n2
1
1
0
2
Σ1
Σ1
1 n2
Σ2 Y1 + Y1
As Y2 ∼ N ( n1 Σ1 ( n1 Σ2 )−1 µ, 1
I+
0
Σ2 +
Σ1 +
n1
1 n2
n2
1
0
Σ2
Σ2
−1
1 n1
1 n1
Σ1
n1
1
Σ1 +
2
1 n1
Σ1
n2
n2
Σ2 1
Σ1 +
1
n2
Σ2
−1
1 n2
Σ2
−1 ! −1
Y1 IY1 >0
Σ2 Y1
−1 Y1
#
−1
Y1 IY1 >0
# Y2 IY1 >0 + E [Y10 Y1 ]IY1 >0 .
(3.7)
) and Y1 and Y2 be mutually independent, we have
# Y2 IY1 >0 = E [µ0 Y1 + Y10 µ]IY1 >0 > 0.
(3.8)
According to (3.7) and (3.8), we have R(µ1 , X¯ 1 , D) − R(µ1 , µ ˜ 1 , D) > 0.
(3.9)
Proof of Theorem 2.2. This proof process of theorem is similar to the foregoing discussion. The different point is that in this theorem we put D = ( n1 Σ2 )−1 ( n1 Σ1 + n1 Σ2 )( n1 Σ2 )−1 , µ ˜ 2 is expressed as 2
1
2
2
µ ˜ 2 = X¯ 2 (1 − IX¯ 1 ≥X¯ 2 ) + (V X¯ 1 + (I − V )X¯ 2 )IX¯ 1 >X¯ 2 ,
(3.10)
and we calculate the risk difference of X¯ 2 and µ ˜ 2 as R(µ2 , X¯ 2 , D) − R(µ2 , µ ˜ 2 , D) = E [(X¯ 2 − µ2 )0 D(X¯ 2 − µ2 ) − (µ ˜ 2 − µ2 )0 D(µ ˜ 2 − µ2 )]IX¯ 1 >X¯ 2
= E [(X¯ 2 − µ2 )0 D(X¯ 2 − µ2 ) − (V X¯ 1 + (I − V )X¯ 2 − µ1 )0 D(V X¯ 1 + (I − V )X¯ 2 − µ1 )]IX¯ 1 >X¯ 2 .
(3.11)
Similarly, we make the transformations Z1 = X¯ 1 − µ2
and Z2 = X¯ 2 − µ2 ,
(3.12)
and then Z1 and Z2 are mutually independently distributed as N (−µ, µ2 − µ1 ≥ 0. From (3.11), we have
1 Σ1 n1
) and N (0,
1 Σ2 n2
), respectively, where µ =
R(µ2 , X¯ 2 , D) − R(µ2 , µ, ˜ D) = E [Z20 DZ2 − (VZ1 + (I − V )Z2 )0 D(VZ1 + (I − V )Z2 )]IZ1 >Z2
= E [Z20 (I − V )0 DV (Z2 − Z1 ) + (Z2 − Z1 )0 V 0 D(I − V )Z2 ]IZ1 >Z2 + E [Z20 V 0 DVZ2 − Z10 V 0 DVZ1 ]IZ1 >Z2 . Denote Y1 = Z1 − Z2 , Y2 = Z2 + ( n1 Σ2 )( n1 Σ1 )−1 Z1 , then Y2 ∼ N (− n1 Σ2 ( n1 Σ1 )−1 µ, 2
1
2
It is obviously that Y1 and Y2 are mutually independent. We have Z1 = (I +
(
) ) (
(
)
1 Σ1 n1 Σ2 −1 −1 n1 Σ1 n1 Σ2 −1 Y2 n1 2 1 2
− Y1 ). Thus we have
(3.13)
+ n12 Σ2 ( n11 Σ1 )−1 n12 Σ2 ). ) ) (Y1 + Y2 ), Z2 = (I +
1 Σ2 n2 1 1 −1 −1 Σ Σ 2 n 1 n2 1 1
(
E [Z2 (I − V ) DV (Z2 − Z1 ) + (Z2 − Z1 )0 V 0 D(I − V )Z2 ]IZ1 >Z2 + E [Z20 V 0 DVZ2 − Z10 V 0 DVZ1 ]IZ1 >Z2 0
0
" = E −Y20 + Y10
n2
1 n1
0
− Y1
n2 0
= E −Y 2
Σ2
Σ1 +
1
"
1
Σ2 + 1 n2
Σ2
−1
1 n1
1 n2
Σ2
1 n1
−1
Σ1
−1
Σ1 Y1 − Y10 1 n1
−1
1 n1
1 n1
Σ1
n1
Σ1
Σ1 Y1 − Y1 0
1 n2
n1
1 n2
2
1
1 n2
Σ1 Y1 + Y10 1
Σ1
Σ2
Σ2 +
Σ2 + 1 n2
−1
Σ2
1 n1
Y2 + Y10
1 n1
Σ1
Σ1
−1
1 n1
−1
−1
Σ1 1 n2
1 n1
Σ2
Σ1 +
2
1 n2
1 n2
Σ2
Σ2 +
−1 Y1 1 n1
Σ1
−1 Y1
# Y1 IY1 >0
# Y2 IY1 >0 + E [Y10 Y1 ]IY1 >0 .
(3.14)
T. Ma, S. Wang / Journal of Multivariate Analysis 101 (2010) 594–602
Taking note of Y2 ∼ N (− n1 Σ2 ( n1 Σ1 )−1 µ, 2
" 0
E −Y 2
1 n2
Σ2
−1
1
Σ1 Y1 − Y1
1
0
n1
1 Σ2 n2
1
Σ1
n1
1 n2
+
Σ2
(
)
1 Σ2 n1 Σ1 −1 n1 Σ2 n2 1 2
−1
599
), so we have
# Y2 IY1 >0 = 2µ0 E (Y1 IY1 >0 ) > 0.
(3.15)
By (3.14) and (3.15), we have R(µ2 , X¯ 2 , D) − R(µ2 , µ ˜ 2 , D) > 0.
(3.16)
Proof of Theorem 2.3. Putting V = (( n1 S1 )−1 + ( n1 S2 )−1 )−1 ( n1 S1 )−1 = 1 Σ2 n2
)(
) ,W = (
1 Σ1 − 1 n1
1 S n1 1
+
1
2
1
) ,µ ˆ 1 is expressed as
1 S −1 n2 2
(
1 S 1S n2 2 n1 1
+
) , D = ( n11 Σ1 )−1 ( n11 Σ1 +
1 S −1 n2 2
µ ˆ 1 = X¯ 1 (1 − IW X¯ 1 >W X¯ 2 ) + (V X¯ 1 + (I − V )X¯ 2 )IW X¯ 1 >W X¯ 2 ,
(3.17)
and we calculate the risk difference of X¯ 1 and µ ˆ 1 as R(µ1 , X¯ 1 , D) − R(µ1 , µ ˆ 1 , D) = E [(X¯ 1 − µ1 )0 D(X¯ 1 − µ1 ) − (µ ˆ 1 − µ1 )0 D(µ ˆ 1 − µ1 )]IW X¯ 1 >W X¯ 2
= E [(X¯ 1 − µ1 )0 D(X¯ 1 − µ1 ) − (V X¯ 1 + (I − V )X¯ 2 − µ1 )0 D(V X¯ 1 + (I − V )X¯ 2 − µ1 )]IW X¯ 1 >W X¯ 2 .
(3.18)
Making the transformations Z1 = X¯ 1 − µ1
and Z2 = X¯ 2 − µ1 ,
(3.19)
and then Z1 and Z2 are mutually independently distributed as N (0, ) and N (µ, µ2 − µ1 ≥ 0. Noting that Z1 , Z2 and V are mutually independent, we have from (3.18) 1 Σ1 n1
1 Σ2 n2
), respectively, where µ =
R(µ1 , X¯ 1 , D) − R(µ1 , µ ˆ 1 , D) = E [Z10 DZ1 − (VZ1 + (I − V )Z2 )0 D(VZ1 + (I − V )Z2 )]IWZ1 >WZ2
= E [Z10 V 0 D(I − V )(Z1 − Z2 ) + (Z1 − Z2 )0 (I − V )0 DVZ1 ]IWZ1 >WZ2 + E [Z10 (I − V )0 D(I − V )Z1 − Z20 (I − V )0 D(I − V )Z2 ]IWZ1 >WZ2 .
(3.20)
= W (Z1 − Z2 ), Y2 = Z1 + ( n11 Σ1 )( n12 Σ2 )−1 Z2 , then Y1 ∼ N (−µ, n11 Σ1 + n12 Σ2 ), Y2 ∼ N ( n1 Σ1 ( n1 Σ2 )−1 µ, n1 Σ1 + n1 Σ1 ( n1 Σ2 )−1 n1 Σ1 ), and Y1 and Y2 are mutually independent. Then Z1 = (I + 1 2 1 1 2 1 1 Σ1 ( n1 Σ2 )−1 )−1 ( n1 Σ1 ( n1 Σ2 )−1 W −1 Y1 + Y2 ), Z2 = (I + n1 Σ1 ( n1 Σ2 )−1 )−1 (Y2 − W −1 Y1 ). Thus we have n1 2 1 2 1 2 Denote Y1
E [Z10 V 0 D(I − V )(Z1 − Z2 ) + (Z1 − Z2 )0 (I − V )0 DVZ1 ]IWZ1 >WZ2
1 = E Y2 + Σ1
n1
1 n2
Σ2
!0
−1 W
+ Y1 W −1 (I − V )0 DV I +
1 n1
−1
Σ1
1
Y1
I+
−1 !−1
1 n2
Σ2
n2
Σ2
−1
1 n1
Y2 +
1 n1
! −1 V 0 D(I − V )W −1 Y1
Σ1
Σ1
1 n2
Σ2
−1
! W −1 Y1 IY1 >0 ,
(3.21)
E [Z10 (I − V )0 D(I − V )Z1 − Z20 (I − V )0 D(I − V )Z2 ]IWZ1 >WZ2
1 = E Y2 + Σ1
1
n1
×
I+
1 n1
×
I+
Σ1 1
n2
Σ2
1 n2
n2
Σ2
Σ2
−1
!0
−1 W
−1
n1
I+
Y1
−1 ! −1
1
Y2 +
1 n1
Σ1
1 n2 1
n2
Σ2
Σ2
−1
Σ1
n1
(I − V ) D(I − V ) I +
! −1 (I − V )0 D(I − V )
Σ1
!
−1
! −1 0
1
W
1 n1
−1
Σ1
Y1
1 n2
− (Y2 − W −1 Y1 )0
Σ2
−1 ! −1
(Y2 − W
−1
Y1 ) IY1 >0 .
From (3.21) and (3.22), we obtain E [Z10 V 0 D(I − V )(Z1 − Z2 ) + (Z1 − Z2 )0 (I − V )0 DVZ1 + Z10 (I − V )0 D(I − V )Z1 − Z20 (I − V )0 D(I − V )Z2 ]IWZ1 >WZ2
(3.22)
600
T. Ma, S. Wang / Journal of Multivariate Analysis 101 (2010) 594–602
0
= E Y2 I +
1
Σ2
n2
"
+ E Y10 W −1
1 n1
×
I+
Σ1
n1
1 n2
− Y10 W −1 I +
!−1
1 n1 1
Σ1 +
× W −1 Y1 + Y10 W −1 1
−1
n2
1 n2
Σ2
1 n2
D(I − V )W
Σ1
Σ2
Σ2
−1
n1
−1
1 n1
−1 ! −1
1 n1
Σ2
1
−1
1 n1
0
Y1 + Y1 W
−1
(I − V ) D I +
1
0
n1
1 n2
1 n2
Σ2
Σ2
−1
−1
1 n1
1 n1
1 n2
Σ1
Σ2
1 n2
−1 ! −1
Y2 IY1 >0
1
Σ2 +
n1
Σ1
−1
! −1 (I − V )0 D(I − V )
Σ1
W −1 Y1
! −1 (I − V )0 D(I − V ) I +
Σ1
Σ1
Σ1 V 0 D(I − V )W −1 Y1 + Y1 W −1 (I − V )0 DV
Σ1 I +
Σ1
−1
1 n1
Σ1
1 n2
Σ2
−1 ! −1
W −1 Y1 IY1 >0 .
(3.23)
It is obviously that E [(I + ( n1 Σ2 )−1 n1 Σ1 )−1 D(I − V )W −1 ] = ( n1 Σ1 )−1 n1 Σ2 ( n1 Σ1 )−1 E [ n1 S1 ] = ( n1 Σ1 )−1 n1 Σ2 , so we 2 1 1 2 1 1 1 2 have
E Y20
1
I+
0
= 2E Y2 Due to
1 Σ1 n1
×
×
1
Σ1
n1
Σ2
n2 1
I+
n1
−1
D(I − V )W −1 Y1 + Y10 W −1 (I − V )0 D I +
Σ1
1 n1
Σ1
1 n2
Σ2
−1 !−1
Y 2 IY 1 > 0
!
1
Σ2 E (Y1 IY1 >0 ) = 2µ0 E (Y1 IY1 >0 ) > 0.
(3.24)
we have 1
Σ1 I +
n1
Σ2
Σ1
! −1
n2
−1
1
1 n1
Σ1
1 Σ2 , n2
n2
1
1 n1
≥
Y10 W −1
Σ2
n2
−1
−1
W
1 n2
Σ2
−1
1 n2
Σ2
0
Y1 − Y1 W
−1 ! −1
−1
! −1
1
Σ1
(I − V ) D(I − V ) I +
−1
n1
−1
I+
1 n2
0
Σ2
1 n1
1 n1
Σ1
1 n2
Σ2
−1 !−1
! −1 Σ1
(I − V )0 D(I − V )
W −1 Y1 IY1 >0 ≥ 0. (3.25)
Because all symmetrical matrices are non-negative definite, we have
" E Y10 W −1
×
1 n1
1 n1
Σ1
Σ1 +
1 n2
1 n2
Σ2 +
Σ2 1 n1
−1
1 n1
Σ1
Σ1 V 0 D(I − V )W −1 Y1 + Y1 W −1 (I − V )0 DV
#
−1 W
−1
≥ 0.
Y1
(3.26) According to (3.24)–(3.26), we have R(µ1 , X¯ 1 , D) − R(µ1 , µ ˆ 1 , D) > 0.
(3.27)
Proof of Theorem 2.4. Putting D = ( n1 Σ2 )−1 ( n1 Σ1 + 2
1
1 Σ2 n2
)( n12 Σ2 )−1 , µ ˆ 2 is expressed as
µ ˆ 2 = X¯ 2 (1 − IW X¯ 1 >W X¯ 2 ) + (V X¯ 1 + (I − V )X¯ 2 )IW X¯ 1 >W X¯ 2 ,
(3.28)
and we calculate the risk difference of X¯ 2 and µ ˆ 2 as R(µ2 , X¯ 2 , D) − R(µ2 , µ ˆ 2 , D) = E [(X¯ 2 − µ2 )0 D(X¯ 2 − µ2 ) − (µ ˆ 2 − µ2 )0 D(µ ˆ 2 − µ2 )]IW X¯ 1 >W X¯ 2
= E [(X¯ 2 − µ2 )0 D(X¯ 2 − µ2 ) − (V X¯ 1 + (I − V )X¯ 2 − µ1 )0 D(V X¯ 1 + (I − V )X¯ 2 − µ1 )]IW X¯ 1 >W X¯ 2 .
(3.29)
T. Ma, S. Wang / Journal of Multivariate Analysis 101 (2010) 594–602
601
Now we make the transformations Z1 = X¯ 1 − µ2
and Z2 = X¯ 2 − µ2 ,
(3.30)
where Z1 and Z2 are mutually independently distributed as N (−µ, n1 Σ1 ) and N (0, 1 It is obviously that Z1 , Z2 and V are mutually independent, we have from (3.28)
1 Σ2 n2
), respectively, µ = µ2 − µ1 ≥ 0.
R(µ2 , X¯ 2 , D) − R(µ2 , µ ˆ 2 , D) = E [Z20 DZ2 − (VZ1 + (I − V )Z2 )0 D(VZ1 + (I − V )Z2 )]IWZ1 >WZ2
= E [Z20 (I − V )0 DV (Z2 − Z1 ) + (Z2 − Z1 )0 V 0 D(I − V )Z2 ]IWZ1 ≥>WZ2 + E [Z20 V 0 DVZ2 − Z10 V 0 DVZ1 ]IWZ1 >WZ2 . Similarly, we denote Y1 = W (Z1 − Z2 ), Y2 = Z2 + ( n1 Σ2 )( n1 Σ1 )−1 Z1 , then Y2 ∼ N (− n1 Σ2 ( n1 Σ1 )−1 µ,
( ) =( +
2
)
1
1 Σ2 n1 Σ1 −1 n1 Σ2 . Obviously, Y1 and Y2 are mutually independent. We n2 1 2 1 Σ1 n1 Σ2 −1 −1 n1 Σ1 n1 Σ2 −1 Y2 W −1 Y1 . Thus we have Z2 I n1 2 1 2
(
) )
(
(
)
have Z1 = (I +
)
−
(
(3.31) 1 Σ2 n2
+ ) ) (W −1 Y1 + Y2 ),
2
1
1 Σ2 n1 Σ1 −1 −1 n2 1
E [Z20 (I − V )0 DV (Z2 − Z1 ) + (Z2 − Z1 )0 V 0 D(I − V )Z2 ]IWZ1 >WZ2 + E [Z20 V 0 DVZ2 − Z10 V 0 DVZ1 ]IWZ1 >WZ2
= E −Y20
×
I+
×
I+
1 n2
1 n1 1 n2
Σ2
Σ1 Σ2
−1
n1
1 n2
1 n1
+ Y 1 W −1 I +
1
Σ2 Σ1
1 n2
1
Σ1 Y1 − Y1 0
−1 ! −1 W
−1
W
−1
−1 ! −1
Σ2
−1
n1
1 n1
Σ1
1 n2
Σ2
−1
Y1 − Y1 W
−1
Y1 + Y1 W
−1
Y2 + Y1 W
I+
1 n1
Σ1
−1
I+
−1
1 n2
V D(I − V ) I + 0
1 n1
1 n2
Σ2
−1
1 n1
! −1 Σ1
V 0 DV
! −1 Σ2
Σ1
V 0 DV 1 n2
Σ2
−1 !−1
W −1 Y1
! −1
(I − V )0 DVW −1 Y1 IY1 >0 .
Σ1
(3.32) It is obvious that
" 0
E −Y 2 Due to
1 n2
1 Σ1 n1
≤
Σ2
−1
0
E Y1 W
−1
n1
1 Σ2 , n2
1
I+
1
Σ1
n1
1 n2
Σ2
#
−1
Y 2 IY 1 > 0 > 0 .
(3.33)
we have 1 n2
− Y10 W −1 I +
Σ1 Y1 − Y1 0
Σ2
1 n1
−1
1 n1
Σ1
−1
! −1 Σ1
1 n2
0
V DV
I+
1 n1
!−1 Σ2
V 0 DV
I+
Σ1
1 n2
Σ2
1 n2
Σ2
1 n1
−1 !−1
Σ1
W −1 Y1
−1 !−1
W −1 Y1 IY1 >0 ≥ 0.
(3.34)
Because all symmetrical matrices are non-negative definite, we obtain
E Y10 W −1 I +
1 n2
Σ2
−1
+ Y1 W −1 V 0 D(I − V ) I +
1 n1
! −1 (I − V )0 DVW −1 Y1
Σ1
1 n1
Σ1
1 n2
Σ2
−1 ! −1
W −1 Y1 IY1 >0 ≥ 0.
(3.35)
Sum up the above discussion, it follows that R(µ2 , X¯ 2 , D) − R(µ2 , µ ˆ 2 , D) > 0. This complete the proofs of all results.
(3.36)
602
T. Ma, S. Wang / Journal of Multivariate Analysis 101 (2010) 594–602
References [1] Y. Oono, N. Shinozaki, Estimation of two order restricted normal means with unknown and possibly unequal variances, J. Statist. Plann. 131 (2005) 349–363. [2] N. Misra, Edward.C. van der Meulen, On estimation of the common mean of k(≥2) normal populations with order restricted variances, Statist. Probab. Lett. 36 (1997) 261–267. [3] B.K. Sinha, Is the maximum likelihood estimate of the common mean of several populations admissible? Sankhya Ser. B. 40 (1979) 192–196. [4] A. Elfessi, N. Pal, A note on the common mean of two normal populations with order restricted variances, Comm. Statist. Theory Methods. 21 (1992) 3117–3184. [5] T. Robertson, F.T. Wright, R.L. Dykstra, Order Restricted Statistical Inference, Wiley, New York, 1988. [6] C.I.C. Lee, The quadratic loss of isotonic regression under normality, Ann. Statist. 9 (1981) 686–688. [7] R.E. Kelly, Stochastic reduction of loss in estimating normal means by isotonic regression, Ann. Statist. 17 (1989) 937–940. [8] N.Z. Shi, H. Jiang, Maximum likelihood estimation of isotonic normal means with unknown variances, J. Multivariate Anal. 64 (1998) 183–195. [9] W. Chiou, A. Cohen, On estimating a common multivariate normal mean vector, Ann. Inst. Statist. Math. 37 (1985) 499–506. [10] W.L. Loh, Estimating the common mean of two multivariate normal distributions, Ann. Statist. 19 (1991) 283–296. [11] S.M. Jordan, K. Krishnamoorthy, Confidence intervals for the common mean of several normal populations, Canad. J. Statist. 23 (1995) 283–297.