Robust inference strategy in the presence of measurement error

Robust inference strategy in the presence of measurement error

Statistics and Probability Letters 80 (2010) 726–732 Contents lists available at ScienceDirect Statistics and Probability Letters journal homepage: ...

510KB Sizes 1 Downloads 62 Views

Statistics and Probability Letters 80 (2010) 726–732

Contents lists available at ScienceDirect

Statistics and Probability Letters journal homepage: www.elsevier.com/locate/stapro

Robust inference strategy in the presence of measurement error S. Ejaz Ahmed, Abdulkadir Hussein, Sévérien Nkurunziza ∗ Department of Mathematics and Statistics, University of Windsor, Canada

article

abstract

info

Article history: Received 18 July 2009 Received in revised form 30 December 2009 Accepted 31 December 2009 Available online 14 January 2010

In this paper, we consider a statistical model where samples are subject to measurement errors. Further, we propose a shrinkage estimation strategy by using the maximum empirical likelihood estimator (MELE) as the base estimator. Our asymptotic results clearly demonstrate the superiority of our proposed shrinkage strategy over the MELE. Monte Carlo simulation results show that such a performance still holds in finite samples. We apply our method to real data set. © 2010 Elsevier B.V. All rights reserved.

1. Introduction Our aim in the present article is to improve upon the maximum empirical likelihood estimator (MELE). The proposed estimators are simple, easy to implement and very competitive with MELE and related estimators. The proposed shrinkagebased technique can be applied to a host of statistical problems. Zhong et al. (2000) [ZCR] demonstrated that the empirical likelihood provides an effective way of combining sample measurements from one perfect instrument and several imperfect instruments in the process of estimating unknown parameters of a population. This situation often arises with survey data as well as in industrial quality control context. We begin by describing the principal ingredients that contribute to the development of the suggested approach. As in ZCR, we consider a sampling design where H + 1 independent random samples s0 , s1 , . . . , sH of sizes n0 , n1 , . . . , nH are drawn and measurements on sh are obtained through instrument h. The instrument 0 is considered as perfect and the rest as giving imperfect measurements. The actual measurements are denoted by {yhi , i ∈ sh , h = 0, 1, . . . , H } and the interest is to estimate the p-column vector θ with p 6 H + 1. ZCR used the empirical likelihood function to derive an estimator that uses the perfect and imperfect measurements in an effective way. Following the ZCR framework, let g (y, θ) = (g0 (y, θ), g1 (y, θ), . . . , gH (y, θ))0 be a p × H + 1 matrix-valued, functionally independent and unbiased estimating equation, i.e. E(g (Y , θ)) = 0. The profile log-empirical likelihood ratio (ELR) is then given by rn (θ) =

H X X

ln 1 + λ0h gh (yhi , θ) ,



(1)

h=0 i∈sh

where λh is a vector of Lagrange multipliers satisfying

X

gh (yhi , θ)

i∈sh

1 + λ0h gh (yhi , θ)

= 0 for h = 0, 1, . . . , H + 1.

(2)

We consider estimation in measurement error models when there are many potential variables and some of them may not be relevant. Such a situation arises, for instance, in the context of regression when an investigator suspects a

∗ Corresponding address: University of Windsor, Department of Mathematics and Statistics, 401 Sunset Avenue, N9B 3P4 Windsor, Ontario, Canada. Tel.: +1 519 253 3000 3017. E-mail address: [email protected] (S. Nkurunziza). 0167-7152/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.spl.2009.12.031

S.E. Ahmed et al. / Statistics and Probability Letters 80 (2010) 726–732

727

priori that several factors have zero effect on the outcome of interest. In such cases, a prior hypothesis is formed that the coefficients of the factors are zero. Relying completely on the prior information leads to restricted estimators whereas completely disbelieving in it leads to unrestricted estimators. We propose a combination of these two types of estimators and construction of Shrinkage-type estimators that improve the performance of the maximum empirical likelihood estimator (MELE) proposed by ZCR in the context of perfect/imperfect instruments. In particular, we consider two competing models, where one model includes all variables and the other restricts parameters to a candidate linear sub-space based on prior knowledge. With respect to these two models, we investigate the relative performances of the Stein-type shrinkage estimators. We derive the asymptotic bias and risk of the proposed estimators. Asymptotic and Monte Carlo simulation studies show that the shrinkage estimators perform better than MELE, specially when the rank of the sub-space candidate is large. Uncertain Prior Information (UPI) The parameter θ is suspected to be restricted to the sub-space: UPI : L θ = d

(3)

where L is known q × p-matrix full rank with q < p, and d is known q-column vector. Clearly, the maximization of (1) subject to (2) yields the classical or unrestricted maximum empirical likelihood estimator (UMELE) which completely ignores the information given in (3). On the other hand,the maximization of (1) subject to (2) and (3) gives rise to restricted estimator (RMELE). In Section 2, we present asymptotic properties of the RMELE and UMELE estimators. We also present shrinkage and positive shrinkage estimators for combining the RMELE and UMELE in an optimal way and we derive the asymptotic properties of the shrinkage estimators. Section 3 gives results of Monte Carlo simulations as well as application to real data situation. Section 4 gives concluding remarks whereas technical results are relegated in the Appendix. 2. Unrestricted and restricted estimation strategies The UMELE, θˆ , of the parameter θ, in the context of measurement error model described in Section 1, is obtained by maximizing (1) subject to (2). Let µ be a q-column vector. Also, by using (1)–(3) and Lagrangian multipliers, as in Qin and Lawless (1994), the restricted estimator RMELE, θ˜ , is the solution of the following system for h = 0, 1, . . . , H,

X

gh (yhi , θ)

i∈sh

1 + λ0h gh (yhi , θ)

= 0,

 0 H X X ∂ gh (yhi , θ)/∂θ0 λh + L 0 µ = 0, 0 1 + λ g ( y , θ) h hi h h=0 i∈s

L θ − d = 0.

(4)

h

2.1. Shrinkage Estimation Strategies (SES) Following Ahmed (2001) and others, we define the following Stein-type shrinkage (SMELE) based on the UMELE and RMELE of Section 1, S b θ =e θ + {1 − c ψn−1 }(b θ −e θ),

(5)

where c is allowed to vary over [0, 2(q − 2)), q > 2, often set to c = q − 2; thus tacitly, we assume that q ≥ 3. The function ψn in (5) can be viewed as the test statistic, testing the information given in (3). This information can also be termed as the null hypothesis. Note that ψn ≥ 0, and hence, ψn < c ⇐⇒ 1 − c ψn−1 < 0, which may cause a possible inversion of sign of the estimator, a phenomenon known as over-shrinkage. Ahmed (2001) recommended that the shrinkage estimator should be used as a tool for developing the positive-rule shrinkage estimator and should not be used as an estimator in its own right. Thus, the second proposed shrinkage estimator is (SMELE+) defined by

 S+ b θ =e θ + {1 − c ψn−1 }+ b θ −e θ ,

where z + = max(0, z ).

(6)

In general, it is not easy to obtain the finite sample risk of the above estimators. This difficulty has been largely overcome by asymptotic methods (Ahmed, 2001; Nkurunziza and Ahmed, in press, and others) which relate primarily to convergence in distribution. But, this does not guarantee convergence in quadratic risk. This technicality has been taken care of by the introduction of asymptotic distributional risk (ADR) (Ahmed, 2001), which, in turn, is based on the concept of shrinking neighborhood of the pivot for which the ADR serves a useful and interpretable role in asymptotic risk analysis. On a slightly different note, the confidence set problem for shrinkage estimators are developed by Ahmed (2001) and Ahmed et al. (2009), and others. S S+ For deriving the ADR of e θ, b θ and b θ , we set the following sequence of alternatives to (3),



Kn : L θ = d + δ/ n,

n = 1, 2, . . .

where kδk < ∞ is a nonzero p-column vector linearly independent with d.

(7)

728

S.E. Ahmed et al. / Statistics and Probability Letters 80 (2010) 726–732

2.2. First order asymptotics For establishing asymptotic results pertaining to the properties of the proposed estimators, the following regularity conditions are needed.

P P (A1 ) Hh=0 i∈sh gh0 (yhi , θ)gh0 (yhi , θ) is a positive definite matrix with probability one; P (A2 ) E[kg (x, θ)g 0 (x, θ)k] < ∞ and for every h = 0, 1, . . . , H, E[ i∈sh gh (yhi , θ)gh0 (yhi , θ)] is a positive definite matrix; (A3 ) The function g (x, θ) is two-differentiable in θ and ∂ 2 g (x, θ)/∂θ∂θ0 is continuous in a neighborhood of the true value θ0 . Further, k∂ g (x, θ)/∂θk, k∂ 2 g (x, θ)/∂θ∂θ0 k and kg (x, θ)k3 are bounded by some integrable function G(x) in this neighborhood. P (A4 ) For each h, the matrices E(∂ gh (x, θ0 )/∂θ), i∈sh E(∂ gh (yhi , θ0 )/∂θ) are full rank. The above regularity conditions are similar to those of Lemma 1 and Theorem 1 of Qin and Lawless (1994). Now let √ √ √ ˜ ζn = n(θ˜ − θ), and let %n = n(θˆ − θ), ξn = n(θˆ − θ),

∂g V = E ∂θ  





E gg

0

−1

∂g E ∂θ 

0 −1

J0 = VL 0 (LVL 0 )−1 ;

;

δ∗ = −J0 δ;

V ∗ = J0 LV .

Under the above notations and regularity conditions, the UMELE and RMELE are consistent estimators of θ. Furthermore, Proposition 2.1 and Corollary 2.1 given below show that the UMELE and RMELE are asymptotically normal under the local alternatives (7). These results are more general than those of Qin and Lawless (1994) and Zhong et al. (2000) in the sense that a sequence of local alternatives which include the null hypothesis is considered and furthermore, the joint asymptotic distribution of the restricted and unrestricted estimators is derived. Proposition 2.1. If (A1 )–(A4 ) hold, then under the local alternative Kn in (7), we have

       L %n % 0 V −−−→ ∼ N2p , ζn n→∞ ζ J0 δ V − V∗        ∗ L ξn ξ −J 0 δ V −−−→ ∼ N2p , J0 δ 0 V ζn n→∞ ζ

V − V∗ V − V∗





0

−V



,

(8)

.

L Corollary 2.1. If Proposition 2.1 holds, then ξ0n L 0 (LVL 0 )−1 L ξn −−−→ ψ ∼ χq2 (∆), where χq2 (∆) is a chi-square variate with

noncentrality parameter ∆ = δ0 (LVL 0 )−1 δ.

n→∞



Note that the Corollary 2.1 covers as special case the asymptotic distribution under the UPI in (3). The proofs of the proposition and its corollary are outlined in the Appendix. In practice, the covariance matrix V can be estimated by

 Vˆ = 

X

kˆ h

(n h X

h =0

ˆ p˜ hi ∂ gh (yhi , θ)/∂θ

i=1

)0 ( n h X

)−1 ˆ gh (yhi , θ) ˆ 0 p˜ hi gh (yhi , θ)

i =1

×

(n h X

)−1 ˆ  p˜ hi ∂ gh (yhi , θ)/∂θ

(9)

i =1

ˆ , Kh = nh /n0 and ∂ gh (yhi , θ)/∂θ ˆ where p˜ hi = pˆ hi (θ) denotes the derivative of ∂ gh (yhi , θ)/∂θ evaluated at θ = θˆ . Note that Vˆ given in (9) is consistent estimator of V (see for example Zhong et al., 2000). Thus, by Slutsky’s theorem and Corollary 2.1, an L asymptotically α -level test can be based on the following Wald-type test statistic, ψn = ξ0n L 0 (L Vˆ L 0 )−1 L ξn −−−→ ψ ∼ χq2 (∆). n→∞

2.3. Asymptotic distributional bias and risk It is well known that, even for normal distribution, the effective domain of risk dominance of shrinkage estimators over MLEs is a small neighborhood of the chosen pivot (viz., L θ = d); and as we make the sample size n larger and larger, this domain becomes narrower. In the present context, under (A1 )–(A4 ), Corollary 2.1 and Proposition 2.1 show that, for any L

L

n→∞

n→∞

fixed L θ 6= d, ψn −−−→ ψ ∼ χq2 (∆) and n−1 ψn −−−→ 0 and as such, the shrinkage factor c ψ −1 = Oq (n−1 ), as n → ∞. Thus, asymptotically, there is no shrinkage effect. This justifies the choice of the sequence of local alternatives given in (7). ? ? ? Consider a quadratic loss function of the form L(b θ , θ; W) = n(b θ −θ)0 W(b θ −θ), where W is a p × p-positive semi-definite √ ? ? (p.s.d) matrix and b θ is an estimator of θ. Suppose that the distribution of n(b θ − θ) is G˜ n (u), u ∈ Rp and G˜ n → G˜ (at all ˜ as n → ∞, and let 6G˜ be the dispersion matrix of G. ˜ points of continuity of G), The quadratic risk is then defined as the expected value of the above loss function with respect to the distribution ? ? ? ˜ n (u) and is Ron (b θ , θ; W). On the other hand, the asymptotic distributional risk (ADR) of b θ is defined as Ro (b θ , θ; W) = G ?

?

?

trace(W6G˜ ) + [B(b θ , θ)]0 W[B(b θ , θ)], where B(b θ , θ) =

R

R ? · · · xdG˜ (x) is called asymptotic distributional bias (ADB) of b θ.

S.E. Ahmed et al. / Statistics and Probability Letters 80 (2010) 726–732

729

Below, we give the ADB and ADR expressions. As outlined in the Appendix, the proofs use extensively Proposition 2.1 and Corollary 2.1. Let Hν (x; ∆) = P{χν2 (∆) ≤ x}, x ∈ R+ . Theorem 2.1. If Proposition 2.1 holds, then, the ADB functions of the estimators are, B(b θ, θ) = 0;

S

B(e θ, θ) = −δ∗ ;

B(b θ , θ) = −δ∗ (q − 2)E χq−+22 (∆) ;    S+ B(b θ , θ) = −δ∗ Hq+2 (q + 2; ∆) + (q − 2)E χq−+22 (∆)I (χq2+2 (∆) > (q − 2)) . 





(10)

Theorem 2.2. If Proposition 2.1 holds, then, the ADR functions of the estimators are: R(b θ, θ; W) = trace(WV );

0

R(e θ, θ; W) = trace(WV ∗ ) + trace δ∗ Wδ∗ ,

S

0



R(b θ , θ; W) = ADR(b θ) + trace δ∗ Wδ∗ (q2 − 4)E(χq−+44 (∆))



− (p − 2)trace(WV ∗ ) 2E(χq−+22 (∆)) − (q − 2)E(χq−+42 (∆)) ,  S+ S R(b θ , θ; W) = ADR(b θ ) − trace(WV )Hq+2 (q − 2; ∆) + trace δ∗ 0 Wδ∗ 2Hq+2 (q − 2; ∆) − Hq+4 (q − 2; ∆)     − (q − 2)trace δ∗ 0 Wδ∗ 2E χq−+22 (∆)I χq2+2 (∆) ≤ (q − 2)    − 2E χq−+24 (∆)I χq2+4 (∆) ≤ (q − 2)    + (q − 2)trace WV ∗ 2E χq−+22 (∆)I χq2+2 (∆) ≤ (q − 2)  −4   − (q − 2)E χq+2 (∆)I χq2+2 (∆) ≤ (q − 2)    − (q − 2)2 trace δ∗ 0 Wδ∗ E χq−+44 (∆)I χq2+4 (∆) ≤ (q − 2) .  

S+

S

As far as bias is concerned, we first notice that the ADBs ofe θ, b θ andb θ are, up to constant factors, functions of δ. Accordingly, it suffices to compare the scalar factors ∆ only. It is clear that bias of the e θ is an unbounded function of ∆. On the other hand, S

S+

the ADB of both b θ and b θ

are bounded in ∆. Noting that since E {χq−+22 (∆)} is a decreasing log-convex function of ∆ the S+

S

ADB of b θ starts from the origin at ∆ = 0, increases to a maximum, and then decreases towards 0. The behavior of b θ S+

S

is

S

similar to that of b θ . Interestingly, the bias curve of b θ remains below the curve of b θ for all values of ∆. On the other hand, for a suitable choice of the matrix W, risk dominance of the estimators are similar to those under normal theory and can be summarized as follows: S

S

– (i) Clearly R(b θ , θ; W) < trace(WV ) for all ∆ > 0. Hence, b θ outperforms b θ. S+

S

S+

– (ii) b θ asymptotically superior to b θ in the entire parameter space induced by ∆. Therefore, b θ UMELE. S+ – (iii) Importantly, b θ does not be inherent over-shrinking problem.

is also superior to the

3. Numerical examples 3.1. Simulation studies In this simulation study, we used samples generated from a p-dimensional multivariate normal model N (θ, 6), with p = 5, 6 and, without loss of generality, 6 = σ 2 Ip with Ip being a p-dimensional identity matrix and σ 2 the component variances. We considered, one sample from perfect machine and one from imperfect machine with equal sample sizes n0 = n1 = 50, 100, 150, 200 and we used 500 Monte Carlo simulations. We assumed that the variance of the perfect machine measurements is σ0 = 1 and that of the imperfect machine is three times larger, σ1 = 3σ0 . The uncertain prior information Lθ = d was taken to be that of equality of the component means, thus, q = p − 1. In order to save space, we report the results for the case where p = 5 and these are summarized in Fig. 1. Under the null hypothesis, the shrinkage estimators SMELE and SMELE+ dominate or equal UMELE in risk and, they are dominated by RMELE in a small neighborhood of the uncertain prior information represented by the null hypothesis. Beyond the small interval near the null hypothesis, the risk of RMELE explodes to infinity and thus, remains dominated by the rest of the estimators. 3.2. Illustrative application The data analyzed in this example are based on a Canadian child safety seats survey described in Snowdon et al. (2009). The main objective of the survey was to measure whether children traveling in vehicles on Canadian roads are correctly restrained in safety devices that are appropriate for their ages, weights and heights. Vehicles entering in 200 randomly selected retail parking lots across Canada were interviewed and variables pertaining to the drivers and child occupants were recorded. Child’s weight, age and height were among a number of variables measured. The drivers were first asked to

S.E. Ahmed et al. / Statistics and Probability Letters 80 (2010) 726–732

n=50, p=5

a

UMELE RMELE SMELE+ SMELE

5 4 3 0

0

1

2

Relative Efficiency

4 3 2 1

Relative Efficiency

n=100, p=5

b

5

6

UMELE RMELE SMELE+ SMELE

6

730

0.0

0.5

1.0

1.5

2.0

0.0

0.5

Δ*

2.0

n=200, p=5

d

2

3

4

5

6

UMELE RMELE SMELE+ SMELE

0

0

1

2

3

4

Relative Efficiency

5

6

UMELE RMELE SMELE+ SMELE

1

Relative Efficiency

1.5

Δ*

n=150, p=5

c

1.0

0.0

0.5

1.0 Δ*

1.5

2.0

0.0

0.5

1.0

1.5

2.0

Δ*

Fig. 1. Relative risks of SMELE, SMELE+ and RMELE with respect to UMELE for p-variate normal data with p = 5 and n0 = n1 = n = 50, 100, 150, 200. The horizontal line represents UMELE.

give estimates of the children’s weights and heights and then, if time permitted, the children were taken out of the vehicle and their actual weights and heights were measured. In a number of cases, drivers declined to continue the survey and thus, interviewers were not able to obtain actual weights and heights of the child occupants. Therefore, a subset of the data on these variables have measurement errors. In this section, we estimate the average weight (in lbs) of school-aged children (4–8 years of age) in five Canadian provinces by using data on both guessed and actual weights of children, under the hypothesis that average weights of the five provinces are equal. For this purpose, we extracted random samples of sizes n = 30 from the guessed and the actual data sets of each of the five provinces. Thus, the weight was considered as five dimensional multivariate normal and the means of the five components were estimated by using the methods developed in this paper. The various estimators were then bootstrapped (using 300 bootstrap samples) and percentile bootstrapped 95% confidence intervals were computed. The results are summarized in Table 1. The results for the shrinkage estimator were identical to those of the positive shrinkage and therefore not reported here. It is seen from Table 1 that shrinkage estimator offers shorter confidence intervals as compared to the UMELE. 4. Conclusion In this article, we constructed efficient empirical likelihood shrinkage estimators that incorporate, in the estimation process, uncertain prior information as well as information from samples affected by measurement errors. We examined the risk properties of the estimators by theoretical asymptotic methods and by simulations. We have found that the proposed

S.E. Ahmed et al. / Statistics and Probability Letters 80 (2010) 726–732

731

Table 1 Average weight (in lbs) of school-aged Canadian children (first row), length of its 95% C.I (second row) and the actual 95% C.I (third row) estimated by using UMELE and SMELE+. Province

UMELE

SMELE+

Province

UMELE

SMELE+

Alberta

50.97 10 (46.75, 56.59)

50.97 8 (47.19, 54.85)

Quebec

50.98 10 (45.36, 55.75)

51.22 9 (46.2, 55.06)

Ontario

56.2 16 (50.27, 66.07)

55.65 15 (50.42, 65.46)

Nova Scotia

51.38 14 (45.52, 57.20)

51.58 12

British Columbia

57.62 15 (50.70, 65.80)

56.99 14 (50.77, 64.38)

(43.26, 57.52)

shrinkage estimators dominate the UMELE. Further, as we go away from the uncertain prior information, the shrinkage estimators dominate also the restricted estimator. We illustrated the usefulness of the methods proposed by using a survey example. Acknowledgements This research work was partially supported by Individual Discovery Grants from the Natural Sciences and Engineering Research Council (NSERC) of Canada. The authors thank Prof. J. N. K. Rao for helpful discussions. Also, they thank Transport Canada and Dr. Anne Snowdon for allowing them to use the survey data. The authors also would like to thank the Editor-inChief Prof. H. Koul and an anonymous referee for the helpful comments. Appendix. Technical results and proofs For simplicity, we denote the left-hand side of (4) by f1n (θ, λh , µ) , f2n (θ, λh , µ) and f3n (θ, λ, µ), respectively, and further we shall omit the arguments of the functions. Proof of Proposition 2.1. Similar to ZCR, Qin and Lawless (1994, 1995), we use the Taylor series expansions of 0 = f1n (θ, λh , µ), h = 0, 1, . . . , H, 0 = f2n (θ, λ, µ), and 0 = f3n (θ, λ, µ) around (θ, 0, 0), about θ. We have

 ! ! ! −1 ! 0  −1 ! ! −1  1  X ∂g X X ∂g  X X X ∂g − 0 0 θˆ − θ = − gg gg g + op nh 2 ,  i ∂θ  ∂θ ∂θ i i i i i

(11)

  √ −1 θ˜ − θ = (−I + J0 L ) θˆ − θ + J0 δ/ nh + op (nh 2 ),

(12)

with J0 = VL 0 (LVL 0 )−1 . Further, by the central limit theorem and since E (g ) = 0, we have

 √ X L (1/ n0 ) g −−−→ Np 0, kh E gg 0 ,

(13)

n→∞

i

and by using the strong law of large numbers, we get,

(1/n0 )

X ∂g i



a.s.

∂θ

−−−→ kh E n→∞

∂g ∂θ



,

and (1/n0 )

X i

a.s.

gg 0 −−−→ kh E gg 0 .



(14)

n→∞

Therefore, combining (11)–(14), and Slutsky Theorem, we get the first statement of the proposition. Furthermore, note that

 0 I ξ0n , ζ0n = p 0

−I p Ip



0 ρ0n , ζ0n ,

(15)

and then, combining (8) and (15) and Slutsky Theorem, we get the second statement of the proposition, that completes the proof.  L

Proof of Corollary 2.1. By using Proposition 2.1, under local alternative, we have ξn −−−→ ξ ∼ Np δ∗ , V ∗ . Therefore,



n→∞

under local alternative (7), ξ0n L 0 (LVL 0 )−1 L ξn

L

−−−→ Z0 L 0 (LVL 0 )−1 LZ. Moreover, one can verify (V ∗ L 0 (LVL 0 )−1 L )2 =   n→∞0 2 0 0 V ∗ L 0 (LVL 0 )−1 L, so that rank V ∗ L 0 (LVL 0 )−1 L = qδ∗ V ∗ L 0 (LVL 0 )−1 L δ∗ V ∗ L 0 (LVL 0 )−1 L and similarly δ∗ L 0 (LVL 0 )−1 LV ∗ L 0 0 0 −1 ∗ (LVL = δ∗L 0 (LVL 0 )−1 L δ∗ . Therefore, by using Theorem 4 in Styan (1970), we get Z0 L 0 (LVL 0 )−1 LZ ∼  ) 0 Lδ χq2 δ∗ L 0 (LVL 0 )−1 L δ∗ . 

732

S.E. Ahmed et al. / Statistics and Probability Letters 80 (2010) 726–732

θ, θ = E (ρ) = 0, and B(e θ, θ) = Proof of Theorem 2.1. From Proposition 2.1, we get directly the two first statements B b ∗ E(ζ) = −δ . For the third statement, we have, 





S B b θ , θ = E ζ + 1 − (q − 2)ψ −1 ξ = −δ∗ + E



  1



1 − (q − 2)ψ −1 ξ ,

 

(16)

1

1

with ψ = ξ0 (ϒϒ ∗ ϒ ) ξ. Also, let ϒ = V − 2 and let V 2 L 0 (LVL 0 )−1 LV 2 = ϒ ∗ . Obviously, ϒ ∗ is a symmetric idempotent matrix. So, there exists an orthogonal matrix Q such that Q ϒ∗Q 0 =





0 . 0

Iq 0

(17)

0

Now, let U = Q ϒ ξ. We have ψ = U 0 U and U = U10 , 0 ψ = U10 U1 , and then, by using (16), we get





S B b θ , θ = −δ∗ + ϒ −1 Q 0 E

h

1 − (q − 2) U10 U1

where U1 ∼ Nq µ1 , Iq with µ1 = −[Iq , 0]Q ϒ δ∗ . Therefore,



 −1  i

U .

(18)

Then, from Theorem 1 in Judge and Bock (1978), we get





S θ , θ = −δ∗ + E B b



1 − (q − 2)/χq2+2 (∆)



ϒ −1 Q 0 Q ϒ ∗ Q 0 Q ϒ δ ∗ ,

−1

with µ01 µ1 = δ0 LVL 0 δ = ∆. Further, using the fact that ϒ −1 ϒ ∗ ϒ = J0 L and LJ 0 = Iq we prove the third statement. The proof of the last statement is similar.  Lemma A.1. Let c be a real number and assume that Proposition 2.1 holds. Then E (1 − c ψ −1 )2 ξ0 W ξ = E[(1 − c χq−+22 (∆))2 ] trace(W ϒ ∗ ) + E





0

E[(1 − c ψ −1 )ζ0 W ξ] = −E[(1 − c χq−+22 (∆))]δ∗ W δ∗ ,

h

2 i

1 − c χq−+24 (∆)

0

δ∗ W δ∗ ;

where ∆ = δ0 (LVL 0 )−1 δ. 

Proof. The proof of the first statement follows from (17) and Theorem 2 in Judge and Bock (1978) along with  some algebraic computations. For the second statement, since ζ and ξ are independent, we have E 1 − c ψ −1 ζ0 W ξ = (δ∗ )0 W E[(1 − c ψ −1 )ξ], and then, using the same techniques as in proof of Theorem 2.1, we prove the last statement the lemma.  S

Proof of Theorem 2.2. The ADR of b β and e β follows directly from Proposition 2.1. To derive ADR(b θ , θ; W), we first note that S ADR(b θ , θ; W) = ADR e θ, θ; W + 2E

 0    ζ W (1 − c ψ −1 )ξ + E ξ0 W (1 − c ψ −1 )2 ξ .  0 S Also, from Lemma A.1, ADR(b θ , θ; W) = ADR e θ, θ; W − 2E[(1 − c χq−+22 (∆))]δ∗ W δ∗ + E[(1 − c χq−+22 (∆))2 ] trace(W ϒ ∗ ) + 

0

E[(1 − c χq−+24 (∆))2 ]δ∗ W δ∗ . Hence, 0 S ADR(b θ , θ; W) = ADR b θ, θ; W + δ∗ W δ∗ (c + 2)2 − 4 E χq−+44 (∆)





− ctrace Wϒ ∗

2E χq−+22 (∆) − c E χq−+42 (∆)









,

S

and then, replacing c by q − 2, we get ADR(b θ , θ; W) as stated in Theorem 2.2. Further, in a similar way, we establish S+

ADR(b θ , θ; W).



References Ahmed, S.E., 2001. Shrinkage estimation of regression coefficients from censored data with multiple observations. In: Ahmed, S.E., Reid, N. (Eds.), Empirical Bayes and Likelihood Inference. Springer, NewYork, pp. 103–120. Ahmed, S.E., Volodin, A.I., Volodin, I.N., 2009. High order approximation for the coverage probability by confident set centered at the positive-part James–Stein estimator. Statistics and Probability Letters 79 (17), 1823–1828. Judge, G.G., Bock, M.E., 1978. The Statistical Implication of Pre-test and Stein-rule Estimators in Econometrics. Amsterdam, North Holland. Nkurunziza, S., Ahmed, S.E, 2009. Shrinkage drift parameter estimation for multi-factor Ornstein–Uhlenbeck processes. Applied Stochastic Models in Business and Industry (in press). Qin, J., Lawless, J., 1994. Empirical likelihood and general estimating equations. The Annals of Statistics 22 (1), 300–325. Qin, J., Lawless, J., 1995. Estimating equations, empirical likelihood and constraints on parameters. The Canadian Journal of Statistics 23 (2), 145–159. Snowdon, A.W., Hussein, A., Purc-Stevenson, R., Bruce, B., Kolga, C., Boase, P., Howard, A., 2009. Are we there yet? Canada’s progress towards achieving road safety vision 2010 for children travelling in vehicles. International Journal of Injury Control and Safety Promotion 16 (4), 231–237. Styan, G.P.H., 1970. Notes on the distribution of quadratic forms in singular normal variables. Biometrika 57 (3), 567–572. Zhong, B., Chen, J., Rao, J.N.K., 2000. Empirical likelihood inference in the presence of measurement error. The Canadian Journal of Statistics 28 (4), 841–852.