Applied Mathematics and Computation 249 (2014) 371–381
Contents lists available at ScienceDirect
Applied Mathematics and Computation journal homepage: www.elsevier.com/locate/amc
Efficiency of a stochastic restricted two-parameter estimator in linear regression Yalian Li ⇑, Hu Yang Department of Statistics and Actuarial Science, Chongqing University, Chongqing 401331, China
a r t i c l e
i n f o
a b s t r a c t Sakallıog˘lu and Kaçiranlar (2008) proposed an estimator, two-parameter estimator, as an alternative to the ordinary least squares, the ordinary ridge and the Liu estimators in the presence of multicollinearity. In this paper, we introduce a new class estimator by combining the ideas underlying the mixed estimator and the two-parameter estimator when stochastic linear restrictions are assumed to hold. The necessary and sufficient conditions for the superiority of the new estimator over the two-parameter estimator, modified mixed estimator and stochastic restricted two-parameter estimator Yang and Wu (2012) are derived by the matrix mean square error criterion. Furthermore, selections of the biasing parameters are discussed and two numerical examples and a Monte Carlo simulation are given to evaluate the performance of mentioned estimators in the theoretical results. Ó 2014 Elsevier Inc. All rights reserved.
Keywords: Multicollinearity Two-parameter estimator Matrix mean square error Stochastic restriction
1. Introduction Let us consider the linear regression model
y ¼ Xb þ e;
ð1Þ 0
where y ¼ ðy1 ; . . . ; yn Þ is a random vector of response variables with mean EðyÞ ¼ Xb and covariance matrix Cov ðyÞ ¼ r In , X ¼ ðx1 ; . . . ; xn Þ0 is an n p regressor matrix of full column rank with xi ¼ ðxi1 ; . . . ; xip Þ0 for i ¼ 1; . . . ; n, b is a p 1 vector of unknown regression coefficients, e is an n 1 vector of disturbances. The ordinary least (OLS) estimator for regression corresponds to minimizing the sum of squared deviations objective 2
W1 ¼ ðy XbÞ0 ðy XbÞ ¼ y0 y 2b0 X 0 y þ b0 X 0 Xb;
ð2Þ
and the objective function (2) minimized by the vector b we have the OLS estimator as
^ ¼ ðX 0 XÞ1 X 0 y; b
ð3Þ
which plays an important role in regression analysis theory. However, the OLS estimator is unstable and often gives misleading information in the presence of multicollinearity. In order to deal with the multicollinearity, Hoerl and Kennard [1] proposed the ordinary ridge (OR) estimator 1 ^ bðkÞ ¼ ðX 0 X þ kIÞ X 0 y;
k > 0:
⇑ Corresponding author. E-mail addresses:
[email protected] (Y. Li),
[email protected] (H. Yang). http://dx.doi.org/10.1016/j.amc.2014.10.011 0096-3003/Ó 2014 Elsevier Inc. All rights reserved.
ð4Þ
372
Y. Li, H. Yang / Applied Mathematics and Computation 249 (2014) 371–381
And Liu [2] proposed the Liu estimator which is combined the Stein estimator with the OR estimator
1 0 ^ ^ ; bðdÞ ¼ X0X þ I X y þ db
0
ð5Þ
and both of the OR estimator and Liu estimators have become the most common methods to overcome the weakness of OLS estimator. Beside the two estimators, some other biased estimators as ones of remedies were put forward in the literature such as ridge-2 estimator [3], r k class estimator [4–5], r d class estimator [6], two-parameter estimator by Sakallıog˘lu and Kaçiranlar [7] and alternative two-parameter estimator by Özkale and Kaçiranlar [8]. Another way of solving the muticollinearity problem is to consider parameter estimation with some additional information on the unknown parameters such as the exact or stochastic restrictions [9]. Mostly, the interest has centered around the estimators when exact restrictions assumed to hold (e.g., [10–15]), and there are relatively less results for the estimators with stochastic restrictions. However, as pointed out by Arashi and Tabatabaey [16], exact restrictions are often not suitable in many applied work, involving economic relations, industrial structures, production planning, etc. Meanwhile, stochastic uncertainty occurs in specifying linear programming due to economic and financial studies. In addition, there is also prior information from a previous sample which usually makes some relations through stochastic subspace restrictions. Therefore, we deal with stochastic restrictions in this article. Let us be given some prior information about b in the form of a set of m independent stochastic linear restrictions as follows:
r ¼ Rb þ e;
ð6Þ
where R is a m p known matrix with rankðRÞ ¼ m, e is a m 1 vector of disturbances with expectation 0 and covariance matrix r2 W, W is assumed to be known and positive definite, the m 1 vector r can be interpreted as a stochastic known vector. Further it is also assumed that e is stochastically independent of e. In the method of mixed estimation as suggested by Theil and Goldberger [17], by unifying the linear model (1) subject to the stochastic linear restrictions (6), we have
~ þ ~e; ~ ¼ Xb y
ð7Þ
where
~¼ y
y r
;
X ; R
~¼ X
~e ¼
e ;
Cov ð~eÞ ¼ r2 R;
e
R¼
In
0
0
W
:
Since W is positive definite matrix, we know that R is also a positive definite matrix. The ordinary mixed estimator of b may be obtained by minimizing 0
~ R1 ðy ~ ~ XbÞ; W2 ¼ ðy~ XbÞ with respect to b, which is given by
1 1 ^OME ¼ S þ R0 W 1 R ^ þ S1 R0 W þ R0 S1 R ^ b X 0 y þ R0 W 1 r ¼ b ðr RbÞ;
ð8Þ
^ is the OLS estimator in (3). where S ¼ X 0 X and b In order to solve the multicollinearity problem, Sakallıog˘lu and Kaçiranlar [7] consider the following function 0
^ ^ W3 ¼ ðy XbÞ0 ðy XbÞ þ ðb dbðkÞÞ ðb dbðkÞÞ; where d is constant. Then, the two-parameter (TP) estimator [7] is given by
^ dÞ ¼ ðS þ IÞ1 X 0 y þ dbðkÞ ^ bðk; :
ð9Þ
Yang and Wu [18] proposed a stochastic restricted two-parameter estimator, which is obtained by replacing the OLS estimator (3) in Eq. (8) by the two-parameter estimator in Eq. (9),
1 ^SR ðk; dÞ ¼ bðk; ^ dÞ þ S1 R0 W þ R0 S1 R ^ dÞÞ: b ðr Rbðk;
ð10Þ
The purpose of this article is to find a better estimator to overcome the multicollinearity in linear regression when stochastic restrictions are assumed to hold. In order to derive the alternative stochastic restricted two-parameter estimator, we ^ propose the modified stochastic restricted two-parameter (MSRTP) estimator by augmenting the equation dbðkÞ ¼ b þ e0 to Eq. (7) and we can get 0
0
~ r2 R1 ðy ~ þ ðb dbðkÞÞ ^ ^ ~ XbÞ W4 ¼ r2 ðy~ XbÞ ðb dbðkÞÞ; which combines the TP estimator and ordinary mixed estimator, where d are constant. Differentiation of this function W4 with respect to b lead to the following normal equations:
Y. Li, H. Yang / Applied Mathematics and Computation 249 (2014) 371–381
~ 0 R1 Xb ~ X ~ 0 R1 y ^ ~ ¼ 0: þX ðb dbðkÞÞ
373
ð11Þ
^MSR ðk; dÞ to denote the solution to the normal equations, then we can get that If we use b
1 ~ 0 R1 X ~ þ Ip ~ 0 R1 y ^MSR ðk; dÞ ¼ X ^ ~ þ dbðkÞ b X ;
ð12Þ
which can also be rewritten as
1 ^MSR ðk; dÞ ¼ X 0 X þ R0 W 1 R þ Ip ^ b X 0 y þ R0 W 1 r þ dbðkÞ ;
ð13Þ
^MSR ðk; dÞ as modified stochastic restricted two-parameter (MSRTP) estimator, where k > 0; 1 < d < 1 and which we call b ^ bðkÞ is the OR estimator in (4). Letting Sð1Þ ¼ X 0 X þ Ip and note that
1 X 0 X þ R0 W 1 R þ Ip ¼ Sð1Þ1 Sð1Þ1 R0 W þ RSð1Þ1 R0 RSð1Þ1 : We can also write the modified stochastic restricted two-parameter (MSRTP) estimator as
1 ^MSR ðk; dÞ ¼ bðk; ^ dÞ þ Sð1Þ1 R0 W þ RSð1Þ1 R0 ^ dÞÞ; b Rðr bðk;
ð14Þ
^ dÞ is the two-parameter (TP) estimator where bðk; 1 in (9). Let Mð1Þ ¼ Sð1Þ1 Sð1Þ1 R0 W þ RSð1Þ1 R0 RSð1Þ1 , then the MSRTP estimator can be rewritten as
1 ^MSR ðk; dÞ ¼ Mð1ÞSðk þ dÞSðkÞ1 Sb ^ þ Sð1Þ1 R0 W þ RSð1Þ1 R0 b Rr:
ð15Þ
^ dÞ is a general estimator which includes the OLS estimator, Sakallıog˘lu and Kaçiranlar [7] pointed that TP estimator bðk; ordinary ridge (OR) estimator and Liu estimator as special cases:
^ 1Þ ¼ ðX 0 XÞ1 X 0 y is the OLS estimator; bð0; ^ 1 kÞ ¼ ðX 0 X þ kIp Þ1 X 0 y is the OR estimator; bðk;
^ dÞ ¼ X 0 X þ Ip 1 X 0 y þ db ^ is the Liu estimator when 0 < d < 1. bð0; Analogously, the new modified stochastic restricted two-parameter (MSRTP) estimator can be obtained the following results
1 1 ^MSR ð0; 1Þ ¼ b ^ þ Sð1Þ1 R0 W þ RSð1Þ1 R0 ^ ¼ Mð1ÞSð1Þb ^ þ Sð1Þ1 R0 W þ RSð1Þ1 R0 b Rðr bÞ Rr;
ð16Þ
which is called modified mixed (MM) estimator different from mixed estimator.
1 ^MSR ðk; 1 kÞ ¼ bðkÞ ^ ^ b þ Sð1Þ1 R0 W þ RSð1Þ1 R0 Rðr bðkÞÞ 1 ^ þ Sð1Þ1 R0 W þ RSð1Þ1 R0 Rr; ¼ Mð1ÞSð1ÞSð1Þ1 Sb
ð17Þ
which is called modified stochastic restricted ridge (MSRR) estimator different from the stochastic restricted ridge estimator introduced by Özkale [20]
1 1 1 ^ ^MSR ð0; dÞ ¼ bðdÞ ^ ^ b þ Sð1Þ1 R0 W þ RSð1Þ1 R0 ¼ Mð1ÞSðdÞ b þ Sð1Þ1 R0 W þ RSð1Þ1 R0 Rðr bðdÞÞ Rr;
ð18Þ
which is called modified stochastic restricted Liu (MSRL) estimator different from the stochastic restricted Liu estimator introduced by Hubert and Wijekoon [21]. We can find that our MSRTP estimator is a new general stochastic restricted estimator. ^MSR ðk; dÞ has superior properties over the TP estimator [7], the In Section 2, we show that the proposed MSRTP estimator b modified mixed (MM) estimator and the SRTP estimator [18] in the matrix mean square error (MMSE) sense. In Section 3, we discuss the selection of parameter k and d by the mean squared error (MSE) criterion and illustrate some of these results with two numerical examples and a Monte Carlo simulation in Section 4. Finally, we give the conclusion in Section 5. 2. Superiority of the new biased estimator To compare estimators of the unknown coefficient vector b, some criterions for gauging the performance of an estimator ~ are measured simultaneously by the matrix mean square error which is are needed. Bias and variance of an estimator b defined as
374
Y. Li, H. Yang / Applied Mathematics and Computation 249 (2014) 371–381
h i ~ ¼ E ðb ~ bÞðb ~ bÞ0 ¼ Cov ðbÞ ~ þ BiasðbÞBiasð ~ ~ 0; bÞ MMSEðbÞ h i ~ by Cov ðbÞ ~ ¼ E ðb ~ EðbÞÞð ~ b ~ EðbÞÞ ~ 0 and the bias by BiasðbÞ ~ ¼ EðbÞ ~ b. For a where the covariance matrix of an estimator b ~ ~ ~ ~ given value of b, b2 is preferred to an alternative estimator b1 , when MMSEðb1 Þ MMSEðb2 Þ is a nonnegative definite (n.n.d.) matrix. Another criterion measure of goodness of an estimator is
~ ¼ trMMSEðbÞ ~ ¼ trCov ðbÞ ~ þ BiasðbÞ ~ 0 BiasðbÞ; ~ MSEðbÞ ~ If MMSEðb ~1 Þ MMSEðb ~2 Þ is a n.n.d. matrix, then MSEðb ~1 Þ MSEðb ~2 Þ P 0. The reverse which is called as the MSE value of b. conclusion does not necessarily hold true. Therefore, MSE criterion is stronger than the MSE criterion [9]. Thus, comparing two estimators in the sense of MMSE criterion can be more suitable. In the following three subsections we compare our ^MSR ðk; dÞ with the TP estimator, modified mixed (MM) estimator and SRTP estimator in the sense of new biased estimator b MMSE criterion. Lemma 2.1 (Baksalary and Kala [19]). Assume the matrix A is n.n.d. matrix, a is some vector, then A aa0 P 0 if and only if a0 Aþ a 6 1; a 2 RðAÞ. Lemma 2.2 (Rao et al. [9]). Assume matrix A and matrix B are n.n.d., then
A B is n:n:d: () kmax ðBA Þ 6 1;
RðBÞ RðAÞ;
where kmax ðBA Þ denotes the maximum eigenvalue of BA and kmax ðBA Þ 6 1 is invariant of the choice of A , where A stands for the generalized inverse of A, and RðAÞ denotes the column vectors space spanned by A. Lemma 2.3 (Rao et al. [9]). Assume A is positive definite (p.d.) and B is n.n.d. matrix, then A B is n.n.d () kmax ðBA1 Þ 6 1. 2.1. MMSE performance comparison between MSRTP estimator and TP estimator From (9), we compute the bias vector and variance matrix of the TP estimator:
^ dÞÞ ¼ Sð1Þ1 ðSðk þ dÞSðkÞ Biasðbðk;
1
Sð1ÞÞb;
^ dÞÞ ¼ r2 Sð1Þ1 Sðk þ dÞSðkÞ1 SSðkÞ1 Sðk þ dÞSð1Þ1 ; Cov ðbðk; where Sð1Þ ¼ ðS þ Ip Þ; Sðk þ dÞ ¼ ðS þ ðk þ dÞIp Þ; SðkÞ ¼ ðS þ kIp Þ; S ¼ X 0 X. Then we can derive the bias vector and variance matrix of the MSRTP estimator as follows: 1
^MSR ðk; dÞÞ ¼ Mð1ÞðSðk þ dÞSðkÞ S Sð1ÞÞb; Biasðb 1 1 1 1 ^MSR ðk;dÞÞ ¼ r2 Mð1ÞSðk þ dÞSðkÞ1 SSðkÞ1 Sðk þ dÞMð1Þ þ r2 Sð1Þ1 R0 W þ RSð1Þ1 R0 Cov ðb W W þ RSð1Þ R0 RSð1Þ :
Now, Let us consider the following MMSE difference:
^ dÞÞ MMSEðb ^MSR ðk; dÞÞ D1 ¼ MMSEðbðk; 1 1 ¼ A Mð1ÞSð1ÞASð1ÞMð1Þ r2 Sð1Þ1 R0 W þ RSð1Þ1 R0 W W þ RSð1Þ1 R0 RSð1Þ1 ¼ A B; where
^ dÞÞ; A ¼ MMSEðbðk; 1 1 B ¼ Mð1ÞSð1ÞASð1ÞMð1Þ þ r2 Sð1Þ1 R0 W þ RSð1Þ1 R0 W W þ RSð1Þ1 R0 RSð1Þ1 : Since A and B be an n.n.d., if A B is n.n.d., then a necessary and sufficient condition for D1 P 0 is kmax ðBA Þ 6 1; RðBÞ RðAÞ according to Lemma 2.2. Now we give the following theorem. Theorem 2.1. The estimator MSRTP estimator is superior to the estimator TP estimator by the MMSE sense if and only if kmax ðBA Þ 6 1; RðBÞ RðAÞ. The MMSE comparison between the MSRTP estimator and TP estimator includes three kinds of comparisons: the modified mixed (MM) estimator and the OLS estimator; the modified stochastic restricted ridge (MSRR) estimator and the OR estimator; the modified stochastic restricted Liu (MSRL) estimator and the Liu estimator.
Y. Li, H. Yang / Applied Mathematics and Computation 249 (2014) 371–381
375
Corollary 2.1. A necessary and sufficient condition for the MM estimator to be superior to the OLS estimator in the MMSE sense is
1 1 6 r2 : kmax S Mð1ÞSð1ÞASð1ÞMð1Þ þ r2 Sð1Þ1 R0 W þ RSð1Þ1 R0 W W þ RSð1Þ1 R0 RSð1Þ1
Corollary 2.2. A necessary and sufficient condition for the MRR estimator to be superior to the ridge estimator in the MMSE sense is
1 1 6 1; kmax F 1 Mð1ÞSð1ÞFSð1ÞMð1Þ þ r2 Sð1Þ1 R0 W þ RSð1Þ1 R0 W W þ RSð1Þ1 R0 RSð1Þ1 1
1
where F ¼ r2 SðkÞ SSðkÞ
2
1
1
þ k SðkÞ bb0 SðkÞ .
Corollary 2.3. A necessary and sufficient condition for the MRL estimator to be superior to the Liu estimator in the MMSE sense is
1 1 6 1; kmax G1 Mð1ÞSð1ÞGSð1ÞMð1Þ þ r2 Sð1Þ1 R0 W þ RSð1Þ1 R0 W W þ RSð1Þ1 R0 RSð1Þ1 2
where G ¼ r2 Sð1Þ1 SðdÞS1 SðdÞSð1Þ1 þ ð1 dÞ Sð1Þ1 bb0 Sð1Þ1 . 2.2. MMSE performance comparison of MSRTP estimator and MM estimator From (13), we can rewrite the MSRTP estimator and its variance matrix as
1 1 ^MSR ðk; dÞ ¼ X 0 X þ R0 W 1 R þ Ip b Ip þ dSðkÞ X 0 y þ R0 W 1 r ; and
1 1 1 1 ^MSR ðk; dÞÞ ¼ r2 S þ R0 W 1 R þ Ip Cov ðb Ip þ dSðkÞ : S Ip þ dSðkÞ þ R0 W 1 R S þ R0 W 1 R þ Ip ^MSR ðk; dÞÞ, namely, As a special case, we can get the variance of MM estimator by putting k ¼ 0; d ¼ 1 in Cov ðb
1 1 ^MSR ð0; 1ÞÞ ¼ r2 S þ R0 W 1 R þ Ip Cov ðb Ip þ S1 S Ip þ S1 þ R0 W 1 R S þ R0 W 1 R þ Ip : Thus, the difference of the variance matrix are given by
h i ^MSR ð0; 1ÞÞ Cov ðb ^MSR ðk; dÞÞ ¼ r2 Mð1Þ S þ R0 W 1 R þ 2I þ S1 S R0 W 1 R dSðkÞ1 S d2 SðkÞ1 SSðkÞ1 Mð1Þ Cov ðb h i 1 2 1 1 ¼ r2 Mð1Þ 2I þ S1 dSðkÞ S d SðkÞ SSðkÞ Mð1Þ: ^MSR ðk; dÞÞ Cov ðb ^MSR ð0; 1ÞÞ to be positive definition matrix is A necessary and sufficient condition for Cov ðb
kmax
1 1 2 1 1 dSðkÞ S þ d SðkÞ SSðkÞ < 1: 2I þ S1
Now, let us further consider the MMSE comparison of the MSRTP and MM estimators,
^MSR ð0; 1ÞÞ MMSE b ^MSR ðk; dÞ D2 ¼ MMSEðb 0 ^MSR ð0; 1ÞÞ Cov ðb ^MSR ðk; dÞÞ Mð1Þ Sðk þ dÞSðkÞ1 S Sð1Þ bb0 Sðk þ dÞSðkÞ1 S Sð1Þ Mð1Þ ¼ Cov ðb h 0 i 1 1 ¼ Mð1Þ r2 H Sðk þ dÞSðkÞ S Sð1Þ bb0 Sðk þ dÞSðkÞ S Sð1Þ Mð1Þ; 1
2
1
1
where H ¼ 2I þ S1 dSðkÞ S d SðkÞ SSðkÞ . Using Lemma 2.1, we can have the following theorem. h i 1 1 2 1 1 ^m ðk; dÞ is superior to the MM < 1, then MSRTP estimator b Theorem 2.2. If kmax ðdSðkÞ S þ d SðkÞ SSðkÞ Þð2I þ S1 Þ estimator in the MMSE sense if and only if
0 1 1 b0 Sðk þ dÞSðkÞ S Sð1Þ H1 Sðk þ dÞSðkÞ S Sð1Þ b 6 r2 :
376
Y. Li, H. Yang / Applied Mathematics and Computation 249 (2014) 371–381
2.3. MMSE performance comparison of MSRTP estimator and SRTP estimator From (10), we can derive the bias vector and variance matrix of the SRTP estimator as 1
1
^SR ðk; dÞÞ ¼ ðS þ R0 W 1 RÞ ðSð1Þ1 Sðk þ dÞSðkÞ S Ip ÞSb; Biasðb 1 h i 1 1 1 ^SR ðk; dÞÞ ¼ r2 S þ R0 W 1 R Cov ðb Sð1Þ1 Sðk þ dÞSðkÞ S3 SðkÞ Sðk þ dÞSð1Þ1 þ R0 W 1 R ðS þ R0 W 1 RÞ : Now, Let us consider the following MMSE difference:
^SR ðk; dÞÞ MMSEðb ^MSR ðk; dÞÞ D3 ¼ MMSEðb ^SRTP Þ Cov ðb ^m ðk; dÞÞ þ Biasðb ^SRTP ÞBiasðb ^SRTP Þ0 Biasðb ^m ðk; dÞÞBiasðb ^m ðk; dÞÞ0 ¼ r2 D þ b2 b0 b1 b0 ; ¼ Cov ðb 2 1 where
D ¼ E NSN 0 þ R0 W 1 R E Mð1ÞSð1ÞNS1 N0 Sð1ÞMð1Þ QWQ 0 ; 1 E ¼ S þ R0 W 1 R ;
1 Q ¼ Sð1Þ1 R0 W þ RSð1Þ1 R0 ;
1
N ¼ Sð1Þ1 Sðk þ dÞSðkÞ S;
b2 ¼ EðN Ip ÞSb;
b1 ¼ Mð1ÞSð1ÞðN Ip Þb:
^MSR ðk; dÞ, when Theorem 2.3. Let us be given the MSRTP estimator b
1 < 1: kmax ðMð1ÞSð1ÞNS1 N 0 Sð1ÞMð1Þ þ QWQ 0 Þ EðNSN0 þ R0 W 1 RÞE
^MSR ðk; dÞ is superior to the SRTP estimator b ^SR ðk; dÞ in the MMSE sense, namely D3 P 0 if and only if Then b 0 0 b1 ðr2 D þ b2 b2 Þb1 6 1. Proof. Since
EðNSN0 þ R0 W 1 RÞE > 0;
Mð1ÞSð1ÞNS1 N0 Sð1ÞMð1Þ þ QWQ 0 P 0:
Therefore when
h i 1 < 1: kmax ðMð1ÞSð1ÞNS1 N0 Sð1ÞMð1Þ þ QWQ 0 ÞðEðNSN 0 þ R0 W 1 RÞEÞ 0
0
We apply Lemma 2.1, we have D3 P 0 if and only if b1 ðr2 D þ b2 b2 Þb1 6 1.From the above theorems, we can conclude that the proposed MSRTP estimator can perform better than the TP estimator and MM estimator under some conditions, respectively. However, all the results above depend on the unknown parameter b; r2 ; k and d. How to replacing these unknown parameters by their suitable estimators is obviously very important issue in application. h
3. Selection of parameter k and d In this section, we discuss the selection of parameter k and d by the MSE sense. How to find an appropriate parameter is a very important issue in application. It is well known that a linear regression model can be transformed to a canonical form by orthogonal transformation. Let Z ¼ XQ ; a ¼ Q b, and Q is an orthogonal matrix such that Z 0 Z ¼ Q 0 X 0 XQ ¼ Q 0 SQ ¼ K ¼ diagðk1 ; . . . ; kp Þ, where k1 P k1 P P kp > 0 are the ordered eigenvalues of X 0 X. Then we rewrite model (1) in canonical form
y ¼ Z a þ e: ^MSR ðk; dÞ, MSEða ^MSR ðk; dÞÞQ , from (13) the MMSE of a ^ MSR ðk; dÞ ¼ Q 0 b ^ MSR ðk; dÞÞ ¼ Q 0 MSEðb ^ MR ðk; dÞ can be written as Since a
h i ^ MSR ðk; dÞÞ ¼ r2 ðK þ I þ CÞ1 C þ K þ 2dðK þ kIÞ1 K þ d2 ðK þ kIÞ1 KðK þ kIÞ1 ðK þ I þ CÞ1 MMSEða 1
1
þ ðK þ I þ CÞ1 ðdðK þ kIÞ K IÞaa0 ðdðK þ kIÞ K IÞðK þ I þ CÞ1 ; 0
0
where Q R W
1
ð19Þ
RQ ¼ C ¼ diagðt1 ; . . . ; tp Þ. Optimal values for k and d can be derived by minimizing
h i 2 2 2 p r2 ðki þ kÞ ðki þ t i Þ þ 2dki ðki þ kÞ þ d ki þ ðdki ki kÞ a2 i X i ^ hðk; dÞ ¼ tr MMSE bMR ðk; dÞ ¼ : 2 2 ðk þ 1 þ t Þ ðk þ kÞ i i i i¼1 h
ð20Þ
377
Y. Li, H. Yang / Applied Mathematics and Computation 249 (2014) 371–381 Table 1 Total National Research and Development Expenditures – as a Percent of Gross National Product by Country: 1972–1986. Year
Y (USA)
X1 (France)
X2 (West Germany)
X3 (Japan)
X4 (Soviet Russia)
1972 1975 1979 1980 1981 1982 1983 1984 1985 1986
2.3 2.2 2.2 2.3 2.4 2.5 2.6 2.6 2.7 2.7
1.9 1.8 1.8 1.8 2.0 2.1 2.1 2.2 2.3 2.3
2.2 2.2 2.4 2.4 2.5 2.6 2.6 2.6 2.8 2.7
1.9 2.0 2.1 2.2 2.3 2.4 2.6 2.6 2.8 2.8
3.7 3.8 3.6 3.8 3.8 3.7 3.8 4.0 3.7 3.8
The k value which minimizes the function hðk; dÞ can be found by hðk; dÞ with respect to k when d is fixed. 2
@hðk; dÞ X 2r2 dki ðki þ kÞ þ 2r2 d ki þ 2dki ðdki ki kÞa2i ¼ : 3 @k ðki þ 1 þ ti Þ3 ðki þ kÞ i¼1 p
Let
@hðk;dÞ @k
¼ 0, we have
ð1 dÞki a2i ðd þ ki Þr2 k¼ : r2 a2i
ð21Þ
The optimal value k in (20) depends on the unknown r2 and a2i , when d is fixed. For practical purposes, we replace then by their unbiased estimators as suggested by Hoerl and Kennard [1] and Kibria [24] and obtain
^ 2i ðd þ ki Þr ^2 ^ ¼ ð1 dÞki a k : 2 2 r^ a^ i
ð22Þ
Now we can propose the estimators of k by taking the arithmetic and geometric means value of k that Kibria [24] found, that is
^AM ¼ 1 k p
^GM ¼ k
p X ^ 2i ðd þ ki Þr ^2 ð1 dÞki a ; 2 2 ^ ^ r ai
ð23Þ
i¼1
p Y ^ 2 ðd þ ki Þr ^2 ð1 dÞki a i ^2
r a^ 2i
i¼1
!1=P ð24Þ
;
^ values, respectively. which are the arithmetic mean and the geometric mean of k Similarly, the value of d which minimizes hðk; dÞ for fixed k value can be derived by differentiating hðk; dÞ with respect to d, we have
@hðk; dÞ X 2r2 ki ðki þ kÞ þ 2dr2 ki þ 2ki ðdki ki kÞa2i ¼ ; 2 @d ðki þ 1 þ t i Þ2 ðki þ kÞ i¼1 p
and let it be equal to zero. After the unknown parameters r2 and a2i are replaced by their unbiased estimators, we obtain the optimal estimator of d for fixed k value as
^opt ¼ d
Pp ^2 ^ 2 i¼1 ki ðki þ kÞðai r Þ Pp : 2 2 þka ^ k ð r i ^i Þ i¼1 i
ð25Þ
We know that k is always positive, if we put a constraint on k values in (21) so that they are positive then the positive values of the estimators in Eqs. (22)–(24) can be obtained. In this manner, we have the following theorem. Theorem 3.1. If
2 ^ ^2 ^ < min ki ai r d ; ^ 2i þ r ^2 ki a
ð26Þ
^AM and k ^GM are always positive. for all i, then k ð1dÞk a2 ðdþk Þr2
k ða2 r2 Þ
i i i Proof. The values of k in (21) are always positive if > 0. Then we get d < ki ai2 þr2 for all i. The inequality r2 a2i i i 2 2 depends on the unknown parameters r and ai which we cannot obtain d. A suitable estimate of d is derived as
k ða2 r2 Þ ^AM and k ^GM are the arithmetic and geometric d < min ki ai2 þr2 by replacing r2 and a2i by their unbiased estimators. As k i i
2 2 ^ in (21) which are always positive when d ^ < min ki ða^i r^ Þ . h means of k ^ 2 þr ^2 ka i i
378
Y. Li, H. Yang / Applied Mathematics and Computation 249 (2014) 371–381
Table 2 MSE value for MM, TP, SRTP and MSRTP estimators when d ¼ 0:9 in Example 1. k ^m b
0 0.0426
0.001 0.0426
0.005 0.0426
0.01 0.0426
0.05 0.0426
0.09 0.0426
0.1 0.0426
^ dÞ bðk; ^SR ðk; dÞ b
0.0684
0.0663
0.0610
0.0593
0.0902
0.1191
0.1247
0.0374
0.0356
0.0296
0.0242
0.0093
0.0061
0.0058
^MSR ðk; dÞ b
0.0008
0.0009
0.0010
0.0013
0.0033
0.0046
0.0048
2 0.0426
2.5 0.0426
3 0.0426
3.5 0.0426
Table 3 MSE value for MM, TP, SRTP and MSRTP estimators when k ¼ 0:05 in Example 1. d ^m b ^ dÞ bðk;
0.8 0.0426
1 0.0426
1.5 0.0426
0.1001
0.0817
0.0604
0.0746
0.1241
0.2090
0.3294
^SR ðk; dÞ b ^MSR ðk; dÞ b
0.0082
0.0106
0.0227
0.0433
0.0725
0.1103
0.1567
0.0042
0.0027
0.0028
0.0085
0.0198
0.0368
0.0593
Fig. 1. The MSE values of TP, MM, SRTP and MSRTP estimator when d ¼ 0:9.
^MSR ðk; dÞ can be found by applying the following method. Now the optimal values of k and d in b ^ from Eq. (26). Step 1. Calculate d ^AM or k ^GM by using d ^ in Step 1. Step 2. Estimate k ^opt from (25) by using the estimators k ^AM or k ^GM in Step 2. Step 3. Obtain d 4. Numerical examples and Monte Carlo simulation In order to illustrate our theoretical results, we now consider two numerical examples and a Monte Carlo simulation in this section. Our computations here and below were all performed by using the software Matlab R2012a. The first numerical example to illustrate the performance of our new MSRTP estimator we now consider a dataset on total National Research and Development Expenditures as a percent of Gross National Product originally due to Gruber [22] and Akdeniz and Erol [23] (see Table 1). From this data, we find the following results:
379
Y. Li, H. Yang / Applied Mathematics and Computation 249 (2014) 371–381
(a) The eigenvalues of X 0 X:
k1 ¼ 302:9626;
k2 ¼ 0:7283;
(b) The OLS estimate of b and
k3 ¼ 0:0446;
k4 ¼ 0:0345:
r2 :
^ ¼ S1 X 0 y ¼ ð0:6455; 0:0896; 0:1436; 0:1526Þ0 b ^ ¼ 0:0808 and r ^ 2 ¼ 0:0015. with MSEðbÞ Consider the following stochastic linear restrictions
r ¼ Rb þ e;
^ 2OLSE Þ: R ¼ ð1 2 2 2Þ; e Nð0; r
For the TP estimator, MM estimator, SRTP estimator and our MSRTP estimator, their estimated MSE values are obtained by replacing in the corresponding theoretical MSE expressions all unknown model parameters by their OLS estimator (see Tables 2 and 3 and Fig. 1). From Tables 2 and 3 and Fig. 1, we can find that the estimated MSE value of MSRTP estimator is much smaller than those of the TP, MM and SRTP estimators. The second numerical example we consider a dataset on Portland cement originally due to Woods et al. [25] and which has analyzed by Kaçiranlar et al. [12]. These data come from an experimental investigation of the heat evolved during the setting and hardening of Portland cements of varied composition and the dependence of this heat on the percentages of four compounds in the clinkers from which the cement was produced. The four compounds considered are tricalcium aluminate: 3CaO Al2 O3 , tricalcium silicate: 3CaO SiO2 , tetracalcium aluminoferrite: 4CaO Al2 O3 Fe2 O3 and b-dicalcium silicate: 2CaO SiO2 , which are denoted by x1 ; x2 ; x3 and x4 , respectively. The heat evolved after 180 days of curing, which is denoted by Y, is measured in calories per gram of cement. We assemble the data as follows:
0
7
B 1 B B B 11 B B 11 B B B 7 B B B 11 B X ¼ ðx1 ; x2 ; x3 ; x4 Þ ¼ B B 3 B B 1 B B 2 B B B 21 B B 1 B B @ 11 10
26
6
60
1
29 15 52 C C C 56 8 20 C C 31 8 47 C C C 52 6 33 C C C 55 9 22 C C 71 17 6 C C; C 31 22 44 C C 54 18 22 C C C 47 4 26 C C 40 23 34 C C C 66 9 12 A 68
8
0
78:5
1
B 74:3 C C B C B B 104:3 C C B B 87:6 C C B C B B 95:9 C C B C B B 109:2 C C B B Y ¼ B 102:7 C C: C B B 72:5 C C B B 93:1 C C B C B B 115:9 C C B B 83:8 C C B C B @ 113:3 A
12
109:4
From Tables 4 and 5, we can also find that the estimated MSE value of MSRTP estimator is also much smaller than those of the TP, MM and SRTP estimators. To further illustrate the behavior of our proposed estimator, we are to perform a Monte Carlo simulation study under different levels of multicollinearity. In this study five explanatory variables and a response variable are generated using the following Monte Carlo equations.
xij ¼ ð1 q2 Þ
1=2
2 1=2
yi ¼ ð1 q Þ
zip þ qziðpþ1Þ
zip þ qziðpþ1Þ
;
i ¼ 1; . . . ; n; j ¼ 1; p;
where zij and ziðpþ1Þ are independent standard normal pseudo-random numbers and q is specified so that the theoretical correlation between any two explanatory variables is given by q2 . In this study we consider n ¼ 100; p ¼ 4; q ¼ 0:99; 0:999 and the same stochastic linear restrictions.
Table 4 MSE value for MM, TP, SRTP and MSRTP estimators when d ¼ 0:9 in Example 2. k ^m b
0 0.0627
0.001 0.0627
0.005 0.0627
0.01 0.0627
0.05 0.0627
0.09 0.0627
0.1 0.0627
^ dÞ bðk; ^SR ðk; dÞ b
0.0637
0.0637
0.0637
0.0637
0.0637
0.0637
0.0637
0.0636
0.0636
0.0636
0.0636
0.0636
0.0636
0.0636
^MSR ðk; dÞ b
0.0627
0.0627
0.0627
0.0627
0.0627
0.0627
0.0627
380
Y. Li, H. Yang / Applied Mathematics and Computation 249 (2014) 371–381
Table 5 MSE value for MM, TP, SRTP and MSRTP estimators when k ¼ 0:05 in Example 2. d ^m b ^ dÞ bðk;
0.8 0.0627
1 0.0627
1.5 0.0627
2 0.0627
2.5 0.0627
3 0.0627
3.5 0.0627
0.0636
0.0638
0.0644
0.0651
0.0661
0.0671
0.0683
^SR ðk; dÞ b ^MSR ðk; dÞ b
0.0635
0.0637
0.0643
0.0651
0.0660
0.0671
0.0683
0.0627
0.0627
0.0627
0.0630
0.0634
0.0639
0.0646
Table 6 MSE value for MM, TP, SRTP and MSRTP estimators when q ¼ 0:99. d = 0.5
d=2 0.01 0.0148
0.1 0.0148
0.001 0.0148
d=3
k ^m b
0.001 0.0148
0.01 0.0148
0.1 0.0148
0.001 0.0148
0.01 0.0148
0.1 0.0148
^ dÞ bðk; ^SR ðk; dÞ b
0.0251
0.0250
0.0246
0.0649
0.0646
0.0615
0.1055
0.1048
0.0987
0.0181
0.0180
0.0178
0.0406
0.0404
0.0389
0.0621
0.0618
0.0588
^MSR ðk; dÞ b
0.0126
0.0126
0.0126
0.0129
0.0129
0.0128
0.0141
0.0141
0.0139
0.01 0.0155
0.1 0.0155
0.01 0.0155
0.1 0.0155
Table 7 MSE value for MM, TP, SRTP and MSRTP estimators when q ¼ 0:999. d = 0.5
d = 0.8 0.01 0.0155
0.1 0.0155
0.001 0.0155
d=1
k ^m b
0.001 0.0155
0.001 0.0155
^ dÞ bðk; ^SR ðk; dÞ b
0.0144
0.0139
0.0113
0.0249
0.0234
0.0154
0.0353
0.0328
0.0197
0.0085
0.0081
0.0061
0.0153
0.0145
0.0094
0.0216
0.0202
0.0125
^MSR ðk; dÞ b
0.0017
0.0017
0.0021
0.0011
0.0012
0.0015
0.0010
0.0010
0.0013
Fig. 2. The MSE values of TP, MM, SRTP and MSRTP estimator when d ¼ 0:9; q ¼ 0:999.
Y. Li, H. Yang / Applied Mathematics and Computation 249 (2014) 371–381
381
For two different levels of multicollinearity, MSE values of the TP, MM, SRTP and MRTP for various values of k and d are obtained in Tables 6 and 7 and Fig. 2. From Tables 6 and 7 and Fig. 2, we may conclude that the new estimator is superior to the TP, MM and SRTP estimators. From above analysis, we see that in practice, we can choose small k and regulate d which agrees with our theoretical findings. Thus, we can see that our estimator is meaningful in practice. 5. Conclusion In this paper, we introduce a modified stochastic restricted two-parameter (MSRTP) estimator for the unknown parameter vector in linear regression. Some necessary and sufficient conditions are derived for the proposed estimator to the TP, MM and SRTP estimator by the MMSE criterion. We also present several methods of estimating the shrinkage parameters and illustrate the superiority of the new estimator with two numerical examples and a Monte Carlo simulation. Thus, researchers who are inclined to use the new MSRTP estimator can draw conclusion which estimator performs better in the MSE criterion by examining the numerical validation. Acknowledgments This work was supported by the National Natural Science Foundation of China (No. 11201505) and the Fundamental Research Funds for the Central Universities (No. 0208005205012). References [1] A.E. Hoerl, R.W. Kennard, Ridge regression: biased estimation for non-orthogonal problems, Technometrics 12 (1970) 55–67. [2] K. Liu, A new class of biased estimate in linear regression, Commun. Stat. Theory Methods 22 (1993) 393–402. [3] S. Toker, S. Kaçiranlar, On the performance of two parameter ridge estimator under the mean square error criterion, Appl. Math. Comput. 219 (2013) 4718–4728. } S ß iray, S. Sakallıog˘lu, Superiority of the r–k class estimator over some estimators in a linear model, Commun. Stat. Theory Methods 41 (15) (2012) [4] G.U. 2819–2832. } ß iray, S. Kaçiranlar, S. Sakallıog˘lu, R–k class estimator in the linear regression model with correlated errors, Stat. Pap. 55 (2) (2014) 393–407. [5] G.U. S [6] S. Kaçiranlar, S. Sakallıog˘lu, Combining the Liu estimator and the principal component regression estimator, Commun. Stat. Theory Methods 30 (2001) 699–2705. [7] S. Sakallıog˘lu, S. Kaçiranlar, A new biased estimator based on ridge estimation, Stat. Pap. 49 (2008) 669–689. [8] M. Özkale, S. Kaçiranlar, The restricted and unrestricted two-parameter estimators, Commun. Stat. Theory Methods 36 (2007) 2707–2725. [9] C.R. Rao, H. Toutenburg, Shalabh, C. Heumann, Linear Models and Generalizations: Least Squares and Alternatives, Springer-Verlag, New York, 2008. [10] N. Sarkar, A new estimator combining the ridge regression and the restricted least squares methods of estimation, Commun. Stat. Theory Methods 21 (1992) 1987–2000. [11] J. Groß, Restricted ridge estimation, Stat. Probab. Lett. 65 (2003) 57–64. [12] S. Kaçiranlar, S. Sakallıoglus, F. Akdeniz, G.P.H. Styan, H.J. Werner, A new biased estimator in linear regression and a detailed analysis of the widelyanalysed dataset on Portland cement, Sankhya Ser. B Ind. J. Stat. 61 (1999) 443–459. [13] Z. Zhong, H. Yang, Ridge estimation to the restricted linear model, Commun. Stat. Theory Methods 36 (2007) 2099–2115. [14] M. Özkale, Comment on ridge estimation to the restricted linear model, Commun. Stat. Theory Methods 38 (7) (2009) 1094–1097. [15] X. Liu, F. Gao, J. Xu, Linearized restricted ridge regression estimator in linear regression, Commun. Stat. Theory Methods 41 (24) (2012) 4503–4514. [16] M. Arashi, S. Tabatabaey, Stein-type improvement under stochastic constraints: use of multivariate Student-t model in regression, Stat. Probab. Lett. 78 (2008) 2142–2153. [17] H. Theil, A. Goldberger, On pure and mixed statistical estimation in economics, Int. Econ. Rev. Overview Recent Results Stat. Pap. 31 (1961) 165–179. [18] H. Yang, J. Wu, A stochastic restricted k–d class estimator, Statistics 46 (2012) 759–766. [19] J.K. Baksalary, R. Kala, Partial orderings between matrices one of which is of rank one, Bull. Polish Acad. Sci. Math. 31 (1983) 5–7. 6. [20] M. Özkale, Stochastic restricted ridge regression estimator, J. Multivariate Anal. 100 (2009) 1706–1716. 6. [21] M. Hubert, P. Wijekoon, Improvement of the Liu estimator in linear regression model, Stat. Pap. 47 (2006) 471–479. [22] M. Gruber, Improving Efficiency by Shrinkage: the James–Stein and Ridge Regression Estimators, Marcel Dekker, Inc., New York, 1998. [23] F. Akdeniz, H. Erol, Mean squared error matrix comparisons of some biased estimators in linear regression, Commun. Stat. Theory Methods 32 (2003) 2389–2413. [24] B.M.G. Kibria, Performance of some new ridge regression estimators, Commun. Stat. Simul. Comput. 32 (2003) 419–435. [25] H. Woods, H. Steinour, H.R. Starke, Effect of composition of Portland cement on heat evolved during hardening, Ind. Eng. Chem. 24 (1932) 1207–1241.