Journal of Statistical Planning and Inference 141 (2011) 189–196
Contents lists available at ScienceDirect
Journal of Statistical Planning and Inference journal homepage: www.elsevier.com/locate/jspi
Improved Liu estimator in a linear regression model$ Xu-Qing Liu Faculty of Mathematics and Physics, Huaiyin Institute of Technology, Huai’an 223003, PR China
a r t i c l e in fo
abstract
Article history: Received 23 October 2009 Received in revised form 26 May 2010 Accepted 31 May 2010 Available online 8 June 2010
In this paper, we mainly aim to introduce the notion of improved Liu estimator (ILE) in the linear regression model y ¼ X b þ e. The selection of the biasing parameters is investigated under the PRESS criterion and the optimal selection is successfully derived. We make a simulation study to show the performance of ILE compared to the ordinary least squares estimator and the Liu estimator. Finally, the main results are applied to the Hald data. & 2010 Elsevier B.V. All rights reserved.
Keywords: Liu estimator Improved Liu estimator Linear regression model PRESS criterion
1. Introduction Consider a linear regression model y ¼ X b þ e,
ð1:1Þ 0
where y ¼ ðy1 , . . . ,yn Þ is an n-dimensional random vector of response variables with mean EðyÞ ¼ X b and covariance matrix DðyÞ ¼ s2 I n , X ¼ ðx1 , . . . ,xn Þ0 with xi ¼ ðxi1 , . . . ,xip Þ0 for i=1,y, n is the n p regressor matrix of full column rank, b is a vector of unknown parameters and s2 is an unknown constant. It is well known that the ordinary least squares estimator (OLSE) for b is written as
b^ OLS ¼ ðX 0 XÞ1 X 0 y, which is the best unbiased estimator when xi ’s are uncorrelated. However, when the problem of multicollinearity appears in the model (1.1), OLSE performs very poorly under the mean squared error (MSE) criterion. In order to overcome multicollinearity, various biased estimators were put forward in the literature. Thereinto, ordinary ridge regression estimator (ORRE; cf. Hoerl and Kennard, 1970) and Liu estimator (LE; Liu, 1993; Akdeniz and Kac- ıranlar, 1995) are two popular linear biased estimators, which can locally improve OLSE by choosing appropriately the ridge parameter and the Liu parameter, respectively, involved in ORRE and LE with respect to the MSE criterion. Some nonlinear biased estimators such as double-k class estimator (DKCE; cf. Ullah and Ullah, 1978) can globally improve OLSE under some conditions, but the conditions may be invalid for a practical problem. Moreover, the computations of MSE of DKCE are complicated and only an estimated value of it can be finally derived. This leads to the difficulties of choosing the parameters involved in DKCE. As we all know, any linear biased estimator including biasing parameter(s) will reduce to be a nonlinear function after determining the selection of the parameters. On the other hand, the MSE of a nonlinear estimator is hard to calculate due to $ Research supported by Grant HGC0923 and the ‘‘Green & Blue Project’’ Program for 2008 to Cultivate Young Core Instructors from the Huaiyin Institute of Technology. E-mail address:
[email protected]
0378-3758/$ - see front matter & 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2010.05.030
190
X.-Q. Liu / Journal of Statistical Planning and Inference 141 (2011) 189–196
the following three reasons: (i) Calculating the MSE of a nonlinear estimator needs the distribution of the response variables, which is not easy to be exactly determined in general, however, for a practical problem. (ii) Even the distribution of the model is known, the computations are still difficult. Usually, the normality assumption is imposed on the model but such a hypothesis needs testing. (iii) The expression of MSE includes the regression parameters and the variance, which are unknown and need to be replaced by corresponding estimates. Therefore, the MSE value is in fact an estimated MSE. For the above reasons, we recommend to use the PRESS criterion or predicted oriented criterion (Myers, 2005, pp. 165–172) which is based on the performance of prediction. In contrast, PRESS does not have the above disadvantages. The main purpose of this paper is to introduce a new biased estimator by investigating the optimal selection of the biasing parameters with respect to (w.r.t.) the PRESS criterion. Noting that either ORRE or DKCE is a nonlinear functions of the parameter(s) involved while LE is a linear function of the Liu parameter, we shall choose LE as the object of study and make a generalization. The rest of the paper is as follows. In Section 2, we introduce concisely the notion of LE and the PRESS criterion and then put forward a new estimator which results from LE. The optimal selection of the biasing parameters is successfully found under the PRESS criterion. In Section 3, a simulation study is given with the aid of Matlab to show the superiority of the new estimator over OLSE and LE. The main results of the paper are applied to the Hald data in Section 4. Finally, we discuss the method of detection of the influential observations and outliers.
2. Improved Liu estimator LE was proposed by Liu (1993) which received a great attention in the literature. For example, Kac- ıranlar et al. (1999) have introduced the notion of restricted LE and discussed its stochastic properties. Hubert and Wijekoon (2006) made an improvement on LE and restricted LE, and proposed the stochastic restricted LE. Restricted LE and stochastic restricted LE are studied, respectively, by the authors under the MSE criterion and the MSE matrix criterion. Kac- ıranlar and Sakallıgˇlu (2001) combined LE and principal component regression estimator and have investigated the properties of the resultant estimator. As a generalized version of LE and ORRE, the Liu-type estimator was considered by Liu (2003). See Arslan and Billor (2000); Torigoe and Ujiie (2006) for further studies. By Liu (1993) and Akdeniz and Kac- ıranlar (1995), LE is expressible as
b^ d ¼ ðX 0 X þIÞ1 ðX 0 X þdIÞb^ OLS ,
ð2:1Þ
where d 2 ½0,1 is the biasing parameter. Clearly, LE turns out to be OLSE if taking d = 1. When d 2 ½0,1Þ, LE is a shrunken estimator of OLSE. Here, we argue that the interval restriction on d is not necessary since the optimal selection of d may be less than 0 or larger than 1. Such assertion is similar to the argument given by Druilhet and Mom (2008, p. 233). As stated ¨ zkale and Kac- ıranlar (2007, p. 1891), LE is a linear function of the biasing parameter d and thus the selection of d is by O relatively simple. The prediction error sum of squares (PRESS) statistic was put forward by Allen (1971, 1974) and usually used to detect high influential observations in the literature. See also Myers (2005, p. 172) for details. PRESS statistic is defined as the sum of squares of the n PRESS residuals of the form e^ i,i ¼ yi y^ i,i ¼ yi x0i bi , where y^ i,i is the predicted value of yi based on the observations by setting yi aside, bi is the estimated value of bi by deleting the ith observation from the data set. Specially, if we use OLSE to evaluate the model, then bi has the form ðOLSÞ
bi
¼ ðX 0 Xxi x0i Þ1 ðX 0 yxi yi Þ:
In this case, the ith PRESS residual based on OLSE is presented as ðOLSÞ e^ i,i ¼
yi x0i ðX 0 XÞ1 X 0 y 1x0i ðX 0 XÞ1 xi
,
where yi y^ i is the ordinary residual. See Myers (2005, pp. 172, 465–466) for the calculations. PRESS based on OLSE reduces thus to PRESS ¼
n X
ðOLSÞ
ðe^ i,i Þ2 ¼
i¼1
n X
yi x0i ðX 0 XÞ1 X 0 y
i¼1
1x0i ðX 0 XÞ1 xi
!2 :
Usually, PRESS statistic was viewed as a criterion used for validation in the spirit of data splitting. Also, the above representation of PRESS based on OLSE can help analysts to find those high influential observations or outliers and make a further determination. Montgomery and Friedman (1993) recommended to use the PRESS statistic for selecting the biasing ¨ zkale and Kac- ıranlar (2007) extended the PRESS criterion to LE. PRESS based on LE is given by parameter involved in ORRE. O PRESS ¼
n X i¼1
ðyi x0i bd;i Þ2 ,
X.-Q. Liu / Journal of Statistical Planning and Inference 141 (2011) 189–196
191
where ðOLSÞ
bd;i ¼ ðX 0 X þ Ixi x0i Þ1 ðX 0 X þ dIxi x0i Þbi
stands for the LE of b by deleting the ith observation from the data set for i=1,y,n. LE can improve OLSE under the PRESS ¨ zkale and Kac- ıranlar (2007, p. 1894 (Theorem 2.1) criterion by appropriately selecting the value of the biasing parameter. O and 1896), come to this conclusion by providing the optimal selection of d. They also applied their result to the well known Hald data by using the unit length scaling variables and obtained that the values of the PRESS statistic based on OLSE and the optimal LE equal 0.0321 and 0.0318, respectively. However, the improvement of the optimal LE over OLSE in the PRESS ¨ zkale and Kac- ıranlar (2007, p. 1896, Table 1) mentioned that for sense is not very effective, as we can see. By the way, O ORRE the optimal selection of the ridge parameter is approximately equal to 0.0106 and in this case the PRESS value of ORRE equals 0.02765. Also, the improvement of ORRE on OLSE is not very substantial. In this paper, we will introduce a new estimator based on LE and then apply the PRESS criterion to the estimator to choose the optimal selection of the biasing parameters. The same data set will be studied to show the superiority of ILE. It will be seen that the PRESS value of the optimal ILE is far less than that of either OLSE or the optimal LE. Now, we note that LE can be rewritten as
b^ d ¼ ðX 0 X þ I p Þ1 X 0 y þdðX 0 X þ I p Þ1 ðX 0 XÞ1 X 0 y, in which the second part is an estimator uniformly adjusted by d. As mentioned above, the improvement of such a LE on OLSE is ineffective. If we adjust ðX 0 X þ I p Þ1 ðX 0 XÞ1 X 0 y non-uniformly by using potentially different d1,y,dp, then ineffective-improvement of LE may be amended to some certain extent. It is seen that in this case the resulting expression is still a linear function of the parameters involved and therefore the selection of di may be simple. However, if we also uniformly or non-uniformly adjust the first part of LE, the improvement should be greater and may result in a better dedication to the generalization. In general, the number of response variables is far larger than the number of regression parameters for a practical problem and thus we assume that n Z2p in this paper, without loss of generality. By the above analysis, we give the following definition: Definition 2.1. Assume that n Z 2p. Let k1 , . . . ,kp , d1 , . . . ,dp 2 R be scalars and take K ¼ diagðk1 , . . . ,kp Þ and D ¼ diagðd1 , . . . ,dp Þ. The estimator
b^ K,D 9KðX 0 X þI p Þ1 X 0 y þ DðX 0 X þ I p Þ1 ðX 0 XÞ1 X 0 y
ð2:2Þ
is called an improved Liu estimator (ILE). It is clear that b^ OLS ¼ b^ I p , I p and b^ d ¼ b^ I p , dI p and from this point of view either an OLSE or a LE is a special ILE. In the following, we determine the 2p biasing parameters such that the PRESS statistic is minimized. Substituting X 0 Xxi x0i and X 0 yxi yi for X 0 X and X 0 y, respectively, in b^ K,D , we obtain the corresponding ILE of b by deleting the ith observation for i= 1,y,n, saying
b^ K,D;i ¼ Kui þ Dvi , where ðiÞ 0 ui ¼ ðX 0 X þIxi x0i Þ1 ðX 0 yxi yi Þ9ðuðiÞ 1 , . . . ,up Þ
vi ¼ ðX
0
and
X þ Ixi x0i Þ1 ðX 0 Xxi x0i Þ1 ðX 0 yxi yi Þ9ðvðiÞ 1 ,
0 . . . ,vðiÞ p Þ:
Then, PRESS of ILE can be written as PRESS ¼
n X
ðyi x0i b^ K,D;i Þ2 :
i¼1
We denote k ¼ ðk1 , . . . ,kp Þ0 , d ¼ ðd1 , . . . ,dp Þ0 , U ¼ ðu1 , . . . ,un Þ0 , V ¼ ðv1 , . . . ,vn Þ0 . By direct operations, it is concluded that x0i b^ K,D;i ¼
p X
ðiÞ 0 0 ðkt xit uðiÞ t þ dt xit vt Þ ¼ ðxi 3ui Þ k þðxi 3v i Þ d,
t¼1
where the operator ‘‘3’’ stands for the Hadamard Product. Therefore, PRESS ¼ ½yðX3UÞk þ ðX3VÞd0 ½yðX3UÞk þðX3VÞd 0
0 0 0
0
0 0
¼ ½yðX3U X3VÞðk d Þ ½yðX3U X3VÞðk d Þ :
ð2:3Þ ð2:4Þ
The above expression of PRESS indicates that PRESS seems to equal the ‘‘sum of squared residuals (SSR)’’ of the ‘‘linear regression model’’ y ¼ Z c þ e, with notation Z ¼ ðX3U,X3VÞ. Therefore, the 2p-dimensional vector c minimizes the ‘‘SSR’’ if
192
X.-Q. Liu / Journal of Statistical Planning and Inference 141 (2011) 189–196
and only if it is the ‘‘OLSE’’ of the ‘‘regular equation’’, namely Z 0 Z c ¼ Z 0 y. That is, ðk0 d0 Þ0 ¼ ðZ 0 ZÞ1 Z 0 y ¼ ½ðX3U X3VÞ0 ðX3U X3VÞ1 ðX3U X3VÞ0 y9ðk1 , . . . ,kp ,d1 , . . . ,dp Þ0 9ðk
0
d 0 Þ0 :
ð2:5Þ
Then, ki (resp., dj ) defined by (2.5) is the optimal selection of ki (resp., tdj) w.r.t. the PRESS criterion for any i,j= 1,y,p. Putting now K 9diagðk1 , . . . ,kp Þ,
D 9diagðd1 , . . . ,dp Þ,
and
ð2:6Þ
we obtain a conclusion which is summarized in the form of a theorem below: Theorem 2.1. Let K and D be defined by (2.6). Then b^ K ,D has the smallest PRESS in the set of all ILE’s. Further, the smallest PRESS equals PRESSmin ¼ y0 yy0 ðX3U X3VÞ½ðX3U X3VÞ0 ðX3U X3VÞ1 ðX3U X3VÞ0 y: Proof. The proof of the first part is based on the above analysis. To derive the smallest value of PRESS, it suffices to insert k ¼ k and d ¼ d into (2.4). The proof is completed. & By Theorem 2.1, b^ K ,D is the optimal ILE under the PRESS criterion. Specially, b^ K ,D is superior over OLSE and any fixed arbitrary LE since both an OLSE and an LE are special ILE’s, as mentioned above. In particular, for any given diagonal matrices K and D, we have !!0 !! 0 k k k k : yZ yZ yZ Z yZ d d d d As a consequence, we mention that the optimal LE can be derived by virtue of (2.3). In fact, inserting k ¼ 1p and d ¼ d1p into (2.3), it can be readily concluded that d is the optimal selection of the biasing parameter involved in LE, where d ¼
ðyðX3UÞ1p Þ0 ðX3VÞ1p 0
1p ðX3VÞ0 ðX3VÞ1p
:
ð2:7Þ
¨ zkale and Kac- ıranlar (2007, p. 1894, Theorem 2.1), that the PRESS statistic for the LE is Compared to the result of O minimized at ( ),( X 2 ) n n X e~ i e~ i e^ i e~ i e^ i , d^ ¼ 1gii 1h1ii 1hii 1gii 1hii i¼1 i¼1 with e^ i ¼ yi x0i ðX 0 Xxi x0i Þ1 ðX 0 yxi yi Þ and e~ i ¼ yi x0i ðX 0 X þ Ixi x0i Þ1 ðX 0 yxi yi Þ, where gii and hii are the ith diagonal elements of G9XðX 0 X þ IÞ1 X 0 and H9XðX 0 XÞ1 X 0 , respectively, the representation (2.7) has a simpler form. For convenience, we write PRESS of OLSE, PRESS of the optimal LE, and PRESS of the optimal ILE as PRESSOLSE), PRESSLE, and PRESSILE, respectively. Clearly, PRESSOLSE ZPRESSLE Z PRESSILE , since OLSE is a special LE while the optimal LE is a special ILE. It can be seen that the difference between PRESSLE and PRESSILE can show the improvement of ILE on LE. So do LE (or ILE) and OLSE. In order to explain how much ILE can improve LE (LE can improve OLSE; ILE can improve OLSE) from a statistical point of view by making a simulation study, we give the following three notions of improved relative efficiency (IRE): IREðLE,OLSEÞ9
PRESSOLSE PRESSLE PRESSLE ¼ 1 , PRESSOLSE PRESSOLSE
IREðILE,OLSEÞ9
IREðILE,LEÞ9
PRESSOLSE PRESSILE PRESSILE ¼ 1 PRESSOLSE PRESSOLSE
and
PRESSLE PRESSILE PRESSILE ¼ 1 : PRESSLE PRESSLE
As argued above, we expect that each IRE (especially the one of ILE relative to LE) is as large as possible. For a given data set according to the model (1.1), we can say that the improvement of ILE on LE is ineffective if the value of IRE of the optimal ILE relative to LE is close to 0, while the improvement is remarkable if the corresponding IRE is close to 1. Other two IRE’s can be described in a similar fashion. To support such a standpoint, we will make a simulation study in the next section. 3. Simulation In this section, we make a simulation study to evaluate the performance of ILE and illustrate its superiority. The simulation study concerns a regression model, including the intercept term, of the form (1.1) with n= 20 and p = 6, in which xi 1 =1 for any i= 1,y, 20. We use the simulation procedure suggested by McDonald and Galarneau (1975) to generate the
X.-Q. Liu / Journal of Statistical Planning and Inference 141 (2011) 189–196
193
Table 1 Sample means of the three types of IRE’s for different values of r2 . 0.99
0.999
0.9999
0.99999
0.999999
Sample mean of IRE (LE, OLSE) N = 10 0.2326 N = 100 0.2396 N = 1000 0.2291
0.2574 0.2682 0.2879
0.3468 0.3065 0.2948
0.3384 0.2819 0.2996
0.2769 0.2809 0.3007
0.2100 0.2764 0.2975
Sample mean of IRE (ILE, OLSE) N = 10 0.7719 N = 100 0.8341 N = 1000 0.8453
0.8859 0.8342 0.8434
0.8554 0.8470 0.8320
0.7929 0.8331 0.8360
0.8103 0.8405 0.8341
0.8478 0.8455 0.8313
Sample mean of IRE (ILE, LE) N = 10 0.6904 N = 100 0.7872 N = 1000 0.7991
0.8343 0.7765 0.7780
0.7701 0.7800 0.7597
0.6945 0.7651 0.7632
0.7313 0.7718 0.7608
0.7880 0.7872 0.7555
r2
0.9
explanatory variables, that is, xij ¼ ð1r2 Þ1=2 zij þ rzi7 , i ¼ 1, . . . ,20
and
j ¼ 2, . . . ,6,
where zij’s are independent standard normal pseudo-random numbers, r2 is the correlation between any two explanatory variables. Take the value of r2 as 0.9, 0.99, 0.999, 0.9999, 0.99999, 0.999999, respectively. The resulting condition numbers of the generated X equal 12.8020, 40.3444, 87.3587, 368.6241, 867.0480, and 4250.6398, respectively. Firstly, generate randomly the intercept and the regression parameters and then a set of pseudo-observations of response variables. Secondly, calculate the PRESS statistics of the OLSE, the optimal LE, and the optimal ILE. Then the values of the three types of IRE’s are calculated. From this point of view, the three types of IRE’s are regarded as random variables, and the resultant values are their pseudo-observations. Repeat the above steps for N times and derive a N 3 matrix of IRE values. Finally, we calculate each sample mean for given r2 which can reflect the average level of the IRE. Denote by N the sample size of the three populations. In this study, we take N as 10,100, and 1000, respectively. With the aid of Matlab 7.0, we derive the sample means of the three types of IRE’s for different values of r2 and list them in Table 1, from which we find that: (a) no matter what the ill-condition of X 0 X is, the values of IRE of the optimal LE relative to the OLSE are very small and thus the improvement of LE over OLSE is ineffective while that of the optimal ILE relative to the OLSE are quite large and therefore the improvement of ILE over OLSE is remarkable. This may indicate that the influence of multicollinearity upon the PRESS criterion is relatively weak. Consequently, ILE is far more effective than LE in improving OLSE. From the table, it can also be concluded that (b) the optimal LE can be improved by the corresponding optimal ILE by average efficiency of approximately 75%; and (c) no matter what the sample size is, the differences of the values of each type of IRE are trivial. In summary, ILE can greatly improve OLSE and LE under the PRESS criterion. 4. Numerical example The simulation study showed that ILE can improve LE and OLSE substantially. In this section, we apply the main results of this paper to the well known Portland cement dataset or called Hald data (Woods et al., 1932). This dataset has been widely investigated by many authors, see Kac- ıranlar et al. (1999) and Sakallıgˇlu and Kac-ıranlar (2008) for example. Recently, Yang and Xu (2009) employed the dataset to support their theoretical results by adding stochastic linear restrictions, see also among others. The dataset comes from an experimental investigation of heat evolved during the setting and hardening of Portland cements of varied composition and the dependence of the heat on the percentages of four compounds in the clinkers from which the cement was produced. It concerns the relationship between the heat evolved (y) of cement and the four compounds including the tricalcium aluminate (x1), the tricalcium silicate (x2), the tetracalcium alumino ferrite (x3), and the dicalcium silicate (x4). The observations are given in the first five columns of Table 2. Other columns present the PRESS residuals (PR) and the ordinary residuals (OR) based on the OLSE, the optimal LE, and the optimal ILE, respectively. Let us first consider a model including the intercept term. The condition number of X equal to 6056.3443 indicates that X 0 X is ill-conditioned enough. With the aid of Matlab 7.0, we get Table 3. By the table, the PRESS values of the OLSE, the optimal LE, and the optimal ILE are 110.3466, 97.6613, and 23.3334, respectively. By direct operations, IRE(LE, OLSE)= 11.50% and IRE(ILE, OLSE) =78.85%. This means that the improvement of ILE on OLSE is far greater than that of LE on OLSE. In addition, from the result IRE(ILE, LE) =76.11% it is also concluded that LE is indeed improved substantially by ILE. We mention here that, from the angle of chemistry, the all coefficients of compounds should be positive but there is one negative estimate in the OLSE, and three in the optimal ILE. In fact, the main reason for this is the utilization of a model
194
X.-Q. Liu / Journal of Statistical Planning and Inference 141 (2011) 189–196
Table 2 Portland cement dataset and the residuals under a model including intercept term. y
x1
x2
x3
x4
PR (OLSE)
OR (OLSE)
PR (LE)
OR (LE)
PR (ILE)
OR (ILE)
78.5 74.3 104.3 87.6 95.9 109.2 102.7 72.5 93.1 115.9 83.8 113.3 109.4
7 1 11 11 7 11 3 1 2 21 1 11 10
26 29 56 31 52 55 71 31 54 47 40 66 68
6 15 8 8 6 9 17 22 18 4 23 9 8
60 52 20 47 33 22 6 44 22 26 34 12 12
0.0106 2.2665 3.9497 2.4506 0.3903 4.4819 2.2889 5.368 1.9532 0.9398 3.4656 1.3202 3.2951
0.0048 1.5112 1.6709 1.7271 0.2508 3.9254 1.4487 3.175 1.3783 0.2815 1.9910 0.9730 2.2943
0.9549 2.6393 0.9521 1.7109 0.0472 4.6427 2.4771 5.5831 2.4366 0.3984 2.6479 0.6578 3.6806
0.4966 1.8685 0.4844 1.2777 0.0326 4.0959 1.5755 3.4280 1.9340 0.1248 1.6262 0.5130 2.7686
1.1480 1.7987 0.2122 0.4175 0.3278 1.5480 1.0495 0.5658 3.2386 0.2077 0.0048 0.6231 1.9275
1.5031 0.6381 7.0470 5.4266 1.6553 1.1714 1.9973 2.8713 0.7597 2.7767 2.7511 0.1767 3.4946
Table 3 Estimates of the parameters and the PRESS under a model including intercept term. Parameters
b0 b1 b2 b3 b4 PRESS
OLSE
Optimal LE
Optimal ILE
62.4054 1.5511 0.5102 0.1019 0.1441
12.2937 2.0555 1.0292 0.6212 0.3638
188.7383 0.3527 0.7682 1.3070 1.3898
110.3466
97.6613
23.3334
Table 4 Estimates of the parameters and PRESS under a model without the intercept. Parameters
b1 b2 b3 b4 PRESS
OLSE
Optimal LE
Optimal ILE
2.1930 1.1533 0.7585 0.4863
2.1820 1.1559 0.7506 0.4880
1.6122 1.2630 0.3836 0.6175
98.5491
98.4258
69.9004
Table 5 Estimates of the parameters and PRESS under a model with unit length scaling variables. Parameters
OLSE
Optimal LE
Optimal ILE
b1 b2 b3 b4
0.6065 0.5277 0.0434 0.1603
0.5899 0.5146 0.0347 0.1649
0.3235 0.6305 0.3637 1.5355
0.0321
0.0318
0.0063
PRESS
including the intercept term. If we use a model without the intercept term, the results given in Table 4 will show that neither OLSE nor the optimal ILE violates the chemical regularity. In addition, the optimal ILE improves OLSE and the optimal LE w.r.t. the corresponding IRE’s by 29.07% and by 28.98%, respectively. In contrast, the optimal LE can only improve OLSE w.r.t. the IRE by 0.13% and therefore LE is ineffective comparing with ILE. On the other hand, we compare ILE and LE by using the unit length scaling variables. The results are given in Table 5, by which we can see that the values of PRESS of the OLSE, the optimal LE, and the optimal ILE are equal to 0.0321, 0.0318, and 0.0063, respectively. The IRE of LE relative to OLSE equals 0.93%, while the IRE of ILE relative to OLSE is 80.37%. The result indicates that the improvement of LE on OLSE is very ineffective while the improvement of ILE on OLSE is quite effective.
X.-Q. Liu / Journal of Statistical Planning and Inference 141 (2011) 189–196
195
This shows that it is reasonable to put forward the notion of ILE under the PRESS criterion in this paper. We assert that ILE ¨ zkale and Kac- ıranlar ((2007, p. 1896), Table 1) is a fine alternative to OLSE and LE for a practical problem. By the way, O mentioned that the minimal value of an available-optimal ORRE equals 0.02765, which is improved by 77.22% by the optimal ILE under the PRESS criterion. 5. Summary and discussions In this paper, we put forward the improved Liu estimator and then recommended to select the biasing parameters under the PRESS criterion. The optimal IRE was theoretically obtained. The simulation study showed that the improvement of the new estimator on OLSE and LE is substantial from the pseudo-random point of view. By applying the results to Hald data, we can assert that IRE is a fine alternative to OLSE and LE in practice. Some discussions are given below: 1. The PRESS statistic can be used to detect the influential observations. Table 2 lists the PRESS residuals (PR) and the ordinary residuals (OR) based on OLSE, the optimal LE, and the optimal ILE. To save space, we only discuss the 7th column and 10th column. From a intuitional angle, the 9th observation is highly influential on the basis of the PRESS residuals of the optimal ILE, while the 6rd and the 8th observations can be regarded as influential points based on the ordinary residuals of OLSE. Hence, observation(s) may be detected as influential point(s) by virtue of different methods. How to make the final decision needs further cross validation. 2. The ordinary residuals and the PRESS residuals can be used to describe the fitting capacity and the predictive capacity, respectively. In the following, we combine ordinary residuals and PRESS residuals to identify the outliers: Viewing the 13 ordinary residuals and the 13 PRESS residuals of the optimal ILE, listed in the last two columns of Table 2, as 13 sets of observations of a bivariate random vector f9ðx, ZÞ0 , then ðflf Þ0 V 1 f ðflf Þ r d is the expected ellipsoidal region, where mf and V f are the mean vector and the covariance matrix, respectively, of f, and the scalar d is approximately appointed to be the 99% quantile of a w2 distribution with 1 degree of freedom. The result is shown in Fig. 1. By the figure, The 3rd and the 11st observations are identified as outliers. Further studies can be made by combining robust methods of detecting outliers and are omitted. 3. For a given estimator, the corresponding R2-like statistic can be used to reflect its prediction capability, which is defined as R2Pred 91 Pn
PRESS Pn
1 i ¼ 1 ðyi n
i¼1
yi Þ2
,
cf. Myers (2005, p. 171, Eq. (4.5)). Similarly, we define the R2-like statistics based on the optimal LE and the optimal ILE as follows: R2LE 91 Pn
PRESSLE Pn
1 i ¼ 1 ðyi n
2 i ¼ 1 yi Þ
,
R2ILE 91 Pn
PRESSILE Pn
1 i ¼ 1 ðyi n
i¼1
yi Þ2
:
By direct operations, the values of the above three R2-like statistics equal 0.9594, 0.9640, and 0.9914, respectively. From this viewpoint, we can also assert that ILE improves OLSE and LE effectively.
4 PRESS residuals of ILE
3 2 1 0 −1
X: −7.047 Y: −0.2122 X: 2.751 Y: −0.004826
−2 −3 −4 −8 −7 −6 −5 −4 −3 −2 −1 0 1 Ordinary residuals of ILE
2
3
Fig. 1. Scatter plot and the 99% confidence concentration.
4
196
X.-Q. Liu / Journal of Statistical Planning and Inference 141 (2011) 189–196
Acknowledgements The author is very grateful to the referee for so detailed comments and constructive suggestions which result in the ¨ zkale for correspondence for O ¨ zkale and Kac- ıranlar (2007) and present version. Thanks also to Professor M. Revan O Professor S. Sakallıgˇlu for correspondence for Sakallıgˇlu and Kac- ıranlar (2008). References Akdeniz, F., Kac- ıranlar, S., 1995. On the almost unbiased generalized Liu estimator and unbiased estimation of the bias and MSE. Commun. Statist. Theory Meth. 24, 1789–1797. Allen, D.M., 1971. Mean square error of prediction as a criterion for selection of variables. Technometrics 13, 469–475. Allen, D.M., 1974. The relationship between variable selection and data augmentation and a method for prediction. Technometrics 16, 125–127. Arslan, O., Billor, N., 2000. Robust Liu estimator for regression based on an M-estimator. J. Appl. Statist. 27, 39–47. Druilhet, P., Mom, A., 2008. Shrinkage structure in biased regression. J. Multivariate Anal. 99, 232–244. Hoerl, A.E., Kennard, R.W., 1970. Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12, 55–67. Hubert, M.H., Wijekoon, P., 2006. Improvement of the Liu estimator in linear regression model. Statist. Papers 47, 471–479. Kac-ıranlar, S., Sakallıgˇlu, S., Akdeniz, F., Styan, G.P.H., Werner, H.J., 1999. A new biased estimator in linear regression and a detailed analysis of the widely-analysed dataset on Portland cement. Sankhya¯ Indian J. Statist. (Ser. B) 61, 443–459. Kac-ıranlar, S., Sakallıgˇlu, S., 2001. Combining the Liu estimator and the principal component regression estimator. Commun. Statist. Theory Meth. 30, 2699–2705. Liu, K.J., 1993. A new class of biased estimate in linear regression. Commun. Statist. Theory Meth. 22 (2), 393–402. Liu, K.J., 2003. Using Liu-type estimator to combat collinearity. Commun. Statist. Theory Meth. 32, 1009–1020. McDonald, G.C., Galarneau, D.I., 1975. A Monte Carlo evaluation of some ridge-type estimators. J. Amer. Statist. Assoc. 70, 407–416. Montgomery, D.C., Friedman, D.J., 1993. Prediction using regression models with multicollinear predictor variables. IIE Trans. 25 (3), 73–85. Myers, R.H., 2005. Classical and Modern Regression with Application, second ed. Higher Education Press (Photocopy version). ¨ zkale, M.R., Kac- ıranlar, S., 2007. A prediction-oriented criterion for choosing the biasing parameter in Liu estimation. Commun. Statist. Theory Meth. 36, O 1889–1903. Sakallıgˇlu, S., Kac-ıranlar, S., 2008. A new biased estimator based on ridge estimation. Statist. Papers 49, 669–689. Torigoe, N., Ujiie, K., 2006. On the restricted Liu estimator in the Gauss–Markov model. Commun. Statist. Theory Meth. 35, 1713–1722. Ullah, A., Ullah, S., 1978. Double k-class estimators of coefficients in linear regression. Econometrica 46, 705–722. Woods, H., Steinour, H.H., Starke, H.R., 1932. Effect of composition of Portland cement on heat evolved during hardening. Ind. Eng. Chem. 24, 1207–1214. Yang, H., Xu, J.W., 2009. An alternative stochastic restricted Liu estimator in linear regression. Statist. Papers 50, 639–647.