Communications in Statistics - Simulation and Computation
ISSN: 0361-0918 (Print) 1532-4141 (Online) Journal homepage: http://www.tandfonline.com/loi/lssp20
Some New Methods to Solve Multicollinearity in Logistic Regression Yasin Asar To cite this article: Yasin Asar (2015): Some New Methods to Solve Multicollinearity in Logistic Regression, Communications in Statistics - Simulation and Computation, DOI: 10.1080/03610918.2015.1053925 To link to this article: http://dx.doi.org/10.1080/03610918.2015.1053925
Accepted online: 31 Aug 2015.
Submit your article to this journal
Article views: 10
View related articles
View Crossmark data
Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=lssp20 Download by: [University of Tasmania]
Date: 01 October 2015, At: 02:19
ACCEPTED MANUSCRIPT Some New Methods to Solve Multicollinearity in Logistic Regression Yasin Asar Department of Mathematics & Computer Science, Necmettin Erbakan University, Konya 42090, Turkey,
[email protected] Abstract
Downloaded by [University of Tasmania] at 02:19 01 October 2015
The binary logistic regression is a widely used statistical method when the dependent variable is binary or dichotomous. In some of the situations of logistic regression, independent variables are collinear which leads to the problem of multicollinearity. It is known that multicollinearity affects the variance of maximum likelihood estimator (MLE) negatively. Thus this paper introduces new methods to estimate the shrinkage parameters of Liu-type logistic estimator proposed by Inan and Erdogan (2013) which is a generalization of the Liu-type estimator defined by Liu (2003) for the linear model. A Monte Carlo study is used to show the effectiveness of the proposed methods over MLE using the mean squared error (MSE) and mean absolute error (MAE). A real data application is illustrated to show the benefits of new methods. According to the results of the simulation and application proposed methods have better performance than MLE. Keywords: Logistic regression, Liu-type estimators, Multicollinearity, MSE, MLE
1 ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT 1. Introduction Binary logistic regression is one of the most used model when the outcome variable is dichotomous or binary i.e., it has two different categories. In the area of applied sciences such as criminology, business and finance, engineering, biology, health policy, biomedical research, ecology, and linguistics, it is a very popular model when modeling binary data (Hosmer et al.
Downloaded by [University of Tasmania] at 02:19 01 October 2015
2013). The explanatory variables may be intercorrelated which is the problem of multicollinearity which is also a problem in applied sciences. For example, the financial ratios are well known that they are highly correlated. Therefore this paper introduces new methods to overcome the multicollinearity problem. Now, consider the binary logistic regression model such that the dependent variable is distributed as Bernoulli Be Pi where Pi is the i
eX element of the vector P such that 1 eX
th
i 1, 2,..., n where X is an n ( p 1) data matrix having p independent variables and is
the coefficient vector of ( p 1) 1. The most common method of estimating the coefficients is to use the maximum likelihood estimator (MLE) which can be obtained by using the iteratively weighted least squares (IWLS) algorithm as follows:
ˆ ˆMLE X WX
1
ˆˆ X Wz
where Wˆ diag Pˆi 1 Pˆi
(1.1)
and
y Pˆ zˆi log Pˆi i i Pˆi 1 Pˆi
is the i th element of the vector zˆ .
2 ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT ˆ MSE and matrix mean squared error (MMSE) of an estimator of can be obtained
respectively by
tr Var ˆ Bias ˆ Bias ˆ ,
Downloaded by [University of Tasmania] at 02:19 01 October 2015
MSE ˆ E ˆ ˆ
(1.2)
ˆ Var ˆ Bias ˆ Bias ˆ
MMSE ˆ E ˆ
where tr is the trace of a matrix,
(1.3)
is the variance matrix and
Var ˆ
Bias ˆ E ˆ
is the
ˆ bias of the estimator .
We can compute MSE and MMSE of ˆMLE by using (1.2) and (1.3)respectively and obtain
ˆ MSE ˆMLE E ˆMLE ˆMLE tr X WX
MMSE ˆMLE E ˆMLE
ˆ
MLE
ˆ X WX
1
p 1
j 1
1
j
,
1
(1.4)
(1.5)
ˆ such that j 1, 2,..., p 1. where j is the j th eigenvalue of the matrix X WX
3 ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT ˆ become close to dependent. When there is multicollinearity, columns of the matrix X WX
ˆ become close to zero. Thus, MSE of MLE is It implies that some of the eigenvalues of X WX inflated so that one cannot obtain stable estimations. In order to overcome this problem, we can use biased estimators. The well-known ridge regression firstly defined by Hoerl and Kennard (1970) has been generalized to logistic regression
Downloaded by [University of Tasmania] at 02:19 01 October 2015
model by Schaefer et al. (1984) successfully. In Månsson and Shukur (2011) and Kibria et al. (2012), the authors investigated the performances of some ridge estimators firstly defined in linear model by Kibria (2003), Muniz and Kibria (2009), Muniz et al. (2012) and Mansson et al. (2010) to the binary logistic regression model using the logistic ridge estimator (LRE) defined by Schaefer et al. (1984). Liu-type estimators (LLT) has been adjusted to binary logistic model by Inan and Erdogan (2013) to solve the multicollinearity problem and decrease the variance so that the estimations become stable. This paper proposed some new methods of estimating the shrinkage parameter to be used in LLT in order to combat multicollinearity in binary logistic regression model. LLT with new estimators are expected to perform better than MLE when the explanatory variables are correlated. Moreover, we give a matrix mean squared error comparison between the estimators and conduct a Monte Carlo simulation to evaluate the performances of the estimators using the MSE and mean absolute error (MAE). This paper is organized as follows: In section 2, theory and proposed methods are developed. Details of Monte Carlo experiment are given in section 3. We provide the results and
4 ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT discussions of the simulation in section 4. In section 5, a real data example is given. Finally, we provide a brief summary and conclusion section.
2. New Estimators and MSE Properties In order to overcome the problem of multicollinearity, logistic Liu-type estimator (LLT) is
Downloaded by [University of Tasmania] at 02:19 01 October 2015
defined by Inan and Erdogan (2013). LLT is a logistic generalization of Liu-type estimator defined by Liu (2003) for the linear model. LLT can be written as follows:
ˆLT 1 S kI
1
S dI ˆMLE
(2.1)
ˆ , k 0 and d , Wˆ is the iteratively weighted covariance matrix, where S X WX and I is the identity matrix of order p 1 . We make
a transformation in order to present the explicit form of the functions
MSE ˆLLT and MMSE ˆLLT . Let
Q
and
ˆ QX WXQ diag 1 , 2 ,..., p 1
where 1 2 ... p 1 0 and Q is the matrix whose columns are the eigenvectors of the
ˆ .The bias and variance of LLT are obtained as follows: matrix X WX
b Bias ˆLLT d k Q k1 ,
(2.2)
Var ˆLLT Q k1*d 1*d k1 Q
(2.3)
where k kI and *d dI . MMSE and MSE of LLT can be obtained respectively as follows:
5 ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT
MMSE ˆLLT Q k 1*d 1*d k1 Q bb ,
MSE ˆLLT
(2.4)
d 2 p 1 d k 2 2 j j 2 2 j 1 k j 1 k j j j f1 k , d f 2 k , d p 1
(2.5)
Downloaded by [University of Tasmania] at 02:19 01 October 2015
where the first term in Equation (2.5) is the asymptotic variance and the second term is the squared bias of the estimator. Thus, one should choose suitable values of k and d such that the decrease in the variance is greater than the increase in the bias term. 2.1. MMSE and MSE comparisons of MLE and LLT In this subsection, some theorems regarding MMSE and MSE comparisons of the estimators MLE and LLT are given. Theorem 2.2 gives the necessary and sufficient condition that the MMSE difference of MLE and LLT is positive definite (p.d.). If ˆ1 and ˆ2 are two estimators of the coefficient vector, then ˆ2 is superior to ˆ1 if and
only if (iff) MMSE 1 MMSE 2 0 . It is proved that if MMSE 1 MMSE 2
is a
non-negative definite matrix (n.n.d.), then MSE 1 MSE 2 0 , see Theobald (1974). But the converse is not always true. Now, the following theorem is used to compare the estimators in the sense of MMSE theoretically.
6 ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT Lemma 2.1 (Trenkler and Toutenburg, 1990): Let ˆ1 and ˆ2 are two estimators of the coefficient
vector . Moreover, let D Var 1 Var 2
be a p.d. matrix, a1 Bias 1
and
a2 Bias 2 . Then, MMSE 1 MMSE 2 0 iff a2 D a1a1 a2 1.
Downloaded by [University of Tasmania] at 02:19 01 October 2015
Theorem 2.2: Let
d k 2 j k d 0
1
, j 1, 2,..., p 1
and b Bias ˆLLT . Then
1 MMSE ˆMLE MMSE ˆLLT 0 iff b 1 k 1*d 1*d k 1 b 1.
Proof: Let D Var MLE Var LLT QQ Q k 1*d 1*d k 1Q which is equal to the p 1
2 j d 1 1 1 * 1 * 1 Q . following: D Q k d d k Q Q diag 2 j j j k j 1
Now, the matrix 1 k 1*d 1*d k1 is p.d. if
k j d 0 which is 2
j
2
equivalent to j k j d j k j d 0 . Simplifying the last inequality, one gets d k 2 j k d 0 . Thus D is positive and the proof is finished by lemma 2.1. 2.2. New methods to choose k and d A new iterative method is proposed to choose the values of the parameters k and d in this subsection. It is expected that the following method yields a much smaller MSE value for the estimator LLT: First, following Hoerl and Kennard (1970), differentiating the equation (2.5) with respect to the parameter k , it is easy to obtain the following equation:
7 ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT
MSE ˆLLT k
2 k d 2 2 k d k ˆ 2 2 k d 2 k ˆ 2 j j j j j j j 4 4 2 j 1 k k j j j p 1
(2.6)
Equating the numerator of Equation (2.6) to zero gives
j d jˆ 2j k d j d 0, j 1, 2,..., p 1
Downloaded by [University of Tasmania] at 02:19 01 October 2015
2
which can be simplified further and the individual parameter kˆLTj 1 can be obtained as
kˆLTj 1
j d 1 jˆ 2j jˆ 2j
, j 1, 2,..., p 1.
(2.7)
Since each term in Equation (2.7) should be positive, the numerator j d 1 jˆ 2j 0 . Hence
d
j 1 jˆ 2j
(2.8)
needs to be satisfied. After obtaining an initial value for the parameter d , estimate k using the following new estimators: We propose to estimate k by using arithmetic mean, geometric mean and median functions following Kibria (2003) as follows:
1 p j d 1 jˆ j kˆAM p j 1 jˆ 2j 2
(2.9)
8 ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT which is the mean of kˆLTj 1 . 1/ p 1
kˆGM
p 1 j d 1 jˆ 2j j 1 jˆ 2j
(2.10)
Downloaded by [University of Tasmania] at 02:19 01 October 2015
which is the geometric mean of kˆLTj 1 .
j d 1 jˆ 2j kˆMED median jˆ 2j
(2.11)
which is the median of kˆLTj 1 . In addition, following Alkhamisi et al. (2006), we propose to use maximum and minimum functions to obtain the following new estimators:
j d 1 jˆ 2j ˆ . kMAX max jˆ 2j
(2.12)
and
j d 1 jˆ 2j ˆ . kMIN min jˆ 2j
(2.13)
Following Hoerl et al. (1975), we suggest to use harmonic mean of kˆLTj 1 and obtain the following new estimator:
9 ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT kˆHM
p 1 jˆ 2j ˆ 2j j 1 j d 1 j p 1
.
(2.14)
Finally, we suggest two new estimators by using maximum and minimum eigenvalues and
Downloaded by [University of Tasmania] at 02:19 01 October 2015
canonical coefficients as follows:
kˆmin
kˆmax
2 min d 1 minˆ min 2 minˆ min
(2.15)
,
2 max d 1 maxˆ max
(2.16)
2 maxˆ max
where max and min are the maximum and minimum eigenvalues of S and ˆ max and ˆ min are the maximum and minimum elements of ˆ .
3. Monte Carlo Simulation Study 3.1. Design of the simulation Design of the Monte Carlo simulation which is conducted to compare the performances of LLT and MLE is given in this section. The two criteria used to judge the performances are the (MSE) and (MAE) computed using Equations (3.1) and (3.2) respectively,
MSE
5000 r 1
5000
,
(3.1)
ACCEPTED MANUSCRIPT
10
ACCEPTED MANUSCRIPT 5000
MAE
r 1
5000
(3.2)
,
where ˆLLT and ˆMLE and . is the usual Euclidean distance. Let ˆ QˆMLE , ˆ j be the j th element of ˆ . The effective factors in the simulation
Downloaded by [University of Tasmania] at 02:19 01 October 2015
are chosen to be the degree of correlation 2 among the explanatory variables, the sample size n and the number of explanatory variables p . Following McDonald and Galarneau (1975), Kibria (2003), Muniz and Kibria (2009) and Asar et al. (2014), the explanatory variables are generated by using the equation
xi j 1 2
1/2
zi j zip
(3.3)
where i 1, 2,, n, j 1, 2,... p 1, and 2 represents the correlation between the explanatory variables and zij ‘s are independent random numbers obtained from the standard normal distribution. The observations of the dependent variable are obtained from the Bernoulli distribution
Be P where
Pi
e xi 1 e xi
(3.4)
such that xi is the i th row of the matrix X . The parameter values of the coefficients are chosen so that 1 due to Newhouse and Oman (1971).
ACCEPTED MANUSCRIPT
11
ACCEPTED MANUSCRIPT The sample size n is chosen to be 50,100, 200, the number of explanatory variables p varies between 4,8 and investigated degrees of correlation are 0.90,0.95,0.99. By using the shrinkage parameter d computed via (2.8) and new proposed biasing estimators of k defined in Subsection 2.2, processing in this manner, one can determine which of the estimators has better performance for different combinations of n, p, 2 . Matlab R2013a is used to write the codes
Downloaded by [University of Tasmania] at 02:19 01 October 2015
and the convergence tolerance is set to 0.0000001 in IWLS algorithm. 3.2. Results and Discussion In this subsection, results of the Monte Carlo simulation are presented. The estimated MSE and MAE values of LLT and MLE are reported in Tables 3.1-3.4. According to the tables, all proposed methods have better performances than MLE such that LLT with new methods have less MSE value than that of MLE. If the sample size is increased, then MSE values of all estimators decrease. In other words, increasing the sample size makes a positive effect on the performances of the estimators. Increasing the sample size affects the performance of MLE positively. Moreover, if the degree of correlation is increased, MSE values of the estimators increase except for LLT with the estimators kˆAM and kˆMAX . There is a positive effect of correlation on these two estimators. This result can be observed from Figure 3.1. According to tables, new estimators can be categorized into two groups. kˆAM , kˆGM , kˆMED and kˆMAX are in one group such that the increase in the degree of correlation does not affect the performances of the estimators substantially. The other estimators are in the second group such that they are affected negatively from the increase in the degree of correlation. This result can be seen comparing the Figure 3.1 and
ACCEPTED MANUSCRIPT
12
ACCEPTED MANUSCRIPT 3.2. Moreover, the estimators considered in Figure 3.1 have better performances than the others considered in Figure 3.2. One can also conclude from Figure 3.2 that MSE of MLE is inflated especially when the sample size is low and the degree of correlation is high. Similarly, MSE of the estimators
kˆMIN , kˆHM , kˆmin and kˆmax are inflated for the same situation. However, increasing the sample size
Downloaded by [University of Tasmania] at 02:19 01 October 2015
has a great positive effect on these estimators. Furthermore, if the number of explanatory variables is increased, MSE values of all estimators are increased. Although, the estimators given in Figure 3.1 are affected slightly from the increase in the number of independent variables, MSE of the others are inflated by the increase in the number of independent variables. According to the tables, it may be concluded that kˆGM has the best performance when
2 0.90 and 2 0.95 . However, kˆAM is the best estimator when 2 0.99 . One can obtain similar conclusions when MAE is used as a performance criterion. The only difference is that MAE values are quite smaller than MSE values for the same situation.
4. Application In this section, a data set taken from the Banks Association of Turkey is used regarding the Asian financial crisis and its affects in the collapse of commercial banks operating in Turkey. This data set is also used by Inan and Erdogan (2013). In period of collapse, Savings Deposit Insurance Fund (SDIF) took over some of the banks. In the application, a binary logistic regression is used to model the dependent variable which is whether the bank is taken over by SDIF or not. A bank is
ACCEPTED MANUSCRIPT
13
ACCEPTED MANUSCRIPT coded as zero if it is taken over and the other successful banks during that period are coded as one for the year 1999. The following financial ratios are used as independent variables: X1: (Shareholders’s Equity+Total Income) / (Deposits+Non-deposits Funds), X2: Net Working Capital / Total Assets, X3: Non-performing Loans / Total Loans, X4: Liquidity Assets / Total Assets, X5: Liquidity Assets / (Deposits+Non-deposits Funds), X6: Interest Income / Interest
Downloaded by [University of Tasmania] at 02:19 01 October 2015
Expenses and X7: Non-interest Income / Non-interest Expenses. The correlation matrix of the data is given in Table 4.1. According to Table 4.1, it is observed that some of the bivariate correlations are high such that 0.92 and 0.89. The eigenvalues of the matrix X X are 3.5658, 1.3654, 1.0109, 0.8126, 0.2022, 0.0364 and 0.0066. The condition number being a measure of the degree of collinearity is computed as max / min 23.1578 which shows that there is a moderate multicollinearity problem with this data set. The estimated theoretical MSE values of the estimators are reported in Table 4.2. One can see from Table 4.2 that LLT used with the new methods has less MSE values than that of MLE and MSE of MLE is inflated. kˆMED has the lowest MSE value. Moreover, the coefficients, standard errors and the corresponding p-values of the model are given in Table 4.3. According to Table 4.3, it is seen that the standard errors of the coefficients regarding LLT with new methods are clearly smaller than that of MLE. Especially, the standard errors of kˆMAX are the lowest. Therefore, it can be concluded that LLT with new methods is more stable than MLE.
ACCEPTED MANUSCRIPT
14
ACCEPTED MANUSCRIPT Finally, a plot of the MSE values versus the changing values of the parameter k is given when 0 k 1 in Figure 4.1. According to Figure 4.1, MSE of LLT decreases for k 0.064 , increases otherwise and it is always less than that of MLE.
5. Conclusion
Downloaded by [University of Tasmania] at 02:19 01 October 2015
In this paper, logistic version of Liu-type estimator is considered to overcome the multicollinearity problem in the binary logistic regression. Some new methods to obtain shrinkage parameters are proposed using arithmetic mean, geometric mean, harmonic mean, median, maximum and minimum functions. A Monte Carlo simulation is designed to evaluate the performances of the estimators using the MSE and MAE. According to simulation study, new proposed methods have better performance than MLE.
kˆMED , kˆAM and kˆGM have less MSE values among the other estimators. Increasing the degree of correlation does not affect the estimators in the first group seriously. Thus they are advisable to the researchers when the sample size is low and the degree of correlation is high. Moreover, a real data application is illustrated to give a better understanding of the new methods. Results are consistent with the results of the simulation showing that new methods are effective in the presence of multicollinearity. Acknowledgments: The author wishes to thank the referees and the editor for their helpful suggestions and comments which helped to improve the quality of the paper.
ACCEPTED MANUSCRIPT
15
ACCEPTED MANUSCRIPT 6. References Alkhamisi, M., Khalaf, G., and Shukur, G. (2006). Some modifications for choosing ridge parameters. Communications in Statistics—Theory and Methods, 35(11), 2005-2020. Asar, Y., Karaibrahimoğlu, A., and Genç, A. (2014). Modified ridge regression parameters: A comparative Monte Carlo study. Hacettepe Journal of Mathematics and Statistics, 43(5),
Downloaded by [University of Tasmania] at 02:19 01 October 2015
827-841. Hoerl, A. E., and Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55-67. Hoerl, A. E., Kennard, R. W., and Baldwin, K. F. (1975). Ridge regression: Some simulations. Communications in Statistics-Theory and Methods, 4(2), 105-123. Hosmer , D. W., Lemeshow, S., and Sturdivant, R. X. (2013). Applied logistic regression (Vol. 398): John Wiley & Sons Inan, D., and Erdogan, B. E. (2013). Liu-type logistic estimator. Communications in Statistics-Simulation and Computation, 42(7), 1578-1586. Kibria, B. M. G. (2003). Performance of some new ridge regression estimators. Communications in Statistics-Simulation and Computation, 32(2), 419-435. Kibria, B. M. G., Mansson, K., and Shukur, G. (2012). Performance of some logistic ridge regression estimators. Computational Economics, 40(4), 401-414. Liu, K. (2003). Using Liu-type estimator to combat collinearity. Communications in Statistics-Theory and Methods, 32(5), 1009-1020. Månsson, K., and Shukur, G. (2011). On ridge parameters in logistic regression. Communications in Statistics-Theory and Methods, 40(18), 3366-3381.
ACCEPTED MANUSCRIPT
16
ACCEPTED MANUSCRIPT Mansson, K., Shukur, G., and Kibria, B. G. (2010). On some ridge regression estimators: A Monte Carlo simulation study under different error variances. Journal of Statistics, 17(1), 1-22. McDonald, G. C., and Galarneau, D. I. (1975). A Monte Carlo evaluation of some ridge-type estimators. Journal of the American Statistical Association, 70(350), 407-416. Muniz, G., and Kibria, B. M. G. (2009). On some ridge regression estimators: An empirical
Downloaded by [University of Tasmania] at 02:19 01 October 2015
comparisons. Communications in Statistics—Simulation and Computation®, 38(3), 621-630. Muniz, G., Kibria, B. M. G., Mansson, K., and Shukur, G. (2012). On developing ridge regression parameters: a graphical investigation. Sort-Statistics and Operations Research Transactions, 36(2), 115-138. Newhouse, J. P., and Oman, S. D. (1971). An evaluation of ridge estimators. Rand Corporation(P-716-PR), 1-16. Schaefer, R. L., Roi, L. D., and Wolfe, R. A. (1984). A ridge logistic estimator. Communications in Statistics-Theory and Methods, 13(1), 99-113. Theobald, C. (1974). Generalizations of mean square error applied to ridge regression. Journal of the Royal Statistical Society. Series B (Methodological), 103-106. Trenkler, G., and Toutenburg, H. (1990). Mean squared error matrix comparisons between biased estimators—an overview of recent results. Statistical Papers, 31(1), 165-179.
ACCEPTED MANUSCRIPT
17
ACCEPTED MANUSCRIPT
Table 6.1. MSE values of estimators when p=4
Downloaded by [University of Tasmania] at 02:19 01 October 2015
2
0.90
0.95
0.99
n
50
100
200
50
100
200
50
100
200
kˆAM
0.4450
0.3711
0.3014
0.4528
0.3512
0.2529
0.3633
0.2728 0.2587
kˆGM
0.3719
0.1958
0.1403
0.4826
0.2847
0.1516
0.5066
0.3342 0.2687
kˆMED
0.4525
0.2370
0.1394
0.5430
0.2859
0.1433
0.6656
0.3342 0.2294
kˆMAX
0.6214
0.5576
0.4471
0.5784
0.4517
0.3594
0.4865
0.3595 0.3262
kˆMIN
2.7500
0.4930
0.2220
3.8414
1.0565
0.3903
14.6935
4.1562 2.2357
kˆHM
1.4720
0.3012
0.1566
2.0150
0.5892
0.2304
6.7202
1.8670 1.0406
kˆmax
1.5709
0.4201
0.2004
2.0865
0.6519
0.3703
8.0360
1.8541 2.1851
kˆmin
1.8603
0.4168
0.2232
2.4602
0.8543
0.3559
8.0166
3.0219 0.4590
MLE
5.1524
1.0366
0.4344
7.3625
2.3202
0.9001
31.8218
10.1260 5.2321
ACCEPTED MANUSCRIPT
18
ACCEPTED MANUSCRIPT Table 6.2. MSE values of estimators when p=8
2
Downloaded by [University of Tasmania] at 02:19 01 October 2015
n
0.90 50
0.95
100
200
50
0.99
100
200
50
100
200
kˆAM
0.6657
0.5403
0.4697
0.6200
0.4565
0.4381
0.5362
0.3702
0.3396
kˆGM
0.4432
0.2861
0.2111
0.4950
0.2990
0.2847
0.9000
0.6993
0.4989
kˆMED
0.4600
0.3004
0.2409
0.5165
0.3233
0.3555
1.0053
0.7764
0.5465
kˆMAX
0.8291
0.7583
0.7077
0.7867
0.6718
0.6482
0.7042
0.5463
0.4756
kˆMIN
121.1414
1.5931
0.8016 247.6495
4.6483
1.4755
631.5407
36.2478
8.0369
kˆHM
30.8679
0.6941
0.4149
61.7599
1.7583
0.7045
194.1220
13.7543
2.8939
kˆmax
62.8914
1.2029
0.6667 129.2836
2.9743
1.2956
323.6374
22.8857
5.9353
kˆmin
56.4873
0.9990
0.5979 115.2998
2.7501
0.9803
324.4524
21.1242
4.2694
MLE
234.3063
3.1156
1.4587 491.0497
8.7549
2.9104 1081.9348
63.7842 17.4108
ACCEPTED MANUSCRIPT
19
ACCEPTED MANUSCRIPT Table 6.3. MAE values of estimators when p=4
2
Downloaded by [University of Tasmania] at 02:19 01 October 2015
n
0.90 50
0.95
100
200
50
0.99
100
200
50
100
200
kˆAM
0.6401
0.5700
0.4962
0.6555
0.5584
0.4561
0.5760
0.4905
0.4706
kˆGM
0.5806
0.4221
0.3561
0.6596
0.5029
0.3656
0.6490
0.5267
0.4745
kˆMED
0.6308
0.4622
0.3560
0.6828
0.4994
0.3555
0.6881
0.5095
0.4415
kˆMAX
0.7755
0.7247
0.6212
0.7471
0.6355
0.5464
0.6769
0.5729
0.5262
kˆMIN
1.2763
0.6143
0.4277
1.5067
0.8649
0.5371
2.5165
1.6344
1.2183
kˆHM
0.9543
0.4890
0.3655
1.1033
0.6621
0.4246
1.6331
1.0653
0.8218
kˆmax
1.0061
0.5850
0.4093
1.1036
0.7124
0.5376
1.7716
1.0901
1.2047
kˆmin
1.0453
0.5752
0.4300
1.1834
0.7787
0.5380
1.6971
1.2645
0.5826
MLE
1.8801
0.9256
0.6032
2.2780
1.3644
0.8651
4.4159
2.8564
2.0609
ACCEPTED MANUSCRIPT
20
ACCEPTED MANUSCRIPT Table 6.4. MAE values of estimators when p=8
2
Downloaded by [University of Tasmania] at 02:19 01 October 2015
n
0.90 50
0.95
100
200
50
0.99
100
200
50
100
200
kˆAM
0.8023
0.7104
0.6518
0.7719
0.6477
0.6307
0.7080
0.5777
0.5465
kˆGM
0.6549
0.5234
0.4496
0.6842
0.5266
0.5122
0.8584
0.7449
0.6347
kˆMED
0.6643
0.5327
0.4755
0.6904
0.5446
0.5709
0.8550
0.7453
0.6613
kˆMAX
0.9039
0.8583
0.8228
0.8773
0.7999
0.7820
0.8209
0.7096
0.6520
kˆMIN
7.2684
1.1006
0.8282
10.5718
1.5676
1.1162
15.9401
4.8263
2.4800
kˆHM
3.6646
0.7394
0.6022
5.2347
0.9661
0.7714
8.6724
2.8663
1.4605
kˆmax
4.3433
0.9712
0.7643
6.2724
1.3301
1.0515
9.8672
3.8417
2.0498
kˆmin
4.0479
0.8624
0.7092
5.8160
1.1377
0.8927
9.5576
3.2132
1.7054
MLE
10.3230
1.6231
1.1490
15.2626
2.3946
1.6221
22.1051
6.8895
3.8649
ACCEPTED MANUSCRIPT
21
ACCEPTED MANUSCRIPT
Downloaded by [University of Tasmania] at 02:19 01 October 2015
Table 6.5. The correlation matrix of the data X1
X2
X3
X4
X5
X6
X7
X1
1.0000
0.9275
-0.6491
0.2531
0.5794
0.6134
0.3662
X2
0.9275
1.0000
-0.4592
0.2078
0.5611
0.5605
0.4568
X3
-0.6491 -0.4592
1.0000
-0.3151 -0.3453 -0.2107
0.0174
X4
0.2531
0.2078
-0.3151
1.0000
0.8997
0.1916
-0.0216
X5
0.5794
0.5611
-0.3453
0.8997
1.0000
0.4696
0.1926
X6
0.6134
0.5605
-0.2107
0.1916
0.4696
1.0000
-0.0308
X7
0.3662
0.4568
0.0174
-0.0216
0.1926
-0.0308
1.0000
ACCEPTED MANUSCRIPT
22
ACCEPTED MANUSCRIPT Table 6.6. The estimated theoretical MSE values of LLT used with new methods and MLE
kˆGM
kˆMED
kˆMAX
kˆMIN
kˆHM
kˆmax
kˆmin
MLE
126.205
106.480
105.438
137.395
132.416
107.437
132.416
105.610
1905.23
5
6
8
8
8
8
8
5
6
Downloaded by [University of Tasmania] at 02:19 01 October 2015
kˆAM
ACCEPTED MANUSCRIPT
23
ACCEPTED MANUSCRIPT Table 6.7 The coefficients, standard errors and corresponding p-values of the model
Downloaded by [University of Tasmania] at 02:19 01 October 2015
Coefficients
kˆAM
kˆGM
kˆMED
kˆMAX
kˆMIN
beta1
0.6762
2.3001
2.5540
0.1315
3.1676
beta2
0.6394
2.2933
2.5895
0.1232
beta3
-0.3125
-1.2300
-1.4214
beta4
0.2557
0.1933
beta5
0.4872
beta6 beta7
kˆHM
kˆmax
kˆmin
MLE
3.0838
3.1676
2.4949
-1.6266
4.2304
3.3420
4.2304
2.5186
6.5176
-0.0586
-3.0354
-1.9930
-3.0354
-1.3743
-6.2600
0.1325
0.0694
-0.2162
-0.0361
-0.2162
0.1476
-3.9726
1.0523
1.0959
0.1122
1.1962
1.1632
1.1962
1.0865
5.3006
0.3668
0.9529
1.0148
0.0786
1.2461
1.1244
1.2461
1.0010
1.9241
0.3437
1.2875
1.4438
0.0632
2.1603
1.7988
2.1603
1.4071
2.9381
Standard Errors
kˆAM
kˆGM
kˆMED
kˆMAX
kˆMIN
kˆHM
kˆmax
kˆmin
MLE
beta1
0.0420
0.1568
0.1795
0.0080
0.4568
0.2503
0.4568
0.1739
2.5793
beta2
0.0408
0.1719
0.2025
0.0076
0.5292
0.3025
0.5292
0.1948
1.9947
beta3
0.0244
0.1262
0.1570
0.0043
0.6404
0.2801
0.6404
0.1490
1.9720
beta4
0.0623
0.1637
0.1774
0.0137
0.2718
0.2108
0.2718
0.1742
3.5039
beta5
0.0572
0.1372
0.1464
0.0128
0.2361
0.1693
0.2361
0.1443
4.0708
beta6
0.0619
0.2147
0.2438
0.0122
0.4580
0.3280
0.4580
0.2367
1.0607
beta7
0.0580
0.2209
0.2535
0.0109
0.4825
0.3467
0.4825
0.2456
0.9318
kˆHM
kˆmax
kˆmin
MLE
p-values
kˆAM
kˆGM
kˆMED
kˆMAX
kˆMIN
ACCEPTED MANUSCRIPT
24
Downloaded by [University of Tasmania] at 02:19 01 October 2015
ACCEPTED MANUSCRIPT beta1
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.2662
beta2
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0012
beta3
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0016
beta4
0.0001
0.1229
0.2301
0.0000
0.2158
0.4325
0.2158
0.2013
0.1323
beta5
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.1007
beta6
0.0000
0.0000
0.0001
0.0000
0.0050
0.0008
0.0050
0.0001
0.0391
beta7
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0017
ACCEPTED MANUSCRIPT
25
Downloaded by [University of Tasmania] at 02:19 01 October 2015
ACCEPTED MANUSCRIPT
Figure 6.1. The estimated MSE values of estimators when p=4
ACCEPTED MANUSCRIPT
26
Downloaded by [University of Tasmania] at 02:19 01 October 2015
ACCEPTED MANUSCRIPT
Figure 6.2. The estimated MSE values of estimators when p=4
ACCEPTED MANUSCRIPT
27
Downloaded by [University of Tasmania] at 02:19 01 October 2015
ACCEPTED MANUSCRIPT
Figure 6.3. MSE values of LLT and MLE versus the parameter k
ACCEPTED MANUSCRIPT
28