Statistics and Probability Letters 82 (2012) 1755–1760
Contents lists available at SciVerse ScienceDirect
Statistics and Probability Letters journal homepage: www.elsevier.com/locate/stapro
Testing the homogeneity of inverse Gaussian scale-like parameters Ming Chang a,∗ , Xuqun You a , Muqing Wen b a
College of Psychology, Shaanxi Normal University, Xian, 710062, China
b
College of Finance and Economics, North-West University of Politics and Law, Xian, 710061, China
article
abstract
info
Article history: Received 28 September 2011 Received in revised form 3 February 2012 Accepted 17 May 2012 Available online 4 June 2012
A test for the homogeneity of normal variances was proposed by Liu and Xu [Liu, X.H., Xu, X.Z., 2010. A generalized p-value approach for testing the homogeneity of variances. Statistics and Probability Letters 80, 1486–1491]. For testing the homogeneity of inverse Gaussian scale-like parameters, a parallel test is developed in this article. The proposed test is proved to have exact frequent property. The merits of the proposed method are numerically compared with the existing method with respect to their sizes and powers under different scenarios. The simulation results show that the proposed approach can perform hypothesis testing with satisfactory sizes and powers. © 2012 Elsevier B.V. All rights reserved.
Keywords: Inverse Gaussian populations Scale-like parameters Generalized p-value
1. Introduction The density function of the two-parameter inverse Gaussian (IG) distribution IG(µ, λ) is defined as f (x, µ, λ) =
λ 2π x3
1/2
exp −
λ 2 ( x − µ) , 2µ2 x
x > 0, µ, λ > 0,
(1.1)
where µ is the mean parameter and λ is the scale-like parameter. The IG distribution has already been applied in describing and analyzing right-skewed data more and more extensively. Durham and Padgett (1997) used the IG models to develop a new general method based on cumulative damage for describing the failure of a system, and Doksum and Hóyland (1992) developed a model for variable-stress-accelerated life testing experiments based on the IG distributions. Seshadri (1993, 1999) discussed more applications of IG distribution in life tests, etc. The IG distributions share many inference theories based on t, F and χ 2 distributions as for the Gaussian distribution. We refer readers to Chhikara and Folks (1989), Seshadri (1993, 1999) and Mudholkar and Natarajan (2002) for more details of Gaussian and IG analogies. In many statistical applications, a test of the equality of IG scale-like parameters is of interest, that is H0 : λ1 = λ2 = · · · = λk ,
vs. H1 : not all λ’is are equal.
(1.2)
For example, when testing for homogeneity of inverse Gaussian means, homogeneity of the scale-like parameters λ is an important assumption in the analysis of reciprocals (Chhikara and Folks, 1989). The sampling distributions of the rescaled maximum likelihood estimators (MLEs) of λ−1 and σ 2 for the inverse Gaussian and normal populations, respectively, are both chi-square with n − 1 degrees of freedom. Interestingly, inference procedures concerning λ are remarkably similar to
∗
Corresponding author. E-mail addresses:
[email protected],
[email protected] (M. Chang).
0167-7152/$ – see front matter © 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.spl.2012.05.013
1756
M. Chang et al. / Statistics and Probability Letters 82 (2012) 1755–1760
those concerning the normal scale parameter. Let Xi1 , Xi2 , . . . , Xini be a random sample from an IG(µi , λi ) population for i = 1, . . . , k. Denote Vi =
ni (Xij−1 − X¯ i−1 ), j =1
N =
k (ni − 1)
(1.3)
i =1
and V˜ =
k
Vi ,
(1.4)
i =1
where X¯ i is the ith sample mean. For testing the homogeneity of k IG scale-like parameters, Chhikara and Folks (1989) developed a modified likelihood ratio (MLR) test, which is analogous to the approximated test presented by Bartlett and Kendall (1946) for normal theory. The test statistics is M /C , where
C =1+
k 1 1 − k 3(k − 1) i=1 ni − 1 (ni − 1)
1
i=1
and M = N ln
V˜ N
−
k i=1
(ni − 1) ln
Vi
(ni − 1)
.
Under H0 , M /C is distributed approximately as χk2−1 , and the level α rejection region for H0 is given by M /C > χk2−1,1−α ,
(1.5)
where χ is the 100(1 − α) percentage point of the χ distribution with k − 1 degrees of freedom. The MLR test for the homogeneity of IG scale-like parameters is based on large-sample approximations whose reliability can be quite poor, in particular, little is known about their validity in small samples. Therefore, it is of practical and theoretical importance to develop a test which does not depend on large-sample approximations. This paper will fill this gap by developing an approach using the concepts of generalized test variables and generalized p-values. The generalized p-value was introduced by Tsui and Weerahandi (1989), and the generalized confidence interval by Weerahandi (1993). The concepts of generalized test variable and generalized pivot quantities have been widely applied to a variety of practical settings where standard inference methods do not exist. For example, see Zhou and Mathew (1994), Tian (2006) and Li (2009), etc. A generalized test variable for the homogeneity of normal variances was proposed by Liu and Xu (2010). In this paper, a parallel generalized p-value procedure for testing the homogeneity of IG scale-like parameters is proposed. The proposed procedure is based on an exact probability statement as required in the context of generalized inference, and it has the exact frequent property. Simulation results also show that the proposed approach has satisfactory sizes and powers. This article is organized as follows. Section 2 reviews the concept of the generalized p-value. In Section 3, a test based on the generalized test variable for testing homogeneity of IG scale-like parameters is presented, and the exact frequency property of the proposed test is proved. In Section 4, simulation results on the sizes and powers are presented. Concluding remarks are summarized in Section 5. 2 k−1,1−α
2
2. Generalized test variables and generalized p-values The generalized p-value approach will be valuable whenever the standard test statistic and standard pivot quantity are either non-existent or difficult to obtain based on the conventional method. To illustrate these concepts, let X be a random variable whose cumulative function is F (X ; ξ ), where ξ = (θ , η) is a vector of unknown parameters, assuming values in a parameter space Θ , where θ is the parameter of interest and η represents the nuisance parameter. Note that θ and η may be more than one parameter of interest and more than one nuisance parameter. Let Ω be the sample space and x be the observed value of X . Consider testing H0 : θ ≤ θ0 versus H1 : θ > θ0 , where θ is the parameter of interest, θ0 is a specified value. Then define the generalized test variable T (X ; x, θ , η) satisfying the following conditions: (a) The distribution of T (X ; x, θ , η) is free of the nuisance parameter η. (b) The observed value of T (X ; x, θ , η),
i.e., T (x; x, θ , η) is free of η.
(c) Pr {T (X ; x, θ , η) ≥ T (x; x, θ , η)} is nondecreasing in θ ,
for fixed x and η.
(2.1)
If the distribution of T (X ; x, θ , η) is stochastically increasing in θ , then the generalized p-value for testing the above hypothesis is defined as Pr (T (X ; x, θ , η) ≥ t ), where t = T (x; x, θ , η). But if the distribution of T (X ; x, θ , η) is stochastically decreasing in θ , then the generalized p-value for testing the above hypothesis is defined as Pr (T (X ; x, θ , η) ≤ t ). For further details on the concept of the generalized p-value, we refer readers to the book by Weerahandi (1995).
M. Chang et al. / Statistics and Probability Letters 82 (2012) 1755–1760
1757
3. The generalized p-value approach A generalized test variable for testing homogeneity of normal variances was proposed by Liu and Xu (2010). In this section, a parallel test for IG scale-like parameters is developed as follows. Let Xi1 , Xi2 , . . . , Xini be a random sample from an IG(µi , λi ) population with X¯ i as the ith sample mean for i = 1, . . . , k. Let x¯ i denote the ith observed sample mean for i = 1, . . . , k. It is well known that X¯i ∼ IG(µi , ni λ),
λi Vi ∼ χn2i −1 .
(3.1)
Denote λ = (λ1 , λ2 , . . . , λk )′ and
0
1
.. .
··· ··· .. .
0
.. .
−1 −1 .. .
0
···
1
−1
1
0
0 H = .. . 0
.
(3.2)
(k−1)×k
Then the testing problem (1.2) is equivalent to testing H0 : H λ = 0,
vs. H1 : H λ ̸= 0.
(3.3)
In this section, a new generalized p-value is developed for testing (3.3). According to Tian (2006), the the generalized pivot quantity for λi based on the ith sample is given by Tλi =
Ui λ i Vi = , vi vi
(3.4)
where vi is the observed value of Vi , Ui ∼ χn2i −1 , i = 1, 2, . . . , k, and the Ui are mutually independent. Obviously, Tλi coincides with the traditional pivot quantity for λi . Then the generalized pivot quantity for H λ can be derived as TH λ = H (Tλ1 , Tλ2 , . . . , Tλk ).
(3.5)
Let V = (V1 , V2 , . . . , Vk )′ and X¯ = (X¯ 1 , X¯ 2 , . . . , X¯ k )′ . From (3.5), we see that the conditional expectation of TH λ given (X¯ , V ) = (¯x, v) is
µT = E (TH λ |(¯x, v)) = HE (Tλ |(¯x, v)) = H (E (Tλ1 |(¯x, v)), . . . , E (Tλk |(¯x, v)))′
(3.6)
and the conditional covariance matrix of TH λ given (X¯ , V ) = (¯x, v) is
ΣT = Cov(TH λ |(¯x, v)) = HCov(Tλ |(¯x, v))H ′ = Hdiag(Cov(Tλ1 |(¯x, v), . . . , Tλk |(¯x, v)))H ′ , where E (Tλi |(¯x, v)) = iv and Cov(Tλi |(¯x, v)) = i n −1
2(ni −1)
vi2
(3.7)
for i = 1, 2, . . . , k, respectively. −1/2
(TH λ − µT ), and t the observed value of T for (X¯ , V ) = Let T denote the standardized expression of TH λ with T = ΣT (¯x, v), where µT and ΣT are given by (3.6) and (3.7), respectively. Given (¯x, v), obviously, the distribution of T is free of any unknown parameter. Then ∥T ∥2 = (TH λ − µT )′ ΣT−1 (TH λ − µT ) does not depend on any unknown parameter too, and the observed value of ∥t ∥2 , under H0 : H λ = 0, is equal to µ′T ΣT−1 µT which is a known constant free of any parameter. Therefore, ∥T ∥2 is a generalized test variable satisfying conditions similar to (2.1). Then the generalized p-value can be obtained as p(¯x, v) = Pr (∥T ∥2 ≥ ∥t ∥2 |H0 ) = Pr ((TH λ − µT )′ ΣT−1 (TH θ − µT ) ≥ µ′T ΣT−1 µT ),
(3.8)
and H0 will be rejected whenever p(¯x, v) is less than the level α . Furthermore, the rejection region can be derived as follows: Cα (x11 , . . . , xknk ) = {(x11 , . . . , xknk ) : p(¯x, v) ≤ α}.
(3.9)
The following theorem shows that the new generalized p-value method has exact frequent property; i.e., the p-value statistic p(X¯ , V ) has a uniform distribution on (0, 1) under the null hypothesis H0 . Theorem 1. About the above problem, Pr (Cα (x11 , . . . , xknk )|H0 ) = PrH0 (p(X¯ , V ) ≤ α) = α where p(X¯ , V ) and Cα (X11 , . . . , Xknk ) are defined through (3.8) and (3.9), respectively.
1758
M. Chang et al. / Statistics and Probability Letters 82 (2012) 1755–1760
Proof. Assume that Ui∗ and Vi∗ are the independent copies of Ui and Vi ; then PrH0 [p(X¯ , V ) ≤ α] = PrH0 [Pr ∗ ((TH λ − µT )′ ΣT−1 (TH θ − µT ) ≥ µ′T ΣT−1 µT ) ≤ α]
k
(Ui∗ − (ni − 1))2 1 = PrH0 Pr ∗ − k 2 ( n − 1 ) Vi∗ i i =1 2(ni −1) i =1 2 k k ni − 1 1 Vi ≥ − k ≤ α . 2 2 Vi i=1 i =1
k V ∗ (U ∗ − (ni − 1)) i
2(ni − 1)
i=1
2(ni −1)
i =1
Under the null hypothesis H0 , µ′T Σ −1 µT = (H λ − µT )′ Σ −1 (H λ − µT ),
k
1 (Ui∗ − (ni − 1))2 − k = PrH0 Pr ∗ 2 ( n − 1 ) Vi∗ i i=1 2(ni −1)
k V ∗ (U ∗ − (ni − 1)) i
2
i
2(ni − 1)
i=1
i =1
≥
k (λi Vi − (ni − 1))2
2(ni − 1)
i=1
−
1 k
Vi 2(ni −1)
i=1
2 k Vi (λi Vi − (ni − 1)) ≤ α 2 ( n − 1 ) i i=1
k
(Ui∗ − (ni − 1))2 1 = PrH0 Pr ∗ − k 2 ( n − 1 ) Vi∗ i i=1 2(ni −1)
k V ∗ (U ∗ − (ni − 1)) i
i=1
2
i
2(ni − 1)
i=1
≥
k (Ui − (ni − 1))2
2(ni − 1)
i=1
−
1 k i =1
Vi 2(ni −1)
2 k Vi (Ui − (ni − 1)) ≤ α . 2 ( n − 1 ) i i =1
Obviously,
Pr
∗
k i =1
≥
1 (Ui∗ − (ni − 1))2 − k Vi∗ 2(ni − 1)
k (Ui − (ni − 1))2
2(ni − 1)
i =1
−
i=1
Vi 2(ni −1)
Then, PrH0 (p(¯x, v) ≤ α) = α is proved.
i =1
2
i
2(ni − 1)
2 k Vi (Ui − (ni − 1))
1 k
i
2(ni −1)
i=1
k V ∗ (U ∗ − (ni − 1))
i=1
2(ni − 1)
∼ U (0, 1).
In practice, the generalized p-value is computed through the following algorithm. Algorithm 1. For a given (n1 , . . . , nk ), (¯x1 , . . . , x¯ k ) and (v1 , . . . , vk ). For l = 1, . . . , L; Generate Ui ∼ χn2i −1 , i = 1, . . . , k. (End l loop) Compute Tl = TH λ . Compute µ ˆT =
1 L
L
l =1
ˆT = 1 Tl and Σ L−1
L
l =1
(Tl − µ ˆ T )(Tl − µ ˆ T )′ .
i
2
M. Chang et al. / Statistics and Probability Letters 82 (2012) 1755–1760
1759
ˆ T−1 (Tl − µ ˆ T−1 µ Compute ∥Tˆ ∥2l = (Tl − µ ˆ T )′ Σ ˆ T ), l = 1, . . . , L and µ ˆ ′T Σ ˆ T. ˆ T−1 µ Let Wl = 1 if ∥Tˆ ∥2l ≥ µ ˆ ′T Σ ˆ T , else Wl = 0. 1 L
L
l =1
Wl is a simulated estimate of generalized p-value in (3.8) for testing (3.3).
4. Numerical results In this section, some simulated type I error probabilities and powers are given in Tables 1 and 2 for the test proposed in Section 3 and the modified likelihood ratio (MLR) test proposed by Chhikara and Folks (1989). In Tables 1 and 2, GP, MLR denote the Monte Carlo estimators of the probabilities using the new proposed generalized p-value test and the modified likelihood ratio test, respectively. The type I error probability and power of the test based on (3.8) were computed as follows. Algorithm 2. For m = 1 to M For a given (n1 , n2 , . . . , nk ) and (µ, λ1 , λ2 , . . . , λk ), generate (¯x1 , x¯ 2 , . . . , x¯ k ) and (v1 , v2 , . . . , vk ) according to (3.1). Use Algorithm 1 to compute the generalized p-value pm ; (End m loop). If the parameters µi are chosen such that H θ = 0, (Number of pm < α )/M is a simulated estimate of type I error probability; otherwise (Number of pm < α )/M is a simulated estimate of the power. For each setting of sample size n = (n1 , n2 , . . . , nk ), we take µi = 1, L = 2000 and M = 5000. The nominal levels are set at α = 0.05 and α = 0.1, respectively. From the numerical results in Table 1, it appears that the test based on the generalized p-value in (3.8) has a type I error probability very close to the nominal level for all cases. However, the MLR test is more liberal even for large samples. Table 2 presents the powers of the proposed generalized p-value test are better than the MLR test except when the ni are small. But when the ni are small, as in an asymptotic test, the MLR test is not suitable in practice. The exact frequency property of the proposed test also is proved. Therefore, the new generalized p-value approach performs better than the MLR test. 5. Concluding remarks For the problem of testing the homogeneity of inverse Gaussian scale-like parameters, this article derives a new generalized test variable and gives the generalized p-value based on the test variable (the test is analogous to the method proposed by Liu and Xu (2010)). Although the literature abounds with examples of generalized p-value tests, no exact frequent property has been proved. The only evidence of their acceptability for practical use is through simulation studies which suggest that almost all reported generalized p-value test methods appear to have type I error probabilities close to the nominal significant level. But for the problem of testing the homogeneity of IG scale-like parameters, the proposed new generalized p-value approach has an exact frequency property. Therefore, except for the simulation results in Section 4, this good property also shows that the new method is worth using in practice. Table 1 Simulated type I error (based on 5000 simulations).
λ = (0.3, 0.3, 0.3) n = (5 , 5 , 5 ) n = (5, 8, 10) n = (10, 10, 10) n = (10, 15, 20) n = (30, 30, 30) λ = (0.5, 0.5, 0.5, 0.5, 0.5) n = (5, 5, 5, 5, 5) n = (7, 8, 9, 10, 12) n = (10, 10, 10, 10, 10) n = (12, 15, 16, 18, 20) n = (20, 20, 30, 30, 30) λ = (1, 1, 1, 1, 1, 1, 1, 1) n = (5, 5, 5, 5, 5, 5, 5, 5) n = (7, 8, 10, 12, 12, 14, 14, 15) n = (10, 10, 10, 10, 10, 10, 10, 10) n = (10, 12, 15, 15, 18, 18, 20, 20) n = (20, 20, 30, 30, 30, 30, 40, 40) λ = (1.2, 1.2, 1.2, 1.2, 1.2, 1.2, 1.2, 1.2, 1.2, 1.2) n = (5, 5, 5, 5, 5, 5, 5, 5, 5, 5) n = (7, 8, 8, 10, 10, 12, 12, 12, 15, 15) n = (10, 10, 10, 10, 10, 10, 10, 10, 10, 10) n = (10, 10, 12, 12, 15, 15, 18, 18, 20, 20) n = (20, 20, 20, 30, 30, 30, 30, 40, 40, 40)
GP (α = 0.05)
MLR
GP (α = 0.1)
MLR
0.054 0.051 0.046 0.044 0.047
0.051 0.058 0.068 0.057 0.069
0.102 0.094 0.099 0.098 0.088
0.103 0.109 0.117 0.109 0.122
0.056 0.050 0.046 0.051 0.047
0.050 0.067 0.055 0.068 0.079
0.106 0.101 0.089 0.092 0.097
0.104 0.106 0.121 0.115 0.124
0.059 0.051 0.046 0.048 0.049
0.051 0.049 0.062 0.074 0.070
0.112 0.102 0.091 0.100 0.087
0.103 0.125 0.118 0.119 0.136
0.056 0.059 0.048 0.047 0.044
0.052 0.057 0.074 0.069 0.071
0.105 0.107 0.099 0.098 0.092
0.104 0.138 0.124 0.135 0.126
1760
M. Chang et al. / Statistics and Probability Letters 82 (2012) 1755–1760 Table 2 Simulated powers (based on 5000 simulations).
λ = (0.3, 0.5, 0.8) n = (5 , 5 , 5 ) n = (5, 8, 10) n = (10, 10, 10) n = (10, 15, 20) n = (30, 30, 30) λ = (0.3, 0.5, 0.7, 1) n = (5, 5, 5, 5) n = (5, 8, 10, 12) n = (10, 10, 10, 10) n = (10, 15, 18, 20) n = (30, 30, 30, 30) λ = (0.5, 0.5, 1, 1, 1, 1.2, 1.2) n = (5, 5, 5, 5, 5, 5, 5) n = (7, 8, 10, 12, 12, 14, 14) n = (10, 10, 10, 10, 10, 10, 10) n = (10, 12, 15, 15, 18, 18, 18) n = (20, 20, 30, 30, 30, 40, 40) λ = (0.3, 0.3, 0.5, 0.5, 1, 1, 1, 1.2, 1.2, 1.2) n = (5, 5, 5, 5, 5, 5, 5, 5, 5, 5) n = (7, 8, 8, 10, 10, 12, 12, 12, 15, 15) n = (10, 10, 10, 10, 10, 10, 10, 10, 10, 10) n = (10, 10, 12, 12, 15, 15, 18, 18, 20, 20) n = (20, 20, 20, 30, 30, 30, 30, 40, 40, 40)
GP (α = 0.05)
MLR
GP (α = 0.1)
MLR
0.089 0.112 0.133 0.284 0.366
0.098 0.109 0.125 0.271 0.349
0.156 0.186 0.209 0.398 0.498
0.154 0.191 0.197 0.372 0.476
0.079 0.091 0.122 0.208 0.448
0.081 0.089 0.117 0.204 0.434
0.162 0.182 0.231 0.298 0.587
0.163 0.175 0.218 0.292 0.566
0.071 0.109 0.187 0.214 0.407
0.078 0.098 0.171 0.202 0.405
0.140 0.172 0.244 0.278 0.561
0.158 0.174 0.237 0.270 0.550
0.082 0.102 0.184 0.279 0.387
0.086 0.096 0.175 0.274 0.362
0.144 0.188 0.225 0.362 0.499
0.152 0.169 0.213 0.363 0.481
Acknowledgments The authors cordially thank the Editor and two referees for their kind help and comments which led to the improvement of this paper. References Bartlett, M.S., Kendall, D.G., 1946. The statistical analysis of variance heterogeneity and the logarithmic transformation. Journal of the Royal Statistical Society (Suppl. 8), 128–138. Chhikara, R.S., Folks, J.L., 1989. The Inverse Gaussian Distribution. Marcel Dekker, New York. Doksum, K.A., Hóyland, A., 1992. Models for variable-stress accelerated life testing experiments based on Wiener processes and the inverse Gaussian. Technometrics 34, 74–82. Durham, S.D., Padgett, W.J., 1997. Cumulative damage models for system failure with application to carbon fibers and composites. Technometrics 34, 34–44. Li, X.M., 2009. A generalized p-value approach for comparing the means of several log-normal populations. Statistics and Probability Letters 79, 1404–1408. Liu, X.H., Xu, X.Z., 2010. A generalized p-value approach for testing the homogeneity of variances. Statistics and Probability Letters 80, 1486–1491. Mudholkar, G.S., Natarajan, R., 2002. The inverse Gaussian analogs of symmetry, skewness and kurtosis. Annals of the Institute of statistical Mathematics 54, 138–154. Seshadri, V., 1993. The Inverse Gaussian Distribution: A Case Study in Exponential Families. Clarendon Press, Oxford. Seshadri, V., 1999. The Inverse Gaussian Distribution: Statistical Theory and Applications. Springer, New York. Tian, L., 2006. Testing equality of inverse Gaussian means under heterogeneity: based on generalized test variable. Computational Statistics and Data Analysis 51, 1156–1162. Tsui, K.W., Weerahandi, S., 1989. Generalized p-value in significance testing of hypotheses in the presence of nuisance parameters. Journal of American Statistical Association 84, 602–607. Weerahandi, S., 1993. Generalized confidence intervals. Journal of American Statistical Association 88, 899–905. Weerahandi, S., 1995. Exact Statistical Methods for Data Analysis. Springer, New York. Zhou, L.P., Mathew, T., 1994. Some tests for variance components using generalized p-values. Technometrics 36, 394–402.