Win-probabilities for regression models

Win-probabilities for regression models

Statistical Methodology 9 (2012) 520–527 Contents lists available at SciVerse ScienceDirect Statistical Methodology journal homepage: www.elsevier.c...

227KB Sizes 1 Downloads 107 Views

Statistical Methodology 9 (2012) 520–527

Contents lists available at SciVerse ScienceDirect

Statistical Methodology journal homepage: www.elsevier.com/locate/stamet

Win-probabilities for regression models A.J. Hayter Department of Business Information and Analytics, University of Denver, Denver, USA

article

info

Article history: Received 12 August 2011 Received in revised form 5 February 2012 Accepted 6 February 2012 Keywords: Multiple linear regression Normal distribution Confidence interval Prediction interval Tolerance interval Non-central t-distribution Normal theory intervals One-sided intervals Two-sided intervals Quantiles Cumulative distribution function

abstract This paper considers inferences concerning future observations for regression models. Specifically, the differences between future observations at two designated sets of input values are considered. Win-probabilities, which are the probabilities that one of the future observations will exceed the other, constitute a special case of this analysis. These win-probabilities, together with the more general inferences on the difference between the future observations, provide a useful and easily interpretable tool with which a practitioner can assess the information provided by the regression model, and can make decisions regarding which of the two designated sets of input values would be optimal. A multiplelinear-regression model is considered in detail, although the results can be applied to any regression model with normally distributed errors. Central and non-central t-distributions are used for the analysis, and several examples of the methodologies are presented. © 2012 Elsevier B.V. All rights reserved.

1. Introduction Consider a standard multiple-linear-regression model with k input variables x = (x1 , . . . , xk ) and with normal errors. Suppose that the design matrix n X has the ith row given by (1, x1i , . . . , xki ), n ¯ i. )(xjm − x¯ j. ) with S being the k × k 1 ≤ i ≤ n, define x¯ i. = m=1 xim /n, and let Sij = m=1 (xim − x matrix with entries Sij . Also, let σˆ 2 be the usual variance estimate distributed as a σ 2 χν2 /(ν) random variable with ν = n − k − 1, and let β0 and β = (β1 , . . . , βk ) be the regression parameters which are ˆ = (βˆ 1 , . . . , βˆ k ). estimated by βˆ 0 and β Consider two distinct sets of input variables x∗1 = (x∗11 , . . . , x∗k1 ) and x∗2 = (x∗12 , . . . , x∗k2 ), and suppose that Y1∗ and Y2∗ are potential future observations of the output variable for these sets of input variables. Thus, Y1∗ ∼ N (β0 + x∗1 β′ , σ 2 ) and Y2∗ ∼ N (β0 + x∗2 β′ , σ 2 ) are independently distributed. The objective of this paper is to make inferences concerning the difference Y1∗ − Y2∗ . A prediction interval

E-mail address: [email protected]. 1572-3127/$ – see front matter © 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.stamet.2012.02.002

A.J. Hayter / Statistical Methodology 9 (2012) 520–527

521

for this difference will be discussed, together with inferences concerning the cumulative distribution function through the probabilities W1 (c ) = P (Y1∗ − Y2∗ ≥ c ),

(1)

as well as inferences concerning the quantiles cγ defined by W1 (cγ ) = P (Y1∗ − Y2∗ ≥ cγ ) = γ . A special case obtained with c = 0 concerns the win-probability of Treatment 1 defined to be W1 (0) = P (Y1∗ > Y2∗ ) and similarly the win-probability of Treatment 2 W2 (0) = P (Y2∗ > Y1∗ ) = 1 − W1 (0). For a continuous output variable, as considered here, it does not matter whether the inequality in the definition of the win-probability is weak or strict. Inference methodologies on the difference Y1∗ − Y2∗ have the advantage of being easily interpretable by a practitioner, and they balance information that is available for the difference between the fitted ′



ˆ and βˆ 0 + x∗ βˆ , with the importance of regression model for the two sets of input variables, βˆ 0 + x∗1 β 2 2 that difference relative to the variance σ of the individual observations. The win-probabilities and the probabilities in Eq. (1), together with the quantiles cγ , provide clear and direct information concerning the difference between outcomes for the two sets of input variables. In a medical setting, for example, a patient may need to decide which of two treatments to receive, and the two sets of input variables can be chosen to correspond to the two treatment options. In a financial setting, the regression may model the risk for different kinds of projects, and these methodologies will allow an assessment of the difference between the risks of potential future projects. The probabilities in Eq. (1) and the quantiles cγ address these questions more directly than just a comparison of the fitted regression model, and they allow the potential superiority of either option to be balanced with other considerations such as their costs or side effects. There has been considerable work on win-probabilites for two-sample problems where two treatments with means µ1 and µ2 are compared, such as [35,31,32,36,34,2], for example. These winprobabilites have been proposed as a useful way to compare two treatments, incorporating aspects of both a confidence interval for the difference µ1 − µ2 and the distribution of potential future observations, and it has been recognized that they are based upon inferences concerning the effect size (µ1 − µ2 )/σ defined by Cohen [4]. These kinds of win-probabilities have roots in engineering where they arise in the comparison of component strengths and stresses (see, for example, [25]). The ideas presented in this paper are applicable in an obvious manner to any kind of regression model with a continuous output variable. The inference methodologies using the central and noncentral t-distribution can be applied to any regression model, such as generalized linear or non-linear models, with normally distributed errors. Again, the win-probabilities will provide useful information for comparing potential future observations, incorporating information from the model estimates with the variability σ 2 of individual observations. ′

ˆ generally involve Inferences on the regression model β0 + xβ′ through fitted values βˆ 0 + xβ confidence intervals. In contrast, an interval for the difference Y1∗ − Y2∗ is referred to as a prediction interval, while confidence intervals for the quantiles cγ can be referred to as tolerance intervals. Consequently, the differences between the methodologies presented in this paper and standard inferences on the regression model mirror the general relative advantages and disadvantages of confidence interval, prediction interval, and tolerance interval approaches to a problem, good discussions of which are provided by Patel [24], Hahn & Meeker [7], and Vardeman [33], for example. The layout of this paper is as follows. Section 2 contains a discussion of prediction intervals for the difference Y1∗ − Y2∗ . Section 3 contains a discussion of inferences concerning the probabilities in Eq. (1), while Section 4 addresses inferences concerning the quantiles cγ . Some extensions to groups of future observations are considered in Section 5. Section 6 considers the specific problem of comparing two regression models, and finally, Section 7 presents some examples of these methodologies for different data sets with illustrations of how they assist in decision making and assessing the information provided by the regression models.

522

A.J. Hayter / Statistical Methodology 9 (2012) 520–527

2. The prediction interval for Y1∗ − Y2∗ Prediction bands around a fitted linear regression model are routinely used, and are given by the pointwise prediction intervals ′

ˆ ± σˆ tα/2,ν Y1∗ ∈ βˆ 0 + x∗1 β





ˆ )/σ 2 1 + Var(βˆ 0 + x∗1 β

which have an exact confidence level of 1 − α , where tα/2,ν is the upper α/2 critical point from a t-distribution with ν degrees of freedom. The 1 inside the square root is there because of the variability of Y1∗ . Similarly, it is straightforward to show that the prediction interval ′

ˆ ± σˆ tα/2,ν Y1∗ − Y2∗ ∈ (x∗1 − x∗2 )β



2 + τ2

(2)

also has an exact confidence level of 1 − α where ′ τ 2 = Var((x∗1 − x∗2 )βˆ )/σ 2 = (x∗1 − x∗2 )S −1 (x∗1 − x∗2 )′ .

This prediction interval is analogous to the interval provided by Hahn [6] for a two-sample problem. In Eq. (2) the 2 inside the square root is there because of the variability of both Y1∗ and Y2∗ , and it can be modified for other regression models with normally distributed errors by adding 2 to the variability (divided by σ 2 ) of the difference of the fitted model estimates at x∗1 and x∗2 . 3. Inferences on P (Y1∗ − Y2∗ ≥ c ) Notice that W1 (c ) = P (Y1∗ − Y2∗ ≥ c ) = Φ



(x∗1 − x∗2 )β′ − c √ 2σ



where Φ (x) is the standard normal cumulative distribution function. Also, ′ (x∗1 − x∗2 )βˆ − c σˆ τ is distributed as tν (δ(c )), a non-central t-distribution with ν degrees of freedom and non-centrality

D(c ) =

parameter

δ(c ) =

(x∗1 − x∗2 )β′ − c . στ

Consequently, inferences concerning P (Y1∗ − Y2∗ ≥ c ) can be based upon the statistic D(c ). This problem is similar to the one-sample problem of making inferences concerning the cumulative distribution function of a normal distribution, where the use of the non-central t-distribution has been illustrated by Odeh & Owen [23] and Hahn & Meeker [7] Section 4.5. In fact, there has been substantial previous work concerning inferences on the non-centrality parameter of a non-central t-distribution going back as far as [21], and the problem also arises, for example, when assessing the signal to noise ratio in simple linear regression models [30,22]. The construction of confidence intervals for the noncentrality parameter δ(c ) is discussed on p. 352 of [15] and on p. 510 of [10]. The standard 1 − α level two-sided confidence interval is derived using critical points for equal tail probabilities of α/2, and is

δ(c ) ∈ (δl (c ), δu (c )) where for a given value of D(c ) P (tν (δl (c )) ≥ D(c )) = α/2 and P (tν (δu (c )) ≥ D(c )) = 1 − α/2. However, the inefficiency of equal upper and lower tail probabilities α/2 for the skewed non-central t-distribution was discussed in [14] where it was shown that improvements in confidence interval length are possible.

A.J. Hayter / Statistical Methodology 9 (2012) 520–527

523

−1 −1 Writing δl (c ) = tα/ 2,ν (D(c )) and δu (c ) = t1−α/2,ν (D(c )) it follows that a 1 − α confidence interval for W1 (c ) is

W1 (c ) ∈

     τ −1 τ −1 Φ √ tα/ ( D ( c )) , Φ t ( D ( c )) . √ 2,ν 1−α/2,ν 2

2

(3)

One-sided confidence intervals for W1 (c ) can be derived from one-sided confidence intervals for δ(c ), and a 1 − α level lower confidence bound is

    τ −1 W1 (c ) ∈ Φ √ tα,ν (D(c )) , 1

(4)

2

while a 1 − α level upper confidence bound is



W1 (c ) ∈ 0, Φ



τ 1 (D(c )) √ t1−−α,ν



.

2

(5)

Similar confidence intervals can be obtained for other regression models with normally distributed errors by constructing similar statistics D(c ) whose non-centrality parameter is related to the probability of interest. Notice that a two-sided 1 − α level confidence interval for the difference in the regression models at x∗1 and x∗2 is ′ (x∗1 − x∗2 )β′ ∈ (x∗1 − x∗2 )βˆ ± tα/2,ν σˆ τ

(6)

and this interval contains c if and only if Eq. (3) contains 0.5. This is because ′

(x∗1 − x∗2 )βˆ − tα/2,ν σˆ τ > c if and only if P (tν (0) ≥ D(c )) < α/2 and so δl (c ) > 0, and ′ (x∗1 − x∗2 )βˆ + tα/2,ν σˆ τ < c

if and only if P (tν (0) ≥ D(c )) > 1 − α/2 and so δu (c ) < 0. Consequently, Eq. (6) can be inferred from Eq. (3) for all values of c. Specifically, whether or not Eq. (6) contains zero, which indicates whether or not the difference between the fitted regression models at x∗1 and x∗2 is statistically significant, can be determined by whether or not the two-sided confidence interval for the win-probability W1 (0) contains 0.5. Similar relationships exist between Eqs. (4) and (5) and one-sided versions of Eq. (6). 4. Inferences on the quantiles cγ Confidence intervals for the quantiles of a normal distribution are discussed by Odeh & Owen [23] and Hahn & Meeker [7] Section 4.4, and the standard approach can be used for this regression setting. Notice that cγ = (x∗1 − x∗2 )β′ −



2σ z1−γ

where Φ (z1−γ ) = γ , so ′



ˆ − σˆ τ L ≤ cγ ≤ (x∗ − x∗ )βˆ − σˆ τ U ) P ((x∗1 − x∗2 )β 1 2 

(x∗ − x∗2 )(βˆ − β)′ + =P U ≤ 1 σˆ τ = P (U ≤ tν (δγ ) ≤ L) √ for δγ = 2z1−γ /τ .



2σ z1−γ

 ≤L (7)

Consequently, critical points from this non-central t-distribution can be used in Eq. (7) to construct a confidence interval for the quantile cγ . One-sided 1 − α level confidence intervals are obtained with L = ∞ and P (tν (δγ ) ≤ U ) = α , or with U = −∞ and P (L ≤ tν (δγ )) = α . Two-sided 1 − α level

524

A.J. Hayter / Statistical Methodology 9 (2012) 520–527

confidence intervals are obtained with P (tν (δγ ) ≤ U ) = α1 and P (L ≤ tν (δγ )) = α2 for α1 + α2 = α . In general it is possible to take α1 = α2 = α/2, although for given values of β and τ , optimal values of α1 and α2 can be chosen to minimize the confidence interval length, as discussed in [14,17]. If γ = 0.5 and so z1−γ = δγ = 0, then the confidence interval in Eq. (7) with α1 = α2 = α/2 is equivalent to the confidence interval in Eq. (6). The work in [17] on the construction of simultaneous confidence intervals for several quantiles of a normal distribution in the one-sample situation applies in a natural manner to this regression setting. Also, it should be pointed out that work on simultaneous confidence bands for the cumulative distribution function of a normal distribution (see [12,13,3,5]) can also be applied in this two-sample situation, and the confidence bands will provide simultaneous confidence intervals for the quantiles cγ for all γ and also for the probabilities P (Y1∗ − Y2∗ ≥ c ) discussed in Section 3 for all c. 5. Groups of future observations The inference methodologies discussed in this paper can be extended to situations where m1 potential future observations at x∗1 and m2 potential future observations at x∗2 are considered. Let Y¯1∗ and Y¯2∗ be the averages of these potential future observations. The prediction interval in Eq. (2) can be extended to a prediction interval for Y¯1∗ − Y¯2∗ by replacing the 2 inside the square root by 1/m1 + 1/m2 . Also, we define

 ∗ ∗ ′ ( x − x )β − c 1 2 . W1,m1 ,m2 (c ) = P (Y¯1∗ − Y¯2∗ ≥ c ) = Φ   1 1 + σ m1 m2 

Eq. (3) now becomes

  W1,m1 ,m2 (c ) ∈ Φ  



τ 1 m1

+

1 m2

−1 ,Φ tα/ 2,ν (D(c ))

 



τ 1 m1

+

1 m2

1  t1−−α/ 2,ν (D(c ))

(8)

and the one-sided confidence intervals in Eqs. (4) and (5) can be similarly modified. These modified confidence intervals have the same relationship to the confidence interval for (x∗1 − x∗2 )β′ as described in Section 3 when m1 = m2 = 1. In addition, if the confidence interval in Eq. (3) does not contain 0.5 (and so the confidence interval for (x∗1 − x∗2 )β′ in Eq. (6) does not contain c) then the confidence interval in Eq. (8) will tend to either the point 0 or the point 1 as both sample sizes m1 and m2 tend to infinity. However, if the confidence interval in Eq. (3) does contain 0.5 (and so the confidence interval for (x∗1 − x∗2 )β′ in Eq. (6) does contain c) then the confidence interval in Eq. (8) will tend towards the interval [0, 1] as both sample sizes m1 and m2 tend to infinity. Eq. (7) can still be used to obtain a confidence interval for the quantile cγ , which is now defined by P (Y¯1∗ − Y¯2∗ ≥ cγ ) = γ and is equal to

 cγ = (x1 − x2 )β − ∗





1 m1

+

1 m2

σ z1−γ

except that the non-centrality parameter is now

 δγ =

1 m1

+ τ

1 z m2 1−γ

.

Eq. (7) becomes equivalent to Eq. (6) as both sample sizes m1 and m2 tend to infinity. 6. Comparing two regression models A commonly arising problem is that of comparing regression models for common input variables but for different groups. For example, two simple linear regression models with a common input

A.J. Hayter / Statistical Methodology 9 (2012) 520–527

525

variable may be constructed for two groups of experimental units. The comparison of these regression models is discussed in [16] Chapters 5 and 6, and standard approaches are to use hypothesis tests and simultaneous confidence intervals or bands (see, for example, [29,1,20,19,9,18]). However, these standard approaches only involve inferences on the regression models and do not implicitly assess the prediction of future observations. The approach taken in this paper can be applied to these problems in order to compare potential future observations from the different groups. Conceptually, this can be done by incorporating the different regression models into one overall regression model with additional input variables to designate the different groups. Predicted values for the different groups for otherwise constant values of the input variables can then be made by choosing x∗1 and x∗2 to vary only in the specification of the groups. This can provide more specific and useful information to a practitioner about how future observations for the different groups may vary, rather than just an assessment of whether there is a statistically significant difference between the regression models. For example, suppose that two simple linear regression models with a common input variable are (1) (1) (2) (2) fitted, βˆ 0 + βˆ 1 x and βˆ 0 + βˆ 1 x, and that σˆ 2 is an estimate of the common variance. If Y1∗ and Y2∗ are potential future observations from the two regression models for a common value x∗ of the input variable, then Eq. (2) becomes (1) (2) (1) (2) Y1∗ − Y2∗ ∈ βˆ 0 − βˆ 0 + (βˆ 1 − βˆ 1 )x∗ ± σˆ tα/2,ν



2 + τ2

(9)

for suitable degrees of freedom ν where now

τ 2 = (Var(βˆ 0(1) + βˆ 1(1) x∗ ) + Var(βˆ 0(2) + βˆ 1(2) x∗ ))/σ 2 . Also, Eqs. (3)–(5) can be used with D(c ) =

βˆ 0(1) − βˆ 0(2) + (βˆ 1(1) − βˆ 1(2) )x∗ − c σˆ τ

and the other methodologies presented in this paper can similarly be applied to this problem. Approximate methodologies can also be developed for these methodologies for the comparison of regression models when the models are considered to have unequal error variances, using the ideas in Section 3 of [9] for example. 7. Examples and illustrations of the methodologies In this section some examples are presented to illustrate the application and interpretation of the methodologies discussed in this paper. Example 1. Data set 210 in [8] is taken from [28] and concerns measurements of n = 31 cherry trees in the Allegheny National Forest, Pennsylvania. A regression model can be used to model the volume of wood in the tree (in cubic feet) from its diameter (in inches) at 54 inches above the ground and its height (in feet). Suppose that two new trees are identified, with Tree-1 having a diameter of 15 inches and a height of 70 feet, while Tree-2 has a diameter of 12 inches and a height of 80 feet. Which tree will have more wood? The fitted values of the regression model are 36.38 cubic feet for Tree-1 and 25.65 cubic feet for Tree-2, and with confidence level 95% Eq. (6) gives

(x∗1 − x∗2 )β′ ∈ (6.96, 14.51)

(10)

so on average trees with the dimensions of Tree-1 can be inferred to contain at least 6.96 cubic meters more of wood than trees with the dimensions of Tree-2. However, at confidence level 95% Eq. (2) gives Y1∗ − Y2∗ ∈ (−1.13, 22.59) so Tree-1 actually may have 1.13 cubic meters less wood than Tree-2. With 95% confidence level Eq. (3) gives W1 (0) ∈ (0.867, 0.997) and

W1 (8) ∈ (0.430, 0.878)

526

A.J. Hayter / Statistical Methodology 9 (2012) 520–527

while Eq. (7) gives c0.95 ∈ (−3.55, 5.52). Thus, there is at least an 86.7% chance that Tree-1 will have more wood than Tree-2 (and notice that this lower bound is larger than 50% which is consistent with the interval in Eq. (10) not containing c = 0). However, suppose that for certain cost related reasons Tree-1 is only preferable to Tree-2 if it contains at least 8 cubic meters more wood. Then the probability that Tree-1 is preferable may be as low as 43.0% (the interval for W1 (8) contains 0.5 which is consistent with the interval in Eq. (10) containing c = 8). Finally, it can be seen that the lower 5% quantile of the difference between the amount of wood in Tree-1 and the amount of wood in Tree-2 may be as low as −3.55 cubic meters. Example 2. Kahn [11] presents data from [27] concerning the effects of smoking on n = 654 youths. Kahn presents a linear regression model with forced expiratory volume (FEV) modeled from age, height, height squared, gender and smoking (no = 0, yes = 1). Consider a new youth and let x∗1 correspond to not smoking and x∗2 correspond to smoking (the age, height and gender do not matter). With confidence level 95% Eq. (6) gives

(x∗1 − x∗2 )β′ = −β5 ∈ (0.02, 0.25) so it can be claimed that the effect of smoking is statistically significant. Also, at confidence level 95% Eq. (2) gives Y1∗ − Y2∗ ∈ (−0.97, 1.24) and Eq. (4) gives W1 (0) ∈ (0.528, 1]. Consequently, it can be claimed that there is at least a 52.8% chance that smoking will decrease the FEV of similar youths. Example 3. Rooney & Lewis [26] construct simple linear regression models for two groups of female fireflies relating the number of eggs laid to the firefly’s weight in mg. The first group of fireflies had one mate while the second group had three mates. The two regression models are eggs = −11.9 + 1.68 weight

and

eggs = −18.1 + 2.80 weight

each based on n = 20 fireflies. The approach discussed in Section 6 can be applied to this data set. Consider a comparison of one new firefly from each of the two groups who both weigh 25 mg. With a confidence level of 95% Eq. (9) gives Y1∗ − Y2∗ ∈ (−64.6, 20.7) and Eq. (3) gives W1 (0) ∈ (0.026, 0.387). Thus, for fireflies of this weight, while a firefly who mates only once may lay up to 20 eggs more than a firefly who mates three times, there is at least a 61.3% chance that the firefly who mates three times will lay more eggs. If five new fireflies from each group are considered, all weighing 25 mg, then the results of Section 5 can be applied. Now Y¯1∗ − Y¯2∗ ∈ (−45.7, 1.9) and Eq. (8) gives W1 (0) ∈ (0.000, 0.260). Consequently, there is at least a 74% chance that the five fireflies from the second group will in total lay more eggs than the five fireflies from the first group.

A.J. Hayter / Statistical Methodology 9 (2012) 520–527

527

References [1] P. Bhargava, J.D. Spurrier, Exact confidence bounds for comparing two regression lines with a control regression line on a fixed interval, Biometrical Journal 46 (6) (2004) 720–730. [2] R.H. Browne, The t-test p value and its relationship to the effect size and P (X > Y ), The American Statistician 64 (1) (2010) 30–33. [3] R.C.H. Cheng, T.C. Iles, Confidence bands for cumulative distribution functions of continuous random variables, Technometrics 25 (1983) 77–86. [4] J. Cohen, Statistical Power Analysis for the Behavioral Sciences, Academic Press, New York, 1969. [5] J. Frey, O. Marrero, D. Norton, Minimum-area confidence sets for a normal distribution, Journal of Statistical Planning and Inference 139 (2009) 1023–1032. [6] G.J. Hahn, A prediction interval on the difference between two future sample means and its application to a claim of product superiority, Technometrics 19 (2) (1977) 131–134. [7] G.J. Hahn, W.Q. Meeker, Statistical Intervals, A Guide for Practitioners, Wiley, New York, 1991. [8] D.J. Hand, F. Daly, A.D. Lunn, K.J. McConway, E. Ostrowski, A Handbook of Small Data Sets, Chapman and Hall, London, 1994. [9] A.J. Hayter, W. Liu, H.P. Wynn, Easy-to-construct confidence bands for comparing two simple linear regression lines, Journal of Statistical Planning and Inference 137 (4) (2007) 1213–1225. [10] N.L. Johnson, S. Kotz, N. Balakrishnan, Continuous Univariate Distributions, second ed., vol. II, Wiley, New York, 1994. [11] M. Kahn, An exhalent problem for teaching statistics, Journal of Statistics Education 13 (2) (2005). [12] P. Kanofsky, Parametric confidence bands on cumulative distribution functions, Sankhya A 30 (1968) 369–378. [13] P. Kanofsky, R. Srinivasan, An approach to the construction of parametric confidence bands on cumulative distribution functions, Biometrika 59 (1972) 623–631. [14] J. Kim, A.J. Hayter, Efficient confidence interval methodologies for the non-centrality parameter of a non-central t-distribution, Communications in Statistics—Simulation and Computation 37 (4) (2008) 660–678. [15] E.L. Lehmann, Testing Statistical Hypotheses, second ed., Springer-Verlag, New York, 1986. [16] W. Liu, Simultaneous Inference in Regression, Chapman and Hall, London, 2011. [17] W. Liu, F. Bretz, A.J. Hayter, E. Glimm, Simultaneous inference for several quantiles of a normal population with applications, Technical Report, University of Southampton, 2011. [18] W. Liu, F. Bretz, A.J. Hayter, H.P. Wynn, Assessing non-superiority, non-inferiority or equivalence when comparing two regression models over a restricted covariate region, Biometrics 65 (4) (2009) 1279–1287. [19] W. Liu, A.J. Hayter, H.P. Wynn, Operability region equivalence: simultaneous confidence bands for the equivalence of two regression models over restricted regions, Biometrical Journal 49 (1) (2007) 144–150. [20] W. Liu, M. Jamshidian, Y. Zhang, F. Bretz, X. Han, Some new methods for the comparison of two linear regression models, Journal of Statistical Planning and Inference 137 (2007) 57–67. [21] A.T. McKay, Distribution of the coefficient of variation and the extended t distribution, Journal of Royal Statistical Society 95 (1932) 695–698. [22] T. Miwa, Statistical inference on non-centrality parameters and Taguchi’s SN ratios, in: Proceedings of International Conference, Statistics in Industry, Science and Technology, 1994, pp. 66–71. [23] R.E. Odeh, D.B. Owen, Tables for Normal Tolerance Limits, Sampling Plans and Screening, Marcel Dekker, New York, 1980. [24] J.K. Patel, Prediction intervals—a review, Communications in Statistics—Theory and Methods 18 (7) (1989) 2393–2465. [25] B. Reiser, I. Guttman, Statistical inference for P (Y < X ): the normal case, Technometrics 28 (3) (1986) 253–257. [26] J. Rooney, S.M. Lewis, Fitness advantage from nuptial gifts in female fireflies, Ecological Entomology 27 (2002) 373–377. [27] B. Rosner, Fundamentals of Biostatistics, fifth ed., Pacific Grove, Duxbury, 1999. [28] T.A. Ryan, B.L. Joiner, B.F. Ryan, The Minitab Student Handbook, Duxbury, Boston, 1985. [29] J.D. Spurrier, Exact confidence bounds for all contrasts of three or more regression lines, Journal of the American Statistical Association 94 (446) (1999) 483–488. [30] G. Taguchi, Design of Experiments, third ed., Maruzen, Tokyo, Japan, 1977. [31] L. Tian, Inferences on standardized mean difference: the generalized variable approach, Statistics in Medicine 26 (5) (2007) 945–953. [32] L. Tian, Confidence intervals for P (Y1 > Y2 ) with normal outcomes in linear models, Statistics in Medicine 27 (21) (2008) 4221–4237. [33] S.B. Vardeman, What about the other intervals? The American Statistician 46 (3) (1992) 193–197. [34] Z. Wang, Statistical inference for P (X < Y ), Statistics in Medicine 27 (2007) 257–279. [35] J. Wu, G. Jiang, W. Wei, Confidence intervals of effect size in randomized comparative parallel-group studies, Statistics in Medicine 25 (2006) 639–651. [36] G.Y. Zou, Exact confidence interval for Cohen’s effect size is readily available, Statistics in Medicine 26 (2007) 3054–3056.