Testing unilateral versus bilateral normal contamination

Testing unilateral versus bilateral normal contamination

Statistics and Probability Letters 83 (2013) 163–167 Contents lists available at SciVerse ScienceDirect Statistics and Probability Letters journal h...

223KB Sizes 0 Downloads 29 Views

Statistics and Probability Letters 83 (2013) 163–167

Contents lists available at SciVerse ScienceDirect

Statistics and Probability Letters journal homepage: www.elsevier.com/locate/stapro

Testing unilateral versus bilateral normal contamination Richard Charnigo a,∗ , Qian Fan a , Douglas Bittel b , Hongying Dai c a

Department of Statistics, University of Kentucky, 725 Rose Street, Lexington, KY 40536, USA

b

Department of Pediatrics, 2401 Gillham Road, Children’s Mercy Hospital, Kansas City, MO 64108, USA

c

Department of Medical Research, 2420 Pershing Road, Children’s Mercy Hospital, Kansas City, MO 64108, USA

article

abstract

info

Article history: Received 27 June 2012 Received in revised form 11 September 2012 Accepted 12 September 2012 Available online 19 September 2012

This letter proposes modeling a large collection of test statistics, such as may arise in microarray data analysis, using a mixture of three normal distributions: one with mean zero, one with nonnegative mean, and one with nonpositive mean. A convenient procedure is established for testing whether this three-component mixture may be reduced to a two-component mixture with one nonzero component mean. © 2012 Elsevier B.V. All rights reserved.

Keywords: Down’s syndrome Gene expression Microarray Mixture model

1. Introduction Suppose that X1 , X2 , . . . , Xn are independent and identically distributed (iid) according to the mixture probability density function (pdf)

(1 − γ1 − γ2 ) exp



−(x−µ0 )2 2σ02



+ γ1 exp



−(x−{µ0 +µ1 })2 2σ02



+ γ2 exp

(2π σ02 )1/2



−(x−{µ0 +µ2 })2 2σ02

 ,

(1)

where γ1 , γ2 , 1 − γ1 − γ2 , µ1 , and −µ2 are all nonnegative and unknown while µ0 ∈ (−∞, ∞) and σ02 ∈ (0, ∞) are known. Without loss of generality, we hereafter assume that µ0 = 0 and σ02 = 1. We refer to (1) as the bilaterally contaminated normal (BCN) model since one of the contaminating normal distributions has a nonnegative mean µ1 while the other has a nonpositive mean µ2 . Testing the ‘‘omnibus’’ null hypothesis that γ1 µ1 + γ2 |µ2 | = 0 (i.e., that the BCN model reduces to the standard normal distribution) is straightforward. For example, one obtains an exact level α procedure by rejecting the omnibus null n hypothesis upon observing i=1 Xi2 greater than the 1 − α quantile of the central chi-square distribution on n degrees of n freedom (df). Such a procedure is unbiased and consistent because, when the omnibus null hypothesis is false, i=1 Xi2 is distributed as a mixture of central and non-central chi-square distributions on n df. A more interesting problem, and the focus of this letter, is that of developing a test of the ‘‘unilateral’’ null hypothesis that γ1 µ1 × γ2 |µ2 | = 0. If the unilateral null hypothesis is true, then the BCN model reduces to the unilaterally contaminated normal (UCN) model with mixture pdf

(1 − γ )(2π )−1/2 exp[−x2 /2] + γ (2π )−1/2 exp[−(x − µ)2 /2], where γ ∈ [0, 1] and µ is real. ∗

Corresponding author. Tel.: +1 859 218 2072; fax: +1 859 257 6430. E-mail addresses: [email protected] (R. Charnigo), [email protected] (Q. Fan), [email protected] (D. Bittel), [email protected] (H. Dai).

0167-7152/$ – see front matter © 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.spl.2012.09.015

(2)

164

R. Charnigo et al. / Statistics and Probability Letters 83 (2013) 163–167

To motivate the BCN model, suppose that a microarray experiment is conducted to compare patients with a specified medical condition to healthy controls on the expression levels of n genes. Let Xi denote a test statistic for gene i with: (1) the standard normal distribution if patients and controls have the same mean expression level; (2) the normal distribution with mean µ1 and unit standard deviation if patients have greater expression; and (3) the normal distribution with mean µ2 and unit standard deviation if patients have lesser expression. Then γ1 and γ2 represent the proportions of genes that are overexpressed and underexpressed in patients. Testing the unilateral null hypothesis will allow a data analyst to ascertain whether overexpression and underexpression are simultaneously present in a collection of genes. While there is a vast literature on large-scale hypothesis testing in general (e.g., Benjamini and Hochberg, 1995; Genovese and Wasserman, 2002; Gadbury et al., 2004) and on microarray data analysis in particular (e.g., Newton et al., 2001; Kendziorski et al., 2003; Keles, 2007), practical application of the BCN model is novel, to the best of our knowledge. Although a variant of the UCN model has been applied to microarray data analysis previously (Dai and Charnigo, 2010), the UCN model has the limitation of being unable to capture gene overexpression and underexpression simultaneously; overexpression corresponds to µ > 0, underexpression corresponds to µ < 0, and µ cannot be both positive and negative at the same time. In contrast, the BCN model can capture both overexpression and underexpression. A contaminated beta (CB) model (Allison et al., 2002; Dai and Charnigo, 2008a) and modifications thereof (e.g., Markitsis and Lai, 2010) have been employed in large-scale hypothesis testing. The CB model has mixture pdf

(1 − γ ) + γ

Γ [α + β] α−1 p (1 − p)β−1 Γ [α]Γ [β]

for p ∈ (0, 1), γ ∈ [0, 1], α ∈ (0, ∞), and β ∈ (0, ∞). However, while the UCN and BCN models describe test statistics directly, the CB model describes p-values derived from test statistics. Assuming two-sided p-values that compare patients to controls on gene expression levels (e.g., Pi = 2 − 2Φ |Xi |, where Xi is as above and Φ denotes the standard normal cumulative distribution function), the CB model captures deviations from uniformity of the p-values but does not distinguish between overexpression and underexpression; both large positive and large negative test statistics translate into small p-values. Thus, even though the CB model overcomes the aforementioned limitation of the UCN model, the CB model does not obviate the BCN model. Hypothesis testing for the BCN model is described in Section 2. We present a case study, illustrating application of the BCN model to data from a microarray experiment on Down’s syndrome, in Section 3. We conclude with a discussion, including potential future research, in Section 4. 2. Testing the unilateral null hypothesis

n

s := n−1 i=1 Xis for any positive integer s. As indicated in Section 1, one way to test the omnibus Let ms := E [X1s ] and m n 2 is too large. Note that the omnibus null hypothesis can be expressed null hypothesis is to ask whether i=1 Xi2 = n × m as m2 − 1 = 0, whereas m2 − 1 is strictly positive when the omnibus null hypothesis is false. If we can exhibit another function of the moments that is 0 when the unilateral null hypothesis is true and strictly positive otherwise, then a test of the unilateral null hypothesis can be based on the evaluation of that function at the estimated moments. We consider a moment-based approach in this letter because of the difficulties inherent to likelihood ratio testing in mixture and contamination models. These difficulties arise due to violations of standard regularity conditions such as identifiability (e.g., Ghosh and Sen, 1985; Chen and Chen, 2001; Dai and Charnigo, 2008b). Likelihood ratio test statistics typically have intractable asymptotic null distributions in these settings, not only because quantiles from the distributions must be approximated via simulations but also because the distributions depend on how the parameter space is chosen; two research teams that assume different parameter spaces, even if both parameter spaces are compact, will obtain two different distributions. Approaches based on modified likelihood ratio testing (e.g., Chen et al., 2001; Chen and Kalbfleisch, 2004) and EM testing (e.g. Li et al., 2009; Li and Chen, 2010) have been fruitful in overcoming the aforementioned difficulties for some mixture and contamination models. These approaches may also prove viable for testing the unilateral null hypothesis in the BCN model. However, we leave such investigations to future research. Returning now to the moment-based approach, a quadratic function of the first three moments, namely m22 − 2m2 + 1 + 3m21 − m3 m1 = γ1 γ2 µ1 |µ2 |(µ1 − µ2 )2 , is 0 when the unilateral null hypothesis is true and is strictly positive otherwise. 22 − 2m 2 + 1 + 3m 21 − m 3 m 1 is sufficiently greater Hence, our test of the unilateral null hypothesis will be based on whether m than 0. Define the 6 × 1 vectors m := (m1 , m2 , . . . , m6 )T ,

 := ( 2 , . . . , m 6 )T , m m1 , m

and 0 := (0, 0, . . . , 0)T .

Also define the 3 × 3 matrix V(m) to contain mi+j − mi mj in row i and column j for i, j ∈ {1, 2, 3}. Put g (m) := m22 − 2m2 + 1 + 3m21 − m3 m1 and h(m) := (−m3 + 6m1 , 2m2 − 2, −m1 )T . We hereafter assume that the omnibus null hypothesis is false; otherwise there is no need to test the unilateral null hypothesis. Then h(m) ̸= 0 and we have

√ n



g ( m)



h( m)T V( m)h( m)

−

g (m) h(m)T V(m)h(m)

 L

−→ N (0, 1)

(3)

R. Charnigo et al. / Statistics and Probability Letters 83 (2013) 163–167

165

by the Cramer and Slutsky Theorems (Ferguson, 1996). Let

 Zn :=

n

g ( m),

h( m) V( m)h( m) T

(4)

zu denote the u quantile of the standard normal distribution (e.g., z0.95 = 1.645), and ma be the vector of moments implied by (γ1 , γ2 , µ1 , µ2 ) = (γ1,a , γ2,a , µ1,a , 0) for fixed positive constants γ1,a , γ2,a , and µ1,a . The following theorem summarizes the implications of relation (3). Theorem 2.1. Suppose that X1 , X2 , . . . , Xn are iid according to model (1) and that the omnibus null hypothesis is false. Under the unilateral null hypothesis, γ1 µ1 × γ2 |µ2 | = 0 and γ1 µ1 + γ2 |µ2 | > 0, lim P(Zn > z1−α ) = α.

(5)

n→∞

Under the local alternative sequence (γ1 , γ2 , µ1 , µ2 ) = (γ1,a , γ2,a , µ1,a , −τ n−1/2 ) for a fixed positive constant τ ,

 lim P(Zn > z1−α ) = Φ

n→∞

−z1−α + 

γ1,a γ2,a µ31,a τ h(ma )T V(ma )h(ma )

 .

(6)

Under the fixed alternative (γ1 , γ2 , µ1 , µ2 ) = (γ1,a , γ2,a , µ1,a , µ2,a ) for a fixed negative constant µ2,a , lim P(Zn > z1−α ) = 1.

(7)

n→∞

In words, statement (5) says that an approximate level α procedure entails rejecting the unilateral null hypothesis on observing Zn greater than z1−α , statement (6) says that the procedure is asymptotically locally unbiased, and statement (7) says that the procedure is consistent. We close Section 2 with two remarks. Remark 1. Relation (6) is useful for approximate sample size calculations. For example, suppose that µ1 = 2, γ1 = 0.10, and γ2 = 0.10. Then h(ma )T V(ma )h(ma ) = 2 and, when |µ2 | is small, an approximate sample size providing 80% power to reject the unilateral null hypothesis at level 0.05 is h(ma )T V(ma )h(ma ) (z0.80 + z1−0.05 )2

γ12 γ22 µ61 µ22

=

1932

µ22

.

Moreover, analogues to (6) exist for local alternative sequences with γ1 or γ2 or µ1 tending to 0, rather than µ2 . For instance, with γ2 = min{1, τ n−1/2 } the γ1,a γ2,a µ31,a τ in (6) is replaced by γ1,a µ1,a |µ2,a |(µ1,a − µ2,a )2 τ . Remark 2. Although the restrictions of nonnegative µ1 and nonpositive µ2 are motivated by the potential application of the BCN model to microarray data analysis, one can still test the null hypothesis that model (1) reduces to model (2) without such restrictions. Rejection of the null hypothesis then occurs on observing |Zn | greater than z1−α/2 . In this case, a different name for model (1), such as the doubly contaminated normal (DCN) model, may be more appropriate since the contamination in (1) need not be bilateral when the aforementioned restrictions are lifted. 3. A microarray data case study We analyzed microarray data, originally examined by Mao et al. (2005) and available via {http://www.partek.com/ Tutorials}, corresponding to four samples of cerebral tissue from four subjects with Down’s syndrome and seven samples from four controls without Down’s syndrome; three of the controls provided two samples, while the fourth provided one. Results are presented herein for the n = 251 genes on chromosome 21, the location of the trisomy in Down’s syndrome. For each of the 251 genes, we fit a linear mixed model Yij = β0 + β1 xi + αi + ϵij , where Yij denotes the expression level in sample j ∈ {1, 2} from subject i ∈ {1, 2, . . . , 8}, xi equals 1 if subject i had Down’s syndrome and 0 otherwise, β0 and β1 are fixed effects, αi is a random effect for subject i, and ϵij is an error term for sample j from subject i. Fixed effects and variance components were estimated using the lme function in the nlme package of R. Of particular interest were the 251 T test statistics for whether β1 = 0. Each of these T test statistics was transformed through successive application of the T cumulative distribution function (cdf) on 6 df and the inverse standard normal cdf, yielding X1 , . . . , X251 which we treated as arising from the BCN model. The unilateral null hypothesis was decisively rejected, with Zn as defined in (4) equaling 5.84. Fig. 1 depicts the transformed test statistics X1 , . . . , X251 , along with the UCN and BCN models fitted via an expectation-maximization approximation to maximum likelihood. The fitted UCN model had ( γ, µ) = (0.292, 2.41), while the fitted BCN model had ( γ1 ,  γ2 ,  µ1 ,  µ2 ) = (0.451, 0.549, 1.97, −1.04). The fitted UCN model did not describe the test statistics well; their empirical distribution was bimodal, with one mode clearly to the left of 0 and one to the right, whereas the fitted UCN model had its primary mode near the trough of the

166

R. Charnigo et al. / Statistics and Probability Letters 83 (2013) 163–167

(a) Fitted UCN model.

(b) Fitted BCN model.

Fig. 1. Fitted models from the microarray data case study. The left and right panels display the fitted UCN and BCN models, respectively, for test statistics comparing Down’s syndrome subjects to controls on mean expression levels at 251 genes on chromosome 21.

empirical distribution. The fitted BCN model described the test statistics imperfectly but more accurately; the two modes of the fitted BCN model roughly aligned with those of the empirical distribution. The large estimates of γ1 and γ2 in the BCN model may be surprising. Yet, Fig. 1 makes clear that the majority of test statistics are more plausibly realizations of normal distributions with means 1.97 and −1.04 than of the standard normal distribution. That said, the levels of differential expression suggested by a normal distribution with mean −1.04, or even a normal distribution with mean 1.97, are not particularly large. For example, a test statistic of −1.04 at a specific gene would not suffice to reject the null hypothesis of no differential expression at that gene, while a test statistic of 1.97 would not suffice to reject the null hypothesis under any reasonable adjustment for multiple comparisons. On the other hand, some of the individual test statistics were fairly large. For instance, three genes yielded test statistics greater than 3.50: CXADR at chromosome location chr21q21.1, SOD1 at chr21q22.1/21q22.11, and PRMT2 at chr21q22.3. Gene SOD1 was one of 26 genes on chromosome 21 identified by Mao et al. (2005) as consistently dysregulated, while CXADR and PRMT2 were not flagged by those authors. However, the overexpression of CXDAR in Down’s syndrome has been documented elsewhere (e.g., Wilcock, 2012), as has that of PRMT2 (e.g., Blehaut et al., 2010). The analysis of Mao et al. (2005) differed from ours, methodologically, in two key respects. First, they obtained verdicts on dysregulation based on an evaluation of differential expression across various cell types (cerebrum, cerebellum, heart, and astrocyte), whereas we focused more specifically on data pertaining to cerebral tissue. Second, they did not model the empirical distribution of test statistics to ascertain the pervasiveness, directionality, and magnitude of the differential expression. Rather, they sought to identify individual genes meeting a statistical criterion for dysregulation. Although the BCN model can be used in tandem with the filtration procedure of Dai and Charnigo (2008a) to decide which individual genes should be flagged, our purpose in this case study was to demonstrate how the BCN model (and, in particular, testing the unilateral null hypothesis) could easily but formally establish the simultaneous presence of overexpression and underexpression. Indeed, there appear to be fairly large proportions of overexpressed genes and underexpressed genes among Down’s syndrome subjects, although the magnitude of the differential expression is generally so modest that only a small minority of genes warrant being flagged individually. 4. Conclusion This letter provides a convenient procedure for testing whether a standard normal distribution is contaminated by one or by two other normal distributions. The procedure is based on whether a quadratic function of the first three sample moments is sufficiently large. Hence, resampling is not necessary to identify a critical value for rejection of the unilateral null hypothesis or to estimate power when the unilateral null hypothesis is false. As exemplified in our microarray data case study, the procedure may be used to explicitly assess whether both gene overexpression and gene underexpression are present among persons with a specified medical condition versus healthy controls. Other contamination tests previously applied to microarray data lack this capability.

R. Charnigo et al. / Statistics and Probability Letters 83 (2013) 163–167

167

s for In principle, one may also use the sample moments to estimate BCN model parameters by equating ms to m s ∈ {1, 2, 3, 4} and then solving the resulting system of four equations corresponding to four unknown parameters. Yet, our practical experience has been that the method of moments does not work well in this setting because of its sensitivity to model misspecification. In other words, if the BCN model is not the true mechanism generating the data but only an 2 is less than 1, then we ostensibly reasonable approximation, then the method of moments is apt to fail. To illustrate, if m must solve  γ1  µ21 +  γ2 µ22 = 0, which yields either negative  γ1 ,  γ2 or  µ1 ,  µ2 with nonzero imaginary part. Of course, that is a 2 is greater than 1, for more subtle reasons. This is why particularly simple example; the method of moments can also fail if m we used an expectation-maximization approximation to maximum likelihood for parameter estimation in our microarray data case study. A topic for future research will be to develop inferential tools for extensions of the BCN model that may permit more realistic descriptions of some data sets. One such extension accommodates a nuisance parameter for the within-component variances, 2 −1/2

(1 − γ1 − γ2 )(2πσ )

     −(x − µ1 )2 −(x − µ2 )2 −x2 2 −1/2 2 −1/2 + γ1 (2π σ ) exp + γ2 (2π σ ) exp , (8) exp 2σ 2 2σ 2 2σ 2 

while another accommodates unequal within-component variances,

(1 − γ1 − γ2 )(2π σ02 )−1/2 exp



     −(x − µ1 )2 −(x − µ2 )2 −x2 2 −1/2 2 −1/2 + γ ( 2 π σ + γ ( 2 π σ . (9) ) exp ) exp 1 2 1 2 2σ02 2σ12 2σ22

Above, σ 2 , σ02 , σ12 , and σ22 belong to a compact interval in (0, ∞) but are otherwise unknown. Developing tests for whether contaminating components can be removed from (8) and (9), without resorting to resampling, may be quite challenging. On the other hand, the negative twice the log likelihood ratio statistic for testing (1) versus (8) has an asymptotic χ12 distribution, and that for testing (1) versus (9) has an asymptotic χ32 distribution, when model (1) holds and the unilateral null hypothesis is false. (Convenient asymptotic distributions are available here because these tests are not questioning the number of contaminating components.) Thus, once an investigator is convinced that gene overexpression and gene underexpression are both present (e.g., after rejecting the unilateral null hypothesis using the procedure proposed in this letter), the investigator may also fit and test models more flexible than (1). Acknowledgments We thank Marepalli Rao, an associate editor, and a reviewer for helpful comments. References Allison, D.B., Gadbury, G.L., Heo, M., Fernandez, J.R., Lee, C.-K., Prolla, T.A., Weindruch, R., 2002. A mixture model approach for the analysis of microarray gene expression data. Computational Statistics & Data Analysis 39, 1–20. Benjamini, Y., Hochberg, Y., 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B 57, 289–300. Blehaut, H., Mircher, C., Ravel, A., Conte, M., de Portzamparc, V., Poret, G., de Kermadec, F., Rethore, M., Sturtz, F., 2010. Effect of leucovorin on the developmental quotient of children with Down’s syndrome and influence of thyroid status. PLoS ONE 5, e8394. Chen, H., Chen, J., 2001. The likelihood ratio test for homogeneity in the finite mixture models. Canadian Journal of Statistics 29, 201–215. Chen, H., Chen, J., Kalbfleisch, J., 2001. A modified likelihood ratio test for homogeneity in finite mixture models. Journal of the Royal Statistical Society, Series B 63, 19–29. Chen, J., Kalbfleisch, J.D., 2004. Modified likelihood ratio test in finite mixture models with a structural parameter. Journal of Statistical Planning and Inference 129, 93–107. Dai, H., Charnigo, R., 2010. Contaminated normal modeling with application to microarray data analysis. Canadian Journal of Statistics 38, 315–332. Dai, H., Charnigo, R., 2008a. Omnibus testing and gene filtration in microarray data analysis. Journal of Applied Statistics 35, 31–47. Dai, H., Charnigo, R., 2008b. Inferences in contaminated regression and density models. Sankhya 69, 842–869. Ferguson, T.S., 1996. A Course in Large Sample Theory. Chapman and Hall, New York. Gadbury, G.L., Page, G.P., Edwards, J., Kayo, T., Prolla, T.A., Weindruch, R., Permana, P.A., Mountz, J.D., Allison, D.B., 2004. Power and sample size estimation in high dimensional biology. Statistical Methods in Medical Research 13, 325–338. Genovese, C., Wasserman, L., 2002. Operating characteristics and extensions of the false discovery rate procedure. Journal of the Royal Statistical Society, Series B 64, 499–517. Ghosh, J.K., Sen, P.K., 1985. On the asymptotic performance of the log likelihood ratio statistics for the mixture model and related results. In: Le Cam, L.M., Olshen, R.A. (Eds.), Proceedings of the Berkeley Conference in Honor of J. Neyman and J. Kiefer, Vol. 2. Wadsworth, pp. 789–806. Keles, S., 2007. Mixture modeling for genome-wide localization of transcription factors. Biometrics 63, 10–21. Kendziorski, C.M., Newton, M.A., Lan, H., Gould, M.N., 2003. On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profiles. Statistics in Medicine 22, 3899–3914. Li, P., Chen, J., 2010. Testing the order of a finite mixture model. Journal of the American Statistical Association 105, 1084–1092. Li, P., Chen, J., Marriott, P., 2009. Non-finite Fisher information and homogeneity: the EM approach. Biometrika 96, 411–426. Mao, R., Wang, X., Spitznagel, E., Frelin, L., Ting, J., Ding, H., Kim, J., Ruczinski, I., Downey, T., Pevsner, J., 2005. Primary and secondary transcriptional effects in the developing human Down syndrome brain and heart. Genome Biology 6, R107. Markitsis, A., Lai, Y., 2010. A censored beta mixture model for the estimation of the proportion of non-differentially expressed genes. Bioinformatics 26, 640–646. Newton, M.A., Kendziorski, C.M., Richmond, C.S., Blattner, F.R., Tsui, K.W., 2001. On differential variability of expression ratios: improving statistical inference about gene expression changes from microarray data. Journal of Computational Biology 8, 37–52. Wilcock, D., 2012. Neuroinflammation in the aging Down syndrome brain; lessons from Alzheimer’s disease. Current Gerontology and Geriatrics Research 2012, 170276.