A generalized false discovery rate in microarray studies

A generalized false discovery rate in microarray studies

Computational Statistics and Data Analysis 55 (2011) 731–737 Contents lists available at ScienceDirect Computational Statistics and Data Analysis jo...

1MB Sizes 2 Downloads 80 Views

Computational Statistics and Data Analysis 55 (2011) 731–737

Contents lists available at ScienceDirect

Computational Statistics and Data Analysis journal homepage: www.elsevier.com/locate/csda

A generalized false discovery rate in microarray studies Moonsu Kang a,∗ , Heuiju Chun b a

Department of Physiology, College of Medicine, Hanyang University, Seoul 133-791, Republic of Korea

b

Department of Data Management, College of Commerce and Business, Pusan University of Foreign Studies, Busan, 608-347, Republic of Korea

article

info

Article history: Received 31 December 2009 Received in revised form 6 May 2010 Accepted 16 June 2010 Available online 20 June 2010 Keywords: Microarray data Familywise error rate k-FDR Poisson approximation

abstract The problem of identifying differentially expressed genes is considered in a microarray experiment. This motivates us to involve an appropriate multiple testing setup to high dimensional and low sample size testing problems in highly nonstandard setups. Familywise error rate (FWER) is too conservative to control the type I error, whereas a less conservative false discovery rate has received considerable attention in a wide variety of research areas such as genomics and large biological systems. Recently, a less conservative method than FDR, the k-FDR, which generalizes the FDR has been proposed by Sarkar (2007). Most of the current FDR procedures assume restrictive dependence structures, resulting in being less reliable. The purpose of this paper is to address these very large multiplicity problems by adopting a proposed k-FDR controlling procedure under suitable dependence structures and based on a Poisson distributional approximation in a unified framework. We compare the performance of the proposed k-FDR procedure with that of other FDR controlling procedures, with an illustration of the leukemia microarray study of Golub et al. (1999) and simulated data. For power consideration, different FDR procedures are assessed using false negative rate (FNR). An unbiased property is appraised by FDR ≤ α and a higher value of 1 − (FDR + FNR). The proposed k-FDR procedure is characterized by greater power without much elevation of k-FDR. © 2010 Elsevier B.V. All rights reserved.

1. Introduction A DNA microarray is a newly presented biotechnology for the purpose of monitoring expression levels of thousands of genes simultaneously. A basic task is to identify differentially expressed genes under different experimental setups, which involves simultaneous testing for each gene of the null hypothesis of no association between the expression levels and explanatory variables or covariates (Dudoit et al., 2003). However, this technology invokes statistical challenges because the data is high dimensional with very little replication, which is called the curse of dimensions. Traditional methods in multiplicity context control the family-wise error rate (FWER), say, the probability of committing any type I error among all hypotheses at a preassigned level α . With a large number of possibly correlated genes tested, however, it is generally unsuitable to use, missing truly differentially expressed genes or other genes. In 1995, Benjamini and Hochberg introduced the FDR as the alternative which is the expected proportion of type I errors among the rejected hypotheses (genes) (Benjamini and Hochberg, 1995). There are m hypotheses testing problems: H0j vs. H1j , j = 1, . . . , m and among these hypotheses, an unknown number m0 is the number of true null hypotheses and the



Corresponding author. Tel.: +82 10 6331 0853. E-mail addresses: [email protected] (M. Kang), [email protected] (H. Chun).

0167-9473/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.csda.2010.06.017

732

M. Kang, H. Chun / Computational Statistics and Data Analysis 55 (2011) 731–737

Table 1 Multiple hypothesis testing.

True null Non-true null

Not rejected

Rejected

Total

U T m−R

V S R

m0 m−m0 m

remaining m1 (=m − m0 ) is the number of false hypotheses; a true (false) hypothesis refers to a null (alternative) hypothesis being true. We have Table 1 to show this picture. In the genomic data, m is large, the test statistics Tj for testing H0j vs. H1j may not necessarily be independent, and the sample size (n) may be smaller than m (m  n). Neither m0 nor their indices are known. Henceforth, current FDR controlling methods are highly conservative in terms of the power of multiple hypothesis testing procedures. For example, Dudoit et al. (2003) used the Benjamini and Hochberg (1995) procedure in microarray data analysis, but this procedure needed independence of the genes tested. The significance analysis of microarrays (SAM) (Tusher et al., 2001) method seeks out a null distribution by exploiting the permutation distribution of the data which resamples expression levels across all genes and experimental conditions. However, the number of permutations is too small to find out a null distribution. This method does not generate biologically feasible null distributions for microarray data. On the other hand, Wu (2008) proposed a form of conditional dependence between genes termed spatial dependence, that is, false null hypotheses are more likely to be false. This dependence structure is proven to be quite plausible for the data such as microarray data or FMRI data. Thus, allowing for the local dependence structures for false null hypotheses, we provide a less conservative FDR controlling procedure, that is, a k-FDR controlling procedure which controls the probabilities of falsely rejecting at least k null hypotheses (Lehmann et al., 2005; Sarkar, 2007). Incorporating Poisson distributional assumptions, the proposed k-FDR controlling procedure does not rely on rather restrictive dependence assumptions among genes. We rigorously construct a suitable multiple testing procedure in highdimensional data. The Poisson approximation method is used to construct the distributions of V and R. Considering the power perspective, we also introduce the unbiasedness property in terms of maintaining a high value for 1 −(k-)FDR − FNR (Sarkar, 2004). That is, a good multiple testing procedure is defined based on the idea of a high proportion of correct decisions compared to that of incorrect decisions. Section 2 presents a general framework of model and corresponding assumptions where the proposed k-FDR procedure is derived. In Section 3, we show how to use the Stein method to derive the Poisson approximation. Section 4 describes the estimation procedure of the proposed k-FDR and false negative rate (FNR). We seek out an appropriate k-FDR procedure satisfying the unbiased property. Section 5 illustrates a simulation study of the proposed FDR under independence or a certain form of dependence structure, compared with other conventional FDR procedures. The proposed FDR procedure is applied to a real microarray data set in Golub et al. (1999). In the last section, we consider a general discussion. 2. Model and assumptions We use the sample data {(xi , yi )}i=1,...,n formed by the expression profiles xi and response or covariate yi in order to test hypotheses regarding the joint distribution of the expression measures X = (X1 , . . . , Xm ) and response or covariate Y . All the m genes tested are classified into two groups: non-differentially expressed genes (NDG) and differentially expressed genes (DG). Suppose that there are m0 NDGs and m1 DGs out of m genes, which correspond to true null hypotheses and false null hypotheses in the multiple testing contexts, respectively. It should be noted that in general, there are a relatively smaller number (not too small) of DGs compared to NDGs (m0  m1 ). The gene expression levels of m tested genes may not be stochastically independent. NDG may have a small gene expression level, whereas DGs appear to have a large gene expression level. We assume that NDGs are stochastically independent and are independent of the DG expressions while DGs are (locally) dependent with each other. We construct for each gene a test statistic Tj for testing for H0j : gene j is NDG vs. H1j : gene j is DG, for j = 1, . . . , m. Regarding possible two-sided alternatives, we will take the Tnj as nonnegative (if needed, by taking their absolute values), so that if tj be the observed values of Tj (for the jth gene), we take the observed significance levels (OSL) or p-values as Pj = P (Tj > tj |H0j ),

j = 1, . . . , m.

3. Stein method and Poisson approximation Now, define ` = {1, . . . , m} as the set of indices of hypotheses and `0 = {1, . . . , m0 } as the set of indices for NDGs. Further, let `1 = `C0 as the set of indices for DGs. We introduce a useful dependence structure in the context (Chen and Shao, 2004).

M. Kang, H. Chun / Computational Statistics and Data Analysis 55 (2011) 731–737

733

Definition 3.1. A sequence of nonnegative integer-valued random variables Yα , α ∈ Γ is said to be locally dependent if for each α ∈ Γ , there exists Aα ⊆ Γ with α ∈ Aα s.t Yα is independent of (Yβ , β ∈ ACα ). 3.1. Distribution of true null hypotheses Define Pi to be Pr (an NDG i will be erroneously rejected), i ∈ `0 . Let Ii = I (Pi < ci ). Under the assumption that Pi , i ∈ `0 are assumed to have the marginal uniform distribution on [0, 1], Pr(Ii = 1) = ci . Without loss of generality, we assume that the sequence is independent. By virtue of Lemma 7 in Boutisikas and Koutras (2000), we compute an error bound for the distribution of the sum of Ii0 s and the Poisson approximation with λ0 . Theorem 3.2. dW

L

m0 X i =1

! Ii

, Po

m0 X

!! = dW L

qi

m0 X

!

! , Po(λ0 )

Ii

i =1

i=1



m0 X

q2i =

i=1

m0 X

ci2

i=1

 2 ≤ m0 · max ci ,

(3.1)

i∈`0

where dW is the Wasserstein distance. −1/2

This upper bound converges to 0 if and P only if maxi∈`0 ci = o(m0 ). Define the number of genes declared to be differentially expressed among NDGs as V = i∈∈`0 Ii . The distribution of V is approximated by Poisson distributions with rates P λ0 , where λ0 = i∈`0 ci . 3.2. Distribution of false null hypotheses Stein’s method for the Poisson approximation was first developed by Chen (1975). The framework of Stein’s method for the Poisson process approximation has been presented by many authors (Roos, 1994; Chen and Xia, 2004; Barbour, 1988; Barbour and Brown, 1992; Barbour et al., 1992; Barbour and Xia, 2006; Brown and Xia, 2001; Chen, 1975; Chen and Shao, 2004). A general result in the Poisson process approximation is proved by taking the local approach. We do not go into a theoretical view of Stein’s method in this paper. Assume that Ii , i ∈ `1 have local dependence conditions. Let Ii , i ∈ `1 , be a Bernoulli variable with Pr(Ii = 1) = qi , which means Pr (a DG i will be declared to be differentially expressed among DGs). For each i ∈ `1 , let Ai ⊂ `1 , where i ∈ Ai . Let the size ofP each Ai , i ∈ `1 be 2b + 1. Let Γis be an index set including summands other than i which are more related to Ii . And let Si = i∈Γ s Ii . i

Using Theorem 1 in Roos (1994), dTV (L(S ), Po(λa )) ≤ l2 (λa )

X

((E(Ii ))2 + E(Ii )E(Si ) + E(Ii Si )) + l1 (λa )

i∈`1

 P

E|E{Ii |(Ij , j ∈ Γi )} − E(Ii )|, with Γi

w

ηi ,

(3.2)

i∈`1

where dTV is a total variation distance, λa = E(S ) = w

X



i∈`1

qi , l1 (λa ) =



1 −λa 2/(eλa ), l2 (λa ) = λ− ) and ηi = a (1 − e

= `1 \ {{i} ∪ Γi }. In fact, since E{Ii |(Ij , j ∈ Γiw )} = E(Ii ), ηi = 0, ∀i ∈ `1 . s

E(Ii Si ) = E(E (Ii Si |Ii )) = E(Ii Si |Ii = 1) Pr(Ii = 1) + E(Ii Si |Ii = 0) Pr(Ii = 0)

 ≤

2

max G(ci ) i∈`1

(2b),

where G(·) is the distribution function for DGs and b is half of a block size (or that of a neighborhood size). Therefore, the upper bound = l2 (λa ) · (4b + 1) · (maxi∈`1 G(ci ))2 . Minimizing the upper bound gives us the Poisson approximation of S with λa . Now, we need to estimate the distribution G(·). Block bootstrap is used for detecting the dependence of consecutive observations by drawing blocks of observations. The blocks of observations are drawn with replacement from a set of blocks. By sampling blocks randomly with replacements and putting them together, the bootstrap resampling is obtained (Hall et al., 1995). Block bootstrap turns out to be a very powerful method for dependent data. We do not go into a theoretical review about overlapping (moving) block bootstrap in this paper. Utilizing moving block bootstrap with an appropriate block length ˆ enables us to estimate the distribution of false null hypotheses, G.

734

M. Kang, H. Chun / Computational Statistics and Data Analysis 55 (2011) 731–737

4. Proposed k-FDR procedure and false negative rate (FNR) Based on the Poisson approximation of V and S , R also has a Poisson distribution with parameter λ =

P

−1/2

i∈`1

G(ci ). If maxi∈`0 ci = o(m0

P

i∈`0

ci +

) and the upper bound for false null hypotheses satisfies l2 (λa )·(4b + 1)·(maxi∈`1 G(ci ))2 λ0 λ

and S follows a binomial distribution with r and λλa . By taking the Poisson approximation, the above results allow for more general dependence structures among the genes (or subjects) and lead to a more concrete form of the k-FDR procedure as below. k-FDR is given by

= o(1), given R = r , V follows a binomial distribution with r and

 E

VI (V ≥ k)

 =

R∨1

  m X VI (V ≥ k) R = r · Pr(R = r |R > 0) · Pr(R > 0) E R r =1

=

m X 1 r =1

=

r

r

m X 1 r =1

# v · Pr(V = v|R) · Pr(R = r )

v=k

" m ∞ X 1 X r =1

=

E(VI (V ≥ k)|R = r ) · Pr(R = r )

" m ∞ X 1 X r =1

=

r

v Pr(V = v|R = r ) −

v=0

"

r

k−1 X

# v Pr(V = v|R = r ) · Pr(R = r )

v=0

k−1 λ0 X r! r· − v· · λ ( r − v)!v! v=0



λ0 λ

v   # λ0 r −v exp(−λ)(λ)r · 1− . · λ r!

Using distributional assumptions for V , S, and R, FNR is given by FNR = E

  T A > 0 · Pr(A > 0) A 

=E





m1 − S R < m m−R

· Pr(R < m)

 X  m1 − S Pr(R = r ) = E R = r · Pr(R < m) · Pr(R < m) m − R r =0 m−1

m−1

=

r =1 m−1

=

1

X

X r =1

m−r

[m1 − E (S |R = r )] ·

exp(−λ)(λ)r r!

 λa exp(−λ)(λ)r m1 − r · · . m−r λ r! 1



Note that as k increases, the corresponding k-FDR decreases without any change of FNR. The unbiasedness criterion, which is defined as (k-)FDR + FNR ≤ 1, places an optimality condition on k-FDR and FNR-controlling procedures which determines critical points ci , i = 1, . . . , m. The higher 1 − (k-)FDR − FNR is, the better the corresponding multiple testing procedure is. In this context, we consider 1 − (k-)FDR − FNR as power measurement. 5. Numerical analysis 5.1. Simulation study In order to assess the improvement made by the proposed k-FDR procedure, the proposed k-FDR procedure is numerically compared with other procedures such as Storey’s FDR (Storey) (Storey, 2002; Storey et al., 2004), the Benjamini and Hochberg procedure (BH) (Benjamini and Hochberg, 1995), and the generalized Benjamini and Hochberg procedure (GeneralBH) (Sarkar, 2007) when k = 2 (Fig. 1). We present 4 FDR controlling procedures in simulated data with varying the proportion of true null π0 (=0.10, 0.50, 0.90) at a fixed level α(=0.05). In an independent simulation study, 1000 independent normal random variables Ti , i = 1, . . . , 1000 with common correlation ρ = 0 are generated. 1000 one-sided

0.0

0.2 0.4 0.6 0.8 1.0 Exact proportion of True null

0.02

FDR 0.03

0.04

0.05

735

0.01

0.01

0.01

0.02

0.02

FDR 0.03

FDR 0.03

0.04

0.04

0.05

0.05

M. Kang, H. Chun / Computational Statistics and Data Analysis 55 (2011) 731–737

0.0

0.2

0.4

0.6

0.8

0.0

1.0

Exact proportion of True null

0.2 0.4 0.6 0.8 1.0 Exact proportion of True null

1.0

1.0 0.2

0.4

0.6

0.8

Exact proportion of True null

1.0

0.8 0.6 0.5

0.5 0.0

0.7

1–FDR–FNR

0.9

0.9 0.8

1–FDR–FNR

0.6

0.7

0.8 0.7 0.5

0.6

1–FDR–FNR

0.9

1.0

Fig. 1. FDR controlling procedures for ρ = 0, 0.2, 0.4, respectively; Storey’s FDR (Storey), the Benjamini and Hochberg procedure (BH), and generalized Benjamini and Hochberg procedure (GeneralBH).

0.0

0.2

0.4

0.6

0.8

Exact proportion of True null

1.0

0.0

0.2

0.4

0.6

0.8

1.0

Exact proportion of True null

Fig. 2. 1 − FDR − FNR for ρ = 0, 0.2, 0.4, respectively; Storey’s FDR (Storey), the Benjamini and Hochberg procedure (BH), and generalized Benjamini and Hochberg procedure (GeneralBH).

hypothesis tests of µ = 0 against µ = 2 are conducted using each of the four FDR procedures. Similarly, in a dependent p-value simulation study, 1000 dependent random variable Ti with ρ(=0.2, 0.4), respectively and 1000 one-sided hypothesis tests are performed. The moving block bootstrap with each block length 10 was used for estimating the distribution of false null hypotheses. Each individual hypothesis is tested by a z-test. Regardless of the value of ρ , the proposed k-FDR procedure is controlled at α , whereas the BH procedure and Storey’s FDR failed to control the type I error. In addition to that, 1 − (k-)FDR − FNR values are computed in order to compute the power of each FDR controlling procedure. In Fig. 2, power improvement offered by the proposed k-FDR procedure is uniformly better than other FDR procedures for all ρ values. 5.2. Application to real data We examined how well the proposed k-FDR procedure performs in comparison with other FDR procedures in the dataset (Golub et al., 1999). Two hematologic malignancies are studied with acute lymphoblastic leukemia (ALL) and acute myeloid leukemia (AML). The aim of this study is to identify differentially expressed genes between ALL and AML. Affymetrix oligonucleotide chips were used for measuring gene expression levels. The data has 500 genes and 38 tumor mRNA samples. The raw data was already preprocessed.

10000

30000

gene expression level (sample1)

Frequency

4000

150 0 05 –100

0

50

150

transformed gene expression level

0

0

Frequency

2000

–50

transformed gene expression level (sample1) 3000

original gene expression level

–150

2000

0

1000

–20000

transformed gene expression level (sample2)

–10000

10000

30000

M. Kang, H. Chun / Computational Statistics and Data Analysis 55 (2011) 731–737

gene expression level (sample2)

736

–20000

–150

0 20000 gene expression level

–50 0 50 gene expression level

150

Fig. 3. Untransformed expression level vs. transformed expression level.

Table 2 FDR procedures (real data).

α

Proposed

Storey

BH

GeneralBH

0.10 0.05 0.01

0.099 0.049 0.009

0.099 0.050 0.015

0.110 0.059 0.016

0.098 0.050 0.009

Fig. 3 compares original expression levels with signed square root transformed expression levels. By transforming the original expression levels, this Z score transformation standardizes data and allows the comparison of microarray data independent of the original hybridization intensities. Based upon data normalized by the transformation, we computed each p-value using Welch two sample t-statistics, comparing the performance of the proposed FDR with that of others for a fixed value of the proportion of true null hypotheses π0 = 0.59 estimated by Storey’s method. The moving block bootstrap with each block length 10 was used for estimating the distribution of truly differentially expressed genes. Table 2 compares values for different FDR procedures. The proposed k-FDR with k = 2 is controlled at all levels α = 0.1, 0.05, and 0.01, whereas Storey’s FDR and BH procedures failed to control the FDR at α . Table 3 lists the 30 most significant genes, controlling the FDR at α = 0.1. We conclude that the proposed k-FDR is more amenable in real microarray data structures.

6. Summary and conclusions Analyzing high dimensional low sample size data (e.g. microarray data) involves a suitable multiple testing procedure to control the Type I error. The k-FDR procedure as a less conservative notion brings newer insight to the conventional FDR controlling procedure. In this context, we developed the k-FDR procedure using the Poisson distributional assumption which allows for more general dependence structures. The proposed k-FDR procedure is not affected by a certain dependence structure. Results from a numerical study confirm that this procedure is controlled at a significance level α . In view of unbiasedness, the proposed k-FDR performs better in power than other FDR procedures.

M. Kang, H. Chun / Computational Statistics and Data Analysis 55 (2011) 731–737

737

Table 3 Displaying the 30 most significant genes at FDR=0.1. p-value

Gene

1.381111e−10 2.138241e−10 3.837362e−09 6.082366e−09 2.221575e−08 2.517146e−08 3.740919e−08 5.867391e−08 6.796881e−08 8.590343e−08 8.639399e−08 9.888047e−08 1.352416e−07 1.820797e−07 1.890900e−07 2.368127e−07 2.574041e−07 2.796545e−07 3.576030e−07 4.776810e−07 5.193354e−07 5.590720e−07 6.749931e−07 6.875291e−07 7.288953e−07 7.733038e−07 8.367065e−07 8.565317e−07 9.520948e−07 9.716227e−07

C -myb gene extracted from human (c-myb) gene, comp FAH fumarylacetoacetate Zyxin Leukotriene C4 synthase (LTC4S) gene TCF3 transcription factor 3 E2A immunoglobulin en RETINOBLASTOMA BINDING PROTEIN P48 CTPS CTP synthetase CCND3 cyclin D3 Clone 22 mRNA, alternative splice variant alpha-1 MB-1 gene LEPR leptin receptor Thrombospondin-p50 gene extracted from human throm PROTEASOME IOTA CHAIN RPA1 replication protein A1 (70 kD) MYL1 myosin light chain (alkali) TOP2B topoisomerase (DNA) II beta (180 kD) ACADM acyl-coenzyme A dehydrogenase, C-4 to C-12 s Cytoplasmic dynein light chain 1 (hdlc1) mRNA CST3 cystatin C amyloid angiopathy and cerebral h GB DEF = homeodomain protein HoxA9 mRNA LYN V-yes-1 yamaguchi sarcoma viral related oncoge PRG1 proteoglycan 1, secretory granule Transcriptional activator hSNF2b CYP2C18 cytochrome P450, subfamily IIC Liver mRNA for interferon-gamma inducing factor IG Inducible protein mRNA Catalase (EC 1.11.1.6) 5primeflank and exon 1 mapp CD33 CD33 antigen (differentiation antigen) CARCINOEMBRYONIC ANTIGEN PRECURSOR MCM3 minichromosome maintenance deficient

References Barbour, A.D., 1988. Stein’s method and Poisson process convergence. Journal of Applied Probability 25A, 175–184. Barbour, A.D., Brown, T.C., 1992. Stein’s method and point process convergence. Stochastic Processes and their Applications 43, 9–31. Barbour, A., Holst, L., Janson, S., 1992. Poisson Approximation. Oxford University Press, Oxford. Barbour, A.D., Xia, A., 2006. On stein’s factors for Poisson approximation in wasserstein distance. Bernoulli 12 (6), 943–954. Benjamini, Y., Hochberg, Y., 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B 57, 289–300. Boutisikas, M.V., Koutras, M.V., 2000. A bound for the distribution of the sum of discrete associated or negatively associated random variables. Annals of Applied Probability 10, 1137–1150. Brown, T.C., Xia, A., 2001. Stein’s method and birth–death processes. Annals of Probability 29, 1373–1403. Chen, L.H.Y., 1975. Poisson approximation for dependent trials. Annals of Probability 3, 534–545. Chen, L.H.Y., Shao, Q.-M., 2004. Normal approximation under local dependence. Annals of Probability 32, 1985–2028. Chen, L.H.Y., Xia, A., 2004. Stein method, palm theory and Poisson process approximation. Annals of Probability 32 (3B), 2545–2569. Dudoit, S., Shaffer, J.P., Boldrick, J.C., 2003. Multiple hypothesis testing in microarray experiments. Statistical Science 18, 71–103. Golub, T.R., Slonim, D.K., Tamayo, P., Huard, R.C., Gaasenbeek, M., Mesirov, J.P., Coller, G.H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Lander, E.S., 1999. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537. Hall, P., Horowitz, J.L., Jing, B.Y., 1995. On blocking rules for the bootstrap with dependent data. Biometrika 82 (3), 561–574. Lehmann, E.L., Romano, J.P., Shaffer, J.P., 2005. On optimality of stepdown and stepup multiple test procedures. Annals of Statistics 33, 1084–1108. Roos, M., 1994. Stein’s method for compound Poisson approximation: the local approach. Annals of Applied Probability 4 (4), 1177–1187. Sarkar, S.K., 2004. FDR-controlling stepwise procedures and their false negative rates. Journal of Statistical Planning and Inference 125, 119–137. Sarkar, S.K., 2007. Stepup procedures controlling generalized fwer and generalized FDR. Annals of Statistics 35, 2405–2420. Storey, J., 2002. A direct approach to false discovery rates. Journal of the Royal Statistical Society, Series B 64, 479–498. Storey, J., E.Taylor, J., Siegmund, D., 2004. Strong control, conservative point estimation, and simultaneous conservative consistency of false discovery rates: a unified approach. Journal of the Royal Statistical Society, Series B 66, 187–205. Tusher, V.G., Tibshirani, R., Chu, G., 2001. Significance analysis of microarrays applied to the ionizing radiation response. Proceedings of the National Academy of Sciences 98 (9), 5115–5121. Wu, W.B., 2008. On false discovery control under dependence. Annals of Statistics 36, 364–380.