Weighted p-value procedures for controlling FDR of grouped hypotheses

Weighted p-value procedures for controlling FDR of grouped hypotheses

Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]] Contents lists available at ScienceDirect Journal of Statistical Planning and Inferen...

693KB Sizes 0 Downloads 31 Views

Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]

Contents lists available at ScienceDirect

Journal of Statistical Planning and Inference journal homepage: www.elsevier.com/locate/jspi

Weighted p-value procedures for controlling FDR of grouped hypotheses Haibing Zhao a,b, Jiajia Zhang c,n a b c

School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai 200433, PR China Key Laboratory of Mathematical Economics (SUFE), Ministry of Education, Shanghai 200433, PR China Department of Epidemiology and Biostatistics, University of South Carolina, Columbia, SC 29208, USA

a r t i c l e i n f o

abstract

Article history: Received 2 April 2013 Received in revised form 17 April 2014 Accepted 18 April 2014

In this paper, we consider the multiple testing problems for grouped hypotheses. Two procedures are proposed based on weighted p-values, where the weights for pvalues are obtained by maximizing a power-related objective function. We find that the proposed procedures can control the false discovery rate asymptotically, and are more powerful than existing methods asymptotically. We further examine their performances with extensive simulations. For illustration, we apply the proposed methods to the adequate yearly progress data. & 2014 Elsevier B.V. All rights reserved.

Keywords: Grouped data Weighted p-values FDR control Multiple comparisons

1. Introduction Many multiple testing procedures have been proposed to control the false discovery rate (FDR). Among them, the linear step-up procedure proposed by Benjamini and Hochberg (1995) has been widely accepted in practice, which is often referred to as the BH procedure. Let γ T0 be the proportion of true null hypotheses and α be the significance level of interest. When the null distributions are identical and independent, the BH procedure is conservative to control the FDR at a level γ T0 α. Previous researchers proposed different testing procedures by incorporating prior information to improve the BH procedure. One direction is to construct testing methods based on weighted p-values (Holm, 1979; Benjamini and Hochberg, 1997; Storey, 2002; Genovese et al., 2006; Roeder and Wasserman, 2009; Cai and Sun, 2009; Hu et al., 2010). The weighting scheme was first proposed by Holm (1979) to improve the power of Bonferroni procedure, while controlling the family-wise error rate (FWER). For the FDR control, Benjamini and Hochberg (1997) developed two weighting methods: the p-value weighting method and error weighting method. The modified BH (MBH) procedure (Storey, 2002) improves the power of the BH procedure by weighting p-values with 1=γ T0 . Genovese et al. (2006) and Roeder and Wasserman (2009) discussed the robustness of power of weighted p-values' procedures to weight misspecification. They showed that the weighted p-values improve the power when weights are appropriately assigned, and slightly pulls the power down otherwise. Roquain and Van De Wiel (2009) derived the optimal weights under the assumptions of the non-null distributions being known and the number of rejected null hypotheses being fixed. However, their FDR controlling procedure with non-null distributions learned from data was not obtained.

n

Corresponding author. E-mail address: [email protected] (J. Zhang).

http://dx.doi.org/10.1016/j.jspi.2014.04.004 0378-3758/& 2014 Elsevier B.V. All rights reserved.

Please cite this article as: Zhao, H., Zhang, J., Weighted p-value procedures for controlling FDR of grouped hypotheses. Journal of Statistical Planning and Inference (2014), http://dx.doi.org/10.1016/j.jspi.2014.04.004i

H. Zhao, J. Zhang / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]

2

The main focus of this paper is on testing grouped hypotheses. The groups may be formed based on the prior knowledge and professional/expertise's suggestions. For example, Hu et al. (2010) divided their genes into groups with respect to Biological Process; Cai and Sun (2009) divided the adequate yearly progress (AYP) (year 2003) data into three groups according to the school sizes. For grouped data, Cai and Sun (2009) proposed their CLfdr procedure and showed some optimality properties. However, the weighting p-values' methods still are attractive for their robust properties (Genovese et al., 2006; Roeder and Wasserman, 2009) and being much easy to carry out. Hu et al. (2010) applied the group information to weight p-values, which is free of test statistics' non-null distributions and uniformly more powerful than the MBH procedure under certain conditions, one of which is that the non-null distributions of different groups are same. We refer Hu et al. (2010)'s weighting method as the HZZ method for the remainder of the paper. In practice, the assumption of sharing same non-null distributions among groups may not be always met. In this case, the HZZ method may be less powerful than the MBH procedure. Moreover, incorporating the non-null distributions information into calculating weights for p-values can further improve the power performances of the HZZ and MBH procedures. Considering that the number of hypotheses is often very large in multiple testing and detecting more false null hypotheses is difficult and practically important, the power improvement is meaningful, which motivated us to develop our procedures. In practice, we often need to test multiple hypotheses simultaneously, which are assumed to be independent of each other throughout this paper. We propose to calculate the weights for the p-values by maximizing a power-related objective function, which is based on the empirical cumulative distribution functions (ecdf) of the p-values. We propose two weighted testing procedures, which are shown to control the FDR asymptotically and be more powerful than the MBH and HZZ procedures under the oracle case (non-null hypothesis distributions are known). Under the data-driven case (non-null hypothesis distributions are unknown and learned from data), the proposed procedures are shown to be asymptotically equivalent to the oracle procedures. The paper is organized as follows. In Section 2, we present some notations and assumptions of hypothesis distributions. We propose our procedures along with their asymptotic properties. Simulation studies are conducted in Section 3. We analyze the AYP (2007) data by proposed methods in Section 4. The discussions and conclusions are given in Section 5. All proofs are provided in the Appendix. 2. Approach 2.1. Notations and assumptions Suppose there are m hypotheses to be tested simultaneously. They are divided into G groups: fH g0i : i ¼ 1; …; mg g;

g ¼ 1; …; G:

Let Pgi be the p-values for testing H g0i and P gi ; i ¼ 1; …; mg ; g ¼ 1; …; G; be independent of each other. Under the null hypotheses, P gi  F g0 ðpÞ, where F g0 ðpÞ are assumed to be the cumulative distribution function of uniform distribution Uð0; 1Þ throughout this paper. Under the non-null hypotheses, P gi  F g1 ðpÞ. Note, the p-values in a group share the same non-null distribution, but those in different groups may have different non-null distributions. Assume that there are mg0 null hypotheses among mg hypotheses in group g and mg1 ¼ m  mg0 ; g ¼ 1; …; G, where G is fixed throughout this paper. Let γ ¼ ðγ 1 ; …; γ G Þ denote the proportion vector of hypotheses, γ0 ¼ ðγ 10 ; …; γ G0 Þ denote the null proportion vector, and γ1 ¼ 1  γ0 ¼ ðγ 11 ; …; γ G1 Þ denote the non-null proportion, where γ g ¼ mg =m and γ g0 ¼ mg0 =mg . 2.2. Proposed test In this subsection, first, we define the objective function as the expectation of discovery rate of the fixed cut-off procedure, and define the optimal weights as the ones maximizing the objective function. Second, for the case of unknown non-null distributions, we derive the empirical objective function and its optimal weight. In order to control the FDR (asymptotically) at a significance level α, we provide an approach to choose a suitable fixed or asymptotically fixed cut-off value. Last, we discuss the asymptotic FDR control of the proposed procedures when only up-biased estimates of unknown parameters can be obtained. 2.3. The non-null distributions being known Throughout this paper, we call a procedure “oracle” if there is known prior and “data-driven” if information is learned from the data. Since the p-values in a group are assumed to share the same non-null distribution F g1 ðpÞ, it is suitable to give the same weight for the p-values in the same group. The MBH procedure takes wg  1=γ T0 and the HZZ procedure takes γ g1 wg ¼ γ g0 γ T1 for group g, where γ T1 ¼ 1  γ T0 . Please cite this article as: Zhao, H., Zhang, J., Weighted p-value procedures for controlling FDR of grouped hypotheses. Journal of Statistical Planning and Inference (2014), http://dx.doi.org/10.1016/j.jspi.2014.04.004i

H. Zhao, J. Zhang / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]

3

To obtain our weights, define the objective function as G

G

g¼1

g¼1

e Oðω; tÞ ¼ ∑ γ g γ g0 F g0 ðwg tÞ þ ∑ γ g γ g1 F g1 ðwg tÞ; where ω ¼ ðw1 ; …; wG Þ. Similar to Roeder and Wasserman (2009), Hu et al. (2010) and Roquain and Van De Wiel (2009), to assure that the FDR is controlled asymptotically, we put a constraint condition on weights: C.i wg Z0; g ¼ 1; …; G

G

and

∑ γ g γ g0 wg ¼ 1:

g¼1

e The proposed optimal weights are obtained by maximizing Oðω; tÞ in ω with fixed t under the condition C.i. It is easy to see that the values of optimal weights are affected not only by γ g and γ g0 , but also by F g1 ðÞ. Different from HZZ's weights which only incorporate the information of γ g and γ g0 , the proposed weights also take the information of F g1 ðÞ into account. e Obviously, maximizing Oðω; tÞ is equivalent to maximizing the power function G

g Powðω; tÞ ¼ ∑ γ g γ g1 F g1 ðwg tÞ:

ð1Þ

g¼1

Remark. For the fixed t, let WFC(ω; t) denote the weighting fixed cut-off procedure, which rejects the ith hypothesis in e tÞ is the expectation of the proportion of hypotheses being rejected group g if and only if P gi =wg r t. It is easy to see that Oðω; based on the WFC(ω; t) procedure. 2.4. The p-values' non-null distributions being estimated e e When the non-null distributions are unknown, Oðω; tÞ needs to be estimated from data. We propose to estimate Oðω; tÞ b as Oðω; tÞ based on the ecdf of p-values, where mg  G  b ðω; t Þ ¼ 1 ∑ ∑ I P gi rwg t O mg ¼1i¼1

b and IðÞ is the indicator function. The value of ω, which maximizes Oðω; tÞ under the constraint condition C.i with γ g0 estimated by γ^ g0 , can be searched out by a software. 2.5. Choose suitable cut-off t 2.5.1. γ g0 and F g1 are known In Sections 2.1 and 2.2, we present how to obtain the optimal weights for p-values when the cut-off is fixed and free of data. Note, the cut-off t must be a value to assure that the FDR to be controlled (asymptotically). We propose to choose the cut-off t based on the MBH and HZZ procedures. Let the cut-off of t, denoted by e t M , is the maximum of e t BH and e t H , where e t BH ¼ supft: t þ∑Gg ¼ 1 γ g γ g1 F g1 ðt=γ T0 Þ Zt=αg from the MBH procedure and e t H ¼ supft: t þ∑Gg ¼ 1 γ g γ g1 F g1 ðγ g1 =ðγ g0 γ T1 ÞtÞ Z t=αg from the HZZ procedure (Hu et al., 2010). That is e t M ¼ maxðe t BH ; e t H Þ. In the definitions of e t BH and e t H (t BH , t H , t^ BH , t^ H , t nBH and t nH in the remaining sections), we set 1=0 ¼ þ1 e o , which maximizes and 0=0 ¼ 0. For example, if γ T1 ¼ 0 and γ g1 ¼ 0, then γ g1 =ðγ g0 γ T1 Þ ¼ 0. Once the optimal weight vector ω e e e o; e Oðω; t M Þ under the constraint condition C.i, is obtained, the WFC(ω t M ) procedure can be obtained and used to test e o; e H g0i ; i ¼ 1; …; mg ; g ¼ 1; …; G. Our proposed procedures (Pro1 and Pro2 procedures) are closely related to the WFC(ω tM) e BH be G-dimensional vector ð1=γ T0 ; …; 1=γ T0 Þ and ω e H ¼ ðγ 11 =γ 10 =γ T1 ; …; γ G1 =γ G0 =γ T1 Þ. Then ω e BH and ω e H are procedure. Let ω e BH and ω e H satisfy the condition C.i. the weights taken by the BH and the HZZ, respectively (Hu et al., 2010). Note that both ω e o; e Lemma 1 shows a property of WFC(ω t M ). pffiffiffiffiffi e o; e t M ) procedure is no greater than α þ Oð1= mÞ, and its power Lemma 1. Suppose γ 0 40 and γ T0 o 1. The FDR of WFC(ω e BH ; e e H; e (defined in Eq. (1)) performance is better than those of the WFCðω t BH Þ and WFCðω t H Þ. 2.5.2. γ g0 and F g1 are unknown In this subsection, we consider how to choose a cut-off for the FDR control and present our proposed procedures and their asymptotic properties as m-1 when γ g0 and Fg 1 are unknown. Suppose γ g -π g ; γ g0 -π g0 , as m-1. Let π g1 ¼ 1  π g0 , π T0 ¼ ∑Gg ¼ 1 π g π g0 and π T1 ¼ 1 π T0 ; π ¼ ðπ 1 ; …; π G Þ, π0 ¼ ðπ 10 ; …; π G0 Þ and π1 ¼ 1  π0 ¼ ðπ 11 ; …; π G1 Þ. Define G

Powðω; tÞ ¼ ∑ π g π g1 F g1 ðwg tÞ; g¼1

G

G

g¼1

g¼1

Oðω; tÞ ¼ ∑ π g π g0 F g0 ðwg tÞ þ ∑ π g π g1 F g1 ðwg tÞ

and a constraint condition Please cite this article as: Zhao, H., Zhang, J., Weighted p-value procedures for controlling FDR of grouped hypotheses. Journal of Statistical Planning and Inference (2014), http://dx.doi.org/10.1016/j.jspi.2014.04.004i

H. Zhao, J. Zhang / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]

4

C.1 wg Z0; g ¼ 1; …; G;

G

and

∑ π g π g0 wg ¼ 1:

g¼1

g e Easy to see that Powðω; tÞ ¼ Powðω; tÞ þ oð1Þ, Oðω; tÞ ¼ Oðω; tÞ þ oð1Þ and ∑Gg ¼ 1 π g π g0 wg ¼ ∑Gg ¼ 1 γ g γ g0 wg þ oð1Þ. Let t BH ¼ supft: t þ ∑Gg ¼ 1 π g π g1 F g1 ðt=π T0 Þ Z t=αg, t H ¼ supft: t þ∑Gg ¼ 1 π g π g1 F g1 ðπ g1 t=π g0 π T1 Þ Z t=αg and t M ¼ maxðt BH ; t H Þ; π^ g0 4 0 be consistent estimates of π g0 , as m-1, and π^ g1 ¼ 1  π^ g0 ; π^ T0 4 0 be a consistent estimate of π T0 . Define m m t^ BH ¼ supft: ð1=mÞ∑Gg ¼ 1 ∑i ¼g 1 IðP gi r t=π^ T0 Þ Z t=αg, which is an estimate of tBH, and t^ H ¼ supft: ð1=mÞ∑Gg ¼ 1 ∑i ¼g 1 IðP gi r ðπ^ g1 =π^ g0 π^ T1 ÞtÞ Z t=αg, which is an estimate of tH. Then we choose the cut-off t to be t^ M ¼ maxðt^ BH ; t^ H Þ. First we list some assumptions which are required in the following theorems: A.1 Bðω; tÞ ¼ t=Oðω; tÞ. BðωH ; tÞ and BðωBH ; tÞ have nonzero derivatives at tH and tBH, respectively. Both limt-0 þ BðωH ; tÞ and limt-0 þ BðωBH ; tÞ are less than α. A.2 F g1 ðxÞ satisfies a uniform Lipschitz condition of order βg 40 for any g. A.3 ωo is the unique maximum point of Oðω; t M Þ under the constraint condition C.1. The assumption A.1 is introduced to ensure t^ H -t H and t^ BH -t BH in probability as m-1 (Hu et al., 2010). We show that assumptions A.1 and A.2 hold when the p-values are calculated from the normal distributions, and A.3 holds when π g1 4 0 and Fg 1 are strictly concave functions for all g in Appendix. Moreover, we show that the Fg1 calculated from the normal distributions are strictly concave in Appendix. b ^o ¼ω ^ o ðt^ M Þ be the maximum of Oðω; Let ω t^ M Þ under the constraint condition C.1 with π g0 estimated by π^ g0 . Theorem 1 ^ o ðt^ M Þ; t^ M ) procedure, i.e. the data-driven WFC(ωo ; t M ) procedure, asymptotically remains the shows that the WFC(ω performances of the WFC(ωo ; t M ) procedure. Theorem 1. Suppose 0 o π g0 o 1; g ¼ 1; …; G, and the assumptions A.1–A.3 hold. Then the FDR and power of the WFC ^ o ðt^ M Þ; t^ M ) procedure are equal to those of WFC(ωo ; t M ) procedure asymptotically, respectively. (ω 2.5.3. Algorithm ^ ^ o ðtÞ) to denote the ω maximizing Oðω; tÞ (Oðω; Hereafter, we use ωo ðtÞ (ω tÞ) under the constraint condition C.1 (with π g0 estimated by π^ g0 ). Note ωo ðt M Þ is ωo and the result in Theorem 1 still holds for WFC(ωo ðt^ M Þ; t^ M ). Because tM is not easily obtained even Fg1 known, we consider WFC(ωo ðt^ M Þ; t^ M ) instead of WFC(ωo ; t M ) for the purpose of ^ o ; t^ M )) procedure as the oracle (or data-driven) Pro1 procedure, convenience. We refer to the WFC(ωo ðt^ M Þ; t^ M ) (or WFC(ω which can be summarized as oracle (data-driven) Pro1 procedure: Step 1: Calculate the cut-offs t^ BH and t^ H . ^ ^ o ) to maximize Oðω; t^ M Þ (Oðω; Step 2: Given t^ M ¼ maxðt^ BH ; t^ H Þ, find ωo (ω t^ M Þ) with respect to ω under the constraint condition C.1 (with π g0 estimated by π^ g0 ). e gi ¼ P gi =wg , i ¼ 1; …; mg ; g ¼ 1; …; G, where wg is the gth component of ωo ðω ^ o Þ. Step 3: Weighting p-values to be P e gi r t^ M , reject H g ; otherwise, accept Hg . Step 4: If P 0i 0i oracle (data-driven) Pro 2 procedure: Steps 1–3: Same as that of Pro1 procedure. e gi as P e0 r ⋯ r P e 0 . Let ej ¼ maxfj: P 0 rjα=mg, if the maximum exists; or else, ej ¼ 0. Step 4: Order P ðjÞ ð1Þ ðmÞ g e Step 5: If j ¼ 0, accept all H ; otherwise, reject H g , jrej, where H g correspond to P 0 . 0i

0ðjÞ

0ðjÞ

ðjÞ

^ o Þweighted BH procedure. The following It should be noted that the oracle (data-driven) Pro2 procedure is the ωo ðω theorem shows the asymptotic properties of the Pro2 procedure. Theorem 2. The oracle Pro2 procedure can control the FDR asymptotically and more powerful than the oracle Pro1 procedure asymptotically. Under the conditions of Theorem 1, the data-driven Pro2 procedure can control the FDR asymptotically and is equivalent to the oracle Pro2 procedure in the sense of both of them achieving the same power asymptotically. 2.6. Results when estimates of π g0 are biased We assume that the consistent estimates for πg0 can be obtained in the previous subsection. However, it is not always true in practice. Various estimates of π0 were proposed in the literature. Genovese and Wasserman (2004) and Meinshausen and Rice (2006) provided consistent estimates of π0 when Fg1 are purity (see the definition in Genovese and Wasserman, 2004). Jin and Cai (2007) estimated π0 based on the empirical characteristic function and Fourier analysis when Fg0 and Fg1 follow normal distributions. Their procedure also gives uniform estimates of Fg0 and Fg1. However, in practice, the conditions required in Genovese and Wasserman (2004) and Jin and Cai (2007) may not be met. Therefore, we discuss the case that only the inconsistent estimates for πg0 can be obtained, such as the least-slope (LSL) estimate (Benjamini and Hochberg, Please cite this article as: Zhao, H., Zhang, J., Weighted p-value procedures for controlling FDR of grouped hypotheses. Journal of Statistical Planning and Inference (2014), http://dx.doi.org/10.1016/j.jspi.2014.04.004i

H. Zhao, J. Zhang / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]

5

2000), the Two-Stage (TST) method estimate (Benjamini et al., 2006) and the Storey method estimate based on the tail proportion of p-values (Storey et al., 2004). Let the inconsistent estimates π^ g0 go to ξg0 in probability as m-1, ξ 0 o 1, where ξ 0 ¼ ∑g π g ξg0 : Let ωHξ ¼ ðð1 ξ10 Þ= ξ10 ξ 1 ; …; ð1  ξG0 Þ=ξG0 ξ 1 Þ and ωBHξ ¼ ð1=ξ 0 ; …; 1=ξ 0 Þ, where ξ 1 ¼ 1 ξ 0 . Define t nBH ¼ supft: αO ðωBHξ ; tÞ Ztg; t nH ¼ supft: αO ðωHξ ; tÞ Ztg, and t nM ¼ maxðt nBH ; t nH Þ. To introduce Lemma 2 and Theorem 3, we put a constraint condition on weights: C.(1) wg Z0; g ¼ 1; …; G;

G

and

∑ π g ξg0 wg ¼ 1:

g¼1

and list the following assumptions: A.(1) A.(2) A.(3) A.(4)

BðωBHξ ; tÞ has nonzero derivatives at t nBH , and limt-0 þ BðωBHξ ; tÞ is less than α. BðωHξ ; tÞ has nonzero derivatives at t nH , and limt-0 þ BðωHξ ; tÞ is less than α. ωoξ is the unique maximum point of Oðω; t M Þ under the constraint condition C.(1). ξg0 4 bg π g0 for some bg Z 1 in every group.

We restate Theorem 4 in Hu et al. (2010) here for the purpose of convenience. P Lemma 2. Under the assumptions A.(2)–A.(4), we have t^ H -t nH and FDRðt^ H Þ rα þ oð1Þ.

Hu et al. (2010) showed that the TST and LSL estimates for πg0 satisfy the conditions in Lemma 2. Theorem 3. Suppose the assumptions A.(1)–A.(4) and A.2 hold. Then the data-driven Pro 1 and Pro 2 still control the FDR asymptotically no greater than α þoð1Þ with the inconsistent estimates of πg0. This theorem can be easily shown following the proof of Theorem 2, and the detail is omitted here. Following the proofs of Lemma 3 and Theorem 2, we can show that the power function of the data-driven Pro1 procedure is equal to that of WFC (ωoξ ; t nM ), which is equal to Oðωoξ ; t nM Þ  ∑Gg ¼ 1 π g π g0 wgoξ t nM , asymptotically. When the FDR is controlled asymptotically, ∑Gg ¼ 1 π g π g0 wgoξ t nM will be a small number compared to Oðωoξ ; t nM Þ asymptotically. Thus, the power of WFC(ωoξ ; t nM ) is approximately equal to Oðωoξ ; t nM Þ. That is, the power of WFC(ωoξ ; t nM ) is approximately no less than that of WFC(ωBHξ ; t nBH ) and WFC ðωHξ ; t nH Þ, which are asymptotical powers of the BH and HZZ procedures, respectively. It should be noticed that the data-driven Pro2 is still more powerful than the data-driven Pro1. 3. Simulation In this section, we conduct the extensive simulation studies in order to evaluate the performance of proposed procedures, including both oracle and data-driven procedures. For the comparison purposes, we also fit the simulated data set by the MBH and HZZ procedures. All programs are done in Matlab. 3.1. Settings The simulation design is similar to that in Cai and Sun (2009). For group g, the null and non-null distributions are Nðμg0 ; s2g0 Þ and Nðμg1 ; s2g1 Þ, respectively. The null distributions of all groups are the standard normal distribution N(0, 1). The nominal significance level α is set to be 0.1. Three situations are considered with respect to the different cluster sizes and hypothesis distributions. 3.1.1. Two groups (G¼2) Case (1): The group sizes are m1 ¼ 3000 and m2 ¼ 1500. π 21 ¼ 0:1, ðμ11 ; s11 Þ ¼ ð2; 1Þ and ðμ21 ; s21 Þ ¼ ð4; 1Þ. π11 varies between (0, 0.3). Case (2): The group sizes are as the same as those in Case (1), ðπ 11 ; π 21 Þ ¼ ð0:2; 0:1Þ, ðμ11 ; s11 Þ ¼ ðμ1 ; 1Þ and ðμ21 ; s21 Þ ¼ ð2; 0:5Þ. μ1 varies between (2.5, 5). Case (3): ðπ 11 ; π 21 Þ ¼ ð0:2; 0:1Þ, ðμ11 ; s11 Þ ¼ ð2; 1Þ and ðμ21 ; s21 Þ ¼ ð4; 1Þ. The sample size of group 2 is fixed at m2 ¼ 1500, and m1 varies between (500, 5000).

3.1.2. Three groups (G ¼3)

Case (1): The group sizes are m1 ¼ 5000, m2 ¼ 3000 and m3 ¼ 1500. ðπ 21 ; π 31 Þ ¼ ð0:1; 0:2Þ, ðμ11 ; s11 Þ ¼ ð2; 1Þ ðμ21 ; s21 Þ ¼ ð4; 1Þ and ðμ31 ; s31 Þ ¼ ð6; 1Þ. π11 varies between (0, 0.3). Please cite this article as: Zhao, H., Zhang, J., Weighted p-value procedures for controlling FDR of grouped hypotheses. Journal of Statistical Planning and Inference (2014), http://dx.doi.org/10.1016/j.jspi.2014.04.004i

H. Zhao, J. Zhang / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]

6

Case (2): The group sizes are the same as those in Case (1). ðπ 11 ; π 21 ; π 31 Þ ¼ ð0:1; 0:2; 0:3Þ, ðμ11 ; s11 Þ ¼ ð2; 1Þ ðμ21 ; s21 Þ ¼ ðμ2 ; 0:5Þ and ðμ31 ; s31 Þ ¼ ð6; 0:75Þ. μ2 varies between (2.5, 5). Case (3): ðπ 11 ; π 21 ; π 31 Þ ¼ ð0:1; 0:2; 0:3Þ, ðμ11 ; s11 Þ ¼ ð2; 0:5Þ; ðμ21 ; s21 Þ ¼ ð4; 1Þ and ðμ31 ; s31 Þ ¼ ð6; 0:75Þ. The sample sizes of groups 2 and 3 are fixed at m2 ¼ 3000, m3 ¼ 1500, and m1 varies between (500, 5000).

0.12

0.75

0.115 0.7 0.11 0.65 Power

FDR

0.105 0.1

0.6

0.095 0.55

0.09 0.085 0.08

0.5 0

0.05

0.1

0.15 π

0.2

0.25

0.3

0

0.05

0.1

0.15 π

11

0.12

0.95

0.115

0.9

0.11

Power

FDR

0.3

0.8

0.1 0.095

0.75 0.7

0.09

0.65

0.085

0.6

0.08

0.55 2.5

3

3.5

u

4

4.5

5

2.5

3

3.5

2

0.12

0.85

0.115

0.8

0.11

0.75

0.105

0.7 Power

FDR

0.25

0.85

0.105

0.075

0.2

11

0.1 0.095

u

4

4.5

5

2

0.65 0.6

0.09

0.55

0.085

0.5

0.08

0.45 1000

2000

m

3000 1

4000

5000

1000

2000

m

3000

4000

5000

1

Fig. 1. (a) Two groups: ⋄, MBH procedure;  :○, HZZ procedure;  n Pro2;   þ , Pro1. This figure shows the results for oracle procedures which assume πg0 are known. Case (1)—top panel, case (2)—middle panel, case (3)—bottom panel. (b) Two groups: ⋄, MBH procedure;  :○, HZZ procedure;  n Pro2;   þ , Pro1. This figure shows the results for adaptive procedures which assume πg0 are unknown and estimated from the data. Here we estimate πg0 with the LSL method. Case (1)—top panel, case (2)—middle panel, case (3)—bottom panel.

Please cite this article as: Zhao, H., Zhang, J., Weighted p-value procedures for controlling FDR of grouped hypotheses. Journal of Statistical Planning and Inference (2014), http://dx.doi.org/10.1016/j.jspi.2014.04.004i

H. Zhao, J. Zhang / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]

0.12

7

0.75

0.115

0.7

0.11 0.65 Power

FDR

0.105 0.1 0.095

0.6

0.55

0.09

0.5

0.085 0.08

0.45 0

0.05

0.1

0.15 π

0.2

0.25

0.3

0

0.05

0.1

0.15 π

11

0.2

0.25

0.3

11

0.95 0.115

0.9

0.11

0.85

0.105

0.8 Power

FDR

0.1 0.095 0.09

0.75 0.7 0.65

0.085

0.6

0.08

0.55

0.075

0.5 2.5

3

3.5

u

4

4.5

5

2.5

2

3

3.5

u

4

4.5

5

2

0.12 0.8 0.11

0.75 0.7 Power

FDR

0.1

0.09

0.65 0.6 0.55

0.08

0.5 0.45

0.07

0.4 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 m1

500 1000 1500 2000 2500 3000 3500 4000 4500 5000 m1

Fig. 1. (continued)

3.1.3. Misspecified alternatives To see the robust of the proposed procedures, we simulate their performances if the data is not grouped correctly. Case (1): The data is generated in the same way as the Case (1) under Three groups setting, but we misspecified the first and second groups as one group when fitting data. Please cite this article as: Zhao, H., Zhang, J., Weighted p-value procedures for controlling FDR of grouped hypotheses. Journal of Statistical Planning and Inference (2014), http://dx.doi.org/10.1016/j.jspi.2014.04.004i

H. Zhao, J. Zhang / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]

8

Case (2): The data is generated in the same way as the Case (2) under Three groups setting, but we misspecified the first and third groups as one group when fitting data. Case (3): The data is generated in the same way as the Case (3) under Three groups setting, but we misspecified the second and third groups as one group when fitting data.

0.12

0.9

0.115 0.85 0.11 0.8 Power

FDR

0.105 0.1 0.095

0.75

0.7

0.09 0.65

0.085 0.08

0

0.05

0.1

0.15 π

0.2

0.25

0.3

0

0.05

0.1

0.15 π

11

0.2

0.25

0.3

11

0.115

0.86

0.11

0.84

0.105

0.82

0.1

0.8

Power

FDR

0.12

0.095

0.78

0.09

0.76

0.085

0.74

0.08

0.72 2.5

3

3.5

u

4

4.5

0.7

5

2.5

3

3.5

2

0.12

u

4

4.5

5

2

1

0.115 0.95

0.11

0.9

0.1

Power

FDR

0.105

0.095 0.09

0.85

0.8

0.085 0.75

0.08 1000

2000

m

3000 1

4000

5000

1000

2000

m

3000

4000

5000

1

Fig. 2. (a) Three groups: ⋄, MBH procedure;  :○, HZZ procedure;  n Pro2;   þ , Pro1. This figure shows the results for oracle procedures which assume πg0 and Fg1 are known. Case (1)—top panel, case (2)—middle panel, case (3)—bottom panel. (b) Three groups: ⋄, MBH procedure;  :○, HZZ procedure;  n Pro2;   þ , Pro1. This figure shows the results for adaptive procedures which assume πg0 are unknown and estimated from the data. Here we estimate πg0 with the LSL method. Case (1)—top panel, case (2)—middle panel, case (3)—bottom panel.

Please cite this article as: Zhao, H., Zhang, J., Weighted p-value procedures for controlling FDR of grouped hypotheses. Journal of Statistical Planning and Inference (2014), http://dx.doi.org/10.1016/j.jspi.2014.04.004i

H. Zhao, J. Zhang / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]

0.12

9

0.9

0.115

0.85

0.11

0.8

Power

FDR

0.105 0.1 0.095

0.6

0.085

0

0.05

0.1

0.15 π 11

0.2

0.25

0.3

0

0.05

0.1

0.15 π 11

0.2

0.25

0.3

0.86

0.12

0.84

0.115

0.82

0.11

0.8

0.105

0.78

0.1

Power

FDR

0.7 0.65

0.09

0.08

0.75

0.095

0.76 0.74

0.09

0.72

0.085

0.7

0.08

0.68 0.66

0.075 2.5

3

3.5

u2

4

4.5

5

2.5

3

3.5

4

4.5

3000 m1

4000

u2

5

1

0.125 0.12

0.95

0.115 0.9

0.11 Power

FDR

0.105 0.1 0.095 0.09

0.85 0.8 0.75

0.085 0.7

0.08 0.075 1000

2000

3000 m1

4000

5000

1000

2000

5000

Fig. 2. (continued)

3.2. Results and conclusions The simulation results are obtained with 10 000 replications and presented in Figs. 1–3. Figs. 1a and 2a show the results for the oracle procedures, Figs. 1b and 2b for the data driven procedure and Fig. 3 for misspecified alternatives with datadriven procedures, where we estimate πg0 with the LSL method for data-driven procedures. Particularly, the performances of the MBH, HZZ and proposed procedures under the cases (1)/(2)/(3) in the settings are plotted at the top/middle/bottom panels of figures. Please cite this article as: Zhao, H., Zhang, J., Weighted p-value procedures for controlling FDR of grouped hypotheses. Journal of Statistical Planning and Inference (2014), http://dx.doi.org/10.1016/j.jspi.2014.04.004i

H. Zhao, J. Zhang / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]

10 0.12 0.115

0.85

0.11

0.8

Power

FDR

0.105 0.1 0.095

0.7 0.65

0.09

0.6

0.085 0.08

0.75

0

0.05

0.1

0.15 π

0.2

0.25

0.3

0

0.05

0.1

0.15 π

11

0.2

0.25

0.3

11

0.85

0.115 0.11 0.105

0.8 Power

FDR

0.1 0.095 0.09

0.75

0.085 0.08 0.7

0.075 2.5

3

3.5

u

4

4.5

5

2.5

3

3.5

2

0.12

u

4

4.5

5

2

1

0.115 0.95

0.11 0.105

0.9 Power

FDR

0.1 0.095

0.85

0.09 0.8

0.085 0.08

0.75

0.075 1000

2000

m

3000 1

4000

5000

1000

2000

m

3000

4000

5000

1

Fig. 3. Three groups are misspecified as two groups: ⋄, MBH procedure;  :○, HZZ procedure;  n Pro2;   þ , Pro1. This figure shows the results for adaptive procedures which assume πg0 are unknown and estimated from data. Here we estimate πg0 with the LSL method. Case (1)—top panel, case (2)—middle panel, case (3)—bottom panel.

Simulation results confirm that the four procedures can control the FDR at level α ¼ 0:1 asymptotically. All cases show that the Pro2 performs best among all four procedures. The Pro1 is better than the HZZ and MBH procedures uniformly. Fig. 3 shows the proposed procedures are robust when the data are grouped incorrectly. From Figs. 1 and 2, we can see that the HZZ procedure cannot be better than the MBH procedure uniformly under the current simulation settings here, where πg0 are not sharply different. However, the HZZ procedure is free of non-null distributions, easy to be carried out, and gaining better power than the MBH procedure when there is large difference in sizes among πg0, g¼ 1,…,G. Please cite this article as: Zhao, H., Zhang, J., Weighted p-value procedures for controlling FDR of grouped hypotheses. Journal of Statistical Planning and Inference (2014), http://dx.doi.org/10.1016/j.jspi.2014.04.004i

H. Zhao, J. Zhang / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]

11

Table 1 Group characteristics in the AYP data. Group

Group size

Proportion (%)

Empirical null distribution

S-G

516

6.6

Nð1:02; 1:232 Þ

M-G

6514

80.6

Nð2:19; 1:742 Þ

L-G

837

12.8

Nð3:88; 3:702 Þ

Table 2 Numbers of interesting schools identified. Group

Method

α 0.01

0.025

0.04

0.055

0.07

0.085

0.10

0.115

S-G

MBH HZZ Pro1 Pro2

24 26 26 26

32 32 33 33

38 38 40 40

40 43 43 43

44 48 48 48

52 50 55 55

55 60 59 59

61 62 61 61

M-G

MBH HZZ Pro1 Pro2

177 188 188 188

238 250 252 252

291 294 299 300

339 336 342 342

388 390 395 395

409 424 423 424

444 448 454 454

482 474 483 487

L-G

MBH HZZ Pro1 Pro2

30 34 34 34

42 42 43 43

52 55 55 55

57 60 58 58

64 66 67 67

69 71 73 73

77 75 77 77

86 82 86 86

Total

MBH HZZ Pro1 Pro2

231 248 248 248

312 324 328 328

381 387 394 395

436 439 443 443

496 507 510 510

530 545 551 552

576 583 590 590

629 618 630 634

4. An application We apply proposed methods to analyze the AYP study of California high schools (year 2007), which includes observations from 7867 California high schools. One of the main interests of this data set for sociological research (Sparkes, 1999; Considine and Zappalà, 2002; Cai and Sun, 2009) is the association between the socioeconomic status (SES) and the academic performance of students, which can be achieved by comparing success rates in Math exams of socioeconomically advantaged (SEA) versus socioeconomically disadvantaged (SED) students. Let us denote the success rate for each school as Xi for SEA students and Yi for SED students, and the numbers of scores reported for SEA and SED students are nxi and nyi, i ¼ 1; …; m, separately, where m¼ 7867. Define the centering constant δ equal to the difference of median(X 1 ; …; X m ) minus median(Y 1 ; …; Y m ). A summary statistics used to construct p-values for comparing SEA students versus SED students is Xi  Y i  δ Z i ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; X i ð1 X i Þ=nxi þ Y i ð1 Y i Þ=nyi

i ¼ 1; …; m:

The school i with the large value of jZ i j is called “interesting”. The AYP data of year 2003 has been analyzed by Efron (2007, 2008) and Cai and Sun (2009), which showed that, if the data is not grouped according to the school sizes (pooled analysis), the results will be misleading and “may distort inferences made for separate groups because highly significant cases from one group may be hidden among the nulls from another group, while insignificant cases may be possibly enhanced” (Efron, 2004; Cai and Sun, 2009). Specifically, the pooled analysis tends to pick out too many large size schools and too few small size schools. Thus, Cai and Sun (2009) divided their data into three groups and we group 2007 AYP data in the same way. The small group (large group) is the one with nxi þnyi r120 (denoted as S-G in the table) (nxi þ nyi Z900, denoted as L-G in the table). The group with size between (120, 900) is called as the medium group (denoted as M-G in the table). For each group, the null distribution is assumed to be the normal distribution. Note, the means may be not zero and the variances may be unequal among the groups. πg0 are estimated by the method in Jin and Cai (2007). The null hypotheses are H g0i : EðZ gi Þ ¼ μg0 vs H g1i : EðZ gi Þ a μg0 ; where μg0 are the means of empirical null distributions in Table 1. For different significance levels of α, the analysis results from the MBH, HZZ, and data-driven Pro1 and Pro2 procedures are presented in Table 2. All four methods indicate that there are more schools of “interesting” from the middle group than those from the small and large group. Less “interesting” schools are found when α decrease. Among all four methods, the data-driven Pro2 procedure Please cite this article as: Zhao, H., Zhang, J., Weighted p-value procedures for controlling FDR of grouped hypotheses. Journal of Statistical Planning and Inference (2014), http://dx.doi.org/10.1016/j.jspi.2014.04.004i

H. Zhao, J. Zhang / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]

12

finds the most schools of “interesting”, which is followed by the data-driven Pro1 procedure. That is consistent with our analysis and simulation results. 5. Conclusions We developed two weighted p-value procedures, Pro1 and Pro2 procedures, for testing grouped hypotheses simultaneously. The weights were obtained by maximizing an objective function, which is equivalent to maximizing the power function of interest. We derived the asymptotic properties of the Pro1 procedure and proved its power advantage over the existing methods theoretically. We also showed that both the Pro1 and Pro2 procedures outperform the existing weighted p-value methods through simulation studies, and the Pro2 procedure performs best among all four procedures considered. The weighted p-value approach suggests several future research directions. First, the independence across hypothesis assumptions could be dropped off, and the weighted p-value testing procedure for the dependence case may be developed. Second, how to obtain the minimal sample size to achieve a certain power in multiple hypotheses test is also of interest. Third, many practitioners and statisticians are interested in controlling the tail probability of the false discovery proportion (FDP) instead of the FDR. Similar weighted p-value procedures should be investigated under the FDP control. It is worthwhile pointing out that Sun and McLain (2012) discussed the effect size problem, which is interested in “How certain can we tell the sign of A-B” (Sun and McLain, 2012). In this paper, we discussed the statistical significance problem, which is interested in “How big is A-B” (Sun and McLain, 2012), and the effect size problem may be potential future work.

Acknowledgment The authors thank the reviewers and associate editor for insightful comments, which have led to a great improvement in the presentation of the work. The authors also thank Professor Wenguang Sun for sharing his data and code. The research of Haibing Zhao was supported by grant from the National Natural Science Foundation of China (NSFC) [No. 11101255] and program for Changjiang Scholars and Innovative Research Team in University. Appendix A

Proof of Lemma 1. The FDR of WFC(ω; t) is calculated as ( )     ∑Gg ¼ 1 ∑i A Hg0 I ðP gi r wg tÞ ∑Gg ¼ 1 γ g γ g0 F g0 ðwg tÞ 1 t 1 pffiffiffiffiffi ¼ ðω; tÞ þO pffiffiffiffiffi ; ¼ ðω; tÞ þ O FDRðω; t Þ ¼ E m g e e m m ð∑Gg ¼ 1 ∑i ¼ 1 I ðPgi r wg tÞ Þ 3 1 O O t BH 4 e t H , implying e tM ¼ e t BH . Note that t þ ∑Gg ¼ 1 where Hg0 ¼ fi: H g0i is trueg. Without loss of generality, it is supposed that e e ω e BH ; tÞ. By Genovese and Wasserman (2002), γ g γ g1 F g1 ðt=γ T0 Þ in the definition of e t BH is Oð       e t BH 1 1 e BH ; e e BH ; e FDR ω t BH ¼ t BH Þ þ O pffiffiffiffiffi r α þO pffiffiffiffiffi ðω e m m O according to the definition of e t BH . Therefore,       e e tM t BH 1 1 e o; e FDR ω tM ¼ þO pffiffiffiffiffi ¼ þO pffiffiffiffiffi e ω e ω m m e o; e e o; e tMÞ t BH Þ Oð Oð     e t BH 1 1 r þ O pffiffiffiffiffi r α þO pffiffiffiffiffi : e ω m m e BH ; e Oð t BH Þ g e The second conclusion is obvious by noting that maximizing Oðω; tÞ is equivalent to maximizing Powðω; tÞ and e tM ¼ e e maxðt BH ; t H Þ. □ Lemma 3. Suppose the conditions in Theorem 1 hold. Then

(1) (2) (3) (4)

P ^ t^ M Þ  Oðω; t M Þj-0 as m- þ1, where ω satisfies the condition C.1. supω jOðω; p ^ ω ^ o ; t^ M Þ Oðω ^ o ; t M Þ- 0 as m- þ 1. Oð p ^ ω ^ o ; t^ M Þ-Oðωo ; t M Þ as m- þ 1. Oð p ^ o -ωo . ω

P P P Proof. Part (1): t^ BH -t BH and t^ H -t H have been shown in Hu et al. (2010). Then t^ M ¼ maxft^ BH ; t^ H g-t M ¼ maxft BH ; t H g: For convenience, we denote ∑i A Hg1 IðP gi o uÞ=ng1 by Fb g1 ðuÞ and ∑i A Hg0 IðP gi o uÞ=ng1 by Fb g0 ðuÞ, where Hg1 ¼ fi: H g0i is falseg.

Please cite this article as: Zhao, H., Zhang, J., Weighted p-value procedures for controlling FDR of grouped hypotheses. Journal of Statistical Planning and Inference (2014), http://dx.doi.org/10.1016/j.jspi.2014.04.004i

H. Zhao, J. Zhang / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]

13

a:s:

Then there are J Fb gj F gj J 1 ¼ supu A R jFb gj ðuÞ  F gj ðuÞj-0, j¼ 0,1. Note that ^ supjOðω; t^ M Þ  Oðω; t M Þj r ∑ ω

G

∑ sup π g π gj jFb gj ðwg t^ M Þ F gj ðwg t M Þj þ op ð1Þ:

j ¼ 0;1 g ¼ 1

ω

Since ω satisfies the condition C.1, w1 ; …; wG are bounded by Λ ¼ maxf1=ðπ 1 π 10 Þ; …; 1=ðπ G π G0 Þg. We have supω jFb g1 ðwg t^ M Þ F g1 ðwg t M Þj-0; by noting supjFb g1 ðwg t^ M Þ F g1 ðwg t M Þj ω

r supjðFb g1 ðwg t^ M Þ  F g1 ðwg t^ M ÞÞj þsupjðF g1 ðwg t^ M Þ  F g1 ðwg t M ÞÞj ω

ω

r op ð1Þ þ Λβg jt^ M  t M jβg ¼ op ð1Þ: P ^ Similarly, there are supω jFb g0 ðwg t^ M Þ  F g0 ðwg t M Þj-0. Then supω jOðω; t^ M Þ Oðω; t M Þj-0. G ^ go ¼ 1 and π^ g0 ¼ π g0 þop ð1Þ, we have 1=π^ g0 ¼ 1=π g0 þ op ð1Þ and w ^ go rΛ þ op ð1Þ ¼ Op ð1Þ. Part (2): Since ∑g ¼ 1 π g π^ g0 w Following the line of part (1), there is

^ ω ^ o ; t^ M Þ Oðω ^ o ; t M Þj r ∑ jOð

G

^ go t^ M Þ F gj ðw ^ go t M Þj þ op ð1Þ ∑ π g π gj jFb gj ðw

j ¼ 0;1 g ¼ 1

G

^ go t^ M Þ F g1 ðw ^ go t^ M ÞÞj þ ∑ ∑ jðFb g1 ðw

r ∑

j ¼ 0;1 g ¼ 1

G

^ go t^ M Þ F g1 ðw ^ go t M ÞÞj ∑ jðF g1 ðw

j ¼ 0;1 g ¼ 1

r op ð1Þ þ Op ð1Þ  ∑

G

∑ jt^ M  t M jβg ¼ op ð1Þ:

j ¼ 0;1 g ¼ 1

The proof of part (2) is completed. The technique used in the proof of part (3) is similar to that in Cai and Sun (2009). p ^ go  1Þ. Define ^ o ; t M Þ-Oðωo ; t M Þ by part (2). Let δ ¼ ð1=π T0 Þð∑Gg ¼ 1 π g π g0 w Part (3): We just need to prove that Oðω n n n n ^ go ¼ w ^ go  δ and ω ^ 1o ; …; w ^ Go Þ. Then ∑Gg ¼ 1 π g π g0 w ^ ngo ¼ 1. Since π^ g0 ¼ π g0 þ op ð1Þ, π g0 4 0 and π^ g0 w ^ go r 1, we get to ^ o ¼ ðw w ^ go ¼ Op ð1Þ. Then know w G G ^ go ¼ op ð1Þ; ^ ∑ π g π g0 w go  1 r ∑ π g jπ^ g0  π g0 jw g ¼ 1 g¼1 p

^ no ; t M Þ-Oðω ^ o ; t M Þ by Fg1 satisfying the uniform Lipschitz condition. If we can show i.e. δ ¼ op ð1Þ. Then there is Oðω p p n ^ o ; t M Þ-Oðωo ; t M Þ, then we have Oðω ^ o ; t M Þ-Oðωo ; t M Þ. If it is not true, then there exists event ωm0 with sufficiently large Oðω 0 m and ε0 such that ^ no ; t M Þ Oðωo ; t M Þj Z ε0 gÞ 4 2η0 ; Prðfωm0 : jOðω where η0 is a positive number. According to the definition of ωo , there is ^ no ; t M Þ Oðωo ; t M Þ r  ε0 gÞ 4 2η0 ; Prðfωm0 : Oðω p p ^ o ; t^ M Þ^ ω ^ no ; t^ M Þ-Oðω ^ no ; t M Þ, as m- þ 1. Then, with sufficiently large By part (1) and part (2), there are Oðω Oðωo ; t M Þ and Oð m″ A fm0 g, there is

^ o ; t^ M Þ  Oðωo ; t M Þj rε0 =4gÞ Z 1  η0 =2 Prðfω0m″ : jOðω ^ ω ^ n ; t^ M Þ  Oðω ^ n ; t M Þj rε0 =4gÞ Z 1  η0 =2: Prðfω″m″ : jOð o

o

Then we have ^ o ; t^ M Þ  Oðωo ; t M Þj r ε0 =4; jOð ^ ω ^ no ; t^ M Þ  Oðω ^ no ; t M Þj r ε0 =4Þ PrðjOðω ^ ω ^ o ; t^ M Þ Oðωo ; t M Þj rε0 =4Þ þPrðjOð ^ n ; t^ M Þ Oðω ^ n ; t M Þj r ε0 =4Þ ¼ PrðjOðω o

o

^ ω ^ o ; t^ M Þ  Oðωo ; t M Þj rε0 =4; or; jOð ^ no ; t^ M Þ Oðω ^ no ; t M Þj r ε0 =4Þ  PrðjOðω Z 2  η0  1 ¼ 1  η0 : Therefore ^ o ; t^ M Þ  Oðωo ; t M Þj r ε0 =4; ^ no ; t M Þ Oðωo ; t M Þ r  ε0 ; jOðω PrðOðω n ^ n ^ ^ ; t M Þ  Oðω ^ ; t M Þj r ε0 =4Þ Z 2η0 þ1 η0 jOðω o

o

^ o ; t^ M Þ  Oðωo ; t M Þj rε0 =4; ^ no ; t M Þ Oðωo ; t M Þ r  ε0 ; or; jOðω  PrðOðω n ^ n ^ ^ ; t M Þj r ε0 =4Þ Z η0 : ^ ; t M Þ  Oðω jOðω o

o

This means, with probability greater than η0 for sufficiently large m″, there is ^ ω ^ o ; t^ M Þ Oðωo ; t M Þj r ε0 =4; jOð ^ no ; t^ M Þ Oðω ^ no ; t M Þj r ε0 =4g fjOðω Please cite this article as: Zhao, H., Zhang, J., Weighted p-value procedures for controlling FDR of grouped hypotheses. Journal of Statistical Planning and Inference (2014), http://dx.doi.org/10.1016/j.jspi.2014.04.004i

H. Zhao, J. Zhang / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]

14

and ^ no ; t M Þ  Oðωo ; t M Þ r  ε0 g: fOðω Then, for sufficiently large m″, with probability greater than η0 we have ^ ω ^ no ; t M Þ þε0 =4 rOðωo ; t M Þ  ε0 þ ε0 =4 ^ no ; t^ M Þ r Oðω Oð ^ o ; t^ M Þ  ε0 =2: ^ o ; t^ M Þ þ ε0 =4  ε0 þ ε0 =4 ¼ Oðω r Oðω 0

n

¼ ð1=∑Gg ¼ 1 π g π^ g0 Þð∑Gg ¼ 1 π g π^ g0 wgo 1Þ.

ð2Þ n

n

n

Define wgo ¼ wgo  δ and ωo ¼ ðw1o ; …; wGo Þ. Then Let δ π^ g0 ¼ π g0 þop ð1Þ and π g0 4 0, we get to know 1=∑Gg ¼ 1 π g π^ g0 ¼ Op ð1Þ. Then G G δ0 ¼ Op ð1Þ ∑ π g π^ g0 wgo  1 r Op ð1Þ ∑ π g jπ^ g0  π g0 jwgo ¼ op ð1Þ; g ¼ 1 g¼1

∑Gg ¼ 1 π g π^ g0 wngo

¼ 1. Since

p p p ^ n ; t^ M Þ^ n ; t^ M Þ^ o ; t^ M Þ^ o ; t^ M Þ by Oðω Oðωno ; t M Þ, Oðω Oðωo ; t M Þ and Fg1 satisfying the Oðω i.e. δ0 ¼ op ð1Þ. Then there is Oðω o o p p p ^ ω ^ ω ^ ω ^ ω ^ no ; t^ M Þ-Oðω ^ no ; t M Þ, Oð ^ o ; t^ M Þ-Oðω ^ o ; t M Þ and ^ o ; t^ M Þ by Oð ^ no ; t^ M Þ-Oð uniform Lipschitz condition. Moreover, we have Oð p

^ no ; t M Þ-Oðω ^ o ; t M Þ. Oðω p p ^ n ; t^ M Þ^ ω ^ o ; t^ M Þ and Oð ^ ω ^ no ; t^ M Þ-Oð ^ o ; t^ M Þ, with sufficiently large m, there is PrðBðε0 =8ÞÞ Z1 η0 =2, where Based on Oðω Oðω o ^ n ; t^ M Þ  Oðω ^ o ; t^ M Þj rε0 =8; jOð ^ ω ^ ω ^ no ; t^ M Þ  Oð ^ o ; t^ M Þj rε0 =8g: Bðε0 =8Þ ¼ fjOðω o Then, for sufficiently large m″, we have ^ ω ^ o ; t^ M Þ  ε0 =2; Bðε0 =8ÞÞ ^ no ; t^ M Þ r Oðω PrðOð ^ o ; t^ M Þ ε0 =2Þ þ PrðBðε0 =8ÞÞ ^ ω ^ no ; t^ M Þ r Oðω ¼ PrðOð ^ ω ^ o ; t^ M Þ ε0 =2; or; Bðε0 =8ÞÞ; ^ no ; t^ M Þ r Oðω  PrðOð Zη0 þ1 η0 =2  1 ¼ η0 =2: ^ ω ^ n ; t^ M Þ  ε0 =4. This ^ o ; t^ M Þ r Oðω which implies, with probability greater than η0 =2 for sufficiently large m″, there is Oð o p p ^ ω ^ o . Combine Oð ^ o ; t^ M Þ-Oðω ^ o ; t M Þ and Oðω ^ o ; t M Þ-Oðωo ; t M Þ together, we conclude contradicts with the definition of ω part (3). Part (4): Let S ¼ fω: ω Z0; ω r1=t M ; ∑Gg ¼ 1 π g π g0 wg ¼ 1g which is bounded and complete set. We only need to consider ω taking values in S for function Oðω; t M Þ. For any ξ 4 0, there must be a ε 4 0 such that jOðω; t M Þ  Oðωo ; t M Þj 4ε for all ω qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ^ o ↛ωo in probability, then ω ^ no ↛ωo in satisfying jω  ωo j 4 ξ, where jvj ¼ v21 þ ⋯ þv2G and v ¼ ðv1 ; …; vG Þ is a vector. If ω ^ no is the one defined in part (3). There exists event ωm0 with sufficiently large m0 and ξ0 such that probability, where ω ^ no ωo j Zξ0 gÞ 42η0 ; Prðfωm0 : jω where η0 is a positive number. Then there is ε0 such that ^ no ; t M Þ  Oðωo ; t M Þ r  ε0 gÞ 4 2η0 Prðfωm0 : Oðω p

^ no ; t M Þ-Oðωo ; t M Þ. which contradicts with Oðω



^ 1o ; …; w ^ Go Þ and ωo ¼ ðw1o ; …; wGo Þ. By Lemma 3, the FDR of the WFC(ω ^ o ¼ ðw ^ o ; t^ M ) Proof of Theorem 1. Suppose that ω procedure is calculated as 8 9 < ∑G ∑i A H I =   ^ go t^ M Þ g¼1 g0 ðP gi r w ^ ^ o; t M ¼ E FDR ω :ð∑G ∑mg I ðP r w^ ; Þ 3 1 g¼1 i¼1 gi got^ M Þ ( ) ∑Gg ¼ 1 ∑i A Hg0 I ðP gi r wgo tM Þ þ op ð1Þ ¼E m ð∑Gg ¼ 1 ∑i ¼g 1 I ðPgi r wgo tM Þ þ op ð1ÞÞ 3 1 ¼ FDRðωo ; t M Þ þ oð1Þ: Therefore, according to Lemma 1, we get the first conclusion. Based on the conditions in this theorem and part (4) in ^ o ; t^ M ) procedure is equivalent to the WFC(ωo ; t M ) procedure since they can achieve Lemma 3, it is easy to see that the WFC(ω the same power asymptotically. □ Proof of Theorem 2. We first show the first conclusion. Let e t c ¼ ejα=m and ( ) G   t t c ¼ sup t: t þ ∑ π g π g1 F g1 twgo Z : α g¼1 Please cite this article as: Zhao, H., Zhang, J., Weighted p-value procedures for controlling FDR of grouped hypotheses. Journal of Statistical Planning and Inference (2014), http://dx.doi.org/10.1016/j.jspi.2014.04.004i

H. Zhao, J. Zhang / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]

15

P Following the proof of Theorem 3 in Sun and Wei (2011), there is e t c -t c . The FDR of the oracle Pro2 procedure is calculated as 9 8 > > > > G > > = < ∑g ¼ 1 ∑i A Hg0 I ∑Gg ¼ 1 π g π g0 F g0 ðwgo t c Þ þ op ð1Þ tc ðP gi r wgoe t cÞ ¼E ¼ þ oð1Þ rα þ oð1Þ: FDRðPro2Þ ¼ E mg > > Oðωo ; t c Þ þ op ð1Þ Oðωo ; t c Þ > > G > > Þ 3 1 ; :ð∑g ¼ 1 ∑ I e i¼1

ðP gi r wgo t c Þ

This means the oracle Pro2 procedure can control the FDR asymptotically. Without loss of generality, it is supposed that t BH 4t H , implying t M ¼ t BH . By the definitions of ωo , tc and tBH, there is Oðωo ; t BH Þ ZOðωBH ; t BH Þ Z t BH =α: By the definition of tc, there is t c Z t BH . Then there is Oðωo ; t BH Þ rt c =α rOðωo ; t c Þ; which implies the oracle Pro2 procedure is no less powerful than the oracle Pro1 procedure asymptotically. p P P ^ ^ ω ^ o ; tÞ-Oðωo ; tÞ. Following the proof of ^ o -ωo and supω jOðω; tÞ Oðω; tÞj-0 for any t A ð0; 1 as m- þ 1, there is Oð By ω ^e P ^e Theorem 4 in Sun and Wei (2011), there is t c -t c , where t c is the threshold of the data-driven Pro2 procedure. Then ^ P ^ ω ^ o; e Oð t c Þ-Oðωo ; t c Þ, which implies the data-driven Pro2 procedure and the oracle Pro2 procedure achieve the same power asymptotically. The FDR of the data-driven Pro2 procedure is calculated as 8 9 G ^ < ∑g ¼ 1 ∑i A Hg0 I =   ∑Gg ¼ 1 π g π g0 F g0 ðwgo t c Þ þ op ð1Þ ^ goe tc ðP gi r w t cÞ d ¼E ¼ þoð1Þ r αþ oð1Þ: FDR Pro2 ¼ E ^e :ð∑G ∑mg I ; Oðωo ; t c Þ þ op ð1Þ Oðωo ; t c Þ t ÞÞ 3 1 g¼1

^ go c i ¼ 1 ðP gi r w

This completes the proof of Theorem 2.



Appendix B In this section, we first show that the condition of Fg(x) satisfying the uniform Lipschitz condition of order βg 40 can be met by the normal distributions. For simplicity, we drop g and i in the notations and consider the following one-sided hypotheses: H0 : μ ¼ 0

vs

H1 : μ4 0:

The test statistic is T  Nðμ; 1Þ and null hypothesis is rejected with large values of T. The p-value is calculated as P ¼ 1  ΦðTÞ, where ΦðÞ is the standard normal distribution. The p-value's distribution is FðxÞ ¼ PrðP r xÞ ¼ ΦðΦ

1

ðxÞ  μÞ

with density function f ðxÞ ¼ expðμΦ where Φ

1

ðxÞ  μ2 =2Þ;

1

1

ðxÞ ¼ 1  ΦðxÞ. Suppose y ¼ Φ ðxÞ. Then pffiffiffiffiffiffiffiffiffiffi   pffiffiffi ΦðyÞ 1 limþ x exp μΦ ðxÞ ¼ lim y- þ 1 expð  μyÞ x-0 ϕðyÞ pffiffiffiffiffiffiffiffiffiffi ΦðyÞμ expð μyÞ

¼ lim

y- þ 12

r lim

ϕðyÞ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi : pffiffiffiffiffiffiffiffiffi 2 ΦðyÞ Φð 3=2yÞμ expð  μyÞ

y- þ 1

By Lagrange mean value theorem, there is ξ A ðy; pffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffi ΦðyÞ Φð 3=2yÞ ¼ ð 3=2 1ÞϕðξÞy:

pffiffiffiffiffiffiffiffi 3=2yÞ such that

We have ϕðyÞ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffi ΦðyÞ Φð 3=2yÞμ expð  μyÞ ( )

y2 ξ2 y2 exp  þ exp  2 4 8 ¼ lim C 1=2 r lim C 1=2 ¼ 0; y- þ 1 y- þ 1 y expð  μyÞ expð  μyÞ y

lim

y- þ 1

2

Please cite this article as: Zhao, H., Zhang, J., Weighted p-value procedures for controlling FDR of grouped hypotheses. Journal of Statistical Planning and Inference (2014), http://dx.doi.org/10.1016/j.jspi.2014.04.004i

H. Zhao, J. Zhang / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]

16

which implies pffiffiffi 1 limþ x expðμΦ ðxÞÞ ¼ 0; x-0

pffiffiffi where C 4 0 is a constant. By this, we get to know that j xf ðxÞj can be bounded on ½0; 1 by a constant M 40. Then, for any x; yA ½0; 1, there is η A ðx; yÞ such that pffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffi pffiffiffi pffiffiffi jFðxÞ  FðyÞj ¼ 2j ηf ðηÞjj x  yj r2M jx  yj: That means F(x) satisfies a uniform Lipschitz condition of order β ¼ 1=2. For p-values for two sided hypotheses calculated from normal distributions, their CDF GðxÞ ¼ 1  ½Fð1 x=2Þ Fðx=2Þ; x r 1, where FðÞ is defined above. Then    y pffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffi x y  x  F GðxÞ  GðyÞ r F 1   F 1  þ F r 2M jx  yj þ 2M jx  yj 2 2 2 pffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffi 2 ¼ 2 2M jx yj: G(x) also satisfy a uniform Lipschitz condition of order 1/2. Next, we show the above p-values also satisfy the assumption A.1. We only demonstrate the case of one-sided p-values. Similarly we can obtain the results for two-sided p-values. For any function ΓðtÞ, we will denote the first (second) order derivative of ΓðtÞ with respect to t by Γ 0 ðtÞ (Γ″ðtÞ). Consider the one-sided hypotheses: H g0i : μgi ¼ 0

vs

H g1i : μgi ¼ μg 4 0:

The second order derivative of F g1 ðxÞ is F″g1 ðtÞ ¼  expðμg Φ

1

ðtÞ  μ2g =2Þ=ϕðΦ

1

ðtÞÞμg o0;

where ϕðÞ is the density function of standard normal distribution. Then F g1 ðtÞ are strictly concave. For any t A ð0; 1 and wg 4 0, there is a ηg such that F g1 ðwg tÞ F g1 ð0Þ ¼ F 0g1 ðwg ηÞwg t 4 F 0g1 ðwg tÞwg t by the concavity of Fg1 G

G

g¼1

g¼1

Oðω; tÞ  Oðω; 0Þ ¼ t þ ∑ π g π g1 F 0g1 ðwg ηg Þwg t 4 t þ ∑ π g π g1 F 0g1 ðwg tÞwg t ¼ O0 ðω; tÞt:

ð3Þ

Then B0 ðω; t Þ ¼

Oðω; tÞ  tO0 ðω; tÞ t 4 0; ½Oðω; tÞ2

for t 4 0. limt-0 þ BðωH ; tÞ o α implies t H 40 (refer to the proof of Hu et al., 2010's Theorem 4 for more details). Therefore BðωH ; tÞ has nonzero derivative at tH. ! G OðωH ; tÞ 0 Z limþ 1 þ ∑ π g π g1 F g1 ðwg tÞwg ; lim t t-0 þ t-0 g¼1 1

by the formula (3). Note that F 0g1 ðwg tÞ ¼ expðμg Φ ðwg tÞ  μ2g =2Þ go to þ 1 as t-0 þ . Then limt-0 þ OðωH ; tÞ=t ¼ þ 1 and limt-0 þ BðωH ; tÞ ¼ 0 o α. Similarly, we can show BðωBH ; tÞ has nonzero derivative at tBH and limt-0 þ BðωBH ; tÞ ¼ 0 oα. Then A.1 is satisfied. Suppose π g1 4 0 and Fg1 are strictly concave functions for all g, we next show the assumption A.3 holds. Suppose both ω0o and ωo are the maximum points of Oðω; t M Þ under the constraint condition C.1, and ω0o a ωo . Without loss of generality, we suppose w01o aw1o . Define ωoλ ¼ λω0o þ ð1  λÞωo , λ is a constant in (0,1). Easy to see ωoλ satisfies the constraint condition C.1. Note that F 11 ðw1oλ t M Þ 4λF 11 ðw01o t M Þ þ ð1  λÞF 11 ðw1o t M Þ and F g1 ðw1oλ t M Þ ZλF g1 ðw01o t M Þ þ ð1  λÞF g1 ðw1o t M Þ; g ¼ 2; …; G, by the concavity of Fg1. We have Oðωoλ ; t M Þ 4 λOðω0o ; t M Þ þ ð1  λÞOðωo ; t M Þ ¼ Oðω0o ; t M Þ ¼ Oðωo ; t M Þ. This contradicts with the definitions of ω0o and ωo . References Benjamini, Y., Hochberg, Y., 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57 (1), 289–300. Benjamini, Y., Hochberg, Y., 1997. Multiple hypotheses testing with weights. Scand. J. Statist. 24 (3), 407–418. Benjamini, Y., Hochberg, Y., 2000. On the adaptive control of the false discovery rate in multiple testing with independent statistics. J. Educational Behav. Statist. 25 (1), 60–83. Benjamini, Y., Krieger, A., Yekutieli, D., 2006. Adaptive linear step-up procedures that control the false discovery rate. Biometrika 93 (3), 491–507. Cai, T., Sun, W., 2009. Simultaneous testing of grouped hypotheses: finding needles in multiple haystacks. J. Amer. Statist. Assoc. 104 (488), 1467–1481. Considine, G., Zappalà, G., 2002. The influence of social and economic disadvantage in the academic performance of school students in Australia. J. Sociology 38 (2), 129–148. Efron, B., 2004. Local False Discovery Rate. Technical Report. Stanford University, Department of Statistics, available.

Please cite this article as: Zhao, H., Zhang, J., Weighted p-value procedures for controlling FDR of grouped hypotheses. Journal of Statistical Planning and Inference (2014), http://dx.doi.org/10.1016/j.jspi.2014.04.004i

H. Zhao, J. Zhang / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]

17

Efron, B., 2007. Correlation and large-scale simultaneous significance testing. J. Amer. Statist. Assoc. 102 (477), 93–103. Efron, B., 2008. Simultaneous inference: When should hypothesis testing problems be combined?. Ann. Appl. Statist., 197–223. Genovese, C., Wasserman, L., 2002. Operating characteristics and extensions of the fdr procedure. J. Roy. Statist. Soc. Ser. B 64 (3), 499–518. Genovese, C., Wasserman, L., 2004. A stochastic process approach to false discovery control. Ann. Statist. 32 (3), 1035–1061. Genovese, C., Roeder, K., Wasserman, L., 2006. False discovery control with p-value weighting. Biometrika 93 (3), 509–524. Holm, S., 1979. A simple sequentially rejective multiple test procedure. Scand. J. Statist., 65–70. Hu, J., Zhao, H., Zhou, H., 2010. False discovery rate control with groups. J. Amer. Statist. Assoc. 105 (491), 1215–1227. Jin, J., Cai, T., 2007. Estimating the null and the proportion of nonnull effects in large-scale multiple comparisons. J. Amer. Statist. Assoc. 102 (478), 495–506. Meinshausen, N., Rice, J., 2006. Estimating the proportion of false null hypotheses among a large number of independently tested hypotheses. Ann. Statist. 34 (1), 373–393. Roeder, K., Wasserman, L., 2009. Genome-wide significance levels and weighted hypothesis testing. Statist. Sci.: A Review J. Inst. Math. Statist. 24 (4), 398. Roquain, E., Van De Wiel, M., 2009. Multi-weighting for fdr control. Electron. J. Statist. 3, 678–711. Sparkes, J., 1999. Schools, Education and Social Exclusion. Centre for Analysis of Social Exclusion, London School of Economics. Storey, J., 2002. A direct approach to false discovery rates. J. Roy. Statist. Soc. Ser. B 64 (3), 479–498. Storey, J., Taylor, J., Siegmund, D., 2004. Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach. J. Roy. Statist. Soc. Ser. B 66 (1), 187–205. Sun, W., McLain, A., 2012. Multiple testing of composite null hypotheses in heteroscedastic models. J. Amer. Statist. Assoc. 107 (498), 673–687. Sun, W., Wei, Z., 2011. Multiple testing for pattern identification, with applications to microarray time course experiments. J. Amer. Statist. Assoc. 106 (493), 73–88.

Please cite this article as: Zhao, H., Zhang, J., Weighted p-value procedures for controlling FDR of grouped hypotheses. Journal of Statistical Planning and Inference (2014), http://dx.doi.org/10.1016/j.jspi.2014.04.004i