Power efficiency of Efron’s biased coin design

Power efficiency of Efron’s biased coin design

Journal of Statistical Planning and Inference 159 (2015) 15–27 Contents lists available at ScienceDirect Journal of Statistical Planning and Inferen...

533KB Sizes 3 Downloads 140 Views

Journal of Statistical Planning and Inference 159 (2015) 15–27

Contents lists available at ScienceDirect

Journal of Statistical Planning and Inference journal homepage: www.elsevier.com/locate/jspi

Power efficiency of Efron’s biased coin design David Azriel ∗ Department of Statistics, University of Pennsylvania, 3730 Walnut Street, Philadelphia, PA 19104, United States Faculty of Industrial Engineering and Management, The Technion, Israel

article

info

Article history: Received 17 June 2014 Received in revised form 4 November 2014 Accepted 12 November 2014 Available online 24 November 2014 Keywords: Design of clinical trials Efron’s biased coin design Large deviations Power efficiency

abstract Efron’s biased coin design aims to both balance the experiment and preserve randomness. It has been noticed that, under the homoscedastic normal model, Efron’s design is uniformly more powerful than a perfect simple randomization. However, this optimality property does not hold for heteroscedastic models. For the latter, it is shown in this work that Efron’s biased coin provides more power than a perfect simple randomization for a large enough sample size. This is proved by studying the exponential rate at which the power converges to one, under the different designs, using large deviations theory. Specifically, we prove this power efficiency property for binary and normal responses, when the variances of the two treatments are different, and the probability of heads for the biased coin is equal to or greater than 2/3. A numerical study indicates that the power is larger even for small-sized experiments and the improvement can reach up to 4%. © 2014 Elsevier B.V. All rights reserved.

1. Introduction Maximizing the power to detect a treatment effect is a major issue in the design of clinical trials. Under the normal homoscedastic model, the balanced design minimizes the variance of the treatment effect estimate and maximizes the power of the test comparing both treatments (Silvey, 1980). Furthermore, Kalish and Harrington (1988) and Begg and Kalish (1984) consider several optimality criteria for heteroscedastic models and recommend the balanced design as being close to optimal. Recently, Azriel et al. (2012) showed that the balanced design is asymptotically optimal for power maximization under binary response, which is again heteroscedastic. When the number of the subjects is known in advance, then perfect balance is possible by fixing the number of subjects in each group in advance. On the other hand, when the subjects arrive sequentially to the experiment, then such a fixed design is not feasible. In this case, a perfect balance can be achieved if a deterministic rule is used, e.g., each subject is given a different treatment than the previous subject. However, such a fixed design may result in different biases since the experimenter and the subjects can know in advance which treatment is given (selection bias). On the other hand, one can consider a randomized design that assigns subjects randomly, with equal probability for both treatments. Such a design does not suffer from selection bias but has non-negligible probability of being far from balanced. In order to account for this dilemma, Efron (1971) introduced the biased coin design, in which the smaller group is preferred but still randomization is used. Efron showed that this design is almost balanced with high probability while the selection bias is small. His results are based on the steady state distribution of the Markov chain induced by the biased coin design. Following this design, several extensions and generalizations have been considered (e.g., Wei, 1978; Baldi Antognini and Giovagnoli, 2004). Recently, Markaryan and Rosenberger (2010) provided the exact distribution of the biased coin design, rather than the steady state distribution originally studied by Efron.



Correspondence to: Department of Statistics, University of Pennsylvania, 3730 Walnut Street, Philadelphia, PA 19104, United States. E-mail address: [email protected].

http://dx.doi.org/10.1016/j.jspi.2014.11.002 0378-3758/© 2014 Elsevier B.V. All rights reserved.

16

D. Azriel / Journal of Statistical Planning and Inference 159 (2015) 15–27

Chen (2006) demonstrated that Efron’s biased coin design is uniformly more powerful by a numerical calculation of the power. He studied the case of normal responses with equal variances. This result was later proved by Baldi Antognini (2008). The proof is based on the relation between the absolute value of the difference between the two groups and the power, and is now presented. Let NA (n) and NB (n) be the number of subjects allocated to treatments A and B after n subjects; NA (n) + NB (n) = n. Suppose that the responses in A and B are distributed N (µA , σ 2 ) and N (µB , σ 2 ). The power of the standard α -level test for testing H0 : µA = µB versus H1 : µB > µA is

        2   √n(µ − µ )  µ −µ  D(n) B A B A E Φ  − zα  = E Φ 1− − zα  , 2     2σ n σ2 + NσB (n) NA (n)

(1)

where Φ is standard normal distribution function, zα := Φ −1 (1−α), and D(n) := NA (n)−NB (n) is the difference between the groups. The expectation in (1) is with respect to the distribution of NA (n). The random variable over which the expectation is taken, decreases with |D(n)|. Baldi Antognini (2008) showed that |D(n)| under Efron’s biased coin design is stochastically smaller than under the randomized design and, therefore, the former is uniformly more powerful than the latter. A similar proof holds when the common σ 2 is not known but is estimated. When the variances are different, equality (1) does not hold. Furthermore, in this case Efron’s biased coin is no longer uniformly more powerful. However, a weaker property holds in this case: Efron’s biased coin provides more power than a randomized design for large enough n, where ‘‘large enough’’ may depend on the parameters. This property implies that for large fixed n, Efron’s biased coin is more powerful for a large part of the parameter space. The current paper is dedicated to the proof and the study of this property. The proof is based on computing the exponential rate at which the power converges to one, using large deviations theory. We show that this rate is always larger under Efron’s design, and strictly larger if some mild conditions are met. Our results are general and apply to the heteroscedastic normal model as well as to binary response. In particular, we show for the latter two models that if the probability of heads for the biased coin is bigger than 2/3, then Efron’s design is more powerful when the variances of the two treatments are different. A numerical study indicates that the improvement in power typically starts from relatively small n (≈20) and it is about 2%–4%. This is a similar improvement to that of the homoscedastic normal model as reported in Chen (2006). The current work studies the power under Efron’s design and not the selection bias. Clearly, if the only consideration is power then one would prefer to choose 1 for the probability of heads of the biased coin, which coincides with a deterministic design. Efron suggested to work with probability of 2/3 since this choice has high probability of being close to balance and the selection bias is small. We show (Proposition 2.1) that indeed Efron’s design with 2/3 is close to a deterministic design in terms of power. The rest of the paper is organized as follows. Section 2 studies three designs: Efron’s biased coin design, perfect randomization and deterministic design that assigns an equal number of subjects to each treatment group. The rate at which the power converges to one, under the different designs, is computed. The theoretical results are illustrated for the binary and normal responses in Sections 3 and 4, respectively. Possible extensions and future research are discussed in Section 5. The proofs are given in Section 6. 2. Main results In this section we state the main theoretical results of the paper. We start with some notation and definitions. Let A and B be two experimental treatments. At stage n of the experiment, NA (n), NB (n) subjects are allocated to treatment A, B, where NA (n) + NB (n) = 1. We study three designs: 1. Efron’s biased coin design. 2. Equal randomization. 3. Deterministic designs. Quantities related to these designs are denoted by superscript E, R, D, respectively. The designs are defined as follows: 1. For Efron’s biased coin design,

   α E E P the n + 1’th subject is allocated to A|NA (n), NB (n) := 1/2   1−α 



NAE (n) < NBE (n) NAE (n) = NBE (n) , NAE (n) > NBE (n)

where α ∈ (1/2, 1] is the probability of heads for the biased coin. 2. For the randomized deign, NAR (n) ∼ Bin(n, 1/2). 3. For the deterministic design, NAD (n) = n/2 (respectively, NAD (n) = (n + 1)/2) for even (respectively, odd) n. Efron’s biased coin design with α = 1 coincides with the deterministic design, and with α = 1/2 coincides with the randomized design.   The response of a subject assigned to A, B is distributed FA , FB , respectively. Let µA := xdFA (x) and µB := xdFB (x) denote the mean, and the average response based on the NA (n), NB (n) subjects are denoted by µ ˆ A (n), µ ˆ B (n). Let Vˆ A (n), Vˆ B (n)

D. Azriel / Journal of Statistical Planning and Inference 159 (2015) 15–27

17

be consistent estimators of the variance. For the following asymptotic results to hold, it is assumed that Vˆ A (n), Vˆ B (n) are bounded from above, that is, a bound, V¯ , on the variance is known and P {Vˆ i (n) > V¯ } = 0, i = A, B for any n. The Wald test statistics is defined by W j (n) := 

µ ˆ jB (n) − µ ˆ jA (n) 1/2 j j Vˆ A (n) Vˆ B (n) + j j NA (n)

j = E , R, D.

NB (n)

We suppose that µA < µB and aim to maximize the power, i.e. the probability P {W j (n) > Cr } for some critical value Cr . We study the exponential  rate at which the power converges to one in the different designs. Let Mi (t ) := exp(tx)dFi (x), be the moment generating function of the response in treatment i; i = A, B. Define fA (K ) := inft >0 MA (t ) exp(−tK ), fB (K ) := infs>0 MB (−s) exp(sK ) (notice the different signs of s and t). As stated in the following theorem the rate, or bounds on the rate, depends on the following functions: 1 log{fB (K )} 2   1 1 R fA (K ) + fB (K ) R (K ) := log 2 2 1

RD (K ) :=



RE (K ) :=

log{fA (K )} +

2

RD (K ) min{R (K ), RD (K ) + g (K )}

g (K ) < 0 , g (K ) ≥ 0

R

where



1 g (K ) := log 2 Theorem 2.1.



fA (K ) 

  + log{(1 − α)/α}. fB ( K ) 

(i) The power of Efron’s design satisfies

lim sup n→∞

1 n

log[1 − P {W E (n) > Cr }] ≤ max







max

RE (K ), min log{(1 − α)/α}, − log(2)

max

R (K ), − log(2) ,

K ∈[µA ,µB ]

.

(ii) The power of the randomized design satisfies lim sup n→∞

lim inf n→∞

1 n

1 n

log[1 − P {W (n) > Cr }] ≤ max R

log[1 − P {W R (n) > Cr }] ≥

 K ∈[µA ,µB ]

max

K ∈[µA ,µB ]

R



RR (K ).

(iii) The power of the deterministic design satisfies lim

1

n→∞

n

log[1 − P {W D (n) > Cr }] =

max

K ∈[µA ,µB ]

RD (K ).

Comparing the deterministic and the randomized designs, Theorem 2.1 implies that the power in the deterministic design approaches one at least not slower. Concavity implies that 1 2

log{fA (K )} +

1 2

log{fB (K )} ≤ log



1 2

fA (K ) +

1 2



fB (K ) ,

and a strict inequality holds unless fA (K ) = fB (K ). Therefore, the rate is strictly larger if max

K ∈[µA ,µB ]

RR (K ) > − log(2)

(2)

and if K ∗ , the maximizer of RR (K ), satisfies fA (K ∗ ) ̸= fB (K ∗ ). For the Binomial and the normal case, we show below that (2) is always satisfied, while the latter holds if the variances are different. The rate of convergence to one in Efron’s design is equal to that of the deterministic design if K˜ ∗ , the maximizer of RE (K ), satisfies g (K˜ ∗ ) ≤ 0. This condition is satisfied in many cases as shown below. However, when it is not satisfied, Theorem 2.1 establishes only a lower bound on the rate and, hence, does not determine the rate. Our main interest is the power comparison of Efron’s and the randomized design. Theorem 2.1 implies that if max

K ∈[µA ,µB ]

RE ( K ) <

max

K ∈[µA ,µB ]

RR (K ),

(3)

then the rate of convergence of the power to 1 is strictly larger in Efron’s design. This is stated in the following theorem, whose proof is based on Theorem 2.1.

18

D. Azriel / Journal of Statistical Planning and Inference 159 (2015) 15–27

Theorem 2.2. (i) Suppose that (2) holds, then lim sup n→∞

1 n

log[1 − P {W E (n) > Cr }] ≤ lim

n→∞

1 n

log[1 − P {W R (n) > Cr }].

(4)

(ii) If further, (3) is satisfied, then the inequality in (4) is strict and therefore there exists N that depends on FA , FB , Cr such that for every n > N P {W E (n) > Cr } > P {W R (n) > Cr }. Theorem 2.2 implies that if (2) and (3) both hold, then Efron’s design is more powerful for large enough n. The following proposition provides sufficient conditions for (2) and (3). Proposition 2.1. (i) If fB (µA ) > 0 or

fA (µB ) > 0,

(5)

then (2) is satisfied. (ii) If α ≥ 2/3, then (3) is satisfied unless the maximizer of RR (K )

and

RD (K ) is the same,

(6)

in which case fA (K ∗ ) = fB (K ∗ ), where K ∗ is the maximizer. Proposition 2.1 implies that if (5) holds and α ≥ 2/3 then Efron’s design is more powerful, unless (6) holds. We show below for the binomial and the normal case that (5) always holds and (6) is satisfied only when the variances are the same. 3. Binary response In this Section we study the case that the responses are binary; the probabilities of success in treatments A and B are denoted by pA and pB . Then,

     pA K 1 − pA 1−K fA (K ) = 1−K  K 1

K > pA

,

K ≤ pA

     pB K 1 − pB 1−K fB ( K ) = 1−K  K 1

K < pB

;

K ≥ pB

notice that − log{fA (K )} is the Kullback–Leibler divergence between pA and K . Here (for pA , pB ̸∈ {0, 1}), fA (pB ) =

 pB  pA pB

1−pA 1−pB

1−pB

> 0 and, hence, (5) is satisfied.

Fig. 1(a) and (b) shows the functions RD (K ), RR (K ), RE (K ) for certain parameters; this is a typical picture where (3) (that is, max RE (K ) < max RR (K )) is satisfied and Efron’s design is more powerful. In Fig. 1(c), pA = 0.2 = 1 − pB and RR (K ), RD (K ), RE (K ) have the same maximizer. In Fig. 1(d), pA = 0.03 = 1 − pB but (3) is satisfied. Fig. 2 shows the pairs (pA , pB ) for which (3) is satisfied for different α ’s. For pairs (pA , pB ) in the gray area (3) is satisfied and Efron’s design is more powerful. For pairs in the black area (3) fails but still Efron’s design can be more powerful since (3) is sufficient but not necessary. By Proposition 2.1, when α ≥ 2/3, (3) does not hold only when (pA , pB ) = (p, 1 − p) for p ∈ (0.091, 0.5), in which case K = 0.5 is the maximizer and fA (0.5) = fB (0.5), and the deterministic design has the same rate as the randomized design. For (pA , pB ) not of this special form, if α ≥ 2/3, then (3) is satisfied and Efron’s design is more powerful. For smaller α ’s, (3) is not satisfied in more cases. In these cases pA and pB are generally far from each other; the power is quite high in both designs and a comparison is of little value. For α ’s close to 0.5, (3) does not hold more frequently but such α ’s are usually not recommended in clinical trials. The result for the binomial case is summarized in the following corollary. Corollary 3.1. When the responses are binary with pA , pB ̸∈ {0, 1}, Efron’s design is more powerful, in the sense of Theorem 2.2, if α ≥ 2/3, unless (pA , pB ) = (p, 1 − p) for p ∈ (0.091, 0.5). Recall that since (5) holds in the binomial case, the rate of convergence to one in Efron’s design is always not smaller than the rate of the randomized design. For the parameters where (3) fails, it still could be the case that Efron’s design is more powerful either because this condition is not necessary or just because same rates do not imply same power. Corollary 3.1 provides sufficient, but not necessary conditions, for the power efficiency of Efron’s design. Fig. 3 compares the power of Efron’s design to the randomized design for n = 20 and for different parameters (pA , pB ). The power is computed exactly using the binomial probabilities and the finite sample distribution of Efron’s design (Markaryan and Rosenberger, 2010). We consider pA = 0.5 and pB varies from 0.5 to 0.99. When pB is close to pA , the power is low and the randomized design is more powerful. In this example, when the power is smaller than 0.2 the randomized design is better but otherwise Efron’s design is better. That is, for pB bigger than ≈0.65, Efron’s design provides higher power. For example, when pB = 0.9 then the power of the randomized, Efron’s design is 0.686, 0.713, respectively, which represents an improvement of 4%. Thus, when n = 20 and pA = 0.5, Efron’s design is uniformly more powerful in the interval pB ∈ (0.65, 1). We repeated this calculation for n = 40, 50, and found out that the corresponding intervals are (0.6, 1), (0.5, 1), indicating that Efron’s design is uniformly more powerful for this parameter space for large enough n.

D. Azriel / Journal of Statistical Planning and Inference 159 (2015) 15–27

a

b

c

d

19

Fig. 1. Plots of RD (K ), RR (K ), RE (K ). In Figure (a) and (b), α = 0.55, pA = 0.3, pB = 0.85. Figure (a) shows the functions for K ∈ [pA , pB ] and Figure (b) focuses on the interval where the maximum is obtained. In Figure (c), α = 0.67, pA = 0.2, pB = 0.8. In Figure (d), α = 0.67, pA = 0.03, pB = 0.97.

4. Normal response We now study the normal case; the distribution in treatments A and B is assumed to be N (µA , σA2 ) and N (µB , σB2 ), respectively. We have that

(K − µA )2 exp − fA (K ) = 2σA2   



1



K > µA K ≤ µA

,

fB ( K ) =

  

 exp −

(µB − K )2 2σB2 1



K < µB

;

K ≥ µB

As in the binomial case, (2) is always satisfied since (5) holds. Fig. 4 shows the parameters for which (3) is satisfied for different α ’s and when (µB , σB ) ∈ [0.05, 3] × [0.5, 2] and (µA , σA ) = (0, 1). As in the binomial response case, if α ≥ 2/3, (3) fails to hold only when σA = σB and µB − µA is not too large. In the situation where σA = σB = 1 and µA = 0, for µB ≤ 2, (3) does not hold and for larger µB ’s it holds for some α ’s. In the case of equal variances Baldi Antognini (2008) proves that Wald’s test in Efron’s design is uniformly more powerful than in the randomized design, but he considers a different test statistic than ours. For α < 2/3, (3) does not hold generally when the power is quite high, as in the Binomial response case. The result for the normal case is summarized in the following corollary. Corollary 4.1. When the responses are normal and the variances are different, Efron’s design is more powerful, in the sense of Theorem 2.2, if α ≥ 2/3.

20

D. Azriel / Journal of Statistical Planning and Inference 159 (2015) 15–27

(a) α = 0.67.

(b) α = 0.6.

(c) α = 0.55.

(d) α = 0.51.

Fig. 2. The pairs (pA , pB ) for which (3) is satisfied for different α ’s. The gray (respectively, black) dots represent the area where (3) holds (respectively, does not hold).

Fig. 5 compares the power of Efron’s design to the randomized design. The power is computed by simulating 105 standard normal random variables, and using the finite sample distribution of Efron’s design (Markaryan and Rosenberger, 2010), for certain parameters. The same simulated random variables were used for both designs in order to reduce the variance of the difference. The result is somewhat similar to the Binomial case. When µB is close to µA then power is low and the randomized design is more powerful. For larger µB ’s, Efron’s design obtains more power. For these parameters the improvement is at most 2%, less than the Binomial case. 5. Discussion In this work we proved that the convergence of the power to one under Efron’s designs is faster than under the randomized design pointwise. A stronger result would be to show that lim sup n→∞

1 n

 log

1 − P {W E (n) > Cr } 1 − P {W R (n) > Cr }



<1

holds uniformly over a subset of the parameter space. This will imply that there exists N such that P {W E (n) > Cr } > P {W R (n) > Cr }

(7)

D. Azriel / Journal of Statistical Planning and Inference 159 (2015) 15–27

(a) Power.

21

(b) Difference.

Fig. 3. (a) The power of the randomized (gray), Efron’s (black) design when pA = 0.5, pB = 0.5, . . . , 0.99, α = 0.67, n = 20 and Cr = 1.645. (b) The power of Efron’s design minus the power of the randomized design under the same set of parameters.

for each n > N uniformly over the above subset. This property was observed in the binomial case as reported in Section 3. We provided sufficient conditions under which (7) holds, pointwise only. The proof is already quite involved and the uniform version is left for future research. In this paper, we considered the randomized, the deterministic and Efron’s design. We now discuss possible extensions of our results to other designs. Wei (1978) considers a generalization to Efron’s design where





P the n + 1’th subject is allocated to A|D(n) := f {D(n)/n}, for f : [−1, 1] → [0, 1], a non-increasing function that satisfies f (−x) = 1 − f (x). This design forces a small-sized experiment to be balanced, but is similar to the randomized design for large experiments. Baldi Antognini and Giovagnoli (2004) introduce the adjustable biased coin design (ABCD hereafter), in which





P the n + 1’th subject is allocated to A|D(n) := F {D(n)}, where F : Z → [0, 1] is a non-increasing function that satisfies F (−x) = 1 − F (x). Thus, the tendency to allocate subjects to the smaller group increases with the imbalance. Baldi Antognini (2008) proves that |D(n)| is stochastically smaller under ABCD than under Wei’s design. Generally, the exponential rate at which the power approaches one is

 − max

K ∈[µA ,µB ]

1 2

log{fA (K )} +

1 2

log{fB (K )} + lim

n→∞

1 n

 log E

max



fA (K ) fB (K )

,

fB (K ) fA (K )

|D(n)|/2 

,

provided that the latter limit exists. For Efron’s design, we computed this limit for some parameters and provided bounds for other parameters. The stochastic order of |D(n)| under ABCD and Wei’s design mentioned above, implies that the rate under ABCD is not lower than under Wei’s design. In order to prove strict inequality between the rates, a more detailed study on these designs is required. Many works (e.g., Hu and Rosenberger, 2006) suggest for heteroscedastic models to allocate subjects according to the ratio of the standard deviations. This is known as Neyman allocation, denoted hereafter by πNeyman . If the variances are known, one can consider a modified version of Efron’s design that targets this allocation and a randomized design that assigns subjects to treatments A and B with probability πNeyman , 1 −πNeyman , respectively. Which design provides more power? Azriel and Feigin (2014) showed that the latter randomized design provides (asymptotically) less power than a deterministic design that minimizes the distance between the actual allocation and πNeyman . The results of the current work indicate that modified Efron’s design will be also more powerful than this randomized design. When the variances are unknown, one can consider a response adaptive design that estimates the parameters at each stage of the experiment and allocates subjects according to the estimated πNeyman . Azriel and Feigin (2014) showed that this adaptive randomized version is asymptotically equivalent, in a large deviations sense, to the randomized design where the parameters are known. A theory of the asymptotic power of adaptive response versions of Efron’s design is yet to be developed. We investigated here designs that aim to balance the treatments groups and covariates were ignored. However, improving the balance over the covariates can lead to better power (Weir and Lees, 2003). Several adaptive designs (Baldi Antognini and Zagoraiou, 2011; Kapelner and Krieger, 2014) were recently suggested to minimize the imbalance of the covariates between the groups. The study of the improvement in power of such designs is left for future work.

22

D. Azriel / Journal of Statistical Planning and Inference 159 (2015) 15–27

(a) α = 0.67.

(b) α = 0.6.

(c) α = 0.55.

(d) α = 0.51.

Fig. 4. The parameters (µB , σB ) for which (3) is satisfied for different α ’s and when (µA , σA ) = (0, 1). The gray (respectively, black) dots represent the area where (3) holds (respectively, does not hold).

6. Proofs Let D(n) := DE (n) = NAE (n)− NBE (n) be the difference between the two groups; notice that the superscript E is suppressed. Proof of Theorem 2.1 For the proof of Theorem 2.1 we need two lemmas. Lemma 6.1. Let C > 1 be a constant. (i) If log{C (1 − α)/α} ≤ 0, then lim

n→∞

1 n

log E C D(n) = 0.





(ii) If log{C (1 − α)/α} > 0, then lim sup n→∞

1 n

log E C D(n) ≤ log{C (1 − α)/α}.





D. Azriel / Journal of Statistical Planning and Inference 159 (2015) 15–27

(a) Power.

23

(b) Difference.

Fig. 5. (a) The power of the randomized (gray), Efron’s (black) design when n = 20, α = 0.67, (µA , σA ) = (0, 1), σB = 1.1, µB ∈ [0, 2], Cr = 1.645 and V¯ = 3. (b) The power of Efron’s design minus the power of the randomized design for the same set of parameters; pointwise confidence intervals based on two standard deviations of the simulation are shown in dashed lines.

(iii) We have, lim sup n→∞

1 n

log E C D(n) ≤ log







C2 + 1



2C

.

Proof. (i) We have, E C D(n)





n 

=

C j P {D(n) = j} ≤ nC +

j=−n

n 

C j P {D(n) = j}

j =0

≤ nC +

n 

C j P {|D(n)| = j} = nC + E C |D(n)| .





j =0

Efron (1971) shows that |D(n)| is a Markov chain with period 2, whose stationary probabilities πj are given by

π0 =

r −1 r

,

πj =

r −1r +1 r

rj

,

where r := α/(1 − α). Therefore, by Fatou’s Lemma, lim sup E C |D(n)| ≤





∞ 

n→∞

C j πj ∝

j =0

∞ 

{C (1 − α)/α}j ,

j =0

which is finite if log{C (1 − α)/α} ≤ 0. Thus, under this condition, lim sup

1

n→∞ n

log E C D(n) ≤ lim sup





n→∞

1 n

log(nC + M ) = 0,

for some finite M. Since, for C > 1, lim infn→∞ 1n log E C D(n) ≥ 0, the result follows. (ii) We first show that there exists constant C (that depends only on α ) such that P {|D(n)| = j} ≤ C πj for every n, j. Since limn→∞ P {|D(n)| = 0} = π0 and limn→∞ P {|D(n)| = 1} = π1 , there exist constants C0 , C1 such that P {|D(n)| = j} ≤ Cj πj for j = 0, 1 and every n. For n = 1, P {|D(1)| = j} = I (j = 1); therefore, for C2 = 1/π1 , P {|D(1)| = j} ≤ C2 πj for every j. Define C = max(C0 , C1 , C2 ). By induction on n, for j ≥ 2,





P {|D(n + 1)| = j} = (1 − α)P {|D(n)| = j − 1} + α P {|D(n)| = j + 1} ≤ C {(1 − α)πj−1 + απj+1 } = C πj . We conclude that P {|D(n)| = j} ≤ C πj for every n, j. Now, E C D(n) ≤ nC +





n  j=0

C j P {|D(n)| = j} ≤ nC + C

n  j=0

C j πj .

24

D. Azriel / Journal of Statistical Planning and Inference 159 (2015) 15–27

We have that C j πj ∝ {C (1 − α)/α}j . If log{C (1 − α)/α} > 0, then for any j = 1, . . . , n, C j πj ≤ C n πn . Therefore, E C D(n) ≤ nC + nC C n πn ;





and, lim sup n→∞

1 n

log E C D(n)





≤ lim sup n→∞

= lim sup n→∞

1

log C n πn



n



1

log

n



(r − 1)(r + 1) r

 {C (1 − α)/α}n = log{C (1 − α)/α}.

(iii) As in Part (i), E C D(n) ≤ nC + E C |D(n)| .









Let DR (n) := NAR (n) − NBR (n) = 2NAR (n) − n be the same as D(n) for the randomized design. Then, |DR (n)| is stochastically larger than |D(n)| (Baldi Antognini, 2008). Therefore, lim sup n→∞

1 n

log E C |D(n)|





≤ lim sup n→∞

= lim sup n→∞

= lim sup n→∞

1 n 1 n 1 n



R log E C |D (n)|



= lim sup n→∞



log E C

2NAR (n)−n



1 n

= lim sup n→∞



R log E C D (n)

1 n



 

log E C 2NA (n) − log(C ) R

log(C 2 /2 + 1/2)n − log C = log(C 2 /2 + 1/2) − log C . 

Lemma 6.2. Define

R(C1 , C2 ) :=

  

1

log(C1 ) +

2



 min log(C1 /2 + C2 /2),

1 2

1 2

log(C2 )

g˜ (C1 , C2 ) < 0

1

g˜ (C1 , C2 ) ≥ 0,

log(C1 ) +

2

 log(C2 ) + g˜ (C1 , C2 )

(8)

where g˜ (C1 , C2 ) :=



1 log 2



  + log{(1 − α)/α}. C2 

C1 

For any C1 , C2 > 0, lim sup n→∞

1 n



NAE

lim

1 n



log E C1 C2

Proof. We have, NAE (n) =

n→∞

NBE

N E (n)

log E [C1 A

n 2

+

N E (n)

C2 B

≤ R(C1 , C2 ).

D(n) 2

]=

and NBE (n) = 1 2

log C1 +

1 2

n 2



D(n) ; 2

so,

log C2 + lim

n→∞

1 n

log E [{C1 /C2 }D(n)/2 ].

Since D(n) is symmetric around zero,

 E [{C1 /C2 }

D(n)/2

] = E max



C1 C2

,

D(n)/2 

C2 C1

and the result now follows from Lemma 6.1.

,



The proof of Theorem 2.1 is now given. Proof of Part (i). We use similar ideas as the proof of Azriel and Feigin (2014), Theorem 4.2 and other tools to obtain an upper bound on the rate of the power of Efron’s design. The main steps are to show that lim sup n→∞

1 n

log[1 − P {W E (n) > Cr }] ≤



max

K ∈[µA ,µB ]

max

K ∈[µA ,µB ]

lim sup n→∞

lim sup n→∞

1 n 1 n

log P [{µ ˆ EA (n) ≥ K } ∩ {µ ˆ EB (n) ≤ K }] log E [{fA (K )}NA (n) {fB (K )}NB (n) ] ≤ E

E

max

K ∈[µA ,µB ]

RE (K ).

D. Azriel / Journal of Statistical Planning and Inference 159 (2015) 15–27

Let Vˆ (n) :=

Vˆ AE (n) NAE (n)

+

Vˆ BE (n) NBE (n)

25

; for any ε1 > 0



1 − P {W E (n) > Cr } = P µ ˆ EA (n) − µ ˆ EB (n) ≥ −Cr {Vˆ (n)}1/2



  µ ˆ EA (n) − µ ˆ EB (n) ≥ −Cr {Vˆ (n)}1/2 ∩ {|D(n)| < (1 − ε1 )n}    +P µ ˆ EA (n) − µ ˆ EB (n) ≥ −Cr {Vˆ (n)}1/2 ∩ {|D(n)| ≥ (1 − ε1 )n} := P1 + P2 . 

=P

(9)

The large deviations rate is the maximum rate of P1 , P2 . We first bound the rate of P2 . For t > 0, P2 ≤ P {|D(n)| ≥ (1 − ε1 )n} = 2P {D(n) ≥ (1 − ε1 )n} = 2P [exp{tD(n)} ≥ exp{t (1 − ε1 )n}]

≤ 2E [exp{tD(n)}] exp{−t (1 − ε1 )n}. By Lemma 6.1 Part (iii), lim

1

n→∞

n

log P2 ≤ I [t + log{(1 − α)/α} > 0] log



exp(2t ) + 1



2 exp(t )

− t (1 − ε1 ).

The minimum over t > 0 is

 min (1 − ε1 ) log{(1 − α)/α}, log



2

1/2

 +

ε1

−1/2 

2

 − log

ε1

2

1/2 

ε1

 − log(2) .

As ε1 → 0, we have 1

lim sup

n

n→∞

log P2 ≤ min [log{(1 − α)/α}, − log(2)] .

(10)

For P1 in (9), let V¯ be an upper bound on Vˆ AE (n), Vˆ BE (n); then, Vˆ AE (n)

Vˆ (n) =

NAE (n)

+

Vˆ BE (n) NBE (n)

≤ V¯



1 NAE (n)

+

1 NBE (n)

 =

   ¯ 4V n  

1 1−



D(n) n

  

2  . 

Let ε2 > 0; if |D(n)| < (1 − ε1 )n, then for large enough n, −Cr {Vˆ (n)}1/2 ≥ −ε2 . So, for large enough n, P1 ≤ P

 E    E  µ ˆ A (n) − µ ˆ EB (n) ≥ −ε2 ∩ {|D(n)| < (1 − ε1 )n} ≤ P µ ˆ A ( n) − µ ˆ EB (n) ≥ −ε2 .

Now, let ε3 > 0 and let µA = k1 < k2 < · · · < kM = µB be a partition of the interval [µA , µB ], such that kj − kj−1 < ε3 for any j = 2, . . . , M. We have, P µ ˆ EA (n) − µ ˆ EB (n) ≥ −ε2









≤P



{µ ˆ (n) ≥ K − ε2 } ∩ {µ ˆ ( n) ≤ K } E A

E B

 

{µ ˆ (n) ≥ µB } ∪ {µ ˆ (n) ≤ µA } E A

E B





K ∈[µA ,µB ]

 ≤P

M  

{µ ˆ (n) ≥ kj − ε2 − ε3 } ∩ {µ ˆ (n) ≤ kj } E A

E B



  E  + P {µ ˆ A (n) ≥ µB } ∪ {µ ˆ EB (n) ≤ µA }

j =1 M   E   E   E  ≤ P {µ ˆ A (n) ≥ kj − ε2 − ε3 } ∩ {µ ˆ EB (n) ≤ kj } + P µ ˆ A (n) ≥ µB + P µ ˆ B ( n) ≤ µ A . j =1

Markov’s inequality implies that for any j = 1, . . . , M and t , s > 0, P ({µ ˆ EA (n) ≥ kj − ε2 − ε3 } ∩ {µ ˆ EB (n) ≤ kj })

 N E (n)  N E (n)  ≤ E MA (t ) exp{−t (kj − ε2 − ε3 )} A MB (−s) exp{skj } B    E  E ˆ A (n) ≥ µB ≤ E {MA (t ) exp(−t µB )}NA (n) P µ    E  E P µ ˆ B (n) ≤ µA ≤ E {MB (−s) exp(sµA )}NB (n) .

26

D. Azriel / Journal of Statistical Planning and Inference 159 (2015) 15–27

Taking the large deviations rate according to Lemma 6.2 and minimizing over s, t yields 1

lim sup

n 1

n→∞

lim sup

n 1

n→∞

lim sup

n

n→∞

log P ({µ ˆ EA (n) ≥ kj − ε2 − ε3 } ∩ {µ ˆ EB (n) ≤ kj }) ≤ R{fA (kj − ε2 − ε3 ), fB (kj )}, log P µ ˆ EA (n) ≥ µB ≤ R{fA (µB ), 1},





log P µ ˆ EB (n) ≤ µA ≤ R{1, fB (µA )},





where R(·, ·) is defined in (8). For K ≤ µA (K ≥ µB ), we have fA (K ) = 1 (fB (K ) = 1); also fA (K ) (fB (K )) is decreasing (increasing) in K . Therefore, the maximum of R{fA (K ), fB (K )} is obtained for K ∈ [µA , µB ]. Since R{fA (K ), fB (K )} is uniformly continuous on [µA , µB ], then,





max R{fA (kj − ε2 − ε3 ), fB (kj )}, R{fA (µB ), 1}, R{1, fB (µA )} ≤

max

K ∈[µA ,µB ]

R{fA (K ), fB (K )} + h(ε2 + ε3 ),

where h is some function which is continuous in 0 and satisfies h(0) = 0. We conclude that lim sup n→∞

1 n

log P1 ≤

max

K ∈[µA ,µB ]

R{fA (K ), fB (K )} =

max

K ∈[µA ,µB ]

RE (K ).

(11)

Tracking back to (9) through (11), (10) we have that lim sup n→∞

1 n

log[1 − P {W (n) > Cr }] ≤ max E

 max

K ∈[µA ,µB ]



R (K ), min log{(1 − α)/α}, − log(2) E



.

Proof of Part (ii). The proof of the upper bound is similar to the proof of Part (i) and therefore omitted. For the lower bound we have for any K ∈ [µA , µB ] that P ({µ ˆ RA (n) ≥ K } ∩ {µ ˆ RB (n) ≤ K }) ≤ P {µ ˆ RA (n) ≥ µ ˆ RB (n)} ≤ 1 − P {W R (n) > Cr }. Lemma 7.1 in Azriel and Feigin (2014) implies that lim

n→∞

1 n

log P ({µ ˆ RA (n) ≥ K } ∩ {µ ˆ RB (n) ≤ K }) = RR (K ).

Since this is true for any K ∈ [µA , µB ], the lower bound follows. Proof of Part (iii). The proof of this part is similar to the previous parts and is omitted.



Proof of Theorem 2.2 Proof of Part (i). If maxK ∈[µA ,µB ] RR (K ) > − log(2), then Theorem 2.1, Part (ii), implies that for the randomized design lim

n→∞

1 n

log[1 − P {W R (n) > Cr }] =

max

K ∈[µA ,µB ]

RR (K ).

For Efron’s design, we have by Theorem 2.1, Part (i), that lim sup n→∞

1 n

log[1 − P {W E (n) > Cr }] ≤ max

 K ∈[µA ,µB ]

 ≤ max





max

RE (K ), min log{(1 − α)/α}, − log(2)

 max R (K ), − log(2) ; E

K ∈[µA ,µB ]

since for any K , RE (K ) ≤ RR (K ), (4) holds true. Proof of Part (ii). As in Part (i), lim sup n→∞

1 n

log[1 − P {W E (n) > Cr }] ≤ max





max

K ∈[µA ,µB ]

RE (K ), − log(2) .

If maxK ∈[µA ,µB ] R (K ) ≤ − log(2) then E

lim sup n→∞

1 n

log[1 − P {W E (n) > Cr }] ≤ − log(2) <

max

K ∈[µA ,µB ]

RR (K ).

D. Azriel / Journal of Statistical Planning and Inference 159 (2015) 15–27

27

Otherwise, maxK ∈[µA ,µB ] RE (K ) > − log(2) and (3) implies that lim sup n→∞

1 n

log[1 − P {W E (n) > Cr }] ≤

max

K ∈[µA ,µB ]

Thus, in both cases the inequality in (4) is strict.

RE (K ) <

max

K ∈[µA ,µB ]

RR (K ).



Proof of Proposition 2.1 (i) If fB (µA ) > 0, then max

K ∈[µA ,µB ]

RR (K ) =

max

K ∈[µA ,µB ]

log {fA (K ) + fB (K )} − log(2)

≥ log{fA (µA ) + fB (µA )} − log(2) > log{fA (µA )} − log(2) = − log(2), where the last equality holds true because fA (µA ) = 1; a similar proof holds if fA (µB ) > 0. (ii) If α ≥ 2/3, then for



C (K ) := max



fA (K ) fB (K )

,

fB (K ) fA (K )

1/2

≥ 1,

we have, g (K ) = log{C (K )(1 − α)/α} ≤ log{C (K )/2} < log



C (K )2 + 1 2C (K )



= RR (K ) − RD (K ),

and therefore RD (K ) + g (K ) < RR (K ). Therefore, RE (K ) = RD (K ) + g (K )I {g (K ) ≥ 0} ≤ RR (K ) and maxK ∈[µA ,µB ] RE (K ) = maxK ∈[µA ,µB ] RR (K ) only if K ∗ the maximizer of RE (K ) satisfies RE (K ∗ ) = RD (K ∗ ) = RR (K ∗ ), which happens only when fA (K ∗ ) = fB (K ∗ ).  Acknowledgment I thank an anonymous reviewer for a close reading of the paper and providing helpful comments. References Azriel, D., Feigin, P.D., 2014. Adaptive designs to maximize power in clinical trials with multiple treatments. Sequential Anal. 33, 60–86. Azriel, D., Mandel, M., Rinott, Y., 2012. Optimal allocation to maximize power of two-sample tests for binary response. Biometrika 99, 101–113. Baldi Antognini, A., 2008. A theoretical analysis of the power of biased coin designs. J. Statist. Plann. Inference 138, 1792–1798. Baldi Antognini, A., Giovagnoli, A., 2004. A new ‘biased coin design’ for the sequential allocation of two treatments. J. R. Stat. Soc. Ser. C. Appl. Stat. 53, 651–664. Baldi Antognini, A., Zagoraiou, M., 2011. The covariate-adaptive biased coin design for balancing clinical trials in the presence of prognostic factors. Biometrika 98, 519–535. Begg, C.B., Kalish, L.A., 1984. Treatment allocation for nonlinear models in clinical trials: the logistic model. Biometrics 40, 409–420. Chen, Y.P., 2006. The power of Efron’s biased coin design. J. Statist. Plann. Inference 136, 1824–1835. Efron, B., 1971. Forcing a sequential experiment to be balanced. Biometrika 58, 403–417. Hu, F., Rosenberger, W.F., 2006. The Theory of Response-Adaptive Randomization in Clinical Trials. Wiley, New York. Kalish, L.A., Harrington, D.P., 1988. Efficiency of balanced treatment allocation for survival analysis. Biometrics 44, 815–821. Kapelner, A., Krieger, A., 2014. Matching on-the-fly: Sequential allocation with higher power and efficiency. Biometrics 70, 378–388. Markaryan, T., Rosenberger, W.F., 2010. Exact properties of Efron’s biased coin randomization procedure. Ann. Statist. 38, 1546–1567. Silvey, S.D., 1980. Optimal Designs. Chapman & Hall, London. Wei, L.J., 1978. The adaptive biased coin design for sequential experiments. Ann. Statist. 6, 92–100. Weir, C.J., Lees, K.R., 2003. Comparison of stratification and adaptive methods for treatment allocation in an acute stroke clinical trial. Stat. Med. 22, 705–726.