Properties of Permuted-Block Randomization in Clinical Trials John P. Matts, PhD, and John M. Lachin, ScD Department of Surgery, University of Minnesota, Minneapolis, Minnesota q.P.M.) and The George Washington University, Department of Statistics/Computer and Information Systems, The Biostatistics Center, Rockville, Maryland (J.M.L.)
ABSTRACT: This article describes some of the important statistical properties of the c o m m o n l y used permuted-block design, also k n o w n simply as blocked-randomization. U n d e r a p e r m u t a t i o n m o d e l for statistical tests, p r o p e r analyses should e m p l o y tests that incorporate the blocking used in the randomization. These include the block-stratified M a n t e l - H a e n s z e l chi-square test for binary data, the blocked analysis of variance F test, a n d the blocked nonparametric linear rank test. It is common, however, to ignore the blocking in the analysis. For these tests, it is shown that the size of a test obtained from an analysis incorporating the blocking (say T), versus an analysis ignoring the blocking (say Tl), is related to the intrablock correlation coefficient (R) as T~ = T(1 - R). For blocks of c o m m o n length 2m, the range of R is from - 1/(2m - 1) to 1. Thus, if there is a positive intrablock correlation, which is more likely than not for m > 1, an analysis ignoring blocking will be u n d u l y conservative. Permutation tests are also presented for the case of stratified analyses within one or more s u b g r o u p s of patients defined p o s t hoc on the basis of a covariate. This provides a basis for the analysis w h e n responses from some patients are a s s u m e d to be missing-at-random. A n alternative strategy that requires no assumptions is to perform the analysis using only the subset of complete blocks in which no observations are missing. The Blackwell-Hodges m o d e l is used to assess the potential for selection bias i n d u c e d b y investigator attempts to guess which treatment is more likely to be assigned to each incoming patient. In an u n m a s k e d trial, the permuted-block design provides substantial potential for selection bias in the comparison of treatments d u e to the predictability of the assignments that is i n d u c e d b y the requirement of balance within blocks. Further, this bias is not eliminated by the use of r a n d o m block sizes. We also m o d i f y the Blackwell-Hodges m o d e l to allow for selection bias only w h e n the investigator is able to discern the next a s s i g n m e n t with certainty. This type of bias is reduced by the use of r a n d o m block sizes a n d is eliminated only if the possible block sizes are u n k n o w n to the investigators. Finally, the Efron model for accidental bias is used to assess the potential for bias in the estimation of treatment effects due to covariate imbalances. For the p e r m u t e d block design, the variance of this bias approaches that of complete randomization as the half-block length m ~ 00. Therefore, for fixed m, as n --* o% this bia~ does not vanish as rapidly as it does for other randomization designs.
Address reprint requests to: John P. Matts, PhD, Department of Surgery, University of Minnesota, 2829 University Avenue SE, Room 408, Minneapolis, MN 55414 Received April 29, 1987; revisedJune 13, 1988. Controlled ClinicalTrials 9:327-344 (1988) © Elsevier Science Publishing Co., Inc. 1988 655 Avenue of the Americas, New York, New York 10010
327 0197-2456/1988/$3.50
328
J.P. Matts and J. M. Lachin
KEY WORDS: Randomization, permuted-block design, permutation tests, stratified analysis, missing data, blocked-stratified analysis, Mantel-Haenszel test, intrablock correlation,, selection bias, accidental bias
INTRODUCTION Elsewhere in this volume, Lachin [1] reviews some of the important properties of simple (unrestricted) and restricted randomization procedures. Among the group of restricted randomization procedures, the permuted-block randomization [2], also known simply as blocked-randomization, is probably the most frequently employed procedure (design) for randomization of patients to treatments in a clinical trial [3]. This design overcomes the major disadvantages of simple randomization, either complete binomial randomization (e.g., coin toss) or a random allocation rule (random selection of na out of n) [41. With simple randomization it is possible that a chance run of treatment assignments to one of the treatment groups could occur. At various times in the recruitment process, there also may be undesirable differences in the numbers of patients assigned to each treatment. If the baseline (entry) characteristics of the patients change with time, for example, the initial patients may be healthier than the later patients, such chance runs or periodic imbalances could result in differences between treatment groups in the distribution of patient characteristics. Although the potential for bias with complete randomization vanishes in a large trial, these are matters of concern, especially in a small trial. The permuted-block design overcomes these disadvantages of simple randomization by forcing periodic balance in the numbers of patients assigned to each treatment group. Blocking, however, has other advantages and disadvantages. In experimental design, the principle of blocking is employed to increase the power of a treatment comparison by dividing the experimental units into homogeneous strata (blocks) and then pooling the treatment group differences over blocks. Thus, in a clinical trial, if patient characteristics change with time, a permuted-block design should provide a more powerful comparison w h e n the resulting analysis incorporates blocking than w h e n the analysis ignores blocking. However, the effects of ignoring blocks in the analysis has not been fully explored. On the other hand, the permuted-block design has the disadvantage that the sequences are to some degree predictable due to the imposition of periodic balance at the end of each successive block. This leads to a potential for bias [1]. If an unmasked randomization is used, that is, the assignments are unmasked as they occur, then the number of assignments that can be correctly guessed exceeds that allowed by simple randomization designs. This provides the investigators with the opportunity to alter the composition of the treatment groups (selection bias) [1]. Also, the imposition of periodic balance alone may affect the chances for covariate imbalances, apart from the effects of investigator selection bias. This effect is termed accidental bias [1]. In this article, these properties of the two-treatment nonstratified permuted-block design are reviewed.
329
Properties of Permuted-Block Randomization T a b l e 1 Example of Permuted Block Design with m = 2
Patient
Treatment
Patient
Treatment
1
b
9
a
2 3 4
b a a
10 11 12
b a b
5 6 7 8
a b b a
13 14 15 16
b b a a
THE P E R M U T E D - B L O C K D E S I G N
The permuted-block design involves randomizing patients to treatment groups in sequential blocks. In the simplest case of constant block size, there are two treatments (say a and b) and m patients per treatment within each block of size 2m. Furthermore, there are B blocks with a total sample size n = 2roB provided that all blocks are tilled, in which case the total numbers assigned to each treatment are equal, n~ = nb = Bm. Within each block, a permutation of size 2m is randomly selected such that m patients are assigned to each treatment. The first block is used for the first set of 2m patients and so on up to block B for the last set of 2m patients. For example, in a design where the block size is four, there are six possible ways to make treatment assignments for a block: aabb, bbaa, abab, baba, abba, and baab. A small-scale example is given in Table 1. The permuted-block design may also employ different sized blocks. For example, more frequent balancing at the early stages of a trial may be desired. Thus, one might employ a design with increasing block sizes, for example, 2, 4, 6, 8, and then equal blocks of size 16. Another variation is to randomly choose the block sizes, for example, randomly choose between blocks of size 4 and 6. In a nonstratified design with blocks of equal size, an equal n u m b e r of patients in each treatment group is guaranteed if n is a multiple of the block size. If the block sizes are unequal, balance is guaranteed if the final block is filled, namely all assignments are made. The maximum imbalance at any given time is one half the current block size. The probability of an imbalance at any stage within a block is the same as for a random allocation rule with a length equal to the block length as described in ref. 4. For a stratified design, Hallstrom and Davis [5] evaluate the probability of an aggregate imbalance (combined over blocks) in a stratified trial where the last block within each strata m a y be incomplete.
ANALYSIS AND INFERENCE
Lachin [1] draws the distinction between a h o m o g e n e o u s and a heterogeneous population model. In the latter, the patient responses vary as a function of the patient characteristics. One of the most com m on instances is a time-heterogeneous population where the patient characteristics and responses change with time of entry into the trial. The permuted-block design
330
J.P. Matts and J. M. Lachin will control for the effects of this time-heterogeneity by imposing balance in treatment assignments in each successive block. However, as a result of the time-homogeneity, there will tend to be a positive intrablock correlation. Therefore, the proper analysis under a time-heterogeneous population model is one that takes the blocking into account. On the other hand, if a homogeneous population model applies, then as shown in ref. 1, the method of randomization is irrelevant to a consideration of the proper analysis. Here the blocks can be ignored because the expected intrablock correlation is zero. This suggests that the proper method of analysis under a population model depends on whether an intrablock correlation exists. Lachin [1] also proposes that the preferred basis for analysis is a permutation model based on the randomization employed. In the case of a permutedblock design, each block is essentially a random allocation rule [4], for which the permutational distribution is approximately equivalent to that of the population model distribution for most test statistics. However, if the blocking is ignored, there is no permutational basis for the analysis. Further, the distribution of the resulting homogeneous population model test will not necessarily be equivalent to that under the precise permutation model. Nevertheless, it has been argued that ignoring the blocks in the analysis is acceptable because it will only result in a conservative test [6], that is, the test ignoring blocks will be smaller in value than the proper blocked test. This is usually true, especially if only a few blocks are used and the block size is large relative to the total sample size. However, a conservative test needlessly sacrifices both efficiency and power. The important issue, therefore, is the extent of this conservativism, and the degree of loss of efficiency and power. Further, there is a possibility, especially when the block size is small, that an analysis ignoring blocks may yield an anticonservative test, that is, one that is bigger in value than the proper blocked test, thus having inflated type 1 error. This is readily demonstrated with a block size of two, in which there is simple pairwise matching. For example, for paired normally distributed variates, the appropriate test is the paired t test. For observations with a negative pairwise (intrablock) correlation, the ordinary (unblocked) Student's t test is greater than the proper paired t test and thus is anticonservative. In the remainder of this section we examine the effects of ignoring the blocking versus the proper blocked analysis for the equal-block-size permutedblock design. Using a permutational model, we consider the contingency chisquare test, the analysis of variance F test, and the family of nonparametric linear rank tests. In each case we demonstrate that the effect of ignoring blocking is represented by the magnitude of the intrablock correlation coefficient. Details for each case are presented in the Appendix.
Contingency Chi-Square Test In the case of a binary outcome such as mortality, the data consist of a set of independent Bernoulli variables with possibly different expectations (probabilities) within each treatment group and within each block. The data for each block can be formed into a 2 × 2 table, as shown in Table 2. Within each block, the treatment assignments are determined by a random permu-
331
Properties of Permuted-Block Randomization T a b l e 2 The 2 x 2 Table Constructed from the Binary Variable for
Mortality Within the ith Block Dead Treatment a ai Treatment b c~ a~ + c~
Alive bi d~ bi + d~
m m 2m
tation for which the total sample sizes are fixed. If the total number of events is also treated as fixed, then within each block under the null hypotheses H0, the permutational distribution of the number of events in the first group (ai) is the hypergeometric distribution with the following expectation and variance
[7,8]: (1)
E(ai) = (ai + ci)/2
V(a~) =
(ai + c~)(bi + di) 4(2m - 1)
Therefore, the proper randomization-based test statistic is the "blocked" or stratified Mantel-Haensze! [7] chi-square statistic B
×2 =
2
~,y(a ~)
,
(2)
which combines the data from all the blocks. Therefore, under the permutedblock randomization, the large-sample (asymptotic) permutational distribution of the Mantel-Haenszel statistic under Ho is chi-square with one degree of freedom (df). If one chooses to ignore the blocks, the data are formed into a single 2 x 2 table. Here the Pearson contingency chi-square statistic is usually employed. This test is based on a population model using the unconditional binomial variance of the total number of events in the first group (a.), and is expressed as
n(a.d.
-
b.c.) 2
X2 = (a. + c.)(b. + d.)(Bm) 2'
(3)
where the dot represents the summation over the subscript (e.g., a. = Y~a~). Under a population model, this statistic is asymptotically distributed under H0 as chi-square on I df. However, if the random allocation design (i.e., only one block) had been used, the randomization-based test using the hypergeometric variance [eq. (1)] can readily be shown to be _
X~
(Bm)
(Bm
-
1) X2"
(4)
Therefore, asymptotically X~ --~ ×2. However, under permuted-block randomization, the distribution of these test statistics is not necessarily the usual chisquare.
332
J.P. Matts and J. M. Lachin The relationship b e t w e e n the blocked and u n b l o c k e d statistics can be written as follows: 1
(2mB) •i(ai q- ci)(bi 4- di)
X2 ×2
1
(a. + c.)(b. + d.) (2m - 1)
(5)
This expression involves only the block marginal totals. Thus, the value of this expression is a constant regardless of the actual sequence of t r e a t m e n t assignments e m p l o y e d within each block. In the A p p e n d i x w e s h o w that eq. (5) is related to the intrablock correlation coefficient [9]. Specifically, let MSB be the block m e a n square a n d let MSW be the within block m e a n square from an analysis of variance with just these two sources of variability. T h e n
1
X2 X2 -
MSB - G ~ B1MSW B MSB + - - ( 2 m B-1
(6) - 1)MSW
For large B, eq. (6) is the intrablock correlation coefficient [9], R =
MSB - MSW MSB + (2m - 1)MSW
(7)
Thus, u n d e r the p e r m u t e d - b l o c k r a n d o m i z a t i o n , the properties of the chisquare statistic that ignores the blocking can be assessed via R. If R is zero, the two statistics are identical, that is, the blocked p e r m u t a t i o n a l distribution of the statistic that ignores the blocking (X2) asymptotically is chisquare on 1 df u n d e r the p e r m u t e d - b l o c k randomization. If R is negative, the usual chi-square test ignoring the blocks will be larger, and, thus, u n d e r the p e r m u t e d - b l o c k r a n d o m i z a t i o n , it will be anticonservative. If R is positive, the test will be conservative. The range of possible values for R is - 1 / (2m - 1) to 1. With a block size of two, the l o w e r b o u n d for R is - 1. With a block size of four, the lower b o u n d is - 0 . 3 3 . As the block size increases further, the lower b o u n d for R a p p r o a c h e s zero. Thus, as the block size increases, it is increasingly likely that an analysis ignoring the blocks will result in either a similar or a conservative test c o m p a r e d to the p r o p e r perm u t a t i o n test.
Analysis of Variance In the casc of c o n t i n u o u s data, the usual analysis is to assess differences b e t w e e n g r o u p m e a n s with a t test or an F test. The F statistic from a blocked or u n b l o c k e d analysis of variance is well k n o w n to follow an F distribution u n d e r the usual n o r m a l t h e o r y (population model) assumptions. H o w e v e r , this statistic is also a p p r o x i m a t e l y distributed as F u n d e r a p e r m u t a t i o n model. The analysis of variance for the p e r m u t e d - b l o c k design which accounts for the blocking is given in Table 3A, w h e r e yijk is the r e s p o n s e of patient j in block i to t r e a t m e n t k. The m e a n squares are the s u m of squares divided by the degrees of freedom. The F statistic of interest here is m e a n square treatm e n t divided by the residual m e a n square (MSE). The p e r m u t a t i o n a l (randomization) distribution of this F statistic has b e e n investigated analytically [10] and also by simulation [11]. Based on the results
333
Properties of Permuted-Block Randomization Table 3 Analysis of Variance Degrees of Freedom and Sums of Squares for a Blocked (A) and Unblocked (B) Analysis A. Blocked Analysis of Variance Source df Block
Sum of Squares
B- 1
2m~¢2 . - 2Bm-~2... i
Treatment
1
1
2
£ k
Residual
2Bm
Total
2Bin -
-
B -
1
Bm~¢? .k - 2 B m y 2 . . .
1
by subtraction B
1
~
m
~
2
Y~ y2k _ 2Bm~t2.. "
i=1 j = l k = l
B. Analysis of Variance Ignoring Blocking Source df Treatment 1
Sum of Squares 2
Y B m y 2. .k - 2 B m ~ . . . k=l
Residual
2Bm
Total
2Bm-
-
2
by subtraction
1
£
B
m
2
~,
~,
i=1 j ; 1
k~l
y2k- 2 B m ~ . . .
of Wilk [10], if the block sizes are equal and the variances within blocks are equal, then the permutational distribution of the F statistic has the same first two moments as the normal theory F distribution. Thus, under such conditions, the normal theory F distribution is an adequate approximation of the permutational distribution. The more general situation, including the presence of stratification, has been investigated by Matts and McHugh [11] using simulation. They also demonstrated that the normal theory F distribution is an adequate approximation of the permutational distribution of the F statistic. If the blocking is ignored, the corresponding analysis of variance is given in Table 3B. The statistic of interest here. F;, is the mean square treatment divided by the residual mean square, which in this case is not the same as that from the blocked analysis of variance (Table 3A). Under permuted-block randomization, the permutational distribution of this statistic is not necessarily well approximated by a corresponding normal theory F distribution. The relationship between the two F statistics can be expressed as follows:
FI
M S B
-
1 - ~ = MSB + MSE (2m - 1)
MSE B - 1
The mean squares in eq. (8) are from the blocked analysis of variance (Table 3A), where MSB is the block m e a n square and MSE is the residual mean square. If B is relatively large, this expression reduces to R', the "treatmentadjusted" [9] intrablock correlation coefficient: ar
~
MSB - MSE MSB + MSE(2m - 1)
(9)
This expression is not, however, constant for all possible permutations. Under
334
J.P. Matts and J. M. Lachin the null hypothesis, different permutations of the treatment assignments within blocks will yield different values for MSE and thus for R'. If the F statistics are converted to beta statistics, the following can be derived:
2m
B
MSB
1 - (2m - 1)
B1MSW
MSB + B ( 2 m - 1)MSW ' B-1
(10)
where MSW is the m e a n square within blocks. For large B, this is equal to the simple intrablock correlation coefficient [eq. (7)], which was derived for the ratio of the chi-square statistics. Furthermore, this expression is constant for all permutations. If stratification is employed, it can likewise be s h o w n that R is related to the relative m a g n i t u d e of the test statistic that accounts for both the stratification and the blocking to that of the test statistic that ignores both the stratification a n d the blocking. To relate the test statistic that accounts for both the stratification a n d the blocking to the test statistic that accounts for the stratification but ignores the blocking, a modification of R is necessary such that MSB is replaced by the m e a n square blocks within strata, and the multiplier of MSW, B / ( B - 1) is replaced by B / ( B - K), where K is the n u m b e r of strata. As before, the m a x i m u m value of R is one a n d the m i n i m u m is - 1/(2m 1). Thus, if the intrablock correlation in eq. (10) is zero, both analyses will yield the same result. The analysis ignoring the blocks will be conservative if R is positive, a n d anticonservative if R is negative. Matts a n d M c H u g h [11] give examples in which there are substantial differences in the significance levels for treatment effect in a blocked a n d unblocked analysis. Linear Rank Tests For data on a n y scale (quantitative, ordinal, nominal) a n d for survival data, an alternative distribution-free or nonparametric test is provided by the family of linear rank tests as described in ref. 1. For a permuted-block design, the blocked (stratified) linear rank test is W = S / ( V ~ 2 ) , where S is the rank statistic with variance V a n d W is of the form
W =
[
B
2m
E
~ wi(cij - ~.)(~q B 2m
-
i=ly=l m
E
E
2(2m --- 1) i=~ j=l
w~i(cij -
I/2)
]1,2 ~i.)2
(11)
Here -~q denotes treatment (1 if a, 0 if b), c~j is the rank score for patient j in block i, which is defined as some function of the responses {Yij} of all the patients in block i as c~j = f ( Y i l . . . . . Y~2,,~), c~. is the m e a n of the scores in block i, a n d w~ is the weight for block i. For Wilcoxon scores, L e h m a n n [12, p. 132] shows that optimal weights are w~ = (2m + 1). Thus, with equal block lengths, the w[s cancel from the n u m e r a t o r and d e n o m i n a t o r in eq. (11). Based on the permuted-block randomization W is asymptotically distributed as a standard normal u n d e r the null hypothesis.
Properties of Permuted-Block Randomization
335
If one ignores the blocking, the linear rank test statistic is B
WI =
[
2m
£ £ (cij - e..)('r 0 - 1/2) i=1 j=* B 2m 11/2" Bm E ~ (cq - e )2 2(2Bm - 1) i=1 j = l "'
(12)
If the random allocation design had been used with total n = 2Bin, then this statistic would asymptotically be distributed as a standard normal [4]. However, under the permuted-block randomization, the distribution of eq. (12) is not necessarily the standard normal. Squaring eqs. (11) and (12) gives chi-square statistics: × 2 = W2 and X~ = W'~r. Using these, the relationship between the two statistics can be expressed as follows: 1
×2 X2
MSB - MSW B ' MSB + ~ (2m - 1)MSW
(13)
where MSB is the block mean square of the ci/and MSW is the within-block mean square of the cq. For large B, the above expression is the intrablock correlation coefficient, R, in eq. (7). As before, the interval of possible values for R is -1/(2m - 1) to 1. Conclusions For each of the statistics discussed above, whether the statistical test ignoring the blocking is conservative, anticonservative, or the same, when compared to the analysis incorporating the blocking, depends on whether the value of the intrablock correlation coefficient is positive, negative, or zero, respectively. For example, if the patients recruited early in a trial are healthier than those recruited later, a similar time trend for both groups is likely to be produced resulting in a positive intrablock correlation. If the two treatment groups had time trends that were opposite, namely one increasing the other decreasing, a negative intrablock correlation would be produced. This latter situation, however, does not seem likely. Thus, a positive correlation, if any, is likely in most trials. Therefore, an analysis incorporating the blocking should be performed to obtain a test of the proper size. This will provide maximum power to detect differences between groups. If blocking is ignored, it is likely that the test will be conservative. For such cases, the p value is too big and fewer significant results will be obtained. Thus, an unblocked analysis is likely to sacrifice power. POST HOC STRATIFIED (SUBGROUP) ANALYSES Another common type of analysis in a clinical trial is to perform a subgroup analysis (treatment comparison) within a post hoc-defined stratum based on a pretreatment (baseline) characteristic, such as an analysis only among males. Such analyses have been justified on a permutational basis for complete ran-
336
I.P. Matts and J. M. Lachin domization and the random allocation rule [1,4] under the assumption that the covariate values are mutually statistically independent of the treatment assignments. Likewise, under this assumption a permutation test can be performed with a permuted-block design using only the responses from patients within each block who are members of the designated subgroup. For binary observations, a block-stratified Mantel-Haenszel subgroup analysis will provide a test equivalent to the permutation test. For quantitative observations, the blocked analysis of variance using responses only from members of the subgroup will provide an F test that is asymptotically equivalent to the permutation test. For the nonparametric family of linear rank tests, the proper permutation test within a subgroup under the covariate-treatment independence assumption is a generalization of eq. (11). In the following we use the developments present in ref. 4 for the random allocation rule. Let vi/indicate whether patient j in block i is a member of the designated subgroup (vii = 1) or not (v/j = 0). Within the ith block of size ni (= 2m for constant block sizes), let cij = f(vil, Yil . . . . , 1)in, Yini) be some function (rank score) of the responses among members of the subgroup, where cij is undefined if vii = 0. Conditional on the pattern of subgroup indicators within each block, the variance of the rank statistic for a subgroup within a block is the same as that for the random allocation rule (eq. 7 in ref. 4). Therefore, conditional on the pattern of subgroup indicators within the entire trial, the linear rank test becomes B
2rn
i=1 j-1
W=
( ~ i=1
w~
n,ia(ni _ n~a) 2m 01/2' Y~ *,,(cij- ~)2 n'i(n'i
(14)
1) j=l
where n'ai = Ejvi~'rij and n} = Y~jvijare the numbers of subgroup members in group a and in total in block i, respectively, q~a = n~dnl is the proportion of these assigned to treatment a, c~. = Ejv~jc~j/n} is the mean of the responses among subgroup members in the block, and wi is a weight associated with the block. It is reasonable to weight by w~ = n} + 1 as suggested by Lehmann [12] for Wilcoxon scores. Under H0 and the assumption that the subgroup indicators are independent of the treatment assignments, W is asymptotically distributed as standard normal. In the event that multiple mutually exclusive subgroups are defined on the basis of a covariate, then under the covariate-treatment independence assumption, the rank statistics for each subgroup are statistically independent [4]. Thus, a combined, covariate-adjusted analysis can be performed as though the randomization had been stratified by the covariate, as described in eq. (7) in ref. 1.
MISSING DATA
Analysis of Complete Blocks As described in ref. 1, with any randomization or model the analysis is complicated by the presence of missing data. With a permuted-block ran-
Properties of Permuted-Block Randomization
337
domization, however, this problem can easily be handled because it is statistically valid to exclude a block from the analysis due to operational deftciencies, such as missing data (unrelated to treatment effects), or incomplete recruitment (an unfilled block). Exclusion of such incomplete blocks will not affect the integrity of the remaining complete blocks, for which the resulting aggregate test statistic (with blocking) still has the usual permutational distribution. Thus, a valid permutational analysis can be performed using the subset of complete blocks without the need to invoke any additional assumptions. When some observations are missing, however, fewer patients would enter into a complete-block permutational analysis than would enter into a complete-data permutational analysis, thus potentially resulting in a loss in efficiency. Further, such a complete-block permutational analysis strictly should only be interpreted to apply to the collection of patients in the complete blocks. In order to apply the results to the original collection of n patients randomized, it is necessary to invoke the assumption of missing-at-random observations [1]. Under this same assumption, however, a valid permutational analysis can be performed using the subset of patients with complete data.
Analysis of the Complete-Data Subset Under the assumption of missing-at-random observations, if the intrablock correlation (R) is approximately zero, the permuted-block design allows a simple alternative to only using the complete blocks. In this case, the proper (blocked) permutation test is approximately equivalent to the unblocked permutation or population model-based test. Therefore, if R - 0, it would be valid to perform the unblocked analysis using the subset of patients with complete data, as though a random allocation rule had been used for the randomization [4]. This would then provide an efficiency comparable to that of complete randomization. On the other hand, if intrablock correlation exists (R ~= 0), then under the assumption of missing-at-random observations, a valid large sample permutation test can be performed by analyzing the subset of patients with complete data as a post hoc-defined subgroup, such as using the subgroup linear rank test (14). In any event, if the number of complete blocks is small, or the total number of patients with missing observations is large, the trial probably has a much more serious problem than trying to determine whether or not the blocking should be accounted for in the analysis or whether observations are missing at random. In this case, no analysis is likely to be scientifically convincing. SELECTION BIAS The Blackwell-Hodges Model Based on the approach of Blackwell and Hodges [14] (also see [1]), a design is said to be subject to selection bias when the investigator is able to alter the expected difference between treatment group responses due to an inherent
338
J.P. Matts and J. M. Lachin T a b l e 4 Expected Selection Bias Factor from eq. (14) as a Function
of the Block Size 2m, where m = Number of Subjects per Treatment per Block m
E(F)
m
E(F)
1 2 3 4 5
0.50 0.833 1.10 1.329 1.532
6 7 8 9 10
1.716 1.887 2.046 2.196 2.338
predictability of the treatment assignments. If the treatment assignments remain masked as they occur, then there is no potential for selection bias. However, if the treatment assignments are unmasked as they occur, then the permuted-block design is subject to selection bias. The optimal guessing strategy for any randomization is to guess the treatment with fewer prior allocations as the next assignment, or to systematically guess one of the treatments as the next assignment w h e n both treatments have previously been allocated equally. The potential for selection bias can be measured by the expected selection bias factor, E(F), which is the expected excess of correct guesses of treatment assignments beyond that expected by chance. Each block in a permuted-block design has a potential selection bias equal to that of a random allocation rule of the same size. Thus, from [4], the bias factor, E(F), for a permuted-block design with equal block size 2m is E(F) = B [2mC m
(15)
[~m] 1'2 ~
B k-i6-J
=
B(O'~a)ml'2"
In general, for a permuted-block design with possibly unequal Mock sizes (2mi, i = 1, 2 . . . . . B), the overall expected selection bias factor is the sum of the individual block selection biases: E(F)
=
~ - i= 1 L 2miCro i
.
(16)
The above expressions provide the expected number of correct guesses in excess of n/2. With complete binomial randomization only one half of the assignments on the average can be correctly guessed. In this case, the expected selection bias is zero. For the random allocation rule, B = 1, m = n/2 and the expected bias factor is given by eq. (15). For the permuted-block design, B > 1, the expected selection bias is always greater than that for an equal size random allocation design~ For example, if n = 100, then for a random allocation rule E (F) = 5.78, whereas for a permuted-block design with five blocks of size 20, E(F) = 11.69. In Table 4, the expected selection bias factor, E(F), is presented for various block sizes. For m = 1 or a block size of 2, on the average, 1.5 (75%) assignments can be correctly guessed, yielding E(F) = 0.50. For m = 10 or a block
Properties of Permuted-Block Randomization
339
size of 20, on the average 12.34 (61.7%) assignments can be correctly guessed, yielding E(F) = 2.34. From eq. (16), it follows that the use of random block sizes does not decrease or eliminate the potential for selection bias. A design employing multiple block sizes has a selection bias factor equal to the average of the bias factors over all blocks. This will approximately equal the selection bias associated with B blocks of average block size. For example, if random (equiprobable) block sizes of 6 and 10 were employed, the expected selection bias factor would be approximately the same as that of a common block size of 8.
Predictions with Certainty For the permuted-block design where the block length is known, it is also possible to assess the expected bias factor when the investigator only attempts to influence patient selection w h e n the next assignment is known with certainty. For a block of length 2m, this occurs after there have been m assignments to one of the treatments within the block. Thus, the number of predictions with certainty can range from 1 to m within a block. The potential for this type of selection bias can be measured by the expected number of known assignments, E(F'). The expected number of known assignments within a block is 2 m / ( m + 1). Thus, for a permuted-block design with equal block size 2m, the bias factor is E(F') = ~
2mB
(m + 1)
(17)
(See the Appendix for details of this and cases that follow.) For a permuted-block design with unequal block sizes (2mi, i = 1, 2 . . . . . B), the overall selection bias is the sum of the individual block biases:
E(F')
=
B ~ i=1
2mi
m i + l"
(18)
If random block sizes are employed and the sequence of block sizes becomes unmasked, then the overall selection bias is the sum of the individual block biases as given previously. If the sequence of block sizes is masked but the sizes employed are unmasked, then assignments within a block will be k n o w n with certainty only when an imbalance equal to one half the largest block size occurs. When this occurs, all remaining assignments within the block are known. For the case where two block sizes, say 2ml and 2m2 with 2ml the largest, are used with equal probability, the overall selection bias is E(F') =
nml
(ml + m2)2miCro1
(19)
The use of random block sizes decreases but does not eliminate this type of selection bias. For example, if n = 120, B = 20, and there are equal block sizes of 2m -- 6, then E(F') = 30. In contrast, if random block sizes of four and six were used with n = 120, then E(F') = 3.6. As shown in the Appendix, when random block sizes are used, the selection bias factor depends partly on the proportion
340
J.P. Matts and J. M. Lachin of assignments made with the largest block size. Thus, decreasing the proportion of assignments made with the largest block size will decrease the expected selection bias factor. Therefore, if one views selection bias as present only when an assignment is known with certainty, as opposed to the original Blackwell-Hodges approach of just being more probable, then the use of random block sizes will provide considerable protection against this bias. Selection bias, however, is best protected against by proper masking of the blocking scheme, namely the sequences of blocks and the possible block sizes. In this case, for both random and fixed blocks, no assignment will be known with certainty and this type of selection bias is eliminated.
Block-Randomization It should be noted, however, that the susceptibility to selection bias under either of these models arises because patients are randomized as they arrive. As was the case with the random allocation rule [4], the potential for selection bias is completely eliminated if patients are randomized as a block, rather than as they arrive for entry. In many trials this will be feasible, especially with small block sizes.
A C C I D E N T A L BIAS
Accidental bias refers to the bias of the estimate (6~*)of treatment effect (o0 obtained from a regression model in which one (or more) important covariates are ignored (see ref. 1 and the references cited therein). This is important because the presence of accidental bias implies a covariate imbalance [1]. In the model for accidental bias first proposed by Efron [15], the vector of covariate values is treated as fixed and the distribution of the bias (8* - o0 is then examined over the reference set of all possible permutations under the randomization design employed. For the random allocation rule (a single block of length 2m), it follows from the work of Efron [15] that var(6* - o~) ~ (~/y/)2 ~, where ~ is the regression coefficient for the covariate and K = 1 + (2m - 1)-1 is the largest eigenvalue of the covariance matrix of the vector of treatment assignments [1]. For complete randomization, K = 1, which is the absolute minimum possible. For the permuted-block design the variance of the bias is proportional to K/n2, where h depends on the block size, not the number of blocks, and does not decrease as n increases. In the event that a sequence of block lengths 2mi for i = 1. . . . . B is employed, then (for all blocks filled) ~ = 1 + (2m* - 1)-1, where m* = rain(m1, m2. . . . . roB). Therefore, as m*-+ 0% k ~ 1 and the vulnerability of the permuted-block design to accidental bias approaches that of complete randomization with the same total sample size. As with the potential for selection bias, the use of random block lengths does not alleviate the vulnerability to accidental bias. Asymptotically, therefore, as n --* % the variance vanishes and the probability of this bias ~ 0. However, for fixed m (or m*), the probability of accidental bias (or covariate imbalance) for the permuted-block design is greater
Properties of Permuted-Block Randomization
341
than that of a random allocation rule or complete randomization with the same total sample size. For example, let the covariate be of moderate prognostic value with ~ = 1.0. Then for n = 100, complete randomization and the random allocation rule each yield the probability of slight bias (say 0.02) such that Pr(l&* - o~I > 0.02) ~ 0.0456 [4]. For a permuted-block randomization with m = 5 a n d B = 10 (n = 100), var(6* - c~) ~ (1/100)211 + (1/9)]. Fromeq. (14) in ref 1. Z002 = 1.897, and it follows that Pr(]6d - RI > 0. 02) < 0.0574. CONCLUSIONS The permuted-block design is commonly used in randomized clinical trials. A major advantage of this design over simple randomization designs is that it periodically achieves balance between the number of patients in each treatment group. Furthermore, there is no treatment imbalance if the total sample size is a multiple of the block size. Under a permutation model for statistical tests, the appropriate analysis is a blocked analysis such as the Mantel-Haenszel test, the blocked analysis of variance F test, or the blocked linear rank test. Under the null hypothesis, these blocked analyses will yield tests of proper size. If an unblocked analysis is employed, for these tests it is shown that the effects on the magnitude of the resulting test value can be expressed as (1 - R)T, where T is the test value from the blocked analysis and R is the intrablock correlation coefficient. Thus, if R is negative the unblocked test value will be anticonservative (too large) and if R is positive it will be conservative. The range of R is - 1/(2m - 1) to 1. Thus, as the block size increases, the maximum degree possible of anticonservativeness decreases. However, the potential for extreme conservativeness remains. When a permuted-block design has been used, the usual approach to the analysis of treatment outcomes, and also the assessment of a covariate imbalance, has been to ignore the blocking. With regard to actual published study results, little is known about the values of the actual intrablock correlations that occurred. If they were in general close to zero, there was no harm in ignoring the blocking in the analyses. If, on the other hand, they were quite positive, the unblocked analyses needlessly sacrificed statistical power. Post hoc covariate-stratified permutation tests can also be performed under the assumption that the covariate values and treatment assignments are mutually independent. For mutually exclusive strata, such tests are statistically independent. When some observations are missing, incomplete blocks can be eliminated from the analysis without affecting the validity of an analysis restricted to the remaining complete blocks. This approach, however, sacrifices observed responses contained in the incomplete blocks. An alternative is to perform a blocked analysis containing only patients with observed responses. This analysis will be valid under the assumption that missing responses are missingat-random. The principal disadvantage of the permuted-block design is an increased probability of bias in the assessment of treatment effects. These biases result from the increased predictability of the treatment assignments due to the
342
j.P. Matts and J. M. Lachin a c h i e v e m e n t of periodic balance. U n d e r the Blackwell-Hodges m o d e l for selection bias in an u n m a s k e d trial, the potential for selection bias decreases as the block size increases, but it is still substantially greater for the p e r m u t e d block design t h a n for simple r a n d o m i z a t i o n designs or an u r n design. Furt h e r m o r e , there is n o a d v a n t a g e to using r a n d o m block sizes. This m o d e l for selection bias can be modified to only allow for bias w h e n assignments can be predicted with certainty. The bias is eliminated if the blocking s c h e m e (e.g., possible block sizes) is masked. If the blocking s c h e m e is not masked, the use of r a n d o m block sizes reduces the potential for this bias. The only w a y to completely eliminate selection bias in an u n m a s k e d trial is to r a n d o m i z e patients as a block rather t h a n individually as t h e y arrive for e n t r y into the trial. In m a n y cases this will be feasible. U n d e r the Efron m o d e l for accidental bias, the potential for accidental bias is related to the probability of a covariate imbalance. For the p e r m u t e d - b l o c k design the variance of this bias is inversely proportional to the smallest block size e m p l o y e d . T h u s the potential for accidental bias is greater for the permuted-block design t h a n for simple r a n d o m i z a t i o n designs. Asymptotically, h o w e v e r , the potential for accidental bias vanishes for the p e r m u t e d - b l o c k design as it does for other designs. For Matts this work was partially supported by the Program on the Surgical Control of the Hyperlipidemias under grant HL-15265from the National Heart, Lung, and Blood Institute, and for Lachin by the Study of ACE Inhibition in Diabetic Nephropathy under grant RO1-DI-39826 from the National Institute of Diabetes, Digestive and Kidney Diseases and a grant from E.R. Squibb and Sons, Inc.
REFERENCES
1. Lachin JM: Statistical properties of randomization in clinical trials. Controlled Clin Trials 9:289-311, 1988 2. Zelen M: The randomization and stratification of patients to clinical trials. J Chron Dis 27:365-375, 1974 3. Lagakos SW, Pocock SJ: Randomization and stratification in cancer clinical trials: An international survey. In Cancer Clinical Trials, Methods and Practice, Buyse ME, Staquet MJ, Sylvester RJ, Eds. New York: Oxford University Press, 1984, pp 276-286 4. Lachin JM: Properties of simple randomization in clinical trials. Controlled Clin Trials 9:312-326, 1988 5. Hallstrom A, Davis K: Imbalance in treatment assignments in a stratified randomization scheme. Controlled Clin Trials, 9:375-382, 1988 6. Friedman LM, Furberg CD, DeMets DL: Fundamentals of Clinical Trials, 2nd ed. Littleton, MA: PSG Publishing Company, 1985 7. Mantel N, Haenszel W: Statistical aspects of the analysis of data from retrospective studies of disease. J Natl Cancer Inst 22:719-748, 1959 8. Fleiss JL: Statistical Methods of Rates and Proportions, 2nd ed. New York: Wiley, 1981 9. Fleiss JL: The Design and Analysis of the Clinical Experiments. New York: Wiley, 1986
343
Properties of Permuted-Block Randomization
10. Wilk MB: The randomization analysis of a generalized randomized block design. Biometrika 42:70-79, 1955 11. Matts JP, McHugh RB: Analysis of accrual randomized clinical trials with balanced groups in strata. J Chron Dis 31:725-740, 1978 12. Lehmann EL: Nonparametrics: Statistical Methods Based on Ranks. San Francisco: Holden-Day, 1975 13. Wei LJ, Lachin JM: Properties of the urn randomization in clinical trials. Controlled Clin Trials 9:345-364, 1988 14. Blackwell D, Hodges JL Jr.: Design for the control of selection bias. Ann Math Statist 28:449-460, 1957 15. Efron B: Forcing a sequential experiment to be balanced. Biometrika 58:403--417, 1971 16. Feller W: An Introduction to Probability Theory and Its Applications, 2nd ed. New York, Wiley: 1957, Vol I, p 62
APPENDIX Ignoring Blocks in the Analysis Below w e outline the derivations of the relationships of the intrablock correlation to the ratio of the value of the test statistic that ignores the blocking to that of the statistic that accounts for the blocking. For the c o n t i n g e n c y chi-square, n o t e that b. = n / 2 - a . , d. = n / 2 - c . , bi + di = 2 m - a i - ci, a n d B m = n / 2 . Substituting these into eqs. (2) a n d
(3), B
n
X2
E
i~1
(ai + c i ) ( 2 m
×-5 = (2m - 1)(a. +
c.)(n
-
ai -
a.
ci) -
(A1)
c.)"
In o r d e r to see the relationship to the intrablock correlation it will be conv e n i e n t to c h a n g e notation at this point a n d let yi] be the binary variable (0 or 1) that indicates the r e s p o n s e of patient j in block i. Thus, (ai + c~) = y~. a n d (a. + c.) = ( y . . ) , w h e r e the "." stands for s u m m a t i o n over the indicated index. Therefore, the ratio of the two statistics can be written as B
n
E
yi.
(2m -
Yi.)
i=1
(2m - 1)y.. (n - y..)"
(A2)
F r o m an analysis of variance w i t h m e a n square blocks (MSB) a n d m e a n square within blocks (MSW), the following expressions can be obtained: (B - 1)MSB + (n - B)MSW
= y.. ( n
-
y.)/n
(A3)
and MSW -
B(2m
I
B yi.(2m
-
E 1)i=1
-- yi.)
2m
(A4)
Thus, o n e m i n u s the ratio of the two statistics can be written as 1
×2 (n)MSW X2 = 1 - (B - 1)MSB + (n - B)MSW"
This t h e n yields eq. (6) in the text.
(A5)
344
I. P- Matts and J. M. Lachin To derive the relationship for the F statistics given in eq. (8), the d e n o m inator of FI is written as (B - 1)MSB + ( 2 B m - B - 1)MSE 2Bm -
(A6)
2
The result follows directly. Using the equality that the s u m of squares total equals the s u m of squares blocks plus the s u m of squares within blocks, eq. (10) is easily derived. To derive eq. (13), the n u m e r a t o r s of eqs. (11) and (12) are equal since Zj('rij - 1/2) = 0. Then, using the equality that the s u m of squares total equals the s u m of squares blocks plus the s u m of squares within blocks, eq. (13) is obtained. Selection Bias for Predictions with Certainty Below we outline the derivations of the selection bias factor, E(F') for fixed a n d r a n d o m blocks w h e n bias is i n t r o d u c e d only w h e n future assignments can be predicted with certainty. For i = 0, 1, 2 . . . . . m - 1, the probability that the last m - i assignments in a block are to t r e a t m e n t a is m+ iC,d 2mCm. Using this, the probability that the last m - i assignments are to t r e a t m e n t a but the last m t h r o u g h m - i + 1 are not all assigned to t r e a t m e n t a is ,n + iCm -
(A7)
m +i-- 1Gin
2mCm
Thus, the expected n u m b e r of assignments k n o w n to be assigned to t r e a t m e n t a is ,,-1Z [m+,Cm i=0
m+, 1Cm] (m -
i).
(A8)
2mCm
Simplifying and using eq. 12.8 from Feller [16], this reduces to m / ( m + 1). The expected n u m b e r of k n o w n assignments to t r e a t m e n t b is the same. Thus, with B blocks, E(F') -
2mB m + l"
(A9)
For the case of r a n d o m block sizes, E(F') d e p e n d s on the largest block size 2ml. In a block of this size, the probability that the last ml assignments are m a d e to the same t r e a t m e n t is 2~(2miCro1)with the expected n u m b e r of correct guesses for that block equal to
2ml/(2mlCml). Let P1 be
the probability of choos-
ing a block size of 2m1 and E ( B ) the expected n u m b e r of blocks. T h e n E(F') = E(B)PI2ml 2m I
(A10)
Cm 1
For the case of two block sizes which are c h o s e n with equal probability, P1 = 1/2 a n d E ( B ) = n / ( m i + m2): E(F') =
nm;
(mi +m2)2m 1 Cm 1
(All)