J ChronDis Vol. 36, No. IO, pp. 685-697,
0021.9681/83$3.00+0.00 Copyright c 1983PergamonPress Ltd
1983
Printedin Great Britain.All rightsreserved
THE RELATIVE EFFICIENCIES OF MATCHED AND INDEPENDENT SAMPLE DESIGNS FOR CASE-CONTROL STUDIES DUNCAN
C. THOMAS’*
and SANDER GREENLAND’
‘Department of Epidemiology and Health, McGill University, 3775 University Street, Montreal, Quebec, Canada H3A 2B4 and ‘Division of Epidemiology, School of Public Health, University of California at Los Angeles, Los Angeles, California, U.S.A. (Receiaed in revised,form 30 March 1983) have studied the asymptotic and small sample efficiencies of dependent (pairmatched or stratified) and independent samples as design techniques for case-control studies, and of matched, stratified, covariance-adjusted, and crude comparisons as methods of analysis. The asymptotic efficiencies of dependent sample designs relative to independent sample designs with adjustment were found to vary with the strengths of the relationships of disease with exposure and potential confounder: as the relationship with exposure increases, dependent samples lose efficiency; as the relationship with confounder increases, dependent samples gain efficiency. The relative efficiency also depends in a complicated manner on such other factors as the distribution of exposure and the strength of the exposure-confounder relationship. In the majority of situations examined, however, dependent samples were found to be somewhat more efficient than independent samples when confounding was present, while the reverse was true when confounding was absent. Results of small sample simulations do not differ importantly from the asymptotic results, except for pair-matching on a non-confounder, where the inefficiency of matching is greater in small samples.
Abstract-We
INTRODUCTION
ONE OF THE major challenges in nonexperimental research is the control of the effects of confounding factors. To this end, a wide range of methodological techniques has evolved, including such design strategies as pair matching and stratified sampling, and such analytical methods as stratified analysis and covariance adjustment. Early efforts [l-4] to provide some statistical basis for choosing between these methods mainly considered prospective studies and the relative capability of various strategies to eliminate bias. Miettinen [5, 61 was the first to point out that another important issue is efficiency (the precision of the estimated parameter) and, in an intuitive way, that efficiency considerations are quite different in cohort and case-control studies. He argued that, whereas in cohort studies, matching reduces the residual variance thereby allowing the exposure effect to be more easily detected, in case-control studies it will tend to restrict the variability of the exposure variable, thereby making it more difficult for the exposure effect to appear. This observation applies as well to statistical adjustment, as demonstrated by Day et al. [7]. Miettinen compared matching with independent sampling when no confounding was present but did not examine the case in which some form of adjustment was necessary to control bias. The latter situation has recently been addressed by a number of authors [8-121; most of these considered the large sample case with binary exposure, disease, and potentially confounding variables. This case is particularly instructive because
*Author to whom correspondence should be addressed. Portions of this paper were presented at the 109th Annual Meeting of the American Public Health Association, Los Angeles, November, 198 I. This research was supported in part by a grant from the National Cancer Institute of Canada and in part by Grant No. ROl-CA-16042 from the National Cancer Institute of the United States. 685
686
DUN~AU C. THOMAS
and SANIXR GRENLANII
the problem can be characterized by a minimum number of parameters and the asymptotic efficiencies easily derived. So far, only McKinlay [12] has compared pair matching (on a continuous covariate) with adjustment, and only Monte Carlo simulation was found to be feasible for her study. In this paper, we briefly review these asymptotic results for binary exposure (E), disease (D) and potentially confounding (C) variables and then describe a new Monte Carlo simulation of the small sample case. Next we consider the problem of pair matching on a continuous covariate and develop an asymptotic expression for its efficiency relative to logistic adjustment; small sample comparisons of pair-matching are limited to a binary exposure variable with confounding absent. Throughout, our comparisons are restricted to the situation of equal numbers of cases and controls (one-to-one matching in the later sections). First, however, it is instructive to consider in theoretical terms what the effect of matching or stratification in design is expected to be. A PRIORI
CONSIDERATIONS
The fundamental characteristic of case-control matching is that it induces a constant (“balanced”) case-control ratio across the strata of the matching factor. For example, one-to-one matching of controls to cases on the factor C will ensure that the number of controls will equal the number of cases within each stratum of C. If stratification on C will be necessary in the analysis. then we should expect this “balancing” effect of matching to improve the efficiency of the study relative to independent sampling, for in the latter design, some strata may be left with unbalanced numbers of cases and controls; the extent of this improvement will depend on the degree of association of C and D in the source population. On the other hand, matching on a factor associated with exposure increases the correlation (concordance) of the exposure histories of cases and controls, thus increasing the variability of the estimated association between E and D [5, 61. Nevertheless, this increase in variance will translate into a reduction in efficiency only when compared against independent samples without adjustment; when some form of statistical adjustment is needed to control confounding, the variability of this adjusted estimate will also increase as the association between E and C increases, so that the relatiz~e elhciency of the two designs will be difficult to predict. Thus, all other things being constant, the more strongly C is related to disease, the more efficient the matched design will be relative to the unmatched design, while the more strongly C is related to exposure, the larger the variance of both the matched design and the unmatched design with adjustment will be. The asymptotic comparisons presented below confirm these predictions, though the magnitudes of the effects vary depending on sample size and other factors.
STRATIFICATION
Asymptotic
relations
In order to provide a basis for comparison with the new results, we review first the situation of binary exposure. disease, and potentially confounding variables. The basic design issue is whether to use stratified or independent samples of controls, and in either case, the basic analytic issue is whether to use a collapsed or stratified analysis; in the stratified sample case, a further analytic issue is whether to retain any one-to-one pairing that may have been done. The five basic strategies to be compared are therefore: (IC) independent samples with a collapsed analysis; (IS) independent samples with a stratified analysis; (SC) stratified samples with a collapsed analysis; (SS) stratified samples with a stratified analysis; and (SM) stratified samples incorporating pair matching in the analysis. In this section we consider the asymptotic bias and efficiency of each of these strategies as functions of the underlying relationships between the three factors in the population. Thus we begin by calculating the expected outcomes for each set of relationships and then
Matching
in Case-Control
Studies
687
compare the expected estimates and asymptotic variances of the log odds ratio for the exposure-disease relationship resulting from each of the five strategies. The relationships described in this section agree with those given by Smith and Day [9] except that we find it more convenient to interchange the order of specifying the frequencies of E and C in the nondiseased population. The analytic results are not affected, but we prefer to think of C as the determinant of E because factors which are intermediate links in a pathway from E to D are not ordinarily considered to be true confounders [13]. Thus, let pc = 1 - yc denote the prevalence of C in the nondiseased population and let p, = 1 - q, denote the prevalence of E in the nondiseased subpopulations with C absent (i = 0) and present (i = 1). Then we let pE =pcp, + qcpo be the overall prevalence of exposure and OR,,. = p,qo/poq, be the odds ratio for the relationship between E and C in the population. Next we assume that the incidence of disease follows a multiplicative dependence on E and C, so that the rate ratio for the exposure effect is -RR,, = rate(EC)/rate(EC) = rate(EC)/rate(EC) and for the potential confounder is -RRLIc = rate(EC)/rate(EC?) = rate(EC)/rate(EC), neither depending on the level of the other factor. Assuming either a rare disease or incidence density sampling [ 141, the expected distributions of cases c,,, independent controls ui,, and stratified controls sij (i = level of C and j = level of E) are then given in Table 1. Note that the expected distribution of independent controls is merely the population distribution, while in the stratified series, the same number of controls as cases is drawn in each confounder stratum (s, = ci., where the point (.) denotes summation over the indicated index). The investigator may find it convenient to pair these cases and controls together, for example by selecting a control with the same value of C who was admitted on the same day. Assuming that the criteria determining such pairing were effectively random, i.e. not related to either E or D, then the expected distribution of the case-control pairs is given
Total
I
Total
where z = P&I,,
fp,,
RR,,)
+ d%~~(~~
+ PIRRA
Exposure Exposure of the case Present Absent
ml,,
on factor
m,, = c, - R&m,,,
m,,,
Total 2
p,” of case-control
Absent
%I = %~ %
m,, is the positive
exposure
control
t% = RR,,m,,
m,
rDbpL + qEand
I
of the matched
Present
Present Absent
c 0
I.
mducing correlation status
Exposure Exposure of the case
Cl
s0
where c,, and s,, are as defined in Table matching
Total
= xl w,
mlJil = z, w,
SI
(B) One-to-one
control
Absent
ml1 =&c,,P, ml11= 7 %,P,
Total
where Z’ = Appendix
of the matched
Present
m,, root of the quadratic
Total
rocpE:z’ co= P, IX’ I
c, =
equation
(I) in
688
DUNCAN
TABLE
3. FOKMLLA~
THOMAS
C.
FOR COMPI:TIPI<;
and SANDER
ESTIMATL,>
oous
GREENLANI)
RATIOS
AI~D TH~,K
ASYMPTOT,C
Estimated Method Sampling
scheme
of
odds
a”alyQs
(IC)
Independent
Collapsed
(IS)
Independent
Stratified
ratlo*
vanaxet
f%L
0%
c I”&O” z, c,,wv,
+ “,)
~,WW + U,,)
) where
Stratified
(SS)
Collapsed
Stratified
U, = L + L + I + ! c,r, co “,I1 “,I I I __f-_+ c,,
,s,,:c+
c
cl
x1 c,lslO/c,
Stratified
RR,,r
I I L + E’, + + ci “I “”
I
z, C,““,, xc, + “,
(SC)
S,S,I(S,
z, w,:c, where
Stratified
*Formulae
used m small
tForm”lae for
Matched sample
used in asymptotic IS
and
estimator previous
SS,
the
comparisons
asymptobc
of the common
“dds
(not (not
variance ratio,
not
required that
that
+ S,,) + _! + ! CIII s,o S,I
ml0
requred
is
’ s0
‘_+J_
“J,lP0
simulations
!+ 5,
S, = !- + i
5 (SM)
VAKIAIUCLS
Asymptotic
of
for
asymptotic
for small the
“%Ji
comparisons).
sample
simulations).
unconditional
maximum
of the MantelLHaenrel
estimator
Note
that
likelihood given
I”
the
column.
in Table 2(a). (The situation of unique one-to-one pairing on true, continuous confounders is addressed later.) Jhe formulae for the odds ratio estimator of RR,], and the asymptotic variance of InRR,, under the various strategies are provided in Table 3. Only the asymptotic variances (column 4) are needed for the efficiency comparisons in this section. In the stratified analyses, the variance is that of the empirical logit or the unconditional or conditional maximum likelihood estimators, all of which are asymptotically equivalent and fully efficient under this model [15]. Then we define the asymptotic relative efficiency, ARE,:, (In so doing we have followed general of strategy x relative to strategy y as G&$~&,. practice, though in practical terms the relevant scale of comparison is the standard error rather than the variance; ARES for this scale can be obtained from the tabulated values by taking square roots and will always be closer to 100x.) The asymptotic variances and ARE’s are tabulated for various choices of the parameters pE, pc, OR,, , RR,), and RR,, in Tables 4 and 5. The nomenclature in Table 4 follows Miettinen [6], except that we have distinguished the case RR,, = OR,, = 1 by the name “irrelevance”. Results. Table 4 describes the asymptotic variances for the eight fundamentally different situations in which OR,,, RR,, and RR,, are either “present” or “absent”. The value 4 was arbitrarily selected to illustrate “presence” and pE and pc are fixed at 0.5 for each; differences between methods vary in a complicated manner with the various parameters, as explored in greater detail in Table 5. TARL~
4. ASYMPTOTK
.Y,FROM STRATF,~” AN”INDEPENDENT SAMPLES FOR CASE CONTROL STUDIFS WlTHMATCHE”. STRATlFlED ANDUNAUJUSIED ANALYSES
YAR,ANC~S OF loci ODDS RAT,OS ,x
Stratified
samples
Independent
Strength w,,
Matched
of relationshipst OR,,
Nomenclature
R’L
(SM) Null
samples
analysis:
analysis: Collapsed
Stratified
Collapsed
(SS)
(SC)
(IS)
(IC)
StratIlied
h~~ppothrsis
I I
I I
I 4
Irrelevance
8.00
X.00
X.00
8.00
x.00
Futility
8.00
8.00
8.00
8.89
X.01)
I I
4
I
Overmatchmg
9.00
9.00
8.00
9.00
8.00
4
4
Confounding
9.00
9 00
8.33
9.99
8.17*
hvporhrsrs
Al~rmuriw 4
I
Irrelevance
12.50
10.25
10.25
10.25
10.25
I
I
4
4
Futihry
12.50
10.25
10.25
II
27
10.25
4
4
I
Overmatching
14.06
I I .70
IO.271
II
36
IO.25
4
4
4
Confounding
15.75
13.43
12.41*
13 75
*Expected
odds
ratio
is biased.
?/Q = p<. = 0.5
for
all slI”ations
I2 17*
Matching
in Case-Control
Studies
689
In the case of irrelevance, all methods are unbiased and equally efficient except matched analysis under the alternative hypothesis. This would seem to be contrary to the conventional wisdom that in the paired analyses, the “concordant pairs” m,, and m,, contribute no information. The latter view is correct only if cases have been individually matched with a uniquely “best” control, such as a sib or nearest available match on a number of strong confounders, as discussed below. In the present case of stratum matching, the particular pairing is but one of a large number of possible ones, so that to restrict the analysis to the small number of discordant pairs is wasteful of information [ 161. The same results apply in the case of futility except that stratified analysis is less efficient than crude analysis for independent samples. In either case, if C is unrelated to E, it makes no difference to efficiency whether the sampling design is stratified on C. In the case of overmatching, only collapsed analysis of independent samples (IC) is fully efficient if RR,, # 1. Of all the situations, the stratified design is the least efficient here, particularly with a matched analysis. This case also illustrates two important differences between case-control and cohort studies in the implications of a stratified design. First, in case-control studies it is not sufficient merely to stratify the design for a potential confounder; the analysis must also be stratified [17]. Indeed, in caseecontrol studies, an unstratified analysis of a stratified design will lead to a biased estimate of RR,, unless OR,, = 1 or RR,, = 1. Second, sampling stratified on a factor which is related to E but not to D will lead to a less precise estimate in a case-control study but not in a cohort study [S, Ill. Nevertheless, it will not cause the odds ratio to be underestimated unless the incorrect unstratified analysis is done. (These differences between case-control and cohort studies reflect the choice of parameter rather than the basic design; if an odds ratio were estimated from a cohort study, the results would be identical to those presented here for case-control studies.) In the case of confounding, a stratified analysis is necessary for validity, whether or not the design is stratified. As in the other cases, a matched analysis is less efficient than a stratified one, so hereafter we restrict our attention to comparisons of stratified and independent sample designs using only stratified analyses. The relative efficiency of the two designs varies in a complicated manner with all of the five parameters, as shown in Table 5. The strongest and most consistent effects are the following: pE: Stratified sampling becomes more efficient as pL becomes smaller, particularly if OR,, or RR,, are large. RR,,: Stratified sampling becomes more efficient as RRUc gets larger, particularly if pE is small. RR,,: Stratified sampling becomes less efficient as RR,, gets larger, particularly if OR, is also large and pa and pc are similar. OR,,.: Stratified sampling becomes less efficient as OR,, gets larger, provided pE is large; otherwise the effect of OR,, is unpredictable.
ANALYLEO
CASE OF rONFOUNDING
BY S,KATIFICAT,Or.
RR,,-=2 OR,<
RR,<
RR,,=8
= 2
3
8 72
PI
PC
0.2
0.2
2
95
85
0.2
0.2
4
93
81
0.2
02
8
91
79
RR,,
=2
4
8
97
89
71
68
99
91
79
66
104
97
85
0.2
0 5
2
94
86
78
96
89
81
0.2
0.5
4
91
83
16
94
87
81
0.2
0.5
8
86
80
74
91
85
80
0.5
0.2
2
99
93
82
101
96
87
0.5
0.2
4
102
99
89
104
103
96
0.5
0.2
8
106
106
99
109
II0
106
0.5
0.5
2
98
92
84
100
96
91
0.5
0 5
4
100
94
86
I05
IO1
95
05
0.5
8
IO1
9s
88
II2
I07
100
690
DUNCAN
C. THOMAS
and SANIXK
GREENLAND
The effects of RR,,- and OR,c are consistent with the u priori considerations discussed above. On balance, most of the combinations of parameters considered here favor a stratified design. The potential loss in efficiency through stratification was rarely more than 10% but the potential gain was sometimes more than 2Ooj,. Small sample simulalions Many of the parameter choices listed in Table 5 lead to one or more cells having very small probabilities (e.g. < 5%) so that for many realistic sample sizes, very small or even empty cells may frequently arise. Thus it is reasonable to wonder if the asymptotic results presented above and in Refs [7-l l] would be applicable to small samples. To investigate this question, we carried out a Monte Carlo simulation. The cil and ui, were assumed to be multinomially distributed with sample size N and probabilities given in Table 1, and the s,, to be binomially distributed with sample sizes c, and conditional probabilities p,. The empirical variances of the estimators listed in the third column of Table 3 were then tabulated. (Again, note that in the stratified analyses, the MantelLHaenszel estimator was used, rather than the more complex maximum likelihood estimators.) Standard errors of the simulated relative efficiencies were estimated by arranging the trials into batches of 200 each. On this basis, a total of 2000 trials was found to be needed to reduce the standard error to under 5%. As there was no systematic pattern to the standard errors, this number was used for all parameter choices and the pooled estimate of standard error of any particular choice was 3.9%. The computing cost forced us to restrict our consideration to a relatively limited choice of parameters. as given in Table 6. Though the standard errors for individual parameter choices may seem large, when considering the entire set of observations, all of the trends which were apparent in the asymptotic results were also highly significant in the simulated results. We looked for patterns in the deviations by using a multiple regression of the simulated on the asymptotic relative efficiencies. At a sample size of 200 cases and 200 controls, no significant patterns were apparent, but at a sample size of 50 cases and 50 controls, there was a small but highly significant tendency for the simulated findings to be more extreme than the asymptotic ones. Though the pattern is difficult to see from the individual cell values, the fitted relative
RR,>, P,=
P,
0.20
0.20
020
0.20
0.50
0.50
0.50
0.50
*N
= number
RR,,
= I
=4
OR,,
,v *
RR,,, = I
2
4
8
RR,,r = I
2
4
8
1.00
‘CC
100
9x
90
7X
100
98
91
79
200
102
97
Y4
74
95
95
96
73
50
101
102
x4
72
103
93
Xl
73
100
95
84
71
101
96
X7
14
94
93
80
69
99
94
89
76
50
102
94
?7
63
90
9s
8Y
65
o-_
2.00
4.00
8.00
2%
100
92
X0
66
100
95
85
72
200
106
8X
7X
63
103
97
x5
74
50
101
X7
1')
60
I01
91
x4
71
x
100
91
77
63
100
95
85
73
200
93
X6
IO6
89
62 57
98 104
100 Y4
Xl x3
75
50
80 hi)
100
Y7
90
x2
100
97
91
X4
99
Y8
92
7X
I02
9')
93
80
99
97
x5
77
97
96
x3
83
100
97
90
X2
IO1
100
94
X7
9X
99
96
76
100
100
93
90
100
92
93
IX
108
101
95
X3
100
97
90
X2
103
103
9x
91
IO?
99
93
X9
104
99
100
X9
100
Y3
XY
Xl
107
98
106
94
IO(1
97
90
82
IO6
IOX
102
94
I 00
2.00
4 00
X.00
of cases = number
69
200
102
97
xx
X4
IOX
I08
100
8X
50
IO1
103
92
X2
107
II4
94
95
of controls: each estimate based on 2000 simulations, estimated standard error = 3.9
Matching
Tm_t
7. SLMMAKY
OF
ASYMFTOTIC>SIMULATtD
STKATLM
Simulated:asymptotic
in CaseeControl
MATCHING
“ARlANCE
CASESOF
Studies
AhO
RtLArlVt
lRRELE”ANCE
AND
N=
number
parameter*
Variances ( x N) of In RR,, SM: stratified samples. matched analysis SS: stratified samples, stratified analysis IS: independent samples, stratified analysis IC: independent samples. collapsed analysis Relative efkency IS:SS IC: SM
691
tFrtClthCY
C”MPAKISOh$
FOR
OYERMATCHINti
of cases = number
of controls
200
50
I.042 I.010 I.014 0.988
1.122 I .093 I.087 I.063
1.002 1.060
I.011 I.061
*Numbers in the table are the geometric means of the ratio of the simulated to asymptotic vanan~e~ or relative eiTiciencies, over eight parameter choices. RR,, = 1, 4 and OR,, = I. 2. 4, 8. each with 1000 simulations (estimated SEM = 0.018).
efficiencies range from 64 at an ARE of 70 to 112 at an ARE of 1 IO. Thus, the small sample results entirely confirm the asymptotic comparisons, with no major differences in either direction or magnitude. For comparison with the case of pair-matching on a continuous covariate discussed below, we carried out an additional simulation of the two estimators considered there_SM (matched pairs estimator for the stratified design) and IC (collapsed estimator, unstratified design)+in the situations where both were valid, i.e. RR,, = 0. Table 7 shows that the simulated variances were generally larger than the asymptotic ones, but the differences were largest for strategy SM and smallest for IC, as might be expected based on the expected cell sizes. Though the simulated efficiencies of IS relative to SS were not substantially different from their ARES, the simulated efficiencies of IC relative to SM were, on average, about 6% larger than their ARES. Thus, convergence to asymptotic relations might require larger total sample sizes in the matched case than in the stratified case. ONE-TO-ONE
Asymptotic
MATCHING
relations
In our preliminary studies, we tried to approximate the case of one-to-one matching by increasing the number of levels of C and found that the differences between the five strategies become much less pronounced. (Similar findings were reported by Thompson et al. [lo].) This approach is cumbersome, however, because of the arbitrariness in specifying the distribution of C and the dependence of E and D on C, and because as the number of categories of C increases, some form of parametric adjustment for C (e.g. logistic regression) may become more efficient than stratification. While multi-level categories can also arise from combinations of several binary confounders, the same difficulties in comparing the efficiencies of designs arise. We have therefore developed a more general approach to the problem, described in detail in Appendix 1, in which we allow E and C to have any joint distribution, continuous or discrete, each factor being either univariate or multivariate. We then assume some functional dependence of risk on E and C and compare the asymptotic variances of the estimates of the coefficient(s) of E in that model under the following three strategies: (MM) paired samples, exactly matched on C, with a conditional likelihood analysis; (IA) independent samples, with an unconditional likelihood analysis which includes both a constant nuisance parameter and one for C in the model, and (IC) independent samples, with an unconditional likelihood analysis which includes only a constant nuisance parameter but ignores C. Thus, except under IC, we assume any confounding perfect matching or by perfect adjustment. The
effects have been eliminated either by asymptotic relative efficiency is then
692
DUNCAN
C. THOMAS and SANDER GREENLAND
computed by comparing the inverses of the second derivates of the log likelihood functions, evaluated at the true parameter values. Appendix 1 provides the general theory, applicable to any distribution of E and C and any form of dependence of D on E and C, together with computational formulae for the special case where E and C have a bivariate normal distribution with correlation pEC and incidence rates depending on E and C according to the relation RR(E, C) = exp(cc + j?,,E + y&I’) where CLis a nuisance parameter. Tables 8 and 9 provide the asymptotic relative efficiencies for various choices of these three parameters, calculated by numerical integration. Dichotomizing a bivariate normal population at the mean of E and C will produce an odds ratio OREc = 4 if pEC = 0.5. Similarly the values DDE= yoc = 1.O will produce rate ratios RR,, = RR,c = 4 when comparing the 25th and 75th percentiles of the marginal distributions of E and C. Thus, these parameter values are roughly comparable in magnitude to those given in Table 4 and the parameter values in Table 9 were chosen in a similar manner to be comparable to those in Table 5. The patterns in Table 8 are generally very similar to those in columns SM, IS and IC of Table 4, but to the extent that the parameter choices are comparable at all, the differences between methods seem to be more pronounced for continuous matching factors. Similarly, comparing Tables 5 and 9, very similar patterns emerge, though the differences are much larger for continuous matching factors. Thus in Table 9, independent samples with adjustment become more efficient as y UE gets larger (as for RR,, in Table 4) but less efficient as either yoc or pEC get larger; the effect of yuc is comparable to that of RR,, in Table 4, but the corresponding effect of OR,,. in Table 5 was quite inconsistent.
PAIR-MATCHtLl AN,,IN*~~P~NI~bNr SAMPLE FOR CASE CONTROL STllDlES WITH TABI.E 8. ASYMPTOTICVARIA~KES, x.v)OF/I-,, FROM (‘ONDITlOhAL. UNC”NDITI”NAL A”JUSTED ANDUivCONDITlOhAL I’NADKSTE” AhALYSES Pair matched sahlples conditional UlZtlyslS (MM)
Logtstic disease risk coefficients
Exposureconfounder correlation PE<
EXpOMe BD,
Confounder i’n<
Nomenclature
Independent unconditional adjusted for c (IA)
samples analysis
unadjusted ULJ)
Null hvporhesis 0.0 0.0 0.5 0.5
0.0 0.0 0.0 0.0
0.0 1.0 0.0 1.0
0.0 0.0 0.5 0.5
1.0 1.0 1.0 I.0
0.0 I.0 0.0 1.0
I~h3UK~
Futihty Overmatching Confounding Alternative
*Expected
parameter
estimate
Logistic diseased-exposure coefficient BDL 0.5 0.5 0.5
2.00 2.00 2 61 2.61
2.00 2.52 2.61 3 35
2.00 2.00 2.00 2.25*
4.21 4.27 4.84 4 85
3.05 3.67 3.88 5.62
3.05 3.05 3.05 4.60*
hypothrsrs
Irrelevance Futility Overmatching Confounding
is biased.
Exposure-confounder correlation coefficient Y6-c 0.30 0.50 0.67
Logtstic
disease-confounder
coefficient,
0.5
I .o
I.5
101 96 91
84 7x 73
64 60 55
I.0 1.0 I .o
0.30 0.50 0.67
I21 I08 94
100 86 73
71 64 54
1.5 1.5 1.5
0.30 0.50 0.67
152 126 100
125 9x 75
95 I? 54
ynr
Matching in Case-Control
Studies
693
TABLE IO. ASYMFTOTICAND SIMULATEDRELATIW EFFICIENCIESc’,,)OF INDEPENtxkT SAMPES WLrHOUTADJUSTMENTRELATlVETO PAIR MATCHINGIN THE ABStNcE OF CONFOlJNDlNG CASE OF lRRELE”ANCEOR OVERMATCHIkC(yoc = 0) Simulatedt
Asymptotic PF
PO!*
0.1
0.0 0.2 0.4
04
0.0 0.2 0.4
TOE=
I
TDE=
I
2
4
2
4
100 125 167
103 123 I57
114 134 169
105 126 I83
106 149 183
136 142 ISI
100 I25 167
IO6 133 I78
124 160 225
109 134 I85
109 143 190
I41 I94 214
‘Correlation of matched-pair exposure levels (see Appendix 2). tNumber of cases = numbers of controls = 100; for pF = 0.1 each estimate based on 2000 simulations, estimated standard error = 5.1; for pE = 0.4 each estimate based on 4000 simulations. estimated standard error = 7.3.
Small sample simulations
A Monte Carlo simulation of the full model presently appears to be infeasible because of the difficulty of specifying the degree of mismatching anticipated, and the potential misspecification of the model for parametric analysis, let alone the enormous computing required for iterative solution of the likelihood equations for each trial.We have therefore restricted our consideration of small samples to the situation where confounding is absent (so that adjustment for C in the independent samples is not needed) and where E is binary. In this case, it is shown in Appendix 2 that the expected distribution of matched pairs can be computed from the parameters pE, RR,, and plo by the formulae given in Table 2(b), where p10 is the correlation between the exposure status of cases and their matched controls induced by the correlation pEC. The asymptotic and small sample simulation results are given in Table 10. The effects are consistent with those noted in Tables 4 and 8, but the small sample effects are uniformly larger in magnitude than the asymptotic ones. Thus, it appears that the loss in efficiency due to overmatching can be substantially greater than the asymptotic results would indicate. DISCUSSION In this paper, we have carefully distinguished between design and analysis issues, focusing on the former. Our main conclusion is that the choice between matched and independent sample designs depends on a number of considerations, but most importantly on the relative strength of the relationships of disease D to exposure E and potential confounder C. In terms of analysis issues, we have noted that failure to stratify or condition in the analysis after having used matched or stratified sampling will result in a biased estimate of the ED relation whereas to employ a paired analysis when cases and controls have in effect only been stratum matched is inefficient. In practice, epidemiologists frequently opt for strategies which are difficult to classify as stratum or pair matching. For example, one might loosely match on broad age and gender strata (potentially strong confounders) and then select the nearest available match on neighborhood or date-of-admission (possibly pseudo-confounders). In such a situation, it is not obvious whether a matched-pairs or stratified analysis would be preferred, although correlation of neighborhood or date-ofadmission with exposure would dictate matched-pairs analysis. Another issue we have not addressed is the risk of introducing bias if an adjustment strategy is used which does not completely eliminate the effects of confounding variables, e.g. through insufficiently fine stratification or incorrect specification of the analysis model or choice of variables. Finally, we have not addressed the greater complexity of stratified over pair-matched analyses when conditional likelihood methods must be used because of small stratum sizes [18, 191.
694
In terms of design
DUNCAN C.
decisions,
THOMAS
and SAiwtx
we have arrived
GKWUIA~IJ
at the following
recommendations:
(I) If a factor C is unrelated to both E and D. there is no statistical consequence of matching on it. Therefore, if both the CE and CD relationships are doubtful. matching on C is unnecessary, and practical considerations would probably argue against matching. If independent samples are chosen. adjustment for such variables should be avoided. (2) If C is unrelated to E but strongly related to D, matching does not appear to have strong statistical consequences. Therefore, if the C’E relationship is doubtful but the CD relationship is known to be strong, practical considerations may predominate in the decision to match, e.g. the enhanced study credibility due to matching on C can be weighed against the extra effort of matching. (As noted below, however, any decision to match on a strong risk factor should be considered in the decision about whether to match on other factors.) (3) If C is strongly related to E but unrelated to D, matching necessitates stratified or conditional analysis, whereas the more efficient unadjusted estimator can bc used if no matching is employed. Therefore, if the CD relationship is doubtful but the CE relationship is known to be strong, matching on C should be avoided. (4) If C is related to both E and D, matching on it will more often than not improve statistical efficiency. In deciding whether to match. consideration should be given to the effects that the anticipated relationships between E. D, and C and their distributions will have on the efficiency of the designs considered (Tables 5 and 9). Thus matching might be avoided if the ED relation were strong, the CD relation weak. or E binary with high prevalence. The effect of the EC relationship is less clear. Obviously the variance of the matched design increases as the strength of this relationship increases for the a priori considerations given earlier, but the same is true of the variance of the various adjustment strategies. The relative efficiency seems to depend on a number of factors and is different when comparing pair-matching with covariance adjustment or comparing stratummatching with stratified analysis. These relationships warrant further investigation. These recommendations are in general agreement with those of the authors previously cited [7-121. If the necessity of controlling a variable is doubtful, we would recommend against matching. The confounding effect of such a variable can be explored in the analysis and a decision made at that stage whether or not to adjust for it. In following these recommendations, one must of course decide which of the variables that appear to bc confounders in the data are in fact true confounders (for which adjustment is necessary) and which are just “pseudo-confounders” (for which adjustment should be avoided) [7. 131. Statistical significance testing is not a valid method for making this decision [7. 13. X-23]. but when the investigator has no strong prior information about the confounding role of a variable, there is no generally agreed-upon criterion for making this decision. Work is in progress to develop criteria for deciding which estimator to use when no prior information is available, and to develop adaptive estimators with desirable properties. Our results do not bear directly on this question because we have assumed throughout that the population associations were known. In this case, we have shown that adjustment (or matching) for variables which are not confounders is inefficient. Recently. it has been demonstrated (Robins, 1982, unpublished manuscript) that this inefficiency can also be viewed as a form of conditional bias: for example, if there is no EC association in the population, but by chance, such an association is apparent in the data, then conditional on this observed association the stratified estimator of the ED association is biased, whereas the crude estimator is unbiased. Thus, if we knew that there was no EC association in the population, the crude independent-sample estimator would be preferred on grounds of both efficiency and validity. The same conclusion applies when there is a sample DC association, but it is known that there is no DC association in the population. In the asymptotic case, the relative efficiencies of the various strategies are rarely strikingly different from unity, but our limited simulation studies indicate that the small sarnple differences could be substantially larger. at least in the case of pair matching. Hence
Matching
in Case-Control
Studies
695
we feel that while asymptotic results can usually point out general trends, asymptotic differences between methods ought not to be dismissed merely because they seem small in practical terms. Further small sample simulations of more realistic designs would be highly desirable. While our study has dealt only with a single factor C, in practice multiple covariates must be considered. Under the assumption of no effect modification, our results extend to the multiple covariate case by considering C to be a potential additional matching factor after having already decided to match on several other factors. The decision whether to match additionally on C must then be based on the residual EC and DC associations after adjustment for the other factors. This is analogous to multiple regression problems, and the analogy extends further: when there are several candidates for matching, the decision to match should not be based on separate consideration of each factor, but rather on whether a factor is a member of some “best” subset of the factors [13]. Theoretically, the “best” subset ought to be a subset that maximizes the efficiency of the design-analysis strategy. Operationally, our results indicate that such a subset would be the one that retains most of the disease-predictive power of the full set while minimizing exposure-predictive power. Identification of such a subset is required at the design stage of the study, but usually not all the information necessary for this will be available. Nevertheless, whenever there are strong intercorrelations within the set of matching candidates (e.g. if the set includes smoking, SES, and race), the disease predictive power of the full set will be captured by a relatively small subset, and the logical first choices for matching will then be the strongest risk factors. Although we have focused here on statistical efficiency criteria, “practical considerations” are bound to be important in choosing between designs [19,23]. Matched studies undoubtedly carry greater credibility than do elaborate statistical adjustments, and are more convenient for controlling such nebulous factors as place of residence or family history. On the other hand, they may be more expensive or cumbersome to implement [IO, 19, 233 and may suffer more losses, either through failure to find a match [24] or failure to interview the other member of a matched pair. Such losses can seriously affect both the validity and efficiency of the study results. These considerations are difficult to incorporate in statistical studies of bias and efficiency, although the cost issue has received some attention [IO]. ArltnoM,/~,d~rn2Pwts-The authors would like to thank Drs Sholom Wacholder and James Robins, and the referees for many helpful suggestions in revising this paper, and MS Marlene Dyck for assistance in preparing the manuscript.
REFERENCES I.
2. ?
4: 5. 6. 7. 8. 9. IO.
I I. 12. 13. 14. 15. 16.
Billewicz WZ: Matched samples in medical investigations. Br J Prev Sot Med 18: 167-173, 1964 Biometrics 21: 623-644. 1965 Billewicr WZ: The elficiencv of matched samoles: an empiric investigation. Rubin DB: Matching to remove bias in obseivational s&dies. Biometrics 29: 159-183, 1973 Rubin DB: The use of matched sampling and regression adjustment to remove bias in observational studies. Biometrics 29: I X5-203, 1973 Miettincn OS: The matched pairs design in the case of all-or-none responses. Biometrics 24: 339-352, 1968 Miettinen OS: Matching and design efficiency in retrospective studies. Am J Epid 91: 11 I-1 17, 1970 Day NE, Byar DP, Green SB: Overadjustment in case-control studies. Am J Epid 112: 696-706, 1980 Kupper LL, Karon JM, Kleinbaum DC. et al: Matching in epidemiologic studies: validity and efficiency considerations. Biometrics 37: 27 l-292. 1981 Smith PG. Day NE: Matching and confounding in the design and analysis of epidemiological case-control studies. In Perspectives in Medical Statistics. Bithell JF, Coppi R (Eds) New York: Academic. 1981 Thompson WD. Kelsey JL, Walter SD: Cost and efficiency in the choice of matched and unmatched casexontrol study designs. Am J Epid 116: X40-851, 1982 Samuels ML: Matching and design efficiency in epidemiological studies. Biometrika 68: 577-588, 1981 McKinlay SM: The effect of bias on estimation of relative risk for pair-matched and stratified samples. JAMA 70: 859-864, 1975 Miettinen OS, Cook EF: Confounding-essence and detection. Am J Epid 114: 593-603, 1981 Greenland S, Thomas DC: On the need for the rare disease assumption in case-control studies. Am J Epid 116: 547-553, 1982 Breslow N: Odds ratio estimators when the data are sparse. Biometrika 68: 73-84, 1981 McKinlay SM: Pair matching--a reappraisal of a popular technique. Biometrics 33: 725-735. 1977
DUNCAN C. THOMASand SANDERGREENLAND
696
17. 18. 19.
20. 21. 22. 23. 24. 25.
Siegel DG, Greenhouse SW: Validity in estimating relative risk in case
APPENDIX Asvmptotic a&stment
wuiunces q/ exactly model ix correct.
pair-matched
I
and independent
sumples
with
corariuncr
rrdjustment
when the
Let f(E, C) be the joint distribution of exposure E and potential confounder C in the population at risk. In what follows, E and C can be univariate or multivariate, continuous or discrete. Let r(E. C) denote the true rate of disease as a function of E and C. Then the expected distribution of E and C in cases will be given by g(E. C) = r(E. C)f(E,
C)
i r(E, C)f(E, C) dE dC. /js
(1)
First, suppose a case-control study has sampled n, cases from g(E, C) and n,, independent controls from f(E, C) and suppose two models for the odds ratio, R,,(E, C; /to. p,, /i?) and R&E; &, Ii,) are to be fitted to the data, where p, and /& are the parameters for E and C respectively and /$, is a nuisance parameter defined by the condition that g(E,C)dEdC=
1.
ss These
odds
ratio
models
induce
models
for the probability
of being
a case of the form
I E, C) = R(E, C)i[l + R(E, C)].
P(II
(2)
(The formal justification of these relations has been provided by Prentice and Pyke [25]). Each case therefore contributes a term of the form lnR(E, C) to the log likelihood and all cases and controls contribute terms of the form ln[l + R(E, C)]. The expected log likelihood for the model R(E, C) is therefore of the form
E(ln L) =
n,g(E.
C) In R(E. C) dE dC
ss -
[n,g(E,
C) + n,f(E,
C)] In [I + R(E. C)] dE dC.
(3)
JJ
The asymptotic covariance matrix of the maximum likelihood estimates obtained by inverting the matrix I of negative second partial derivatives E(I,,) = -
n&E,
C) I,,(E, C)dE
of the parameters of In L. given by
of R(E, C) is
dC
ST +
C) + n,f (E, C)] I,,,,(/% C) dE dC
(4)
JJ
where
Z,,(E, C) = (R:R; - R;;R)/R’ I,&,
C) = - [R;(l + R) - R:R;]/(l
+ R)’
and for notational simplicity, R = R(E, C), R; = aR/ap,, and R.:; = a’R/?p,afl,. If in the adjusted analysis, the true model is used, i.e. R,,(E, C) = r(E, C), then the asymptotic variance of the adjusted estimate 8, is then simply the l,l-element of the inverse matrix of E(I), as evaluated at the true values of &. [j, /&. For the crude analysis, we let R,,(E)
=
r(E, C)f(E. C) dC /(& C) dC (5) s il and evaluate the asymptotic variance of /I, similarly. Now suppose controls have been exactly matched to the case distribution of C (comparable to our having assumed our adjusted analysis exactly modelled the effects of C in the unmatched case). Then the distribution
Matching of E in those
controls
matched
to cases
in Case-Control
on value
R,,(E, The expected
conditional
E(ln L) = and the expected
information
g(E, is
E(I) = where I(E,,
(6)
is
f(E, C) dC.
the risks relative to some standard we fit the model
C) = RR(E;
log likelihood
697
C is merely
WE I C) = f(E, C) In fitting the data, we now need only consider if we assume E and C interact multiplicatively,
Studies
B,) R,,(O, C; B,. -
covariate
value,
C,, i.e.
>Pd.
is
>CM-C, I Cl WE, )I[RW, I+ R&%)1 d4 d&, dC
CM% ICW, 344 d-4 d4, dC JJJg(4)
(7)
EO) = (RR;’ - RR;RR,)/RR: - [(RR; + RRJ,)‘-
(RR; + RG)(RR,
+ RR,)]/(RR,
+ RR,)’
and RR, = RR(E,, C), RR = aRR,/a/?,, RR” = (7’RR,/aj?t. Provided the true model r(E, C) has been used in specifying the relative risk model RR(E) in equation (5) the asymptotic variance of fi, is simply l/E(I) evaluated at the true values of p, and &. Equations [4] and [7] can in principle be evaluated for any specification of f(E, C) and r(E, C), but the results discussed in the text are confined to the special case where f(E, C) is the bivariate normal distribution with correlation pEc and r(E, C) is the exponential model exp& + B,E + &C), which induces the logistic model for equation (2). For the exponential model, the terms in equations (4) and (7) involving the cases disappear, and the remaining terms simplify to It,” = R(E, C) E”JPJ/[ I + R(E, C)] (8) where
es = 6,, + a,,
, q, = 6,, + 6,2, and 6,, = I if n = m, 0 otherwise, I = RR(E,)
Tables 7 and 8 present the results combinations of pEc, 8, and &.
RR(E,,)(E,
of numerical
- E,$/[RR(E,)
integration
APPENDIX Distribution
and + RR(E,#
of equations
(9)
(8) and (9) as evaluated
for various
2
of matched pairs with a binary exposure -factor and a continuous
matching
variable
We assume that E is a binomial random variable. Using the notation in Appendix 1, we further assume that RR(I, C) = RR,, RR(0, C) and that RR(0, C) is a constant, (i.e. yoc = 0), so that unmatched sampling and analysis yields unbiased results. Pair-matching controls to a random sample of cases will induce a correlation p,,, between case and control exposure levels, given by PI0 =
ml a0 - mlomOl, _____ (c,cOm,mO)‘~*
where the left-hand term is the correlation as derived from the expected matched-pair Table 2(b). In general, p10 depends in a complex manner on pEC, pns, and yOE; in the case where /IDE = 0 and yDc = 0, it can be shown that p,,, = pi<.; where pEC = 0, then p10 = 0 for all values of /IDE and yoc. Substituting as indicated in the footnote of Table 2(b) yields a quadratic equation in m,. The positive root of this equation is the number of discordant pairs with the case unexposed, control exposed; m,, = RR,,m,, is the expected number of discordant pairs with the case unexposed, control exposed. In simulating the case series, E was assumed to be binomially distributed with sample size N and probability parameter RR,,p$(RR,,p, + qE). Two control series were generated in each trial. In the independent series, E was assumed to be binomial with sample size N and parameter pE. The matched (correlated) series was generated using two binomials: in the first, generating controls for the exposed cases, E was binomial with c, trials and parameter m,,/c,; in the second, generating controls for the unexposed cases, E was binomial with c, trials and parameter m,,,/c,.