The relative efficiencies of matched and independent sample designs for case-control studies

The relative efficiencies of matched and independent sample designs for case-control studies

J ChronDis Vol. 36, No. IO, pp. 685-697, 0021.9681/83$3.00+0.00 Copyright c 1983PergamonPress Ltd 1983 Printedin Great Britain.All rightsreserved ...

1MB Sizes 0 Downloads 11 Views

J ChronDis Vol. 36, No. IO, pp. 685-697,

0021.9681/83$3.00+0.00 Copyright c 1983PergamonPress Ltd

1983

Printedin Great Britain.All rightsreserved

THE RELATIVE EFFICIENCIES OF MATCHED AND INDEPENDENT SAMPLE DESIGNS FOR CASE-CONTROL STUDIES DUNCAN

C. THOMAS’*

and SANDER GREENLAND’

‘Department of Epidemiology and Health, McGill University, 3775 University Street, Montreal, Quebec, Canada H3A 2B4 and ‘Division of Epidemiology, School of Public Health, University of California at Los Angeles, Los Angeles, California, U.S.A. (Receiaed in revised,form 30 March 1983) have studied the asymptotic and small sample efficiencies of dependent (pairmatched or stratified) and independent samples as design techniques for case-control studies, and of matched, stratified, covariance-adjusted, and crude comparisons as methods of analysis. The asymptotic efficiencies of dependent sample designs relative to independent sample designs with adjustment were found to vary with the strengths of the relationships of disease with exposure and potential confounder: as the relationship with exposure increases, dependent samples lose efficiency; as the relationship with confounder increases, dependent samples gain efficiency. The relative efficiency also depends in a complicated manner on such other factors as the distribution of exposure and the strength of the exposure-confounder relationship. In the majority of situations examined, however, dependent samples were found to be somewhat more efficient than independent samples when confounding was present, while the reverse was true when confounding was absent. Results of small sample simulations do not differ importantly from the asymptotic results, except for pair-matching on a non-confounder, where the inefficiency of matching is greater in small samples.

Abstract-We

INTRODUCTION

ONE OF THE major challenges in nonexperimental research is the control of the effects of confounding factors. To this end, a wide range of methodological techniques has evolved, including such design strategies as pair matching and stratified sampling, and such analytical methods as stratified analysis and covariance adjustment. Early efforts [l-4] to provide some statistical basis for choosing between these methods mainly considered prospective studies and the relative capability of various strategies to eliminate bias. Miettinen [5, 61 was the first to point out that another important issue is efficiency (the precision of the estimated parameter) and, in an intuitive way, that efficiency considerations are quite different in cohort and case-control studies. He argued that, whereas in cohort studies, matching reduces the residual variance thereby allowing the exposure effect to be more easily detected, in case-control studies it will tend to restrict the variability of the exposure variable, thereby making it more difficult for the exposure effect to appear. This observation applies as well to statistical adjustment, as demonstrated by Day et al. [7]. Miettinen compared matching with independent sampling when no confounding was present but did not examine the case in which some form of adjustment was necessary to control bias. The latter situation has recently been addressed by a number of authors [8-121; most of these considered the large sample case with binary exposure, disease, and potentially confounding variables. This case is particularly instructive because

*Author to whom correspondence should be addressed. Portions of this paper were presented at the 109th Annual Meeting of the American Public Health Association, Los Angeles, November, 198 I. This research was supported in part by a grant from the National Cancer Institute of Canada and in part by Grant No. ROl-CA-16042 from the National Cancer Institute of the United States. 685

686

DUN~AU C. THOMAS

and SANIXR GRENLANII

the problem can be characterized by a minimum number of parameters and the asymptotic efficiencies easily derived. So far, only McKinlay [12] has compared pair matching (on a continuous covariate) with adjustment, and only Monte Carlo simulation was found to be feasible for her study. In this paper, we briefly review these asymptotic results for binary exposure (E), disease (D) and potentially confounding (C) variables and then describe a new Monte Carlo simulation of the small sample case. Next we consider the problem of pair matching on a continuous covariate and develop an asymptotic expression for its efficiency relative to logistic adjustment; small sample comparisons of pair-matching are limited to a binary exposure variable with confounding absent. Throughout, our comparisons are restricted to the situation of equal numbers of cases and controls (one-to-one matching in the later sections). First, however, it is instructive to consider in theoretical terms what the effect of matching or stratification in design is expected to be. A PRIORI

CONSIDERATIONS

The fundamental characteristic of case-control matching is that it induces a constant (“balanced”) case-control ratio across the strata of the matching factor. For example, one-to-one matching of controls to cases on the factor C will ensure that the number of controls will equal the number of cases within each stratum of C. If stratification on C will be necessary in the analysis. then we should expect this “balancing” effect of matching to improve the efficiency of the study relative to independent sampling, for in the latter design, some strata may be left with unbalanced numbers of cases and controls; the extent of this improvement will depend on the degree of association of C and D in the source population. On the other hand, matching on a factor associated with exposure increases the correlation (concordance) of the exposure histories of cases and controls, thus increasing the variability of the estimated association between E and D [5, 61. Nevertheless, this increase in variance will translate into a reduction in efficiency only when compared against independent samples without adjustment; when some form of statistical adjustment is needed to control confounding, the variability of this adjusted estimate will also increase as the association between E and C increases, so that the relatiz~e elhciency of the two designs will be difficult to predict. Thus, all other things being constant, the more strongly C is related to disease, the more efficient the matched design will be relative to the unmatched design, while the more strongly C is related to exposure, the larger the variance of both the matched design and the unmatched design with adjustment will be. The asymptotic comparisons presented below confirm these predictions, though the magnitudes of the effects vary depending on sample size and other factors.

STRATIFICATION

Asymptotic

relations

In order to provide a basis for comparison with the new results, we review first the situation of binary exposure. disease, and potentially confounding variables. The basic design issue is whether to use stratified or independent samples of controls, and in either case, the basic analytic issue is whether to use a collapsed or stratified analysis; in the stratified sample case, a further analytic issue is whether to retain any one-to-one pairing that may have been done. The five basic strategies to be compared are therefore: (IC) independent samples with a collapsed analysis; (IS) independent samples with a stratified analysis; (SC) stratified samples with a collapsed analysis; (SS) stratified samples with a stratified analysis; and (SM) stratified samples incorporating pair matching in the analysis. In this section we consider the asymptotic bias and efficiency of each of these strategies as functions of the underlying relationships between the three factors in the population. Thus we begin by calculating the expected outcomes for each set of relationships and then

Matching

in Case-Control

Studies

687

compare the expected estimates and asymptotic variances of the log odds ratio for the exposure-disease relationship resulting from each of the five strategies. The relationships described in this section agree with those given by Smith and Day [9] except that we find it more convenient to interchange the order of specifying the frequencies of E and C in the nondiseased population. The analytic results are not affected, but we prefer to think of C as the determinant of E because factors which are intermediate links in a pathway from E to D are not ordinarily considered to be true confounders [13]. Thus, let pc = 1 - yc denote the prevalence of C in the nondiseased population and let p, = 1 - q, denote the prevalence of E in the nondiseased subpopulations with C absent (i = 0) and present (i = 1). Then we let pE =pcp, + qcpo be the overall prevalence of exposure and OR,,. = p,qo/poq, be the odds ratio for the relationship between E and C in the population. Next we assume that the incidence of disease follows a multiplicative dependence on E and C, so that the rate ratio for the exposure effect is -RR,, = rate(EC)/rate(EC) = rate(EC)/rate(EC) and for the potential confounder is -RRLIc = rate(EC)/rate(EC?) = rate(EC)/rate(EC), neither depending on the level of the other factor. Assuming either a rare disease or incidence density sampling [ 141, the expected distributions of cases c,,, independent controls ui,, and stratified controls sij (i = level of C and j = level of E) are then given in Table 1. Note that the expected distribution of independent controls is merely the population distribution, while in the stratified series, the same number of controls as cases is drawn in each confounder stratum (s, = ci., where the point (.) denotes summation over the indicated index). The investigator may find it convenient to pair these cases and controls together, for example by selecting a control with the same value of C who was admitted on the same day. Assuming that the criteria determining such pairing were effectively random, i.e. not related to either E or D, then the expected distribution of the case-control pairs is given

Total

I

Total

where z = P&I,,

fp,,

RR,,)

+ d%~~(~~

+ PIRRA

Exposure Exposure of the case Present Absent

ml,,

on factor

m,, = c, - R&m,,,

m,,,

Total 2

p,” of case-control

Absent

%I = %~ %

m,, is the positive

exposure

control

t% = RR,,m,,

m,

rDbpL + qEand

I

of the matched

Present

Present Absent

c 0

I.

mducing correlation status

Exposure Exposure of the case

Cl

s0

where c,, and s,, are as defined in Table matching

Total

= xl w,

mlJil = z, w,

SI

(B) One-to-one

control

Absent

ml1 =&c,,P, ml11= 7 %,P,

Total

where Z’ = Appendix

of the matched

Present

m,, root of the quadratic

Total

rocpE:z’ co= P, IX’ I

c, =

equation

(I) in

688

DUNCAN

TABLE

3. FOKMLLA~

THOMAS

C.

FOR COMPI:TIPI<;

and SANDER

ESTIMATL,>

oous

GREENLANI)

RATIOS

AI~D TH~,K

ASYMPTOT,C

Estimated Method Sampling

scheme

of

odds

a”alyQs

(IC)

Independent

Collapsed

(IS)

Independent

Stratified

ratlo*

vanaxet

f%L

0%

c I”&O” z, c,,wv,

+ “,)

~,WW + U,,)

) where

Stratified

(SS)

Collapsed

Stratified

U, = L + L + I + ! c,r, co “,I1 “,I I I __f-_+ c,,

,s,,:c+

c

cl

x1 c,lslO/c,

Stratified

RR,,r

I I L + E’, + + ci “I “”

I

z, C,““,, xc, + “,

(SC)

S,S,I(S,

z, w,:c, where

Stratified

*Formulae

used m small

tForm”lae for

Matched sample

used in asymptotic IS

and

estimator previous

SS,

the

comparisons

asymptobc

of the common

“dds

(not (not

variance ratio,

not

required that

that

+ S,,) + _! + ! CIII s,o S,I

ml0

requred

is

’ s0

‘_+J_

“J,lP0

simulations

!+ 5,

S, = !- + i

5 (SM)

VAKIAIUCLS

Asymptotic

of

for

asymptotic

for small the

“%Ji

comparisons).

sample

simulations).

unconditional

maximum

of the MantelLHaenrel

estimator

Note

that

likelihood given

I”

the

column.

in Table 2(a). (The situation of unique one-to-one pairing on true, continuous confounders is addressed later.) Jhe formulae for the odds ratio estimator of RR,], and the asymptotic variance of InRR,, under the various strategies are provided in Table 3. Only the asymptotic variances (column 4) are needed for the efficiency comparisons in this section. In the stratified analyses, the variance is that of the empirical logit or the unconditional or conditional maximum likelihood estimators, all of which are asymptotically equivalent and fully efficient under this model [15]. Then we define the asymptotic relative efficiency, ARE,:, (In so doing we have followed general of strategy x relative to strategy y as G&$~&,. practice, though in practical terms the relevant scale of comparison is the standard error rather than the variance; ARES for this scale can be obtained from the tabulated values by taking square roots and will always be closer to 100x.) The asymptotic variances and ARE’s are tabulated for various choices of the parameters pE, pc, OR,, , RR,), and RR,, in Tables 4 and 5. The nomenclature in Table 4 follows Miettinen [6], except that we have distinguished the case RR,, = OR,, = 1 by the name “irrelevance”. Results. Table 4 describes the asymptotic variances for the eight fundamentally different situations in which OR,,, RR,, and RR,, are either “present” or “absent”. The value 4 was arbitrarily selected to illustrate “presence” and pE and pc are fixed at 0.5 for each; differences between methods vary in a complicated manner with the various parameters, as explored in greater detail in Table 5. TARL~

4. ASYMPTOTK

.Y,FROM STRATF,~” AN”INDEPENDENT SAMPLES FOR CASE CONTROL STUDIFS WlTHMATCHE”. STRATlFlED ANDUNAUJUSIED ANALYSES

YAR,ANC~S OF loci ODDS RAT,OS ,x

Stratified

samples

Independent

Strength w,,

Matched

of relationshipst OR,,

Nomenclature

R’L

(SM) Null

samples

analysis:

analysis: Collapsed

Stratified

Collapsed

(SS)

(SC)

(IS)

(IC)

StratIlied

h~~ppothrsis

I I

I I

I 4

Irrelevance

8.00

X.00

X.00

8.00

x.00

Futility

8.00

8.00

8.00

8.89

X.01)

I I

4

I

Overmatchmg

9.00

9.00

8.00

9.00

8.00

4

4

Confounding

9.00

9 00

8.33

9.99

8.17*

hvporhrsrs

Al~rmuriw 4

I

Irrelevance

12.50

10.25

10.25

10.25

10.25

I

I

4

4

Futihry

12.50

10.25

10.25

II

27

10.25

4

4

I

Overmatching

14.06

I I .70

IO.271

II

36

IO.25

4

4

4

Confounding

15.75

13.43

12.41*

13 75

*Expected

odds

ratio

is biased.

?/Q = p<. = 0.5

for

all slI”ations

I2 17*

Matching

in Case-Control

Studies

689

In the case of irrelevance, all methods are unbiased and equally efficient except matched analysis under the alternative hypothesis. This would seem to be contrary to the conventional wisdom that in the paired analyses, the “concordant pairs” m,, and m,, contribute no information. The latter view is correct only if cases have been individually matched with a uniquely “best” control, such as a sib or nearest available match on a number of strong confounders, as discussed below. In the present case of stratum matching, the particular pairing is but one of a large number of possible ones, so that to restrict the analysis to the small number of discordant pairs is wasteful of information [ 161. The same results apply in the case of futility except that stratified analysis is less efficient than crude analysis for independent samples. In either case, if C is unrelated to E, it makes no difference to efficiency whether the sampling design is stratified on C. In the case of overmatching, only collapsed analysis of independent samples (IC) is fully efficient if RR,, # 1. Of all the situations, the stratified design is the least efficient here, particularly with a matched analysis. This case also illustrates two important differences between case-control and cohort studies in the implications of a stratified design. First, in case-control studies it is not sufficient merely to stratify the design for a potential confounder; the analysis must also be stratified [17]. Indeed, in caseecontrol studies, an unstratified analysis of a stratified design will lead to a biased estimate of RR,, unless OR,, = 1 or RR,, = 1. Second, sampling stratified on a factor which is related to E but not to D will lead to a less precise estimate in a case-control study but not in a cohort study [S, Ill. Nevertheless, it will not cause the odds ratio to be underestimated unless the incorrect unstratified analysis is done. (These differences between case-control and cohort studies reflect the choice of parameter rather than the basic design; if an odds ratio were estimated from a cohort study, the results would be identical to those presented here for case-control studies.) In the case of confounding, a stratified analysis is necessary for validity, whether or not the design is stratified. As in the other cases, a matched analysis is less efficient than a stratified one, so hereafter we restrict our attention to comparisons of stratified and independent sample designs using only stratified analyses. The relative efficiency of the two designs varies in a complicated manner with all of the five parameters, as shown in Table 5. The strongest and most consistent effects are the following: pE: Stratified sampling becomes more efficient as pL becomes smaller, particularly if OR,, or RR,, are large. RR,,: Stratified sampling becomes more efficient as RRUc gets larger, particularly if pE is small. RR,,: Stratified sampling becomes less efficient as RR,, gets larger, particularly if OR, is also large and pa and pc are similar. OR,,.: Stratified sampling becomes less efficient as OR,, gets larger, provided pE is large; otherwise the effect of OR,, is unpredictable.

ANALYLEO

CASE OF rONFOUNDING

BY S,KATIFICAT,Or.

RR,,-=2 OR,<

RR,<

RR,,=8

= 2

3

8 72

PI

PC

0.2

0.2

2

95

85

0.2

0.2

4

93

81

0.2

02

8

91

79

RR,,

=2

4

8

97

89

71

68

99

91

79

66

104

97

85

0.2

0 5

2

94

86

78

96

89

81

0.2

0.5

4

91

83

16

94

87

81

0.2

0.5

8

86

80

74

91

85

80

0.5

0.2

2

99

93

82

101

96

87

0.5

0.2

4

102

99

89

104

103

96

0.5

0.2

8

106

106

99

109

II0

106

0.5

0.5

2

98

92

84

100

96

91

0.5

0 5

4

100

94

86

I05

IO1

95

05

0.5

8

IO1

9s

88

II2

I07

100

690

DUNCAN

C. THOMAS

and SANIXK

GREENLAND

The effects of RR,,- and OR,c are consistent with the u priori considerations discussed above. On balance, most of the combinations of parameters considered here favor a stratified design. The potential loss in efficiency through stratification was rarely more than 10% but the potential gain was sometimes more than 2Ooj,. Small sample simulalions Many of the parameter choices listed in Table 5 lead to one or more cells having very small probabilities (e.g. < 5%) so that for many realistic sample sizes, very small or even empty cells may frequently arise. Thus it is reasonable to wonder if the asymptotic results presented above and in Refs [7-l l] would be applicable to small samples. To investigate this question, we carried out a Monte Carlo simulation. The cil and ui, were assumed to be multinomially distributed with sample size N and probabilities given in Table 1, and the s,, to be binomially distributed with sample sizes c, and conditional probabilities p,. The empirical variances of the estimators listed in the third column of Table 3 were then tabulated. (Again, note that in the stratified analyses, the MantelLHaenszel estimator was used, rather than the more complex maximum likelihood estimators.) Standard errors of the simulated relative efficiencies were estimated by arranging the trials into batches of 200 each. On this basis, a total of 2000 trials was found to be needed to reduce the standard error to under 5%. As there was no systematic pattern to the standard errors, this number was used for all parameter choices and the pooled estimate of standard error of any particular choice was 3.9%. The computing cost forced us to restrict our consideration to a relatively limited choice of parameters. as given in Table 6. Though the standard errors for individual parameter choices may seem large, when considering the entire set of observations, all of the trends which were apparent in the asymptotic results were also highly significant in the simulated results. We looked for patterns in the deviations by using a multiple regression of the simulated on the asymptotic relative efficiencies. At a sample size of 200 cases and 200 controls, no significant patterns were apparent, but at a sample size of 50 cases and 50 controls, there was a small but highly significant tendency for the simulated findings to be more extreme than the asymptotic ones. Though the pattern is difficult to see from the individual cell values, the fitted relative

RR,>, P,=

P,

0.20

0.20

020

0.20

0.50

0.50

0.50

0.50

*N

= number

RR,,

= I

=4

OR,,

,v *

RR,,, = I

2

4

8

RR,,r = I

2

4

8

1.00

‘CC

100

9x

90

7X

100

98

91

79

200

102

97

Y4

74

95

95

96

73

50

101

102

x4

72

103

93

Xl

73

100

95

84

71

101

96

X7

14

94

93

80

69

99

94

89

76

50

102

94

?7

63

90

9s

8Y

65

o-_

2.00

4.00

8.00

2%

100

92

X0

66

100

95

85

72

200

106

8X

7X

63

103

97

x5

74

50

101

X7

1')

60

I01

91

x4

71

x

100

91

77

63

100

95

85

73

200

93

X6

IO6

89

62 57

98 104

100 Y4

Xl x3

75

50

80 hi)

100

Y7

90

x2

100

97

91

X4

99

Y8

92

7X

I02

9')

93

80

99

97

x5

77

97

96

x3

83

100

97

90

X2

IO1

100

94

X7

9X

99

96

76

100

100

93

90

100

92

93

IX

108

101

95

X3

100

97

90

X2

103

103

9x

91

IO?

99

93

X9

104

99

100

X9

100

Y3

XY

Xl

107

98

106

94

IO(1

97

90

82

IO6

IOX

102

94

I 00

2.00

4 00

X.00

of cases = number

69

200

102

97

xx

X4

IOX

I08

100

8X

50

IO1

103

92

X2

107

II4

94

95

of controls: each estimate based on 2000 simulations, estimated standard error = 3.9

Matching

Tm_t

7. SLMMAKY

OF

ASYMFTOTIC>SIMULATtD

STKATLM

Simulated:asymptotic

in CaseeControl

MATCHING

“ARlANCE

CASESOF

Studies

AhO

RtLArlVt

lRRELE”ANCE

AND

N=

number

parameter*

Variances ( x N) of In RR,, SM: stratified samples. matched analysis SS: stratified samples, stratified analysis IS: independent samples, stratified analysis IC: independent samples. collapsed analysis Relative efkency IS:SS IC: SM

691

tFrtClthCY

C”MPAKISOh$

FOR

OYERMATCHINti

of cases = number

of controls

200

50

I.042 I.010 I.014 0.988

1.122 I .093 I.087 I.063

1.002 1.060

I.011 I.061

*Numbers in the table are the geometric means of the ratio of the simulated to asymptotic vanan~e~ or relative eiTiciencies, over eight parameter choices. RR,, = 1, 4 and OR,, = I. 2. 4, 8. each with 1000 simulations (estimated SEM = 0.018).

efficiencies range from 64 at an ARE of 70 to 112 at an ARE of 1 IO. Thus, the small sample results entirely confirm the asymptotic comparisons, with no major differences in either direction or magnitude. For comparison with the case of pair-matching on a continuous covariate discussed below, we carried out an additional simulation of the two estimators considered there_SM (matched pairs estimator for the stratified design) and IC (collapsed estimator, unstratified design)+in the situations where both were valid, i.e. RR,, = 0. Table 7 shows that the simulated variances were generally larger than the asymptotic ones, but the differences were largest for strategy SM and smallest for IC, as might be expected based on the expected cell sizes. Though the simulated efficiencies of IS relative to SS were not substantially different from their ARES, the simulated efficiencies of IC relative to SM were, on average, about 6% larger than their ARES. Thus, convergence to asymptotic relations might require larger total sample sizes in the matched case than in the stratified case. ONE-TO-ONE

Asymptotic

MATCHING

relations

In our preliminary studies, we tried to approximate the case of one-to-one matching by increasing the number of levels of C and found that the differences between the five strategies become much less pronounced. (Similar findings were reported by Thompson et al. [lo].) This approach is cumbersome, however, because of the arbitrariness in specifying the distribution of C and the dependence of E and D on C, and because as the number of categories of C increases, some form of parametric adjustment for C (e.g. logistic regression) may become more efficient than stratification. While multi-level categories can also arise from combinations of several binary confounders, the same difficulties in comparing the efficiencies of designs arise. We have therefore developed a more general approach to the problem, described in detail in Appendix 1, in which we allow E and C to have any joint distribution, continuous or discrete, each factor being either univariate or multivariate. We then assume some functional dependence of risk on E and C and compare the asymptotic variances of the estimates of the coefficient(s) of E in that model under the following three strategies: (MM) paired samples, exactly matched on C, with a conditional likelihood analysis; (IA) independent samples, with an unconditional likelihood analysis which includes both a constant nuisance parameter and one for C in the model, and (IC) independent samples, with an unconditional likelihood analysis which includes only a constant nuisance parameter but ignores C. Thus, except under IC, we assume any confounding perfect matching or by perfect adjustment. The

effects have been eliminated either by asymptotic relative efficiency is then

692

DUNCAN

C. THOMAS and SANDER GREENLAND

computed by comparing the inverses of the second derivates of the log likelihood functions, evaluated at the true parameter values. Appendix 1 provides the general theory, applicable to any distribution of E and C and any form of dependence of D on E and C, together with computational formulae for the special case where E and C have a bivariate normal distribution with correlation pEC and incidence rates depending on E and C according to the relation RR(E, C) = exp(cc + j?,,E + y&I’) where CLis a nuisance parameter. Tables 8 and 9 provide the asymptotic relative efficiencies for various choices of these three parameters, calculated by numerical integration. Dichotomizing a bivariate normal population at the mean of E and C will produce an odds ratio OREc = 4 if pEC = 0.5. Similarly the values DDE= yoc = 1.O will produce rate ratios RR,, = RR,c = 4 when comparing the 25th and 75th percentiles of the marginal distributions of E and C. Thus, these parameter values are roughly comparable in magnitude to those given in Table 4 and the parameter values in Table 9 were chosen in a similar manner to be comparable to those in Table 5. The patterns in Table 8 are generally very similar to those in columns SM, IS and IC of Table 4, but to the extent that the parameter choices are comparable at all, the differences between methods seem to be more pronounced for continuous matching factors. Similarly, comparing Tables 5 and 9, very similar patterns emerge, though the differences are much larger for continuous matching factors. Thus in Table 9, independent samples with adjustment become more efficient as y UE gets larger (as for RR,, in Table 4) but less efficient as either yoc or pEC get larger; the effect of yuc is comparable to that of RR,, in Table 4, but the corresponding effect of OR,,. in Table 5 was quite inconsistent.

PAIR-MATCHtLl AN,,IN*~~P~NI~bNr SAMPLE FOR CASE CONTROL STllDlES WITH TABI.E 8. ASYMPTOTICVARIA~KES, x.v)OF/I-,, FROM (‘ONDITlOhAL. UNC”NDITI”NAL A”JUSTED ANDUivCONDITlOhAL I’NADKSTE” AhALYSES Pair matched sahlples conditional UlZtlyslS (MM)

Logtstic disease risk coefficients

Exposureconfounder correlation PE<

EXpOMe BD,

Confounder i’n<

Nomenclature

Independent unconditional adjusted for c (IA)

samples analysis

unadjusted ULJ)

Null hvporhesis 0.0 0.0 0.5 0.5

0.0 0.0 0.0 0.0

0.0 1.0 0.0 1.0

0.0 0.0 0.5 0.5

1.0 1.0 1.0 I.0

0.0 I.0 0.0 1.0

I~h3UK~

Futihty Overmatching Confounding Alternative

*Expected

parameter

estimate

Logistic diseased-exposure coefficient BDL 0.5 0.5 0.5

2.00 2.00 2 61 2.61

2.00 2.52 2.61 3 35

2.00 2.00 2.00 2.25*

4.21 4.27 4.84 4 85

3.05 3.67 3.88 5.62

3.05 3.05 3.05 4.60*

hypothrsrs

Irrelevance Futility Overmatching Confounding

is biased.

Exposure-confounder correlation coefficient Y6-c 0.30 0.50 0.67

Logtstic

disease-confounder

coefficient,

0.5

I .o

I.5

101 96 91

84 7x 73

64 60 55

I.0 1.0 I .o

0.30 0.50 0.67

I21 I08 94

100 86 73

71 64 54

1.5 1.5 1.5

0.30 0.50 0.67

152 126 100

125 9x 75

95 I? 54

ynr

Matching in Case-Control

Studies

693

TABLE IO. ASYMFTOTICAND SIMULATEDRELATIW EFFICIENCIESc’,,)OF INDEPENtxkT SAMPES WLrHOUTADJUSTMENTRELATlVETO PAIR MATCHINGIN THE ABStNcE OF CONFOlJNDlNG CASE OF lRRELE”ANCEOR OVERMATCHIkC(yoc = 0) Simulatedt

Asymptotic PF

PO!*

0.1

0.0 0.2 0.4

04

0.0 0.2 0.4

TOE=

I

TDE=

I

2

4

2

4

100 125 167

103 123 I57

114 134 169

105 126 I83

106 149 183

136 142 ISI

100 I25 167

IO6 133 I78

124 160 225

109 134 I85

109 143 190

I41 I94 214

‘Correlation of matched-pair exposure levels (see Appendix 2). tNumber of cases = numbers of controls = 100; for pF = 0.1 each estimate based on 2000 simulations, estimated standard error = 5.1; for pE = 0.4 each estimate based on 4000 simulations. estimated standard error = 7.3.

Small sample simulations

A Monte Carlo simulation of the full model presently appears to be infeasible because of the difficulty of specifying the degree of mismatching anticipated, and the potential misspecification of the model for parametric analysis, let alone the enormous computing required for iterative solution of the likelihood equations for each trial.We have therefore restricted our consideration of small samples to the situation where confounding is absent (so that adjustment for C in the independent samples is not needed) and where E is binary. In this case, it is shown in Appendix 2 that the expected distribution of matched pairs can be computed from the parameters pE, RR,, and plo by the formulae given in Table 2(b), where p10 is the correlation between the exposure status of cases and their matched controls induced by the correlation pEC. The asymptotic and small sample simulation results are given in Table 10. The effects are consistent with those noted in Tables 4 and 8, but the small sample effects are uniformly larger in magnitude than the asymptotic ones. Thus, it appears that the loss in efficiency due to overmatching can be substantially greater than the asymptotic results would indicate. DISCUSSION In this paper, we have carefully distinguished between design and analysis issues, focusing on the former. Our main conclusion is that the choice between matched and independent sample designs depends on a number of considerations, but most importantly on the relative strength of the relationships of disease D to exposure E and potential confounder C. In terms of analysis issues, we have noted that failure to stratify or condition in the analysis after having used matched or stratified sampling will result in a biased estimate of the ED relation whereas to employ a paired analysis when cases and controls have in effect only been stratum matched is inefficient. In practice, epidemiologists frequently opt for strategies which are difficult to classify as stratum or pair matching. For example, one might loosely match on broad age and gender strata (potentially strong confounders) and then select the nearest available match on neighborhood or date-of-admission (possibly pseudo-confounders). In such a situation, it is not obvious whether a matched-pairs or stratified analysis would be preferred, although correlation of neighborhood or date-ofadmission with exposure would dictate matched-pairs analysis. Another issue we have not addressed is the risk of introducing bias if an adjustment strategy is used which does not completely eliminate the effects of confounding variables, e.g. through insufficiently fine stratification or incorrect specification of the analysis model or choice of variables. Finally, we have not addressed the greater complexity of stratified over pair-matched analyses when conditional likelihood methods must be used because of small stratum sizes [18, 191.

694

In terms of design

DUNCAN C.

decisions,

THOMAS

and SAiwtx

we have arrived

GKWUIA~IJ

at the following

recommendations:

(I) If a factor C is unrelated to both E and D. there is no statistical consequence of matching on it. Therefore, if both the CE and CD relationships are doubtful. matching on C is unnecessary, and practical considerations would probably argue against matching. If independent samples are chosen. adjustment for such variables should be avoided. (2) If C is unrelated to E but strongly related to D, matching does not appear to have strong statistical consequences. Therefore, if the C’E relationship is doubtful but the CD relationship is known to be strong, practical considerations may predominate in the decision to match, e.g. the enhanced study credibility due to matching on C can be weighed against the extra effort of matching. (As noted below, however, any decision to match on a strong risk factor should be considered in the decision about whether to match on other factors.) (3) If C is strongly related to E but unrelated to D, matching necessitates stratified or conditional analysis, whereas the more efficient unadjusted estimator can bc used if no matching is employed. Therefore, if the CD relationship is doubtful but the CE relationship is known to be strong, matching on C should be avoided. (4) If C is related to both E and D, matching on it will more often than not improve statistical efficiency. In deciding whether to match. consideration should be given to the effects that the anticipated relationships between E. D, and C and their distributions will have on the efficiency of the designs considered (Tables 5 and 9). Thus matching might be avoided if the ED relation were strong, the CD relation weak. or E binary with high prevalence. The effect of the EC relationship is less clear. Obviously the variance of the matched design increases as the strength of this relationship increases for the a priori considerations given earlier, but the same is true of the variance of the various adjustment strategies. The relative efficiency seems to depend on a number of factors and is different when comparing pair-matching with covariance adjustment or comparing stratummatching with stratified analysis. These relationships warrant further investigation. These recommendations are in general agreement with those of the authors previously cited [7-121. If the necessity of controlling a variable is doubtful, we would recommend against matching. The confounding effect of such a variable can be explored in the analysis and a decision made at that stage whether or not to adjust for it. In following these recommendations, one must of course decide which of the variables that appear to bc confounders in the data are in fact true confounders (for which adjustment is necessary) and which are just “pseudo-confounders” (for which adjustment should be avoided) [7. 131. Statistical significance testing is not a valid method for making this decision [7. 13. X-23]. but when the investigator has no strong prior information about the confounding role of a variable, there is no generally agreed-upon criterion for making this decision. Work is in progress to develop criteria for deciding which estimator to use when no prior information is available, and to develop adaptive estimators with desirable properties. Our results do not bear directly on this question because we have assumed throughout that the population associations were known. In this case, we have shown that adjustment (or matching) for variables which are not confounders is inefficient. Recently. it has been demonstrated (Robins, 1982, unpublished manuscript) that this inefficiency can also be viewed as a form of conditional bias: for example, if there is no EC association in the population, but by chance, such an association is apparent in the data, then conditional on this observed association the stratified estimator of the ED association is biased, whereas the crude estimator is unbiased. Thus, if we knew that there was no EC association in the population, the crude independent-sample estimator would be preferred on grounds of both efficiency and validity. The same conclusion applies when there is a sample DC association, but it is known that there is no DC association in the population. In the asymptotic case, the relative efficiencies of the various strategies are rarely strikingly different from unity, but our limited simulation studies indicate that the small sarnple differences could be substantially larger. at least in the case of pair matching. Hence

Matching

in Case-Control

Studies

695

we feel that while asymptotic results can usually point out general trends, asymptotic differences between methods ought not to be dismissed merely because they seem small in practical terms. Further small sample simulations of more realistic designs would be highly desirable. While our study has dealt only with a single factor C, in practice multiple covariates must be considered. Under the assumption of no effect modification, our results extend to the multiple covariate case by considering C to be a potential additional matching factor after having already decided to match on several other factors. The decision whether to match additionally on C must then be based on the residual EC and DC associations after adjustment for the other factors. This is analogous to multiple regression problems, and the analogy extends further: when there are several candidates for matching, the decision to match should not be based on separate consideration of each factor, but rather on whether a factor is a member of some “best” subset of the factors [13]. Theoretically, the “best” subset ought to be a subset that maximizes the efficiency of the design-analysis strategy. Operationally, our results indicate that such a subset would be the one that retains most of the disease-predictive power of the full set while minimizing exposure-predictive power. Identification of such a subset is required at the design stage of the study, but usually not all the information necessary for this will be available. Nevertheless, whenever there are strong intercorrelations within the set of matching candidates (e.g. if the set includes smoking, SES, and race), the disease predictive power of the full set will be captured by a relatively small subset, and the logical first choices for matching will then be the strongest risk factors. Although we have focused here on statistical efficiency criteria, “practical considerations” are bound to be important in choosing between designs [19,23]. Matched studies undoubtedly carry greater credibility than do elaborate statistical adjustments, and are more convenient for controlling such nebulous factors as place of residence or family history. On the other hand, they may be more expensive or cumbersome to implement [IO, 19, 233 and may suffer more losses, either through failure to find a match [24] or failure to interview the other member of a matched pair. Such losses can seriously affect both the validity and efficiency of the study results. These considerations are difficult to incorporate in statistical studies of bias and efficiency, although the cost issue has received some attention [IO]. ArltnoM,/~,d~rn2Pwts-The authors would like to thank Drs Sholom Wacholder and James Robins, and the referees for many helpful suggestions in revising this paper, and MS Marlene Dyck for assistance in preparing the manuscript.

REFERENCES I.

2. ?

4: 5. 6. 7. 8. 9. IO.

I I. 12. 13. 14. 15. 16.

Billewicz WZ: Matched samples in medical investigations. Br J Prev Sot Med 18: 167-173, 1964 Biometrics 21: 623-644. 1965 Billewicr WZ: The elficiencv of matched samoles: an empiric investigation. Rubin DB: Matching to remove bias in obseivational s&dies. Biometrics 29: 159-183, 1973 Rubin DB: The use of matched sampling and regression adjustment to remove bias in observational studies. Biometrics 29: I X5-203, 1973 Miettincn OS: The matched pairs design in the case of all-or-none responses. Biometrics 24: 339-352, 1968 Miettinen OS: Matching and design efficiency in retrospective studies. Am J Epid 91: 11 I-1 17, 1970 Day NE, Byar DP, Green SB: Overadjustment in case-control studies. Am J Epid 112: 696-706, 1980 Kupper LL, Karon JM, Kleinbaum DC. et al: Matching in epidemiologic studies: validity and efficiency considerations. Biometrics 37: 27 l-292. 1981 Smith PG. Day NE: Matching and confounding in the design and analysis of epidemiological case-control studies. In Perspectives in Medical Statistics. Bithell JF, Coppi R (Eds) New York: Academic. 1981 Thompson WD. Kelsey JL, Walter SD: Cost and efficiency in the choice of matched and unmatched casexontrol study designs. Am J Epid 116: X40-851, 1982 Samuels ML: Matching and design efficiency in epidemiological studies. Biometrika 68: 577-588, 1981 McKinlay SM: The effect of bias on estimation of relative risk for pair-matched and stratified samples. JAMA 70: 859-864, 1975 Miettinen OS, Cook EF: Confounding-essence and detection. Am J Epid 114: 593-603, 1981 Greenland S, Thomas DC: On the need for the rare disease assumption in case-control studies. Am J Epid 116: 547-553, 1982 Breslow N: Odds ratio estimators when the data are sparse. Biometrika 68: 73-84, 1981 McKinlay SM: Pair matching--a reappraisal of a popular technique. Biometrics 33: 725-735. 1977

DUNCAN C. THOMASand SANDERGREENLAND

696

17. 18. 19.

20. 21. 22. 23. 24. 25.

Siegel DG, Greenhouse SW: Validity in estimating relative risk in case
APPENDIX Asvmptotic a&stment

wuiunces q/ exactly model ix correct.

pair-matched

I

and independent

sumples

with

corariuncr

rrdjustment

when the

Let f(E, C) be the joint distribution of exposure E and potential confounder C in the population at risk. In what follows, E and C can be univariate or multivariate, continuous or discrete. Let r(E. C) denote the true rate of disease as a function of E and C. Then the expected distribution of E and C in cases will be given by g(E. C) = r(E. C)f(E,

C)

i r(E, C)f(E, C) dE dC. /js

(1)

First, suppose a case-control study has sampled n, cases from g(E, C) and n,, independent controls from f(E, C) and suppose two models for the odds ratio, R,,(E, C; /to. p,, /i?) and R&E; &, Ii,) are to be fitted to the data, where p, and /& are the parameters for E and C respectively and /$, is a nuisance parameter defined by the condition that g(E,C)dEdC=

1.

ss These

odds

ratio

models

induce

models

for the probability

of being

a case of the form

I E, C) = R(E, C)i[l + R(E, C)].

P(II

(2)

(The formal justification of these relations has been provided by Prentice and Pyke [25]). Each case therefore contributes a term of the form lnR(E, C) to the log likelihood and all cases and controls contribute terms of the form ln[l + R(E, C)]. The expected log likelihood for the model R(E, C) is therefore of the form

E(ln L) =

n,g(E.

C) In R(E. C) dE dC

ss -

[n,g(E,

C) + n,f(E,

C)] In [I + R(E. C)] dE dC.

(3)

JJ

The asymptotic covariance matrix of the maximum likelihood estimates obtained by inverting the matrix I of negative second partial derivatives E(I,,) = -

n&E,

C) I,,(E, C)dE

of the parameters of In L. given by

of R(E, C) is

dC

ST +

C) + n,f (E, C)] I,,,,(/% C) dE dC

(4)

JJ

where

Z,,(E, C) = (R:R; - R;;R)/R’ I,&,

C) = - [R;(l + R) - R:R;]/(l

+ R)’

and for notational simplicity, R = R(E, C), R; = aR/ap,, and R.:; = a’R/?p,afl,. If in the adjusted analysis, the true model is used, i.e. R,,(E, C) = r(E, C), then the asymptotic variance of the adjusted estimate 8, is then simply the l,l-element of the inverse matrix of E(I), as evaluated at the true values of &. [j, /&. For the crude analysis, we let R,,(E)

=

r(E, C)f(E. C) dC /(& C) dC (5) s il and evaluate the asymptotic variance of /I, similarly. Now suppose controls have been exactly matched to the case distribution of C (comparable to our having assumed our adjusted analysis exactly modelled the effects of C in the unmatched case). Then the distribution

Matching of E in those

controls

matched

to cases

in Case-Control

on value

R,,(E, The expected

conditional

E(ln L) = and the expected

information

g(E, is

E(I) = where I(E,,

(6)

is

f(E, C) dC.

the risks relative to some standard we fit the model

C) = RR(E;

log likelihood

697

C is merely

WE I C) = f(E, C) In fitting the data, we now need only consider if we assume E and C interact multiplicatively,

Studies

B,) R,,(O, C; B,. -

covariate

value,

C,, i.e.

>Pd.

is

>CM-C, I Cl WE, )I[RW, I+ R&%)1 d4 d&, dC

CM% ICW, 344 d-4 d4, dC JJJg(4)

(7)

EO) = (RR;’ - RR;RR,)/RR: - [(RR; + RRJ,)‘-

(RR; + RG)(RR,

+ RR,)]/(RR,

+ RR,)’

and RR, = RR(E,, C), RR = aRR,/a/?,, RR” = (7’RR,/aj?t. Provided the true model r(E, C) has been used in specifying the relative risk model RR(E) in equation (5) the asymptotic variance of fi, is simply l/E(I) evaluated at the true values of p, and &. Equations [4] and [7] can in principle be evaluated for any specification of f(E, C) and r(E, C), but the results discussed in the text are confined to the special case where f(E, C) is the bivariate normal distribution with correlation pEc and r(E, C) is the exponential model exp& + B,E + &C), which induces the logistic model for equation (2). For the exponential model, the terms in equations (4) and (7) involving the cases disappear, and the remaining terms simplify to It,” = R(E, C) E”JPJ/[ I + R(E, C)] (8) where

es = 6,, + a,,

, q, = 6,, + 6,2, and 6,, = I if n = m, 0 otherwise, I = RR(E,)

Tables 7 and 8 present the results combinations of pEc, 8, and &.

RR(E,,)(E,

of numerical

- E,$/[RR(E,)

integration

APPENDIX Distribution

and + RR(E,#

of equations

(9)

(8) and (9) as evaluated

for various

2

of matched pairs with a binary exposure -factor and a continuous

matching

variable

We assume that E is a binomial random variable. Using the notation in Appendix 1, we further assume that RR(I, C) = RR,, RR(0, C) and that RR(0, C) is a constant, (i.e. yoc = 0), so that unmatched sampling and analysis yields unbiased results. Pair-matching controls to a random sample of cases will induce a correlation p,,, between case and control exposure levels, given by PI0 =

ml a0 - mlomOl, _____ (c,cOm,mO)‘~*

where the left-hand term is the correlation as derived from the expected matched-pair Table 2(b). In general, p10 depends in a complex manner on pEC, pns, and yOE; in the case where /IDE = 0 and yDc = 0, it can be shown that p,,, = pi<.; where pEC = 0, then p10 = 0 for all values of /IDE and yoc. Substituting as indicated in the footnote of Table 2(b) yields a quadratic equation in m,. The positive root of this equation is the number of discordant pairs with the case unexposed, control exposed; m,, = RR,,m,, is the expected number of discordant pairs with the case unexposed, control exposed. In simulating the case series, E was assumed to be binomially distributed with sample size N and probability parameter RR,,p$(RR,,p, + qE). Two control series were generated in each trial. In the independent series, E was assumed to be binomial with sample size N and parameter pE. The matched (correlated) series was generated using two binomials: in the first, generating controls for the exposed cases, E was binomial with c, trials and parameter m,,/c,; in the second, generating controls for the unexposed cases, E was binomial with c, trials and parameter m,,,/c,.