The Sufficient-Component Discrete-Cause Model and Its Extension to Several Risk Factors B. RAJA RAO Department
AND
PHILIP E. ENTERLINE
of Biostatistics,
University of Pittsburgh,
Pittsburgh,
Pennsylvania
15-761
Received 14 March 1983
ABSTRACT This paper is concerned with Koopman’s (1981) discussion of using the ICDR as a measure of deviation from an additive model of no interaction, This measure has been shown to be zero or slightly negative when the assumptions of no interaction hold in the sufficient-component discrete-cause model. The discussion centers around a theoretical structure of sufficient causes for juvenile-onset diabetes where there is a component cause common to all sufficient causes and two binary factors or risks X and Y are measured (X: Coxsackie B4 infection; Y: HLA4). In this paper, we have extended Koopman’s discussion to the case where three or more binary factors X, Y, Z, are measured. We have considered interaction between factors in every subset and have proved that the ICDR is again either zero or negative when the assumptions of no interaction hold in the sufficientcomponent cause model, It is supposed that these factors in conjunction will inevitably result in beta-cell destruction together with a set of immunologic conditions that permit dissemination of the enterovirus infection and permit an autoimmune response.
1.
INTRODUCTION
As Rothman [l] defines it, a curse is an act or event or a state of nature, which initiates or permits, alone or in conjunction with other causes, a sequence of events resulting in an effect. A cause which inevitably produces the effect is sufficient. In other words, a sufficient cause is a concurrence of conditions which inevitably results in the outcome under consideration. A number of component causes often make up a sufficient cause, and there are almost always various sets of component causes that will act as sufficient causes. Thus most causes in the health field are components of sufficient causes, but are not sufficient in themselves. Drinking contaminated water, for example, is not sufficient to produce cholera, and smoking is not sufficient to produce lung cancer, but both of these are components of sufficient causes. Koopman [2] calls this the sufficient-component-cause model. If there exists a component cause which is a member of every sufficient cause, such a component is called a necessary cause. MATHEMATICAL
BIOSCIENCES
68:267-278
OElsevier Science Publishing Co., Inc., 1984 52 Vanderbilt Ave., New York, NY 10017
(1984)
267 0025.5564/84/$03.00
268
B. RAJA RAO AND PHILIP
E. ENTERLINE
Koopman [2] discusses the theoretical structure of sufficient causes for juvenile-onset diabetes where there is a common component cause to all sufficient causes. In Figure 1, the circles do not describe the arrangement of component causes in an individual, but rather they describe a system of interaction between causes where each component cause is discrete and requires the presence of all other component causes to cause disease. The source of interaction between component causes in the sufficient-component cause model, as implied by Rothman, is the joint participation of component causes in a sufficient cause. In Figure 1, Koopman assumes that three events in conjunction will inevitably result in the beta-cell destruction: (1) a specific type of disseminated enterovirus infection; (2) a specific HLA (human leukocyte antigen) involved in an autoimmune response to beta-cell infection; and (3) a set of immunologic conditions that permit dissemination of the enterovirus infection and permit an autoimmune response. He discusses the specific case when two binary factors X (Coxsackie B4 infection) and Y: HLA4 presence are being measured. If Coxsackie B4 is capable of interacting with HLA4 in this scheme, there will be a sufficient cause of type 1. Otherwise there will be no such sufficient cause of type 1. Koopman uses this model and proves that the interaction contrast of disease rate (ICDR), a measure or deviation from an additive model of no interaction, is always either zero of negative when the assumptions of no interaction hold in the sufficient-component discrete-cause model. He uses this result to test for the null hypothesis of no interaction. As is well known, and as demonstrated by Rao and Enterline [4], asymmetries of interaction do not exist at the two-dichotomous-risk level. This is an oversimplification of the problem and a lot of beautiful mathematics is lost. On the other hand, if one measures three or more dichotomous factors, there exist several asymmetries and one has to discuss interactions between any two, any three,. . . factors. The object of the present communication is twofold: First, we demonstrate that simpler and more elegant forms may be obtained for Koopman’s results by using elementary set-theoretic definitions. This achieves a sort of
Sufficient I
Cause
Sufficient II
Cause
Sufficient III
Cause
Sufficient
Cause
IV
FIG. 1. Theoretical structure of sufficient causes for juvenile-onset diabetes where there is a component cause common to all sufficient causes. Component-cause labels for juvenile-onset diabetes: X: HLA4; Y: Coxsackie B4 infection; A: a set of conditions necessary for all HLA-enterovirus interactions; B: Coxsackie A, infection; C: HLA14; D: all other sets of interacting HLA and enterovirus infections (these do not involve X or Y).
SUFFICIENT-COMPONENT
269
CAUSE MODEL
unification. Second, we extend Koopman’s result to several factors which are measured. Again the basic set-theory notation makes the discussion elegant. The case of three binary risk factors is presented in detail, the mathematics being fairly manageable. It is shown that .the ICDR is either zero or negative when there is no interaction, With three risk factors X, Y, and Z, we have defined what is called the marginal ICDR. This is the ICDR corresponding to two factors when information on the third factor is suppressed. If the three factors interact, it is proved that their joint ICDR not only is positive, but has a lower bound which is the average pairwise ICDR. If the three factors do not interact, it is shown that their joint ICDR not only is negative but has a negative upper bound. A numerical example is worked out to illustrate the extension to three factors. 2.
KOOPMAN’S COMMON-COMPONENT-CAUSE TWO FACTORS
MODEL
WITH
If the factor X (Coxsackie B4) is capable of interacting with Y (HLA4), there will be a sufficient cause of type 1. If such an interaction does not take place, there will be no sufficient cause of type 1. Consider the rates in various exposure categories in terms of the probabilities of the four unmeasured component causes in Figure 1. Koopman uses the symbols @ii, $12, q&, and (p,,, where (Pxr = P(disease when exposed to both X and Y). Under the assumption of no sufficient cause of type 1 and the assumption of no cause of interaction other than a sufficient cause of type 1, we get ~,,=P(ABuACuAD)=P{An(BuCuD)}. Similarly +xy = P(disease
when exposed to Xbut not to Y)
=P(ABuAD)=P{An(BuD)}, Cpy,= P (disease when exposed to Y but not to X) =P(ACuAD)=P{An(CuD)}, (Px’r-- P (disease when exposed to neither X nor Y)
=P(AflD)=P(AD). Then under the assumption
of no interaction,
the ICDR becomes
4, y = +xr - +XY - (P,.Y + GYY =P{An(BuCuD)}-P{An(BuD)) -P{An(CUD)}+P(AnD) =P(AnBnCnD)-P(AnBnC).
270
B. RAJA RAO AND PHILIP
E. ENTERLINE
Now since AnBnCnDCAnBnC, we get D,,.60. Thus under the assumption of no interaction, the ICDR is either negative or zero. It should be observed that when testing shows that ICDR is significantly positive, we reject the null hypothesis of no interaction. If there were interaction, that is, if sufficient cause I existed, then we would have AcBuCuD
and
An(BUCUD)=A,
which gives
(Px,=P{An(BUCUD)}=P(A) and the measure D X.Y =P(A)-P{An(BU
D)}-P{An(CU
D)}+P(AnD)
=P(A)-P(AnB)-P(AnC)-P(AnD) +P(AnCnD)+P(AnBnD) =P(A)-P{An(BUCUD)}
+P(AnBnC)
=-0 -P(AnBnCnD)
(1)
=P(AnBnC)-P(AnBnCnD), which is positive, since A n B n C n D c A n B n C.
3.
THE MARGINAL
EFFECTS
OF X AND Y
The present authors [4] have defined what is called the marginal effect of an agent or risk. This is the effect of an agent when it is acting in the presence of or in conjunction with a second agent. The difference between the marginal effect of an agent and the effect when it is used alone (in the absence of the second) is called the effect S, of the agent X. In other words, S, denotes the increase in the effect of X when it is used in conjunction with agent Y. A simple relationship has been established between the ICDR and the increases in the effects of agents:
E( X alone) = $J~~ - +xy =P{An(BUD)}-P(AD)=P(AB)-P(ABD), E( Y alone) = $J~~ - (Pry =P{An(CUD)}-P(AD)=P(AC)-P(ACD).
SUFFICIENT-COMPONENT The marginal
CAUSE MODEL
271
effects of the factors X and Y are defined as
ME(X)
= +xr - (Pyu =P{,4n(BUCU~)}-P{An(cu~)},
ME(Y)
=
+xr
-
+xr
=P{L4n(mJCUD)}-P{An(mJD)}.
The
increase in the effect of X when it acts in conjunction
with Y is
S,=ME(X)-E(Xalone)
=P{An(BucuD)}-P{An(CuD)} -P{An(BuD)}+P(AD), =
GXY
-
+x-r -
(Px,, + GYY
(2)
which is exactly the ICDR. Similarly the increase in the effect of Y is seen to be the same. This shows that the ICDR may be expressed as D X,Y =f(s,+&s,).
(3)
It is clear from this equation that there can be no asymmetries at the two-dichotomous-risk level. If the two factors X and Y interact, then it is seen that S,=S,=P(ABC)-P(ABCD). 4.
EXTENSION OF THE SUFFICIENT-COMPONENT DISCRETE-CAUSE MODEL TO THREE FACTORS
As mentioned in [2], a sufficient cause is a concurrence of conditions which inevitably results in the outcome under consideration. There are almost always a number of component causes making up a sufficient cause, and various sets of component causes exist that will act as sufficient causes. Sets of environmental conditions that will result in an adequate exposure to cause disease may make up component causes. These component causes may be attributes of the host or attributes of the environment. We now suppose that we are measuring three factors X, Y, and Z. This situation is reflected in Figure 2. With the use of set-theoretic rules, reasonably simple relationships may be obtained, as we shall show. As in the two-factor case, we define 9 xyz = P (disease when exposed to X, Y, and Z) , + XY’Z’
--
P (disease when exposed to X but not to Y or Z) ,
(Px’yz’ = P( disease when exposed to Y but not to X or Z) , 9 x,y,z = P( disease when exposed to Z but not to X or Y) .
212
B. RAJA RAO AND PHILIP E. ENTERLINE Sufficient Cause
Sufficient Cause
Sufficient Cause
Sufficient Cause
I
II
III
IV
Sufficient Cause
Sufficient Cause
Sufficient Cause
Sufficient Cause
V
VI
VII
VIII
FIG. 2
We also define the probabilities not to the third:
of disease with exposure to two factors but
disease when exposed to X and Y but not to Z) ,
@XYZ,
=
P(
@ X’YZ
--
P (disease when exposed to Y and Z but not to X) ,
e xy’z = P(disease
when exposed to X and Z but not to Y),
and finally $x,,,,z8 = P(disease
when exposed to neither X nor Y nor Z).
We now define the sets J=EUFVGVH, L=CuDuJ, M=BuDuJ, N=BuCUJ. Then it will be seen that + x,y’z, = P( A n H), +xyr=
P(ABuAEuAGuAH)=
$yrz,=
P{An(Cu
EUFu
~~,,=P{AfI(DUFUGUH)}, 9xrz,=P{AnN}, Qyyz=P(AnL), cpxyz=P{AnM},
P{An(BuEuGU H)},
H)},
SUFFICIENT-COMPONENT
CAUSE MODEL
and under the assumption
of no interaction,
9 ~,,=P{ABuACuADuAEuAFuAGuAH} =P{An(BuCuDuEuFuGuH)}
L,M, and N are subsets of the set view of this, the following effects are all nonnegative:
BUCU D
U
J.In
E, = E(X alone) = $xr,r - $rr,z =P{An(BuEuGUH)}-P(M),
E,=P{An(CuEuFuH)}-P(AH), E,=P{An(DuFuGuH)}-P(AH). The ICDR in the three-factor
D x,Y, z
=
(PXYZ
-
+xYT
case is -
(P,.YT
-
+ 29mz
@XYZ
=P{An(BuCuDuJ)}-P{An(BuEuGuH)} -P{An(CUEUFUH)}-P{An(DUFUGUH)} +2P(A The marginal
n H).
(4)
effect of X when it acts with Y but not 2 is
ME( X: Y n 2’) = ~xyz~- Gyyr = P(AnN)-P{An(CUEUFUH)}. Similarly = P(AnM)-P{An(DUFUGUH)}.
ME( X: Y’n Z) = $xr,z - qxrz The marginal
effect of X when it interacts with Y and Z is ME(X:Y
nZ)=~,yz-+yyz =P{An(BuL)}-P(AnL).
The increase in the effect of X when it interacts with Y but not with Z is s x,y,,=ME(X:YnZ’)-E(Xalone) =
+xrr
-
Gx,YZ
-
(+xYT
-
(P,,,.,,).
(5)
The increase in the effect of X when it interacts with Z but not with Y is s X,Y’ClZ
=ME(X:Y’nZ)-E(Xalone) = (P,,,, - +,Wz - (@XW - +x,Y,z,).
(6)
274
B. RAJA RAO AND PHILIP
The increase in the effect of X when it interacts
E. ENTERLINE
with both Y and Z is
s,,,,z=ME(X:YnZ)-E(Xalone) =
(PXYZ
-
(Pruz
-(~,,,z~-
%rYT).
(7)
Let us define the increase in the effect of X when it interacts either with Y or Z or Y n Z as
sx=
S x,YnT+Sx,Ynz
+s
x,ynz
2
We also define the increase in the effect of Y when it interacts not with Z as
with X but
S ,,,,,=ME(Y:XnZ’)-E(Yalone) =
GXYZ
-
(Px,,,,
-
(@XYT
-
erYT>.
(9)
We make the following observations: (1) We have S X.YnZ’-
-S
Y,Xflz”
(10)
This simply says that in the absence of the factor Z, the two factors X and Y have the same increases in their effects when they interact. Observe also that
Sx,Ynz+SY,xnz.
(2) Sx,rnz is exactly the ICDR measure corresponding to the factors X and Y among those individuals not exposed to the factor Z. Let us define
Then it will be seen from Equation
(4) that
D xyz=A,-A,+2A,. Also from (7) S x,~nz+~~,xnz+~z,xn~=3Ao-A~-A~+3A~
SUFFICIENT-COMPONENT
215
CAUSE MODEL
sx,Y”Z’+~X,~nZ+~Y.XnZ’+~Y,X’nZ+~Z,XnY’+~Z,X’nY = 2A,-4A, Defining
+6A,.
S, and Sz analogously
to Equation
(8), it will be seen that
S,+Sy+SZ=3Ao-3A2+6A3, which shows that D,,=f(S,+Sr+&).
(11)
We note that at the three-dichotomous-risk level, there may exist asymmetries. We remark here that if no sufficient cause of type 1 exists and if we assume that no cause of interaction exists other than a sufficient cause of type 1, the ICDR will be negative or zero. If there is interaction, the ICDR will be positive. Also from our Equations (S), (lo), and (11) it can be seen that 3Dx,r,z=Sx+S,+Sz = S x,ynz~+~Y,znxf+&.xnY 02)
+(Sx,Y”z+SY,ZnX+SZ,XnY).
THE MARGINAL
ICDR
We now define the ICDR between the factors X and Y among those individuals not exposed to the third factor Z as the marginal ICDR D,,. This gives D,,
=
+XYT
-
(P,,Yz,
-
+XYT
+
(P.r,,
DYZ
=
+x,rz
-
9rrz
-
@X,YT
+
+)x’y’z,,
Dz, = +xyz - +,VZ - (P,,z, Using equation 30x,...
3
(13)
+ +~Y,z,.
(13) in (12), we get
-(Dx,~+D~,z+Dz,x)=~x,~n~+~~,~nx+~z,xny.
(14)
If the factors X, Y, and Z interact pairwise, then S x,vnz'O,
SY,ZnX>O,
s,.,,y>
0,
and we get from (14) the inequality D x~z>f(D,,.+D,,z+Dz,.). , 9
(15)
276
B. RAJA RAO AND PHILIP E. ENTERLINE
If the factors X, Y and Z interact with each other, we proved in Equation (11) that D,,,, z > 0. The inequality (15) shows that D,, z has a lower bound. This lower bound may be called the average pairwise ICDR. On the other hand, if the factors X, Y, and Z do not interact pairwise, we know that S,, r ,, z, < 0, since Sx, y o z, is the ICDR measure with the factors X and Y in the absence of Z. Similarly Sr, z ,, x’ < 0 and S,, x n y’ < 0. Then from Equation (14) we get D x.y,z <+(Dx,,+D~,z+Dz,xh
But Dx,Y is also an ICDR between the factors X and Y when information
on Z is suppressed, so that D,, ,, < 0. Similarly D y, z < 0 and D,, x < 0. Thus when the factors X, Y, and Z do not interact, their joint ICDR will be negative but has an upper bound. EXTENSION
TO k RISK FACTORS
The method presented in this paper can be extended to the case of k binary risk factors. Interaction between two or more factors has to be considered in the 2k - k - 1 subsets with at least two factors. A simple method is to approach the problem by finding the marginal effect and the increase in the effect of a factor when it interacts with one or more of the other factors. A simple relationship [analogous to our Equations (2) and (ll)] may be established between their joint ICDR and the increases in the effects of these factors, of the form
D,,,,_..,=
‘( s,+s,+ k
... +s,).
Marginal ICDRs have to be defined for all the 2k - k - 1 subsets (with at least 2 risk factors each), and a simple relationship or an inequality can be established between the joint and the marginal ICDRs. In the above discussion, the risk factors need not be binary. They may have several levels. The i th risk factor X, may have n, (i = 1,2,. . . , k) levels. A general result can be proved that the ICDR is positive when there is interaction and is zero or negative when there is no interaction. A NUMERICAL
EXAMPLE
McDonough et al. [3] have examined 654 white males aged 40-47 for coronary heart disease (CHD) in relation to three factors-viz., social class, cholesterol level, and blood pressure-separately as regards the prevalence of CHD. They have done this for each of the three factors separately and then used the weighted average of these statistics as a &i-squared variable with 1 d.f. to test the homogeneity of these tables.
SUFFICIENT-COMPONENT
CAUSE
277
MODEL
These three agents are dichotomized as high or low according defined in their paper. The data are shown in Table 1. The total effect of the three agents X, Y, and Z is
to an index
TE(xuYuz)=(p,yz-~x’y’z = 0.2121-
0.0219 = 0.1902.
Also E( Xalone)
The three-factor
= @xrz - $yyz,
= 0.0206,
E( Y alone) = $rrz
- $ryr
= 0.0329,
E( Z alone) = Gyyz
- +x,r,r
= 0.0293.
ICDR is D x, y, z = 0.1902 - 0.0828 = 0.1074.
It is also seen that S x,YnZ'-S y,xnz=0.0625, S Z, xn y’ = 0.0393,
S x.rnz=
S Y.X'f?Z' S Z,X'nY = 0.0387. Further
SX,Ynz=0.0687,
Sy,xnz=O.0681,
Sz,xny=0.0449,
giving S x,vnz+Sy,xnz+Sz,xny=O.l817.
TABLE
X
X’
Y
I
Total Cohort
CHD Cases
Z
Z
Z’
Z
Z’
Z’
Rate of CHD
Y
33 21
58 94
I 3
8 4
0.2121 0.1111
0.1379 0.0425
Y Y’
57 39
164 182
I 2
9 4
0.1228 0.0512
0.0548 0.0219
X:
High blood pressure
X’:
Low blood pressure
Y: Z:
High social class High cholesterol
Y’:
Low social class
Z’:
Low cholesterol
278
B. RAJA RAO AND PHILIP
E. ENTERLINE
We also get s,=2+
Sx
S y
YilZ
+ Sx,y,z=0.1196,
sy = 0.1187, s, = 0.0839. This gives ~(s,+s,+&)=0.1074=D,,,,.. In the data of McDonough
et al. the two-factor
D x.y = 0.0625,
D,.,
= 0.0387,
marginal D,,.
ICDRs are = 0.0393.
These give Dx,
Y +
Dy,
z +
Dz_
x =
0.1405.
Also 30
X,Y,Z-(Dx.~+Dy,z+Dz,x)=O.1817 =
S x,Ynz+SY,xnz+
sz,,,,.
REFERENCES 1 2 3
4
K. J. Rothman, J. S. Koopman,
Causes, Amer. J. Epidemiol. 104:587-92 (1976). Interaction between discrete causes, Amer. J. Epidemiol.
113:716-24
(1981). J. R. McDonough, C. G. Haines, S. C. Stubb, and G. E. Garrison, Coronary heart disease among Negroes and whites in Evans County, Ga., J. Chron. Dis. 18:443-468 (1965). Interaction, contrast disease rates for assessing synergism (or antagonism) in multifactor-multilevel disease risks, Accepted for publication in Biometrical Journal.