SOCIAL SCIENCE RESEARCH 5,95-105
(1976)
Theory Trimming J. MILLER MCPHERSON University of Nebraska
This paper attempts to show that the “Theory Trimming” technique makes little contribution to theory in sociology. A brief history of the technique is given, and several key assertions of its proponents are abstracted These assertions are shown to be highly questionable. The general alternative to Theory Trimming is shown, and the relationship between Theory Trimming and more general orientations to research is discussed.
In 1954 Herbert A. Simon set the stage for what has become a major methodological revolution in sociology by providing a clear causal interpretation for spurious correlation (Simon, 1954). Hubert Blalock made this revolution a reality by clarifying and extending this interpretation in a series of articles in major sociological journals (Blalock, 1960a, 1960b, 1961a, 1961b, etc.) The synthesis of these ideas with the techniques of path analysis in the late 1960s provided the final impetus for an explosion of methodological articles in sociology (cf. Borgatta et al., 1969, 1970; Costner, 1971, 1972, 1973-1974; Blalock, 1968, 1971, 1974; Goldberger and Duncan, 1973; and many others). One outgrowth of the combination of path analysis with the Simon-Blalock tradition is the “Theory Trimming” technique first demonstrated by Duncan (1966) and later given name by Heise (1969). This technique, in various forms, has been put to use in a number of subsequent articles (see Britt and Galle, 1969; Laumann et al., 1974; Cartwright and Schwartz, 1973; Braungart, 1971; Duncan and Featherman, 1972; Mulford et al., 1972; and many others). The purpose of this report is to trace the development of the technique, and to show that it is simply a disguised form of data dredging (Selvin and Stuart, 1966). The paper begins with a brief history of the technique, and abstracts a set of assertions from the work of its advocates. These assertions are then shown to be untenable. Finally, alternatives to the procedure are discussed.
Send reprint requests to the author, Nebraska, Lincoln, NI3 68588.
95 Copyright @ 1976 by Academic Press, Inc. All rights of reproduction in any form reserved.
Department
of Sociology,
University
of
96
J. MILLER
MCPHERSON
HISTORY
’
In 1964 Blalock introduced what has since come to be known as the Simon-Blalock technique for evaluating causal models. This technique involves testing particular partial correlations against the null hypothesis that they are zero: If we are willing to assume the correctness of the above recursive causal model, we can estimate each of the b’s by ordinary least squares. In order to do so we merely study each equation separately, estimating the b’s by whatever standard computing routine is most convenient. But we are seldom in a theoretical position to make an out and out assumption that a given model is in fact correct. Instead, we wish to test the adequacy of the model in some way. Keeping in mind that the numerators of the correlation coefficients and slopes have the same values, we see that the vanishing of any given b is equivalent to the disappearance of the comparable partial correlation. It turns out that the prediction equations can therefore be written in the form of some
partial
correlation
set equal
to zero.
(1964,
p. 64)
Now it is crucial to note that at this point, the Simon-Blalock technique is designed to test a model that has been specified a priori. In fact, Blalock later in the discussion emphasizes this: “It should be explicitly noted that we have not established the validity of Model II. We have merely eliminated Model I” (p. 76). The next development of interest takes place in 1966, in a much cited article by 0. D. Duncan: In problems of this kind, Blalock (1964) has been preoccupied with the question of whether one or more path coefficients may be deleted without loss of information. As compared with his rather tedious search procedure, the procedure followed here seems more straightforward. Had some of the (path coefficients) turned out both non-significant and negligible in magnitude, one could have erased the corresponding paths from the diagram and run the regressions over, retaining only those independent variables found to be statistically and substantively significant. (Duncan, 1966, p. 7)
Duncan here is suggesting the postulation of a completely specified exactly identified model (all possible causal effects included), the estimation of the path coefficients, and the deletion of the effects from the model if they are found not to be significant. Duncan does not, however, emphasize that this procedure is totally post hoc, giving the impression that models derived in this fashion have the same logical status as models derived analytically from theory and tested empirically. This confusion reaches its purest form in a much cited discussion of path analysis (Heise, 1969, p. 59):
THEORY TRIMMING
97
x, I\ x2-x (0)
3
Fig. 1. Heise’s three models.
Now presuming all assumptions about specifications,l identification, and measurements have been met so that one has valid estimations of the path coefficients, he actually is in a position to go beyond his limited initial theoretical knowledge and choose between the three models indicated in Figure 8. [Reproduced here as Fig. 1.1 To do this, one operates with the model in Figure 8a, and then examines the resulting path coefficients. If he were to tind psr is zero, he could conclude that there is no direct causal linkage between x1 and xs and, henceforth, would hold that model 8b is valid. If he were to find ~32 is zero, he would conclude that model 8c is valid. The theory trimming technique, then, has evolved from Blalock’s method of testing a priori causal structures to a method of finding a supposedly “valid” model by running a series of regression analyses and throwing out the “negligible” path coefficients. The most recent extensions of the technique hinge upon the development of the confirmatory factor analysis estimating procedure of Jtireskog (1969) (see, for instance, Burt, 1974). 1Heise’s assumptions about specification are restricted to the causal ordering of the variables; the question of direct vs indirect linkages is not addressed in this assumption.
98
J. MILLER
MCPHERSON
Before dealing with these extensions, however, it is useful to develop the major premises of the technique, and discuss them. Assertion No. I: that the assumptions about specifications, identification, and measurement are sufficient for “valid estimation” of path coefficients. The crucial phrase here is “valid estimation.” The problems of estimation are typically given short shrift in the sociological literature, and the article in question is a good example. * Since the meaning of “valid estimation” is ambiguous, we will discuss two possible meanings, beginning with the most restrictive and proceeding to the less restrictive possibility. (1) If the phrase means that the estimate is equal to the population parameter, then the assertion must be rejected out of hand, since all estimators in finite samples possess sampling variability. At best, estimators may be unbiased (the mean of the sampling distribution is the population value) and efficient (the sampling distribution has minimum variance).3 (2) If the phrase means that the estimates produced by the procedure are “the best possible” according to accepted criteria, the assertion is still incorrect. In order to prove this, we must introduce the notion of a “true” equation. By “true” equation is meant simply that if we could measure the variables in the population, rather than the sample, then the values we observe for the parameters in the equation would describe exactly the true causal relationship among the variables. A crucial feature of “true” equations is that parameters corresponding to independent variables not belonging to the equation are exactly zero. Thus, in the hypothetical situation of knowing the true equation, we know exactly which variables affect a given variable and which do not. Thus information is not provided by the theory trimmer’s assumptions; indeed, the Theory Trimming technique purports to deduce it from the data. Unless the independent variables in an estimated regression equation actually appear also in the true regression equation governing the process being researched, the estimates do not possessthe minimum variance property mentioned earlier.4 Unless all of the variables actually belong in the equation, each of the estimates in the equation has greater variance than the “best possible” estimate. Therefore, the only case in which the best estimates occur *For sociological discussions of estimation, see particularly Hauser and Goldberger (1971) and Blalock et al. (1970). SWe do not mean to imply that the author did not know the difference between a sample estimate and a population parameter. The point is that his argument may be taken to suggest an equivalence between the two. Later on in the discussion, it should become clear that a researcher could make the conclusions said to be possible using the Theory Trimming procedure only if the population parameters were estimated exactly. 4A scalar-algebraic proof of this assertion appears in Rao and Miller (1971, pp. 56-57). A more general matrix-algebraic proof may be found in Goldberger (1964).
THEORY TRIMMING
99
is precisely the case in which the Theory Trimming procedure is useless-the case in which all causal assertions made in the initial stage are true. Actually, the situation is even worse. It can be shown that if an independent variable appearing in the true equation is omitted from the estimated equation, then the other estimates in the estimated equation are biased.5 This possibility is not formally ruled out by the initial assumptions of Theory Trimming, although in practice the procedure “dumps in” all eligible independent variables-decreasing the probability of the latter problem while increasing the probability of the former. Therefore, we must conclude that Assertion No. 1 cannot be supported unless the phrase “valid estimates” means simply “estimates’‘-not ruling out the possibility of bias or inefficiency. Assertion No. 2: that it is possible to evaluate the statistical significance of each regression coefficient by the procedure described.6 The first aspect of the problem is that there is no single hypothesis being evaluated. For instance, in an equation with K possible independent variables, under the assumptions, it is possible for any combination of them to be “irrelevant,” in the sense discussed by Rao and Miller (1971). Each of the 2K possible true equations constitutes a distinct hypothesis, with a different null and/or alternative. These hypotheses, of course, are not independent. To proceed as if each coefficient could be evaluated in isolation is to violate the rules of elementary statistical inference. Unless there is an a prioti reason to test a particular hypothesis, it is very difficult to evaluate the amount of support for any given hypothesis. The data cannot tell the researcher which hypothesis to test; at best, the data may tell when a particular hypothesis is supported or unsupported, when a priori grounds exist for testing it. However, there are even more problems. As already demonstrated, estimates from equations with irrelevant variables do not possess minimum variance. Unfortunately, the tables typically used for testing hypotheses are constructed under the assumption that the estimator is the minimum variance estimator. Thus, the critical value for the evaluation of an hypothesis about a given estimate is likely to be in error. Therefore, we must reject the assertion that scanning the model as Duncan suggests allows evaluation of the statistical significance of the path coefficients. Assertion No. 3: That the Theory Trimming procedure results in a “valid” model. Here again, there is a problem of ambiguity in the meaning of “valid.” However, if we assume that it means something like “best estimate of the true model,” then we must reject the assertion. We have already shown the difficulty in establishing whether a particular estimate is zero in a single 5SeeRao and Miller (1971). 6D~ncan notes in his addenda (1971) that there are problems inference in the paper, but does not discuss directly the issue raised here.
of statistical
100
J. MILLER
MCPHERSON
equation.7 Now, we are faced with the problem of combining these equations into a “valid” modeL8 Recent research has shown that the alpha level for hypotheses on multiple equation path (diagonal recursive) models is related to the alpha level for hypotheses on the component equations in a relatively complex fashion (McPherson and Huang, 1974). For instance, the alpha level for each equation included in the general hypothesis must be more stringent than would be the case if the equation were evaluated in isolation. Thus, in a five equation model with hypotheses in each equation, if each hypothesis is tested at the .OS alpha level. the resulting overall alpha level for the model is greater than .2-not a very stringent test. g And this test is, of course, based upon a priori predictions. The case for the significance of “discovered” relationships is quite a bit more subtle and complex (see Scheffe, 1959, Chap. 3 for example). Therefore, we must reject Assertion No. 3 on both the grounds that the individual equations are suspect, and that the method of relating each equation to the whole model is not viable. Assertion No. 4: that the Theory Trimming procedure trims theory; that it is a method which results in a new or “more realistic” theory. This assertion is derived from one of the final sections of the cited article (Heise, 1969, p. 69): Given a well-identified recursivesystem,it is possibleto trim a theory down to a more parsimoniousversion by deleting causal linkages associated with zero path coefficients. In other words, given most of a theory, one can infer from correlational data that some causal relations do not exist. The Simon-Blalock technique of hypothesizing a theory and then testing it with correlational data is seen to be applicable to rejecting some theories, but not to the inductive problems of developing new theories or modifying the old theories so that they are more realistic.
This assertion is the most difficult to address, since it hinges upon one’s beliefs about the logic of science. An issue which remains unresolved in the literature is whether or not a path model is a theory. Without debating the issue here, we can usefully examine the information contained in path models of the sort used in the Theory Trimming procedure, and determine whether a “new” or “more realistic” theory is a result. The most basic type of information contained in a path model is simply the delimitation of the relevant variables. This information derives from a priori logic and research in a particular content area. The second type of information has to do with causal priorities among the variables. As all treatments of basic path analysis point out (cf. Duncan, 1966; Land, 1969), it 7In fact, concluding that the estimate is zero is the hoary fallacy of “accepting the null hypothesis” (Blalock, 196Ob, p. 123). possible true models, given K variables with SNote that now there are 2K(K-r)/2 known causal priorities. 9The formula is (1 - a~) = (1 - ar)(l - aa) . . . (1 - Qk) where aT iS the alpha level for the total model, and k is the number of equations.
THEORY TRIMMING
101
is necessary to assume some particular causal ordering for all variables. This information enters the model as an untested assumption (particularly for cross-section, diagonal recursive models), and is not changed during the analysis (for an interesting example of confusion on this point, see Rehberg et al., 1970). The third type of information in a path model is relational. This type of information relates the variables to one another through assertions about direct causation. Each of the N(N- 1)/2 relations in a model contains a bit of relational information in the form of the presence or absence of an arrow between the two variables. However, the presence of an arrow contains less information than the absence of an arrow. The presence of an arrow denotes that the independent variable in the relationship causes variation directly in the dependent variable, with the amount unspecified. The absence of an arrow asserts that the independent variable causes exactly no direct variation in the dependent variable within the limits of sampling error. Obviously, the latter is a much more restrictive statement than the former, in that it specifies an exact value for the relationship. This difference, of course, is the basis for the Simon-Blalock theory testing technique, in the sense that the a priori’ information that two variables are not causally linked (directly) provides a prediction of exactly zero for some parameter of the model. Referring to Fig. 1, we see that the difference between models (a) and (c) is that (c) gives us a value of zero for the parameter relating X2 to X3. This value of zero becomes the value for the null hypothesis. As McPherson and Huang (1974) show, testing this parameter estimate against zero gives the same result as testing the partial correlation against zero (the Simon-Blalock procedure). However, in the Theory Trimming technique, there is no prediction to be tested with the data; that is, there is no exact value under the null hypothesis. The delimitory and causal priority information cannot be tested at all,lo and none of the relational information provides exact values with which estimates can be compared. The relevant question is, “What theory is being trimmed here?” The delimitory and causal priority information may be thought of as theoretical information, but the relational information is simply that “everything causes everything else” (within the confines of delimitory and causal priority assumptions), which is no theory at all. Since the procedure can modify only the relational information, which in this case does not come from theov, no theory is being trimmed. At best, estimates of the magnitudes of direct causal effects are obtained. However, these estimates are appropriate (i.e., efficient and unbiased-see Goldberger, 1964, p. 262) only when the model is fully specified a priori.
l%ne could ague that if a variable has all zero paths relating to it, then one could “test” the delimitory information. Of course, this “test” could be as easily performed with simple correlations as with path analysis.
102
J. MILLER
ALTERNATIVES:
MCPHERSON
THEORETICAL
AND STATISTICAL
The basic criterion for a researcher’s deciding whether or not to “theory trim” is whether he believes that the data can form his hypothesis for him. It should be clear to the reader of the first section of this paper that there are serious problems in this belief. It is possible to put this issue into a more general context: a form of logical positivism versus operationism. The logical positivist (as exemplified in .the work of Popper, 1959) sees the basic aim of research as falsification. Falsification occurs when a proposition or hypothesis is contradicted by experience (Popper, 1959, pp. 40-41). A scientific theory, according to Popper, is one which could be falsified; the purpose of research is to test explanations against data. By extension, a model is an expression of an explanation which c& be directly related to data. The operationist, on the other hand, sees the model as the theory itself. The concepts of the theory are the variables in the model, as witnessed by the famous assertion that “intelligence is what intelligence tests measure” (Kaplan, 1964, p. 40). Trimming a model, then, is trimming a theory; finding the best fitting model is generating the best theory. The basic question which the operationist cannot answer is why a model might be important enough to estimate in the first place. If the model is the theory, then the only criterion for judging the importance of a model is how well it fits the data. Taking this position to its logical extreme, any model is just as important or scientifically interesting as any other, as long as it fits the data. There is no guide to what area of research should be concentrated upon. This guide is readily available for the positivist because the model is a consequence of theory as opposed to being theory. The most important research in this framework occurs when two different theories predict different outcomes. This is the well-known case of the “crucial test” (see, for instance, Stinchcombe, 1968). The simplest instance of a crucial test is the elementary test for spurious correlation, when one theory posits direct causation, and the other posits spuriousness. Referring to Fig. 1, one theory produces model (a), and the other theory produces (c).l 1 In this instance the crucial test dictates estimating the parameter relating Xz to Xs, cant-olling for X1, and testing it against zero. The estimation procedure might be undertaken in the context of path analysis, cross-tabulation, or experimental ll~n example of exactly this situation occurs in GaUeet UL (1972). The research problem
centers upon whether the relationship
between population
density and social
pathology is direct [i.e., model (a) or spurious through social class, model (c)l. The theory behind model (a) is a classical biological one, deriving from animal studies; the theory behind (c) is sociological, in that the explanation posits social structure as the source of covariation in density and pathology for humans. Note that exactly the same research problem generates an entirely different model in McPherson(1975), becauseof additional data.
THEORY TRIMMING
103
design.12 The point is that the magnitude of the estimate is important only in the context of theoretical expectations. 13 If the parameter is demonstrably nonzero, we reject the theory that generated model (c). If it appears to be zero, we reject the theory that generated model (a). The impetus for building and estimating a model in the first place must come from a theoretical problem; this problem derives from logically or mathematically inconsistent predictions from different explanations. At worst, the problem may occur when the adequacy of a single theory may be called into question by the absence of causal linkages specified to exist by the theory. When a theory says nothing about direct causal links as in the theory trimming case, the estimates, whatever they are, are only empirical regularities which at best might become the focus of some future explanation.
CONCLUSIONS Our discussion has come full circle, in that we must conclude that the original Simon-Blalock procedure is in no way improved upon by Theory Trimming. The Simon-Blalock procedure itself is a version of the scientific method of comparing u priori expectations with empirical outcomes, and rejecting or modifying the premises which led to the expectations in the face of contrary evidence. The theory trimming procedure ultimately produces information which is no different qualitatively from simple observation, in the sense that no theory is tested, and no falsification can result. While simple observation is undeniably important to the scientific endeavor, it is absolutely crucial to make a distinction between observing empirical regularities and predicting them. This distinction may well have been clear to the formulators of Theory Trimming, but subsequent researchers seem to be less aware of it than is desirable. It is hoped that this paper has contributed to a clearer understanding of the distinction.
l%he essential equivalence of means, differences of means, and regression coefficients as estimates of parameters is clear in the relevant statistical literature (see for instance, Christ, 1966; Goldberger, 1964). 13Unconvinced operationists should note that recent advances in estimation (cf. Joreskog 1969, 1970, 1973) allow a direct test of any hypothesized value against any other value, for any set of parameters. Thus, it is now possible to test whether the exact values found in a model estimated with data set “a” are consistent with different sets of data “b,” “c,” and so forth. Specht and Warren(1975) provide an explicit outline for comparing the results of estimated models across populations. Once again, we would argue that these procedures are useful only when the differences across populations are subject to some theoretical interpretation leading to new hypotheses. For instance, a comparison of the mobility process among Blacks and Whites may generate new insight into the dynamics of discrimination which could be tested in a totally different context.
104
J. MILLER
MCPHERSON
REFERENCES Blalock, H. M. (196Oa), “Correlational analysis and causal inferences,” American Anthropologist 62, 624-631. Blalock, H. M. (196Ob), Social Statistics. McGraw-Hill, New York. Blalock, H. M. (1961a), “Correlation and causality: The multivariate case,” Social Forces 34, 246-25 1. Blalock, H. M. (196lb), “Evaluating the relative importance of variables,” American Sociological Review 26, 866-874. Blalock, H. M. (1962a), “Spuriousness vs. intervening variables: The problem of temporal sequences,” Social Forces 40, 330-336. Blalock, H. M. (1962b), “Four variable causal models and partial correlations.” In H. M. Blalock (Ed), Causal Models in the Social Sciences, pp. 18-32. Aldine-Atherton, New York. Blalock, H. M. (1964), Causal Inferences in Nonexperimental Research. University of North Carolina Press, Chapel Hill (See also review by McGinnis in Social Forces 44, 1966, pp. 584-586.) Blalock, H. M. (1968), “The measurement problem: a gap between the languages of theory and research.” In H. M. Blalock and Ann B. Blalock (Eds.), Methodology in Social Research, Chap. 1. McGraw-Hill, New York. BJau, P. M., and Duncan, 0. D. (1967), The American Occupational Structure. Wiley, New York. Borgatta, Edgar F. (1969), Sociological Methodology 1969. Jossey-Bass, San Francisco. Borgatta, Edgar F., and Bohmstedt, G. W. (1970), Sociological Methodology 1970. Jossey-Bass, San Francisco. Braungart, R. G. (1971), “Family status, socialization, and student politics: a multivariate analysis,” American Journal of Sociology 77, 101-130. Britt, David W., and Galle, Omer R. (1969), “Industrial conflict and unionization,” American Sociological Review 37, 46-56. Burt, R. S. (1974), “Confirmatory factor analytic structures and the theory construction process,” Sociological Methods and Research 2, 131-190. Cartwright, B. L., and Schwartz, R. D. (1973), “The invocation of legal norms: An empirical investigation of Durkheim and Weber,” American Sociological Review 38, 340-354. Christ, C. F. (1966), Economic Models and Methods. John Wiley and Sons, New York. Costner, H. L. (1969), “Theory, deduction, and rules of correspondence,” American Journal of Sociology 75, 245-263. Costner, H. L. (1971), Sociological Methodology 1971. Jossey-Bass, San Francisco. Costner, H. L. (1972), Sociological Methodology 1972. Jossey-Bass, San Francisco. Costner, H. L. (1974), Sociological Methodology 1973-74. Jossey-Bass, San Francisco. Duncan, 0. D. (1966), “Path analysis: Sociological examples,” American Journal of Sociology 72, l-16. Duncan, 0. D., Haller, A. O., and Portes, A. (1968), “Peer influence on aspirations: A reinterpretation,” American Journal of Sociology 74. Duncan, 0. D., and Featherman, D. L. (1972), “Psychological and cultural factors in the process of occupational achievement,” Social Science Research 1, 121-146. Galle, Omer R., Gove, W., and McPherson, J. M. (1972), “Population density and social pathology: What are the relations for man?” Science 176, 23-30. Goldberger, A. S. (1964), Econometric Theory. John Wiley, New York. Goldberger, A. S., and Duncan, 0. D. (1973), Structural Equation Models in the Social Sciences. Seminar Press. New York.
THEORY TRIMMING Gordon,
105
R. A. (1968), “Issues in multiple regression,” American Journal of Sociology 73, 592-616. Hauser, R. M., and Goldberger, A. S. (1971), “The treatment of unobservable variables in path analysis.” In Sociological Methodology 1971, pp. 81-117. Jossey-Bass, San Francisco. Heise, D. R. (1969), “Problems in path analysis and causal inference.” In E. F. Borgatta (Ed.), Sociological Methodology 1969, Chap. 2. Jossey-Bass, San Francisco. Joreskog, K. G. (1969), “A general approach to confumatory maximum likelihood factor analysis,” Psychometrica 34, 183-202. Joreskog, K. G. (1970), “A general method for the analysis of covariance structures.” Biometrika 57, 239-25 1. Joreskog, K. G. (1973), “A general method for estimating a linear structural equation.” In Goldberger and Duncan (Eds.), Structural Equation Models in the Social Sciences, Chap. 5. Seminar Press, New York. Kaplan, Abraham (1964), The Conduct of Inquiry. Chandler Publishing, San Francisco. Land, K. C. (1969), “Principles of path analysis.” In Borgatta (Ed.), Sociological Methodology 1969, pp. 3-37. Jossey-Bass, San Francisco. Laumann, E. O., Verbrugge, L. M., and Pappi, F. U. (1974), “A causal modelling approach to the study of a community elite’s influence structure,” American Sociological Review 39, 162-174. McPherson, J. M. (1975), “Population density and social pathology: A reexamination,” Sociological Symposium 14, 77-92. McPherson, J. M., and Huang, C. W. (1974), “Hypothesis testing in path analysis,” Social Science Research 3, 127-139. Mulford, C. L., Klonglan, G. E., Warren, R. D., and Schmitz, P. E., (1972), “A causal model of effectiveness in organizations,” Social Science Research 1, 61-78. Popper, K. R. (1959), The Logic of Scientific Discovery. Harper & Row, New York. Rao, Potluri, and Miller, R. L. (1971), Applied Econometrics. Wadsworth, Belmont, CA. Rehberg, R. A., Schafer, W. E., and Sinclair, J. (1970), “Toward a temporal sequence of adolescent achievement variables,” American Sociological Review 35, 34-38. Scheffe, Henry (1959), The Analysis of Variance. John Wiley and Sons, New York. Selvin, H. L., and Stuart, J. (1966), “Data dredging procedures in survey analysis,” The American Statistician 20(3), 20-23. Simon, H. A. (1954), “Spurious correlation: A causal interpretation,” Journal of the American Statistical Association 49, 467-479. Specht, David A., and Warren, R. D. (1975), “Comparing causal models.” In D. R. Heise (Ed.), Sociological Methodology 1976, Chap. 2. Jossey-Bass, San Francisco.