~)
New Ideas in PsyehoL Vol. 13, No. 3, pp. 259-279, 199.5 Copyright © 1995 Elsevier Science Ltfl Printed in Great Britain. All rights reserved 0732-118X/95 $9.50 + 0.00
Pergamon
0732-118X(95)00012-7
IN DEFENSEOF EXPERIMENTALDATAIN A RELATIVISTIC MILIEU SIU L. C H O W * Department of Psychology, University of Regina, Regina, Saskatchewan, Canada $4S 0A2 Abstract--The objectivity and utility of experimental data as evidential support for knowledge-claims may be found suspect when it is shown that (a) the interpretation of experimental data is inevitably complicated by social factors like experimenter effects, subject effects and demand characteristics, (b) social factors which affect experimental data are themselves sensitive to societal conventions or cultural values, (c) all observations (including experimental observations) are necessarily theory-dependent, and (d) experimental data have limited generality because they are collected in artificial settings. These critiques of experimental data are answered by showing that (i) not all empirical studies are experiments, (ii) experimental methodology is developed to exclude alternate interpretations of data (including explanations in terms of social influences), (iii) theoretical disputes and their settlement take place in the context of a particular frame of reference, and (iv) objectivity can be achieved with observations neutral to the to-be-corroborated theory despite theory-dependent observations if distinctions are made (a) between pr/0r observation and evidential observation and (b) between a to-be-corroborated theory and the theory underlying the identity of evidential response. Two recent publications are eloquent accounts o f why we may be dissatisfied with the e x p e r i m e n t a l a p p r o a c h to psychological research. T h e influences o f social factors on psychological research are emphasized in b o t h critiques of e x p e r i m e n t a l studies in psychology. While Danziger (1990) is m o r e c o n c e r n e d with the influences o f various professional, institutional, societal a n d political c o n s t r a i n t s o n psychologists' investigative practice, Gergen (1991) emphasizes that psychologists' a p p e a l to e x p e r i m e n t a l data is inevitably circular because all observations are theory-dependent. They also find psychologists' appeal to empirical data, as well as their reliance o n statistics, wanting because psychologists seem to pursue these technical niceties at the expense of conceptual sophistication or social relevance. These critics o f e x p e r i m e n t a l data seem to suggest that psychologists collect data in artificial settings because of their servitude to methodology. Moreover, in using statistics, psychologists overlook the i m p o r t a n c e o f individual or g r o u p differences. Psychologists' knowledge-claims are thereby necessarily suspect. T o D a n z i g e r (1990) a n d G e r g e n (1991), what psychologists have a c q u i r e d through their research efforts are various knowledge-claims constructed in specific social contexts. T h e meanings, significance or usefulness o f these knowledge-claims c a n n o t be p r o p e r l y a p p r e c i a t e d unless the social c o n t e x t in which they are *I am grateful for the helpful comments of an anonymous revieweron an earlier version of the paper. 259
260
S.L. Chow
c o n s t r u c t e d is taken into account. In short, it is inevitable that psychological knowledge-claims are not true in an absolute sense. They are assertions relative to a specific theoretical perspective in the context of a particular social milieu. Hence, psychological knowledge-claims cannot be assessed in objective, quantitative terms. This assessment of experimental psychologists' current investigative practice will be called relativistic critique in s u b s e q u e n t discussion. Critics o f psychologists' investigative practice from the relativistic critique perspective will be called relativistic
critics. It will be shown that m u c h of what is said about properly collected experimental data in the relativistic critique is debatable. The putative influences of social factors on research outcome iterated in the relativistic critique argument are neither as insidiously pervasive n o r as validly established as they are said to be because (a) not all empirical researches are e x p e r i m e n t a l research, and (b) the logic o f experimental research renders it possible to exclude the various social factors as artifacts. How non-intellectual factors may affect research practice (e.g., the practical importance of the research result) is irrelevant to assessing the validity of knowledge-claims if a distinction is made between the rules o f a game and how the game is played. A case can be m a d e for objectivity despite t h e o r y - d e p e n d e n t observations if certain distinctions are made. Moreover, u n a m b i g u o u s theorycorroboration data can only be obtained in artificial settings. The present discussion of the relativistic critique begins with a description o f an experiment so as to provide the helpful frame of reference. Also described is the affinity o f the e x p e r i m e n t a l design with the formal structure o f Mill's (1973) inductive methods. T h e m e a n i n g s o f validity a n d control are explicated with r e f e r e n c e to the e x p e r i m e n t a l design so as to provide a r e j o i n d e r to the "methodolatry" characterization of experimentation. A case is made for ecological invalidity. T h e objectivity a n d non-circular n a t u r e o f e x p e r i m e n t a l data are illustrated with reference to a few meta-theoretical distinctions. The putative effects of the social context on experimental evidence are assessed by revisiting the social psychology of the psychological experiment, or SPOPE, for short. THE LOGIC OF EMPIRICAL JUSTIFICATION C o n s i d e r the following c o m m o n p l a c e e x p e r i e n c e . Suppose that you are i n t r o d u c e d to a g r o u p o f p e o p l e in a party. By the time you m e e t the sixth individual, you most probably have forgotten the name o f the first person to whom you have just been introduced. This p h e n o m e n o n is to be explained in terms of a storage mechanism, called the short-term store (Baddeley, 1990), which retains information for only a few seconds (see the " P h e n o m e n o n " and "Hypothetical mechanism" rows in Table 1). How information is lost from the short-term store depends on the theoretical properties attributed to the hypothetical m e m o r y store in question. Two theoretical possibilities readily suggest themselves (see the two entries in the "Theoretical property" row in Table 1). First, there is a finite n u m b e r of storage slots in the shortterm store. Old information is displaced when incoming information exceeds the storage capacity of the short-term store (i.e., the "Interference" theory). The second explanation is that it takes time to get introduced to several people. Information in
E x p e r i m e n t a l d a t a in a relativistic milieu
261
Table 1. A schematic representation of the relationship among to-be-explained phenomenon, theory, experimental expectation, and evidential data in support of theory Phenomenon
Information learned casually is forgotten rapidly.
Hypothetical mechanism
A short-term storage mechanism: the short-term store.
Theoretical property of mechanism
Interference: Forgetting occurs when the task demand exceeds the fixed number of storage slots.
Decay: Forgetting is a function of delay of information retrieval.
Implication
More forgetting if more to remember, regardless of time.
More forgetting ff longer to retain, regardless of memory load.
Experimental task
Waugh and Norman's (1965) probed recall task. (a) Fast rate of presentation: 4 digits per second. (b) Slow rate of presentation: 1 digit per second. [See Sub-Table la].
Identity of response
The digit following the probe-digit.
Theoretical prescription % of digits correctly recalled as a function of the time interval between (experimental expectation) the 2 occurrences of the to-be-recalled digit: Fast presentation < Slow presentation. Fast presentation = Slow presentation. Experimental outcome
Recall performance (% of digit correctly recalled): Fast presentation < Slow presentation.
Experimental conclusions
(a) The decay theory can be rejected. (b) There is no reason to reject the interference theory.
Sub-Table la. A schematic representation of Waugh and Norman's (1965) probed recall task Digit sequence: Probe digit: Response required: Correct response: Number of intervening digits (inclusive):
387628 8 (the digit accompanied by a tone) The digit following the probe-digit 7 4
the short-term store simply dissipates if not processed further, in much the same way alcohol evaporates if left in an uncovered beaker. Whatever is retained about the name of the first person would have decayed by the time you come to the sixth person (viz., the "Decay" theory). Cognitive psychologists' modus operandi is first to work out what else should be true if the to-be-corroborated theory is true (viz., the implications of the theory). Acceptance of the theory is to be assessed in terms of its implications. This is done one implication at a time, a process called "converging operations" (Garner, Hake & Eriksen, 1956). Suppose that a certain implication is used to test the theory in the study. The specific theoretical prescription is made explicit by expressing the to-betested implication in terms of the chosen experimental task, independent variable and d e p e n d e n t variable. This abstraction description of the theory-corroboration process may be explicated as follows. Waugh and Norman (1965) used a probed-recall task to choose between the Interference and Decaytheories. On any trial, subjects were presented auditorially with a sequence of decimal digits at either a fast rate (viz., 4 digitS per second) or a slow rate (i.e., 1 digit per second). A digit might be repeated after being intervened by
262
S.L. Chow
a p r e d e t e r m i n e d n u m b e r of other digits. In the example shown in Sub-Table la, the digit "8" is accompanied by a tone which serves as a probe to recall. The digit "8" is called the probe-digit. Subjects were to recall the digit which followed the probe-digit in its previous occurrence (i.e., "7" in the example in Sub-Table la). An implication of this Interference theory is that the a m o u n t of forgetting increases as the work load increases, irrespective o f the r e t e n t i o n interval (see the "Implication" row in Table 1). This implication is given a more concrete expression by Waugh and N o r m a n (1965) in terms of their experimental task, as well as the c h o s e n d e p e n d e n t variable (see the "Theoretical prescription ( e x p e r i m e n t a l expectation)" row in Table 1). Specifically, performance should be worse in the fastpresentation condition than in the slow-presentation condition when the percent of digits correctly recall is plotted as a function o f the delay between the two occurrences of the probe digit (e.g., 2 seconds). It is implied in the Decay theory that the a m o u n t of forgetting should increase as the delay between original learning and retrieval is extended (see the "Implication" row in Table 1). When expressed in terms of Waugh and Norman's (1965) task, the Decay theory implies that recall performance should be the same at the same delay (e.g., 2 seconds) at both rates of presentation (see the "Theoretical prescription (experimental expectation)" row in Table 1). It was f o u n d that recall performance was p o o r e r at the same delay when items were p r e s e n t e d at the fast rate than at the slow rate (see the " E x p e r i m e n t a l outcome" row in Table 1). The data do not warrant accepting the Decay theory because they contradict what is prescribed by the theory. At the same time, there is no reason to reject the Interference theory because the data are consistent with its implication. In this sense, the data are said to warrant the acceptance o f the Interference theory (see the "Experimental conclusions" row in Table 1). T h e inductive rule u n d e r l y i n g Waugh and N o r m a n ' s (1965) e x p e r i m e n t a l design is Mill's (1973) Method of Concomitant Variation, as may be seen from Table 2 (see also Copi, 1982). Variables 1 and 2 are the i n d e p e n d e n t variables, Delay between the two occurrences of the target digit (A) and Rate of digit presentation (X), respectively. The d e p e n d e n t variable is Variation N + 1 (viz., the percent of digits recalled correctly). Variables 3 through 5 are three control variables, namely, (i) a high-frequency tone was used as the probe tone (i.e., B in Table 2), (ii) digits were read at a steady rate in a male m o n o t o n e voice (viz., C in Table 2), and (iii) the stimulus presentation was done by using a r e c o r d e d string of digits (i.e., D in Table 2), respectively. Variables 3 t h r o u g h 5 are called control variables because each o f t h e m is r e p r e s e n t e d at the same level at all six Delay by Rate combinations of the two i n d e p e n d e n t variables. In theory, there are numerous extraneous variables (viz., Variables 6 t h r o u g h N). They are assumed to have b e e n controlled (i.e., held constant at the same level at all Delay by Rate combinations) by virtue o f the fact that all subjects received many trials of each of the six Delay by Rate combinations in a r a n d o m o r d e r (i.e., the repeated-measures design is used). Consider the e x p e r i m e n t a l expectation, or theoretical prescription, o f the Interference theory. Given any delay, the fast rate o f digit presentation imposes greater demands on the short-term store than the slow rate because more digits are
Experimental data in a relativistic milieu
263
presented at the fast rate. The demand on the short-term store due to the fast presentation rate is amplified the longer the delay. In other words, the Delay by Rate combinations in Row (i) through (vi) in Table 2 represent increasingly greater demands on the short-term store according to the Interference theory. Larger subscripts of the dependent variable (Y) represent poorer performance. In other words, an inverse relationship is expected between the treatment combinations of the two independent variables and the dependent variables. It is for this reason that Waugh and Norman's (1965) experimental design is said to be an exemplification of Mill's ( 1973 ) Method of Concomitant Variation.* ECOLOGICAL VALIDITY REVISITED
Relativistic critics would find Waugh and Norman's (1965) experiment objectionable because their experimental task and test condition were artificial. The message of the relativistic position is that, as the whole situation was so unlike any everyday activity, Waugh and Norman's (1965) conclusion could not possibly be applied to forgetting in real-life situations. This is a good example of the Table 2. T h e basic design of W a u g h a n d N o r m a n ' s (1965) e x p e r i m e n t a n d its affinity to Mill's (1973)
method of concomitant variation Independent variables 1 2 (i) (ii) (iii) (iv) (v) (vi)
A1 aI a2 As A3 A3
X1 X~ X~ X~ X1 X~
Control variables 3 4 5
Extraneous variables 6 7 .... N
B B B B B B
e e e e e e
C C C C C C
D D O D O O
f...n f...n f...n f...n f...n f...n
Dependent variable N+I
]11 Y~ Y3 Y4 Y5 116
A X
= Delay between the two occurrences o f the target digit = Rate o f presentation o f digits Ax XI = Short delay at slow rate A1 X~ = Short delay at fast rate A2 X~ = M e d i u m delay at slow rate A2 X~ = M e d i u m delay at fast rate A3 X~ -- L o n g delay at slow rate A~ X~ = L o n g delay at fast rate B = A high-frequency tone as the probe tone C = Digits read at a steady rate in a male m o n o t o n e voice D = Auditory presentation (recorded) e, f . . . n = Variables 5, 6, t h r o u g h N are r e p r e s e n t e d by their respective levels (viz., Levels e, f etc., t h r o u g h n). Y1% o f digits recalled correcdy at Al Ys % o f digits recalled correctly at A 1 X~ Ys % o f digits recalled correctly at A~ )~ Y4 % o f digits recalled correctly at As X2 Y5 % o f digits recalled correcfly at As .~ Y6 % o f digits recalled correctly at Aa Xj *The o t h e r exemplification of Method of Concomitant Variation is the situation in which the functional relationship between the i n d e p e n d e n t a n d d e p e n d e n t variables is a positive one. W h a t should be a d d e d is that the said functional relationship m u s t be established in the presence o f all recognized control variables.
264
S.L. Chow
complaint that experimental data do not have ecological validity (called ecological invalidity in subsequent discussion). It can be shown that the ecological invalidity feature of experimental data can be defended. In fact, it is this feature which renders non-circular experimental data. The necessity of ecological invalidity may be seen by supposing that city streets are wet early one morning. This p h e n o m e n o n may suggest to Observer A that it rained overnight. However, Observer B may interpret the wet city streets to mean that city employees cleaned the city overnight. An appeal to wet highways as evidential data has ecological validity (Neisser, 1976, 1988) because wet highways are like wet city streets in terms of their location and being exposed to the same weather as do city streets. This example illustrates why relativistic critics suggest that empirical data are not useful as the theory corroboration evidence, or as the means to choose between contending explanations of observable p h e n o m e n a . In the words of a relativistic critic, For if theoretical assumptions create the domain of meaningful facts, then rigorous observation can no longer stand as the chief criterion for evaluating theoretical positions. Theories cannot be falsified in principle because the very facts that stand for and against them are the products of theoretical presuppositions. To falsify (or verify) a theory thus requires preliminary acceptance of a network of assumptions which themselves cannot be empirically justified. (Gergen, 1991, p. 15) [Quote 1] Gergen's (1991) point is well made ifA and B appeal to wet city streets themselves to justify their respective theories because to do so would be circular. It is also a good point if Observers A and B appeal to wet highways to settle their theoretical dispute (on the mere grounds that they are consistent with their positions) for the following reason. To Observer A, wet highways are also indicative of the possibility that it has rained overnight. Observer B, on the other hand, would still see wet highways as the c o n s e q u e n c e o f s o m e o n e having c l e a n e d the city. For e i t h e r observer, the observer's theoretical assumption influences how data are being interpreted. That is, wet highways are ambiguous as data for choosing between the "It rained" and "Someone cleaned the city" explanations. This ecological invalidity critique is reasonable if the purpose o f Waugh and N o r m a n ' s (1965) e x p e r i m e n t was to d e m o n s t r a t e the p h e n o m e n o n o f r e m e m b e r i n g a series of names learned hastily in succession. However, this was not what the e x p e r i m e n t was about. It was c o n d u c t e d to test two theories a b o u t forgetting in the short-term m e m o r y situation. It is often n e g l e c t e d in metatheoretical discussion that psychological research is n o t a b o u t psychological p h e n o m e n a per se. This is the case because p h e n o m e n a of interest to psychologists are seldom in dispute (e.g., forgetting, tip-of-the-tongue, hyperactivity, emotions, etc.). Disputes among psychologists are often disagreements as to how to account for an a g r e e d - u p o n p h e n o m e n o n in terms o f u n o b s e r v a b l e h y p o t h e t i c a l mechanisms. In o t h e r words, the subject-matter o f psychological research is theories about p h e n o m e n a , not the p h e n o m e n a themselves. A prerequisite for an explanatory theory being non-circular is that i t c a n be used
E x p e r i m e n t a l d a t a in a relativistic m i l i e u
265
to explain p h e n o m e n a other than the original p h e n o m e n o n . This is an iteration of cognitive psychologists' modus operandi that they have to stipulate what else can be explained by the theory, in addition to the original p h e n o m e n o n which invites theorization in the first place. What may be added here is that the "what else" emphasized in bold is more helpful the more different it is from the original p h e n o m e n o n . This state of affairs is responsible for the fact that good theorycorroboration data do not have (and do not need) ecological validity. In short, it is necessary to show that the "It rained" or "Someone cleaned the city" explanation can be used to explain p h e m o n e n a o t h e r t h a n wet city streets. Specifically, the "It rained" theory, but not the "Someone cleaned the city" theory, can explain wet leaves on the top of a tall tree. Alternatively, the "Someone cleaned the city," but not the "It rained," theory can explain the fact that the floor of the bandstand in the park is wet.* Suppose Observers A and B wish to test A's theory. They should examine the leaves on the top of a tall tree. If it did rain, the leaves should be wet. Hence, the "It rained" theory can be unambiguously rejected if the leaves are dry. In other words, Gergen's (1991) assertion that theories cannot be falsified in principle [see Quote 1] is debatable. Of importance to the ecological invalidity argument is the fact that wet leaves are different from wet city streets. Yet, wet leaves are the data to use to test the "It rained" theory. It has been shown that the unambiguous evidence is unlike the to-be-explained p h e n o m e n o n . This feature is reflected in cognitive psychologists' modus operandi depicted in Table 1. It can be seen that the choice between the Interference and Decay theories is setded with data which are unlike the to-be-explained p h e n o m e n o n (viz., recalling the digit after the probe-digit in the laboratory versus recalling the names of several strangers in an information-rich social gathering). In other words, the ecological validity criterion is the incorrect criterion to use when assessing whether or not experimental data warrant a theoretical position (Mook, 1983). PRECEDENCE: DATA OR THEORY
The circular theory-data relationship envisaged in the relativistic account is given the inadvertent support when psychologists say, in their meta-theoretical moments, that they infer a theory from their data. This may be the result of their having been exposed to how experimentation is being described and explained in statistics textbooks (Chow, 1992a, 1992b). This misleading way of talking about the theory-corroboration procedure suggests that (a) observation precedes theory, and (b) the original observation itself, or observations very similar to the original observation, is used to support the theory. That is, the sequence is phenomenon --~ theory --~ phenomenon. This is how a Skinnerian would talk about experimentation
* T h e r e is the complication d u e to auxiliary assumptions. Specifically, a calm day is a s s u m e d in the example. If it is n o t reasonable to a s s u m e a calm day, the disputants would have to c o m e u p with a di~fferent test which takes into a c c o u n t the wind condition. Be that as it may, suffice it to say that this complication does n o t affect the p r e s e n t a r g u m e n t (see Chow, 1992a).
266
S.L. Chow
(Skinner, 1938), and its conceptual difficulties have b e e n made explicit by Chomsky (1959). Even though they agree with relativistic critics that no observation is theoryindependent, contemporary experimental psychologists' modus operandi is based on an observation1 ---> theory ---> obscrvatio½ sequence. This may be illustrated with reference to two distinctions which have to be made when we assess whether or not empirical data can be used in a non-circular way in theory corroboration. T h e two distinctions are (i) p r i o r data (i.e., observationl) versus evidential data (viz., observations), and (ii) a to-be-corroborated theory versus the theory underlying the evidential observation. Not distinguishing between studying phenomenon and studying a theory about a phenomenon, critics of experimental psychology tend to lose sight o f the fact that contemporary cognitive psychologists draw their theoretical insight from everyday psychological experiences and p h e n o m e n a (e.g., the " P h e n o m e n o n " row in Table 1). Both the Interference and Decay theories are attempts to account for p h e n o m e n a of forgetting experienced in everyday life. In other words, the p h e n o m e n o n of forgetting hastily learned information is a datum which invites theorization. T h e said p h e n o m e n o n may hence be characterized as prior data (or observationl) vis-avis either the Interference or Decay theory. Contrary to some relativistic critics' characterization (e.g., Gergen, 1988), psychologists do not use prior data to test a theory or to choose among contending theories. To do so would be indeed an exercise in circularity. Instead, they collected data in artificial settings (e.g., the artificial task in Sub-Table la), and they assess the experimental data thus collected with reference to an implication of the to-becorroborated theory. Experimental data are evidential data (or observatio½) vis-a-vis the t o - b e - c o r r o b o r a t e d theory. T h e y are collected after the t h e o r y has b e e n proposed. Hence, relativists' phenomenon ---> theory --) phenomenon sequence should be replaced by the observation~ ---> theory ---> observatio½ sequence. More important, evidential data are not identical, or similar, to prior data in theory corroboration. As a matter of fact, in order to be unambiguous, evidential data have to be different from prior data. TO-BE-CORROBORATED THEORY VERSUS THEORY UNDERLYING EVIDENTIAL RESPONSE Now consider the "theoretical assumptions create the domain of meaningful facts" assertion in [Quote l] with reference to the "Identity of response" row in Table 1 and its entry, "The digit following the probe-digit." As a digit, the response is a decimal number. Decimal numbers are one of several kinds of numbers defined by different conventions (viz., hexadecimal, decimal, octal and binary numbers). In other words, the identity of the response used as evidence in Waugh and Norman's (1965) e x p e r i m e n t (i.e., evidential response) is d e p e n d e n t on a non-empirical frame of reference (i.e., the decimal n u m b e r system). More important to Gergen's (1991) "domain of meaningful acts" is the issue of how the response was categorized in the experiment (viz., as "correct" or "incorrect"). Note that there is an implicit distinction in G e r g e n ' s (1991) a r g u m e n t , namely, the observation (viz., the response) and the "fact" represented by the observation, or the "factual status" of
Experimental data in a relativistic milieu
267
the response (i.e., the fact of being correct or incorrect). T h e categorization was d o n e with reference to the "probe-recall" feature of the task. That is to say, the factual status of a response is d e p e n d e n t on the experimental task, not on the Interference or Decay theory. This is contrary to an implication o f [Quote 1] that the factual status o f the evidential response is d e p e n d e n t on the to-be-tested theory. At the same time, t h e r e is s o m e t h i n g in the e x p e r i m e n t which is i n d e e d d e p e n d e n t on the to-be-corroborated theory. Specifically, the two entries in the "Theoretical prescription" row in Table 1 are d e p e n d e n t on the Interference and Decay theories, respectively. In other words, it seems that the necessary distinction between the factual status of the observation and the theoreticalprescription of the to-becorroborated theory is not made in [Quote 1 ]. As may be seen, we are dealing with frames of reference, or theories, in different domains when we consider (a) the identity of the response, (b) the factual states of the response (viz., what fact is being represented by the response), and (c) the theoretical prescription of the experiment. T h e r e is no inter-dependence a m o n g (a), (b) and (c). Hence, it is possible to have atheoreticalor theory-neutralobservations (vis-a-vis a t o - b e - c o r r o b o r a t e d theory) even when all observations are theorydependent. In sum, there is no reason why individuals subscribing to different theories cannot agree on (a) what the evidential response is, (b) what "fact" is represented by the response, and (c) what should be the case if a theory is true. Consequently, it is possible to achieve objectivity (viz., inter-observer agreement on experimental observations, regardless of the observers' theoretical commitment), despite the t h e o r y - d e p e n d e n c e of research observations. T h a t is the case because multiple theories b e l o n g i n g to d i f f e r e n t d o m a i n s or levels o f abstraction are b e i n g implicated. It is also important to note that the choice between the Interference and Decay theory is made within the same frame of reference (viz., cognitive psychology). Some relativistic critics may argue that what is said in either theory might be very different had the p h e n o m e n o n of forgetting been conducted in the context of a different frame of reference (e.g., psychodynamics or positivistic psychology). This is true. However, the dispute envisaged by relativistic critics is not one between two theories o f forgetting in the context of a frame of reference which is not in dispute. The relativistic critics are effectively suggesting an examination of the relative merits of two different frames of reference (e.g., cognitive psychology versus psychodynamics). W h e t h e r or n o t such a c h o i c e can be m a d e with empirical data, let alone experimental data, depends on a host of issues which go beyond the scope of the present discussion. The crucial question is that questions about the choice between two meta-theoretical frames of reference are not questions as to whether or not a theory can be warranted by experimental data. KNOWLEDGE-CLAIMS AND VALIDITY The conclusion o f Waugh and Norman's (1965) e x p e r i m e n t is that interferenceis responsible for forgetting in the short-term m e m o r y situation. Important to the present discussion is whether or not the conclusion is warranted by their finding, a criterion called "warranted assertability" by Manicas and Secord (1983). This
268
S.L. Chow
criterion is g e r m a n e to the relativistic charge that experimental psychologists pay u n d u e attention to their methodology. To consider the "warranted assertability" of an experimental conclusion is to examine the theory-data relationship implicated in the experiment (viz., the theory ---) observation~ part of the observation1 ---) theory ---) observation 2 sequence). A theory is justified, or warranted, by the data collected for the explicit purpose of testing the theory (i.e., observations) when Cook and Campbell's (1979) criteria o f statistical conclusion validity, internal validity and construct validity are met. The term, validity, is used in the present discussion to highlight the fact that "warranted assertability" is achieved when the three aforementioned kinds o f validity are secured.* Issues about statistical conclusion validity are concerns that the correct statistical procedure is used to analyse the data (e.g., using the related-sample t test for the repeated-measures 1-factor, 2-level design). To examine the internal validity of the study is to ensure that the observed relationship between the i n d e p e n d e n t and d e p e n d e n t variables c a n n o t be explained in terms of the other features of the study. This amounts to excluding all alternative explanations in terms of the structural or procedural characteristics of the study (viz., the design of the study, the choice of i n d e p e n d e n t a n d control variables, and the control procedures). This kind o f c o n c e r n is actually the c o n c e r n with the formal r e q u i r e m e n t o f the inductive principle underlying the design o f the study. Hence, Cook and Campbell's (1979) internal validity is characterized as inductive conclusion validity (Chow, 1987, 1992a). To Cook and Campbell (1979), an empirical study has construct validity when it is established that the observed relationship between the i n d e p e n d e n t a n d d e p e n d e n t variables actually informs us about the to-be-investigated theoretical construct. This may be interpreted to mean the exclusion of alternative conceptual explanations of the data (as opposed to alternative explanations in terms of the procedural features of the study at the methodological level implicated in inductive conclusion validity). However, the term construct is often used in non-experimental validation of psychometric tests. That is, it has too heavy a psychometric overtone (Chow, 1987). Hence, Wampold, Davis and G o o d ' s (1990) hypothesis validity seems more appropriate than construct validity in the context of experimental studies. In short, the hypothesis validity o f a study is the e x t e n t to which an alternative conceptual explanation of the data is unambiguously excluded. Hence, Waugh and N o r m a n ' s (1965) study has hypothesis validity because the Decay t h e o r y was unambiguously excluded in favor of the Interference theory. VALIDITYOR METHODOLATRY It is instructive to examine why the criterion validity has received scant notice, if *An explanation is necessaryfor the exclusion of Cook and Campbell's (1979) external validity. Some criteria of external validityare generality, social relevance, practical usefulness, a point suggested by an anonymous reviewer of an earlier version of this paper. However, the assessment of the theory-data relationship is not affectedby any of these considerations. For example, universality (or generality) refers to the extent to which a theory is applicable. The theory-data relation found in a research conclusion is not contingent on generalizing the conclusion beyond the well-defined population about which the research is conducted. That it may be nice to be able to do that is a different matter. That is, the applicability of the conclusion may be restricted to a well-defined population. Nonetheless, it must be warranted by a set of research data (Greenwood, 1991).
Experimental data in a relativistic milieu
269
at all, in the relativistic critique. This can be d o n e by considering the exchange between Barber and Silver (1968a, 1968b) and Rosenthal (1966, 1968) regarding the experimenter bias effect (viz., the view that the e x p e r i m e n t e r would, knowingly or unknowingly, i n d u c e the subjects to behave in a way consistent with the experimenter's vested interests). Barber and Silver (1968a) argued that the putative evidence for the experimenter bias effect is debatable because, in studies purportedly in its support, (a) questionable levels of significance were sometimes used (e.g., p > 0.10), (b) some investigators failed to use the appropriate statistical analyses, (c) necessary statistical analyses were not carried out in some studies, (d) inappropriate data collection p r o c e d u r e s were f o u n d in some studies, and (e) the necessary controls (in the sense o f provisions for e x c l u d i n g specific alternative interpretations) were absent in many studies. Barber and Silver's (1968a, 1968b) criticisms o f the evidential support for the experimenter bias effect is a good example of what is wrong with psychology from the relativistic critique perspective.* To begin with, the methodological and procedural details of concern to Barber and Silver (1968a, 1968b) are far removed from reallife experiences. Nor do these details have anything to do with the to-be-studied p h e n o m e n o n (viz., experimenter bias effect). Second, psychologists effectively disregard findings which may have important practical and social implications when they reject studies for some arcane methodological reasons. Hence, psychologists' c o n c e r n with m e t h o d o l o g i c a l issues or technical details is c h a r a c t e r i z e d as a rhetorical device by Gergen (1991) and a fetish or "methodolatry" by Danziger (1990). In o t h e r words, the derogatory characterizations of psychologists' attention to statistics and research design issues used in relativistic critique is reasonable if psychologists do so for merely pedantic reasons. However, theoretical discussions and the assessment o f research findings are n o t everyday activities. Theoretical discussions necessarily r e q u i r e formal r e a s o n i n g a n d abstract criteria. T h e s e intellectual activities require specialized skills and terminology to bring into focus formal and abstract considerations not important to everyday or literary discourse (see "misplaced abstraction" emphasized in italics in [Quote 2] below). For example, researchers are r e m i n d e d of the influences of chance factors by considering statistical significance. Psychologists a d o p t a conventional level o f significance (viz., 0~ = 0.05) because they wish to have a well-defined criterion of stringency when they reject chance factors as the explanation of data (Chow, 1988, 1991). Researchers c o n c e r n e d with r e d u c i n g ambiguity would have to pay particular attention to controls as the means to exclude alternate interpretations of data. In other words, technical terms such as statistical significance and control are used to talk about the formal r e q u i r e m e n t necessary for establishing the validity of empirical studies.
*It should be made clear that neither Danziger (1990) nor Gergen (1991) used the exchange between Barber and Silver (1968a, 1968b) and Rosenthal (1966, 1968) as an example. At the same time, neither Danziger nor Gergen actually said in explicit terms why they found it objectionable to use technical criteria to assess research results. I think that Barber and Silver's (1968a, 1968b) papers represent best what Danziger (1990) and Gergen (1991) may disapprove in general terms.
270
S.L. Chow
EMPIRICAL RESEARCH: EXPERIMENTAL VERSUS NON-EXPERIMENTAL Some relativistic critics (e.g., Danziger, 1990) do recognize the importance of arriving at unambiguous research conclusions. It has just been shown that questions about statistics and research design are concerns about reducing ambiguity. Yet, relativistic critics dismiss technical concerns as methodolatry. It may be seen that this inconsistency is due to a misunderstanding of the concept, control, in the relativistic critique. Danziger (1990) argues that contemporary psychologists owe their insistence on using empirical data to three origins, namely, (a) Wundt's psychophysics, (b) Binet's study of hypnosis with a comparison group and (c) Galton's anthropometric measurement. Although these pioneers o f psychological research used radically different data-collection procedures, their procedures have all been characterized as experimentalby Danziger (1990). Specifically, it is said, Discussion of experimentation in psychology have often suffered from a misplaced abstractness. Not uncommonly, they have referred to something called the experimental method, as though there was only one . . . . (Danziger, 1990, p. 24, emphasis in italics added) [Quote 2] To Danziger (1990), the prevalent characteristic of psychological research is to achieve different kinds of control, as may be seen by considering an attempt in psychophysics to establish the absolute threshold for brightness with the Method of Constant Stimuli. An appropriate range of brightness is first chosen. This is "stimulus control." Each o f these brightness values is presented to an observer many times. The entire ensemble of stimuli is presented to an observer in a r a n d o m order. The observer is given only two response options on every trial, namely, to respond either 'fi(es" (to indicate that a light is seen) or "No" (to indicate that no light is seen). This is "response control." Although stimulus and response controls were achieved in different ways in Wundt's, Binet's and Galton's approaches, "control" was, nonetheless, an integral c o m p o n e n t o f their research. Hence, Danziger (1990) treats them as different ways of conducting experiments. In any case, using stimulus control or response control is objectionable because their introduction constitutes "social control" (Danziger, 1990, p. 137) in the Skinnerian sense of shaping the subjects' behavior. At the same time, it is said that the purpose of introducing stimulus and response controls in the data collection p r o c e d u r e is considered a means "to impose a numerical structure on the d a t a . . . " (Danziger, 1990, p. 137) in o r d e r to use statistics. An implication of this argument is that "control" is not (or need not be) a necessary c o m p o n e n t of research had psychologists not been so obsessed with quantification and statistics. However, two rejoinders to this interpretation of "control" may be offered. First, imagine simply asking an observer in an absolute-threshold task to watch the screen and give a response within a p r e d e t e r m i n e d period o f time (e.g., 2 seconds). This is an ambiguous instruction in an ill-defined situation. It is reasonable to assume that the observer would not know what to do, especially at the beginning. A likely o u t c o m e is either (a) that the observer responds in a way u n r e l a t e d to the p u r p o s e o f the study, or (b) the observer asks for explicit instruction as to what to do. This is not unlike the following situation.
Experimental data in a relativistic milieu
271
Suppose that a biology teacher instructs a novice student to report what is seen u n d e r a microscope. The novice student may report things which have nothing to do with the class work. Another likely outcome is that the student asks for specific instruction as to what to look for. That is, people ask for information so as to reduce ambiguity when placed in an uncertain situation. In other words, it is debatable to say that giving an observer explicit instructions as to what to do is an attempt to shape the observer's behavior. T h a t is, it is d e b a t a b l e to characterize giving unambiguous instruction to experimental subjects as "response control" in the sense o f shaping the subjects' behavior. Nor is following specific instructions by e x p e r i m e n t a l subjects a contrived way o f behaving which happens only in the laboratory. For example, drivers do stop at red lights even when they are in a hurry. That is, people do follow rules, conventions or instructions in everyday life. The second rejoinder to Danziger's (1990) interpretation of control is based on three technical meanings of controlidentified by Boring (1954, 1969), n o n e of them refers to shaping the behavior o f research participants. Control is achieved in research when the following conditions are met" (1) T h e r e is a valid baseline for making a comparison. This function is served in Waugh and Norman's (1965) experiment by the condition represented by Row (i) in Table 2. Performance in Row (ii) through Row (vi) is c o m p a r e d to that in Row (i). Moreover, the conditions d e p i c t e d in Rows (i) t h r o u g h (vi) are identical in all aspects but o n e (viz., the specific level of the Delay by Rate combination). Hence, experiment may be defined as an empirical study in which data are collected in two, or more, conditions which are identical in all aspects but one (viz., the specific level of the i n d e p e n d e n t variable used). This feature, however, was absent in psychophysical or anthropometric measurements where data are collected in o n e c o n d i t i o n only. N o r is it p r e s e n t in the nonexperimental research (e.g., a correlation study). (2) Constancy of condition is achieved by (a) holding the control variables at the same level* in all data-collection conditions (e.g., the control variables, Variables 3, 4 and 5 in Table 2, f o u n d in Waugh & Norman's, 1965, experiment), (b) using the p r e d e t e r m i n e d levels o f the manipulated variables as prescribed in the design of the study (viz., Danziger's, 1990, "stimulus control"; e.g., A I X 1 through A3X~ in Table 2). (3) T h e r e are provisions for e x c l u d i n g potential c o n f o u n d i n g variables. An example is the use of the repeated-measures design in Waugh and Norman's (1965) experiment. Whatever the extraneous variables, Variables 6 through N, might be (including social or personality factors), it seems reasonable to assume that each o f t h e m was r e p r e s e n t e d at the same level at all six t r e a t m e n t combinations when the same subject was tested in all of them. *It may be helpful to iterate the distinction between a variable (e.g., color) and its/eve/s (viz.,red, blue, green, etc.). The relation between a variableand its/eve/sis like that between a class and its members. If the hue blueis found in everydata collection condition, the variable coloris said to be held at the same (or constant) level blue.
272
S.L. Chow
It may readily be seen that the purpose of instituting stimulus control is to ensure that the pre-determined levels of a manipulated variable are used as prescribed. Contrary to Danziger's (1990) characterization, stimulus controlis not used as a form of social control because it is not instituted in the experiment for the purpose of shaping the subject's behavior. It is used to satisfy the formal requirement of the inductive method underlying the experimental design. Moreover, by itself, stimulus control is not the whole meaning of control. Hence, not all three components of controlare found in a typical psychophysical study. This illustrates the point that not all empirical methods satisfy the three meanings of control Specifically, neither Wundt's nor Galton's approach satisfied all three criteria. As a matter of fact, empirical research methods are divided into experimental, quasi-experimental a n d non-experimental methods. Specifically, (a) an e x p e r i m e n t is one in which all recognized controls are present, (b) a quasiexperiment is one in which at least one recognized control is absent due to practical or logistic constrains (Campbell & Stanley, 1966; Cook & Campbell, 1969; Binet's approach, strictly speaking, fell into this category), and (c) a non-experimental research study is one in which no attempt is made to ensure the presence of control (e.g., research data collected by conducting an interview or by naturalistic observations). The importance of control may be seen by supposing that Variable 3 in Table 2 (viz., B or using a high frequency tone as the probe tone) were absent in Waugh and N o r m a n ' s (1965) experiment. W h a t is the implication if tones of different frequencies were used for the six Delay and Rate combinations? U n d e r such circumstances, the variation in the subjects' performance (viz., Y1 through Y6 in Table 2) u n d e r the six treatment combinations might be accounted for by the difference in frequency among the probe-tones. The data would be ambiguous as to the verity of the Interference theory under such circumstances. However, as the frequency of the probe tone was held constant in Waugh and N o r m a n ' s (1965) experiment, it can u n a m b i g u o u s l y be excluded as the explanation of the data (Cohen & Nagel, 1934). This "exclusion" function is the utility of the three kinds of control which must be present in a properly conducted experiment. This becomes important when the SPOPE factors are discussed in subsequent sections. In sum, ambiguities in data interpretation are reduced by excluding alternate explanatory accounts. At the same time, the purpose of instituting controls in empirical research is not to impose a numerical structure on data, nor to shape the subjects' behavior. It is to exclude alternate interpretations of data. In other words, to provide controls is to attempt to reduce ambiguity. As controls may be achieved to various degrees of success, some empirical studies (viz., experimental studies) give less ambiguous results than other empirical studies (e.g., quasi-experimental or non-experimental). It is unfortunate for meta-theoretical or methodological discussion that the role of control in research is misrepresented in the relativistic critique account. This misrepresentation may be responsible for the failure in the relativistic critique view to distinguish between experimental and non-experimental research. This failure, in turn, may be responsible for the aforementioned inconsistency between (a) the
Experimental data in a relativistic milieu
273
recognition of the need to reduce ambiguity, and (b) the relativistic critics' disdain for technical rigor. SOCIAL PSYCHOLOGYOF THE PSYCHOLOGICAL EXPERIMENT (SPOPE) There is a set of factors collectively known as social psychology of the psychological experiment (or SPOPE for short) (Orne, 1962; Rosenthal, 1963) which putatively renders experimental data inevitably invalid. This set of social factors consists of experimenter effects (Rosenthal, 1966), subject effects (Rosenthal & Rosnow, 1969, 1975) and demand characteristics (Orne, 1962, 1969, 1973). The essence of the SPOPE critique of experimental data is that the outcome of an experiment is inevitably influenced by who conducts the experiment with whom as subjects. These influences are the consequence of, among other things, (a) the e x p e r i m e n t e r ' s personal characteristics and expectation regarding the experimental result, (b) experimental subjects' individual characteristics, and (c) the perceived roles entertained by the experimenter and subjects, the subjects' desire to ingratiate with the experimenter, and rumors about the experiment. In short, the SPOPE argument may be the psychological reason why the relativistic critique argument is correct. Danziger (1990) takes the SPOPE factors for granted. However, it may be suggested that there is no valid empirical data in support of the SPOPE arguments (Chow, 1987, 1992a, 1994). A thesis of the demand characteristics charge is that subjects are willing to undertake any task, however meaningless or futile, in the context of experimental research. For example, thirsty subjects presevered in eating salty crackers upon request, or an individual would patiently cross out a particular letter from rows and rows of random letter strings (Orne, 1962, 1969, 1973). This is the extent of the evidential support for the demand characteristics claim. However, asking people to carry out a meaningless or futile task is not conducting an experiment. Berkowitz and Donnerstein (1982) are correct in saying that there is no evidential support the demand characteristics claim. The fundamental problem with the subject effects claim is the failure in the SPOPE argument to distinguish between group differences and the effect of group differences on e x p e r i m e n t a l result. For the subject effects claim to be true, it is necessary to demonstrate the latter. Demonstrating the former is insufficient. This difficulty may be illustrated w i t h ' G o l d s t e i n , Rosnow, Goodstadt and Suls' (1972) verbalconditioning experiment. They reinforced their subjects with a verbal reinforcer (e.g., "good") whenever the subjects emitted the first-person p r o n o u n 'T' or "We." There were two groups of subjects, namely, volunteers a n d non-volunteers. H a l f of each group was knowledgeable of the principle of operant conditioning; the other half of each group was not knowledgeable. All subjects were tested in four 20-trial blocks. The d e p e n d e n t variable was the difference between the n u m b e r of first-person pronouns emitted in Block 4 and Block 1. This may be interpreted as a measure of how readily a subject can be conditioned. That is to say, (X) in Table 3 represents the differences between Block 4 and Block 1 for volunteers, and (Y) represents the same difference for non-volunteers. (Dr) represents the difference between the experimental and control conditions for the volunteers, whereas (D~v) is the
274
S.L. Chow
Table 3. A distinction between group differenceand the effect of group differenceon the result of Goldstein et al.'s (1972) verbal-conditioningexperiment Volunteers Non-volunteers E C (X) (Y)
= = = =
(Dv) = (Dm) =
E
C
Mean
Difference
Xe Ye
Xc Yc
(X) ( I0
(Dv) (Duv)
Experimental condition (knowledgeable of verbal conditioning) Controlcondition (not knowledgeable of verbal conditioning) Meanof Xeand Xc Meanof Y~and Yc
X~- Xc YE- Yc
difference between the e x p e r i m e n t a l a n d c o n t r o l c o n d i t i o n s for the nonvolunteers. Goldstein et al. (1972) f o u n d that (X) was significantly larger than (Y). They concluded that willing subjects were "good" subjects in the sense that these subjects were more inclined to fulfill the d e m a n d of the experimental task (see Rosenthal & Rosnow, 1975, pp. 157-165). The subject effects claim is thereby established in the SPOPE view. As (X) and (Y) represent the rate of conditioning of the volunterrs and non-volunteers, respectively, the differen e between (X) and (Y) means that there is a difference between the two groups. This f i n d i n g merely reinforces the knowledge that the two groups o f subjects were different to begin with (viz., their being volunteers or non-volunteers). It is important not to lose sight of the fact that the knowledgeability of verbal conditioning was included in the study. This manipulation means that the study was one about the differential consequences of the knowledgeability of verbal conditioning for the two groups. To demonstrate the effect o f the group difference on the effect of knowledgeability on operant conditioning, it is necessary to show that (Dv) is larger than (Duv). However, there was no difference between (Dr) and (DNv). The issue is not that there are g r o u p or individual differences. What critics o f experimental psychology have not shown is how g r o u p or individual differences actually affect experimental results. Goldstein et al.'s (1972) experiment is typical of the genre o f experiments used to support the subject effects claim. As has been seen, what is demonstrated is group difference w h e n the r e q u i r e d d e m o n s t r a t i o n is an effect of group difference on e x p e r i m e n t a l data. In o t h e r words, data f r o m studies o f this g e n r e (however numerous) are insufficient as evidence for the subject effects claim. The difficulty with the experimenter effects may be shown with reference to Table 4. In the top panel is depicted the design of Rosenthal and Fode's (1963) photo-rating study. "A" represents Rosenthal and Fode, "V' and "F' represent two different groups of data-collectors who presented photographs to their respective subjects. The task of the subjects was to rate how successful, or unsuccessful, was the person in the photograph. "V' and "F' were given different "theoretical orientations." Specifically, datacollectors in "V' were told to expect a mean rating of +5, whereas those in "F' were i n f o r m e d to expect a m e a n rating o f - 5 . This differential instruction provides
Experimental data in a relativistic milieu
275
effectively different data-collection conditions for the "V' and "F' groups. Hence, from the perspective o f A (the investigator), the study had the formal structure of an experiment. M1 was larger than 3/2. Rosenthal and Fode (1963) concluded that the data were consistent with the experimenter expectancy effects. T h e problem is that M1 and M2 are not experimental data to "V" or "Y". The necessary condition for the experimenter expectancy effects claim is that each of the individuals in Group "V' and "F' must be given an e x p e r i m e n t to conduct. However, the individuals in Group "V' and "F' collected data u n d e r o n e condition only. In o t h e r words, M1 and 342 are only m e a s u r e m e n t data, not experimental data. In sum, the design depicted in the u p p e r panel of Table 4 is insufficient for investigating the experimenter expectancy effects. A reason why this inadequacy is not recognized is the failure to distinguish between meta-experiment (viz., an e x p e r i m e n t about experimentation) and experiment, as may be seen in the lower panel o f Table 4. Investigator B has two groups o f experimenters. Regardless o f the theoretical orientation, each individual in "W" or "Z" collects data in two conditions, namely, the experimental and control conditions. Consequently, a difference between the two conditions is available from every individual in "W" or "Z." From the perspective o f "W" or "Z," the study is an e x p e r i m e n t because data are collected in two conditions which are identical in all aspects but one. T h e study is an experiment a b o u t the e x p e r i m e n t c o n d u c t e d by "W' or "Z" f r o m the perspective o f the investigator, B. For this reason, it is a "meta-experiment" to the investigator. Using Rosenthal and Fode's (1963) photo-rating task, Chow (1994) reported data from a meta-experiment. T h e r e was no differences between the two sets of e x p e r i m e n t a l data, dl and d2. T h e inescapable conclusions are that (i) the experimenter expectancy effects claim is based on data which are inadequate as the evidential data, and (ii) data from a properly designed meta-experiment do not support the experimenter expectancy effects claim. Table 4. The distinction between measurement and experiment (upper panel) and between meta-experiment and experiment (lower panel) Upper panel: Investigator Data-collector Theoretical orientation Measurement condition Result Lower panel: Investigator Experimenter Theoretical orientation Test conditions* Experimental result t E-C
A V TI MC1 MI
Y T~ MC~ M2
W TI E
Z T2 C
E-C
4
E
C
E-C
a~
*Eand Cstand for the experimental and control conditions, respectively. t E - C stands for the difference between the means of the experimental and control groups.
276
S.L. Chow
In sum, while the SPOPE argument may be reasonable as a critique of nonexperimental data, it is not applicable to experimental data. Specifically, controls are means with which experimenters can exclude alternate interpretations of data. By the same token, even if social conventions, cultural values and professional constraint may influence non-experimental data, appropriate controls can be (and are routinely) used to rule out these non-intellectual influences on experimental data. This is another reason why being attentive to technical details is not servitude to methodology. SUMMARYAND CONCLUSIONS The relativistic critique argument begins with the correct view that all data are theory-dependent. However, it is debatable to say that all assumptions underlying empirical research (particularly experimental research) have to be empirically justified simply because many assumptions are non-empirical in nature (e.g., rules in deductive logic, inductive principles, assumptions underlying statistical procedures, etc.). Relativistic critics fault cognitive psychologists who practice Popper's (1968a/1959, 1968b/1962) conjectures and refutations approach for paying too much attention to problems of justification at the expense of questions about discovery. The question becomes the discovery of what. Psychologists do not invent phenomena to study. Instead, they speculate about the causes or reasons underlying varying mental and experiential phenomena found in everyday life. At this level, no contemporary experimental psychologist would disagree with relativistic critics that psychological theories are psychologists' active constructions rather than passive reading of what are revealed to the psychologists. In exploring the hypothetical mechanisms capable of explaining observable phenomena, experimental psychologists do engage in the discovery process. Many cognitive psychologists take to heart Popper's (1968a/1959, 1968b/1962) view that neither the origin of, nor the manner of acquiring, these constructivistic speculations really matters. Relativistic critics, on the other hand, find it necessary to make explicit, with adduction, circumstantial factors which may affect psychologists (Danziger, 1990). It is legitimate for relativistic critics to emphasize the meaning of psychological phenomena in the experiential sense. However, this is not the theory-data relation issue raised in the relativistic critiques of experimental data. Moreover, that relativistic critics may have good reasons to adopt their metatheoretical preference should not be construed as an indictment against experimental psychologists' sensitivity towards methodological issues. Nor should the relativistic meta-theoretical preference be used to argue against experimental psychologists' practice of using technical terminology when they assess, or talk about, research result. Contrary to the relativistic contention, issues of justification are important because (a) theories are intellectual constructions, (b) it is often possible to have multiple accounts for the same p h e n o m e n o n , (c) the multiple explanatory accounts may be mutually compatible, and (d) it is necessary to reduce ambiguity by excluding theoretical alternatives which are not warranted by data.
Experimental data in a relativistic milieu
277
Many of the objections to empirical data raised in relativistic critique can readily b e answered. Specifically, the ecological validity c r i t e r i o n loses its a p p a r e n t attractiveness if it is realized that research is c o n d u c t e d to assess theories a b o u t p h e n o m e n a , n o t the p h e n o m e n a themselves. Accepting the legitimacy o f data collected in artificial setting, e x p e r i m e n t a l psychologists find it easier to h o n o r the three technical meanings of control To institute control in empirical research is to a t t e m p t to r e d u c e ambiguity. E x p e r i m e n t a l psychologists pay a t t e n t i o n to methodological details not for any pedantic reason. N o r does this m e a n that they are oblivious of o t h e r non-intellectual issues. It simply means that questions a b o u t validity are the only relevant issues w h e n the discussion is a b o u t the relation b e t w e e n a t h e o r e t i c a l assertion a n d its evidential s u p p o r t . Psychologists are meticulous a b o u t methodological details because they have to reduce ambiguity. As r e c o u n t e d by Copi (1965), Euclid m a d e a mistake in his first a t t e m p t to establish a geometry p r o o f because Euclid allowed the content of his a r g u m e n t to interfere with his reasoning when he should have b e e n guided by the form of his reasoning. T h a t is to say, there is a time for content, as well as a time for f0rm. By the same token, an exclusive concern for (a) the content of an assertion, (b) its practical utility, social i m p l i c a t i o n s or political c o r r e c t n e s s , or (c) the validity o f the t h e o r y - d a t a relationship may be called for on different occasions. This is not to say that psychologists are oblivious to pragmatic issues a n d / o r utilitarian consequences o f an assertion when they criticize a study on technical grounds. At the same time, to suggest that paying attention to technical details is a fetish or m e r e scientific rhetoric is to suggest that it is not necessary to consider the relation between a knowledge-claim and its evidential support. REFERENCES Baddeley, A. (1990). Human memory: Theory and practice. Needham Heights, MA: Allyn and Bacon. Barber, T. X., & Silver, M.J. (1968a). Fact, fiction, and the experimenter bias effect. PsychologicaIBulletin Monograph Supplement, 70 (6, Part 2), 1-29. Barber, T. X., & Silver, M.J. (1968b). Pitfalls in data analysis and interpretation: A reply to Rosenthal. PsychologicalBulletin Monograph Supplement, 70 (6, Part 2), 48--62. Berkowitz, L., & Donnerstein, E. (1982). External validity is more than skin deep: Some answers to criticisms of laboratory experiments. American Psychologists, 37, 245-257. Boring, E. G. (1954). The nature and history of experimental control. American Journal of Psychology, 67, 573-589. Boring, E. G. (1969). Perspective: Artifact and control. In R. Rosenthal & R. L. Rosnow (Eds.), Artifact in behavioral research (pp. 1-11). NewYork: Academic Press. Campbell, D. T., & Stanley,J. L. (1966). Experimental and quasi-experimental designsfor research. Chicago: Rand McNully. Chomsky, N. (1959). Review of Skinner's "Verbal Behavior." Language, $5, 26-58. Chow, S. L. (1987). ExperimentalPsychology: Rationale, procedures and issues. Calgary: Detselig. Chow, S. L. (1988). Significance test or effect size? PsychologicalBulletin, 103, 105-110. Chow, S. L. (1991). Conceptual rigor versus practical impact. Theory & Psychology, 1,337-360. Chow, S. L. (1992a). Research methods in psychology: A primer. Calgary: Detselig. Chow, S. L. (1992b). Positivism and cognitive psychology: A second look. In C. W. Tolman (Ed.), Positivism in psychology: Historical and contemporaryproblems (pp. 119-144). New York: Springer-Verlag. Chow, S. L. (1994). The experimenter's expectancy effect: A meta-experiment. Zeitschriflfiir Piidagogische Psychologie/GermanJournal of Educational Psychology, 8 (2), 89-97,
278
S.L. Chow
Cohen, M. R., & Nagel, E. (1934). An introduction to logic and scientific method. London: Routledge & Kegan Paul. Cook, T. D., & Campbell, D. T. (1979). Quasi-expehmentation: Design & analysis issuesfor field settings. Chicago: Rand McNally. Copi, I. M. (1965). Symbolic logic (2nd ed.). NewYork: Macmillan. Copi, I. M. (1982). Symbolic logic (6th ed.). NewYork: Macmillan. Danziger, K. (1990). Constructing the subject: Historical origins of psychological research. Cambridge: Cambridge University Press. Garner, W. R., Hake, H. W., & Eriksen, C. W. (1956). Operationalism and the concept of perception. PsychologicalBulletin, 87, 564-567. Gergen, K.J. (1988). The concept of progress in psychological theory. In W.J. Baker, L. P. Mos, H. V. Rappard, & H.J. Stam (Eds.), Recent trends in theoreticalpsychology (pp. 1-14). New York: Springer-Verlag. Gergen, K.J. (1991). Emerging challenges for theory and psychology. Theory &Psychology, 1, 13-35. Goldstein,J.J., Rosnow, R. L., Goodstadt, B., & Suls,J. M. (1972). The "good subject" in verbal operant conditioning research. Journal of Experimental Research in Personality, 6, 2933. Greenwood, J. D. (1991). Relations & representations. London: Routledge. Manicas, P. T., & Secord, P. F. (1983). Implications for psychology of the new philosophy of science. American Psychologist, 38, 103-115. Mill, J. S. (1973). A system of logic: Ratiocinative and inductive. Toronto: University of Toronto Press. Mook, D. G. (1983). In defense of external invalidity. American Psychologist, 38, 379-387. Neisser, U. (1976). Cognition and reality. San Francisco: W. H. Freeman. Neisser, U. (1988). New vistas in the study of memory. In U. Neisser & E. Winograd (Eds.), Remembering reconsidered:Ecological approaches to the study of memory (pp. 1-10). Cambridge: Cambridge University Press. Orne, M. T. (1962). On the social psychology of the psychological experiment: With particular reference to demand characteristics and their implications. American Psychologist, 17, 776-783. Orne, M. T. (1969). Demand characteristics and the concept of quasi-controls. In R. Rosenthal & R. L. Rosnow (Eds.), Artifact in behavioral research (pp. 143-179). New York: Academic Press. Orne, M. T. (1973). Communication by the total experimental situation: Why it is important, how it is evaluated, and its significance for the ecological validity of findings. In P. Pliner, L. Krames, & T. Alloway (Eds.), Communication and affect (pp. 157-191). New York: Academic Press. Popper, K. R. (1968a). The logic of scientific discovery (2nd ed.). New York: Harper & Row. (Original work published 1959) Popper, K. R. (1968b). Conjectures and refutations. New York: Harper & Row. (Original work published, 1962) Rosenthal, R. (1963). On the social psychology of the psychological experiment: The experimenter's hypothesis as unintended determinant of experimental results. American Scientist, 51,268-283. Rosenthal, R. (1966). Experimenter effects in behavioral research. New York: Appleton-CenturyCrofts. Rosenthal, R. (1968). Experimenter expectancy and the reassuring nature of the null hypothesis decision procedure. PsychologicalBulletin Monograph Supplement, 70 (6, Part 2), 48-62. Rosenthal, R., & Fode, K. L. (1963). Three experiments in experimenter bias. Psychological Report, 12, 491-511. Rosenthal, R., & Rosnow, R. L. (1969). The volunteer subject. In R. Rosenthal & R. L. Rosnow (Eds.), Artifact in behavioral research (pp. 59-118). New York: Academic Press. Rosenthal, R., & Rosnow, R. L. (1975). The volunteer subject. New York: John Wiley.
Experimental data in a relativistic milieu
279
Skinner, B. F. (1938). The behavior of organisms: An experimental analysis. NewYork: AppletonCentury. Wampold, B. E., Davis, B., & Good, R. H. II. (1990). Methodological contributions to clinical research: Hypothesis validity of clinical research. Journal of Consulting and Clinical Psychology, 58, 360-367. Waugh, N. C., & Norman, D. A. (1965). Primary memory. PsychologicaIReview, 72, 89-104.