JOURNAL
OF EXPERIMENTAL
Some Implications DENNIS
SOCIAL
PSYCHOLOGY
of Temporal H. NAGAO
16, 479-496
(1980)
Drift i
AND JAMES
University of Illinois Received October 2, 1979 Social and political factors often have an effect on the way we perceive and think about particular events. While it is generally acknowledged that these factors are likely to change over time, the implications of such temporal changes for empirical research are not often considered. In this paper, some of these are illustrated by examining data collected from a sequence of studies conducted over a 3-year period (1973-1976). Subjects viewed the same videotaped mock (rape) trial and, prior to experimental manipulations, gave their own personal verdict. An examination of the proportion of mock jurors preferring guilty in each of the studies revealed a drift of 16% (53 to .69) from 1973-1976. This trend toward harsher judgments was observed for both sexes, with females uniformly more likely than males to favor conviction. Ancillary evidence suggests that the upward drift reflected changes in the perception of rape during that period. The implications of temporal drift in parameter values for empirical research were illustrated by simulating jury size effects using social decision scheme theory (I. II. Davis, Psychological Review, 1973, 80, 97-125). The results indicate that the magnitude of theoretical differences due to jury size must have decreased as the probability of an individual guilty vote increased over time. Thus, failures to replicate findings may not be due entirely to sampling error, methodological imprecision, or the like, but to temporal changes in the social phenomenon under investigation.
Changes in social phenomena over time are often quite subtle, and t are difficult to evaluate. Programmatic research, for al% of its virtues, seems as susceptible as one-shot investigations to unrecognized temporal drift in social parameters. However, a sequence of studies over time also provides a means of exploring research implications of time-dep changes, albeit sometimes an unplanned inquiry. The purpose paper is to derive and evaluate some implications of apparent s change in an ostensibly constant population of subjects. Although the data This research was supported by National Science Foundation Grant BNS 77-15216 to the second author. We are grateful for comments on a preliminary draft from Blair Sheppard, David Vollrath, Verlin Hinsz, Thomas Srull, William Lapworth, and Toni Tumonis. Requests for reprints should be sent to Dennis I-I. Nagao, Department of Psychology, University of Illinois, Champaign, IL 61820. 479 Copyright @ 1980 by Academic Press, Inc. Al1 rights of reproduction in any form reserved.
480
NAGAO
AND DAVIS
which are central to this paper come from a sequence of mock jury studies, we wish to emphasize that the points to be raised are applicable to any empirical investigation of social phenomena, of which mock jury studies are only one example. A number of investigations (Davis, Holt, Spitzer, & Stasser, 1980; Davis, Kerr, Atkin, Holt, & Meek, 1975; Davis, Kerr, Stasser, Meek, & Holt, 1977; Davis, Spitzer, Nagao, & Stasser, 1978; Davis, Stasser, Spitzer, & Holt, 1976; Kerr, Atkin, Stasser, Meek, Holt, & Davis, 1976; Davis, Nagao, Spitzer, & Stasser, Note 1) have been carried out under highly similar conditions during the past several years (see the summary by Davis, 1980). All featured the same prerecorded mock trial, general procedure (except for the experimental manipulations of interest), subject population, and immediate surrounds. In brief, randomly selected undergraduates responded to a few short general items about crime and punishment upon arrival at the laboratory, watched a mock trial in which the defendant was accused of rape, gave their private opinions about the defendant’s guilt, deliberated, rendered a verdict unless they could not reach a consensus, and answered privately several questions about the preceding events, including their final personal opinion about the defendant’ s guilt. We will focus only on aspects of the studies that were constant throughout the sequence. Other aspects were the site of experimental manipulations, and these treatments, of course, varied from study to study. MOCK JUROR PREFERENCES
The statistic of primary interest is the proportion, BG, of individual jurors favoring a guilty verdict prior to jury deliberation. Since they were obligated to favor acquittal if not convinced of guilt, 1 - fit = fiNG, the proportion of jurors inclined toward not-guilty verdicts. The very large samples provide some confidence in the estimates of the parameter,~~, of a subject favoring guilty given the trial as presented.’ The value offiG from each study is given in Fig. 1, along with the date of data collection. Note that the first study is offset in distinctive fashion to reflect the fact that the content of the trial was slightly altered following that investigation in order to contain somewhat more incriminating evidence, but remained constant thereafter.z The trend clearly seems to suggest increasingly harsher judgments 1 The size of the samples from whichbe was calculated were 642, 544, 66, 864, 654, 821, and 870 for Studies 1 through 7, respectively. 2 The experimental manipulations for two of the studies occurred before subjects indicated their personal verdict preferences. For one of these (Kerr et al., 1976), only the data from a control condition, which did not differ from previous studies in procedure, was retained for analysis here. In the other (Davis et al., 1977b), mock jurors, prior to viewing
TEMPORAL
DRIFT
IN
SOCL4L
481
PA~~ET~RS
r’ .00-r
; (Spring, 1973)
I 2 (FQII, 1973)
I 3 (Spring, 1974)
MOCK
I 4 (Foil, 1974)
JURY
I 5 (Spring, 19751
I 6 (FaII, 19751
7 ispring,
1976)
STUDiES
(Year)
FIG. 1. Proportion of guilty verdicts favored by predehberation semesters, 1973-1976.
jurors over successive
about the defendant’s guilt in response to the same trial. There appears to have been a change in i)G of about 16% in the relatively short period (3 years) between Studies 2 and ‘i-the interval with which we will be concerned hereafter. Since estimates from such large samples are quite stable, we must entertain the possibility that pG is indeed drifting upward with time. The immediate conditions of research, within which estimates of fiG were obtained, had been carefully held constant, at least up to the state of the art. Experimenters differed over the years, but their presence was minimized; constant instructions were delivered by intercom; questionnaire administration, response recording, etc., were automated or prearranged in booklets, etc. In other words, the general procedures and stimulus materials in the section of the experiments where the estimates were obrained were uncommonly consistent over time. We suspect that the more parsimonious explanation is that our subjec?s differed over time. the trial on videotape, read purported newspaper articles which attempted to manipulate their perceptions of the consequences the victim suffered and the severity of the punishment the defendant might receive if convicted. Analysis of the data revealed that the manipula; tions did not significantly affect individuals’ initial verdict preferences. Hence, these data were retained.
482
NAGAO
AND DAVIS
Although we cannot rule out the possibilities that the University somehow attracted different students, that the introductory general and social psychology courses enrolled different types of students, and so on, a straightforward possibility is that the general social factors that determine pG (norms, opinions, etc.) themselves changed over the period in question. In other words, we suspect that our subjects brought different social cognitions with them over the period in which the mock trials took place. Unfortunately, we lack the data to address this question decisively; none of the studies was originally designed to assess social changes. In any event, the important point of our discussion is not the discovery of the causal agent, but rather the exploration of some consequences of timedependent parameter changes, whatever the reason for the trend. Another notable feature of Fig. 1 is that females are uniformly more likely than males to favor conviction, and the difference is quite large. The importance of the type of crime, rape, to such a difference should not be overlooked, and serves to remind one of the probable critical relationship between particular crime, pretrial opinion distribution, and juror response . IMPLICATIONS
FOR JURY VERDICTS
Although we may intuitively appreciate change in opinions and cognitive dispositions, the trend displayed in Fig. 1 may seem small, especially in view of typical research practices oriented toward small samples and evaluation of null hypotheses. Thus, we explore below some implications of these data for-jury verdicts-implications not altogether obvious from juror reactions alone. The trend of jury verdicts, of course, may not be the same as inclinations of individual jurors. The proportion of guilty verdicts, If& observed for a sample of r person mock juries is the statistic of major interest, and depends on important ways on juror preferences, as indexed by fi,, which was graphed in Fig. 1. However, experimental manipulations also intluenced the value of PGfrom study to study; any changes in P, over the comparable period would be confounded with independent variables important to the several studies. That is to say, while the guilt preferences of our mock jurors (which yielded dc as described earlier) were elicited under quite constant conditions from study to study, the group verdicts (p,) were not. The problem is not uncommon in even programmatic experimental research, but also has its counterpart in archival data. For example, it is very rare to obtain surveys from court records that are not also ambiguous in the same sense: even within a given category of cases (e.g., rape), there is great variation due to participants (judges, attorneys, witnesses, and the like), courtroom procedures, and different criminal law in various states. However, we can explore some implications for jury verdicts by resort-
TEMPORAL
DRIFT IN SOCIAL
PARAMETERS
483
ing to conceptual or thought experiments (e.g., Davis, 1980). Not only can thought experiments aid policy decisions in the absence of empirical data, but they also offer a means of understanding sometimes conflicting results by examining theoretical outcomes under a variety of assumptions about parameter values-especially extremes. For example, both jury size and the social decision rule for aggregating votes or verdict preferences would appear to exert fairly simple but important influences. Unfortunately, the numerous studies of jury size seem to have produced conflicting results in the opinion of many researchers; the relatively fewer studies of assigned social decision rule have excited less interest, but as much puzzlement. Part of the continuing puzzlement over the ostensibly simple issues of size and decision rule is due to the unappealing complexity of the associated theoretical arguments. But part is also due to the overwhelming preference for empirical data. Unfortunately, it is difficult to imagine that social researchers will ever be able to attack each nuance of a social policy question or application with a matching empirical study. We shall argue that while data are indispensable for the usual, and obvious, reasons, theory is also essential in order to extrapolate from existing data sets and interpolate between available data points. One hardly needs a complicated model to realize that the more jurors who prefer a guilty verdict, the more likely a guilty verdict is to occur. Unfortunately for simplicity, the difference in the probability of a guilty verdict, Pe, between groups of different size is not linear, and not even monotonic, with changes in input preferences, PG. Both size and decision rule obviously subsume important social interaction processes. Moreover, the effects of both are dependent upon the input members bring with them to deliberation-summarized here in t probability distribution (PG, p.yC) that a member prefers a guilty or not guilty verdict. We would now like to consider some implications of changes actually observed overtime in the estimates) wjG, fiN,Jg when considered in conjunction with jury size and social decision rule. SOCIAL DECISION SCHEMES The general model of social decision schemes is described more fu elsewhere (Davis, 1973; Stasser, Kerr, & Davis, 1980), and will only bri be described here. Imagine that a decision maker must select one of pz mutually exclusive and exhaustive response alternatives, A,,j = 1,2, . . . n. We assume that individual decisions are characterized by a discrete probability distribution, p = (PI, pz, . . . , p A, over these alternatives, and that of groups by a probability distribution, P = (PI, 2, . . ., P& over the same alternatives. In some cases, however, as with the mockjury task:, the number of response alternatives available for groups (guilty, not guilty, or hung) may be different from the number defined for individuals (guilty or not guilty).
484
NAGAO
AND
DAVIS
At the beginning of the discussion, the r individual members may disagree on the response alternative for the group and may array themselves over the n response alternatives in m=( n+r-1r > different ways. For example, in the case of a six-person jury there are seven possible ways the members might distribute themselves over the alternatives guilty and not guilty, i.e., (6,0), (5, l), (4,2), (3,3), (2,4), (1,5), (0,6). An array is sometimes referred to as a distinguishable distribution since alternatives but not people are distinguishable. The probability of the ith distribution of member preferences occurring, ni, may be estimated directly in some research applications by counting the relative frequency, ?ii, with which the ith array occurs. But in other instances, mi must be estimated indirectly by substituting the estimates oil, dz, . . . , p*A from a sample of individuals in the expression ni
=
(r,,
rp?...,
7-J
PIT1 P2” *** Pnrn2
the well-known multinomial distribution which defines the probability of the ith array of members occurring. Given a particular distribution or array of opinions in a group, what is the probability that a group will choose a particular alternative? This process is obviously a function of the social interaction that establishes the consensus and may reflect as well the constitution or bylaws of the group, custom or tradition, the law, and the like. This social process may be extraordinarily complex, but it can be given an explicit summary form by defining the conditional probability, dijy the probability of the group choosing thejth response alternative given the ith distinguishable distribution. The general statement of the theoretical relations between initial distribution and final outcome may be cast in a m x n social decision scheme matrix, D. Examples of social decision scheme matrices for the six-person case are given in Table 1. Some of the social decision schemes we have chosen for study (summarized as D, and D, in Table 1) have accurately described mock jury verdict distributions in previous research (Davis et al., 1975; Davis et al. 1977b; Kerr et al., 1976); whereas others (viz., D3 and DJ offer interesting contrasts. As a final step, we can now relate the individual probability distribution, p, with the group probability distribution, P, by postmultiplying the row vector, m, by the social decision scheme matrix D. Thus, (PI, Pz, . . . , PJ = (ml, r2, . . . , r,J . D. (Note that in the following sections, IZ = 2, since the criminal trial is the referent here. However, the preceding discussion assumed IZ > 2, the general case .) The consequence of each (fiG, 6,) distribution displayed in Fig. 1 must be assessed with regard to some social process (e.g., those given above in Table 1). We have carried through calculations, assuming each social
TEMPORAL
DRIFT IN SOCIAL
485
PARAMETERS
TABLE 1 SIX-PERSON EXAMPLES OF PLAUSIBLE SOCIAL DECISION SCHEME MATRICES, Dh, USED TO GENERATE POSSIBLE CONSEQUENCES IN JURY VERDICTS OF THE APPARENT TREND AMONG JURORS Initial (rC,rNG) opinion distribution
W3
(5,1) (492) (373) (2,4) (13)
(0~5~
Jury outcomes G
NG
Q .oo .oo .oo .oo
1.oo 1.00 1.00 .oo .oo .oo .oo
1.00 1.00 1.00
1.00 .83 .67 .50 .33 .17 .oo
D, .oo .17 .33 SO .67 .83 1.oo
H
Jury outcomes _____ G NG H
1.00
4 .oo
1.00 .oo .oo .oo i
1.oo 1.00 .oo .oo .oo .oo
.oo .OO .75 1.00 1.00 1.00
.oo .OO .OO 25 .oo .Ml .oo
.OO .oo .oo .oo .oo .OO .oo
1.oo .67 .33 .Ml .w .QO .oo
R .oa .OO .33 .60 1.00 1.00 1.00
.oo .33 .33 ,410 .oo .oo .OO
.oo .oo .oo
Note. Principles: (D1) two-thirds majority establishes guilty, not-guilty verdict with probability near 1.0, otherwise hung with probability near 1.0; (D2) two-thirds majority establishes guilty, not-guilty verdict with probability near 1.0, otherwise P(NG) = .75, P(N) = .25; (D3) probability of guilty verdict is proportional to majority strength, otherwise not guilty; (D4) majority for not guilty establishes a not-guilty verdict; even split for guilty and not guilty establishes P(NG) = .60, P(H) = .40; unanimous majority for guilty estabiishes a guilty verdict; otherwise: if minority proportion favoring not guilty (Iklp) < 113, B(W) = 2(MP), P(G) = l-P(H); if MP 2 l/3, P(H) = P(NG) = MP, P(G)= I-2(MP). (See tigures I and 2.)
decision scheme of Table 1 in turn and each value graphed in Fig. I as a parameter. These projections or extrapolations are given in Fig. 2 for several different jury sizes. It is evident from inspection that the trend in (pG, pNG) is influencing the results (PG, PNG, PH), given the social decision schemes under consideration, in some very interesting ways. Several important conclusions are illustrated in Fig. 2. For example, consider the practical question of jury size. Observe that the magnitude of the “real” difference between, say, 6- and 12-person juries with regard to PC is quite small for Study 1, increases substantially at Study 2, but then decreases steadily toward the end of that time period, given, of course, a social decision scheme such as D,. This discrepancy due to size is nowhere very large. However, the change in magnitude, and in some cases the change in direction of the difference, is the important conse-
486
NAGAO
AND DAVIS
STUDY
as
STUDY
D,
PH
a function of the probability,
STUDY
234567
PROPORTIONALITY,
in jury outcome ~~Q~b~~~ti~s (PG, P ,VG,PH) I), and selected jury sizes.
1976 (see Fig.
The trends
Spring
FIG. 2.
t~~~~g~
I234567
STUDY
STUDY
234567
pG, estimated for each study, Spring 1973
I
M 7J m
g
w
fj
G
s
488
NAGAO
AND DAVIS
quence of changes inp,. If these models are even approximately correct, it would not be surprising to obtain “conflicting” empirical results in studies where pc is free to vary. In other words, past results may be accurate, but due to the studies’ location along the pc continuum, sizerelated differences in verdicts are not those expected intuitively. Those relationships that are nonmonotonic must seem especially counterintuitive in character. Consequently, to aid our intuition we calculated guilty verdict probabilities, P,, for both 6- and 1Zperson juries over the full range of individual guilt preference values, pG, assuming the two-thirds majority principle embodied in social decision schemes D, and D,. These results, as well as the location of each of the jury studies along the pG continuum, are depicted in Fig. 3. Several interesting findings are illustrated there. For one thing, it is clear that the relationship between jury size differences and pG, given a two-thirds majority social decision scheme, is not as simple as one might initially imagine. Specifically, whenp, = 0, .73, or 1, note that no difference due to size is predicted. However, whenp, ranges from .Ol-.72, juries of 6 persons are predicted to convict with greater probability than those of 12, while the opposite is true in the interval .74-.99. Furthermore, the magnitude of the predicted size difference within each of these intervals first increases, reaches a maximum (i.e., 15% at .5 and 2.7% at .83), and then steadily decreases as pc increases. Also evident from Fig. 3 is that the best opportunity to detect “true” size differences in jury conviction rates, given the jury studies we have considered and a two-thirds majority, occurred at the time of Study 2.
PROBABILITY
OF
INDIVIDUAL
GUILTY
PREFERENCE,
pc
FIG. 3. Probability of convictionunder a two-thirds majority decision scheme (D1 and D& for 6- and 12-person juries as a function of the probability of an individual guilty vote. Also shown are the maximun size differences on each side of the crossover and the relative location of Studies l-7 on the pc continuum.
TEMPORAL
DRIFI
IN SOCIAL
PARAMETERS
489
Moreover, the upward drift that occurred inp, suggests that the pa~i~~la~ mock trial used in Studies 2-7 might not now be optimally suited for the purpose of investigating jury size effects. In particular, Fig. 3 suggests that those wishing to research jury size are well advised to use a mock trial for which the value of pG more closely approximates .50. We can illustrate some further implications for empirical research by asking how many subjects would be required in order to detect the (true) difference at the time of Study 1 (where the gap is smallest, 2.3%), as opposed to Study 2 (where it is largest, 14.5%). The relevant c tions,3 assuming an equal number of groups and a conventional Type I error of .05, indicate that 125 juries of each size are required in the former instance and 80 in the latter. The total number of subjects needed, however, would then be 6(125) + 12(125) = 2250 and 6(80) + 12(8Q) = 1440, respectively. Both hypothetical samples are quite large. Given curre research practices, which rarely use samples of snch magnitude, it wou be difficult to detect either theoretical difference empirically. The same point was made without reference to actual data by Davis, Bray, and II (1977), and has been discussed further in Vollrath and Davis (1980). is important to remember that we are assuming a valu r pG that was actually observed, fit, from a large sample in each st Recall, too, that the mock trial was changed from S 1 to Study 2, but remained constant thereafter. Hence, the increase in$, from Study to Study 2 reflects that fact as well as any changes that may have occurre in the social milieu during the same period. Therefore, a better illustrative of the implications of parameter drift for empirical research may obtained by calculating the number of subjects required to detect t difference between 6-person and 12-person juries preferring guilty at the time of Study 7 (2.9%) and comparing it with the difference obtaine Study 2 (again assuming a social decision scheme such as D1). relevant calculations yield the unexpectedly large figure of 6(18’? rprising in light of our 12(1876) = 33,768. This result is all the more previous finding that “only” 2250 subjects wo detect a difference of similar magnitude (2.3 This somewhat counterintuitive result is due to the fact that the s~t~sti detection of a given difference between any two theoretical pro~o~~~~s 3 The required sample sizes were obtained by solving for N in the expression, L = (2 sin-’
q,“* - 2 sin-’ qZ1@)/(l/N, + 1/N2)r1* = 1.96, where s is a normal deviate, q1 and q2 are sample proportions set at the difference to be illustrated, N = N, = N, (i.e., equal size samples), and a conventional Type I error of .05 is assumed. Thus, N = 2[(2 sin-” q,li2 - 2 sin’r q21’2)/1.96]-2. 4 Note that the requirement of 1440 subjects to detect a 14.5% difference by means of usual statistical tests contrasts with a somewhat similar value reported by Davis et al. (1979a) as necessary to detect an 8% difference (viz., 1116 subjects). A closer examination of the latter calculation reveals that the figure reported actually underestimates the number of subjects needed by a factor of two! The correct value is 2232.
490
NAGAO
AND DAVIS
requires greater power when the proportions are near .50 than when they are near 0.00 or 1.00. Such a lack of uniformity of effects in the closed interval (0,l) is due to larger variances near the center of the interval; note for example that the standard error of a proportion ofi2 = p(l-p)/N = .25lN whenp = .50, but is only .009/N whenp = .Ol. These consequences are further illustrated in Fig. 4, which graphically depicts the total sample sizes required to detect theoretical differences between verdicts of 6- and 12-person juries of 5, 7.5, 10, 12.5, and 15% at various points along the continuum from 0.00 to 1.00. For example, a 10% difference in PG between 6- and 12-person juries occurring around .20 (say, P,(6) = .15 and P,(12) = .25, or vice versa) would require 6(122) + 12(122) = 2196 subjects to detect (once again assuming an equal number of groups in the 6- and 1Zperson samples, and a Type I error rate of .05), whereas the same 10% difference occurring around .50 (say, P,(6) = .45 and P,(12) = .55, or vice versa) would require 6(192) + 12(192) = 3456 subjects to detect, an increase of 57%. It is clear from Fig. 4 that verdict differences due to jury size which are less than 15% will generally be quite difficult to detect without very large samples,5 at least with social decision processes such as we have assumed here. Unfortunately, note that those places where the power to detect such jury size differences is greatest (near 0.00 or 1.00) also correspond to those where the likelihood of obtaining a difference is least.6 See Diamond (1974) for a discussion of “slanted” cases. Moreover, the value of pG where size differences are likely to be the largest (see Fig. 3) ironically corresponds to that point where the power to detect such differences is weakest. (Although we shall not discuss the matter here, the goodness of the normal distribution approximation to the binomial distribution becomes less accurate asp is near .OOor 1.00, and this requires additional care in data analysis.) The foregoing discussion illustrates anew the old lesson that bare empirical results are not sufficient and can sometimes mislead. Empirically grounded conclusions concerning the efficacy of jury size depend on the state of the system at the time of assessment. Remember, we are untypically holding constant (within the state of the art) a variety of conditions (case, participants, etc.) that would not prevail in the world at large. But 5 The rather large samples required to detect some of the differences between 6- and 12-person jury conviction rates (PG) portrayed in Fig. 3 might lead one to question the practical significance of such diferences. Recall, however, that the differences graphed are “true” theoretical differences. Hence, while one might question the significance of, say, a 5% difference on the grounds of the statistical power required to detect it, we are fairly confident that a defendant brought to trial before a jury is likely to view such a difference as quite important. 6 The discussion of sample sizes is intended to dramatize the magnitude of size-related differences in verdicts. Of course, the magnitude and direction of the difference over time are themselves the important points.
TEMPORAL
900
LOCATION
DRIFT IN SOCIAL
.20 OF
.40
.60
0 IFFERENCE,
PARAMETERS
.&IO I
401
1.00
PGt61 - PG(12)
FIG. 4. Graphs show sample size needed to detect (with Type I error at .05) a given difference (5, 7.5, 10, 12.5, or 15%) between the proportion of 6- versus IZperson juries preferring guilty as a function of the region of the interval, .OO =ZP&6)-P& 12) s I.00, in which it is located.
the subjects come from the world at large (undergraduates) and, while the subject population is clearly defined, and nominally the same from study to study, it never in fact contains socially constant material. With a change in (pa, pNG) at the individual level there is thus a necessary change in (PG, PNG, PH) at the group level. However, this change is not necessarily linear, nor is it always regular. Accurate extrapolation is complicated and depends upon explicit theory. Other features of interest in Fig. 2 also illustrate the impact of the temporal trend at issue. Compare the shape of the P, and PH surfaces given for social decision scheme D, with those for DZ (the PG surfaces are identical). Differences in these surfaces illustrate the power of the subscheme, which only comes into play when the primary plan (two-thirds majority in this instance) does not apply, in assigning a substantial portion of the probability to the group outcomes. Observe that while the drift in the parameterpo seems to affect D, and D, in opposite ways with respect to PNGand P,, the overall effect for both decision schemes is adecrease in the magnitude of the size differences. The two surfaces depicted for D3 illustrate the case where no group size differences are predicted, because the proportionality scheme reproduces in the group output the original input distribution. The proportionality social decision scheme is given
492
NAGAO
AND DAVIS
here for comparison purposes. The surfaces shown for D4, however, illustrate some verdict consequences of an asymmetrical scheme. Note that while the drift inpc results in a decrease in the magnitude of theoretical size differences in P,, an increase is observed in PNGand PH. Other points of interest in Fig. 2 are the “folds” that appear at group size 8 in the surfaces given for decision schemes D, and D,. In several instances, these serve to illustrate counterintuitive findings. For example, the PH surface for D, seems to fold downward from group size 8 to size 12. This reflects the fact that, given a two-thirds majority decision scheme, S-person juries are predicted to hang with a greater probability than lo-person juries, which, in turn, are more likely to do so than 12-person juries. The reason for this apparent anomaly lies in the fact that a twothirds majority is not well defined for groups of 8 or 10, wherein a respectively, would be required. consensus of 5.33 and 6.67 “persons,” Thus, the application of a two-thirds rule requires a majority of at least 6 for an S-person group and 7 for a lo-person group before the majority can “win.” Notice that what we have actually formalized as a two-thirds majority for groups of size 8 and 10 is coincident with a three-fourths majority for the former and a 7/10 majority for the latter. In both cases these are more “stringent” decision rules and result in the folds in the surfaces of D, and D,. Where the decision rule is well defined, as in D, and D4, the effect of group size on the theoretical predictions is much more regular. Finally, notice that the depth of the folds just described changes with the study. That is, the increase inp, over time (i.e., study), alters the size effect for a given social decision scheme. Such temporal results constitute our major concern. SOCIAL DECISION
PROCESSES
Up to this point we have been assuming a social decision process with a constant effect over time. That is, our projections of implications from individuals’ inclinations have used the social decision scheme matrices of Table 1, in which d, was assumed to remain constant over time. But it seems likely that the social decision process, of course only summarized by a social decision scheme, would also be subject to the changes in social norms that we believe influenced pc, the probability of individual guilt preference. Earlier we noted that the experimental manipulations in each study took place after subjects were formed into juries, and thus such treatments were typically confounded with time. However, the experimental manipulations did not produce significant effects in the (chronologically) second study (Davis et al., 1977b) we conducted. The observed social decision scheme matrix from that study along with a more recent experiment (Davis et al., 1980), in which some jury conditions were free of
TEMPORAL
DRIFT IN SOCIAL
493
PARAMETERS
manipulations, are given in Table 2. Inspection suggests that the majority effect is perhaps not symmetrical in either data set, but with time seems to have become less powerful, at least in the direction of conviction. In other words, majority effects persist, but the proportion of a guilty-favoring majority prevailing is much reduced; not-guilty majorities, bowever, remain nearly invincible. Regarding the two matrices as layers in a threedimensional contingency table yields x”(8) = 28.31, p < .OOl, where the reduced degrees of freedom appropriately reflect zero frequencies (Bishop, Fienberg, & Holland, 1977). The general differences between the two data matrices might be characterized as an increase in “defendant protection,” implying a decreasing tendency to find for guilty. The possibility of such a moderating tren particularly interesting in view of the increase observed in bci or the proportion of guilty-inclining individual jurors faced with the same mock trial. However, observe that the retreat from the guilty verdicts in the earlier study was toward the hung category in the later experiment, an did not result in an increase in acquittals as our “‘defendant protection” label might imply. Hung outcomes are ambiguous, and may reflect unresolved conflict from “hardened” positions rather than simply a recoil from guilty verdicts. (Time allotted was the same in both studies.) It is also worth remembering that the lack of significant treatments effects observed within the earlier study reflects a statistical decision, and cannot ultimately be an (null hypothesis) assertion of no (real) lingering treatment effects. Consequently, the results of the temporal comparison in Table 2 should be regarded as suggestive rather than definitive. Moreover, sot decision schemes estimated at only two points in time cannot define mu of a trend. Yet, disturbing as it may be,, we would be prudent to regar social explanations such as these as perhaps holding within a region, as subject to temporal dynamics that are as yet incompletely underst TABLE
2
SOCIALDECISIONSCHEMEMATRICESOBSERVEDIN
Study 2 Observed relative frequencies” (Iv = 90) G NG H (6,O) (5,l)
C&2) (3 93) (2.4) (13) 646)
.93 .84 .16 .06 .oo .oo
Study 5 Qbserved relative frequencie@ (Iv = 48) G IVG N
_07 .16 .68 .94 1.00
-c.oo .oo .16 .oo .oo
1.00
.oo
u Davis et al., 1977b. * Davis et al., 1980. r No juries with this initial distribution
STUDY~AND$TUDY~
1.00 c .82 .47 .I4 .oo .OO i -
were observed.
.oo .oo -11, .w I.00 1.00
.OO .18 .42 36 .oo .oo
-
-e
494
NAGAO
AND DAVIS
DlSCUSSlON
Gergen’s (1973) pithy analysis of theory and data in social psychology constitutes one of the central pieces of the crisis-in-social-psychology literature. Without endorsing the idea that the perishability of social knowledge is due to any specific set of influences in the way Gergen suggests (e.g., the bias inherent in prescriptive theory, or the influence of social knowledge on subsequent social behavior), or subscribing to the notion that social-historical analyses should become the paramount analytical perspective, it seems prudent to recognize that social phenomena indeed may change. Moreover, both temporal and cultural inconstancies may constitute the very stuff that should be studied but is now often ignored. As we implied at the outset, the recognition of social change is hardly new. But, the particular implications of social trends are not always sufficiently reflected in our conceptual work, whether intuitive or formal. For example, one can easily conclude from the current literature on the role of jury size (see recent reviews by Davis et al., 1977a; Vollrath and Davis, 1980) that there exists to be discovered a reasonably unqualified empirical answer to the “best size” question; yet the results we have reported here and elsewhere suggest that such is not the case. (See Davis, 1980, for a more comprehensive discussion of the problem of various inconstancies to be reckoned with in regard to social engineering questions-especially those pertaining to psychology and the law.) Consider, too, the implications of temporal trends for that revered cannon of science: replication. From the above results, it is obvious that a reader of an early report who set out to confirm (or disconfirm) the effect of size on mock jury verdicts by the most exacting replication (i.e., using precisely the same procedure, case, conditions, and the like) might well have failed to duplicate the results in question. The “replication failure” need not have been due to sampling error (the usual interpretations), careless work, etc., but rather to changes in the social processes themselves which have gone unrecognized. Clearly, the mock jury results discussed here constitute only one example. The same approach or points of view can be extrapolated by the reader to other areas in which research questions may be similarly vulnerable to social dynamics. Our point is that while repeating experiments or other data-gathering efforts is an essential part of the research enterprise, implications of apparently affirming or disconfirming results are not always straightforward. One often encounters the lament that “different” (but largely unspecified) methods are required for social psychology-methods not inherited from the natural sciences-and, the apparent parameter drift illustrated herein might offer a prima facie example of such a need. However, other disciplines, ranging from meteorology to virology, likewise face dynamic
phenomena. Each field may at some point be best served by particular research methods, but we are unconvinced that there are philosophical differences between social psychology and other n sciences. We do, however, endorse an historical perspective (or at least an appreciation of temporal factors) as an important general disposition in social research. But, it would surely be prudent to cultivate a sensitivity in our theoretical work to parameter inconstancies of all sorts, whe temporal, cultural, personal, or whatever. As we have seen, there ma some very practical reasons for holding such a view. In the study of many social phenomena (e.g., jury deliberations) t development of adequate data sets is often difficult or impossible. Theory, particularly formal theory capable of generating explicit statements under various interesting and plausible parameter values, allows us to inte~polate between data points which are available as well as extrapolate from existing data to those regions for which no data exist. As we have seen from Fig. 4, those data that do exist may not always be adequate to allow the detection of subtle effects (true theoretical differences in our examples). Such results illustrate how the grounding of social policy o results of a study or two may prove hazardous. One means of evaluating the possible consequences of social policy changes is through simulation. The use of simulation techniques are routine in various areas of meteorology, engineering, and other disciplines confronted by research constraints and inconstancies not unlike those common to social psychology. For example, the feasibility of a ~a~i~~Ia~ airframe design can be evaluated by “flying” it under different computersimulated weather conditions. These results may then point to problems in the design which were not evident before the simulation. In a simil manner, the conceptual simulation of social phenomena may also point to counterintuitions or anomalies that may exist in the data record. (S Davis, 1980, for a more comprehensive discussion of theory and the issue of interpolation/extrapolation in regard to empirical data sets.) There is no substitute for data to guide theory, intuition, s~~~~lati~~ and the like. However, there seems at present a very strong disposition toward empirical research. Perhaps some redress favoring social theism is in order, particularly efforts to construct explicit models useful for evaluating social policy alternatives, and assessing the consequences of social inconstancies-temporal or whatever. REFERENCES Bishop, Y. M. M., Fienberg, S. E., & Holland, P. W. Discrete multivariate analysis. Cambridge, Mass.: MIT Press, 1977. Davis, J. H. Group decision and social interaction: A theory of social decision schemes. Psychological Review, 1973, 80, 97-125. Davis, J. H. Group decision and procedural justice. En M. Fishbein (Ed.), Progress in social psychology. Hillsdale, N.J.: Erlbaum, 1980.
NAGAO
496
AND DAVIS
Davis, J. H., Bray, R. M., & Holt, R. W. The empirical study of decision processes in juries: A critical review. In J. L. Tapp & F. J. Levine (Eds.), Law, justice, and the individual in society: Psychological and legal issues. New York: Holt, 1977. (a) Davis, J. H., Holt, R. W., Spitzer, C. E., & Stasser, G. The effects of consensus requirements and multiple decisions on mock juror verdict preferences. Journal of Experimental Social Psychology, in press. Davis, J. H., Kerr, N. L., Atkin, R. S., Holt, R. W., & Meek, D. The decision processes of 6- and 1Zperson mock juries assigned unanimous and 213 majority rules. Journal of Personality and Social Psychology, 1975, 32, 1-14. Davis, J. H., Kerr, N. L., Stasser, G., Meek, D., & Holt, R. W. Victim consequences, sentence severity, and decision processes in mock juries. Organizational Behavior and Hunan Pet$ormance, 1977, 18, 346-365. (b) Davis, J. H., Spitzer, C. E., Nagao, D. H., & Stasser, G. The nature of bias in social decisions by individuals and groups-an example from mock juries. In H. Brandstatter, J. H. Davis, & H. Schuler (Eds.), Dynamics ofgroup decisions. Beverly Hills, Calif.: Sage, 1978. Davis, J. H., Stasser, G., Spitzer, C. E., & Holt, R. W. Changes in group member preferences during discussion: An illustration with mock juries. Journal of Personality and Social Psychology, 1976, 34, 1177-l 187. Diamond, S. S. A jury experiment reanalyzed. University of Michigan Journal of Law Reform, 1974, I, 520-532. Gergen, K. J. Social psychology as history. Journal of Personality and Social Psychology, 1973, 26, 309-320. Kerr, N. L., Atkin, R. S., Stasser, G., Meek, D., Holt, R. W., & Davis, J. H. Guilt beyond a reasonable doubt: Effects of concept definition and assigned decision rule on the judgments of mock jurors. Journal of Personality and Social Psychology, 1976, 34, 282-294.
Schram, D. D. Rape. In J. R. Chapman & M. Gates (Eds.), Beverly Hills, Calif.: Sage, 1978. Stasser, G., Kerr, N. L., & Davis, J. H. Group performance Paulus (Ed.), Psychology of group influence. Hillsdale, Vollmth, D. A., & Davis, J. H. Jury size and decision rule. In role in American society. Lexington, Mass.: Lexington,
REFERENCE
The victimization
of women.
and decision making. In P. N.J.: Erlbaum, 1980. R. Simon (Ed.), The jury: Its 1980.
NOTE
1. Davis, J. H., Nagao, D. H., Spitzer, C. E., & Stasser, G. Decisions in mock juries with multiple minorities and majorities (tentative title). Manuscript in preparation, University of Illinois, 1980.