Anchoring in sequential judgments

Anchoring in sequential judgments

Organizational Behavior and Human Decision Processes 122 (2013) 69–79 Contents lists available at SciVerse ScienceDirect Organizational Behavior and...

473KB Sizes 1 Downloads 95 Views

Organizational Behavior and Human Decision Processes 122 (2013) 69–79

Contents lists available at SciVerse ScienceDirect

Organizational Behavior and Human Decision Processes journal homepage: www.elsevier.com/locate/obhdp

Preface

Anchoring in sequential judgments Daniel Mochon a,⇑,1, Shane Frederick b,1 a b

A.B. Freeman School of Business, Tulane University, United States School of Management, Yale University, United States

a r t i c l e

i n f o

Article history: Received 20 August 2011 Accepted 15 April 2013 Available online 25 May 2013 Accepted by Harris Sondak Keywords: Anchoring Judgment Response bias Context effect Heuristics

a b s t r a c t Building on the scale distortion theory (Frederick & Mochon, 2012), we explore the boundary conditions of anchoring outside of the standard paradigm. We argue that the conditions needed for anchoring effects are much more restrictive than those suggested by some theories, but much less restrictive than those suggested by others. Our findings illuminate both the scope and limits of this well-known effect and provide a framework for predicting its occurrence in novel settings. Ó 2013 Elsevier Inc. All rights reserved.

Introduction Anchoring is the term applied to situations in which numeric judgments assimilate towards a previously encountered standard. In most experimental demonstrations, participants first judge whether some target quantity is greater or smaller than a presented numeric standard and then render their best point estimate, as in the example below, taken from Jacowitz and Kahneman (1995): (1) Is the Mississippi river shorter or longer than 70 miles? shorter longer. (2) How long is the Mississippi river? _______ miles. Since a seminal demonstration by Tversky and Kahneman (1974), this ‘‘standard’’ or ‘‘classic’’ paradigm has dominated research on anchoring (for reviews, see Chapman & Johnson, 2002; Epley, 2004). However, there are many possible situations in which potential anchors precede or accompany quantitative judgments that deviate from this paradigm, and it is important to understand the range of conditions in which anchoring typically occurs. Toward that end, we pursue two complementary goals in this paper: (1) We move beyond the standard paradigm to explore the boundary conditions of anchoring, and (2) We attempt to explain the ⇑ Corresponding author. Address: A.B. Freeman School of Business, Tulane University, 7 McAlister Dr., New Orleans, LA 70118, United States. Fax: +1 504 865-6751. E-mail address: [email protected] (D. Mochon). 1 These authors contributed equally to the work. 0749-5978/$ - see front matter Ó 2013 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.obhdp.2013.04.002

presence or absence of anchoring in terms of our scale distortion theory (Frederick & Mochon, 2012). We begin by reviewing the most common theories of anchoring and attempt to derive the predictions of those theories in judgmental contexts that differ from those in which the theories were formulated. Later, we move beyond the standard paradigm to help delineate the circumstances in which a potential anchor is likely to influence judgments and to test how well each theory can predict the pattern of results observed.

Anchoring and adjustment The first theory proposed to explain anchoring effects was ‘‘anchoring and adjustment’’ (Tversky & Kahneman, 1974), which suggested that respondents serially adjust from an anchor toward the correct value, but stop too soon, perhaps at the first value which seems plausible (Quattrone, Lawrence, Finkel, & Andrus, 1981). Later, Epley and Gilovich (2001, 2004, 2005, 2006) extended this theory to instances in which participants self-generate the numeric standard by considering some relevant benchmark. For example, if asked to estimate the freezing point of vodka, most respondents would first consider 32 °F (because it is the only freezing point they know), and then adjust downward from this value to reflect their recollection that alcoholic beverages typically do not freeze solid in their freezers. The anchoring and adjustment process seems most plausible when the anchor provides a reasonable starting point for the judgment, as in the vodka example. The account seems less satisfying when the numeric standard is very far from a plausible answer to the target judgment. Indeed, some

70

D. Mochon, S. Frederick / Organizational Behavior and Human Decision Processes 122 (2013) 69–79

research has failed to find evidence for the adjustment process within the standard paradigm, as various factors that ought to encourage participants to deliberate longer – and, thus, adjust more – do not, in fact, reduce anchoring (Chapman & Johnson, 2002; Epley & Gilovich, 2005; Mussweiler & Strack, 1999a; though see Simmons, LeBoeuf, & Nelson, 2010, for an alternate account of this). Numeric priming By this account, any number present in the environment or highly salient in memory at the moment of judgment may function as an anchor, regardless of its source, context, scale, or relevance. We will collectively refer to this class of theories as numeric priming - which all essentially imply that ‘‘if people pay at least minimal amount of attention to arbitrary numbers, these numbers can anchor numerical answers to unrelated questions’’ (Wilson, Houston, Etling, & Brekke, 1996, p. 401). On this view, anything that enhances the salience of a number could increase the likelihood that it will be recruited as a possible answer in subsequent judgments requiring numeric responses. Results cited in support of this paradigm include Critcher and Gilovich’s (2008) finding that a linebacker wearing a higher jersey number is judged to be more likely to record a sack in the next game and the finding by Wilson et al. (1996) that having students copy 35 four-digit numbers for an ostensibly unrelated purpose affected their judgments about the incidence of cancer among their classmates (see also, Adaval & Monroe, 2002; Mussweiler & Englich, 2005; Wong & Kwong, 2000). Oppenheimer, LeBoeuf, and Brewer (2007) have even suggested that such priming effects may occur for non-numeric stimuli, as they find that participants judged the average temperature in Honolulu to be lower when they first drew three short lines than when they first drew three long lines. Like numeric priming, this theory of conceptual priming suggests that anchoring may occur in a broad range of situations and that the anchor need not be related to the target stimulus or be on the same scale on which the judgment is rendered. Claims of numeric priming effects remain controversial, however. Many researchers have argued that the mere presence of a number is generally not sufficient to cause anchoring (Brewer & Chapman, 2002; Mussweiler & Strack, 2001) and that these effects may be limited to situations in which participants’ cognitive resources are experimentally constrained (Reitsma-van Rooijen & Daamen, 2006; Wegener, Petty, Blankenship, & Detweiler-Bedell, 2010). Indeed, Brewer and Chapman (2002) failed to replicate many of the findings reported in Wilson et al. (1996). Selective accessibility Selective accessibility argues that anchoring occurs within the standard paradigm because of two related principles: hypothesis consistent testing and semantic priming (Mussweiler & Strack, 1999a). When participants are presented with the comparative question, they treat it as a hypothesis to be tested. For instance, asking whether the Mississippi river is shorter or longer than 70 miles is treated as the hypothesis: Is the Mississippi river 70 miles long? Since hypothesis testing tends to evoke a search for confirmatory information (Klayman & Ha, 1987; Wason, 1960), information consistent with the provided anchor tends to be differentially activated. This information then lingers to influence the subsequent absolute judgment through a process akin to semantic priming (Higgins, 1996). For example, having focused on reasons why the Mississippi river could be 70 miles long, the subsequent absolute estimate is driven downward by those facts most consistent with the river being that short (Chapman & Johnson, 1994, 1999;

Mussweiler, 2003; Mussweiler & Strack, 1999b, 2000; Strack & Mussweiler, 1997). Various results have been cited in support of this account. Mussweiler and Strack (1999b) found that responses depended on the precise hypothesis implied by the comparative judgment. Specifically, participants judged the Elbe river to be longer if first asked whether it was longer than 890 km than if first asked whether it is shorter than 890 km.2 Mussweiler and Strack (2000) found that participants were quicker to identify words consistent with expensive cars (e.g. BMW) if the prior comparative question about the average price of cars involved a higher monetary value: 40,000 German Marks (vs. 20,000). Such results support the theory of selective accessibility by showing that the numeric value used in the comparative question can affect the activation of concepts, but do not demonstrate that this is mediating the observed anchoring effect. Strack and Mussweiler (1997) construe anchoring as a form of semantic priming, and have argued that knowledge primed during the comparative judgment will be influential to the extent it is deemed applicable to the target judgment (e.g. Higgins & Brendl, 1995; Higgins, Rholes, & Jones, 1977). For example selective accessibility would predict that estimates of the length of the Mississippi river would be more affected by preceding judgments about the length of rivers, than by preceding questions involving the weight of animals or the completion percentages of NFL quarterbacks. However, the applicability criterion has also been invoked to make less intuitive predictions. For example, Strack and Mussweiler (1997) predict that the numeric anchors provided in a comparative question about the width of the Brandenburg gate will not affect subsequent judgments about its height, and report data consistent with that prediction. We will revisit these results later in the paper. For now, we use it only to make the point that the threshold for relevance is very much an open question in several theories of anchoring.

Scale distortion The scale distortion theory of anchoring (Frederick & Mochon, 2012) attributes anchoring to a distortion in the mapping of judgments to the provided response scale. On this view, numeric anchors do not affect the representation or beliefs about the target stimulus, rather the use of the response scale on which judgments are rendered. For example, in the context of a 70 mile anchor, subsequent estimates of the length of the Mississippi river are lower, not because participants believe that it is shorter, but rather because they use a different number on the miles scale to communicate their unchanged mental representation of length. Whereas 2000 miles might have seemed like a reasonable response when no anchor is present, this number seems too large when contrasted with 70, causing respondents select a smaller number. In support of their theory, Frederick and Mochon (2012) showed that anchors can affect the mapping between numbers and stimuli. For example, respondents who were first asked to estimate the weight of a wolf later judged ‘1000 lbs’ as bigger – they chose larger exemplars when asked to indicate an animal that weighs around 1000 lbs. They also showed that anchoring effects are ‘‘shallow’’ – the influence of anchors on responses appear to be unaccompanied by any corresponding effects on mental representations of the object being evaluated. For example, an anchoring manipulation may affect the number of calories that a specified serving of French fries is judged to have, without affecting any 2 This particular result is also consistent with conversational norms (Grice, 1975) as the direction of the comparative question may imply something about the answer expected.

D. Mochon, S. Frederick / Organizational Behavior and Human Decision Processes 122 (2013) 69–79

related judgments, such as how many grams of fat they contain or how unhealthy they are relative to other foods (see also Brewer, Chapman, Schwartz, & Bergus, 2007; Chapman & Johnson, 1994; Kahneman & Knetsch, 1993).

71

the above formulation, even in these cases, anchoring re-emerges when comparisons are experimentally enforced (Study 4C). This suggests that conceptual similarity moderates anchoring by reducing the likelihood that a comparison is made, rather than by changing the type of comparison that is made or the information to which one attends during that comparison.

Overview of studies In this paper, we move outside the standard paradigm to explore the boundary conditions of anchoring. Building on the scale distortion theory (Frederick & Mochon, 2012), we will show that the conditions needed for anchoring are much more restrictive than those suggested by a numeric priming account, but much less restrictive than the selective accessibility account would seem to imply. Note in particular, that champions of the selective accessibility account have proposed that the anchor and target stimulus must be very closely related – to the extent that Strack and Mussweiler (1997) predict that the width of the Brandenburg Gate (in meters) will be an inapplicable standard for judging the height of the Brandenburg Gate (in meters). By contrast, we’ll show that the hurdle of conceptual relevance is quite low: a wide range of stimuli can act as anchors for each other if the stimulus to which the numeric standard is attached shares the response scale with the target judgment. To better understand which experimental features are critical for the emergence of anchoring effects, our first study manipulates two things that are typically held constant in the standard paradigm: (1) the conceptual relevance of the comparative question and (2) the manner in which the anchor is represented. This study enables us to contrast two of the aforementioned theories, as selective accessibility predicts that the critical factor is the conceptual relevance of the hypothesis being tested, whereas scale distortion predicts that the crucial factor will be the presence of a numeric standard on a common scale. In Study 2, we abandon the standard paradigm altogether to examine whether prior judgments affect subsequent ones, when no comparison is requested or even strongly suggested. This study further tests the role of conceptual relevance in anchoring and expands the boundaries of anchoring by extending it to a somewhat more common context, in which people render multiple judgments about different stimuli. Studies 3A and 3B examine whether numeric and conceptual priming could explain the anchoring effects we observe outside of the standard paradigm. As a whole, these studies show that scale preservation is essential for anchoring to occur, but that conceptual relevance does not play a large role. The final set of studies examine whether conceptual relevance plays any role in anchoring. Though conceptual relevance was not part of the original formulation of the scale distortion theory (Frederick & Mochon, 2012), we refine the theory here to suggest that conceptual relevance can play a role, but a different one than what is suggested by selective accessibility. The selective accessibility model argues that similarity moderates anchoring by affecting the information respondents access during the comparison (Mussweiler, 2003). When two stimuli are similar, people focus on their similarities, leading to assimilation effects. When two stimuli are dissimilar, people focus on their dissimilarities, leading to contrast effects. We instead propose that conceptual relevance can moderate anchoring by affecting whether a comparison is made, rather than by affecting the type of comparison. Even a scale distortion account of anchoring assumes a comparison between numeric values on the scale. However, if the anchor is perceived as a completely irrelevant standard, no comparison to that number will be made, and, thus, no scale distortion will occur. In the final set of studies we show that highly dissimilar anchors do eliminate the anchoring effect, though they do not lead to contrast effects (Studies 4A and 4B). However, consistent with

Study 1 To begin our exploration of the boundaries of anchoring effects, our first study retains the traditional paradigm (in which a comparative question involving a provided standard precedes an absolute judgment), but manipulates two aspects of the comparative question that are rarely varied: (1) whether the magnitude of the stimulus being compared is implicitly or explicitly represented with a number and (2) whether the comparative and absolute judgments pertain to the same stimulus. The selective accessibility account of anchoring (Mussweiler & Strack, 1999b, 2000; Strack & Mussweiler, 1997) emphasizes the hypothesis being tested in the comparative judgment and the relevance of the activated information to the target judgment, not the presence of a number, per se. On this view, the conceptual relevance of the stimulus should matter more than whether the anchor is expressed with a number. By contrast, the scale distortion account (Frederick & Mochon, 2012) emphasizes the presence of a numeric standard on a shared scale. Therefore, a numeric representation should matter more than whether the number in the comparative judgment pertains to the same stimulus for which an absolute judgment is ultimately made. This prediction is consistent with some prior work on anchoring suggesting that comparisons not involving numbers do not cause anchoring (Chapman & Johnson, 1994; Markovsky, 1988).3 Method We recruited 143 participants from Amazon Mturk to complete a short online study. The target judgment was the current price at Best Buy of a Canon Powershot camera. All participants first made a comparative judgment, though we manipulated three aspects of it using a 2  2  2 between-subjects design: (1) whether the comparison involved the stimulus of the target judgment (a Canon Powershot camera) or not (a Garmin Nuvi GPS), (2) whether that thing was being compared with something more or less expensive, and (3) whether the high or low standards were specified as numbers ($900 and $6, respectively) or as something costing that much (an LG ultra-capacity washing machine or a four-pack of Duracell AA Batteries, respectively). Table 1 specifies the eight conditions. Since selective accessibility is, at its core, a semantic theory about knowledge made accessible during the comparative task, the crucial elements should be the act of comparison and the pertinence of the selectively activated information to the target judgment. Therefore, the selective accessibility account (Mussweiler & Strack, 1999b, 2000; Strack & Mussweiler, 1997) suggests that anchoring effects should occur when the camera is compared to a high or low anchor, regardless of how that anchor is represented. By contrast, the scale distortion theory of anchoring implies that the presence of a numeric standard would be crucial and that the identity of the comparative stimulus would be less important. In short, we predict anchoring in places where the selective 3 Markovsky (1988) reported studies using numeric standards and others not using them, though never manipulated this factor within the same study. Chapman and Johnson (1994) simultaneously manipulated whether the anchor was expressed as a number and the way in which the comparison was phrased. The current study builds on this prior work while manipulating only whether the comparative judgment involves a numeric value.

72

D. Mochon, S. Frederick / Organizational Behavior and Human Decision Processes 122 (2013) 69–79

Table 1 Mean estimates (with standard errors in parentheses) of the price of the camera in Study 1. Comparative judgment

Target judgment

Mean estimate (SE)

Does a [ ] cost more or less than. . .

. . .[ ]

Camera Camera Camera Camera

$6 $900 a pack of AA batteries a washing machine

Price Price Price Price

of of of of

camera camera camera camera

$164 $376 $158 $189

(16.0) (47.0) (17.3) (26.4)

GPS GPS GPS GPS

$6 $900 a pack of AA batteries a washing machine

Price Price Price Price

of of of of

camera camera camera camera

$165 $247 $202 $158

(21.8) (18.6) (34.9) (17.1)

device device device device

accessibility account would not expect it and predict no anchoring in circumstances that seem sufficient for its occurrence given a reasonable reading of that theory.

Table 2 Mean estimates (with standard errors in parentheses) for the height of the Eiffel Tower.

Results The results are summarized in Table 1. As predicted, anchoring occurred when the comparative judgment involved a number, but not otherwise. An ANOVA revealed main effects for anchor magnitude (F(1, 135) = 12.2, p < .001, g2 = .08) and anchor representation (F(1, 135) = 9.3, p < .01, g2 = .06), and the predicted interaction between them (F(1, 135) = 14.7, p < .001, g2 = .1). When the anchor was represented as a number, we found significant anchoring effects, whether the comparative stimulus matched the target stimulus (t(33) = 3.8, p < .001, d = 1.3), or not (t(33) = 2.9, p < .01, d = 1.0). However, if the comparative judgment lacked a numeric value, we found no significant anchoring effects (t’s < 1.1, p’s > .25). Anchoring effects were larger when the comparative question involved the camera than when it involved the GPS device, as shown by a significant interaction effect between anchor stimulus and value (F(1, 135) = 6.5, p < .05, g2 = .05). As will be discussed later, this effect is consistent with selective accessibility, potentially consistent with scale distortion, and consistent with other accounts of anchoring, including conversational norms (Frederick, Mochon, & Danilowitz, 2013; Grice, 1975). When the comparative question involves the camera, it more strongly suggests that the anchor should be used as information for the absolute estimate of the camera. None of the other effects in the ANOVA were statistically significant. Discussion At least within the standard paradigm (in which a comparative judgment precedes an absolute judgment), our results suggest different boundary conditions than those emphasized by advocates of selective accessibility. The comparative judgment appears important only to the extent that is provides a numeric value against which the continuous response can be compared. The exact contents of the information evoked when answering the comparative question seem less important, as you get anchoring even when the comparative judgment involves a different stimulus than the target judgment (see also, Frederick & Mochon, 2012).4 The finding that anchoring occurs even when the comparative judgment pertains to a different object than the one being assessed in the continuous judgment may seem surprising in light of a much cited result that Strack and Mussweiler (1997) advance in support of selective accessibility. Specifically, they predicted and found that 4 This paradigm is admittedly rather unusual, since there is no obvious reason for the simultaneous variation in the type of response required (binary vs. continuous) and the target being judged. The sequential judgments paradigm examined in the remaining studies is much more natural, since only the target varies.

Low anchor High anchor

Same dimension comparison (height)

Different dimension comparison (width)

541 (132.3) 2216 (335.8)

702 (198.6) 2603 (501.9)

a comparative question involving the width of the Brandenburg Gate would not act as an anchor for estimates of its height, purportedly because information activated about its width is not sufficiently applicable to affect estimates of its height. Such results imply anchoring over a much narrower range of situations than our results seem to imply. There may, however, be no mystery to solve, as there are grounds to question the reliability of this result – which was based on the responses of just 8 participants per cell. Wong and Kwong (2000) found that a comparative question involving the height of the Buddha at the Po Lin Monastery did significantly affect subsequent estimates of its width. Similarly, using a 2  2 between subjects design, we assigned 117 participants from eLab (an online survey site hosted by the Yale School of Management) to one of four conditions in which we crossed anchor magnitude (low vs. high) and anchor dimension (same vs. different). Participants judged whether the Eiffel tower was more or less than [30/3000] feet [wide/tall] before estimating its height in feet. A between-subjects ANOVA yielded a significant main effect for anchor magnitude (F(1, 113) = 30.3, p < .001, g2 = .2), but no significant effect for anchor dimension (F(1, 113) = .7, p = .4, g2 = .006), nor an interaction (F (1, 113) = .1, p = .7, g2 = .001). Height judgments were significantly influenced by anchor magnitude, whether the comparative question involved the tower’s width (t(58) = 3.5, p < .001, d = .9) or its height (t(55) = 4.8, p < .001, d = 1.3; see Table 2). Study 2 Study 1 demonstrates that a comparative question involving one stimulus can act as an anchor for an absolute judgment of another stimulus. Building on this finding, we examine whether the comparative question is necessary at all, by testing whether anchoring occurs for sequential absolute judgments of different stimuli – a situation in which preceding judgments are available as potential numeric standards, but where no comparison is requested or even strongly suggested. Frederick and Mochon (2012) provided some initial evidence that anchoring can occur in such situations. In the current study, we replicate this finding across various stimuli and test for possible effects of stimuli further back in the judgmental sequence. The effect of multiple anchors is rarely examined in the anchoring literature, but arises naturally in this paradigm.

73

D. Mochon, S. Frederick / Organizational Behavior and Human Decision Processes 122 (2013) 69–79 Table 3 Mean estimates (with standard errors in parentheses) for target and anchor questions in Study 2. Judgmental domain

Anchor condition

Manufacturing cost of products (dollars) None Small Medium Both Weight of animals (pounds) None Small Medium Both Weight of bowling-ball sized spheres (pounds) None Small Medium Both Population of countries (millions) None Small Medium Both

Small Anchor

Medium anchor

Target judgment

Clock radio

Fax machine

Television $613a (56.2) $458b (73.6) $372b (36.6) $328c (44.8)

a

$8 (0.8) $7a (1.0)

$53a (8.4) $31b (3.3)

Raccoon

Wolf

21a (1.8) 19a (1.4)

86a (5.1) 64b (3.1)

Rubber

Glass

a

15 (1.2) 15a (2.4)

30a (4.0) 26a (3.5)

Finland

Italy

21a (4.5) 11a (1.7)

38a (5.8) 32a (6.0)

Giraffe 916a (129.8) 739b (66.5) 803b (106.1) 657b (61.3) Lead 150a (16.7) 109b (12.3) 93b (12.7) 84b (10.8) Japan 127a (15.4) 117a,b (14.2) 128a,b (16.7) 91b (12.2)

Means with different superscripts are significantly different (based on pair-wise Mann–Whitney tests).

Method We recruited 553 participants from Yale’s eLab. All respondents made four ‘‘target’’ judgments: (1) the manufacturing cost of a fifty inch flat-screen LCD television (in dollars), (2) the weight of an average adult giraffe (in pounds), (3) the weight of a bowling ball sized sphere of lead (in pounds), and (4) the population of Japan (in millions). Each judgment appeared on a different page, and the order was randomized. Depending on condition, each target judgment was presented by itself, with no preceding judgment (None) or preceded by the judgment of a Small stimulus, a Medium stimulus, or Both.5 For each participant, we randomized which of the four conditions was assigned to which of the four domains. Each target judgment was made under different conditions, with the conditions assigned to domains counterbalanced across participants. Table 3 lists the stimuli we used. Results Since each participant made one target judgment in each of the four experimental conditions, our formal analysis involved a within-subjects ANOVA, with anchor condition as the lone factor. After performing a log transformation on the raw responses to reduce skewness, we compared z-scores across conditions and found a significant main effect of anchor condition (F(3, 550) = 16.7, p < .001, g2 = .08). Post hoc pair wise tests revealed that target judgments were significantly lower when preceded by judgments of the small or medium objects (p’s < .001) and that making two such judgments resulted in lower estimates than in any of the other three conditions (p’s < .001). The small and medium anchor conditions did not differ (p > .5). Discussion These results expand the scope of anchoring in two ways. First, we show that anchoring occurs in sequential judgments without 5 For expositional ease, we refer to one set of anchors as ‘small’ and another as ‘medium,’ though the values of both are generally more similar to each other than to the target judgment, and, thus, there is no reason to suspect that they would have substantially different effects (see Chapman & Johnson, 1994).

any explicit request for a comparison to a provided standard – e.g., merely asking people to estimate the cost of a clock radio significantly lowered a subsequent estimate of the cost of an LCD television. Moreover, this occurred even though the preceding numeric referents were not seen as relevant standards. Although the sequential judgment paradigm entails the production of responses that could be used as inputs to subsequent judgments, we did not anticipate that participants would regard these as highly relevant benchmarks, a condition generally regarded as necessary to initiate the adjustment process in most anchoring and adjustment accounts (Epley & Gilovich, 2001, 2004, 2005, 2006). To confirm this, we conducted a brief survey in which respondents judged the relevance of various standards, on a scale ranging from 0 (totally irrelevant) to 10 (extremely relevant). Nine were the ‘‘self generated anchors’’ that Epley and Gilovich (2001) proposed respondents would spontaneously summon when judging a related quantity. The remaining four were the actual values for the ‘‘Small’’ stimuli from Study 2. Twenty-seven participants were recruited from Yale’s eLab to complete this survey. The order of the thirteen judgments was randomized between-subjects, each appearing on a different page. As expected, the stimuli in Study 2 were judged to be much less relevant than the ‘‘self-generated anchors’’ from Epley and Gilovich’s paradigm. The average of the four relevance judgments for the stimuli in this study (M = 2.8, SE = .38) was significantly lower than for the nine stimuli used by Epley and Gilovich (M = 5.3, SE = .44; t(25) = 12.1, p < .001). For example, the weight of a raccoon was deemed to have almost no relevance for judging the weight of a giraffe, whereas the year that the U.S. declared independence was deemed highly relevant for judging the year in which Washington was elected president (1.8 vs. 7.9; t(26) = 12.5; p < .001). A second important implication of these results is that two anchors can be more effective than one: Judgments in the Both condition were significantly lower than those in the Small or Medium conditions. This ‘‘multiple-anchor’’ result is novel in the anchoring literature and of some theoretical importance, as it further opposes notions of anchoring and adjustment, in which respondents stop at the boundary of the range of plausible values (Quattrone et al., 1981) or effortfully adjust from some standard (Epley & Gilovich, 2001, 2004, 2005, 2006). Neither of these theories readily account for why two anchors work better than either individually – e.g., why estimates of the cost of an LCD television are

74

D. Mochon, S. Frederick / Organizational Behavior and Human Decision Processes 122 (2013) 69–79

Table 4 Mean estimates (with standard errors in parentheses) for target and anchor questions in Study 3A. Condition

Judgment 1

Judgment 2

Target (projector)

Control Numeric primes Same dimension

Calories in strawberry: (calories) 28.4 (7.6) Cost of mouse: $26.0 (1.8)

Weight of air conditioner: (pounds) 88.1 (13.8) Cost of toaster oven: $80.7 (9.3)

$529.6 (72.8) $582.5 (103.0) $311.5 (39.3)

more affected by preceding estimates of the cost of both a clock radio and a fax machine than by either of these questions alone. Selective accessibility also does not fully account for these results. Strack and Mussweiler (1997, p. 438) suggest that for anchors falling outside the plausible range, respondents test for the possibility that the target possesses the nearest plausible value, and that information recruited during this search affects the final value. It is unclear how or why two extreme anchors would fulfill this function differently. Such a finding appears consistent with the scale distortion account of anchoring (Frederick & Mochon, 2012), as contrast effects are often larger in contexts with multiple standards (Massaro & Anderson, 1971).

Study 3A The results of the first two studies weigh against selective accessibility but are consistent with numeric priming or ‘‘basic’’ anchoring, which predict that sequential judgments would assimilate towards any prior numeric response, regardless of the stimulus or scale to which the number pertains (Adaval & Monroe, 2002; Critcher & Gilovich, 2008; Mussweiler & Englich, 2005; Wilson et al., 1996; Wong & Kwong, 2000). We next test whether the production of small numbers alone is sufficient to generate anchoring effects. Method In exchange for a chance to win Amazon gift certificates, 121 respondents from Yale’s eLab estimated the price, at Best Buy, of a Sony LCD home-theater projector. Participants were randomly assigned to one of three conditions. In the Control condition, respondents made only that judgment. In the Same Dimension condition, they first estimated the price of a Microsoft wireless optical mouse and an Oster toaster oven. In the Numeric Primes condition, they first estimated the number of calories in a strawberry and the weight in pounds of a window air conditioner – two unrelated judgments we suspected would yield similar numeric estimates (though in different units). If the previous findings were driven by numeric priming, we would expect similar results in the Same Dimension and the Numeric Primes conditions. Results and discussion The mean estimates for each condition are shown in Table 4. A between-subjects ANOVA on log (price estimate) yielded a significant effect of anchor condition (F(2, 118) = 5.0, p < .01, g2 = .08). Replicating Study 2, the estimated price of a projector was significantly lower when preceded by judgments of the price of a computer mouse and toaster oven (p < .01). However, prior estimates of the number of calories in a strawberry and the weight in pounds of a window air conditioner had no effect (p > .8), even though these yielded similar numeric values, as intended. Thus, these results suggest that mere presence or production of a number in the preceding judgment is generally not sufficient to cause anchoring.

Study 3B Study 3A suggests that anchoring does not generalize to any preceding numeric response. We probe the boundary conditions for the effect further here, by holding constant the preceding stimulus and manipulating only the dimensions on which that stimulus is being judged. This design also allows us to test for the possibility of conceptual priming (Oppenheimer et al., 2007), since the target judgment is not only preceded by the production of a smaller number, but also by the consideration of a smaller object. Method We recruited 251 respondents from Amazon Mturk to participate in an online survey.6 All participants first made a judgment about a jug of windshield fluid and then about a heavy duty punching bag. Both stimuli were depicted with photos as well as verbal labels. We chose these two stimuli because they weigh (in pounds) roughly what they cost (in dollars). The dimension on which each stimulus was judged (weight in pounds or cost in dollars) was manipulated between subjects to yield four conditions. Numeric priming and conceptual priming (Oppenheimer et al., 2007) predict similar estimates in all four cells, since all participants are exposed to the same stimulus and a small number prior to the target judgment. By contrast, scale distortion and selective accessibility predict that the punching bag should be judged as lighter only when the preceding judgment pertained to weight and cheaper only when the preceding judgment pertained to price. Results and discussion As predicted, the boxing bag was judged to be significantly cheaper after estimating the cost of a jug of windshield fluid than after estimating the jug’s weight (M = $123.4, SE = 10.1 vs. M = $168.7, SE = 12.0; t(120) = 2.9, p < .01, d = .5) and significantly lighter after estimating the jug’s weight than after estimating the jug’s cost (M = 88.8, SE = 8.8 vs. M = 116.5, SE = 9.6; t(125) = 2.1, p < .05, d = .4). Correspondingly, a 2  2 between-subjects ANOVA revealed the predicted interaction between anchor and target dimension (F(1, 245) = 12.7, p < .001, g2 = .05; see Fig. 1). The ANOVA also showed a significant main effect for target dimension (F(1, 245) = 18.0, p < .001, g2 = .07), with the punching bag being judged as a bit more expensive than heavy, and no significant main effect for anchor dimension (F(1, 245) = .73, p = .39, g2 = .003). Neither of these main effects is of any theoretical importance, so we will not discuss them further. These results (and those of Study 3A) dovetail with other results that challenge the robustness of anchoring effects based on numeric or conceptual priming (Brewer & Chapman, 2002; Mussweiler & Strack, 2001) and with prior research suggesting the importance of correspondence between the potential anchor and target response scale (e.g. Brewer et al., 2007; Chapman & Johnson, 1994; Frederick & Mochon, 2012; Kahneman & Knetsch, 1993; 6 Two extreme outliers were removed because their target judgment exceeded the criteria described by Tukey (1977).

D. Mochon, S. Frederick / Organizational Behavior and Human Decision Processes 122 (2013) 69–79

Fig. 1. Mean estimates (with standard errors bars) for the target judgment in Study 3B.

Markovsky, 1988). Numbers in the environment appear to affect subsequent judgments only to the extent that the two share a scale – a result that is compatible with both scale distortion (Frederick & Mochon, 2012) and selective accessibility. However, selective accessibility fares less well at predicting why conceptual relevance plays such a small role. If the width of an object is deemed insufficiently relevant to affect judgments of its height (Strack & Mussweiler, 1997), the weight of a jug of windshield washer fluid should have limited relevance for judging the weight of a punching bag. Yet we find strong anchoring effects in such contexts.

Study 4A The first three studies show that the scale distortion theory (Frederick & Mochon, 2012), provides a parsimonious account of anchoring effects across several paradigms. Anchoring effects appear to require that the anchor and target are evaluated on the same response scale, but do not require that the judgments pertain to the same object or that the preceding stimulus is judged as particularly relevant to the target stimulus. We next test whether conceptual relevance plays any role in anchoring effects. Though the theory of scale distortion de-emphasizes conceptual relations between stimuli, even this theory requires that comparisons occur (since without a comparison, there is no contrast effect among scale values). Thus, some modest degree of similarity may be required for the emergence of anchoring effects. In the hopes of better understanding anchoring and refining the theory of scale distortion, Studies 4A and 4B test whether anchoring effects attenuate with sufficient conceptual distance between sequential stimuli. The role of conceptual relevance is viewed differently by the two theories of anchoring we have focused upon most. Advocates of selective accessibility propose that conceptual relevance matters either because it affects the nature of the information recruited or because it affects the relevance of that information for the target judgment. On this view, conceptual discrepancies between successive stimuli either attenuate anchoring (to the extent that the activated information is not pertinent to the focal judgment) or reverse anchoring (to the extent that the discrepancy between the stimuli activates a search for aspects of dissimilarity, as proposed by Mussweiler, 2003). By contrast, we propose that conceptual relevance may affect whether scale values are compared, but nothing else. These distinctions yield two predictions we test next: (1) Sufficient conceptual distance between sequentially judged stimuli may eliminate anchoring effects but will not induce contrast effects (Studies 4A and 4B), and (2) Forcing participants to compare conceptually distant stimuli will resurrect anchoring effects (Study 4C).

75

Fig. 2. Mean estimates (with standard errors bars) for the target judgment in Study 4A.

Method We recruited 121 respondents from Amazon Mturk to participate in an online survey. All participants rendered two estimates. The target judgment was always the number of calories in a medium McDonald’s French fries. Using a 2  2 between-subjects design, we manipulated the preceding judgment in terms of both the magnitude and relevance of the generated response. The two low anchors were either the number of calories in a Hershey’s chocolate kiss (low and relevant) or the daily caloric requirements of a goldfish (low and irrelevant). The two high anchor conditions were the number of calories in a Domino’s large cheese pizza (high and relevant) or the daily caloric requirements of a horse (high and irrelevant).7

Results and discussion As seen in Fig. 2, calorie estimates about the McDonald’s fries were affected by preceding judgment about the caloric content of other foods (t(57) = 3.9, p < .001, d = 1.0), but not by preceding judgments about the caloric requirements of other animals (t(58) = .1, p = .9, d = .03).8 Since both of these estimates involve numeric judgments rendered on the same scale, they show that presence of numeric standards on the same scale is not a sufficient condition for anchoring effects; a small amount of conceptual relevance is apparently also necessary. If this (low) hurdle is not met, the numeric referents appear to be ignored and lead to neither assimilation nor contrast effects.

Study 4B We replicate the results from Study 4A here, by holding the response scale constant while manipulating the degree of similarity between the anchor and target. We again predict that sufficient dissimilarity will attenuate anchoring effects, but not induce contrast effects. 7 The median estimates for the low and high anchors in the relevant condition were 50 and 2176 respectively, while the corresponding estimates in the irrelevant condition were 25 and 4000. 8 We excluded two extreme responses from the analyses (with estimates for the target judgment greater than 3 SDs from the mean). A 2  2 between-subjects ANOVA confirmed the predicted interaction effect. This analysis revealed a significant effect of anchor magnitude (F(1, 115) = 9.4, p < .01, g2 = .08), no main effect of standard relevance (F(1, 115) = .02, p = .88, g2 = .00), and a significant interaction between these two (F(1, 115) = 8.5, p < .01, g2 = .07).

76

D. Mochon, S. Frederick / Organizational Behavior and Human Decision Processes 122 (2013) 69–79

Table 5 Mean estimates (with standard errors in parentheses) for target and anchor questions in Study 4B. Condition Control Objects Birds Mammals

Judgment 1 (weight of . . .)

Judgment 2 (weight of . . .)

Target (giraffe)

Tricycle 18 (1.3) Turkey 31 (2.4) Raccoon 23 (1.7)

Air conditioner 72 (6.9) Penguin 66 (6.1) Wolf 77 (4.2)

1065 (85.2) 1170 (122.4) 757 (64.0) 698 (62.3)

Method We recruited 488 respondents from Yale’s eLab to participate in an online survey in exchange for a chance to win Amazon gift certificates. All respondents estimated the weight of an average adult giraffe. In the control condition, that was the only judgment. In three other conditions, respondents first estimated the weight of two smaller Mammals (a raccoon and a wolf), two smaller Birds (a wild turkey and an emperor penguin) or two smaller Objects (a tricycle and a window air conditioner). These pairs of preceding stimuli were selected to yield similar weight estimates, but to differ in terms of their conceptual similarity to a giraffe.

Results As Table 5 reveals, relative to the control group, giraffe estimates were significantly lower in the mammals condition and the birds condition, but not in the objects condition. An ANOVA conducted on the log of the judgments confirms a significant effect of anchor (F(3, 484) = 10.0, p < .001, g2 = .06). Post hoc pair wise comparisons confirm that the birds and mammals conditions were each significantly lower than the control and objects conditions (p’s < .01), which did not differ from each other (p = .5). The birds and mammals conditions also did not significantly differ (p = .2).

Discussion Our finding that estimates of a giraffe’s weight are markedly affected by preceding judgments about the weights of other mammals and birds again suggests that only modest conceptual relevance is required for anchoring to occur. However, the fact that a tricycle and air conditioner had no effect suggests that some degree of similarity is necessary. These results help refine the theory of scale distortion (Frederick & Mochon, 2012), by showing that some minimum hurdle of conceptual relevance must be met for the scale values to be compared. Exploring the nature of relations between stimuli that induce anchoring effects in various paradigms is an important future research topic. For example, would estimates of a giraffe be influenced by a preceding estimate of the weight of a statue of a raccoon? Would the anchoring effect be reduced by first noting that the animals occupy different continents? Would judgments about a giraffe’s weight be anchored less by a prior judgment about the average weight of a human than by a prior judgment about the weight of a non-human animal of similar size? Though such questions are beyond the scope of this paper, our results suggest that anchoring effects would be amplified by manipulations emphasizing relations between stimuli. For example, judgments about a pen may have greater effects on subsequent judgments about fax machines if both products are listed as ‘‘office supplies.’’ Similarly, judgments about a battery and a fax machine may interact more if both are listed as examples of ‘‘electronics.’’

Study 4C The previous two studies demonstrate that anchoring effects disappear when the conceptual distance between sequential stimuli becomes sufficiently great. We propose that conceptual relevance moderates anchoring by affecting whether respondents make a comparison between two numbers, rather than by affecting the applicability of the information being activated or the type of comparison being made (Mussweiler, 2003; Mussweiler & Strack, 1999b, 2000; Strack & Mussweiler, 1997). Correspondingly, we predict that for stimuli that are too dissimilar to be spontaneously compared, anchoring will re-emerge if a comparison is experimentally forced. Method Drawing 656 participants from Amazon Mturk, we had all participants estimate the weight of an adult giraffe, manipulating only the preceding judgments. In the Control condition, nothing preceded the giraffe judgment. In a Raccoon condition, participants first estimated the weight of a raccoon. In the Tricycle condition, participants first estimated the weight of a tricycle. In the Tricycle Comparison condition, participants first estimated the weight of a tricycle and then indicated whether a giraffe weighs more or less than a tricycle (before giving their best estimate of its exact weight). From the results of the prior two studies, we predicted that the tricycle judgment would not affect the giraffe judgment unless respondents were required to compare them – an experimental contrivance intended to recreate the spontaneous comparisons we imagine occur when the preceding stimulus is sufficiently relevant.9 Results and discussion An ANOVA conducted on the log of the judgments confirms a significant effect of anchor (F(3, 652) = 4.8, p < .01, g2 = .02), and as shown in Table 6, our specific predictions were supported. As in the previous study, sufficient conceptual distance eliminated anchoring. However, anchoring effects re-emerged if participants were forced to compare the giraffe and tricycle. Post hoc pair-wise comparisons confirm that the raccoon and tricycle comparison conditions were nearly identical (p = .98) and were each significantly lower than either the control or the tricycle conditions (p’s < .01), which did not differ from each other (p = .92). These results suggest that conceptual relevance moderates anchoring by affecting the likelihood that a comparison is made, rather than through the type of information that is made accessible by the comparison. General discussion In this paper, we explored the boundaries of anchoring. We proposed that the emergence or disappearance of anchoring across various paradigms is most compatible with the scale distortion theory of anchoring (Frederick & Mochon, 2012), and challenges, in one or more ways, most prevailing theories, including: numeric priming, selective accessibility, and anchoring and adjustment. In Study 1, we tweaked the standard paradigm only slightly; we showed that the comparative judgment, per se, does not cause anchoring unless accompanied by a numeric standard, and that the numeric standard does cause anchoring, even when it pertains to a different stimulus than the one for which the absolute 9 Note that the tricycle comparison condition is very close to the standard anchoring paradigm, except that the numeric standard is self-generated, and only implicitly contained within the comparative question.

D. Mochon, S. Frederick / Organizational Behavior and Human Decision Processes 122 (2013) 69–79 Table 6 Mean estimates (with standard errors in parentheses) for target and anchor questions in Study 4C. Condition Control Raccoon Tricycle Tricycle comparison

Judgment 1 (weight of . . .) Raccoon 42 (7.8) Tricycle 30 (4.6) Tricycle 41 (7.0)

Judgment 2 (giraffe heavier than . . .)

Target (giraffe)

Tricycle 99%

1206 (86.2) 925 (60.0) 1293 (106.0) 1035 (88.0)

judgment is being sought. In Studies 2 and 3, we showed that numbers typically act as anchors, even in the absence of any explicit comparison, as long as they share a response scale with the target judgment. As in the first study, we observe strong anchoring effects in situations in which the relation between the target and presumptive anchor is much weaker than a selective accessibility account would seem to require, and we do not observe anchoring effects in situations in which numeric priming would predict they should occur. In the final three studies, we showed that if the comparative stimulus is sufficiently unrelated to the target, no anchoring occurs, even when both judgments are made on the same scale. However, we also showed that anchoring effects re-emerge for highly dissimilar stimuli if respondents are forced to compare them. This suggests that the effect of similarity operates by moderating whether the numeric value associated with the anchor is referenced, rather than through the type of information respondents recruit. Moreover, although much theorizing around selective accessibility would seem to suggest that contrast effects will result from highly dissimilar stimuli (Mussweiler, 2003), we have not yet observed an instance of this. Our results both constrain and expand the boundaries of anchoring. They constrain them by showing that most numbers do not typically act as anchors; at a minimum, anchoring appears to require that the standard and the target are assessed on the same scale.10 Nonetheless, our results expand the boundaries of anchoring in three ways: (1) We show that comparative judgments are not an essential experimental feature, as large anchoring effects occur in situations in which no comparative judgment is explicitly requested or even strongly suggested; (2) we show that conceptual relevance plays a much smaller role in anchoring effects than previously thought; and (3) we show that anchoring effects may be influenced even by stimuli and judgments further back in the judgmental sequence. These results have implications for when one would expect to observe anchoring effects, as well as for why these effects occur. Theoretical implications The varied theories of anchoring that have been brought forth to explain the welter of experimental results compete with one another because the explanatory sufficiency of each theory reduces the value of alternate theories. But these alternatives are not logically contradictory; the psychological processes being posited are not exclusive. Multiple processes could co-occur, and their relative importance may differ markedly across contexts. For example, though our results cast further doubt on numeric priming as a general account of anchoring, such effects may occur if mental resources are experimentally constrained (Reitsma-van Rooijen & Daamen, 2006; Wegener et al., 2010); selective accessibility might 10 For example, estimates of the number of rushing yards attained by last year’s NFL leader will not be affected by whether last year is instead expressed as the number 2012, and a CEO’s bid for a large company is not likely to be influenced by the area code of the number of the person he dials to negotiate.

77

play a larger role in judgments when there is a large pool of background knowledge which might be differentially activated (e.g. Englich & Mussweiler, 2001); and anchoring and adjustment may operate when the standard is seen as relevant and the direction for the adjustment is clear (Epley & Gilovich, 2001; Simmons et al., 2010). Such conditions do not apply to the studies reported here, however, and, thus, we believe that the scale distortion theory of anchoring (Frederick & Mochon, 2012) is the most tenable explanation for the range of effects (and non-effects) we observe. Our results also help refine that theory: Though the presence of a numeric standard on a shared scale does, indeed, appear to be a necessary condition for anchoring, it is not a sufficient condition. For example, a judgment about the weight of a giraffe is affected by a preceding judgment about a raccoon, or wolf, or turkey, or penguin, but not an air conditioner or a tricycle. Similarly, estimates of the number of calories in an order of French fries is affected by preceding judgments about the caloric content of other food items, but not by preceding judgments of the caloric requirements of goldfish or horses. Thus, some small amount of similarity is required for anchoring to occur. Nonetheless, similarity plays a smaller role in moderating the effect than what selective accessibility would imply: Even judgments about a jug of windshield washer fluid exert strong influences on judgments about punching bags. Moreover, even when this low hurdle of conceptual similarity is not met, merely enforcing the comparison is sufficient for anchoring effects to remerge. Generated vs. provided standard In most of the studies we present, the putative numeric anchor was generated by participants themselves, via their preceding judgment(s). The scale distortion theory places no special emphasis on this aspect of the experimental design, and we would expect similar (or even larger) anchoring effects if the anchor were externally provided. Consider a recent paper (Attari, DeKay, Davidson, & Bruine de Bruin, 2010) in which participants estimated the energy consumption of eight household appliances (e.g. laptop computer, central AC) after being told that a 100-W incandescent electric light bulb uses 100 units of energy in 1 h. Though no comparison was explicitly requested, the provided standard was clearly intended to be used as a referent, and presumably functioned in the same way that respondents’ prior estimates did in our studies. Scale distortion would clearly apply here as well, and, indeed, Frederick, Meyer, and Mochon (2011) later demonstrated that when a comparatively high standard was instead provided (a 9000-W electric furnace), underestimates became overestimates. Though the self-generation of standards has no special significance to the scale distortion theory, it is worth noting that one’s guess about the value of a different stimulus that happens to precede the target judgment seems much less likely to carry a suggestion of relevance than an externally provided standard. Thus, though the scale distortion theory is potentially applicable to any situation in which a standard shares a response scale with the target judgment, conversational norms and inference likely account for part of the effect in many of these contexts, including many of the results in the classic anchoring paradigm, where respondents could reasonably interpret the experimentally provided standard as conveying information about the expected response (Grice, 1975). Note also that experimental attempts to inhibit this inference – say, by having the last two digits of one’s social security number be the source of the numeric standard (Chapman & Johnson, 1999) – remain odd within the classic paradigm, because respondents might wonder why they have been asked to provide both a comparative and absolute judgment when the latter implies the former. This redundancy is highlighted by reversing the order of these two judgments: Experimenter: ‘‘What is the weight of the

78

D. Mochon, S. Frederick / Organizational Behavior and Human Decision Processes 122 (2013) 69–79

world record pumpkin?’’ Participant: ‘‘Um, 250 pounds?’’ Experimenter: ‘‘Is it heavier or lighter than 100 pounds?’’ Participant: (quizzically), ‘‘Heavier.’’ Even when the binary judgment precedes the continuous judgment, it is conversationally redundant to ask both unless the binary judgment is seen as an attempt to convey clues about the magnitude of the subsequent continuous judgment. Otherwise, respondents must either repudiate the inference the two-part task compels or search for possible reasons why an ostensibly random number might somehow be informative. By contrast, there is nothing odd about asking for sequential absolute judgments of two different stimuli (e.g., How much do you think your cat weighs? ____ How much do you think your horse weighs? ____), and conversational norms likely play a smaller role within this paradigm. Conclusion In this paper, we explored the boundaries of anchoring beyond the standard paradigm and proposed a framework for when to expect such effects. We suggest that future research should not only continue to examine the antecedents and boundary conditions for anchoring effects, but also their proper interpretation. For example, in one study not yet mentioned, 303 participants estimated the number of calories in a pint of Ben & Jerry’s chocolate ice cream. Half made only that judgment and half first estimated the number of calories in a carrot. Those who first estimated the caloric content of the carrot gave markedly lower estimates of the calories in the ice cream (M = 680, SE = 41.9) than those who did not have this anchor (M = 1100, SE = 99.0; t(301) = 3.69, p < .001). It is important to know whether ice cream is actually perceived to be less fattening following judgments about carrots or whether respondents are, temporarily, using the calorie scale differently. Similarly, in the aforementioned study involving estimates of energy consumption, it is important to know whether the standard is actually affecting people’s beliefs about the efficiency of various appliances – which might translate into differing adoption rates of energy-efficient technologies – or whether the effect is more superficial. As noted by Lynch, Chakravarti, and Mitra (1991) the implication of contextual effect depends on whether they are pure scaling effects or deeper representational ones (see also Campbell, Lewis, & Hunt, 1958; Stevens, 1958; Upshaw, 1978, 1984). The scale distortion theory of anchoring (Frederick & Mochon, 2012) suggests the former, though the evidence is not yet decisive. On the same note, if survey results are not properly understood, researchers will draw the wrong conclusions, since responses on objective scales (watts, calories, pounds) do not typically elicit any further scrutiny of what the responses ‘‘mean’’ – the set of inferences that are permissible when comparing numeric responses drawn from different contexts. Few papers have examined anchoring in this important context (though see Adaval & Monroe, 2002; Chernev, 2011; Payne, Schkade, Desvousges, & Aultman, 2000), but we presume that continued research on this topic will ultimately permit the development of a sort of calibration factor capable of reversing conclusions drawn from analyses which take numeric responses at face value. For example, a complete understanding of anchoring effects may permit the conclusion that a judgment of 45% likelihood in a low-anchor context actually suggests a greater degree of belief than a judgment of 60% in a high anchor context. We are 80% sure. Acknowledgments We thank Daylian Cain, Zoë Chance, Paul Cohen, Ravi Dhar, Nick Epley, Sean Hundtofte, Daniel Kahneman, Andrew Meyer, Nathan Novemsky, Daniel Oppenheimer, Sebastian Park, Daniel Read, and Joe Simmons for comments on prior drafts. A special thanks to

Brett Boshco, who helped design and collect some preliminary research whose findings stimulated the research we present here.

References Adaval, R., & Monroe, K. B. (2002). Automatic construction and use of contextual information for product and price evaluations. Journal of Consumer Research, 28(March), 572–588. Attari, S. Z., DeKay, M. L., Davidson, C. I., & Bruine de Bruin, W. (2010). Public perceptions of energy consumption and savings. Proceedings of the National Academy of Sciences of the United States of America, 107(37), 16054–16059. Brewer, N. T., & Chapman, G. B. (2002). The fragile basic anchoring effect. Journal of Behavioral Decision Making, 15, 66–77. Brewer, N. T., Chapman, G. B., Schwartz, J. A., & Bergus, G. R. (2007). The influence of irrelevant anchors on the judgments and choices of doctors and patients. Medical Decision Making, 27(203–211). Campbell, D. T., Lewis, N. A., & Hunt, W. A. (1958). Context effects with judgmental language that is absolute, extensive, and extra-experimentally anchored. Journal of Experimental Psychology, 55(3), 220–228. Chapman, G. B., & Johnson, E. J. (1994). The limits of anchoring. Journal of Behavioral Decision Making, 7(4), 223–242. Chapman, G. B., & Johnson, E. J. (1999). Anchoring, activation, and the construction of values. Organizational Behavior and Human Decision Processes, 79(2), 115–153. Chapman, G. B., & Johnson, E. J. (2002). Incorporating the irrelevant: Anchors in judgments of belief and value. In T. Gilovich, D. Griffin, & D. Kahneman (Eds.), Heuristics and biases: The psychology of intuitive judgment (pp. 120–138). New York, NY, US: Cambridge University Press. Chernev, A. (2011). Anchoring in sequential evaluations of vices and virtues. Journal of Consumer Research, 37(5), 761–774. Critcher, C. R., & Gilovich, T. (2008). Incidental environmental anchors. Journal of Behavioral Decision Making, 21, 241–251. Englich, B., & Mussweiler, T. (2001). Sentencing under uncertainty: Anchoring effects in the courtroom. Journal of Applied Psychology, 31(7), 1535–1551. Epley, N. (2004). A tale of tuned decks? Anchoring as accessibility and anchoring as adjustment. In The Blackwell handbook of judgment and decision making (pp. 240–257). Epley, N., & Gilovich, T. (2001). Putting adjustment back in the anchoring and adjustment heuristic: Differential processing of self-generated and experimenter-provided anchors. Psychological Science, 12(5), 391–396. Epley, N., & Gilovich, T. (2004). Are adjustments insufficient? Personality and Social Psychology Bulletin, 30(4), 447–460. Epley, N., & Gilovich, T. (2005). When effortful thinking influences judgmental anchoring: Differential effects of forewarning and incentives on self-generated and externally provided anchors. Journal of Behavioral Decision Making, 18, 199–212. Epley, N., & Gilovich, T. (2006). The anchoring-and-adjustment heuristic. Why the adjustments are insufficient. Psychological Science, 17(4), 311–318. Frederick, S., Meyer, A. B., & Mochon, D. (2011). Characterizing perceptions of energy consumption. Proceedings of the National Academy of Sciences of the United States of America, 108(8), E23. Frederick, S., & Mochon, D. (2012). A scale distortion theory of anchoring. Journal of Experimental Psychology: General, 141(1), 124–133. Frederick, S., Mochon, D., & Danilowitz, J. (2013). Anchoring as inference. Yale University working paper. Grice, H. P. (1975). Logic and conversation. In P. Cole & J. L. Morgan (Eds.), Syntax and semantics 3: Speech acts. New York: Academic Press. Higgins, E. T. (1996). Knowledge activation: Accessibility, applicability, and salience. In E. T. Higgins & A. W. Kruglanski (Eds.), Social psychology: Handbook of basic principles (pp. 133–168). New York: Guilford Press. Higgins, E. T., & Brendl, C. M. (1995). Accessibility and applicability: Some ‘‘activation rules’’ influencing judgment. Journal of Experimental Social Psychology, 31(3), 218–243. Higgins, E. T., Rholes, W. S., & Jones, C. R. (1977). Category accessibility and impression formation. Journal of Experimental Social Psychology, 13(2), 141–154. Jacowitz, K. E., & Kahneman, D. (1995). Measures of anchoring in estimation tasks. Personality and Social Psychology Bulletin, 21(11), 1161–1166. Kahneman, D., & Knetsch, J. L. (1993). Anchoring or shallow inference: The effect of format. Unpublished manuscript. Klayman, J., & Ha, Y.-W. (1987). Confirmation, disconfirmation, and information in hypothesis testing. Psychological Review, 94(2), 211–228. Lynch, J. G., Chakravarti, D., & Mitra, A. (1991). Contrast effects in consumer judgments: Changes in mental representation or in the anchoring of rating scales. Journal of Consumer Research, 18(December), 284–297. Markovsky, B. (1988). Anchoring justice. Social Psychology Quarterly, 51(3), 213–224. Massaro, D. W., & Anderson, N. H. (1971). Judgmental model of the Ebbinghaus illusion. Journal of Experimental Psychology, 89(1), 147–151. Mussweiler, T. (2003). Comparison processes in social judgment: Mechanisms and consequences. Psychological Review, 110(3), 472–489. Mussweiler, T., & Englich, B. (2005). Subliminal anchoring: Judgmental consequences and underlying mechanisms. Organizational Behavior and Human Decision Processes, 98(2), 133–143. Mussweiler, T., & Strack, F. (1999a). Comparing is believing: A selective accessibility model of judgmental anchoring. In W. Stroebe & M. Hewstone (Eds.), European review of social psychology (pp. 135–167). Chichester, UK: Wiley.

D. Mochon, S. Frederick / Organizational Behavior and Human Decision Processes 122 (2013) 69–79 Mussweiler, T., & Strack, F. (1999b). Hypothesis-consistent testing and semantic priming in the anchoring paradigm: A selective accessibility model. Journal of Experimental Social Psychology, 35(2), 136–164. Mussweiler, T., & Strack, F. (2000). The use of category and exemplar knowledge in the solution of anchoring tasks. Journal of Personality and Social Psychology, 78(6), 1038–1052. Mussweiler, T., & Strack, F. (2001). The semantics of anchoring. Organizational Behavior and Human Decision Processes, 86(2), 234–255. Oppenheimer, D. M., LeBoeuf, R. A., & Brewer, N. T. (2007). Anchors aweigh: A demonstration of cross-modality anchoring and magnitude priming. Cognition, 106(1), 13–26. Payne, J. W., Schkade, D. A., Desvousges, W. H., & Aultman, C. (2000). Valuation of multiple environmental programs. Journal of Risk and Uncertainty, 21(1), 95–115. Quattrone, G. A., Lawrence, C. P., Finkel, S. E., & Andrus, D. C. (1981). Explorations in anchoring: The effects of prior range, anchor extremity, and suggestive hints. Unpublished manuscript. Stanford University. Reitsma-van Rooijen, M., & Daamen, D. D. L. (2006). Subliminal anchoring: The effects of subliminally presented numbers on probability estimates. Journal of Experimental Social Psychology, 42, 380–387. Simmons, J. P., LeBoeuf, R. A., & Nelson, L. D. (2010). The effect of accuracy motivation on anchoring and adjustment: Do people adjust from provided anchors? Journal of Personality and Social Psychology, 99(December), 917–932. Stevens, S. S. (1958). Adaptation-level vs. the relativity of judgment. The American Journal of Psychology, 71(4), 633–646.

79

Strack, F., & Mussweiler, T. (1997). Explaining the enigmatic anchoring effect: Mechanisms of selective accessibility. Journal of Personality and Social Psychology, 73(3), 437–446. Tukey, J. W. (1977). Exploratory data analysis. Reading, MA: Assison-Welsey. Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185, 1124–1131. Upshaw, H. S. (1978). Social influence on attitudes and on anchoring of congeneric attitude scales. Journal of Experimental Social Psychology, 14, 327–339. Upshaw, H. S. (1984). Output processes in judgment. In R. S. Wyer & T. K. Srull (Eds.), Handbook of social cognition (pp. 237–256). Hillsdale, NJ: Lawrence Erlbaum Associates. Wason, P. C. (1960). On the failure to eliminate hypotheses in a conceptual task. Quarterly Journal of Experimental Psychology, 12, 129–140. Wegener, D. T., Petty, R. E., Blankenship, K. L., & Detweiler-Bedell, B. T. (2010). Elaboration and numerical anchoring: Implications of attitude theories for consumer judgment and decision making. Journal of Consumer Psychology, 20, 5–17. Wilson, T. D., Houston, C. E., Etling, K. M., & Brekke, N. (1996). A new look at anchoring effects: Basic anchoring and its antecedents. Journal of Experimental Psychology: General, 125(4), 387–402. Wong, K. F. E., & Kwong, J. Y. Y. (2000). Is 7300 m equal to 7.3 km? Same semantics but different anchoring effects. Organizational Behavior and Human Decision Processes, 82(2), 314–333.