Laboratory experiments: Challenges and promise

Laboratory experiments: Challenges and promise

Journal of Economic Behavior & Organization 73 (2010) 21–23 Contents lists available at ScienceDirect Journal of Economic Behavior & Organization jo...

123KB Sizes 1 Downloads 126 Views

Journal of Economic Behavior & Organization 73 (2010) 21–23

Contents lists available at ScienceDirect

Journal of Economic Behavior & Organization journal homepage: www.elsevier.com/locate/jebo

Discussion

Laboratory experiments: Challenges and promise A review of “Theory and Experiment: What are the Questions?” by Vernon Smith Gary Charness Department of Economics, University of California at Santa Barbara, Santa Barbara, United States

a r t i c l e

i n f o

Article history: Received 12 November 2008 Accepted 12 November 2008

In “Theory and Experiment: What are the Questions?” Vernon Smith presents a series of issues and questions that pertain to the practice of laboratory experimentation. These issues include how we frame decision problems to participants, how we make inferences from our data, and how our own perspectives color our interpretation of the evidence. He points out that there are often “auxiliary hypotheses on which we rely in order to extract meaning from tests of formal hypotheses.” The goal in this exercise is to initiate a discourse in which the experimental community takes stock of our methodology and can better formulate how to avoid some of the pitfalls and traps that dot the landscape. In this short article, I briefly discuss some of Vernon Smith’s many important points, and I also point out some of the advantages of standard laboratory experiments. In the article, Vernon mentions “common assumptions that experimentalists make in order to construct and implement tasks.” Assumption 1 is that people will use backward induction to analyze their situation. While I feel that people can and do use backward induction at least to some degree (for example, we typically do see unraveling for a period or two before the end of an experiment in which unraveling should in principle go all the way back to the first period), I certainly agree that there are limits to the cognitive resources that people (choose to) bring to experimental tasks. The notion of bounded rationality is both a challenge for experimentalists as well as an important research issue in its own right. Theory often presumes that people are able to readily perform complex calculations, making subtle inferences in the process. While experimenters are likely to be a bit more realistic about the behavior and ability of our human participants, it is altogether too easy to slip into the habit of thinking that these participants reason like trained and sophisticated economists. In fact, we must be careful to interpret behavior without making this assumption. People make mistakes in decision environments that seem transparent to economists, at least in part because they think about the decision problems in different ways. Consider for example, the Acquire-a-Company problem, which is based on a game-theoretic analog of the famous “lemons” problem studied by Akerlof (1970) and was first described in Samuelson and Bazerman (1985). One can make an offer to buy a firm with a value (unknown to the buyer, but known to the firm) that is an integer in the interval [0,100], with each integer value equally likely. The firm is worth 50% more in the buyer’s hands, and ownership is transferred if the bid is at least as large as the current value. The typical range of bids is between 50 and 60, even though simple reasoning leads to the conclusion that the optimal bid is zero.1 Charness and Levin (2009) find this phenomenon is robust to quite

E-mail address: [email protected]. Suppose one bids x; in this case, values above x are irrelevant. Thus, the support of the relevant distribution is [0,x], with an expected value of x/2. Since 150% of x/2 is 3x/4, one loses x/4 on average; thus, the bid that minimizes one’s expected loss is zero. In post-experimental briefings, there is widespread 1

0167-2681/$ – see front matter © 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.jebo.2008.11.005

22

G. Charness / Journal of Economic Behavior & Organization 73 (2010) 21–23

a number of simplifications. The problem seems to be that people have difficulty in reasoning about contingent events. In Vernon Smith’s language, “subjects need to experience the future to take it into account.” An even simpler example is the famous Linda problem (Tversky and Kahneman, 1983), where many people think it is more probable that “Linda is a bank teller and is active in the feminist movement” than “Linda is a bank teller.” Economists see this problem as p(A ∩ B) ≤ p(A) and have difficulty understanding how people get this wrong. Yet, as Vernon Smith states: “Subjects do not apply reason to the tasks . . . the same way that we do in our constructivist models.” People bring their own decision heuristics to the task at hand; understanding these heuristics and seeking methods to ameliorate cognitive limitations are important issues in economics. It is also true that people are highly susceptible to framing, as they bring to the laboratory attitudes and beliefs from their own lives and try to apply these to unfamiliar environments. Participants often look for clues regarding what they “should” do in an experiment, particularly in stark environments such as the Dictator game. Minor changes in framing can induce substantial changes in behavior. Liberman et al. (2004) find very different behavior in a Prisoner’s Dilemma game, depending on whether the game is labeled “The Wall St. Game” or “The Community Game.” Harking back to the Linda problem, Charness et al. (forthcoming) find that the omission of the word “single” in the description of Linda leads to a marked and significant decrease in the proportion of people who fall prey to the conjunction fallacy. Framing can also affect different sub-populations differently; Cooper et al. (1999) find that providing a work-environment context (as opposed to an abstract frame) to Chinese managers aided their comprehension in a ratchet-effect game, but had no effect on Chinese students. One must be very careful in this aspect of experimental design, as unintended consequences are all too common. Another important point raised in the article is summarized in Assumption 2, which is the presumption that play in a one-stage game is made without reference to one’s personal history or tendency to try to build reputations. Vernon states: “the abstract concept of single play invokes conditions sufficiently remote from much human experience that it may be operationally difficult to penetrate.” People do bring their own experiences to the laboratory, and they try to make sense of a task in terms of their own cultural norms, which typically prescribe behavior in a richer (and more open-ended) environment than the lab. Vernon points out that just the mention of a possible future is enough to significantly affect behavior. Participants have been known to mention “karma” when asked why they chose to sacrifice money to help someone anonymously. This point also relates to Assumption 4 in the article, which states that people always choose the larger amount of money for themselves regardless of the circumstances. It is clear that this assumption is frequently violated, at least in part because of the shadow of a future that seems to us to be irrelevant to the task at hand. 1. Many ladders to the same roof Vernon Smith points out many of the challenges facing experimenters today. There is no doubt that these are very serious issues that must be addressed. But in this section I will point out some of the major advantages of laboratory experiments with standard subject pools. It is incumbent upon the experimental community to embrace these advantages and to continue to use experiments for the many reasons pointed out in Roth (1988) and elsewhere. In recent years, it has become somewhat fashionable to attack laboratory experiments on grounds that they do not generalize well to economic environments of interest due to reasons such as a per se poor match between the laboratory and the field, a poor match between the students typically serving as participants and the people making these decisions in the field, or the fact that people in laboratory experiments know that they are in an experiment and are being observed. Vernon Smith also points out that using “other people’s money” may lead to unrealistic behavior. All of these points are well taken, of course. But one should be careful to not “throw the baby out with the bath water.” There are inherent limitations to laboratory experiments, much as there are for all research approaches. Yet these various approaches should be seen as complementary rather than being in competition.2 As noted by many observers, one obvious advantage of controlled laboratory experiments is that one may systematically vary one aspect of the environment while keeping all other aspects constant. While one may not be able to realistically extend the levels of behavior observed in the lab to the field environment, laboratory experiments provide an unparalleled opportunity to investigate treatment effects cleanly. However, it seems that another important feature has been neglected in the recent discussion vis-à-vis merits and disadvantages of various research techniques. This is the concept of replicability. In recent years, there have been many veiled and not-so-veiled accusations concerning the truthfulness of data or the selfserving (and even incorrect) ways that data have been processed. These problems pertain to field experiments, to empirical analyses using proprietary data sets, and to laboratory experiments using non-standard subject pools. To the best of my knowledge, there have been no such accusations leveled against laboratory experiments with standard subject pools. Perhaps this is not entirely a coincidence. It is typically quite difficult to replicate field experiments, as these may involve specialized groups or circumstances and are also often rather expensive; much the same issue applies to laboratory experiments using specialized groups of participants. A somewhat different issue is that one cannot know how a researcher has

and immediate agreement among participants that zero is the best bid once this chain of logic is laid out. 2 A Sufi saying is “There are many ladders to the same roof.”

G. Charness / Journal of Economic Behavior & Organization 73 (2010) 21–23

23

selected data from a proprietary data set. While economists like to consider themselves academics and seekers of truth first and rational, selfish economic actors second, there is nevertheless an inherent moral-hazard problem in handling data, and the stakes are high when such work is being published in the top journals. Laboratory experiments offer a broad-based platform that can be used by many, many researchers. If a result seems anomalous, it is possible to try to test it using the instructions provided along with the result. While the journals provide little incentive to perform replications, one must nevertheless often provide a baseline for research that extends previous work; if this baseline does not match the previous work, there may be grounds for some suspicion. Thus, it might well be misguided to engage in data fraud in the lab, since so many people can attempt to replicate one’s results (this was true when the community was small and it is even more true today as experiments become more popular). The ‘open-source’ nature of standard laboratory experiments is clearly beneficial in this regard. Second, research innovations can happen quickly and dialogues in the literature are often particularly fruitful. Since people can use publicly available experimental instructions and make changes on particular variables or design elements to identify their effect, we can quickly get a more complete understanding of the participants’ underlying behavior. Furthermore, this can generate lively debate and further our knowledge of the underlying phenomenon in question.3 There are also incentives for developing effective, clever experimental paradigms since others are likely to adopt them as well; this encourages thoughtful experimental design. When conducted with the proper scientific method, laboratory experiments are particularly useful; for an earlier comment along these lines see Roth (1994). I believe that it behooves the experimental profession to provide more incentives for replication and more incentives for methodological research. In the past, there were relatively few practitioners, so that being an experimenter was a bit like being a member of a private club. Perhaps the relatively small social distance served to facilitate a high degree of integrity. While this may plausibly be attenuating with the increase in popularity of experimental methods in economics, I nevertheless hold firm to my belief that experimental researchers do not create their data out of whole cloth. I propose that replications be encouraged and published by good journals; one approach is to treat them as “technical notes,” as is done in many of the journals in the physical sciences. A second point is that negative results are often quite valuable, but usually are much more difficult to publish than are dramatic successes. One wonders how many non-results have been obtained but not reported; it may be the case that dozens of researchers have spent time and energy to learn something that is already known, but only privately. Vernon Smith has identified a number of important concerns in how we as experimenters interpret the behavior of participants in our experiments. I believe that all of these points are well taken. At the same time, it is clear traditional laboratory experiments can be quite valuable, despite their shortcomings. If we wish to make economics more of a science, experimental methods should be embraced and replication rewarded, thereby serving to limit the moral-hazard problem. Traditional laboratory experiments hold the promise of providing useful data that are unlikely to be tainted by opportunistic behavior on the part of the researchers. This point should be emphasized in the current debate about methodology. Once again, there are many ladders to the same roof. Acknowledgement I would like to acknowledge useful comments from Vincent Crawford, Martin Dufwenberg, and Judd Kessler. Nevertheless, all claims (and errors) made are my own responsibility. References Akerlof, G., 1970. The market for lemons: qualitative uncertainty and the market mechanism. Quarterly Journal of Economics 89, 488–500. Charness, G., Dufwenberg, M., 2006. Promises and partnership. Econometrica 74, 1579–1601. Charness, G., Karni, E., Levin, D., forthcoming. On the conjunction fallacy in probability judgment: new experimental evidence, Games and Economic Behavior. Charness, G., Levin, D., 2009. The origin of the winner’s curse: a laboratory study. American Economic Journals- Micro 1, 207–236. Cooper, D., Kagel, J., Lo, W., Gu, Q., 1999. Gaming against managers in incentive systems: experimental results with Chinese students and Chinese managers. American Economic Review 89, 781–804. Ellingsen, T., Johannesson, M., Torsvik, G., Tjotta, S., 2008. Testing guilt aversion. Working paper, Stockholm School of Economics. Liberman, V., Samuels, S., Ross, L., 2004. The name of the game: predictive power of reputations versus situational labels in determining prisoner’s dilemma game moves. Personality and Social Psychology Bulletin 30, 1175–1185. Roth, A., 1988. Laboratory experimentation in economics: a methodological overview. Economic Journal 98, 974–1031. Roth, A., 1994. Lets keep the con out of experimental econ.: a methodological note. Empirical Economics 19, 279–289. Samuelson, W., Bazerman, M., 1985. The winner’s curse in bilateral negotiations. Research in Experimental Economics 3, 105–137. Tversky, A, Kahneman, D., 1983. Extensional versus intuitive reasoning: the conjunction fallacy in probability judgment. Psychological Review 90, 293–315. Vanberg, C., 2008. Why do people keep their promises? an experimental test of two explanations. Econometrica 76, 1467–1480.

3 A case in point involves Ellingsen et al. (2008) and Vanberg (2008), two papers written in response to Charness and Dufwenberg (2006). Since the experimental instructions are complete and are public, it is relatively easy to extend tests to slightly different designs.