Imperfect information processing in sequential bargaining games with present biased preferences

Imperfect information processing in sequential bargaining games with present biased preferences

Journal of Economic Psychology 30 (2009) 642–650 Contents lists available at ScienceDirect Journal of Economic Psychology journal homepage: www.else...

281KB Sizes 2 Downloads 75 Views

Journal of Economic Psychology 30 (2009) 642–650

Contents lists available at ScienceDirect

Journal of Economic Psychology journal homepage: www.elsevier.com/locate/joep

Imperfect information processing in sequential bargaining games with present biased preferences Zafer Akin * Department of Economics, TOBB University of Economics and Technology, Sogutozu, Ankara 06560, Turkey

a r t i c l e

i n f o

Article history: Received 14 August 2008 Received in revised form 28 April 2009 Accepted 25 May 2009 Available online 6 June 2009

JEL classification: C78 D83

a b s t r a c t This paper studies an alternating-offers bargaining game between a time-consistent player and a time-inconsistent player who processes information on future self-preferences imperfectly. Time-inconsistency and information processing are modeled by using cognitive and mood state approaches, respectively. This model structure allows for the learning of the partially naive time-inconsistent agent. The results characterize the relationship among the level of naivete, the level of learning probability and the equilibrium. We find critical values of the model parameters that specify whether the agreement is delayed and characterize the probabilistic nature of the agreement. In addition, comparative static results are reported with respect to time preferences. Ó 2009 Elsevier B.V. All rights reserved.

PsycINFO classification: 2340 2343 Keywords: Quasi-hyperbolic discounting Imperfect information processing Sequential bargaining

1. Introduction Can some parties in strategic environments exploit opponents who have behavioral biases? How are equilibrium outcomes affected depending on whether people learn the biases they possess over time? Does one bias exacerbate or alleviate another bias’ good or bad effects? These important questions are not addressed sufficiently in economics literature, although there is ongoing research on each of them. The aim of this paper is to partially answer these comprehensive questions in a strategic environment where agents who potentially have present biased preferences and process information imperfectly play a sequential bargaining game. People inherently have self-serving biases, which constitute a subset of behavioral biases, as opposed to the perfect rationality assumption of the standard economic theory. Self-serving biases are defined as holding beliefs or having directed preferences that favor one’s own payoff or satisfaction and are mainly driven by beliefs that may be about the payoffs, other players’ types, available strategies and environment. One of these biases is present biased preferences. It is frequently observed in experimental studies that there is a sharp short run drop in valuation and a faster decline rate in the short run than in the long run. In other words, people have a preference for immediate gratification that probably will not be appreciated by their future selves. These observations are properly captured by the quasi-hyperbolic discounting model.1

* Tel.: +90 5326677960. E-mail address: [email protected] 1 For an extensive discussion on hyperbolic discounting, see Frederick, O’Donoghue, and Loewenstein (2002). 0167-4870/$ - see front matter Ó 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.joep.2009.05.005

Z. Akin / Journal of Economic Psychology 30 (2009) 642–650

643

This type of behavior can be seen as a lack of self-control and leads to time-inconsistency.2 There is a growing literature incorporating this type of preferences in a wide range of contexts (Laibson, 1997; O’Donoghue & Rabin, 1999a). In strategic environments, implications of the existence of players endowed with present biased preferences are not trivial. There are papers that show the possibility of exploitation of these agents by other parties (DellaVigna & Malmendier, } szegi, 2008; Sarafidis, 2005). Note that 2004, 2006; Eliaz & Spiegler, 2006; Gilpatric, 2008; Hafalir, 2008; Heidhues & Ko agents who are aware of their own biases can act rationally although the very nature of the existence of these biases is to be unaware of them in most of the cases. The documented possibility of exploitation of these agents proves the importance of the question of whether people are aware of this bias and whether they are able to learn it. The existing literature takes the types (based on awareness level) of present biased agents as given and does not examine the case of dynamic preferences. There are only a few papers that mention learning and its effects on the equilibrium behavior (O’Donoghue & Rabin, 2001; DellaVigna & Malmendier, 2006). Learning of hyperbolic discounters, to the best of our knowledge, is only examined in a noncooperative game context by Akin (2007). He analyzes naive learning in a bargaining framework and characterizes the conditions for bargaining delay caused by naive learning where the naive agent systematically learns her own type during the game. Contrary to existing literature showing the detrimental nature of naiveté, he shows that the more naive the player is, the higher the share received under some informational assumptions. We, in this paper, allow people to be naive about self-preferences and incorporate a learning framework. In order to do this, we model partial naiveté in such a way that it allows learning in the sense of increasing awareness. Specifically, the agent cognitively can be in two different states based on perceptions about future preferences; sophisticated state and naive state that are mutually exclusive. This implies that whenever a signal about the states is processed, the true state is revealed. The level of naiveté is determined by the probability assigned to the naive state. There are some papers modeling partial naiveté whose applications and focuses are different. In O’Donoghue and Rabin (2001), partially naive agents are characterized by a parameter that represents an underestimation of the actual level of present bias and the higher this parameter, the higher the degree of naiveté. Loewenstein, O’Donoghue, and Rabin (2003) generalize the notion in O’Donoghue and Rabin (2001) by using state dependent utility functions. In their model, predicted future utility lies in between the utility given the current state and the true future utility. Hence, the agent knows that her own preferences will change but she systematically underestimates the magnitude. Similarly, Eliaz and Spiegler (2006) do not particularly focus on the present biased preferences but employ a very general formulation of time-inconsistency. In their model, the agent has a belief about current and future utility functions where the agent is uncertain whether or not her own preferences will change, but she knows exactly which future utility functions she }szegi (2009) employ possibly the most general method such that the conjecture on the future may have. Heidhues and Ko preference parameter is continuously distributed with a density. They show that this approach explains the interesting case where becoming more sophisticated can decrease welfare in a context of consumption of a harmful product. Finally, in Asheim (2008) changed to Asheim (2008), the partially naive agent believes that the current preferences will persist at each future period with a fixed probability p 2 ð0; 1Þ and the model’s implications are similar to the ones in Heidhues and }szegi (2009). Our modeling approach is similar to the multi period3 version of Eliaz and Spiegler’s model in the sense Ko that whether or not the agent’s preferences will change is uncertain, but she knows exactly what kind of preferences she } szegi (2009) is also similar to our may have. A specific case (binary beliefs on future preferences) in Heidhues and Ko modeling. In addition to modeling partial naiveté as a cognitive phenomenon, we add another behavioral dimension – a psychological characteristic common to most people. Specifically, the agent has a psychological state (mood state) during which she potentially processes information imperfectly.4 Our mood state approach can be interpreted as follows: the agent may or may not take into account relevant information in her decision making process. That is, she disregards (treats) the information (as irrelevant) with a fixed probability (this phenomenon is called ‘‘confirmation bias”. For more details, see Wason, 1960). Specifically, the information is processed if it supports previously held beliefs, but it may be omitted if it does not support these beliefs. Rabin and Schrag (1999) present a general model that analyzes the implications of confirmatory bias. In their model, there are two states of the world and the agent initially views them as equally likely. The agent receives (i.i.d.) signals correlated with the true state. The agent, with positive probability, reads a signal conflicting with the hypotheses that he currently believes is more likely as a supporting evidence for his current hypotheses. He is also unaware of his biased interpretation characteristic. Similarly, in our model, the agent has two cognitive states and processes signals in a similar manner. However, our model involves strategic interaction and the signals are sent by the player who has private information through the actions taken during the course of the game. Moreover, confirmation bias is modeled a little differently 2

Agents with time-inconsistent preferences, quasi-hyperbolic discounting and self-control problems are used interchangeably in the paper. Since these models are in general two-period models, they are not interested in whether the agent realizes her own time-inconsistency. Rather, they analyze the implication of having these preferences causing the agents’ planned and actual actions to differ. What we are interested in is (a discrete type of) learning of these agents about self-preferences. In addition, our model is similar to Rubinstein’s model (1985) in the sense that only one period of delay reveals the true type of the agent if the information is perfectly processed. 4 Imperfect information processing can be explained mainly by self-serving biases. However, here, we will approach imperfect information processing phenomenon from a more affective, visceral and noncognitive perspective (Raghunathan & Pham, 1999; Sanbonmatsuand & Kardes, 1988). For a more formal discussion of information processing imperfections as the source of bounded rationality, see Lipman (1995). For a similar modeling of limited information processing, see Compte and Postlewaite (2008). For other sources of this phenomenon, see Antonides (1996, chap. 7). 3

644

Z. Akin / Journal of Economic Psychology 30 (2009) 642–650

and the structure of our model necessitates that the signals always point to one hypotheses. Other than these, the models are similar and the current model can be treated as a variant of the model in Rabin and Schrag (1999) in a strategic environment. Given the mentioned behavioral assumptions, we analyze an infinite horizon alternating-offers bargaining game (Rubinstein, 1982) played between a partially naive discounter, the agent; and an exponential discounter, the principal. The agent has prior beliefs regarding the state of the world. This prior is cognitively updated with the information flowing during the course of the game but filtered depending on the mood state.5 There are several reasons why we focus on alternating-offers bargaining game as the strategic environment. Firstly, it is a sequential (not a one shot) and multi period (potentially, infinite horizon) game, which allows the modeler to observe the effect of present biased preferences explicitly. Secondly, it is a two person game where different types of hyperbolic agents can interact (not a one person game played by different selves of one player). Thirdly, it is a dynamic game in the sense that during the course of the game, information is generated about the underlying uncertainty if there is any. Hence, it allows learning. Last, but not least, it has a wide range of real life applications and there is considerable related literature. Given the preferences and mentioned behavioral characteristics, we characterize the equilibrium of this game (when and with what shares the game ends). We show that there is a critical level of naiveté below which the agent behaves as if she is sophisticated and above which she behaves as if she is naive. We then show that if the agent is not sophisticated enough, there exists a critical value for the learning probability below which the game ends immediately. Moreover, above this threshold level, the agreement may be delayed depending on the parameter values. If so, the agreement is probabilistic and the average delay time is calculated accordingly. Comparative statics imply that if both players get more patient (a higher d), then delay is more likely. More interestingly, the exponential agent makes a higher offer for himself more frequently to the opponents with less severe self-control problems (higher b) if b is small enough. On the other hand, if b is relatively high (the agent’s self-control problem is mild), a higher b implies a more frequent offer of a small share for himself (by ‘‘more frequent”, we mean ‘‘for a wider range of learning probability”). The remainder of the paper is organized as follows. Section 2 introduces the model in detail. Section 3 characterizes the equilibrium behavior and presents some comparative statics. Section 4 concludes with a discussion of the presented model’s implications and limitations and some extensions.

2. Model Let T ¼ f0; 1; 2; . . .g be the infinite set of possible agreement times. Let i–j 2 f1; 2g and t; s 2 T represent players and dates, respectively. Let u ¼ fu1 ; u2 g 2 U be a utility pair and U be the set of feasible utility pairs where U ¼ fu 2 ½0; 12 ju1 þ u2 6 1g. Initially, i offers a utility pair at t ¼ 0. If j accepts the offer, the game ends. If j rejects, then at t ¼ 1, j offers a utility pair. The game continues by switching roles as long as there is no agreement. Each player gets zero if they never agree. There are mainly two types of players: exponential players (EA) and hyperbolic players. The hyperbolic players are further categorized into three types: naive (NHA), partially naive (PNHA) and sophisticated hyperbolic players (SHA). The EA is timeconsistent and has the discount factor sequence f1; d; d2 ; d3 ; . . .g where d is the standard time-consistent impatience. All hyperbolic agents currently have the discount factor sequence f1; bd; bd2 ; bd3 ; . . .g where b is the agents’ preferences for immediate gratification or degree of self-control problem (for the EA, b = 1). The categorization of the hyperbolic agents is mainly based on their beliefs about future preferences. Note that regardless of the type, they will all employ the current discount factor sequence f1; bd; bd2 ; bd3 ; . . .g at each future period. Note also that all players have the same d and all the ones who have self-control problems have the same b. Given these, the type who assigns probability one to using f1; bd; bd2 ; bd3 ; . . .g at each future period is called sophisticated. On the other hand, the type who assigns probability zero to using this discount factor sequence and who believes that she will use discount factor sequence f1; d; d2 ; d3 ; . . .g at each future period with probability one is called naive.6 Contrary to these extreme cases, the type who assigns a probability 0 < k < 1 to using f1; bd; bd2 ; bd3 ; . . .g at each future period is called partially naive. The higher the agent’s k, the higher the degree of sophistication. Therefore, for the SHA, k ¼ 1 and for the NHA, k ¼ 0. Fig. 1 summarizes the model. There are two possible states of the world.7 One is the sophisticated state in which the agent has beliefs of a sophisticated agent and the other is the naive state in which she has beliefs of a naive agent.8 The agent’s prior beliefs about the two states are k and 1  k, respectively. The actual state of the world is the sophisticated state as opposed to her current beliefs, 0 < k < 1. Given the structure of the model, any delay in the game turns out to be a signal about the states of the world. We allow belief updating based on the signals received during the course of the game in a boundedly rational sense. 5 It is generally considered that agents’ decisions are based purely on ‘‘cold” cognitive processes. However, emotions interact with the cognitive system. For dual self/system models, see Fudenberg and Levine (2006); Metcalfe and Michel (1999). 6 We call all hyperbolic types as ‘‘she” and the exponential type as ‘‘he” for convenience. 7 States of the world are basically the hypotheses that the agent deems possible. We model the state space such that it has just two mutually exclusive elements, which is easy to understand and more tractable. However, this can be generalized into overlapping multiple states or even continuum of states (that allow gradual Bayesian learning). 8 The naive state can be called as the hot state because the agent is totally optimistic about her future preferences in this state. The sophisticated state can be called as the cool state since the agent is totally aware of her future preferences and behaves accordingly. The agent’s beliefs are on the discount factors and the states are named based on where the agent’s beliefs stand relative to the reality. If they overlap, it is the sophisticated state; if they differ, it is the naive state.

Z. Akin / Journal of Economic Psychology 30 (2009) 642–650

645

Fig. 1. Dynamics of the model.

The signals received during the game are processed by the PNHA based on the mood state. She disregards (treats) the information (as irrelevant) with a fixed probability q. With complement probability, the agent processes the signal.9 By adding this dimension, we embed different approaches into the same model. If q ¼ 1, this refers to no learning case; if q ¼ 0, this refers to perfect Bayesian learning and if 0 < q < 1, this refers to imperfect information processing and incorporates a bounded rationality aspect into the agent’s behavior. The last case implies a self-serving assessment in bargaining since, as long as the game continues, the information flowing signals the agent’s disadvantageousness. Note that the parameter q is related to a behavioral characteristic, exogenously given and does not change over time. The EA does not have any psychological state like the PNHA. With regard to the beliefs, we assume that each player’s available strategies, each player’s current preferences and future preferences of the EA are common knowledge. The agent’s beliefs about self-preferences are as specified above. The EA knows the actual state of the world. The previous statement is also common knowledge. All these imply that the PNHA does not appreciate her own time-inconsistency fully but she is sophisticated about the EA’s preferences. Moreover, the EA knows everything that the modeler knows about the psychological state structure of the PNHA but the PNHA herself does not. 3. Equilibrium characterization In this part, we characterize the equilibrium of the game. We report some comparative analysis results at the end. Since this is a dynamic game of incomplete information, we use the perfect Bayesian equilibrium concept. Before proceeding, it will be useful to mention the equilibrium shares of Rubinstein’s alternating-offers bargaining game between different types of players: Remark 1 1. In the infinite horizon alternating-offers bargaining game where both players have exponential discounting with discount     1d2 1 d 2 ð1d1 Þ . If d1 ¼ d2 , ðx ; 1  x Þ ¼ 1þd . ; d1d ; 1þd factors d1 and d2 , the equilibrium payoffs are ðx ; 1  x Þ ¼ 1d 1 d2 1 d2   2 1bd bdbd   2. A bargaining gamebetween an  EA and an SHA ends with the following shares ðx ; 1  x Þ ¼ 1bd2 ; 1bd2 if the EA offers first, ðx ; 1  x Þ ¼

1d 1bd2

2

dbd ; 1bd 2

if the SHA offers first.

3. Moreover, a bargaining game between an EA and an NHA (who does not learn) ends always with shares     bd bd 1 d ; 1þd ; 1þd ðx ; 1  x Þ ¼ 1  1þd if the EA offers. If the NHA offers, she offers ðx ; 1  x Þ ¼ 1þd but the EA rejects this offer, where x is the share of the first proposer. Briefly, the EA prefers an SHA more than an NHA as the opponent and he prefers another EA the least. For notational con1bd e bd 1 e venience, we denote S ¼ 1bd 2 , S ¼ 1  1þd and S ¼ 1þd satisfying S > S > S which refer to the shares explained in Remark 1. From here on, all mentioned shares are of the EA, who is the first proposer, unless otherwise noted. 9 By processing the signal, we mean that the agent learns the true state. By disregarding it, we mean that the beliefs do not change. Thus, in a sufficiently long time, learning occurs unless q ¼ 1. In a different model, there may be two or more conflicting signals that result in ambiguous dynamics of beliefs (Rabin & Schrag, 1999).

646

Z. Akin / Journal of Economic Psychology 30 (2009) 642–650

The first result is due to Rubinstein (1982). The other two are due to Akin (2007). The second result is actually obtained from the first one where d1 ¼ d and d2 ¼ bd. This is because an SHA in our framework can be treated as an EA with discount factor bd, given her beliefs (k = 1). The third result is obtained by using the fact that the NHA who does not learn, given her beliefs (k = 0), is an obstinate agent who insists on her current beliefs whatever the history of the game and this is common knowledge.10 In this paper, we look for the answer to the following question: what is the equilibrium of an alternating-offers bargaining game played between a time-consistent player and a time-inconsistent player who is not certain what her future preferences will be and who processes information imperfectly? The latter can be one of two types and holds beliefs about the types. The former, on the other hand, knows the actual type of the PNHA. In order to answer this question, we use Rubinstein’s (1985) results. Rubinstein in his seminal 1982 paper introduces the alternating-offers bargaining game protocol and characterizes its equilibrium in a complete information setting. Rubinstein (1985) analyzes an extension of his previous model such that the second player has private information about what his own type is. The first player has initial binary beliefs about the type of the second player. He believes that the second player is either a weak type ð2w Þ with probability x0 or a strong type ð2s Þ with probability 1  x0 (ds > dw ). The main result characterizes the relationship between the unique bargaining sequential equilibrium and player 1’s belief. It specifically states that ‘‘. . .there exists a cut-off point x such that if x0 is strictly below x , player 1 gives up: he offers the partition he would have offered if he thought he was playing against 2s , and player 2, whatever his type, accepts it. If x0 is strictly above x , some continuation of the bargaining is possible. In equilibrium, player 1 offers P1, 2w would accept it, and 2s would reject it and offer P2, which is accepted by 1. The partitions P1 and P2 are the only ones satisfying: (i) player 1 is indifferent between P2 today and the lottery of P1 tomorrow with probability x0 and P2 after tomorrow with probability 1-x0 ; (ii) player 2w is indifferent between P1 today and P2 tomorrow”. Rubinstein (1982) and our approach are similar in terms of informational assumptions but there are two main distinctions. Firstly, the uncertainty in our framework is about the agent’s own type and the agent herself does not know which type she is. She holds an initial belief about her own type. The EA has private information about the types. Secondly, the aspect of imperfect information processing further differentiates our model from Rubinstein’s (1985) model. Despite these distinctions, the structural similarities allow us to use the mentioned result in our framework. Lemma 1. Assume q ¼ 0. There exists a threshold level of sophistication k such that for every k P k , the agent behaves as if she is an SHA. If k < k , the PNHA behaves as if she is an NHA. Specifically, the PNHA, as the first proposer, offers ðS; 1  SÞ, which is S; 1  e SÞ, which is accepted if e S P d2 S. rejected. Then, the EA offers ðS; 1  SÞ, which is accepted. The EA, as the first proposer, offers ðe If e S < d2 S, the EA, as the first proposer, offers a share ðs; 1  sÞ where s P S that will be rejected. Then, the PNHA offers ð1  dS; dSÞ, which is accepted. Dropping imperfect information processing assumption, q ¼ 0 makes our model very similar to the one in Rubinstein (1985). Hence, the existence of a threshold level of probability, k , and the equilibrium shares offered follow directly from Rubinstein (1985). Here, we provide only an intuitive explanation for Lemma 1. Firstly, for the PNHA who has binary beliefs, it is optimal to either make an offer a sophisticate would make or make an offer as if she is an exponential agent. The reason is that any intermediate offer is rejected anyway if the sophisticated state is the true state since it is lower than the offer that a sophisticated agent makes that is the only acceptable offer for the EA. On the other hand, if the exponential state is the true state, any intermediate offer is accepted anyway since it is higher than the offer made by a naive agent and accepted by the EA. Therefore, no intermediate offer is made. Secondly, if the prior belief of the PNHA that the current preferences will persist is high enough, k P k , she gives up and offers as an SHA does, which is the highest offer the EA expects (Remark 1, point 2). Otherwise, k < k , she offers as if she will be an EA from tomorrow on. This offer is rejected by the EA since after the EA’s rejection, the PNHA becomes an SHA and it is optimal to reject and make a higher counter offer for himself. This is also known by the PNHA who also knows that the EA can pretend that she is actually weak. However, updating here occurs due to the PNHA’s self observation about having the present biased preference again. On the other hand, the EA offers ðe S; 1  e SÞ if e S P d2 S that is accepted by the PNHA. The reason is that from the perspective e of the EA, S is at least as high as the highest share he can possibly get, d2 S. The PNHA also accepts this offer since it is the highest share she can achieve. However, if e S < d2 S, the EA knows that the PNHA does not accept any share less than 2 e 1  S. Since d S > S > S, it is optimal to make an offer ðs; 1  sÞ where s P S, which is rejected since 1  s 6 1  S. In the next period, the PNHA becomes an SHA and offers as in Remark 1, point 2. In short, Lemma 1 emphasizes the relationship between the equilibrium and the PNHA’s prior belief. Intuitively, if the agent is pessimistic enough about being a strong type, in Rubinstein’s terminology, she would consent to the lower share for herself, ð1  SÞ. Otherwise, the game is possibly delayed one period. 10 Akin (2007) uses an equilibrium concept called naive backwards induction (NBI), introduced first by Sarafidis (2005), to obtain these results. NBI is a twisted version of subgame perfect Nash equilibrium where agents have inconsistent beliefs, which contradicts the Nash equilibrium concept (For details, see Akin, 2007). However, although we do not prove it formally, by definition, the NBI concept overlaps with the perfect Bayesian Nash equilibrium concept for the PNHA, given her beliefs. Thus, we can use the results in this paper as benchmarks to analyze the agent’s behavior.

Z. Akin / Journal of Economic Psychology 30 (2009) 642–650

647

We now turn to the case of imperfect information processing. Here, whenever there is a rejection and this information is processed by the PNHA, it is revealed that the state of the world is the sophisticated state. This occurs with a fixed probability, 0 < q < 1 at each t P 0 (sudden learning) and after becoming sophisticated, the agent remains sophisticated.11 The following proposition characterizes the relationship between the equilibrium structure of the bargaining game and the information processing probability, q. Proposition 1. Assume that q 2 ð0; 1Þ, k < k and the EA is the first proposer. Then, there exists a learning probability q such that: S < d2 S, in which case the agreement is delayed with probability 1  q. In case of a delay, 1. The EA offers ðS; 1  SÞ for all q > q if e when the PNHA offers, she either offers ð1  dS; dSÞ with probability q, which is accepted, or she offers ðS; 1  SÞ with probability 1  q, which is rejected. S < d2 S or for all q 2 ð0; 1Þ if e S P d2 S. 2. The EA offers ðe S; 1  e SÞ and the agreement occurs immediately either for all q 6 q if e

Proof. Suppose 0 < q < 1, k < k and the EA is the first proposer (the case in which the PNHA is the first proposer can easily be adopted). First suppose that e S < d2 S. This means that the EA can take his chance to offer the highest share he can get. Define two value functions for the EA’s problem as follows:

VðtÞ ¼ maxfRðtÞ; qS þ ð1  qÞdVðt þ 1Þg; fe S;Sg Vðt þ 1Þ ¼

max

fRðt þ 1Þ; dVðt þ 2Þg;

fAccept; Rejectg

where t 2 f0; 2; 4; . . .g:

At each even period, the value function takes the form of VðtÞ because the EA offers either S or e S since the opponent is S. In other words, the EA does either sophisticated in which case the EA gets S or partially naive in which case the EA gets e not make any intermediate offer because a partially naive opponent rejects it anyway and a sophisticated opponent is ready S, both types accept and he gets e S for sure. Hence, the immediate reward function RðtÞ is equal to e S for to accept S. If he offers e all t 2 f0; 2; 4; . . .g. However, the payoff from offering S is not deterministic. The expected value of offering S is given by qS þ ð1  qÞdVðt þ 1Þ since S is accepted with probability q and rejected with probability 1  q, in which case he gets his continuation payoff in the next period, Vðt þ 1Þ.12 With a slight abuse of notation, we use EðSÞ to refer to the expected value of offering S, EðSÞ ¼ qS þ ð1  qÞdVðt þ 1Þ. At each odd period, the EA either accepts or rejects the opponent’s offer. The opponent is either sophisticated and offers dS to the EA (which is accepted) or still partially naive in which case she offers 1  S. By Lemma 1 in Akin (2007), offers of the PNHA are never accepted. In other words, the game can end only at even periods in which the EA offers unless the PNHA becomes sophisticated during an odd period. The immediate reward function at odd periods, Rðt þ 1Þ, takes two values dS or 1  S with probability q and 1  q, respectively. However, Vðt þ 1Þ ¼ maxfAccept; Rejectg fdS; dVðt þ 2Þg ¼ dS since Vðt þ 2Þ 6 S S ) dVðt þ 2Þ P for all t 2 f0; 2; 4; . . .g and Vðt þ 1Þ ¼ maxfAccept; Rejectg f1  S; dVðt þ 2Þg ¼ dVðt þ 2Þ since Vðt þ 2Þ P e de S > 1  S for all t 2 f0; 2; 4; . . .g. Thus, we can write the following:

Vðt þ 1Þ ¼ qdS þ ð1  qÞdVðt þ 2Þ; VðtÞ ¼ Vðt þ 2Þ: The last expression is satisfied because of the static environment in which the opponent’s type is exogenously determined with a fixed probability. Then,

VðtÞ ¼ maxfe S; EðSÞg; fS;Sg

S; qS þ ð1  qÞdðqdS þ ð1  qÞd maxfe S; EðSÞgÞg: VðtÞ ¼ maxfe fe S;Sg fe S;Sg Now, if EðSÞ > e S, which means that the EA offers S, then

EðSÞ ¼ qS þ ð1  qÞdðqdS þ ð1  qÞdðEðSÞÞ:

ð1Þ

If qS þ ð1  qÞdVðt þ 1Þ 6 e S, which means that the EA offers e S, then

EðSÞ ¼ qS þ ð1  qÞdðqdS þ ð1  qÞde SÞ:

ð2Þ

A consistency condition is imposed such that EðSÞ > e S and expression (1) together and EðSÞ 6 e S and expression (2) together S and (1), we get: should be consistent. By using EðSÞ > e 11 Alternatively, the agent may interpret the relevant information (contradicting her beliefs) as mistakes. In this case, we can argue that, rather than a gradual learning, there may exist a threshold accumulated (affective) information level above which the agent inevitably concedes its implications. Although this information is cognitively ignored, affectively, it accumulates. This may seem to be sudden learning but its underlying process may be as described above. 12 We assume that the type of the PNHA, sophisticated or naive, is revealed by nature at the very beginning of the period.

648

Z. Akin / Journal of Economic Psychology 30 (2009) 642–650

Hðd; b; qÞ ¼

qSð1 þ ð1  qÞd2 Þ e  S > 0: 1  d2 ð1  qÞ2

ð3Þ

By using EðSÞ 6 e S and (2), we get:

Hðd; b; qÞ 6 0:

ð4Þ

Note that > 0, Hðd; b; 0Þ < 0 and Hðd; b; 1Þ > 0 since S > e S. Hence, there must be a threshold value q satisfying Hðd; b; q Þ ¼ 0 such that for all q 6 q , (4) is satisfied. Otherwise, (3) is satisfied. Thus, if q > q , he offers S in which case the game is delayed with probability 1  q. When the PNHA offers, she is still naive with probability 1  q in which case she offers 1  S and the game is delayed. If she is sophisticated and offers dS, then the game ends immediately. If q 6 q , the EA offers e S, in which case the game ends immediately. Finally, suppose e S P d2 S. This means that the present value of what the EA can at most get in the next period is smaller than or equal to the offer that he makes to an NHA who does not learn and that is accepted. This is why the EA gives up and offers ðe S; 1  e SÞ, which is accepted by the PNHA. h dHðd;b;qÞ dq

Proposition 1 states that if the present value of the best the EA can get in the next period is not higher than the share he can guarantee today, then the game ends immediately with the share that he offers to an NHA. Moreover, if the immediate reward is not large enough, then the EA takes his chance to offer the highest share he can get only if the probability that the PNHA processes the signal correctly is high enough. Otherwise, he again gives up and offers the share that he offers to an NHA, which immediately ends the game. Note that given b, provided that d is large enough, e S < d2 S and given d, provided that b is large enough,e S > d2 S. This intuitively means that if the EA is patient enough then the game is probably delayed and if the self-control problem of the agent is not so severe, then the EA does not delay the game. The following corollary characterizes probabilistically when the game ends. ~ Corollary 1. If e S < d2 S and q > q , then the game is delayed ~t periods with probability qð1  qÞt .

Proof. Suppose e S < d2 S and q > q . By Proposition 1, we know that as long as the PNHA does not become an SHA, the game ~ is delayed. Then, a delay of ~t periods has a probability of ð1  qÞt . Moreover, the game ends at period ~t þ 1 with probability q. ~ ~ ~ Hence, the game is delayed t periods and ends at t þ 1 with probability qð1  qÞt . h This corollary specifies how many periods the game is delayed with what probability if the EA decides to offer the highest share he can get. 3.1. Comparative statics We now find how the value of q changes with different parameters of the model by applying the  implicit .function  theq Þ dHðd;b;q Þ q ¼  dHðd;b; orem on Hðd; b; q Þ. Firstly, we explore the relationship between q and b by examining oob . We dðbÞ dðq Þ 2 dHð:Þ d Hðd;b;qÞ  > 0. Note that function Hðd; b; q Þ is concave in b since < 0 and there is a unique b satisfying know that dð  2 q Þ db dHðd;b ;qÞ qÞ qÞ q ¼ 0 such that for all b > b , dHðd;b; < 0 and for all b < b , dHðd;b; > 0. Thus, for all b > b , oob > 0 and for all b < b , db db db oq  < 0. This means that, given b > b , as the agent’s degree of self-control problem gets milder (a higher b), the EA is more ob S decrease, but the former decreases more than the latter. likely to offer e S. The reason is that as b increases, both EðSÞ and e This leads to an increase in the tendency to offer e S, which implies a higher q . On the other hand, if b < b , as the self-control S decrease as b increases but the problem of the agent lessens, the EA is more likely to offer S. This is because both EðSÞ and e latter decreases more than the former. This leads to an increase in tendency to offer S, which implies a lower q . In other words, if the self-control problem of the agent is severe and as it gets milder, the EA offers the higher share for himself more frequently up to a point after which further  . increase  in b leads the EA to offer the higher share less frequently. q Þ dHðd;b;q Þ q ¼  dHðd;b; Secondly, the sign of ood provides us the relationship between q and d. The critical value of learndðdÞ dðq Þ ing probability plays a role only if the condition e S < d2 S is satisfied by Proposition 1. By using this expression, it is easy to dHð:Þ oq > 0. This, together with dð show that dHð:Þ dðdÞ q Þ > 0, implies that od < 0. This means that as the common time-consistent discount S decreases. This implies an factor d increases, the EA is more likely to offer S. Intuitively, as d increases, EðSÞ increases but e increase in the tendency to offer S implying a lower q . This means that as the EA gets more patient, delay becomes more likely.

4. Discussion and conclusion Identifying behavioral biases of people, determining their implications and then, if possible, suggesting operational policies are extremely relevant and crucial issues since people frequently exhibit bounded, rather than perfect, rationality. In this paper, we investigate the behavior of agents in a strategic environment who potentially have two well known biases. This helps us to better understand the behavior of players who try to take advantage of other players with these biases. We believe that having a preference for immediate gratification emerges as a very important factor and it gives surprising and seemingly irrational outcomes both in individual decision making problems and in strategic environments. Savings and

Z. Akin / Journal of Economic Psychology 30 (2009) 642–650

649

investment behavior, project completion, intertemporal individual decision making problems, price determination, several advertorial strategies employed by firms and bargaining situations are some examples. Moreover, since the very nature of acting rationally necessitates being aware of what the agent desires and how she optimizes under both feasibility and behavioral constraints, learning in the sense of increasing awareness is also a crucial part of rationality. This is the main reason why we choose imperfect information processing along with the present bias among the many well documented biases. In addition, they together elucidate the intuition behind the persistency of implied anomalous behavior. This point indicates the fact that one bias may wash out or exacerbate the other’s effect.13 In our context, being naive about one’s own preferences is a bias that makes the agent better off but learning one’s own preferences over time alleviates the positive effect of this bias. On the other hand, imperfect information processing is another bias and in the existence of both of these biases, the latter prevents the lessening of the effect of the former. In this sense, one bias provides permanence of the other’s effect. Our mood state approach turns out to be a formal explanation of this persistency, e.g., imperfect information processing causes the present bias to be persistent. Although restricting the bias space into these two biases may seem to be a limitation, this result can be stated for any bias combinations having similar implications. While we find that being naive about having present biased preferences is advantageous, most of the industrial organization literature finds that it hurts the decision maker. In these environments, we observe the persistency of different biases (Loewenstein, 1996) and people do not/cannot correct these biases although having them is costly. In this case, their permanence is less likely. In our framework, however, people are actually better off by failing to appreciate the bias they have, hence they do not have any incentive to overcome this bias in the sense of increasing awareness.14 Hence, it is more likely to see a permanence of the bias. Moreover, even if they become aware of it, they have an incentive to pretend to be naive. In this sense, examining an alternating-offers bargaining game as the medium of interaction is more interesting in terms of its implications, but it is possible to extend this work by considering different types of games. In this paper, we model information processing and partial naiveté in a simple way. However, we conjecture that even if different stochastic structures for learning are assumed, the results do not change significantly. Moreover, in this paper, learning is better understood as a case by case learning that can be called meta-learning. It would not be correct to generalize this in the sense that agents become fully aware of a bias they have regardless of the decision making problem.15 As Babcock, Loewenstein, Issacharoff, & Camerer (1995) argue, people may recognize the self-serving biases of other people but they tend to overlook the ones they may have. Rabin & Schrag (1999) raised a question of how economic implications might depend on people’s awareness of others’ confirmatory bias and they continue: ‘‘One possibility is that people might exploit the bias of others. A principal may, for instance, design an incentive contract for an agent that yields the agent lower wages on average than the agent anticipates, because the agent will be overconfident about her judgments in ways that may lead her to exaggerate her yield from a contract.” We assume that the principal is aware of the existence of the biases the agent has and acts accordingly. In our context, the principal’s design is the bargaining game on wage. The naive agent in advance anticipates a higher wage that turns out to be less than what she anticipated (as a result of learning) as long as the information processing is close to perfect. In conclusion, this paper gives a taste of how to characterize the behavior of agents in strategic environments who have self-serving biases. However, extending this into more general settings still remains an issue that needs further research. Possible extensions are to relax the informational assumptions, to allow for correlated cognitive and mood states, to generalize the cognitive state structure and the learning dynamics (especially, to allow for conflicting signals), to relax the information processing structure and to consider different types of strategic environments. Acknowledgements I thank Matthias Sutter and two anonymous referees whose suggestions improved the exposition substantially. For helpful comments, I am indebted to Ahu Genis Gruber, Derek Laing, Emin Karagozoglu, Isa Emin Hafalir, Ismail Saglam and Thomas Sjöström. I also thank Burak Erdeniz, who provided excellent research assistance. All remaining errors are mine. References Akin, Z. (2007). Time-inconsistency and learning in bargaining games. International Journal of Game Theory, 36:2, 275–299. Antonides, G. (1996). Psychology in economics and business: An introduction to economic psychology. Dordrecht: Kluwer Academic Publishers.. Asheim, G. B. (2008). Procrastination, partial naivet é, and behavioral welfare analysis. Working paper, University of Oslo. Babcock, L., Loewenstein, G., Issacharoff, S., & Camerer, C. (1995). Biased judgments of fairness in bargaining. American Economic Review, 85(5), 1337–1343. Besharov, G. (2004). Second-best considerations in correcting cognitive biases. Southern Economic Journal, 71(1), 12–20. Compte, O. & Postlewaite, A. (2008). Repeated relationships with limits on information processing. PIER working paper. DellaVigna, S., & Malmendier, U. (2004). Contract design and self-control: Theory and evidence. Quarterly Journal of Economics, 119(2), 353–402. DellaVigna, S., & Malmendier, U. (2006). Paying not to go to the gym. American Economic Review, 96, 694–719.

13 Besharov (2004), by using examples of time-inconsistency, regret, and overconfidence, examines how biases may offset each other’s effects and the implications of correcting biases. 14 Especially in intertemporal games, being naive refers to the concept that we call as ‘‘believing is as if actually being” under, of course, some informational assumptions. 15 The agent may deem different hypothesis about the states of the world more or less likely depending on the nature of the problem she faces.

650

Z. Akin / Journal of Economic Psychology 30 (2009) 642–650

Eliaz, K., & Spiegler, R. (2006). Contracting with diversely naive agents. Review of Economic Studies, 73(3), 689–714. Frederick, S., O’Donoghue, T., & Loewenstein, G. (2002). Time discounting and time preference: A critical review. Journal of Economic Literature, 40(2), 351–401 (Jun.).. Fudenberg, D., & Levine, D. K. (2006). A dual self model of impulse control. Harvard Institute of Economic Research Discussion Paper, No. 2112. Gilpatric, S. M. (2008). Present biased preferences, self-awareness and shirking. Journal of Economic Behavior and Organization, 67, 735–754. Hafalir, E. I. (2008). Credit card competition and naive hyperbolic consumers. Working paper, Carnegie Mellon University. }szegi, B. (2008). Exploiting naiveté about self-control in the credit market. Working paper, University of California, Berkeley. Heidhues, P., & Ko }szegi, B. (2009). Futile attempts at self-control. Journal of the European Economic Association, 7(2-3), 423–434. Heidhues, P., & Ko Laibson, D. (1997). Golden eggs and hyperbolic discounting. Quarterly Journal of Economics, 62(2), 443–478. Lipman, L. B. (1995). Information processing and bounded rationality: A survey. The Canadian Journal of Economics/Revue Canadienne d’Economique, 28(1), 42–67. Loewenstein, G. (1996). Out of control: Visceral influences on behavior. Organizational Behavior and Human Decision Processes, 65, 272–292. Loewenstein, G., O’Donoghue, T., & Rabin, M. (2003). Projection bias in predicting future utility. Quarterly Journal of Economics, 1029–1248. Metcalfe, J., & Michel, W. (1999). A hot/cool system analysis of delay of gratification: Dynamics of willpower. Psychological Review, 106(1), 3–19. O’Donoghue, T., & Rabin, M. (1999a). Doing it now or later. American Economic Review, 89(1), 103–124. O’Donoghue, T., & Rabin, M. (2001). Choice and procrastination. Quarterly Journal of Economics, 116(1), 121–160. Rabin, M., & Schrag, J. L. (1999). First impressions matter: A model of confirmatory bias. Quarterly Journal of Economics, 114(1), 37–82. Raghunathan, R., & Pham, M. T. (1999). All negative moods are not equal: Motivational influences of anxiety and sadness on decision making. Organizational Behavior and Human Decision Processes, 79(1), 56–77. Rubinstein, A. (1982). Perfect equilibrium in a bargaining model. Econometrica, 50(1), 97–110. Rubinstein, A. (1985). A bargaining model with incomplete information about time preferences. Econometrica, 53(5), 1151–1172. Sanbonmatsuand, D. M., & Kardes, F. R. (1988). The effects of physiological arousal on information processing and persuasion. Journal of Consumer Research, 15(3), 379–385. Sarafidis, Y. (2005). Inter-temporal price discrimination with time-inconsistent consumers. Mimeo, CRA International. Wason, P. C. (1960). On the failure to eliminate hypotheses in a conceptual task. Quarterly Journal of Experimental Psychology, 12, 129–140.