Copyright ©'IFAC Supplemental Ways for Improving International Stability, Laxenburg, Austria, 1983
IN QUEST OF A THEORY OF RECONCILIATION H. Nurmi Department of Philosophy, University of Turku, SF-20500 Turku 50, Finland
Abstract. Traditional 2x2 game theory focuses on conflict si tuations where each participant perceives the game in an iden tical manner. The hypergame theory does away with this assump tion. The hybrid games consisting of the Assurance Game, Prisoner's Dilemma and Chicken are discussed in an effort to find out what difference from the stability viewpoint the perceptual differences might make. The extreme type of knowledge concerning the opponent's choices is omniscience. It seems that in certain games like Chicken the omniscience is not beneficial but even do~nright harmful for the omnscient player provided that his opponent knows of it. The equilibrium concept of the one - shot game theory seems too restrictive to be taken as a model of real world conflicts where the games are often sequential. The concept of nonmyopic equilibrium of Brams and his associates is used to analyze the hybri d games and their stable outcomes . Finally a brief review of the state of the art of institutional design is presented in order to show the relevance of the social choice theory for research on international stability and to point to some of the problems that lie ahead in this field . Keywords. Prisoner's Dilemma , Chicken, Assurance Game, non myopic equilibrium, Nash equilibrium, omniscience, social choice, stability. INTRODUCTION The control engineer is naturally inclined to think of the international system as a control system and , con s equently, of the malfunctionings of the system RS information gathering and control problems . Therefore, the relevant questions to be asked from this view - point when looking at inter nqtional c~nflicts are 1) what controls do the various a.gents (typically stA.tes or coalitions of states) possess vis - a - vis each other in the issue at hand, and 2) what kind of information ao they h~ve of the system and its comuonents. But obviously the agents have also interests in the behaviour of the system or in some of its states. Moreover , the conflict issue has a ~ore or less limited SCODe. The main argufTIent of this paper i - s - that each of these four parts must enter any satisfactory theory of conflict resolution or reconciliation of conflicting interests. Fortunately the theory does not have to be built f r om scratch . There is a rich literature of interest-control 129
interactions, viz . the theory of qames. Particularly pertinent parts of game theory deal with non-constantsum games like the Prisoner's Dilemma and Chicken . The questions that are addressed in this paper are: what is the nature of the conflict and what kind of resolution is called for in these situations? Some authors have ar2ued that the emergence of the nationstate can be )~ade intelligible by the IInecessi ty" to find ways out of the Pareto-subop timal and individually rational equi librium in the Prisoner's Dilemma-like situation. And yet no arranqement of a similar nature seems to succeed on the international level . ~hy? Several possible exulanations can be ~i ven: . (i) Prisoner's Dilemma. - likesi tu ations can be resolved without es tablishing new institutions BS contract-enforcers, (ii) there is ~ gene rally perceived need for a trqnsn~tio nal body , but its design faces extre mely difficult theoretical ~roblems , viz . problems of ~ust and fair renresentation and of selecting a good social choice procedure, and (iii) our
H. Nurmi
130
theories of stability are not in accord with the world we live in because there is so much stability in the world and yet our theories tell us that many of our institutions are chaotic. So, there is more stability in our institutions than our theories suggest. This pacer discusses the explanations (ii) and (iii) fairly extensively making only some brief remarks on (i) as the latter has been extensively studied in the literature. This paper focuses on some problem formulations of game theory-from a specific normative view-point, viz. we shal l be dea ling with the ways in which reconciliatory outcomes can be reached in various game situations. The ambitious word "theory" in the title of this paper suggests nothing more than the fact that we are look~ ing at games from this normative angle. The games we shall be dealing with are mostly wellknown from the literature. The basic types are all twoperson non-constant sum finite games. One could well ask why no effort is ~ade to cover the constant-sum games as they are often thought of as " paramount examples of conflicts of the ~ost antagonistic kind. The answer is contained in what was just said as the very nature of constant-sum games makes it inconceivable for the parties to look for reconciliation on ra~ion~l ~ rounds. Each of the players 1S w1nn1ng and , therefore, the act of giving up some of one's achievable Dayoff to reach a more reconciliatory settlement is presumably due to the embedding of the game in a larger game or in some extraneous state of a ffairs. Hence, it seems reasonable to deal with non-constant-sum ga mes in which oer definitionem ther~ are some common interests combined with Dure l y individual ones. SETTING THE STAGE: THE BASIC STRUCTURE OF THE
CONFLICT The ?risoner's Dilemma Probably the best kno~n single game in pane theory is the Priso~Br'~ ry ile mrna (PD, for short). It can be defined by n eans of the following payoff "J.atrix (Fig.1). ' . . Colu"ln
c
1)
C
3,.1
1 ,4
D
L,. , 1
.2 , .<
"to', Fi g. 1.
Here the uayoffs have only ordinal significance, i.e. it is assumed that 4 Pi 3, 3 Pi 2, 2 Pi 1, where x Pi y means "x is strictly 9referred to Y by i n , i="tow, Column. As usual the uayoffs of Row (Column, respectively) are presented ~s the left (riqht) entry in each cell of the matri x. . Both players in PD have a dominating strategy, viz. D (D="defection"). However, if both choose D, the outcome is (2,2) which is Pa reto-s uboptimal . The Pareto-change can be made if both players choose C (C= "co operation " ) which in turn is made difficult by the fact that by unilateral defection one player can then get the maximum payoff 4. The (2,2) outcome is the Nash equilibrium of PD , i.e. no player can benefit from a unilateral departure from it (assuming that the other player sticks to his strateg~. The characteristic feature of PD is thus the fact that the C-choice by both players would be beneficial visa -vi s the D-c hoice by both , but , on the other hand, the best result is achieved when one is the only defecto r. The seriousness of the dilemma would seem to increase with the increase of the temptation 4-3. The othe r parameters of the game are risk: 2 -1 and gain: 3 - 2 (see Bonacich 1970 ; Nurmi 1977). Ch icken In Chicken (CG, for short), the worst outcome for both players results from their joint D- choice , whi l e the second worst outcome is "th e sucker's payoff " resulting from one's C- choice when the opponent chooses D. The game can thus be defined as follows (Fig . 2) .
C
D
C
3,3
2,4
D
4,2
1,1
Fig . 2 . There are two Nash eq uil ibria in this game , viz. (4,2) and (2 ,4 ) . On the other hand, there are no dominant strategies for either player. C is bette r for Row if Column chooses D while D is better for Row if Column chooses C . So , the game is more coo perationoriented than PD where both dominant strategy and equilibrium considerations speak in favour of the D-ch oice . We observe, moreover, that (1 ,1 ) is not an equil i b rium even though it results from the D-choice by both players. Consequently, the incentives for cooperative choice are definitely larger in CG than in PD. To this one must add, however, that
In Quest of a Theory of Reconciliation
the dilemma is still there as the largest payoff for either player results when he chooses D and his opponent chooses C. Assurance Game In the Assurance GaMe (AG, for brevity) the dilemma is still smaller than in CG . Now the largest achieveable payoff for both players results from the cooperative choice by bot h. The second best payoff accrues frOM a concerted acti o n as well, viz . from the D- choice by both . On l y when the players choose differently is there a difference between payoffs so that the cooperative chooser gets his worst payoff whereas the D-ch ooser gets his next to worst one (Fig . 3) . C
D
C
4 , I,
1 , :2
D
2 ,1
3,3
Fig . 3 . Again we have two Nash equilibria, (4 , 4) and (3 , 3) . This time the equilibria are located in the main diago nal of the matrix indicating that same action by both p lay ers re sults in an equilibrium . There are no domi nant st rate gies in this game , as C is better for Row if Column chooses C, whereas D i s bette r for Row if Column cho oses D. We see that in contrast to the equilibria in CG, one of the AG equilibria is Pare to -su perior to the other . In CG both equilibria are Paret o- optimal , while in PD the equilib ri um is Pare to - dominated by a non-equilibrium outcome. RESOLVING THE CONFLICTS Let us now turn to the resolution of the ga mes discussed above, i.e. to the ways in which the cooperative outco~e could be rendered individually rati ona l. An obvious method of resolution would, of course, be to modify the payoff matrix so as to end up with one in which the cooperative C-ch oice is the dominant strategy for both players . As a matter of fact one does not need to do quite that much: all that is needed to gua rantee the cooperative outcome is that one of the players has a dominant cooperative strategy and the other cannot benefit from defecting when the choice of his opponent is cooperative . In PD this means that for one player positive payoffs A and D are added to 3 and 1, respectively so that 3+A is lar~er than 4 and 1+D is larger than 2 (or alternatively negative payoffs are added to 4 and 2) , while for the other player
131
the game is transformed into AG by eliminating temptation 4-3, e.g. by adding a reward to 3. In CG the elimination of temptation is sufficient to make the cooperative choice individually rational for both players. In AG there per definitionem is no temptation. This is, however, not sufficient to elicit cooperation. The modification in the payoff matrix needed for this is the elimination of difference 3-1 by, say, subtracting a positive payoff B from 3 so that 3-B is smaller than 1. From a game -theoretic point of view these modifications are trivial although in practice one could argue that the conflict resolution often if not always - takes the form of such transformations. The reason I'm somewhat hesitant to claim that the conflict resolution always takes the form of payoff modifications is that the role of perceptions is intuitively crucial. In resol ving conflicts it is not only rel evant to know how the players conceptualize, in terms of their own payoffs, the nature of the conflict, but it is also quite essential to find out how the players perceive the game that their opponent plays. These together determine the game that one of the p layers thinks he is playing and , consequently, the modifications Must be made with respect to this game . We shall return to this problematique later on. A feature that seems important in conflict resolution is the number of times the game is played by the same players. In game -theoreti c terminology the distinction that captu res this feature differentiates one - shot and sequential games. Up to now we have dealt with former ones only , but surely we encounter situations which would more naturally be describable as game sequences than one -shot games. Only one of the three games discussed above has been studied with the aim of finding resolutions in sequences of games not achievable in one-shot ones (see esp. Taylor 1976; Smale 1980). Taylor finds that in PD supergames i.e. games consisting of a sequence of one -shot PD's - there are strategies which are optimal when one is confronted with an opponent choosing from a restri cted set of supergame strategies and wh ic h , moreove r, dictate the cooperative choice for both players in each composite PD. A crucial role in Taylor's supergame solution is played by the rate at which the players discount the future payoffs ensuing from the composite games. Smale, in turn, shows that in sequential PD the requirement of a "good
132
H. Nurmi
st r ategy " is that the coope rat i ve cho i ce i s made in each sub ga me of the sequen tial PD a nd , moreo ver, that one c annot do bette r by c h oos in g some ot her than a good stra t egy . A more de tailed expla nation of Smale s's resul ts may be in order. Let us denote by S the convex set of points i n two-d imensi onal r eal space R2 g enera ted by the :point s (4 , 1) , (3 , 3) , (2 , 2) and (1 , 4). We consid er the iterat ed PD in which the playe rs make their stra tegy choice s condi tio nal upon thei r past payof fs . Smale assum es that a suitab le summa ry measu re of past p e rforman ce is given . Now a strate gy in the ite rat ed game is a rule which assign s to each value of the summa ry measu re a choice (C or D) fo r the next round . For the sake of simpl icity let us assume that the sum mary measur e is the ar ithme tical mean of the past pay offs . Suppo se that one PD is played in one unit of time . Sta rting from 1 we th us get a sequen ce x , x , ••• , x 2 (x. ~ S , for all i , 1 ~ i 1~ T) of T l payof fs up to a given time point T . The ave r age payof f vecto r du rin g the period from 1 to T is then _ 1 x = -
T
T
T
L: i=1
Assum ing that the averag e past per forman ce is the only consi derati on that enters a playe r's delibe r ation of his next choice , we get s : S ~ A , where s = (s 1,s2 ) , A=A xA , 1 2 A =A =(C , D) , s1 : S - H 1 and s2 : 1 2 S ~ A2 '
Obvio usly the payof fs in the first play and the strate gies s1 and s2 chosen by the playe rs compl etely de temine the evolu tion ot the dynam ic PD . A soluti on for this game is def ine d by Smale as a pair (s , x) such that s is a strate gy pai r and x i s stat iona ry with r espec t to s . This means that if x=x j , then XTt j =x for T=1,2 , T The globa l staol lity of a solu tion (s , x) , on the other hand , means that for any x E Sand x + = hT(x ) , 1 T 1 T T=1 , 2 , ... : xT-+ K as T -t oc:>, where K i s a f ixe d payof f vecto r and hT is a functi on def ining the dynam lcal system where the averag e payof fs up to Ta r e mapped into the averag e payof fs at T+1 . Natu rall y hT depend s on the strate gies o f the playe rs. In othe r words , Tx+g O s(x) T+1
X £.
S.
Here s(x) speci fies a pair of choice s of playe rs and g is simply the payof f functi on in PD . Now the conce pt of a good strate gy can be define d. A good strate gy satisfie s the follow i ng three condi ti o ns: (1) s (a , b) = D if a < 2 where (a ,b ) E S , that is , a good strate gy dictat es the ch o ice of D for playe r if his averag e payof f up to the pre sent time pe ri od is less than 2 which is the maxim in payof f for each playe r and , theref ore, can be unila terall y secure d , (2) sJ (a , b)=D if b > 3 which means that 1 should choose D if 2 ' s averag e exceed s the payof f that 2 could possib ly get by choosin~ C (i . e . 2 must have been explo iting 1 s coo perati on at least pa rt of the time) , and (3) s1(a , b)=C i f b ~ a and the r e is in S an open set which conta ins the segme nt { (c , c) E S / 2 .::: c ", 3] on which s =C . Condi tion (3) is needed to avoid being locked in the noncoo perati ve equili brium . Now Smal e proves that by playin g a good strate gy a playe r g ets on the averag e at least the same payof f as T ~ 00 as is the maxim in payof f in PD , i .e.2. Mo re over , a good strate gy is an equili brium strate gy in the sense that if playe r 1 plays a good strate gy then playe r 2 canno t do bette r than by playin g a good strate gy as well . Furthe rmore , Smale shows that if 1 and 2 play a good strate gy (s1 ' s2)=s , then the payof f vecto r x'= (3 , 3) is a soluti on payof f distr i bution and (s , x') is the unique solu tion . In additi on , (s , x ') is also globa lly stable . Resolv ing CG is in the follow ing sen se easie r than that of PD (see Taylo r & Wa r d 1981) . While in the latter the only equili brium i s found at the inters ection of two n oncoop e r ative strate gies , in the forme r there a r e two equ i libria both of which entai l that one of the playe rs choose s non coope rative ly and the other coope ra tively . This , in conju nction with the fact that the worst outcom e re sults from both playe rs' choos ing nonco opera tively , means that if the playe rs make their choice s conse cuti vely , the one who choose s first can "hijac k" the other to play coope rati vely . Simil arly , AG can be resolv ed by con st ructin g a dynam ic versio n of the game so that the choice s are made conse cutive ly and with full inform a tion of the choice s alread y made . Indeed , in AG the playe r choos in g first does not pose any kind of threa t to the other playe r . Ra ther he is p ropos ing an equili brium outco me which is super ior to the two equi libria from both playe rs' view - points . What is partic ularly worth obse r ving ,
In Quest of a Theory of Reconciliation
howeve r, is that these re solutions a re c r ucia ll y game - dependent , that is, a st rata gem that would seem to re solve one game does not necessarily res o l ve another game . In fact , if a player in CG chooses D in order to hijack his opponent to choose C to avo id the wo rst outcome, the same choice would ce rtainly not elicit coope ration if the opponent thinks he is playing a dynamical PD or AG. This once aga i n points t o the impo r tance of the per ception of games . HOW THE CONFLICTS ARE SEEN It can be argued that any successful theory of conflict re so lution must take into account both the objective and perceived features of the situation~ . The latter include both the subjective analysis of the type of game that is being played and the i n terpretation of the moves of a player by his oppos in g party . We shall deal with the fo rm e r analysis only . Suppose a two - person conflict situa tion in which Row int e rprets the pay offs ensuing from the strategy choices as a PD as far as his own payoffs are concerned , while the Column sees the ga me as a CG as far his payoffs go. The s ituation is then the following (Fig . 4. ) • Colum n C D C 1,4 3, 3 Row 2 ,1 D 4,2
133
C
D
C
3, 4
1,2
D
4,1
2,3
Fig. 5. There is one equilibrium in thi s ga me, to wit th e DD - outcome . Interes tingly enough the fact that Column perceives the situation as an AG does n ot change the equilibrium outcome from what it had been had both play ers seen it as a PD . We obse r ve , however, that the DD - outcome is no lon g er a dominant st rat egy intersec tion , but it is an equilibrium all the same . But then not much is ach ieved by way of guaranteeing a coo perative solution if one of the play ers is persuaded to switch to AG from PD . Finally the AG - CG combination looks like this (Fig . 6) . C
D
C
4, 3
1,4
D
2,2
3,1
Fig . 6 . There are no equilibria in this game . Surely the combination AG - CG is more conducive to reconciliation than a PD - PD combination , but as long as there remains some temptation - i . e . as long as 4 > 3 in the payoff matrix - there is no assurance that the CC outcome will be rea ched .
Fig . 4 . In this game there is one equilibr i um , viz . at the inter section of Row ' s Dchoice and the Column's C- choice . Obvious ly this outcome i s an improvement over the noncoope r ative PD-equilibrium from Row's view - point . More ove r, Row can achieve this outcome by committ in g himself to D leaving Co l umn no othe r option but to yield , i . e . choose C in order to avoid the disas ter DD - outcome . It is obviously bene ficial for a playe r to give the impression that he is playing PD when the opponent is play in g CG . It seems that this very asymmetry of PD and CG - viz . that the player perceiving him self in a CG situation is in a d isadvantaged position vis - a -vi s the player seeing the game as a PD - is utilized to some extent by superpowe rs both pretending to be playing a PD in d i s armament negotiations while the ob jective facts of the matter' mak e the CG a much mo r e applicable model. Let now Row see the game as a PD and Column see it as an AG . The n we have the following payoff matrix (Fig. 5) .
What kind of measures one would then need to make the cooperative outcome more likely in these "hybrid " games? First of all , the fact that a player sees a game as a PD instead of CG makes thin gs a lot worse for an out side ump ir e trying to work out a CC outcome . The r eason i s obvious . The player in question is only more convinced when he learns that the other player is playing a CG . Cer ta i nly this information does not in duce him to choose C instead of D. On the other hand, if the player who sees the game as a CG is told that his opponent views the situat i on as a PD , this information would no doubt make it more likely that he chooses C in order to avoid the disast rous DD - outcome . Consequently there is an asymmet r y i n the effect of this in formation : one of the players is more likely to choose C upon being told what kind of game his opponent is playing while there is a presumption that the other player takes exactly the oppos i te action if the information about the opponent's game were g i ven. Moreover, the benefits of
134
H. Nurmi
tr y i ng to st r e n g then one's own posi t i on by giv in g false info r mation ab out one ' s p r efe r ences ove r outcomes i s obvious . Second l y , i f a p l ay er who se e s a game as a PD is to l d that his opponent thinks of it as an AG , this a ctua ll y g i ves h im a n add i tiona l i ncentive to choose D since he knows that his oppo nent no longe r h as a do mi nati n g stra tegy D, but may choose C which would guarantee the la r gest payoff to the PD - playe r choosi n g h i s dominant stra tegy D. On the other hand , if an AG playe r is info r med about his oppo n e n t 's playing PD , this would probably induce him to antic i pate aD - choice f r om h i s opponent's side . The best response to this would obviously be D whereby the DD - outcome results . In this hybrid game , then , the effect of information about the oppo nent's game does not increase the likelihood of the CC - outcome no matte r which of the players is info r med about the op po nent's game . . Th i rdly , if a CG - player learns that his opponent does not see the game as a CG but as an AG , this fact certain ly prevents him from benefiting from hijacking his opponent by a commit ment to D. Consequently , this infor mation would seem to increase the li kelihood of the CC - outcome . On the other hand , if the AG-player i s told that he is facing a CG - player instead of an AG one , this piece of informa tion leaves the situation indetermi nate as how he acts in this new situ ation is now dependent on the p~ayer' s risk - posture . The largest payoff in AG results from both players' choosing C, while in CG the la r gest payoffs cannot be achieved simultane ously by both players . Wh en an AG i s converted into a CG one increases the range of payoffs poss ibly res u lting f r om a D- choice , while decreasing the range related to the C- choice . It seems then that if the AG - player is told that his opponent is a r isk averse CG - player , the likelihood of the CC - outcome inc r eases . If the CG player is risk - acceptant , then it would seem that the possibilities of reaching a CC - outcome a r e diminished by the fact that the AG - player learns this fact . The above rema r ks show that the in crease in informat i on about one's opponent's preferences does not ne cessa ri ly inc r ease the probability of the coope r ative outcome . Th i s con clusion follows from the kind of hy pe r game ana l ysis of AG , CG and PD which we have conducted above . The hype r game models have ma i nly been uti lized in the reconst r uct i ons of real event sequences such as milita r y mano -
euvres (see Bennett 1980 ; Said , Hartley and Bennett 1981) . We have not co n s i de r ed the possible comp li cations ensui n g from diffe r ent interpretat ions of act i ons by various players eve n though these wou l d n o doubt pro ve impo r tant in ana l yses aiming at support i ng decisions . This , howeve r, is beyond our focus . OMNISCIENCE A fai r ly commonly sha r ed belief about the stability of the duopolistic su perpowe r system - which ~ertainly is held by the intelligence officers in the scientific - military - indu s trial complexes of the superpowers - is that increasing information about the opponent's plans and motives also in creases the stability of th e system . This is why e . g . intelligence satelli tes are now a widely accepte d part of the interaction system of the two blocks . It can be argued , however , that taken to its extreme an increase in the knowled ge abo ut th e opponent ' s intentions may be both harmful for the knower and highly des ta bilizing for the system . To see why this i~ so , let us cons i der the paradox of om r.iscience discussed by Brams in seve ral works (1980 ; 1981) . We focus on a two - person CG in wh ich Row is an "ordinary " player , whereas Column is an omniscient player in the sense that he know s th e choice of Row befo r e making his own choice (Fig . ?) .
C D
C 3,3
D 2.4 1, 1
Fig . ? Now suppose that Row knows of the om niscience of Column . Under this as sumption Row certainly h as an advan tage over Column since he can by ctoosing D rest assured that omnsci ent Column will not choose D, but is hijacked to choosin g C . Hence the result is (4 , 2) bringin g the maximum benefit to the opponent of the omni scient playe r. This has been called the paradox of omn iscience . It is noteworthy that in addition to assuming that one of the players is omniscient and that the other knows this , we have to assume that the g ame is a CG . The r e are , howeve r, other games - altogethe r six out of the 7 8 d i fferent 2 x 2 games (Rapoport & Guyer 1966; Br ams 1980) to be exact - in which the paradox of omniscience can occur. In fact it can occur in all such games i n which the non - omni scient player's best outcome coinci -
In Quest of a Theory of Reconciliation
des with the omniscient playe r' s se cond or third best outcome and the ot her outcome on the same r ow (assuming that Co lumn is the omnisc i ent player) is the third best or worst for the omniscient player . It is straight forward to show that neithe r AG no r PD is among the six games in which the paradox of omniscience can occur . In AG the best outco me f or the non - om niscient player occu r s simultaneously with the best outco me for the omnisc i ent one . Therefore , no paradox can ensue . Similarly in PD the non - omni scient player may commit himself to the choice of D, but this will necess arily be accompanied by the D- choice of the omniscient opponent . Thus , ne ith er in AG no r in PD can the para dox of omniscience possibly occur . In the hybrid PD - CG discussed above (Fig . 4) we observe the pa r adox of omniscience if Column who is playing CG is omniscient . Row who plays PD can then hijack Column into a (4 , 2) outcome . It is easy to chec k that the paradox does not occur if the PD - play er is assumed to be omniscient . Simi larly , it can be seen that there is no possibility of the paradox in the game where the PD - playe r faces the AG - play er (F ig . 5) . The same holds for the CC - AC combinRtion game (F ig . 6) . The ef f ect of omniscience is identical with the effect of the restructuring of CG so that the moves are made con secutively and not simultaneously . Being omniscient so that one 's oppo nent also knows this means effectively that one can make the last choice in the game . The same advantage for the non - om niscient player r esults from an irrevocable comm itme nt to the D- choice . As Sc hellin g ( 1960) has shown one can henefit from such a commitment especi ally if it is followed by the cutting of all information channels from one's opponen t's side . This seems to work particularly well in CG . The paradox of omnisc i ence seems to point to an absolute limit to the degree at which one can no longer benefit from intel ligence information . As such the pa radox gives a pow er ful instrument to the pl~yer who finds himself in a se riously disadvantaged pos i tion in so far as the ~athering of i ntelligence information is concerned . But in games of strategy all things hang together . So , the omniscient play er may , if he sees the possibility of be in g hijacked at the second worst outcome , give the opponent the im p ression that he is not i n the end play ing CG at all , but AG . But sure ly this is not enough to convince the n on - omniscient player to switch to C. What is called for is an assurance that the latter player is playing an AG and not CG . What could possibly
135
be easier would be to try to convince the non - omniscient player that one is not o mniscient after all. Be that as it may, the paradox of omnisc ience shows that in te l lig en ce inf ormation may be a destabiliz in g factor r athe r than a stabilizing one . NONMYOPIC EQUILIBRIA Much of game theory centers a r ound e quilibrium related notions . In 2 x 2 games the tradit i onal equilib ri u m concept is that of the Nash equ ilibrium which we have already used in the p r eceding . An outcome is a Nash equilib r ium if and only if neithe r player ca n unilaterally benefit from departing from it assuming that his opponent sticks to his choice . In a sense this concept is very plausible especially in one - shot games . How ever , when sequential games are con sidered one could argue for other equilibr i um concepts that take into account the wider prospects of play ers follow i ng from move - counte rm ove sequences ove r time . Let us consider the nonmyopic equi libria recently int ro duced by Brams and Wittman (1982) . It is defined with respect to what is called a supergame which consists of an infinite number of constituent games i n normal form . The payoff matrix of each con st it uent game is the same , e . g . PD , AG o r what n ot . The palyers are as sumed to make their strategy choices in alte rnatin g fashion so that at ti me t Row makes his choi ce and moves the outcome from the payoff matrix cell whe r e it was after Colu mn has made his choice at t - 1 . Alternative ly , Row (or Column) may terminate the game by not making any choice when his turn comes . The basic idea of nonmyopic equilibrium is that the players trace the consequences of their choices all the way , i . e . for all possible move countermove sequen ces that can ensue . An outcome (a , 1 a ) is a nonmyop ic equilibrium for 2 Row (Column) if and only if all con tinuations of the supe r game res ult in final outcomes (x , x 2 ) such that a 1 1 (a ) is strictly preferred to x (x ) 1 2 2 by -R ow (Column) . An outcome is a nonmyopic equilibr i um if and only if it is a nonmyopic equilibrium for both Row and Column . A nonmyopic equilibrium thus captu r es the intui tively plausible i d ea that r ational players will not start p r ocesses that will result eventually in worse out comes than the prevailing one . Let us now see if this new equilibri um concept sheds any light on our hybrid games . In the PD - CG situation (Fig . 4) there is one nonmyopic equi-
136
H.
~ urmi
librium , viz . the (4 , 2) outcome . To see that this is the case suppose that Column switches to D so that the outcome (2 , 1) would ensue. But Row would certainly have no incentive to switch to C to come up with a (1 , 4) outcome because the game would then be terminated the re being no way in which Colum n could i mprove upon payoff~ . Hence , Row would terminate the game at (2 , 1) . But this is worse than (4 , 2) from Column's view - point. Hence (4 , 2) is a nonmyopic equilibri um for Column . It is also a nonmyopic equilibrium for Row as it gua rantees him the maximum payoff . Since there are no othe r nonmyopic equilibria in this ga me , we observe that the Nash equilibrium and the nonmyopic equilibri u m coincide in this game (wh i ch , in cidentally was used by Brams and Witt man (1982) as a model of the State Solidarity conflict in Poland). In the PD - AG game (F i g . 5) we have a similar situation , to wit , there is on e nonmyopic equilibrium giving Row 3 and Column 4 . Obviously Colu mn has no incentive to depart from (3 , 4) and Row cannot expect Column to settle for (4 , 1) . Instead (2 , 3) would be the likely result if Row would depart from (3 , 4) by choosing D. Clearly 3 is better than 2 for Row and, hence , a nonmyopi c Row would stick to C. Con sequently , (3 , 4) is a nonmyopic equi librium . It is the only one in this game and interestingly enough does not coincide with the Nash equilibrium (2 , 3) . (This seems to contradict Brams ' and Wittman's claim that in all 2 x 2 games with such nonmyop ic equ ilibria that one player obtains his best payoff while the other does not , the nonmyopic equilibrium coincides with the pure - strategy Nash equilibrium) . Turning finally to the AG -CG situation (Fig . 6) we observe that there is a nonmyopic equilibrium , viz . (4 , 3) which is unique . In this game there are no Nash equilibria . The fact that (4 , 3) is a nonmyopic equilibrium can be seen from the following . Row's best outcome coinc ides with (4 , 3) and , therefore , he has no incentive to de part the re from . Column , on the other hand , would be willing to move to ( 1 , 4) . but then Row would take the outcome to (3 , 1) whereupon Column's C- choice would end up with (2 , 2) , clearly an inferior outcome than (4 , 3) to Co lumn . }-Ience , (4 , 3) i s a nonmyo pic equilibrium . Brams and Wittman (1982) show that the concept of nonmyopic equilibrium does indeed suggest new solutions to important classes of gam es . For our purpo ses it is interesting to notice that there are two nonmyopic equilibria in PD , viz . (2 , 2) and (J , 3) outcomes .
In CG, on the other hand, there is just one nonm yop ic eq uilibr ium and that is (3 ,3). So bo th in PD and CG the coope rativ e outcome is a nonmyo pic equilibrium. In PD , however, the other nonmyopic equ ili brium , which als o coincides wi th the un i que Nash equilibrium (2 , 2) outcome , is absorb ing in the sense that departures from (4 ,1 ) , (1 , 4) and (2 , 2) outcomes all re turn to (2 , 2) wh ile if the sta r t i ng outcome is (3 , 3) then the p r ocess of sequential moves does not depa rt from it if the players a r e nonmyopic. In CG the unique nonmyopic equilibrium differs from the Nash equilibrium thus showing that we are dealing with an essentially different solution con cept. It is easy to see that in AG there is only one nonmyopic equilib ri um , viz . (4 , 4) whereas there are two Nash equilibria (4 , 4) and (3 , 3) . Again long - run considerations speak in favour of the cooperative outcome . Does this then mean that those who ha ve been worried about the emergence of PD o r CG like situations in the r eal world have simply misunderstood the whole issue by looking at it from a too restrictive myopic rationality angle? Not necessarily . The nonmyo pic eguilibria - although in many respects intuitively more plausible than the traditional equilibrium concepts - are defined for rather restricted classes of games . One could argue , for instance , that the well - ordered move-countermove sequence with no backt r acking or no possibility to skip one's tu rn to mov e is simply not an adequate model of the strategic intricacies of the world. Hut surely the equilibrium concept per se is more realistic than the myopic concepts . The nonmyopic equilibrium provides a hypothetical explanation for why so many PD's do not end up with (2 , 2) Nash equilibria o r why CG ' s often re su l t in (3 . 3) solutions . This prompts the following more gene ral observation concerning the inter action of the model - builder and the system he i s trying to comprehend. While the way in which the real world games are seen by players obviously has some effect on the outcomes and should . the r efore . be built into the models thenselves . the concept of a rat ional actor should also be given a thorough analysis . In this section we have seen how the modification of the rationality concept defines new equilibri um outcomes . From the view point of conflict resolution the mes sage is perhaps too obvious to be sta ted : the actors should take into ac count the nonmyopic consequences of the ir actions in their deliberations . Once aga i n we notice that PD is inhe r ently a more difficult game to re-
In Quest of a Theory of Reconciliation
solv e in a cooperative way than CG o r AG : while in CG and AG the coope rative CC - outcome is the on ly n onmyopic equi librium , in PD the CC - outcome which is a nonmy o bi c equ ilibrium is reached only if the process stars from it . PR OBLEMS FO INSTITUTIONAL DESIGN Perhaps the most straight - forward way o f enhancing in ternational stability is to des i gn institutions to settle conflicts in an orderly and peaceful way . Th is would in fact be equiva lent to turning the noncooperative games into cooperat i ve ones b y sub mitti ng one ' s decision making powe r to a larger collectivity of , say , re pr esentatives of states . We know that such a surrender of power to transnation al bodies has not i n prac tice occurrred in matters of major national interest . And yet in the literature one encounters accounts ex plaining the emergence of the state by the need to overcome the Pareto suboptima l Nash equilib rium deadlock in a PD - like " original " (i . e . state less) situation . Why doesn ' t a simi lar need elici t a similar response on the transnational level? The pre ceding section provides a partial answer : the PD - like situations can result in (3 , 3) nonmyopic eq uili bria without the help of an outs ide con tract - enforcer (the state) . Similar ly , Ta ylor ' s (197 6 ) supergame solu tion suggests the possibility of end ing up with (3 , 3) equilibria in games consisting of an infinite number o f PD ' s played in a sequence . A crucial determinant of taylor 's solut i on i s the rate at which the fu ture pay off s are discounted by the players . Also Smale ' s ide a of a ood strategy that we touched upon earlier aims at e x plaining why (3 , 3) could be the ou t come rational play ers end up with . But su r ely a transnational body hand ling conflicts would make things more stable in e ff ec tiv ely exluding down right confro nta tions . Why then do the main actors in the international arena reject the idea of a world ~ove rnment in issues of national security? This question is to complex to be answer ed by even internat i onal political experts . Let us , therefore , process it to a form that is slightly more accessible : do we know enough abou t the properties of social i nstitutions to be able to design a good transnatio nal body? This I think is th c rusi a l qu est i on t o be addressed by expert groups . A lot of social choice theory literature would seem to suggest that t he answer has to be " no ". We have all kinds of impossibility results starting from Arrow (1963) and Sen (1970) to the theorems of Gibba r d (1973) ,
137
Satt e rthwaite (1975) , McKelvey (1979) and Schofield ( 1 981) . All these the orems tel l o f the d if ficulty of de signing good preference aggregati on procedures . Espec i ally the results of McKelve y and Schofield can be in terpreted s o as to say that the simp le majority principle is typically entirely chaotic in multi - dimensional Enclidean policy spaces where all points represent alte rna t i ves and where all voters have well - defined countinuo us utility fun ctions . The absence o f a co re alternat i ve implies that any alternative can be rendered the majority rule winner . Sure l y " the will of the people " is no t r e fle cted by such a dec isi on met hod~ In f act the agenda - setter determines the outcome . Th e Gi bbard - Satt erthw a ite theorem , on the other hand , has often been invok ed by those who want to emphas i ze the ubiquity of strategic manipulation possibilities . In his recent book Riker (1982) argues that the two sets o f results just mentioned justify the conclusion that the concept of democ racy need s a reinterpretation i n which no reference is made to the quality o f the decision out put . One does not ha ve to accept Riker ' s view to notice that the nega ti ve results present a major challenge to efforts to design social institutions . If the equilibria are so rare and the strategic manipulation possibilit i es so ubiquitous , as is a rgue d by Riker , then surely the case for creat i ng transnational collective bod i es with extensive powers over their consti tuent members would become a lot weaker . But the results of the pau c i ty of equilibria as wel l as the generic manipulability of social choice mechanisms are of questionable validity when it comes to real world social institutions . The emptiness of cores in mult i - d i mens i onal real spaces has been shown to be a generic property , but it is not at a ll the case that this emptiness would typi cally characterize situati ons involv ing a finite numbe r o f alternatives on ly (see Nurmi 1 980 , Sheple 1979) . Also the notion of equi l ibrium under lying the concept of co r e is perhaps somewhat too myopic to be really re levant . Other solution concept s h ave , indeed , been i n troduced i n the lite ra ture , such as the minimax set of Kramer (1977) . These are not , how ever, equ ilibrium concepts in th e usual sense . Theref ore , the y do not hav e the normative i mpo rt ance of equilibrium concepts even though their expl31atory and predictive power in experiments may be cons i derable . Ne verthe less, the rele vance o f the re sults stating the g eneral empt iness o f equlibria is not obv i o us .
138
H. Nurmi
Similarly , the ubiquity of strategic manipulation possibilities is based on the i dea of manipulation that pre supposes a more or less perfect know l edge of the decision making body . Surely a more relevant picture of the seriousll€s'S of the manipulation prob lem e merges out of a study of the ty pes of manipulabiliy along with the easiness of manipulation (Nurmi 1983 b) . There are definitely varying degrees of manipulab i lity in different voting systems when measured by the amount of i nformation one needs in order to manipulate successfully . Some systems are also subject to more numerous types of manipulability than others . For instance some systems are vulnerable to what is called the sincere truncation of preferences (Brams 1982), others allow f o r more possibilities of agenda - manipulation and so on . All this suggests that even though none of the current voting systems is perfect , there are definitely systems that fare better than others with respect to a set of criteria of goodness which includes st r ategic a nd other theoreti cal pro perties (Nurmi 1983 a; 1 983 b). The problems of designing good soci al institutions are not made any ea sier if one notices that before the social choic e principle is agreed upon , another equally if not more difficult problem has to be solved , viz. that of guaranteeing the p a rticipants a fair representation (see Balinski & Young 1 982) . Without going int o the details of this prob lem one could just point to the diffi culty o f as signing seats to parti es in proportional representation systems . Even perfe ct pr o portionaly ty - i . e . a% o f the seats to a party of a% of the total support would not guarantee that the le gis lative power is distributed proportio nally among the parties . It is simply not the case that a party with a% of the parliamentary seats controls a% of the legisl ation . But how can one measure l eg islative power? Several non - equivalent indices have been proposed, but there seems to be no general answer to this question (see Holler 1981) . A further problem which actually con nects the social choice the ory to the theory of representation is our habi tual way of thinking of representation in terms of the plurality principle . For instance , in proportio nal representation systems the prin ciple of proportionality is defined with respect to the voters first pre ferences only . And yet , it is well known that the voters could be asked many other things besides their first preference . In designing social in stituions the possibilities of a
more gener a l v i ew of the representa tion - choice linkage could be useful . Even though there many central prob l ems involved in the design of social i nstitutions that would increase in ternational stability , I dare to con clude this sect i on with a couple of recommendations for the estatlishment of such bodies . Some of the recommendations are based on what is known from research in th i s field, o thers are more conjectural : 1 . The number of alternatives upon which the vote is to be taken should be as sma l l as possible, preferably two . However , in eliminating alterna tives one should not use elimination procedures that fail to s at isfy pre ference monotonicity or Condorecet cri teria . Rather one should allow the bargaining processes to take place so that the voters could decide unani mously which alternatives are to be voted up on . 2 . In case the number of alternatives considered is larger than two , one should look for systems that have as many intuitively desirable properties as possible , like Cond orcet -s atisfac tion , monotonicity, Pareto - opt i mality , consistency and so on . Which weights are given to the satisfaction of va rious criteria is dependent on the kinds of issues dealt with as well as on the ide a l one has of the social choice . 3 . In allocating the voting weights to various participants one should not only consider their "si ze " as measured by , say , population and/or GNP , but also th e decision rule used in making collectively binding decisions . It is obvious that certain rules favour par ticipants with certain voting weights and are disadvantageous ~ to others ect . In guar antee i ng a reasonably fair re presentation one could vary both th e total voting weight of the body and the decision rules . CONCLUSION The theory of reconciliation has in the preceding been approached by pointing to s ome obvious situations in which the already existing g am e theoretical re sults can be of use for the processes aiming at stable solutions i n situati ons involving a conflict of interest . Weobserved the importan ce of the per ceptions in game - like situati ons in de termining the stable outcomes . Like hybrid decision making methods may con tain unexpected new properties , so the combined games can have prop erties that are new vis - a - vis the "pare nt " gemes . The role of information in ascertaining the type of game ones oppon ent is playing is a quite crucial strat egic
In Quest of a The ory of Reconc i liati on
characteristic . Curiously enough ta ken t o the extreme the kno wledge of one s opponen ts aims and c ho ices may turn aga inst the possessor of such knowledge . This is the case in CG , for example . So the theory of reconcil iation should take account of the perceptions and information of the players . Bu t also the basic praxeological concept of rational choice may need closer scru tiny . In one - shot games there i s very litt le one can say against the Na.sh equilibrium . But in sequential games there are grounds for arguing for a less myopic equilib ri um con cept . The nonmyopic equilibr i a touched upon above provide new in sights into the dynamic games espe cially AG , CG and PD and their hybrid forms . When looked upon from the nonmyopic rationa li ty perspective the chances f or cooperative solut i ons are considerably greater than from the myopic rationality angle . The id ea of some kind of world govern ment in security matters unde r lies many effo rts t o attain g reater stabi lity in the international arena . How ever , the knowledge we have of social institutions is l argely negat iv e in nature . We know that present day vo ting bod i es use decision methods that have several undesirable properties . Similarly the criteria of an adequate and just representation often con flict with ot her goa ls regarded as important . Desp it e these findings there de finitely are better and worse voting and representation systems . Awareness of the limitati ons of social institutions is the first step towards their prudent use and further develop ment . REFERENCES Arrow , K. J . (1963) . Socia l Choice and Ind i v i dual Values , 2nd . ed ., New York/Wiley Balinski , M. L . and Young , H. P ., (1982) . Fa ir Representation , New Haven: Yale University Press . Bennett , P . G., (1980) . Hypergames: developing a model of con fli ct , Futures 12 , 489 - 507 . Bonac ich , P . (1970) . Putting the di l emma back in to pris oners dilemma , J ournal of Conflict Resolution 14 , 379 - 387 . Brams , S . J . (1980) . Mathematics and theology : game - theoretic implica tions of Gods omniscience , Ma thematics Maga zine 53 , 277 - 282 .
13 9
Brams , S . J . (1981) . Omniscience and omn i potence : how they may help - or hurt - in a game , Mee ting of the Ame rican Political Science Associat i on . Septembe r 3 - 6 . Brams , S . J . (1982) . The AMS nomina ti on pr oc edure is vulnerable to truncation of preferences . Notice s o f the Ameri can Mathematical Society 29 , 136 - 138 . Brams, S . J. and Wittman D. (forthco mrng-1 982) . Nonmyop ic equh ibria in 2 x 2 games , Con flict Management and Peace Science . Gibbard , A. (1973) . Manipulation o f voting schemes , Econom etric a 41 , 587 - 60 1 . Holler , M. J . (ed . ) (1981) . Power , Vo ting , and Voting Power, Wurzburg : Physica - Verlag . Kramer , G. H. (1977) . A dynamical mo del of political equil i brium . J ournal of Economic Theory 16 , 310 - 334 . McKelvey , R. D. (1979) . General con ditions for global intransivities in formal voting models , Econometrica 47 , 10 85 - 1112 . Nurmi , H. (1977) . Ways out of the pri soners dilemma . Quality and Quanti ty 11 , 135 - 165 . Nurmi , H. (1980) . Majority rul e : second th oughts and refutations , Qual it y and Quan ti ty 14 , 743 - 765 . Nurmi , H. (1983)(a ) . Voting procedu re s : a-suffimary ana l ysis , Bri tish Journal· of Pol iti cal Science 13 , 181 - 208 Nurmi , H. (1983)b» . On the strategic properties of some modern methods of group dec i sion making , International Scientific Sem i nar on Modern Methods in Decision Making , Sofia , 6th - 10 t h Ju ne . Rapoport , A. and Guyer , M. (1966) . A tax onomy of 2 x 2 games , General Systems 11 , 203 - 214 . Riker , W. H. (1982) . Liberalism against Populism , San Francisco: W. H. Freeman. Said , A, K., Hartley , D. A. and Bennet P . G. ( J uly 1981 (mimeographed») . Hypergame mode l s for decis i on planning. Satte rt hwaite , M. A. (1975) . St ra tegy pr oofness and Arrows conditions , Journal o f Econom i c Theory 10 , 1 87 - 217 . Schelling , T . C . (1960) . The Strategy of Conflict , Cambridge : Harvard Universi ty Press . Schofield , N. ( March 1981) . Eq uili brium and cyc les in voting compacta , Ess ex Economic Papers No . 173 . Sen, A. K. (1970) . Collective Choice and Social Welfare , London:Boyden . Shepsle , K. ( 197 9) . Institutional ar r an gements and equilibrium i n multidimen si o nal vot ing models . American Jour nal of Poli t i cal Science 23 , 27 - 59 . Taylor , M. (1976) . Anarchy and Co - o pe rat i on , Lond o n : Wiley . Tayl or , M. and Ward , H. (1981) . Ch icken~ whales and lumpy goods:alternative mo dels of public goods provision , Poli tical Studies As soc ia ti on : Annual Meeting .