Games and Economic Behavior 35, 339᎐348 Ž2001. doi:10.1006rgame.2000.0806, available online at http:rrwww.idealibrary.com on
Equilibria in Automated Interactions Nir Vulkan* Department of Economics, Uni¨ ersity of Bristol, 8 Woodland Road, Bristol BS8 1TN, United Kingdom Received April 30, 1998; published online January 24, 2001
Automated agents are increasingly being used to interact on behalf of individuals and organizations in electronic markets. Agents are used, for example, to explore the set of possible contracts and to trade in communication bandwidth. Users choose agents which are then matched on various locations Žhosts.. Hosts will normally check the agents Ži.e., the code. to verify that it is compatible with the communication protocol. In this paper we explore the equilibrium outcomes of such interactions. We show that users will choose agents who reveal parts of their code in order to coordinate on welfare improving outcomes which otherwise would not have been supported in equilibrium. Journal of Economic Literature Classification Numbers: C72, D83. 䊚 2001 Academic Press Key Words: e-commerce; automated interactions; equilibrium.
1. INTRODUCTION The Internet Žand the World Wide Web in particular. is emerging as one of the most efficient media for carrying out business-to-consumer and business-to-business transactions. New companies, like Amazon.com or Microsoft’s Expedia, which specialize in sales over the Web, are emerging as key players in their industries. On-line sales are growing rapidly. Large organizations use the Internet for much of their trading with suppliers. In the last year alone, General Electric’s turnover of Internet based trade was in excess of $1 billion, twice the total of world-wide Internet sales in the same year. These figures are likely to increase further because of advances in agent technology. In this framework individuals and organizations interact via the network using software agents. A software agent is a program that acts independently on behalf of its user and in its interests. The main charac*E-mail: N.
[email protected]. I thank Ian Jewitt, In-Uck Park, Sjaak Hurkens, and Dov Monderer for helpful comments. Financial support from the EPSRC Award GR MO7052 is gratefully acknowledged. 339 0899-8256r01 $35.00 Copyright 䊚 2001 by Academic Press All rights of reproduction in any form reserved.
340
NIR VULKAN
teristic of these interactions is that the user delegates the authority to search, match, and even to transact business to the agent. Automated agents are increasingly being used for scheduling tasks, resource allocation, automated negotiations, and on-line auctions. This trend is supported by growing standardization of communication infrastructures over which different organizations can interact and safely carry out transactions, and by a growing number of applications where agents either significantly reduce negotiations overheads, or even allow for negotiations which otherwise would not have been possible, hence increasing the feasible set of possible agreements and the overall efficiency of the system. For example, agents can be used to trade communication bandwidth in double auctions markets which clear every few seconds Žsee Vulkan and Preist, 1999.. Another example is ADEPT ŽAdvance Decision Environment for Process Task., a system which views the management decision making process as a process of negotiating between various self-interested entities. ADEPT is successfully being used by British Telecom in order to reduce negotiations overhead and to increase the efficiency of decision making within the organization. In this system agents represent the different departments, for example the legal department or the sales department, involved in the process of providing quotes and tariffs for consumers. The system allows for both external and internal negotiations, and imposes only minimal restrictions on the negotiation process Žsee Vulkan and Jennings, 2000., and the references therein for more details.. Other examples include MIT’s Kasbah, a trading system where users create simple buying and selling agents which then interact in an electronic marketplace Žsee Chavez and Maes, 1996., and an agent-based double auction marketplace for electronic components operated by the American company Fastparts Žsee http:rrwww.fastparts.comr.. Already mentioned, agent-based markets for communication bandwidth, currently developed by several organizations like Band-X and RateXchange, seem likely to become the central mechanism for trading in these resources. Since the outcomes of these interactions depend on the behavior of all agents, the situation can be described as a game. It is therefore not surprising that game theory is commonly used as the framework for models of automated interactions. In particular, the game theoretical notion of players as preprogrammed algorithms seems more suitable for software agents than it is for humans Ža point recognized by Rosenschein and Zlotkin, 1994, who pioneered the usage of game theory for automated interactions .. Still, software agents differ from their human counterparts in several important respects, and the current challenge for researchers in this field is to find out which of the insights offered by game theory can be useful for the design of automated agents and the environments where
EQUILIBRIA IN AUTOMATED INTERACTIONS
341
they interact. To do that, existing models have to be adapted to the specifics of such problems. Moreover, some of these applications present us with a new set of problems which have not yet been studied in sufficient detail by economists or game theorists Žsee Vulkan, 1999, for a discussion of these issues.. New technologies are emerging based on game-theoretical analysis of such automated interactions. One such example is the one-toone negotiations between automated agents facing strict deadlines Žwhich are private information., by which they must reach an agreement. The large literature on bargaining does not offer much insight into this problem. This problem was addressed by Sandholm and Vulkan Ž1999., who found a simple and efficient mechanism which resolves these negotiations in such a way that it becomes optimal for self-interested agents to truthfully report their deadlines. In a somewhat different context, Sandholm and Lesser Ž1996. have devised a technology known as le¨ el commitment contracts. This technology allows agents to specify penalties for unilateral decommitting from agreements, expanding the set of possible agreements, and hence increasing the efficiency of such negotiations. Even in these situations already studied by economists, automated negotiations may lead to different outcomes. One reason is the fact that agents, and especially mobile agents Ži.e., agents which migrate between hosts., will need to be checked by the host in order to ensure that the agent’s code is harmless Ži.e., it is not a virus., and is compatible with the communication protocol. For example, an agent bidding in an English auction can safely reveal its code to the host, encrypting only its user’s reservation price: Begin strategy: If ((current bid)<(encrypted reservation price) and (auction continues)) then (bid = current bid + constant) End Strategy If agents are designed to always interact under the same conditions and always on the same host, then no such checks are required. In fact, in many existing agent-based applications, users receive their agents directly from the host Žnormally in the form of a Java script., and key-in the relevant information. However, the current trend is toward applications which allow for participation of any agents compatible with the communication protocol. In this framework users choose or design agents to interact on their behalf on a number of potential hosts, and under varying circumstances. In other words, users will not know in advance exactly when and where agents are matched. A computerized agent in such a setting is likely to use a case-base memory which will allow it to pursue Žpossibly.
342
NIR VULKAN
different strategies in different cases. Specifically, agents can credibly demonstrate to one another how their behavior depends Žor not. on the identity of the host: Begin strategy: If (host 1) then (encrypted) else if (host 2) then (encrypted) ...... else if (host n) then (encrypted) End strategy or: Begin strategy: If (hosts 1,2, . . . , n- 1) then (encrypted) else if (host n) then (encrypted) End strategy and so on. In this paper we consider the implications on the set of possible equilibrium contracts of the fact that agents may be able to credibly reveal parts of their behavioral strategy to each other. We show that the set of outcomes supported by these types of interactions is typically larger than when codes cannot be revealed. Moreover, we show that if agents can choose whether or not to reveal their code, then this will occur in equilibrium only if the outcome is welfare improving compared to the equilibrium outcomes of the game where code revelation is not possible. We provide a simple two-agent example which demonstrates how agents can coordinate on actions which otherwise would have been strictly dominated. To demonstrate this point, we consider the following stylized example: There are several providers of bandwidth. Agents representing bandwidth consumers Žfor example Internet service providers. are randomly matched by the market clearing protocol used by the double auction server Žas in Band-X or RateXchange. to a provider. We reduce their choices of consumption pattern to only two strategies: ‘‘greedy’’ and ‘‘cooperative.’’ Unless hosts have specific mechanisms in place to induce cooperation, agents are locked in a version of the Prisoner’s Dilemma, where the ‘‘greedy’’ consumption pattern is dominant. However, some hosts penalize ‘‘greedy’’ bandwidth consuming agents in order to resolve this dilemma. Sufficiently high penalties are imposed to ensure that the ‘‘cooperative’’ strategy becomes the best-respond to itself.
EQUILIBRIA IN AUTOMATED INTERACTIONS
343
If codes cannot be revealed, agents will always be greedy when matched on a host which does not reward cooperation. However, if codes are revealed, and if on average cooperation with cooperators is beneficial, then agents can in equilibrium choose not to distinguish between hosts and to always cooperate with each other. The intuition for this is that the choice of hosts-based strategy is used as a coordination device by signaling the willingness to play a particular continuation equilibrium. The rest of the paper is organized in the following way: Section 2 formally analyzes the above example. In Section 3 we consider a general framework for modeling automated interactions where users can choose agents who reveal, or do not reveal, their behavioral strategy. We characterize the set of sequential equilibria if agents do not reveal their code, and we give a sufficient condition for code relevation. Section 5 concludes. 2. THE EXAMPLE Figure 1 shows the payoffs for the two agents, as a function of the state Žthe host., : Let g denote the greedy strategy, and c the cooperative strategy. In 2 , the strategy g strictly dominates c, and the game therefore has a unique Nash equilibrium where both players play g. In 1 the situation is reversed and Ž c, c . is the only Nash equilibrium. We assume that agents are assigned to 1-type hosts with probability 3r8, and to 2-type hosts with probability 5r8. We analyze the outcome of the following four-stage game: Stage I: Users choose agents. Stage II: Agents are assigned to a host, according to the above probabilities. Stage III: Each agent observes the nonencrypted part of its opponent’s strategy. Stage IV: Agents simultaneously choose bandwidth consumption patterns.
FIGURE 1
344
NIR VULKAN
For example, a user can choose an agent which distinguishes between the states 1 and 2 and which, if matched with a state-distinguishing agent, plays g in 1 and c in 2 , and which plays c in 1 and g in 2 , if matched with a nondistinguishing agent. Let D denote a state-distinguishing agent, and N. D. a nondistinguishing agent. In order to compute the sequential equilibria of the above four-stage game, Fig. 2 lists the continuation equilibria, and the expected payoffs associated with these equilibria, as functions of the users choice of agents Žfor example, if both agents can distinguish between states then there is a unique continuation equilibrium where agents cooperate in 1 and are greedy in 2 .: Now a sequential equilibrium of the four-stage game is a Nash equilibrium of the meta-game described in Fig. 2. It is easy to see that in addition to the equilibrium where agents play the dominant strategies in each state, there exists an additional sequential equilibrium of ⌫ where players choose agents who cannot distinguish between states, and cooperate with each other on either type of host Žwhen the continuation equilibrium Ž c, c . is picked in the subgame Ž N. D., N. D.... The existence of this equilibrium crucially depends on agents’ ability to credibly reveal the case-base part of their code. Otherwise agents will have incentives to only pretend to not be able to distinguish between states, to become secretly informed about the state, and to play g in 2 . In most settings there would be more than one continuation equilibrium for a given choice of agents. Consequently, there would be multiple sequential equilibria corresponding to the Nash equilibria of the metagames defined by fixing any of the possible continuations equilibria for each choice of agents’ structure. In the following section we show how this set of sequential equilibria relate to the equilibria of the underlying games. Suppose that users can choose whether their agents reveal their nonencrypted code. That is, the game has an additional stage, II.5, where agents choose whether or not to reveal their type Ži.e., D or N. D... Since c is a dominant strategy in 1 and g is dominant in 2 , an agent which does not reveal its type will always become informed and play the corresponding dominant strategy. The only possible continuation equilibrium when at
FIGURE 2
EQUILIBRIA IN AUTOMATED INTERACTIONS
345
least one of the players picks an agent which does not reveal its code is therefore Ž cg, cg .. However, the sequential equilibrium where both players pick agents of type N. D. and continue with c, remains an equilibrium of the five-stage game. This is because the cooperative equilibrium is welfare improving. In the following section we generalize this intuition.
3. THE MODEL Let i denote the game specified by the communication protocol used by host i. Let ⍀ s 1 , 2 , . . . , N 4 denote the set of games Žstates .. Let A j be the set of actions for agent j s 1, 2. We assume that these sets are constant over all states in ⍀ Žthat is, all games have the same set of strategies .. Nature chooses an element in ⍀ according to a probability measure , where is common knowledge. Players choose, in the first stage of the game, a partition of ⍀. Formally, P s P1 , . . . , Pk 4 is a partition of ⍀ if Ži. Pi / ⭋ Žall i ., Žii. Pi l Pj s ⭋ Žall i / j ., and Žiii. for all g ⍀ there exists Pi with g Pi . Let P denote the set of all possible partitions of ⍀. Denote by P i Player i’s partition. Once Nature chooses the state, both players receive a Žperfect. signal. Player i’s belief is determined by conditional on P i. Formally we define the following four- or five-stage game, ⌫: Stage I: Users choose agents. Stage II: Nature chooses Žagents are assigned to a host., according to . ŽStage II.5: Agents choose whether to make their partition public.. Stage III: Agents observe public partitions. Stage IV: Agents choose which actions and payoffs are distributed. An agent in this context is identical to the game-theoretical notion of strategy in the stage-game. If we do not include stage II.5, then an agent for Player i is a pair si s Ž P i, z i ., where z i : P i = P ª A i . If stage II.5 is included then an agent is also a pair si s Ž P i, z i . but with the difference that now z i : P i = Reveal, Not-reveal4 = P j Ž Pyi Not Revealed.4 ª A i . Expected payoffs are defined in the following way: Fix the partitions of both players: P 1 and P 2 . Denote by u i k Ž a1 , a2 . Player i’s payoff when the pair of actions Ž a1 , a2 . is chosen in state k . Expected payoffs in ⌫ for Player i are given by: n
i Ž s1 , s 2 . '
Ý u i Ž z1 Ž P 1 Ž k . . , z 2 Ž P 2 Ž k . . k
ks1
for i s 1, 2.
346
NIR VULKAN
We are now able to prove the following results for the stage-game ⌫ which includes stage II.5: PROPOSITION 1. For any sequential equilibrium s of ⌫, where both players choose not to re¨ eal their code, it must follow that sŽ i . is a Nash equilibrium of the game i for i s 1, . . . , n. Proof. Suppose, on the contrary, that there exists a sequential equilibrium s1 of ⌫, and a state i such that s1Ž i . is not a Nash equilibrium of i . Then there is a player which has an incentive to change its strategy in i . This player can profitably deviate by picking an agent which chooses the best-respond in state i Žthis may require refining the partition, separating state i from other states .. Since partitions are kept secret, such a deviation cannot affect the player’s expected payoff in any way other than by improving it in state i . B Let S k be the set of equilibria of the game k , k s 1, . . . , n. Let S be the set of all combinations of these equilibria over all states in ⍀ Ži.e., < S < s < S 1 < = < S 2 < = ⭈⭈⭈ = < S n <.. Proposition 1 states that the sequential equilibrium of ⌫ where players choose agents which do not reveal their code must correspond to an element of S. We now characterize a sufficient condition for a sequential equilibrium of ⌫ which does not correspond to any element of S. PROPOSITION 2. Let s be a Nash equilibrium of the game with fixed partitions, P 1 and P 2 . Then if i Ž s . G i Ž s⬘. ᭙s⬘ g S Ž for i s 1, 2., then there exists a sequential equilibrium of ⌫ where, on the equilibrium path, players choose agents with partition P 1 and P 2 , which then re¨ eal their code and continue with s. Proof. Pick any equilibrium s⬘ in S. The following strategies constitute a sequential equilibrium of ⌫: Players choose agents with partition P 1 and P 2 and continue with s. If a player deviates by picking an agent which does not reveal its code, then agents continue with s⬘. Since i Ž s . G i Ž s⬘. such a deviation would not be profitable. B
4. DISCUSSION Security is one of the most important issues in the design of agents-based e-commerce systems. Since agents carry information about their users’ preferences and willingness to pay, there are potentially substantial gains from reversely engineering these agents. For this reason, anonymizing and synchronizing technologies are being developed to ensure secure transactions between automated agents. In this paper we show that the fact that
EQUILIBRIA IN AUTOMATED INTERACTIONS
347
parts of the agents’ strategies can be credibly revealed can also be used by agents to increase the utility of their users. Specifically, agents can, by credibly revealing how their behavior is conditioned on features like the identity of the host, increase the set of equilibrium outcomes. The choice to reveal these behavioral strategies can be interpreted as a signal to continue with a welfare-improving continuation equilibrium. Game theory teaches us that the value of information to a player may depend on the behavior of the other players and may even be negative Žsee, however, Neyman, 1991 for a different approach for evaluating the value of information.. That is, a player can be better off not knowing some of the game parameters. But how can players credibly signal their ‘‘ignorance’’ in these cases? If an economic agent can gather information secretly, then she will do so, and only pretend to be ignorant. In equilibrium, this will be anticipated, and therefore information which is useful Ži.e., information which has a positive value in the Blackwell framework. will always be obtained. Interactions between automated agents provide a new framework where these intuitions may prove useful. An automated agent, a form of a computer program, can credibly demonstrate that it is, in some sense, ignorant. For example, in a repeated interactions Monderer and Tenennholtz Ž1999. study the equilibrium behavior of software agents capable of demonstrating that they cannot ‘‘remember’’ the previous round Ži.e., cannot condition on outcomes of previous rounds.. Moreover, these types of code revelations are likely to be supported because hosts will need to check compatibility with the communication protocol. If our goal is to design agents who operate in open environments like the Internet, we need to take this into account. A better understanding of the relationship between the outcomes of these stage games and the equilibria of the underlying games is therefore required. This paper can be seen as a step in that direction.
REFERENCES Chavez, A., and Maes, P. Ž1996.. ‘‘Kasbah: An Agent Marketplace for Buying and Selling Goods,’’ Proceedings of the First International Conference on the Practical Application of Intelligent Agents and Multiagent Systems, London, UK. Monderer, D., and Tenennholtz, M. Ž1999.. ‘‘Distributed Games,’’ Games Econom. Beha¨ . 28, 55᎐72. Neyman, A. Ž1991.. ‘‘The Positive Value of Information,’’ Games Econom. Beha¨ . 3, 350᎐355. Rosenschein, J. S., and Zlotkin, G. Ž1994.. Rules of Encounter, MIT Press. Sandholm, T. W., and Lesser, V. R. Ž1996.. ‘‘Advantages of a Level Commitment Contracting Protocol,’’ in Proceedings of the National Conference on Artificial Intelligence, Portland, Oregon, pp. 126᎐133.
348
NIR VULKAN
Sandholm, T. W., and Vulkan, N. Ž1999.. ‘‘Bargaining with Deadlines,’’ National Conference on Artificial Intelligence Ž AAAI ., Orlando, FL. Vulkan, N. Ž1999.. ‘‘Economic Implications of Agent Technology and E-Commerce,’’ Econom. J. 109, F67᎐F90. Vulkan, N., and Jennings, N. R. Ž2000.. ‘‘Efficient Mechanisms for the Supply of Services in Multi-Agent Environments,’’ Decision Support Systems 28, 5᎐19. Vulkan, N., and Preist, C. Ž1999.. ‘‘Optimal Trading in Automated Markets for Communication Bandwidth,’’ Technical Report, Hewlett-Packard Laboratories, and the Economics Department, University of Bristol.