Evaluating NDF-based negotiation mechanism within an agent-based environment

Evaluating NDF-based negotiation mechanism within an agent-based environment

Robotics and Autonomous Systems 43 (2003) 1–27 Evaluating NDF-based negotiation mechanism within an agent-based environment Kung-Jeng Wang∗ , Chung-H...

678KB Sizes 0 Downloads 50 Views

Robotics and Autonomous Systems 43 (2003) 1–27

Evaluating NDF-based negotiation mechanism within an agent-based environment Kung-Jeng Wang∗ , Chung-How Chou Department of Industrial Engineering, Chung-Yuan Christian University, Chung Li 320, Taiwan, ROC Received 20 April 2001; received in revised form 5 September 2002

Abstract We investigate the properties of a negotiation mechanism that is based on negotiation decision functions (NDFs) in an agent-based system. The study employs both analytical and simulation approaches. Analysis and evaluation of negotiation tactics and strategies indicated that negotiation deadline significantly influences the convergence performance of the negotiation. Some important properties of the negotiation convergence are analyzed, and a set of experiments are carried out among agents for typical negotiation tactics and strategies to investigate negotiation fairness with respect to the mean difference of deal values and convergence characteristics. A preliminary study on the efficiency of NDF as compared to TAC agents and Pareto optimal is done as well. © 2003 Elsevier Science B.V. All rights reserved. Keywords: Agent systems; Negotiation model; Convergence analysis

1. Introduction Negotiation is critical in resolving conflicts of multi-agent systems in distributed artificial intelligence. Single-shot negotiation models (such as contract net) are typically employed in research and practice along the lines first developed by Davis and Smith [1]. However, no guarantee is given regarding the global optimum for this class of single-shot negotiation models owing to the myopic nature of the class [2]. Negotiation is characterized by iterations among agents. Information is shared extensively among agents and uncover the intentions of opponents through such communication. Negotiation is commonly addressed in the domain of multi-agent systems, and focuses on the exchange of partial plans to attain a global goal [3]. Kraus et al. [4] addressed the issue of negotiation time. Meanwhile, Kraus and Lehmann [5] established an artificial diplomat system to address negotiation issues. Greenwald and Stone [6] also presented a TAC market game run over the Internet and operating within a travel shopping scenario, and discussed the strategies developed for trading agents. Multi-agent negotiation shows promise, particularly in recent applications of intelligent production control systems (e.g. [7,8]). Furthermore, Faratin et al. [9] proposed negotiation decision function (NDF) as a basis of formal negotiation. In-depth experimental results ∗

Corresponding author. Tel.: +886-3-265-4411; fax: +886-3-265-4499. E-mail address: [email protected] (K.-J. Wang). 0921-8890/03/$ – see front matter © 2003 Elsevier Science B.V. All rights reserved. doi:10.1016/S0921-8890(02)00359-7

2

K.-J. Wang, C.-H. Chou / Robotics and Autonomous Systems 43 (2003) 1–27

regarding NDF refer to the ones of [10]. Faratin et al. [11] further proposed a trade-off algorithm of negotiation to increase social welfare. Negotiation models based on disclosure of information among agents (such as game theory) limit themselves to real applications, while NDF-based negotiation is characterized by its autonomous (private) behavior, consideration of timing, and issues. NDF-based negotiation can be applied to numerous real world application domains, such as industry production control, and pricing in e-markets. Hence, empirical and analytical investigation of NDF-based negotiation is encouraged to better understand its properties and efficiency. Besides, such a study can provide a strong basis for building an incentive mechanism in which agents use certain negotiation parameters to achieve socially desirable outcomes. The rest of this paper is organized as follows. Section 2 discusses NDFs and how they can be used as building blocks of the multiply negotiation model. Section 3 analyses some important properties of the NDF-based negotiation model, and Section 4 evaluates the model for a specified set of negotiation tactics and strategies, via simulation. Section 5 summarizes the work. 2. An overview of a negotiation mechanism based on NDFs We adopt NDFs as a decision tool to convey offers among the agents in an agent-based system. The characteristics and fundamental assumptions of NDF refers to [9]. Herein, we consider the negotiation as a process by which a joint decision is made by two or more parties which have individual and confronting objectives. The parties first express contradictory demands and then move towards agreement by a process of concession making or search for new alternatives [12]. NDF-based negotiation allows many agents to be involved to address multiple issues (for example, price and quantity). The NDF framework contains negotiation strategies, negotiation tactics, and multi-lateral negotiation functions. The set of tactics includes time-dependent tactics, resource-dependent tactics and behavior-dependent tactics. All these are based on the multi-lateral negotiation function. Multi-lateral negotiation functions are derived from bilateral negotiation functions [13], and were first proposed by Faratin et al. [9]. The functions employ a rating concept to conduct negotiation and support one-to-many negotiation. The agents involved are called parties. The subjects to be bargained over are called issues. Negotiation t involves offer and counter offer proposed according to issues raised sequentially by opposing sides. xa→b denotes  t the offer proposed by agent a to agent b at time t, whereas xb→a represents a counter offer by agent b to agent tn a at time t . The collection of all the transactions during a finite period, tn , is called the negotiation thread, Xa↔b t1 , xt2 , xt3 , xt4 , xt5 , xt6 , . . . ). The final offer/counter offer can be acceptance or rejection. The (xa→b b→a a→b b→a a→b b→a scoring function is computed according to the value of the issues. Suppose that there exist j issues (j = 1 to n). Let Xji ∈ [minij , maxij ] represent the acceptable range for agent i on issue j and Vji : [minij , maxij ] → [0, 1] indicates that the values associated with issues range from 0 to 1. Let vij (xj ) denote the value of issue j to agent i. v is normalized. vij (xj ) =

xj − minij

maxij − minij

vij (xj ) = 1 − and V i (x) =

xj − minij

(as vij is an increasing function of xj ),

maxij − minij



wij vij (xj ),

(as vij is a decreasing function of xj ), where wij is a weighting.

1≤j≤n

For agent i, a higher V i represents a more favored offer.

K.-J. Wang, C.-H. Chou / Robotics and Autonomous Systems 43 (2003) 1–27

3

The decision function is used to determine whether to accept or reject an offer, or to continue to negotiate. The decision function is defined thus   a   reject if t > tmax , t I a (t  , xb→a )=

 

accept t xa→b



t t if V a (xb→a ) ≥ V a (xa→b ), t  > t,

(1)

otherwise.

t a , a rejection An agent considers the counter party’s offer, xb→a . If period t exceeds the negotiation deadline tmax  t t is returned. Otherwise, the values of the counter offer, xb→a , and the next offer proposed, xa→b , are compared. If t t t t V a (xb→a ) ≥ V a (xa→b ), xb→a is accepted; else a new offer xa→b is proposed. The family of negotiation tactics consists of time-dependent tactics, resource-dependent tactics, and behaviordependent tactics. The details can be found in [9]. A brief description rephrases in Appendix A. Finally, we consider a negotiation strategy as a linear combination of tactics. A strategy is used to determine the best course of action leading to an agreement on a contract x which maximizes the scoring function Vja . In order to facilitate the evaluation of the NDF-based negotiation model, it has been implemented using the Java language and engaged with a Java-based Swarm simulation platform (for Swarm platform details, refer to [14]).

3. Convergence analysis of the NDF-based negotiation NDF-based negotiation considers the timing factor regarding the termination of conversation among agents, and this vital factor may significantly influence deals. This section thus analyzes some important aspects of the impact of negotiation deadline on the proposed negotiation model. One-to-one negotiation and single issue are assumed for clarity. The first property of the negotiation model is that, a negotiation process can converge in a finite time when proper tactics are employed. Theorem 1. If both agents a and b apply the time-dependent tactic, the ratio of the negotiation deal time to the negotiation deadline (the maximal length of time allowed to negotiate) will converge to a finite constant. That is, for parties a and b, with negotiation deadline tmax , the negotiation deal time in which agent a can accept an offer from b, is T (T < tmax ), such that T/tmax ≈ R, 0 < R ≤ 1, where R is a finite constant. Proof. Suppose agent a accepts an offer from b at time T during the negotiation. According to formula (1), offer and counter offer values are related as follows: 

T T V a (Xb→a ) ≥ V a (Xa→b ),

T  > T.

t ) is an increasing function of t from the standpoint of agent a During the negotiation process (t < T ), V a (Xb→a t t (because agent b yields to agent a by increasing Xb→a ) and V a (Xa→b ) is a decreasing function (since agent a yields to b). Hence at time t t−t t V a (Xa→b ) ≥ V a (Xb→a ).

From formula (A.1) T xb→a − mina minb + (1 − αb (T ))(maxb − minb ) − mina a = a max − min maxa − mina maxb − mina − (maxb − minb )αb (T ) = , maxa − mina T xa→b − mina mina + αa (T  )(maxa − mina ) − mina a T V (xa→b ) = = = αa (T  ). a maxa − min maxa − mina

T V a (xb→a )=

(2)

4

K.-J. Wang, C.-H. Chou / Robotics and Autonomous Systems 43 (2003) 1–27

Two cases are considered. Case (i). α(t) for the polynomial model   T 1/β , αb (T ) = kb + (1 − kb ) tmax

αa (T  ) = ka + (1 − ka )



T

1/β

tmax

.

Then 

T T V a (xb→a ) ≥ V a (xa→b )

maxb − mina − (maxb − minb )αb (T ) (maxa − mina )αa (T  ) ≥ maxa − mina maxa − mina    T 1/β b a b b b b ⇒ max − min − (max − min ) k + (1 − k ) tmax    1/β T ≥ (maxa − mina ) ka + (1 − ka ) tmax   T 1/β ⇒ maxb − mina − (maxb − minb )kb − (maxb − minb )(1 − kb ) tmax   1/β T ≥ (maxa − mina )ka + (maxa − mina )(1 − ka ) tmax ⇒

⇒ maxb − mina − (maxb − minb )kb − (maxa − mina )ka   1/β   T T 1/β a a a b b b ≥ (max − min )(1 − k ) + (max − min )(1 − k ) . tmax tmax Furthermore T  = T + t,

T tmax

=

T + t T ≥ tmax tmax

since tmax is large.

Then maxb − mina − (maxb − minb )kb − (maxa − mina )ka   1/β   T T 1/β a a a b b b ≥ (max − min )(1 − k ) + (max − min )(1 − k ) tmax tmax  1/β   T T 1/β > (maxa − mina )(1 − ka ) + (maxb − minb )(1 − kb ) tmax tmax = maxb − mina − (maxb − minb )kb − (maxa − mina )ka   T 1/β a a a b b b > ((max − min )(1 − k ) + (max − min )(1 − k )) tmax   a b b a a b b a T 1/β max − min − (max − min )k − (max − min )k > ⇒ tmax (maxa − mina )(1 − ka ) + (maxb − minb )(1 − kb )  β maxb − mina − (maxb − minb )kb − (maxa − mina )ka T ⇒ > . a b a a b b t (max − min )(1 − k ) + (max − min )(1 − k ) max

K.-J. Wang, C.-H. Chou / Robotics and Autonomous Systems 43 (2003) 1–27

5

Again, from formula (2), since t = T , we know 

T T ) ≤ V a (xa→b ) V a (xb→a

maxb − mina − (maxb − minb )αb (T ) (maxa − mina )αa (T  ) ≤ , T  = T − t a a max − min maxa − mina  1/β  T ⇒ maxb − mina − (maxb − minb ) kb + (1 − kb ) tmax    T 1/β ≤ (maxa − mina ) ka + (1 − ka ) tmax   T 1/β ⇒ maxb − mina − (maxb − minb )kb − (maxb − minb )(1 − kb ) tmax 1/β  T < (maxa − mina )ka + (maxa − mina )(1 − ka ) tmax



⇒ maxb − mina − (maxb − minb )kb − (maxa − mina )ka     T 1/β T 1/β < (maxa − mina )(1 − ka ) + (maxb − minb )(1 − kb ) tmax tmax 1/β  a b b a a b b a max − min − (max − min )k − (max − min )k T ⇒ ≤ a b a a b b t (max − min )(1 − k ) + (max − min )(1 − k ) max  a b b a a β b b a max − min − (max − min )k − (max − min )k T ⇒ ≤ . a b a a b b tmax (max − min )(1 − k ) + (max − min )(1 − k ) Hence 

maxb − mina − (maxb − minb )kb − (maxa − mina )ka

β

T ≤ tmax (maxa − mina )(1 − ka ) + (maxb − minb )(1 − kb ) β  maxb − mina − (maxb − minb )kb − (maxa − mina )ka ≤ (maxa − mina )(1 − ka ) + (maxb − minb )(1 − kb ) β  T maxb − mina − (maxb − minb )kb − (maxa − mina )ka ⇒ = , tmax (maxa − mina )(1 − ka ) + (maxb − minb )(1 − kb )

which is a finite constant. Case (ii). α(t) is that of the exponential model. β

αb (T ) = e(1−T/tmax )

ln kb

,

αa (T  ) = e(1−T

 /t β max )

ln ka

.

Then 

T T ) ≥ V a (xa→b ) V a (xb→a



maxb − mina − (maxb − minb )αb (T ) (maxa − mina )αa (T  ) ≥ maxa − mina maxa − mina β

ln kb

≥ (maxa − mina ) e(1−T

 /t β max )

ln ka

β

ln kb

+ (maxb − minb ) e(1−T

 /t β max )

ln ka

⇒ maxb − mina − (maxb − minb ) e(1−T/tmax )

⇒ maxb − mina ≥ (maxa − mina ) e(1−T/tmax )

.

6

K.-J. Wang, C.-H. Chou / Robotics and Autonomous Systems 43 (2003) 1–27

Furthermore, T  = T + t and tmax ≥ T  > T T tmax

=

T + t T  ≥ ⇒ e(1−T /tmax ) ≤ e(1−T/tmax ) . tmax tmax

We know 0 < ka , kb ≤ 1 ⇒ e(1−T

 /t a max ) ln k

a

≥ e(1−T/tmax ) ln k .

Let kmin = min(ka , kb ). Then maxb − mina ≥ ((maxa − mina ) e(1−T

 /t β max )

≥ ((maxa − mina ) e(1−T/tmax a

a

≥ ((max − min ) e b

a

a

(1−T/tmax

ln ka

β

+ (maxb − minb ) e(1−T/tmax )



ln ka



ln kmin

a

+ (maxb − minb ) e(1−T/tmax b

b

+ (max − min ) e

b

b



ln kb

ln kb

)

)

(1−T/tmax )β ln kmin

)

(1−T/tmax )β ln kmin

⇒ (max − min ) ≥ ((max − min ) + (max − min )) e (maxb − mina ) β ⇒ ≥ e(1−T/tmax ) ln kmin ((maxa − mina ) + (maxb − minb ))  (maxb − mina ) β ⇒ ln ≥ ln(e(1−T/tmax ) ln kmin ) a b a b ((max − min ) + (max − min ))   T β ⇒ ln(maxb − mina ) − ln((maxa − mina ) + (maxb − minb )) ≥ 1 − ln kmin tmax   ln(maxb − mina ) − ln((maxa − mina ) + (maxb − minb )) T β ⇒ ≤ 1− (∵ ln kmin < 0) ln kmin tmax  1/β ln(maxb − mina ) − ln((maxa − mina ) + (maxb − minb )) T ⇒1− ≥ . ln kmin tmax Let kmax = max(ka , kb ). Then using formula (2) 

T T V a (xb→a ) ≤ V a (xa→b )

β

⇒ maxb − mina ≤ ((maxa − mina ) e(1−T/tmax ) a

a

≤ ((max − min ) e b

a

(1−T/tmax )β ln kmax a

a

b

ln ka

β

+ (maxb − minb ) e(1−T/tmax ) b

+ (max − min ) e b

b

(1−T/tmax )β ln kmax

ln kb

)

)

(1−T/tmax )β ln kmax

⇒ (max − min ) ≤ ((max − min ) + (max − min )) e (maxb − mina ) β ≤ e(1−T/tmax ) ln kmax ⇒ a b a b ((max − min ) + (max − min ))  (maxb − mina ) β ⇒ ln ≤ ln(e(1−T/tmax ) ln kmax ) ((maxa − mina ) + (maxb − minb ))   T β b a a a b b ⇒ ln(max − min ) − ln((max − min ) + (max − min )) ≤ 1 − ln kmax tmax   ln(maxb − mina ) − ln((maxa − mina ) + (maxb − minb )) T β ⇒ ≥ 1− (∵ ln kmax < 0) ln kmax tmax  1/β ln(maxb − mina ) − ln((maxa − mina ) + (maxb − minb )) T ⇒1− ≤ ln kmax tmax

K.-J. Wang, C.-H. Chou / Robotics and Autonomous Systems 43 (2003) 1–27

7

Table 1 Experimental parameters of experiment (No. 1)

Issue Offer range Initial value Scoring function Tactic Parameter Deadline range

Agent a

Agent b

Price 5–35 30 Increasing Time-dependent Polynomial, β = 1 5200

Price 5–35 10 Decreasing Time-dependent Polynomial, β = 1 5200



1/β ln(maxb − mina ) − ln((maxa − mina ) + (maxb − minb )) T ⇒1− ≤ ln kmax tmax  1/β ln(maxb − mina ) − ln((maxa − mina ) + (maxb − minb )) ≤1− . ln kmin Since ka ≈ kb T tmax



≈1−

ln(maxb − mina ) − ln((maxa − mina ) + (maxb − minb )) ln ka

1/β , 䊐

which is also a finite constant. Thus the proof is complete.

Simulation experiment (No. 1) performed confirms the accuracy of Theorem 1. Table 1 lists the corresponding experimental parameters. We plot the ratio of T/tmax during the simulations. The experimental results show that the steady-state ratio is 41.92% when the deadline is sufficiently late. Fig. 1 depicts the transitional ratio in terms of single runs and moving averages over nine observations of ratios. Fig. 1 displays oscillations in T/tmax ratio, which occurs for reasons explained below. In the experiments, the increment of tmax is fixed at 1, and one run is performed for each setting. When tmax increases by 2–3 unit times, T increases by 1, a pattern with a cyclic period of 2 and 3, and causes oscillations in T/tmax ratio. From Theorem 1, when the negotiation deadline equals 50, we have T tmax ⇒

 = T tmax

maxb − mina − (maxb − minb )kb − (maxa − mina )ka

β

(maxa − mina )(1 − ka ) + (maxb − minb )(1 − kb )   35 − 5 − (35 − 5)0.16667 − (35 − 5)0.16667 1 T 20 = = 0.4. ⇒ = (35 − 5)(1 − 0.16667) + (35 − 5)(1 − 0.16667) tmax 50

In the following experiment (No. 2), both agents a and b combine a relative tit-for-tat behavior-dependent tactic (period δ = 3) with a polynomial time-dependent tactic (β = 1) in equal parts. Fig. 2 presents the ratio T/tmax over the transition. The ratio is 40.00% (moving average is 40.47%) when the deadline is 30 and the ratio is 38.00% (moving average is 38.65%) when the deadline is 5.0. Experimental data show that the average steady state ratio is 41.64%. In summary, we conclude from both analytical examples and experimental results, that the ratio of the negotiation deal time to the deadline will converge to a finite constant when a time-dependent static is employed.

8

K.-J. Wang, C.-H. Chou / Robotics and Autonomous Systems 43 (2003) 1–27

Fig. 1. Ratio of T/tmax against tmax (α(t) is a polynomial function, β = 1).

The simulation study No. 1 reveals that the difference between the final offer and the counter offer is related to the negotiation deadline. We state this property as follows. Observation 1. The length of the negotiation deadline will affect the difference between the final offer and the counter offer. That is, The two counter parties considering a deal will coincide (be a consensus) if the length of the deadline is sufficiently long.

Fig. 2. Ratio of T/tmax against tmax (a negotiation strategy combining 50% relative tit-for-tat behavior-dependent tactic (period δ = 3) and 50% polynomial time-dependent tactic (β = 1)).

K.-J. Wang, C.-H. Chou / Robotics and Autonomous Systems 43 (2003) 1–27

Fig. 3. XT against tmax (polynomial time-dependent tactic, β = 1).

Fig. 4. VT vs. tmax (polynomial time-dependent tactic, β = 1).

9

10

K.-J. Wang, C.-H. Chou / Robotics and Autonomous Systems 43 (2003) 1–27

As previous research by Faratin et al. [9], the original NDF could not guarantee the Pareto optima, and a trade-off algorithm is hence developed for the purpose to achieve better ‘social welfare’. Similarly, in this study, we are not interesting in the optimality rather the fairness (the difference between the final offer and the counter offer (XT = T T  )), which measures the ‘loss’ of a participant. The XT is chosen to be observed accordingly. For Xb→a − Xa→b instance, if a seller is planning to offer a price of 20 in the next ply; and the buyer has provided a price of 23 in this ply, then both participants reach the deal and the buyer “loses” 23 − 20 = 3 dollars. Eq. (3) measures the difference between the final offer and the counter offer: 

T T − Xa→b . XT = Xb→a

(3)

Fig. 3 depicts the change of XT in the direction of tmax in experiment No. 1. The sign of XT is set from the perspective of the distinct agents in the figure. For instanceXT is negative if agent b accepts an offer from a. Fig. 4 gives the difference between the final deal values of agents a and b, VT . VT clearly approaches zero as tmax tends to infinity. In experiment No. 3, a 50–50% combination of a time-dependent tactic with β = 0.5 and a relative tit-for-tat tactic with δ = 1 is applied to both agents. Figs. 5 and 6 show the simulation results. A relative tit-for-tat tactic cannot be applied before t < 2δ, and an offer is then determined using the time-dependent tactic. The offer can be derived, for instance, using Eq. (4) at time point 3:   t n−2 xb→a [j] tn−1 tn+1 a a xa→b [j] = 0.5 min max tn x [j], minj , maxj + 0.5 TDepa (t = 3, β = 0.5) xb→a [j] a→b   2− xb→a [j] 2−1 2+1 a a = xa→b [j] = 0.5 min max 2−2+2 xa→b [j], minj , maxj + 0.5 TDepa (t = 3, β = 0.5) xb→a [j]

Fig. 5. XT vs. tmax (a combination of 50% time-dependent tactic with β = 0.5 and 50% relative tit-for-tat tactic with δ = 1).

K.-J. Wang, C.-H. Chou / Robotics and Autonomous Systems 43 (2003) 1–27

11

Fig. 6. VT vs. tmax (a combination of 50% time-dependent tactic with β = 0.5 and 50% relative tit-for-tat tactic with δ = 1).

 3 = xa→b [j]



= 0.5 min max





0 xb→a [j]

2 xb→a [j]

1 [j], minaj xa→b

, maxaj

+ 0.5 TDepa (t = 3, β = 0.05)

TDepb (t = 0, β = 0.5) a a a TDep (t = 1, β = 0.5), minj , maxj = 0.5 min max TDepb (t = 2, β = 0.5) + 0.5 TDepa (t = 3, β = 0.5)   (minbj + (1 − αbj (0))(maxbj − minbj )) a a a a a a (minj + αj (1)(maxj − minj )), minj , maxj = 0.5 min max (minbj + (1 − αbj (2))(maxbj − minbj )) + 0.5 TDepa (t = 3, β = 0.5).

(4)

TDepa (t = 3, β = 0.5) in Eq. (4) represents the offer computed by agent a using a time-dependent tactic with β = 0.5. As the negotiation process progresses, the formula forms a recursive function as in Eq. (5), in which BDepb (t = 2, δ = 1) represents the offer computed by agent b at time 2 using the relative tit-for-tat tactic with δ = 1. Accordingly, this negotiation strategy inherits the property of the time-dependent tactic. Observation 1 is expected to hold also for the negotiation strategy:  tn+1 xa→b [j] = 0.5 min 4+1 = xa→b [j]

 max

t

n−2 xb→a [j]

tn−1 xa→b [j], minaj

tn xb→a [j]  

= 0.5 min max

4−2 xb→a [j] 4−2+2 xb→a [j]

, maxaj

+ 0.5 TDep(t = n + 1, β = 0.5)

4−1 [j], minaj xa→b

, maxaj

+ 0.5 TDep(t = 5, β = 0.5)

12

K.-J. Wang, C.-H. Chou / Robotics and Autonomous Systems 43 (2003) 1–27

 5 = xa→b [j]

= 0.5 min max



2 xb→a [j] 4 xb→a [j]



3 [j], minaj xa→b

, maxaj

+ 0.5 TDep(t = 5, β = 0.05)

0.5 BDepb (t = 2, δ = 1) + 0.5 TDepb (t = 2, β = 0.5) (0.5 BDepa (t = 3, δ = 1) = 0.5 0.5 BDepb (t = 4, δ = 1) + 0.5 TDepb (t = 4, β = 0.5) 

+ 0.5 TDepa (t = 3, β = 0.5)) + 0.5 TDep(t = 5, β = 0.5).

(5)

The simulation results of the negotiation strategy confirm the conjecture of Observation 1. The average difference between the final offer and the counter offer is 0.557927, and the average difference between the deal values is 50 50 = 0.018533 which values resemble those in experiment 0.0185976. At time 50, Xa→b = −0.555986and Va→b No. 2. In summary, a later negotiation deadline leads to a lower difference between the final offer and the counter offer reached. That is, the two parties reach a consensus if the deadline is sufficiently late. We further establish Theorem 2 to quantify the difference between deal values in the long term. Theorem 2. Let both agents a and b use a time-dependent tactic. If negotiation deadline is sufficiently late, the difference between deal values will converge to zero. That is, as tmax approaches infinity and agent a accepts a deal at time T (T < Tmax ), then limtmax →∞ V T ≈ 0. Proof. (i) For a polynomial time-dependent tactic 

T T V T = V a (xb→a ) − V a (xa→b )

1 maxb − mina − (maxb − minb )kb − (maxa − mina )ka = a max − mina 1/β    1/β T T −(maxb − minb )(1 − kb ) , − (maxa − mina )(1 − ka ) tmax tmax

where T  = T + t. According to Theorem 1, we know T/tmax = R as t → 0. Again by Theorem 1 1/β  maxb − mina − (maxb − minb )kb − (maxa − mina )ka , R= (maxa − mina )(1 − ka ) + (maxb − minb )(1 − kb )

1 maxb − mina − (maxb − minb )kb − (maxa − mina )ka V T = maxa − mina     1/β T 1/β T a a a b b b −(max − min )(1 − k ) − (max − min )(1 − k ) tmax tmax 1 [maxb − mina − (maxb − minb )kb − (maxa − mina )ka maxa − mina −((maxa − mina )(1 − ka ) + (maxb − minb )(1 − kb ))R1/β ] 1 [maxb − mina − (maxb − minb )kb − (maxa − mina )ka = a max − mina −(maxb − mina − (maxb − minb )kb − (maxa − mina )ka )] = 0. =

K.-J. Wang, C.-H. Chou / Robotics and Autonomous Systems 43 (2003) 1–27

13

(ii) For an exponential time-dependent tactic, from formulas (A.1) and (2), we know that 

T T T V a (xa→b ) ≤ V a (xb→a ) < V a (xa→b ) 



T T T T ⇒ 0 ≤ V a (xb→a ) − V a (xa→b ) < V a (xa→b ) − V a (xa→b )

t ∴ V a (xa→b ) is decreasing

β

⇒ 0 ≤ V T < αa (T ) − αa (T  ) = ka + (1 − ka ) e(1−T/tmax )

ln ka

− ka + (1 − ka ) e(1−T

 /t β max )

ln ka

,

and T tmax

=

T + t T  ≥ ⇒ e(1−T /tmax ) ≤ e(1−T/tmax ) . tmax tmax

Also, 0 < ka , kb ≤ 1 ⇒ e(1−T

 /t a max ) ln k

a

≥ e(1−T/tmax ) ln k .

Then β

0 ≤ V T ≤ αa (T ) − αa (T  ) = ka + (1 − ka ) e(1−T/tmax ) ≤ ka + (1 − ka ) e(1−T

 /t β max )

ln ka

ln ka

− ka + (1 − ka ) e(1−T

− ka + (1 − ka ) e(1−T

 /t β max )

 /t β max )

ln ka

ln ka

⇒ 0 ≤ V T ≤ 0, which concludes the proof.



Recall that in Fig. 4, the difference between the deal values approached zero as the negotiation deadline increased ¯ to 100 (V T = 0.0083). ¯ When the deadline is extremely large (10,000), VT approaches from 50 (V T = 0.016) T ¯ zero (V = 0.0000416). These simulation data correspond to Theorem 2. Theorems 1 and 2, and Observation 1 together show the convergence of a deal is definitely affected by the negotiation deadline. However, the experiments (Nos. 1–3) show that an excessive late deadline does not support quick convergence. Setting an appropriate negotiation deadline is essential to the performance of the NDF-based negotiation mechanism. 4. Empirical study of NDF-based negotiation This section investigates NDF tactics and their combinations. One-to-one and one-to-many experiments are conducted to understand the actual behavior of agents, through the NDF-based negotiation model. A comparison of NDF agent to TAC agents [15] are done and Pareto optimal is conducted. 4.1. Single tactic 4.1.1. Time-dependent tactic The value of α(t) for a time-dependent tactic can be computed in two ways. For a fixed convex parameter β, an exponential α(t) resembles the Boulware mode in which an agent tends to insist in its offer value. Therefore, we conjecture that the agent using an exponential α(t) should dominate the one using polynomial α(t). To verify the conjecture, one-to-one experiments are run. Experimental parameters (issue, offer range, and scoring function) of each agent are set at the same as in Table 1. In the experiment which both the two agents use time-dependent tactic with β = 0.5, agent a uses a polynomial mode and agent b uses an exponential mode. A positive XT (defined in Eq. (3)) indicates that agent a accepts the offer of agent b. The experiments show that with a negotiation deadline varying between 5 and 200 unit times, positive values appear 129 times of 196; that is, agent a tends to accept its opponent’s offer. The simulation outcomes confirm our conjecture.

14

K.-J. Wang, C.-H. Chou / Robotics and Autonomous Systems 43 (2003) 1–27

Fig. 7. α(t) against β.

Furthermore, offer values of a time-dependent tactic are strongly related to convex parameter β. An agent regards β as the behavior to construct an offer when the time-dependent tactic is adopted. For instance an agent insisting on an issue should adopt a β below 1 to slow down the speed (to yield to its counter party) of its offer (for example, using Boulware mode). On the contrary, a β above unity will accelerate yielding (for example, using Conceder mode). Fig. 7 depicts α(t) against β. The curves represent three different β settings. Curve #1: β is fixed at 1; Curve #2: β is a geometric series starting at 0.01 with a ratio of 1.2; Curve #3: β is a geometric series starting at 0.01 with a ratio of 1.5 in the first half and 0.66 in the latter half. Experimental results reveal that α(t) is linear when β is fixed at 1, α(t) increases if β increases, and α(t) oscillates when β oscillates. Notably, manipulating β can disclose a change in behavior during negotiation. 4.1.2. Behavior-dependent tactic The behavior-dependent tactic has three variations—random absolute tit-for-tat, relative tit-for-tat, and averaged tit-for-tat. The latter two change offers on the basis of the counter party’s offers. One-to-one experiments are run to examine this set of tactics. Our experimental outcomes discover the problem which arise when both agents use the behavior-dependent, relative tit-for-tat tactic (δ = 1). No deal can then be reached. Therefore, an agent must incorporate other tactics, or the counter party must use a distinct tactic (such as the negotiation strategy used in experiment No. 3). The random absolute tit-for-tat tactic incorporates a random number and thus is unpredictable to the counter party. Fig. 8 displays XT when the same random seed is used. Distinct random seeds should be used to divert the offers among agents. 4.1.3. Resource-dependent tactic The resource-dependent tactic includes two types—dynamic deadline tactics and resource estimation tactics. In dynamic deadline tactics, the negotiation deadline rather than the offer value is adjusted. Hence, the dynamic deadline tactic should be used with other tactics. The tactic indirectly affects negotiation results by changing the deadline. Resource estimation tactics deduct resources to adjust the offer value. This tactic can be employed independently (as can be the time-dependent tactic), and can be used to model the effect of the number of counter parties as well. Usually resources decline as negotiation time passes. A one-to-many negotiation is conducted in the experiment (No. 4) to demonstrate how the number of parties in a negotiation can affect the result. The experiment involves five participants, including four clients and one server. The time-dependent tactic with β = 15, 10, 1 and 0.1 is employed on the client side for the four clients, respectively.

K.-J. Wang, C.-H. Chou / Robotics and Autonomous Systems 43 (2003) 1–27

15

Fig. 8. Using the same random seed (behavior-dependent, random absolute tit-for-tat tactic, M = 1).

Meanwhile, the server uses a resource-dependent, dynamic deadline tactic (using Eq. (6)) with µ = 50, and the four negotiation threads are independently developed via NDF-negotiation. Fig. 9 presents the outcomes of the experiment, and every intersection in the figure represents a deal. The number of resources at each intersection time reduces with the decline in the number of participants, and deal values decline accordingly (for example, at t = 23 and 75). The convergence rate of the negotiation deals depends on the tactics selected by the participating agents. Agents using higher β parameter values (Conceder tactics) find it easier to reach deals with the sever. |N a (t)|2 resourcea (t) = µa t . i |Xi↔a |

(6)

Fig. 9. A one-to-many negotiation (clients use time-dependent tactic with β = 15, 10, 1 and 0.1. Server uses resource-dependent, dynamic deadline tactics (Eq. (6)) with µ = 50).

16

K.-J. Wang, C.-H. Chou / Robotics and Autonomous Systems 43 (2003) 1–27

Fig. 10. Ratio of T/tmax vs. tmax (both agents use resource-dependent, linear-time tactic). a − t)) is characterized by its late negotiation The resource-dependent, linear-time tactic (resourcea (t) = min(0, tmax convergence. Fig. 10 depicts the late convergence when both agents use a resource-dependent, linear-time tactic. The ratio of negotiation deal time to deadline is nearly 100% in most of the cases implying that a deal is not obtained until the deadline approaches.

4.2. Negotiation strategy Combined tactics, which contain a linear combination of tactics, are regarded as a negotiation strategy in the NDF model (a formal definition refers to [9]). We examine the performance of negotiation strategies when they are set against single tactics. A set of typical tactics are selected for carrying out experiments. Herein, we mainly use the experimental set-up of [9] with a minor modification. Table 2 presents the tactics under consideration. β ∈ [0.01, 0.2] denotes a random number between 0.01 and 0.2. All experiments use the polynomial mode when the time-dependent tactic is applied, and dynamic estimation when the resource-dependent tactic is used (refer to Eq. (6)). Table 2 Experimental set-up to compare negotiation strategy and single tactics Tactic family

Tactic name

Abbreviation

Parameter ranges

Description

Time-dependent

Boulware Linear Conceder

B L C

β ∈ [0.01, 0.2] β=1 β ∈ [20, 40]

Increasing rate of approach to reservation as β increases

Resource-dependent

Impatient Steady Patient

IM ST PA

µ = 1, n = 1 µ ∈ [10, 50], n = 1 µ ∈ [50, 100], n = 1

Decreasing rate of approach to reservation as µ increases

Behavior-dependent

Relative tit-for-tat Random tit-for-tat Average tit-for-tat

RE RA AV

δ=1 δ = 1, M ∈ [1, 3] γ=2

Percent imitation of last two offers Fluctuating absolute imitation of last two offers Average imitation of last four offers

K.-J. Wang, C.-H. Chou / Robotics and Autonomous Systems 43 (2003) 1–27

17

Table 3 Notations and parameters Notation

Description

Value/range

minc maxc θc mins maxs θs

Minimal value of issues at client end Maximal value of issues at client end Ranges at client end Minimal value of issues at server end Maximal value of issues at server end Ranges at server end

5 maxc = minc + θ c θ c ∈ [10, 30] 5 maxs = mins + θ s θ s ∈ [10, 30]

A one-to-one negotiation is employed and the two agents include the client and the server. Price is the only issue. Table 3 specifies the parameters used with the minima all set to five to ensure offers and counter-offers will eventually overlap. The experimental is set up as follows. Three time-dependent tactics are compared to a set of negotiation strategies using a 50–50% weighting. On the client side, alternatives include (i) three single time-dependent tactics, (ii) three negotiation strategies with time-dependent tactics, and (iii) 18 negotiation strategies using a combination of time-dependent with the other six tactics. A total of 24 negotiation strategies are investigated from the perspective of the client. On the sever side, one of the three time-dependent tactics is applied in each experiment. Accordingly, 72 (3 × 24) experiments are performed. The negotiation deadline is 200. Each experiment is repeated 1000 times and average performance indices are computed. First table in Appendix B presents the simulation outcomes. Experiments are abbreviated; for instance, BSBC stands for a Boulware Server and a Boulware Client, and BSBCC stands for a Boulware Server and a BoulwareT  ), and averaged deal Conceder-combination Client. Negotiation results (denoted as the averaged offer Xa (xa→b

Fig. 11. V T (Boulware tactic in server and a combination of Boulware and another tactic in client).

18

K.-J. Wang, C.-H. Chou / Robotics and Autonomous Systems 43 (2003) 1–27 

T V a (xa→b )) are collected from the perspective of the agent which accepts a deal. The agent which accepts the final offer is designated as agent a and the one offering is agent b. Fig. 11 depicts the average difference in the deal values (V T ) for agents a and b using the Boulware mode in server and a combination of Boulware and another tactics in the client. A small difference between deal values is favored. The three behavior-dependent tactics used with the Boulware tactic are inferior to the single Boulware tactic in terms of the mean difference in deal values. We conclude that, as facing a server agent using Boulware tactic, all the other tactics (i.e., BC, BL, BIM, BST, and BPA) outperform the single Boulware tactic at the client end. Among them, BL and BIM perform the best. Fig. 12 depicts the negotiation time (T¯ ) required to reach a deal. Negotiation strategies require less negotiation time to reach a deal. Fig. 11 shows that BSBCC (Boulware vs. Boulware-Conceder-combination), BSBLC (Boulware vs. linear), and BSBIMC have a smaller V T value than the others. Conceder and linear tactics converge quickly while the Boulware tactic yields reluctantly and converges very late, as shown in Fig. 12. When two agents with two such distinct attitudes encounter each other, the one with the Boulware tactic is less likely to yield at the last minute. The difference in the deal values is reduced accordingly. Fig. 11 shows that the value of V T for BSBIMC, BSBSTC, and BSBPAC increase according to µ. For a fixed number of participants (n = 1 in this case), a lower µ implies faster convergence to a deal. Figs. 13 and 14 show values of V T and T¯ as linear tactic used in server and a combination of linear and another tactic used in client. As facing a linear-typed server, all the other tactics (i.e., LB, LIM, LST, LPA, LRE, and LAV) outperform the single Boulware tactic at the client end, except those of LC and LRA. However, LC requires the least negotiation time to reach a deal. In summary, the experiments described herein reveal some interactive effects when various tactics are applied to negotiate each other. The experimental results imply that negotiation strategies typically outperform a single tactic in terms of reduced V T and T¯ .

Fig. 12. T¯ (Boulware tactic in server and a negotiation strategy combining the Boulware and the other tactic in client).

K.-J. Wang, C.-H. Chou / Robotics and Autonomous Systems 43 (2003) 1–27

19

Fig. 13. V T (linear tactic in server and a combination of linear and another tactic in client).

4.3. Efficiency of NDF-based negotiation This section presents a preliminary study on the efficiency of NDF compared to TAC agents [15] and the Pareto optimal. Since this study focuses on analytical study of the tactics, simulation-based comparison with other strategies has been dealt with as briefly as possible. The NDF-based agent in this work employed MediocreAgent of the University of Melbourne [16] as a template, where hotel price negotiation uses a polynomial mode, time-dependent

Fig. 14. T¯ (linear tactic in server and a combination of linear and another tactic in client).

20

K.-J. Wang, C.-H. Chou / Robotics and Autonomous Systems 43 (2003) 1–27

tactic where β = 1. Four competitions are performed, as detailed by second table in Appendix B. The first game (ID 11,763) contains one NDF-based agent and seven TAC default dummy agents, and shows NDF-based agents (i.e., Hank) to outperform the others. The second game (ID 11,743) involves four NDF-based agents (Hank1–4), three TAC default dummy buyers, and one ‘noise’ agent, Tong, and demonstrates that NDF-based agents perform diversely but one of them achieves the highest score among the eight agents. Meanwhile, the third game (ID 11,767) reveals that the purely NDF-based agent performs the second best among seven MediocreAgents. Finally, the last game (ID 11,755) contains seven NDF-based agents, where Hank6–8 use β = 0.01 (Boulware mode), while Hank9–12 with β = 10 (Conceder mode), and one noise agent. The average performance of NDF agents with Boulware attitude is better than that of other NDF agents. A rudimentary study of the efficiency of NDF as compared to the Pareto optimal is conducted herein. Pareto optimal curves are plotted assuming that a one-to-one agent negotiation is performed. Meanwhile, both of the agents employ the polynomial mode, time-dependent tactic with β = 1. We consider two issues, price and volume. Agent a has issue values with an issue boundary vector of price = [9, 25], volume = [5, 30] and issue weight vector [price = 0.2, volume = 0.8]; while for agent b boundary vector of price = [5, 20], volume = [10, 40] and issue

Fig. 15. NDF efficiency vs. Pareto optimal.

K.-J. Wang, C.-H. Chou / Robotics and Autonomous Systems 43 (2003) 1–27

21

weight vector [price = 0.8, volume = 0.2]. Agent a is assumed to prefer high price and low volume, while b is assumed to prefer the opposite. Fig. 15 presents the negotiation thread and Pareto optimal curve. NDF cannot guarantee reach the Pareto optimal, while a research effort using a trade-off algorithm is conducted by Faratin et al. [11] to improve ‘social welfare’. 5. Conclusions Autonomous agents have been applied to a wide range of industrial and business domains. This paper evaluated a negotiation model based on NDFs to facilitate agreement among agents. The NDF-based negotiation model is analyzed for its convergence characteristics using analytical and simulation approaches. Our analysis of tactics and strategies revealed that the negotiation deadline has significantly influences convergence performance. Simulations of tactics and strategies are conducted to examine negotiation fairness with respect to the mean difference between deal values and the corresponding convergence properties. We conclude that (i) when negotiation time is large, the ratio of deal time to negotiation time is almost constant; (ii) the final agreements of two counter parties who are considering a fair deal, will coincide if the deadline is sufficiently late; (iii) when both agents use a time-dependent tactic, the difference between deal values will converge to zero if the negotiation deadline is late enough; (iv) some tactics cannot be suitably applied alone, such as relative tit-for-tat and averaged tit-for-tat behaviordependent tactics; (v) in most cases negotiation strategy outperform a single tactic with respect to mean differences between deal values and mean deal time. Specifically speaking, as facing a server agent using Boulware tactic, all the other tactics (i.e., BC, BL, BIM, BST, and BPA) outperform the single Boulware tactic at the client end. Among them, BL and BIM perform the best. Besides, negotiation strategies require less negotiation time to reach a deal. As facing a linear-typed server, all the tactics (i.e., LB, LIM, LST, LPA, LRE, and LAV) outperform a single Boulware tactic at the client end, except that of using LC and LRA. However, LC requires the least negotiation time to reach a deal. A preliminary study on the efficiency of NDF as compared to TAC agents and Pareto optimal has been done herein. The experimental outcomes demonstrate that NDF-based agents outperform others on average, but are incapable of reaching the Pareto optimal. Hence, research efforts such as the trade-off algorithm [11] are worthy of exploration. This study analyzes deadline factor. Furthermore, additional questions regarding asymmetries in reservation values and their effect on convergence would make very interesting topics for future research. The use of a general purpose theorem prover to conduct proof procedures for efficiency is also encouraged. Such a theorem prover could be implemented by a “center”, which could then calculate possible outcomes given that agents have an incentive to provide their true model parameters. Besides, a comprehensive simulation-based evaluation of the strategies proposed by NDF model compared to other strategies can be done using the TAC platform, and an in-depth study of Pareto-efficiency analysis would also make a good topic for future research. We are also investigating how to employ NDF-based negotiation in production control systems by considering machine utilization, order due date, and processing time as the issues of factory agents. We expect that such a model would allow the system to autonomously produce an efficient and effective production schedule. Acknowledgements The authors gratefully acknowledge the helpful comments and suggestions of the anonymous referees. The authors would like to thank the National Science Council of the Republic of China for financially supporting this research under Contract No. NSC-90-2218-E-033-001. Appendix A. Time-dependent tactics Time-dependent tactics provide a means of computing an offer (as formula (A.1)) according to the negotiation time passed:

22

K.-J. Wang, C.-H. Chou / Robotics and Autonomous Systems 43 (2003) 1–27

t xa→b [j] =

minaj + αaj (t)(maxaj − minaj ), vaj is an increasing function of xj , minaj + (1 − αaj (t))(maxaj − minaj ), vaj is a decreasing function of xj ,

(A.1)

a ) = 1, and k a is a where αaj (t) is a function of the negotiation time passed, 0 ≤ αaj (t) ≤ 1, αaj (0) = kja , αaj (tmax j constant. Suppose agent a’s initial offer is xj0 , then kja is defined as follows:

kja =

 xj0 − minaj      (maxa − mina ) , j j     

maxaj − xj0

(maxaj − minaj )

,

Vja is an increasing function of xj , Vja is a decreasing function of xj ,

α(t) is defined as a polynomial or exponential function as in (3). a a polynomial mode : αaj (t) = kja + (1 − kja )(min(t, tmax )/tmax )1/β , a a )/tmax )β ln kja ). exponential mode : αaj (t) = exp((1 − min(t, tmax

(A.2)

The behavior of this tactic depends on the convex parameter β. Fig. 16 demonstrates how the modes and β affect α(t). A time-dependent tactic is called a Boulware tactic if β < 1, and a Conceder tactic if β > 1 regardless of the mode used. The Boulware tactic represents reluctance to increase an offer until the negotiation deadline is approached, while the Conceder tactic yields quickly. A.1. Resource-dependent tactics Resource-dependent tactics consist of two sets—dynamic deadline tactics and resource estimation tactics. Timedependent tactics are a special set of resource-dependent tactics when time is considered to be a resource. The

Fig. 16. (a) Polynomial mode and (b) exponential mode of α(t).

K.-J. Wang, C.-H. Chou / Robotics and Autonomous Systems 43 (2003) 1–27

23

a α function represents a change of resources. In dynamic deadline tactics, deadline tmax is a dynamically changa a ing resource. A lower value of tmax , indicates less time intended by an agent to reach a deal. tmax can be defined as

|N a (t)|2 a tmax = µa t , i |Xi↔a |

(A.3)

t is active} and |N a (t)| denote the set of agents negotiating with agent a at time t. µa represents where N a (t) = {i|Xi↔a t | describes the the period for which agent a considers negotiation with a single agent to be reasonable. i |Xi↔a length of the current negotiation thread between i and a. Resource estimation tactics compute an offer according to remaining resources:

αat (t) = kja + (1 − kja ) e−resouce

a (t)

,

where resourcea (t) can take one of the following forms. (1) resourcea (t) = |N a (t)|, regarding the agent population  2 t as a resource; (2) resourcea (t) = µa |N a (t)|/ i |Xi↔a | , regarding the agent and thread length as resources, and a − t), regarding time as a resource. (3) resourcea (t) = max(0, tmax A.2. Behavior-dependent tactics Behavior-dependent tactics compute the next offer according to the most recent attitude of the counter party. This group consists of relative tit-for-tat, random absolute tit-for-tat and averaged tit-for-tat tactics [9]. The negotiation thread is represented as follows: t

t

t

t

t

tn n−2δ n−2δ+1 n−2δ+2 n−2 n−1 {. . . , xb→a , xa→b , xb→a , . . . , xb→a , xa→b , xb→a },

δ ≥ 1.

In the relative tit-for-tat tactic, an agent reproduces, in percentage terms, the behavior that its counter party exhibited δ ≥ 1 steps ago. This tactic applies for n > 2δ:   t n−2δ xb→a [j] tn−1 tn+1 a a xa→b [j] = min max tn−2δ+2 xa→b [j], minj , maxj . (A.4) xb→a [j] The random absolute tit-for-tat tactic is the same as above but in absolute terms. If the other agent increases the offer by a certain amount, then the response should be an increase by the same amount except a random number is added. t

t

t

t

n+1 n−1 n−2δ n−2δ+2 [j] = min(max(xa→b xa→b [j] + (xb→a [j] − xb→a [j]) + (−1)s R(M), minaj ), maxaj ),

(A.5)

where R(M) is a function that generates a random integer in the interval [0, M] and

0 if Vja decreasing, s= 1 if Vja increasing. The averaged tit-for-tat tactic computes the average percentage change in the opponent’s history in a window of size (γ ≥ 1) when determining a new offer. This tactic applies when n > 2γ:   tn−2γ xb→a [j] tn−1 tn+1 a a xa→b [j] = min max tn x (A.6) [j], minj , maxj . xb→a [j] a→b

24

Appendix B Three time-dependent tactics are compared to a set of negotiation strategies using a 50–50% weighting Negotiation strategy





max(VT )

min(VT )

Range of VT

V T

T Xa (xa→b )

T V a (xa→b )

0.443497 0.453840 0.435049 0.512386 0.490917 0.453891 0.453891 0.453891 0.572673 0.463145 0.445392 0.530403 0.482650 0.464234 0.464234 0.464234 0.462107 0.452802 0.444354 0.528840 0.481157 0.463196 0.463196 0.463196 0.135017 0.422367 0.319195 0.135017 0.135017 0.416030 0.443370 0.446822 0.853225 0.429553 0.739412 0.422367

0.000070 0.000003 0.000008 0.000019 0.000029 0.000002 0.000096 0.000037 0.000016 0.000003 0.000005 0.000052 0.000000 0.000008 0.000010 0.000009 0.000018 0.000034 0.000003 0.000016 0.000006 0.000001 0.000016 0.000020 0.000000 0.000443 0.000013 0.000005 0.000003 0.000643 0.000643 0.000278 0.000643 0.000643 0.000445 0.000443

0.443427 0.453836 0.435041 0.512367 0.490889 0.453890 0.453795 0.453854 0.572657 0.463142 0.445387 0.530350 0.482650 0.464226 0.464224 0.464225 0.462088 0.452768 0.444351 0.528823 0.481150 0.463195 0.463181 0.463177 0.135017 0.421923 0.319182 0.135013 0.135015 0.415387 0.442727 0.446544 0.852582 0.428910 0.738967 0.421923

0.032043 0.023352 0.021302 0.027490 0.030467 0.033642 0.032818 0.035960 0.114061 0.016565 0.023602 0.020933 0.025758 0.041347 0.047851 0.041356 0.013836 0.020496 0.015161 0.019000 0.023240 0.013954 0.017567 0.013960 0.003432 0.287167 0.021180 0.004234 0.003611 0.210198 0.276840 0.234261 0.671293 0.293833 0.310927 0.287167

14.354900 17.438901 16.703449 14.738245 13.472756 14.378998 14.818504 14.652059 23.313974 20.300451 20.572957 18.772172 17.473441 21.524210 21.675303 21.525440 19.595226 16.733486 19.695087 17.967425 16.652695 19.719583 20.061229 19.809731 6.056068 13.640210 7.360219 6.363484 6.199541 11.721421 13.130409 11.753669 21.224078 13.771785 14.083905 13.640210

0.476622 0.509888 0.505635 0.479521 0.460369 0.468826 0.468721 0.467298 0.368668 0.493483 0.493055 0.530909 0.525922 0.440487 0.425602 0.443172 0.476426 0.498654 0.499305 0.531316 0.514088 0.472771 0.461282 0.477731 0.137283 0.512393 0.632657 0.465598 0.328671 0.589362 0.522720 0.545984 0.128267 0.505727 0.488633 0.512393

Averaged deal time T¯ 185.499 157.061 168.56 182.061 185.81 185.705 184.348 185.189 84.36 123.145 103.51 135.611 148.513 85.861 83.164 85.818 151.082 175.004 137.141 160.863 169.946 151.405 146.428 150.986 48.509 1.938 8.345 25.348 34.612 1.938 1.938 3.764 1.938 1.938 1.938 1.938

T Xa (xb→a )

T V a (xb→a )

14.418565 17.678023 17.068558 15.171901 13.936693 14.369784 14.930862 14.655755 21.026860 20.432613 20.600382 19.048166 17.863273 21.044626 21.049976 21.045855 19.812296 17.032324 19.966001 18.346935 17.132616 19.929302 20.179757 20.017418 6.104388 7.799951 7.052508 6.393215 6.241338 7.799951 7.799951 7.430990 7.799951 7.799951 7.799951 7.799951

0.508664 0.533239 0.526937 0.507010 0.490836 0.502468 0.501539 0.503259 0.482729 0.510048 0.516657 0.551842 0.551679 0.481834 0.473453 0.484528 0.490262 0.519149 0.514466 0.550316 0.537328 0.486725 0.478849 0.491691 0.140715 0.799560 0.653836 0.469832 0.332282 0.799560 0.799560 0.780246 0.799560 0.799560 0.799560 0.799560

No. of times client accepting an offer

No. of times server accepting an offer

499 379 360 357 330 537 480 527 540 495 482 363 293 541 550 538 520 412 475 347 312 529 544 522 85 938 699 454 294 938 938 888 938 938 938 938

501 621 640 643 670 463 520 473 460 505 518 637 707 459 450 462 480 588 525 653 688 471 456 478 915 62 301 546 706 62 62 112 62 62 62 62

K.-J. Wang, C.-H. Chou / Robotics and Autonomous Systems 43 (2003) 1–27

BSBC BSBCC BSBIMC BSBSTC BSBPAC BSBREC BSBRAC BSBAVC BSCC BSCLC BSCIMC BSCSTC BSCPAC BSCREC BSCRAC BSCAVC BSLC BSLBC BSLIMC BSLSTC BSLPAC BSLREC BSLRAC BSLAVC CSBC CSBCC CSBIMC CSBSTC CSBPAC CSBREC CSBRAC CSBAVC CSCC CSCLC CSCIMC CSCSTC

Merit

0.422367 0.812165 0.838531 0.422367 0.135017 0.135017 0.326252 0.135017 0.135017 0.422797 0.450157 0.452019 0.035157 0.169517 0.059345 0.010408 0.012745 0.030385 0.080426 0.027965 0.575779 0.176320 0.375895 0.169517 0.169517 0.217608 0.265616 0.219596 0.009783 0.013211 0.070757 0.009938 0.009798 0.010097 0.063279 0.010238

0.000443 0.000643 0.000643 0.000443 0.000006 0.000002 0.000011 0.000001 0.000002 0.000643 0.000643 0.000018 0.000025 0.000005 0.000000 0.000002 0.000004 0.000017 0.000002 0.000002 0.000009 0.000002 0.000011 0.000019 0.000011 0.000006 0.000006 0.000009 0.000008 0.000000 0.000001 0.000010 0.000008 0.000001 0.000008 0.000007

0.421923 0.811522 0.837888 0.421923 0.135011 0.135015 0.326241 0.135017 0.135015 0.422154 0.449514 0.452001 0.035132 0.169512 0.059345 0.010406 0.012741 0.030369 0.080423 0.027963 0.575770 0.176318 0.375884 0.169498 0.169506 0.217602 0.265610 0.219587 0.009775 0.013210 0.070756 0.009928 0.009790 0.010096 0.063271 0.010232

0.287167 0.594324 0.660966 0.287167 0.006486 0.004666 0.022896 0.005112 0.004729 0.216864 0.283506 0.241127 0.006407 0.008098 0.004241 0.004047 0.004158 0.006088 0.012166 0.006142 0.108256 0.008181 0.013859 0.007744 0.007662 0.035066 0.040473 0.035937 0.004734 0.004187 0.004949 0.004570 0.004413 0.004698 0.011208 0.004895

13.640210 19.305289 20.714277 13.640210 6.921426 6.680639 7.475460 6.709848 6.683244 11.852996 13.261984 11.856057 8.415065 15.158110 13.641549 10.805738 9.521037 8.392885 9.099455 8.473570 23.131460 17.528246 19.381727 16.867961 15.959071 21.246283 21.496603 21.318357 14.562296 11.899987 16.223429 13.915543 12.909320 14.570190 15.264535 14.664208

0.512393 0.205236 0.138594 0.512393 0.542122 0.489916 0.627898 0.515748 0.496549 0.582696 0.516054 0.525297 0.536877 0.509956 0.469737 0.433362 0.441583 0.529893 0.573982 0.516009 0.393230 0.523271 0.496074 0.524247 0.520191 0.450344 0.413916 0.450315 0.495906 0.470679 0.506082 0.492424 0.483474 0.495727 0.479122 0.495976

1.938 1.938 1.938 1.938 9.933 14.71 6.978 14.002 14.648 1.938 1.938 3.707 162.787 83.52 100.686 133.909 148.786 162.946 155.929 162.084 29.466 60.742 43.799 67.306 76.073 30.352 29.18 29.872 94.072 122.278 75.066 100.791 111.597 94.11 88.059 93.172

7.799951 7.799951 7.799951 7.799951 6.922167 6.708546 7.140030 6.729530 6.710201 7.799951 7.799951 7.439838 8.374928 15.129671 13.704392 10.853053 9.552634 8.359804 8.944778 8.438137 20.570823 17.447711 19.147338 16.811773 15.913609 20.509822 20.622630 20.560285 14.559898 11.931999 16.228947 13.928376 12.933250 14.562929 15.106058 14.660067

0.799560 0.799560 0.799560 0.799560 0.548609 0.494583 0.650794 0.520860 0.501278 0.799560 0.799560 0.766423 0.543284 0.518054 0.473979 0.437409 0.445740 0.535981 0.586149 0.522151 0.501487 0.531452 0.509933 0.531991 0.527853 0.485410 0.454389 0.486252 0.500640 0.474866 0.511031 0.496994 0.487887 0.500425 0.490330 0.500871

938 938 938 938 557 488 698 520 496 938 938 871 567 168 230 335 398 560 655 548 514 392 483 352 295 536 578 536 516 400 444 463 429 514 673 524

62 62 62 62 443 512 302 480 504 62 62 129 433 832 770 665 602 440 345 452 486 608 517 648 705 464 422 464 484 600 556 537 571 486 327 476

K.-J. Wang, C.-H. Chou / Robotics and Autonomous Systems 43 (2003) 1–27

CSCPAC CSCREC CSCRAC CSCAVC CSLC CSLBC CSLIMC CSLSTC CSLPAC CSLREC CSLRAC CSLAVC LSBC LSBCC LSBIMC LSBSTC LSBPAC LSBREC LSBRAC LSBAVC LSCC LSCLC LSCIMC LSCSTC LSCPAC LSCREC LSCRAC LSCAVC LSLC LSLBC LSLIMC LSLSTC LSLPAC LSLREC LSLRAC LSLAVC

25

26

K.-J. Wang, C.-H. Chou / Robotics and Autonomous Systems 43 (2003) 1–27

Competition in TAC games Game ID

Player name

Score in TAC

11,763

Hank (NDF) Dummy buyer1 (TAC) Dummy buyer0 (TAC) Dummy buyer4 (TAC) Dummy buyer3 (TAC) Dummy buyer5 (TAC) Dummy buyer6 (TAC) Dummy buyer2 (TAC)

5003.28 3285.14 3121.39 2684.74 2109.47 2035.22 1971.51 −3489.75

11,743

Hank2 (NDF) Dummy buyer0 (TAC) Dummy buyer1 (TAC) Dummy buyer2 (TAC) Hank3 (NDF) Hank1 (NDF) Tong (Noise) Hank4 (NDF)

4464.50 4084.00 3769.00 3741.62 3017.00 2468.50 2261.00 181.38

11,767

Hank2 (MediocreAgent) Hank (NDF) Hank7 (MediocreAgent) Hank6 (MediocreAgent) Hank3 (MediocreAgent) Hank4 (MediocreAgent) Hank1 (MediocreAgent) Hank5 (MediocreAgent)

4056.00 3763.00 3112.00 3110.00 1148.00 196.00 146.00 −337.00

11,755

Hank12 (NDF with β = 10) Hank7 (NDF with β = 0.01) Hank8 (NDF with β = 0.01) Hank10 (NDF with β = 10) Hank9 (NDF with β = 10) Tvad (Noise) Hank6 (NDF with β = 0.01) Hank11 (NDF with β = 10)

4841.00 4626.00 4424.00 3334.00 475.00 0.00 −266.00 −556.00

Experiment setting for comparison with Pareto optimal Parameter, β

Run number

Agent

Tactic/strategy

Tactic weight

1

a b

TimeDepTactic.POLYNOMIAL TimeDepTactic.POLYNOMIAL

0.5 0.5

1 1

2

a b

TimeDepTactic.POLYNOMIAL TimeDepTactic.POLYNOMIAL

0.5 10

1 1

K.-J. Wang, C.-H. Chou / Robotics and Autonomous Systems 43 (2003) 1–27

27

Appendix B (Continued ) Run number

Agent

Tactic/strategy

3

a

TimeDepTactic.POLYNOMIAL TimeDepTactic.EXPONENTIAL TimeDepTactic.POLYNOMIAL TimeDepTactic.EXPONENTIAL

b

Parameter, β 0.7 0.7 2 2

Tactic weight 0.7 0.3 0.7 0.3

References [1] R. Davis, R.G. Smith, Negotiation as a metaphor for distributed problem solving, Artificial Intelligence 20 (1) (1983) 63–109. [2] D. Veeramani, K.J. Wang, Performance analysis of auction-based distributed shop-floor control schemes from the perspective of the communication system, International Journal of Flexible Manufacturing Systems 9 (1997) 121–143. [3] E.H. Durfee, V.R. Lesser, Negotiating task decomposition and allocation using partial global planning, Distributed Artificial Intelligence 2 (1989) 229–243. [4] S. Kraus, J. Wilkenfeld, G. Zlotkin, Multiagent negotiation under time constraints, Artificial Intelligence 75 (2) (1995) 297–345. [5] S. Kraus, D. Lehmann, Designing and building an negotiating autonomous agent, Computational Intelligence 11 (1) (1995) 132–171. [6] A. Greenwald, P. Stone, The first international trading agent competition: autonomous bidding agents, Journal of Electronic Commerce Research, in press. http://auction2.eecs.umich.edu/researchreport.html. [7] W. Shen, D.H. Norrie, Agent-based systems for intelligent manufacturing: a state-of-the-art survey, International Journal Knowledge and Information Systems 1 (2) (1999) 129–156. [8] H.V.D. Parunak, Agents in overalls: experiences and issues in the development and deployment of industrial agent-based systems, International Journal of Cooperative Information Systems 9 (3) (2000) 209–227. [9] P. Faratin, C. Sierra, R.N. Jennings, Negotiation decision functions for autonomous agents, Robotics and Autonomous Systems 24 (3) (1998) 159–182. [10] P. Faratin, Automated service negotiation between autonomous computational agents, Ph.D. Dissertation, Department of Electronic Engineering, Queen Mary College, University of London, 2000. [11] P. Faratin, C. Sierra, R.N. Jennings, Using similarity criteria to make negotiation trade-offs, in: Proceedings of the Fourth International Conference on Multiagent Systems, 2000, pp. 119–126. [12] D.G. Pruitt, Negotiation Behavior, Academic Press, New York, 1981. [13] H. Raiffa, The Art and Science of Negotiation, Harvard University Press, Cambridge, MA, 1982. [14] Swarm Development Group (SDG), Introduction to Swarm, 1999. http://www.swarm.org. [15] TAC, 2002. http://auction2.eecs.umich.edu/. [16] MediocreAgent, 2002. http://www.cs.mu.oz.au/∼scv/tac/gettingstarted.html. Kung-Jeng Wang has been a faculty member in Industrial Engineering Department of Chung-Yuan Christian University, since 1997. He received a BA in industrial engineering from CYCU, and a MS in computer science and Ph.D. in industrial engineering from University of Wisconsin at Madison, USA. He is now the Director of the Laboratory for Intelligent Production Decision Technologies of CYCU. Dr. Wang currently works closely with the semiconductor companies in Taiwan on issues of manufacturing management. He has published articles in the International Journal of Flexible Manufacturing Systems, IIE Transactions, Production Planning and Control, International Journal of Computer Integrated Manufacturing, Journal of Robotics and CIM, Journal of the Chinese Society of Mechanical Engineers, and Journal of the Chinese Institute of Industrial Engineers. His current research is in the areas of performance analysis of production systems and intelligent manufacturing control. Chung-How Chou received his master degree in industrial engineering from Chung-Yuan Christian University in 2000. His research focuses on agent-based production systems. He currently serves for an MIS branch of Chunghwa picture tubes Co., one of the leading LCD manufacturing companies.