Chapter 7
The Competition Problem
A decision choice procedure is proposed in a mathematical model of competition between two similar economies. Games are analyzed with separable and mirror (anticommutative) vector-valued payoff functions. In the end of the chapter an optimal solution is obtained in a model of the competitive exploration of a scientific problem.
1.
Mathematical Model of Competition
1.1. A Model of a Specific System
Two similar economic units (factories, corporations, concerns or states) have conflicting interests. The dynamics of the one economy are assumed to be described by the vector differential equation x = f(t, x,
4,
xCto1 = xo,
(1.1)
where x E R" is the state vector; t E [ t o , 81 time; 8 > to 2 0 constants; u E Rhthe vector; and U E Ha collection of control actions in the system. Numerous kinds of right-hand sides of equations (1.1) used in economic applications of the theory of differential games have been reported [12]. In a model of the exploration of a scientific problem by two similar firms simultaneously f = u (section 4). The right-hand side f(t, x, u) of the system (1.1) will be assumed to satisfy Condition 1.1 of Chapter 1, that is, f ( t , x, u) is continuous over the set of arguments and locally Lipschitz with respect to x, H ~ c o m p R ~and , 3 y = const > 0: l l f l l < y(1 Ilxll) uniformly with respect to t, x, and u.
+
28 1
282
7. The Competition Problem
Disregarding the competition with the other economy, (1) the set of strategies =
{ U t u(t,X) 1 ~ ( X) t , E H,
V(t, X) E [to, 91 x R" f ;
(1.2)
(2) the performance of this system is described by the set of criteria
For instance, in comparing specific constructions built around different control systems the choice has to be made between the following criteria [25, p. 25-27]: (1) adaptability to variations of the process gain; (2) insensitivity to variations of the gain; (3) static accuracy of the system; (4) noise-immunity of the system; (5) ease of system restructuring. In addition, the collections of criteria reflect:
(1) ease of design of the systems; (2) adaptability of the system to streamlined manufacture; (3) complexity of system start-up; (4) system performance; (5) system safety. The scalar functions F,(x), i E N, will be assumed hereafter to be continuous on R" and (Condition 1.1, Chapter 5) will be assumed to hold. Disregarding the interaction of this system with the other system and with the environment, the economic management process may be described in the following way (Fig. 7.1.1). Let us assume that a certain strategy U + u ( t , x ) is chosen and, depending upon the accuracy of approaching the optimal values of the criteria Fi(x), i E N, a partition A: to < zo < z1 < ... < z,,,(~) = 9, the initial value of the vector xo and the number a reflecting the sum of the steps of stepwise quasimotions at the partition points are established. At time zo the controller generates a vector jzo, where llxo - lollis chosen so that the sum of the steps of the stepwise quasimotion is less than a and a value of the strategy U , i.e., the vector u(zo,xo), which is fed to the system Z. This system
1. Mathematical Model of Competition
283
I=
u - U(t, x) Regulator
Figure 7.1.1.
generates the first interval x ( t ) , r o Q t Q r 1 of the stepwise quasimotion by means of the stepwise equation
+
x(t) = fo
J:o
f(r, x(T), u(ro, 20))dr.
The measuring unit determines at time t = r1 the value of the state vector x(r,) = xo(rl),given to the regulator which forms to ci the vector 2, and a value of u(r,, 2 , ) and transmits them to E.By means of the stepwise equation x(t) = i
l +
f , f(r, x(4,
U(Z1,
fddz,
the measuring units generate a second interval x(t),r1 < t < r2, of stepwise quasimotion and determine x(r2) = x0(r2), which is fed to the controller. Following the strategy U and number ci, the controller determines the vectors f2,u(rz,f2),and so on. As a result we have a stepwise quasimotion x(t),ro Q t Q 8, whose right end x(8) specifies the values of the criteria Fi(x(8)),i E N. Note again that the 3-tuple (A,x(~),cL("')) is governed, by subsection 2.3 of Chapter 1, by the admissible approach of Fi(x(@),i~ N, to the values of Fi(x[8, to,xo,U ] ) , where x[*, to, xo,U ] are quasimotions of the system (1.1) generated by the strategy U from the initial position (to,xo).
284
7. The Competition Problem
1.2. A Model of Competition
If there were no competition the goal of the first economy would be to choose a strategy U E @so as to obtain the largest possible values of all the criteria Fi(x[B]),i E N, simultaneously. In this case the results of Chapters 2-4 could be used. Now let us proceed to a mathematical model of competition between two economies that are similar in that: (1) the dynamics of the second economy are specified by the same
differential equation
L =f@, Y , 4,
YCtOl = yo;
(1.3)
(2) the initial values of the state vectors are identical, xo = yo;
(3) the control action u E Q, that is, the set of values of u and u coincide, whence the set of strategies of the second economy is 9'" = { V
=
-+ ~ ( tX ,) I ~ ( tx,) Q } ;
(4) the performance of the second economy is evaluated in terms of the same set of criteria F(y[e]) = (F,(y[O]),. . .,F,(y[B])).
Consequently, the competition between these two systems can be modeled as a differential zero-sum positional game with vector-valued payoff function <{1,2}, zc, {ac, v C }~ ,~ ( x c eycei)). i,
(1.4)
Here the system zcis described by the common differential equations = f(t,
x, 4,j ,
= f(t,
Y , 01,
(1.5)
where x , y ~ R " ; the control action of the first (second) player u E Q E comp Rq, u E Q; time t E [to, el; the constants 8 > to 2 0; the right-hand sides satisfy Condition 1.1 of Chapter 1; and the initial position (to, x0,yo) = (to, xo,yo = xo)E LO,
e) x
The set of strategies of the first (second) player
[wn
x
W.
285
1. Mathematical Model of Competition
The vector-valued payoff function
~c(xCel? YC
U
=
V Y C m - W3I)
(1.7)
or, in coordinate form,
F(xCeI9 YCKI)
=
Fi(yCOI) - Fi(xCoI),
i e N.
The differential positional zero-sum game (1.4) will be referred to as the competition problem. In the game (1.4) the second player chooses V e Y c so as to obtain the largest possible values of the components of the payoff function
E(xCeI9 YC~I) = Fi(yCoI) - Fi(xCoI),
i e N,
and thus increase the gap between the values of the similar scalar criteria Fi(~C01)and Fi(xCo1). Using the strategy U e a Cthe first player tries to have the lowest possible value of the components of Fi(x[B], y[O]), i e N. Because min[F,(y) - Fi(x)]
= - max[Fi(x) -
F,(y)],
both players try to increase the gap between the values of the similar scalar criteria F,(x[O]) and Fi(y[O]),i e N.
1.3. Optimal Decision-Making in Competition Problems
By the solution of the competition model (1.4) we will understand the Slater saddle point ( Uzs, V z s )E aCx Y cin which the value of the payoff functional is equal to the Slater maximin and its minimax.
Definition 2.2. which
The ZS-solution of (1.4) is the situation (Uzs, V z s )~'42' x Vc,
(1) is the Slater saddle point of this game: w c o , to, xo, UZSI,YCO, to, xo, vl) 4 w 4 e , to, xo, UZS1,YCO, to, xo, VZS1)
4 FC(xce,to, xo, ~ 1yCe, , to, x0, vZs1)
(1.8)
for any strategies V EYc,U E aCand all quasimotions x[ .,to,xo, U], YC ., to, xo, vl;
286
7. Tbe Competition Problem
(2) there are quasimotions (k[. ,to, xo, U z s ] , j [ ., to, xo, V z s ] ) for which there exist a Slater maximin and minimax such that FC(2[0, to, xo, UZSI,jco, to, xo, V"") =
minSu maxS Fc(x[d,
t o , xo, U ] , y [ 0 , to, xo, V t
Q])
XC.1,YC.l
=
maxS u minS FC(x[O, to, xo, U VEY'
t
.x[.],yC.I
Q],y[&tO,xO, V]).
(1.9)
Note that it follows from (1.9) and Proposition 2.8 from Chapter 6 that the strategy Uzs is Slater-minimax and Vzs, Slater-maximin, while the associated points (j;"[UZs],j s [ V t Q]) and (XJU - Q], y , [ V z s ] ) coincide with (jC[O, to, xo, U z s ] , j[O, to, xo, Vzs3). 1.4. A Geometric Interpretation of the ZS-Solution
A geometric interpretation of the ZS-solution will be provided for the case N = {1,2}.
First, let us give an interpretation of the equalities F C ( i [ O , to, X O , UZSI,EC0,to, xo, V Z S 3
-
minS Fc(xCO, to, xo, U - Q1,y[O, to, xo, V s ] )
x[.l. Y I , I
(1.10)
Because Vzs is a Slater-maximin strategy and
(X0, to, xo, UZSl,3C0, to, xo, V"")
=
(x~"", jCVZS1)
are the associated points of Definition 1.1 in Chapter 6, it follows from the first equality in (1.10) that
~C(~Cu"sl, jCV""3) 4 FC(XC& to, xo, Ul, yce, t o , xo, VZS1) for any strategies U E aCand quasimotions x[ . ,to, xo, U ] ,y[. ,to,xo, V z s ] of the system (1.5) and xo = yo. The latter relation signifies that none of the points of the set
Fc(X[U - Q], Y [ V z s ] )=
u
FC(x,y)
x e - X [ U - Q] ye Y[v=s]
will land strictly inside the angular region G with vertex F(A[Uzs], j [ V z S ] ) (Fig. 7.1.2).
1. Mathematical Model of Competition
287
Figure 7.1.2.
Consequently, if the maximizing player has chosen and has been using the Slater-maximin strategy Vzs, it will not be possible for all the possible values of the vector-valued payoff function LFc(x[O],y[O]) to be less than 5 " ( i [ U Z s ]j,[ V z s ] ) ;in other words, by choosing Vzs, the maximizing player assures for himself components of the vector-valued payoff function that are at least equal to F ~ ( ~ ~[C Vu" ] )~. ~ ] , On the other hand, by (l.lO), the values of the vector-valued payoff function LFC(i[UZS], j [ V Z S ] are ) such that ffC(2 [
P I , j [V Z S ] )
=
maxSu minS ffC(x[O, t O , x o ,U - Q],y[O,to,xo, V ] ) VEY'
XC.I.YC~1
(1.1 1)
for any V E V . Let us assume that the reachability domain of the first subsystem in (1.5)
while
288
7. The Competition Problem
Then (1.1 1) implies that the values F c ( 2 [ U z s ] , j [ V z s ] )of the vector-valued payoff function become equal to the Slater-minimal values of the set Fc(X, Y [ U )for every fixed strategy V E V c ,or coincide with the points of the set
9i-(,)FC(X,Y [ V ] )= {%€FC(X, Y [ V ] ) l F ; 4 IFc, V[FCE[FC(X,Y [ v l ) } . Therefore, by (1.11) for no V e V c will the points F r o ) F C ( X ,Y [ V ] ) land inside the right angle G 2with vertex at the point Fc(2[UZs],j [ V z s ] ) but with sides directed opposite to those of 6 , (Fig. 7.1.3). Figure 7.1.3 shows the possible positions of three sets
9r(,,FC(X, Y [ V c k ) ] k) ,= 1,2,3, relative to the point F C ( 2 [ U Z Sj][, V z s ] ) . Consequently, in addition to the Slater saddle point, if the Slater maximin F C ( i [ U Z S9] ,[ V z s ] ) defines the northeast boundary of 4 r o ) F C ( X ,Y [ V ] )which no values of Fc from the sets 9 r o , F C ( X , Y [ V ] ) for any V E V c can exceed, componentwise, Fc(2[UZS], j [ V z s ] ) .Here F C ( 2 [ U Z S j] [, V z s ] )is the Slater maximum of all Slater minima F r o , F C ( X , Y [ f l ) ,t/ V E Vc.Consequently, if for every strategy V E V Ca set of Slater minima Fr(,,lFC(X,Y [ V ] ) are obtained in the multicriterial static problem
(X x
n n IFC(%
Figure 7.1.3.
Y)),
(1.12)
1. Mathematical Model of Competition
289
the best (Slater-maximal) strategy for the maximizing player, is lFC(i[UzS],9 [ V z s ] ) , which is obtained for the Slater-maximin strategy Vzs. The associated point ( i [ U z s ] ,3 [ V z S ] X ) ~x Y [ Vl is the Slater-maximal solution of (1.12) with V = Vzs. A similar geometric interpretation of the Slater-minimax strategy Uzs is shown in Fig. 7.1.4. Here the reachability domain of the second subsystem from (1.5) is
moreover,
is the set of Slater maxima of the two-criteria1 problem
for every fixed strategy U E
of the first player.
Figure 7.1.4.
290
7. The Competition Problem
Consequently, using the strategy U z s the first player restricts to the angle G, the maximal values of Fc(X[Uzs], Y ) by Fc(2[Uzs], 3 [ V z s ] ) . Simultaneously, the same value of the payoff function is the least of all I F E F ~ ( ~ " F ~ ( XY[ U ) ]Slater , maxima of problem (1.13) for any UE%!'. Consequently, the best, or Slater-minimal value of FC(x[O],y[O]) for the ) angular region G , is the minimizing first player is F C ( a [ U z s ] , j [ V z S ] (the lower bound). Because, by (1.9),these values Fc(? [U z s ] ,9 [ V z s ] ) are the same for the first and second players, the situation (Uzs, V z s )of Definition 1.1 is the most acceptable (for both players) solution of a differential zero-sum game with vector-valued payoff function. We have, therefore, chosen the ZS-solution as the optimal solution of the competition problem (1.4) for the following reasons (regarding (2)-(4) see at the end of subsection 2.5 in Chapter 6). (1) The ZS-solution (Uzs, V z s )of the game (1.4) with initial position (to,xo) is dynamically stable, that is, (Uzs, Vzs) remains the ZS-solution of this game but with a current initial position ( t , x [ t , to,xo, U z s ] , y [ t , to,xo, V z s ] )at any t E [to,O ] and any quasimotions x [ *, to,xo, U z s ] and y [ . , to, xo, V z s ] . (2) The ZS-solution is internally stable. (3) The ZS-solution is nonimprovable by using Slater saddle points. (4) The ZS-solution ensures the maximal possible vector-valued guarantees for both players simultaneously.
1.5. Procedure for the Construction of a ZS-Solution
The most important finding of this chapter is the following procedure for the construction of a the ZS-solution: (1) find at least one Slater-maximal strategy U s ~ % ! ( U+s uS(t,x ) ) of the multicriterial dynamic problem
(E,@, UXCOl)),
(1.14)
where the set of strategies 92 is defined in (1.2), H = Q, and the system E in (1.1); simultaneously, identify at least one quasimotion 2 [ .,to, xo, U s ] of the system (1.1) generated from the initial position ( t o , x o )by the strategy Us.
2. A Came with Separable Payoff Function
291
(2) The ZS-solution of (1.4) is the situation ( U s , V s ) , where V s - uS(t,y) and
WXU~I, icvsl)= F C ( w , to,xo, us],ice, to,xo, vS3 where j [ t , to,xo, Vs]
= % [ t ,to,x0, Us],
to G t G 8.
This result follows from an analysis of two kinds of differential zero-sum game with vector-valued (1) separable payoff function ~ ~ 0YC1OI) , = w x c 0 1 )
+ IF(~)(Y~OI),
the game dynamics being described by the system x, u),
x[to] = xo,
L = f ( 2 ) ( tY,, u),
Y C t O l = yo;
jc = f ( l ) ( t ,
(2) mirror (anticommutative) payoff function
W ~ IYcel) , = - UYC81, x c m with dynamics (1.5) where xo = yo. The game (1.4) is easily seen to be a game with both separable and mirror payoff function. Games with a separable payoff function are discussed in section 2 and those with a mirror function, in section 3. In the last section 4 of this chapter, a procedure for constructing the ZS-solution of subsection 3.3 is obtained in a model of competitive exploration of a scientific problem. 2. A Game with Separable Payoff Function
2.1. Problem Statement In the differential game
- y d ~~ , X
+~Ycei)),
c e i )
(2.1)
the system Zd is described by two separable systems of common differential equations i= f‘”(t, x, u), j = f ‘ 2 ’ (t, Y, u),
(2.2) (2.3)
292
7. The Competition Problem
where (x, y) E R" x R" is the state vector; t E [to, 81 time; the initial position (to,xo, y o ) and time 8 > t o when the game ends are fixed; as before, the control action of the first player is U E H ~ c o m Rh p and of the second player, u E Q E comp R4;every vector function f ( ' ) ( t ,x, u) and f ("(t, y, u) is assumed to satisfy Conditions 1.1 (Chapter 1) and the components of the N vector functions F(')(x) and V2)(x) are continuous on R" and R", respectively. The set of strategies adfor the first (Vdfor the second) player are Qd
= {u
- u(t,x, y ) I u(t, x, y ) E H } ,
Y d= {I/ + u(t, X, y ) I v(t,X, y ) G Q}.
In effect, what we have is a special case of the game (1.1, Chapter 6), where the system of differential equations has been reduced to two equations, and the components F~(x[O], ycei) = F:')(x[~])
+ ~ : ~ ) ( y [ e ] ) , i E N,
of the payoff function is formed by the sum of two functions F$"(x[O]) and F,(y[B]), each depending only on x[O] or y[O]. This game will be referred to as a zero-sum position differential game with separable uector-valued payoff function. The next natural step is to obtain an insight into the properties of the set of solutions to the game (2.l), such as vector-valued saddle points, maximins, and minimaxes. The game (2.1) will be associated with two multicriterial dynamic problems, (P,
a, ~ ( y q e - ~ ) )
(2.4)
and
(P), Y-, F'"(y[8])).
(2.5)
The control system Z(')is described by equation (2.2) and C('), by equation (2.3),the initial positions (to,xo) for (2.2)and (to,yo) for (2.3)are fixed. The sets of strategies are Q = { U +-
~ ( t X, ) ~ U ( ~ ,C XH ) },
Y = { V t u(t, y) I ~ ( ty ,) E Q } .
For problem (2.4) we denote by as the set of Slater-minimal strategies Us; by Qp, the set of Pareto-minimal strategies Up;by QG, the set of Geoffrionminimal strategies UG;and by aA, the set of A-minimal strategies U A ,where A is a constant fixed (N x N)-dimensional matrix with positive elements.
293
2. A Game with Separable Payoff Function
Similarly, for the problem (2.5), V sdenotes the set of Slater-maximal strategies Vs; Vp, the set of Pareto-maximal strategies Vp; V G ,the set of Geoffrion-maximal strategies VG;and V Athe , set of A-maximal strategies V A . By virtue of (3.23) and Proposition 3.6, both from Chapter 4, 02s v-s
2.2.
3 0 2 p 3 %G
3 %A ,
3 v-p 3 V G 3 v-A.
Vector- Valued Saddle Points
The differential game (2.1) is associated with a static zero-sum game with vector-valued payoff function (X,
I: lyx, y ) = W
X )
+ F(Z'(y)),
(2.6)
where the reachability set of the system (2.2) from position (to, xo) is
x = x r e , to, xo, u - HI =
u
VEY'
Xre,
to, xo, u],
and that of the system (2.3) from position ( t o ,yo),
The strategies of the first (minimizing) player is x E X and that of the second (maximizing) player y E Y. We let X, be the set of Slater-minimal strategies xs of the static multicriterial problem (X, F'"(X)) or
x, = {XSEXI P ( X S ) +- [F"'(x),
(2.7) vx E X};
Y s is the set of Slater-maximal strategies y" in the multicriterial problem
(I: F'2'(Y)>
(2.8)
or Y S= {ySE YIF(2)(yS) 4: F(2)(y),
v y € Y}.
By Definition 3.1 in Chapter 6, the situation ( x S , y S ) ~xXY is the Slater saddle point of the game ({ 1,2}, {X, Y}, F(x, y)) if Vx, yS)4: F(XS,y S )4: w , Y),
vx E x , Y
E
Y.
294
7. The Competition Problem
Lemma 2.1. The situation (xs,y ” ) is the Slater saddle point of the game (2.6) iff xsE X, and y”E Y’. Consequently, the entire set of Slater saddle points of (2.6) is Xsx Y“ and the set of all values of the vector-valued payoffunction is equal to the algebraic sum of the set P1)(XS)+ P2’(Y”),where ff“’(X,) =
u [F“’(x),
[ F ‘ 2 ’ (Y S )
=
Proof.
Necessity.
u
[F‘2’(y).
y € Y’
X€X,
Because (x”,y ” ) is the Slater saddle point of (2.6),
[F“’(x) + [ F ‘ y y ” )4: Pyx,) + P ( y ” )4: lP1)(XS)+ IF‘Z’(y),
V X E X , Y E I:
From the left-hand side of the latter relation it follows that
P y x ) 4: IP(x”),
vx E x,
since the sign ( + ) remains true in these relations if on both its sides the same ) added or subtracted. In turn, the latter relation constant N-vector F ( 2 ) ( y s is signifies that xsE X,,the set of Slater-minimal solutions of (2.7). The inclusion y S € Y“ is proved in the same way. Suflciency. Let x”E X,and y” E Y’. The following inclusions are equivalent: [F“’(x)
4: F‘l)(xS),
vx E x
and P ’ ( y ” ) 4: F‘2’(y),
vyE
I:
The first (second) inclusion remains true if a constant vector F(2)(y”) (respectively, F(’)(x”))is added to both sides:
[F“’(x) + l P ( y ” ) 4: [F“’(x”)+ F(2’(y”)), F‘yxs) + [F@)(yS)4: F‘l’(xs)+ P ’ ( y ) ,
vx E x, VyE I:
Hence, by Definition 3.1 (Chapter 6), the situation (xs,y ” ) is the Slater saddle point of (2.6). The following proposition is established in a similar way: For the situation (xk,y k )E X x Y to be the Pareto (with k = p), Geoffrion ( k = g ) , or A ( k = a ) saddle point of Definition 3.1 from Chapter 6, it is necessary and sufficient that xkE xk and yk E yk. Remark 2.2.
Furthermore, the entire set of associated saddle points is xk x Yk, while the
2. A Game with Separable Payoff Function
295
set of values of the vector-valued payoff function on X k x Y k is P y x k )+ P’(Y k ) . Here xk is the set of Pareto ( k = p ) , Geoffrion-(k = g), and A-minimal ( k = a) solutions of (2.7) and Y k ,the set of associated maximal solutions of (2.8).
Proposition 2.1. ff us^@' and Vs, the situation (Us, V s ) is the Slatersaddle point of the diflerential game (2.1). Note that the sets @‘ and V s are defined in the concluding part of subsection 2.1. Proof. By virtue of the structure of solutions to the multicriterial dynamic problem (Proposition 3.1, Chapter 2), if the strategy V s is Slater-maximal in problem (2.5), then y[O, to, yo, V s ] E Y s for any quasimotion y [ . ,to, y o , V s ] of (2.3)generated from the initial position (to,y o ) by the strategy Vs. In a similar way, x[& to, xo, US] E X , ,
vUsE@s,x[. , to, xo, U S ] E % [ t , ,
xo,
US].
Because X is the reachability set of the system (2.2) from the position (to,xo), xce, to, x0, UI E X
for any strategies U E @d and quasimotions x[ .,to, xo, U ] which they generate. In a similar way, YC& to, Yo, vl E y
for every V E Vdand quasimotions y [ . ,to, yo, V ] . Then, by Lemma 2.1,
+ F‘2’(yc& to, yo, VS1) 4: ff‘”(x[~,to, xo, US1) + W Y C & to, Yo, VS1) 4: W x c e , to, x0, us])+ WYC& to, y o , ~ 1 )
[F‘”(X[B, t o ,
xo, UI)
for any strategies U E adand V E Vd and the quasi-motions they generate, which implies, by Definition 1.1 from Chapter 5, that (Us, V s ) is the Slater saddle point of the differential game (2.1). Q.E.D. Because as# c$ and V s# c$ (Corollary 1.1, Chapter 2), in the differential game (2.1) the set of Slater saddle points is not empty. Remark 2.2.
296
7. The Competition Problem
Remark 2.3. By Lemma 2.1 and proof of Proposition 2.1, in addition to singleton strategies U s and V s (the points x[e, to,xo, U s ] EX, and y [ e , to,yo, V s ] E Y s ) , the Slater saddle point may be generated by the strategies U s € % and V'E Y , where the sets X [ e , to,xo, U s ] c X, and Y[O, to, yo, V s ] c Ys, as by any other strategies from the sets 42d and Yd. The only constraint is that these strategies must be in the inclusions
xce, to, xo, usl = x,,yce, to, yo, vS1= ys.
(2.9)
Consequently, the entire set of Slater saddle points of (2.1) is obtained by and V S ~ Y d finding the entire set 42' c ad(Ysc Yd)of strategies that satisfy the inclusions (2.9). Then the entire set of Slater saddle points is 42' x Ys, and the set of associated values of the vector-valued payoff function is the algebraic sum 5 ( ' ) ( X , ) IF("( Y').
+
Let s be the set of Slater saddle points of (2.1). Unlike the general case of (2.1) (see Example 4.3, Chapter 5), the Slater saddle points of (2.1) are interchangeable. Specifically,
Proposition 2.2. If (U('),If(')) and (U"), V 2 ) are ) arbitrary Slater saddle points of (2.1), the situations (U"), V @ ) )and (U'", V l ) ) are also Slater saddle points of this game.
(2.10)
2. A Game with Separable Payoff Function
297
for VU €ad and V E Vd,x [ . ,to,xo, U l and y e . ,to, yo, Vl. In obtaining (2.10) the following two properties were used: (1) The reachability domains of (2.2) and (2.3) are
x = xce,to, xo, u - H I Y
(2)
=
YCe, to, Y O , V
[F")
4 P 2 ) eV')
+
u u YC~,
xce, to, xo,
=
UEQ
Q1 =
VEV'
u] =
to, Y O , V l =
u u YCO,
Xce,
UEQd
V € P
to, xo, u],
to, YO, V I ;
+ a 4 F2)+ a for any constant vector a € W.
The relating in (2.10) signify that the situation ( W ) V , 2 ) is ) the Slater saddle point of (2.1).In a similar way the situation ( U ( 2 )Vl)) , is also found to be a Slater saddle point. Q.E.D. Propositions 2.1 and 2.2 may be proved in a similar way for the other saddle points of (2.1).
Proposition 2.3. If the strategies U K e a K and V K e V K ,the situation ( U K ,V K )is the Pareto ( K = P), Geoflrion (K = G),and A(K = A ) saddle point of (2.1). The saddle points vor every $xed K = P, G, A ) are interchangeable, their set is not empty, and, by Property 1.1 from Chapter 5,
d
c
9 c 9 c 9,
where d is the set of A-saddle points, B is the set of Geofrion saddle points, B is the set of Pareto saddle points.
Corollary 2.1. The above findings suggest a procedure for obtaining vectorvalued saddle points of (2.1). For illustration, this tool is applied to Pareto saddle points. Find at least one Pareto-maximal strategy Vp in the multicriterial problem (2.5). (2) Obtain the Pareto-minimal strategy Up of (2.4). (1)
Then the Pareto saddle point of (2.1) is the situation (Up, Vp). By Remark 2.1 and Proposition 3.1 from Chapter 3, if the entire set of such Pareto-optimal strategies 42' and Vphas been found, the set of values of the vector-valued payoff function ffc(x[e], y [ e ] ) = IF(')(x[8])+ lFc2)(y[e]) on all Pareto saddle points is
F'"(X[ap])+ I P ( Y [ V P ] ) ,
298
7. The Competition Problem
where
Example 2.2. Consider the game (2.1), where the system Z is described by the equations i= u, j = u,
where x
= (xl, x,), y = (yl, y,);
XCO]
= y[O] =
o,,
(2.11)
the sets are
< 1, u, < 1, u1 + u, Q = {U = ( u ~ , u , I) U: + U: < l}, = { u = ( # I , u,) I u1
2 O},
time t E [0, 01, 0 = 1; the two-component payoff function is ~ ( X C ~ YCOl) I, = (Fl(XC~I9YCOI), F2(X)Ca
YCOl)) (2.12) = XCOl + YCOl = (XICOI + YICO19 XZCO1 + YZCOl). We will denote the game as T(,.,,. The reachability domains X and Y of the first and second subsystems from
(2.11) are shaded in Fig. 7.2.1. By Remark 2.1, for the static two-criteria1 problem
(X,lF'"(X)
= x),
we obtain a set X, of Pareto-minimal strategies xp (in Fig. 7.2.1 the set is shown as a heavy line AB) and for the problem
(k: W
Y ) = y),
we obtain a set Yp (the quadrant C D of the circle) of Pareto-maximal strategies yp. Then the set of values of the two-component payoff function (2.12) which can be reached on the entire set of Pareto saddle points will, by Corollary 2.1, coincide with the algebraic sum of the sets (Fig. 7.2.2),
v,, YP) = (Fl(X,,YP), F,(X,, YP)) =
F(1)(XP)+ F(,'(YP)
=
x,+ YP.
Following the procedure of Corollary 2.1, we obtain a subset of Pareto
2. A Game with Separable Payoff Function
Figure 7.2.1.
L
IFz
Figure 7.2.2.
299
7. The Competition Problem
300
We wish to find the strategies
saddle points for the differential game U p of the first player such that
xco, to, xo, UPIEXpr
and of the second player, ,'/I
such that
YC4 to, Yo,
V'I
E yp.
These conditions are easily seen to be satisfied by any strategies of the form
+
Up + u(t,x) = (ul, uzlul u2 = 0, u1 = const < 1, u2 = const < I), V'
+ u(t, y ) = ( u l , uzlu:
+ ug = 1, u1 = const 2 0, u2 = const 2 0).
Every situation (Up, V') is a Pareto saddle point of l-(2,1,. The entire set of Pareto saddle point of r(2.1) is larger than the situations (Up, V') thus obtained. They do not, for instance, include the Pareto saddle points (Up, V') for which X[O, to,xo, U p ] is a subset (not a point) of X, and Y[O,to, yo, V'] is also a subset of the set YP. Besides, they do not include the strategies UpE Qd or V' E V dof the form U + u(t, x, y) and V + u(t, x, y ) such that
xce, to, xo, up]= x,,yce, to, yo, vpi= YP. The sets Qd and V dare defined in (1.6).
2.3. Vector-Valued Maximin and Minimax
For the case of the differential games (2.1) with separable payoff function, Proposition 2.7 from Chapter 6 may be made more specific regarding the position of the vector-valued minimaxes, maximins, and the set of values of the vector-valued payoff function that can be obtained on the set of all saddle points. For instance, when the notion of Slater optimality is used the set of Slater maximins coincides with Fr's)lF(Y), the Slater-maximal boundary of lF(Y),i.e., the set of values of the payoff function that can be obtained on the set of all Slater saddle points, while the set of Slater minimaxes coincides with 9to,F(Y), the set of Slater minima for IF(9'). This fact is established in the next proposition.
2. A Game with Separable Payoff Function
Proposition 2.4.
301
For the diflerential game (2.1),
Here
V 1 ) ( X s )= (J ff(l)(x), P ( Y S ) = (J f f ' y y ) , YEY '
*EX.
and X , ( Y s )is the set of Slater minimal (or maximal) solutions of problem (2.7) (or Problem (2.8)). Let us prove (2.13); equation (2.14) may be proved in a similar way. The proof will proceed in two stages. On the first stage we establish the following inclusion: any Slater maximin
u maxSu m i 2 ff(x[e, to, xo, V ] , y [ e , to, yo, ~ l ) ~ ~ r ( ~ )+[ff'*'(Ys)1, ~(')(x,) Y€Pd
XI.l.Y[.l
(2.15) and on the second stage, the converse inclusion
TP)[[F(')(x,) + P2)(Y s ) ]c u maxSu minS [F(x[e,to,xo, a, y[O, to,yo, a). VEYd
X[.I,YC~l
(2.16)
From these two inclusions (2.13) will follow. Proof. First stage. The system (2.2), (2.3) is represented in the form
Here the vector z = (x,y). By Definition 1.1 (Chapter 6) of the maximin strategy V @ )of the game (1.1, Chapter 6), every strategy V € Y dis associated with the set Z ( s ) [ f l of Slater-minimal solutions z s [ q = ( x s [ V J ,ys[w) of the problem
( Z [ V ] , F(z) = P y x ) + F'yy)),
(2.17)
302
7. The Competition Problem
where
ZCVI=
u zce,
Zt '
1
to, z0, ~1 =
u
x[
Y[
'
'
1 1
(xce, to,x0, VI,
Yce, to, Y,, ~ 1 ) .
Because of the specific form of the dynamic system (2.2), (2.3), the first subsystem in (2.2) is obviously independent of the control signal K Then, by the way the quasimotions are obtained (subsection 2.3, Chapter l), the set Z[V]
=
x x Y[V],
where the reachability domain of the system (2.2)
x=
u xce,
U€l
to,x0,
UI
and the set Y[Vl = uy[.ly[O, to,yo, VJ is the intersection of the bunch CY[to,yo, V] of quasimotions y[., tO,yo, V] of the system (2.3) and the hyperplane t = 8. Therefore, for every point zs [ V] = (xs [ V], y, [V]) of the set Z(,,CVl/7, F(1)(XS[V])
for any X E Xand y[V] we have
E
+ F(2)(ys[V])
4 F"'(x)
+ F'2'(y[V])
Y[V]. Assuming in (2.18) the vector y[V]
F'"(x,[V])
4 F"'(x),
(2.18) = y,[r/7,
VXEX.
Consequently, for any strategy V E Y ' " ~the collection of all xs[V] forms the set X, of Slater-minimal solutions of the static multicriterial problem (2.7). This set X, is independent of the choice of the strategy C: and every strategy V in (2.18) may be associated with any point xs[V] = x,EX,, which is also independent of the choice of ys[V]. Assuming now in (2.18) the vector x = x, = xs[V], we have E(2'(YscVl) 4:
q;;,
VY E YCVl.
Therefore, the set of all points ys[V] coincides with the set Y,[V] of Slaterminimal solutions ys[V] of the problem
< Y C n F(2)(Y)>, and the choice of the specific ys[V] x, = xJV] in (2.18). Consequently, we have Z,,,CVI
E
=
x [ V ] is also independent of the point
xs x xcv1
2. A Game with Separable Pay06 Function
and any point
c
303
(2.19)
xs VI EX,.
With V(') the Slater-maximin strategy of (2.1),by Definition 1.1 in Chapter 6 this implies that there exists
is[V S ' ]
j s [V'S'])
= (& [V'S'],
such that
+ P ( j S V'S']) [ 4: ff"'(x,[V]) + P ( y s [ V ] )
IF"'(f,[V'S'])
for any V E Y ~ , X ~ [ V ] E X ,and , y s [ V ] ~ Y,[V]. Assuming now in (2.20) the vector xs[V] = $[V's']
= x,EX,,
c
F(2)(AcV's'l) 4: F'*'(Y, VI)
(2.20) we have
(2.21)
with V V E Y ~The . set of strategies Y c Vd(Y is defined in subsection 2.1), and among the strategies Y there are singleton strategies, i.e., strategies for which for any y * K~ the reachability domain of the system (2.3) from the position (to,yo). there exists (see Corollary 4.1, Chapter 1) a singleton strategy V,, E V such that y* = y[& to,yo, V,,] for all quasimotions y[ . , to, yo, V,*] of the system (2.3). For such strategies Y*
=
YCv,*I
=
Y,[V,*I9
since the entire set Y[V,,] degenerates to the point y*. For such singleton strategies, equation (2.21) signifies that F'2'(js[
V'S'])
4: IF'Z'(y),
vy E
rc:
thus
(2.22) the set of Slater-maximal solutions of problem (2.8). In light of the inclusions (2.19) and (2.22),
+ F(2)($,[V'S'])€
F"'(f,[V'S)])
For singleton strategies V,, where (2.20) may be reformulated as F"'(2,[V'S']) for every x , E X , and
YE
YE
F'"(XS)+ P ' ( Y S ) .
I:bearing in mind the inclusion (2.19),
+ F'2'(ps[V'S'])
4: F"'(X,)
+ F'Z'(y)
Y. But then this relation holds also with any V"], y,[ Vn]) terminates in
(xs,y) E X , x Y'; in other words, the situation (?,[
304
7. The Competition Problem
the Slater maximum on the set [F(”(X,)+ P 2 ) ( Y S )Consequently, . any Slater maximin of (2.1) satisfies the inclusion (2.15). Second stage. Proceeding now to the proof of (2.16), suppose that for some strategy V*eVd and point .2,[V*] = ( i , [ V * ] ,j , [ V * ] ) the following inclusion holds: I P 1 ) ( i s [ V * ]+ ) F ( 2 ) ( j s [ V * ] ) ~ 9 r ( S ) [ [ F ( 1+) (IF(2)(Ys)]. X,)
(2.23)
In this case the strategy V* will be shown to be Slater-maximin for the game (2.1),while .2,[V*] = (i,[V*], j , [ V * ] )is the associated point from Definition 1.1 (Chapter 6). From the inclusion (2.23), it follows that P”(i,[V*])
+ [F‘2’(js[V*])+ P y x s ) + 5‘2’(y”)
(2.24)
for every x , E X, and ys E Ys.By the strategy V E V dof the second player, a set Z ( s ) [ V lis obtained of Slater-minimal solutions of (2.17). By the first step of the proof, Z ( s ) [ V l = X, x Y , [ V ] . The set Y s of Slater-maximal solutions ys (2.8) is externally stable [70, p. 1581. Therefore, for any quasimotion y [ - ,t O , y o ,r/7 of the system (2.3) (including those for which y[O, to, yo, Vl = y,[Vl E Y , [ V ] ) ,there exists y ’ ~Y s such that [F‘2’( y , [V ] )
< F‘2’( y”.
By this inequality and the proof of the equality X s = { x s [ V l } ,for any V E Vd on the first stage, (2.24) may be given in the form
F‘”(i,[V*])
+ [F(2)(jS[V*])+ F(l)(xs[v-J)+ 5(2)(y,[VI)
for any V E V and ~ ( x , [ V l , y , [ V l ) ~ Z ( ~ ) [ VHence, l. by Definition 1.1 (Chapter 6), V* is the Slater-maximal strategy of the differential game (2.1). This proves the proposition. Remark 2.4. The above procedure of proving Proposition 2.4 is applicable to the cases of Pareto, Geoffrion, and A-optima. In this case the following equalities may be established for (2.1):
F ( K=) u maxKu minK q x [ e , to, xo, V ] , y [ e , to, yo, V]) VEYd
5K) =u
*c~l,Yc~l
minKu maxK [F(x[O,to, xo, U l , y[O, to, yo, U l )
UEqd
*C.I,YC.l
(2.25)
2. A Game with Separable Payoff Function
305
where K, k = (P, p; G, g; A, a). These equalities and Remark 2.3 help reveal the structure, i.e., with K, k = P, p, of the Pareto-optimal solutions of (2.1). First, the set
u u wce,
~(9) =
( U , V ) E 9 x“1 Y[
’
to, x0,UI,y w , to, yo,
~1)
1
of values of the vector-valued payoff functions ff(x[B],y[B]) saddle points is the sum
on all Pareto
[F“’(X,) + !P( YP), where X, is the set of Pareto-minimal solutions of problem (2.7) and Yp is the set of Pareto-maximal solutions of problem (2.8). Second, the Pareto maximins of this game form the Pareto-maximal boundary of the set E(P), while the Pareto-minimaxes, the Pareto-minimal boundary of this set. Thus in the game T(z.l, of Example 2.1, any point of the curve LTHK (heavy line) is the Pareto-maximin while every point of the segment LK (double line), the Pareto minimax.
2.4. Existence of ZS-Solutions With N = 2
Recall that by the ZS-solution of the differential game (1.1, Chapter 6) we understand the Slater saddle point (Uzs, V z s ) for which there exists a point f[UZS, V z s ] = are, to, xo,Uzs, Vzs] and a Slater maximin and Slater minimax such that F(Ei.[Uzs, Vzs])
=
maxS u minS qx[e, to, xo,V]) VEY
x[.]
= minS u maxS F(x[O, to, xo,U]). U€%
x[.]
In this subsection the existence of a ZS-solution will be demonstrated in the differential game (2.1) with two-component payoff function (N = { 1,2)). In effect, we consider a differential game (2.1), where IF”) = (Fy),FY)),j = 1,2. This game is associated with two two-criteria1 static problems:
r(1)= (x,pyyx), ~yy~)}), =(
r; {F(iZ)(Y), F:z’(Y))),
306
7. The Competition Problem
where X is the reachability domain of the system (2.2) with U €42 and Y an analogous domain for the system (2.3), where V EV . In this context some propositions from the theory of multicriterial problems will be helpful. Lemma 2.2. For every i = 1,2 in problem
F'),
min F$''(x) = min F$')(x), XPX
*€XT
where X , is the set of Slater-minimal solutions of F(');similarly,
max Fi2)(y)= max FI2)(y), YE
Y
y € YS
where Y s is the set of Slater-maximal solutions of
r(2).
This Lemma is established in [70, p. 751. Lemma 2.3. If
Fil) = min F\')(x), *EX,
(2.26)
there exists
(2.27)
(2.28)
there exists
Fi2)= min F\')(y) Y E Y'
2. A Game with Separable Payoff Function
307
Here, as before,
Proof. (2.26) (2.27), (2.28) is assumed; the second part of the lemma may be established in a similar way. The set X, is a compact subset of X, understood as a set of Slater-minimal solutions of r(l)(provided X is a compactum and F(')(x) continuous). Under a continuous mapping, the image of a compactum is a compactum, thus the set F(')(X,) is closed and bounded in R2. But then the set F(~)(x,)n ( ~ 1( 1 )=
1 )
is also a compactum. Therefore there exists a number
f\') =
IF"'=
max
(F:l',F:lI)€(IF"'(X,)n(F:l' = $ l I'
11
Pi') such that (2.29)
Fi1).
Let us show that fil)in (2.29) satisfies conditions (2.27) and (2.28). We assume the contrary, i.e., that there exists a vector F(') = (Fill, Fi'))E ff(')(X,) such that F\') # f\') and
(2.30) where
f\')
is defined in (2.29). Because F(')eF(')(Xs) and
g(') =
(f\'),Pi1))€F(')(X,), from (2.30) and the internal stability of the set F(')(X,), we have
But F\') # because of the way cannot be true, for by Lemma 2.2,
F(') is obtained in (2.30) and Fil) < f\') (2.31)
and
B(')E F'"(X,) = F("(X). The resultant contradiction F"'E{F"'(x,)
max
n { F\"
= P,')]]
F\') =
max
F"'EIF"' (X.)
proves the equality
F\1) = pa) 2
whence follows (2.27). The inclusion (2.28) is a corollary of (2.31). Remark 2.5. Propositions similar to Lemma 2.2 and 2.3 in the case of a Pareto optimum have been proved in [70, pp. 75,1183.
308
7. The Competition Problem
Proposition 2.5. In the differential game (2.1) with N solution, i.e., a Slater saddle point (Uzs, V z s ) such that
W& to, xo, UZSl,YC& to, yo, = =
there exists a ZS-
VZS1)
maxSu minS F(x[B, to, xo, V€Yd
=2
Xr.1,Yr.l
VI, yC0, to, yo, VI)
minSu maxS F(xC8, to, xo, U ] , yC6, to, yo, Ul) X[.I,Y[.l
for any quasimotions x [ .,to, xo. U z s ] ,y [ * , to, yo,
v"].
Note that by virtue of the way quasimotions are obtained and of the specific form of the system (2.2) and (2.3),
x[e,to,xo, V ] = x[e,to,xo, u - H I = x, yce, tO,yo,ui = yce, to, yo, - QI = r, where X (respectively, Y) is the reachability domain of the system (2.2) (or (2.3)). By Lemmas 2.3 and 2.2, there exists a Slater-minimal solution x , E X of the problem r(l)such that Proof.
F(1) I (x,) = min F\')(x) = min F\I)(x), X€X,
X€X
(2.32)
F\')(x,) = max @)(x), X€X,
and the point x,EX, c X. Similarly, in problem maximal solution y" E Y s for which
r(2)there exists a
Slater-
Fi2)(yS) = max F f ) ( y )= max Fi2'(y), y€
YS
YE
Y
Fi2)(yS) = min Fi2)(y). ys Y'
(2.33)
Now let us consider a multicriterial problem
( X , x YS,F"'(x)
+ P'(y)),
(2.34)
where, as noted above, the sets X, and Y" are compact. By virtue of (2.32) the situation (x,, y') is Slater-minimal in problem (2.34)
309
2. A Game with Separable Payoff Function
and, by (2.33), the same situation is the Slater-maximal solution of this multicriterial problem. Consequently, at the point (x,, y'), P1)(x,)
+ F 2 ) ( y S )= -
maxS [P1)(x) + P2)(y)]
( X , Y F X , x yo
minS
(X.YkX, x
Y'
[P')(x)
+ P2)(y)],
or
+ [ F ' 2 ) ( y s ) E ~ r ' s ' [ [ F " ' ( X+s )P2)(Ys)], V1)(xs)+ P ) ( y S E) Frcs,[P1)(Xs)+ P'( Ys)].
[~'"(x,)
(2.35)
In addition, by Lemma 2.1 the situation (x,, y') is the Slater saddle point of the game (2.6) with N = 2, because x, E X, and ys E Y', thus IF")(x)
+ lF(2)(ys)4: P1)(x,) + P 2 ) ( y S 4:) [F(')(x,) + [F'2'(y)
(2.36)
for every x E X and y E I: By Corollary 4.1 from Chapter 1 there exist strategies Uzs E % and Vzs E V" such that xce, to,
yce, to,
XO, Yo9
UZSl= Xsr V
zs -
1- Y
(2.37)
s
for all quasimotions x[. ,to, xo, Uzs] and y [ . ,to,yo, Vzs]. Since for any U E %d and V E V d , xce, to, xo, UI EX,
yre, to, yo, VI E
r:
it follows from (2.36) and (2.37) that WXCB,
+ 5(2)(~ce, to, yo, VZSI) to, x0, uZs1) + E'2)(yce, to, yo, v z s 3
to, xO1UI)
4: WXCB, 4: w x c e , to, xo, UZSN+ ~ ' 2 ' ( y ct o~, yo, , V3)
for any strategies U E %d and V E V d and quasimotions x[. ,to, xo, U], y [ . ,to, yo, Vl. The latter chain of relations signifies that the situation (Uzs, V z s ) is the Slater saddle point of the differential game (2.1) with N = 2. By (2.36) and (2.35), for the Slater saddle point (Uzs, Vzs),
+ P 2 ) ( y [ 8 ,to,yo, I/zs])~9r(s)[IF(')(X,)+ P2)(Ys)], P)(x[B, to,x0, uZs]) + P2)(y[e, to,yo, I/zs])~~r(s,[(F(l)(X,) + F2)(Y,)].
[F(')(x[~, to, xo, UZs])
310
7. The Competition Problem
Then, by Proposition 2.4,
is simultaneously the Slater maximin and Slater minimax. Consequently, the situation (Uzs, V z s ) obtained in (2.32),(2.33) and (2.37) is the ZS-solution of the differential game (2.1) with N = 2. Note also that, by Proposition 2.8 from Chapter 6, the Slater-maximin strategy in this ZS-solution is Vzs with the associated point
and the Slater-minimax strategy is U z s with the associated point ( i UZS], [ j[P I ) . Thus, in Example 4.2 in Chapter 5 there are two ZS-solution to which the points C and D in Fig. 5.4.4 correspond. The point C is associated with the first ZS-solution, the Slater saddle point (UW, ycl)) - (
pu ( l )) = ((-1, + 1X(O,l)k 3
with U ( ' ) the Slater-minimax and V(" the Slater maximin strategies. The point D represents the second ZS-solution (U'2', V 2 ) ) - ( u ( 2 ) ,u ( 2 ) ) = ((
+ 1,
-
l), (1,O)).
Remark 2.6. In a similar way it may be proved that in (2.1) with N = 2, there exists a Pareto saddle point such that (Up, Vp) is the ZP-solution, ff(x[B, to, xo, Up, Vp]) coincides with the Pareto maximin and minimax. In particular, there are two such solutions in Example 2.1. In Fig. 7.2.2 they are represented as the points L and K. For L the Pareto saddle point is (VL',VL)) + (( - 1, + l),(O, + l)), the associated Pareto-maximin strategy is VL with point
(;re,
to,xo, ~
( ~ x' 1e ,,to,Yo, vL)l) = (( - 1, + 11,(0, + I)),
and the Pareto-minimax strategy is U(L)with the same point. The point K is represented as the Pareto saddle point (UK), V K )-) (( + 1, - l),( + 1,O)); here the Pareto-minimax strategy is U ( Kwith ) point ((1, - l), (1.0))and the Paretomaximin strategy is V Kwith ) the same point. Finally, the point L represents the Pareto maximin maxP u minP [F(x[~],y[e]), VEY-
XC.l,YC~I
3. The ZS-solution in the Competition Problem
31 1
which is equal to (- 1,2) and minP u maxP [F(x[~],~[e]), LIE@
*C.I,YC.l
and for the point K, maxPu minP VEV
qX[e], y[e])
*C~l,YI~l
=
minP u maxP UEQ
XC~1,YC.I
[ ~ ( ~ [ e y[e]) ],
= (2,1).
3. The ZS-solution in the Competition Problem This section considers basically the properties of vector-valued saddle points, maximin and minimax in a differential game with mirror vector-valued payoff function. Then, using the properties of such solutions for games with separable and mirror payoff function, a procedure is proposed for deriving a ZS-solution in a competition model.
3.1. A Game with Mirror Payof Function (Saddle Points)
In a differential game {qM,
-Y^,>?
IF(x~819
ycel)>,
(3.1)
the control system EM is described by equations (1.5), = f(t, x,
4,
L = f(t, y , 4,
(3.2)
where x, y E R"; time t E [to,01; the constants 0 > to 2 0; the control actions of the players u E Q ~ c o m R4, p u E Q (or H = Q); the vector function f(t, x, u) satisfies Condition 1.1 of Chapter 1. The initial positions of both subsystems of (3.2) are assumed to be the same, i.e., the initial position is ( t o , xo, Y o = XO).
is,
(3.3)
The sets of strategies of the players are defined in (1.6),where H = Q, that
%I = { u f U ( t , X, Y ) I U(t,X, Y ) Q } , VM= { L' + u(t, X, Y ) I u ( 4 X, Y ) E Q > .
(3.4)
Because the right-hand sides of the subsystems in (3.2) are the same, the initial positions ( t o , x o ) and (to,yo) also coincide by virtue of the same
3 12
7. The Competition Problem
constraints (inclusions) in (3.4), the reachability sets of each of the subsystems in (3.2) coincide, that is, XCO, to, x0, U
t
Q] = X
Y
=
=
Y[O, to, yo
= x0,
V + Q].
(3.5)
The components of the vector-valued function F(x, y) are continuous on X x Y = X 2 . Besides, F(x, y ) is assumed to be mirror (anticommutative):
m,y ) = - 5(Y, x )
(3.6)
for any EX and y~ Y In particular, it follows from (3.6) that F(X,
x ) = 0,
E
w,
v x E x.
(3.7)
Among the mirror vector-valued functions are F(x, Y ) = cx 0Y I ( X - Y ) ,
where x 0 y = ( x ~ , ~ ~ , . . . ~ x n y , , ) ; F(x, y ) = F(x) - q y ) .
The latter example forces us to take a special look at the class of differential games (3.1) with mirror vector-valued payoff function. The vector-valued payoff function F(x[O],y[O]) of (3.1) that has the property (3.6) will be referred to as mirror and the game (3.1) itself as the zero-sum game with mirror vectorvalued payoff function. The following is an auxiliary proposition that will be helpful in the proofs of this subsection. Let U* t u*(t, x , y ) and V* t u*(t, x , y ) be some strategies from the sets %M and VM, respectively, and let the sets X [ U * ] = u x t . I x [ OtO,x0, , U * ] and Y [ V * ] = U,t.ly[O,t,,yo, V * ] . Consider the strategies U: t u*(t,y,x) and V: t u*(t, y, x); these differ from U* t u * ( t , x , y ) and V* t u*(t, x, y) in that in the functions u*(t, x , y ) and u*(t, x , y), the vectors x and y are interchanged. Let XCV:I =
u u
xce, to, x0,
~ 1 ,
YCO, to, xo,
u:1.
xt.1
ycu:1 Lemma 3.1.
=
Yt.1
The following inequalities hold: X [ U * ] = YCu:],
Y [ V * ] = X[V?].
3, The ZS-solution in the Competition Problem
313
Indeed, in devising stepwise quasimotions of the first (respectively, second) subsystem of (3.2)with U = V,* - u*(t, y, x) (respectively, with V = UF t u*(t, y, x)), we have, in fact, the same system of stepwise equations, where the vectors x and y have been interchanged. Hence the lemma follows. Proposition 3.1. Ifthe situation (V:, U ; ) is the Slater saddle point of(3.1), so is the situation (Us, Vs).
Indeed, for the Slater saddle point (Us, Vs), we have, by Definition 1.1 in Chapter 5,
Proof.
~ ( x cto, ~ x0, , ui, yre, to,xo, vS1)4: F(xce, to,xO7us],yce, to,x0, vS1)
4:
F(XC& to, xo, USl,yCe,to, xo, v3)
for any U E %lM and V EVMand quasimotions x[ . ,to,xo, U ] ,y [ .,to,xo, U. In light of (3.6), these relations are equivalent to -V
Y C ~to, , x0, vS1,xce, to,x0, ~ 14: )- wee, to,xo, vS1,xce, to,xo, U S ] )
4: - F(YC% to, xo, VJ, xce,
to, xo, US])
or
w e ,to,x0, vS1,xce, to,xo, UI) 4 w e , to,xO9vS1,xre, to, x0, us]) 4 F(YC&
to,
XO,
Vl, xce, to, xo, US])
for any U E 4&, V EVM,and quasimotions x[ * ,to,xo, U ] , y [ to,xo, Vl. By (3.5) and Lemma 3.1, the latter relations may be represented as 1 ,
w e ,to, xo, V I 9yce, to, XO, v3) 4 w e , to, xo9 c1,yce, to, XOl q l ) 4 ~ ( x r eto, , xo, UI),
y[e, to, xo, U;I)
for any strategies Uc%lMand V€VM and quasimotions x[., to, xo, V : ] , y [ . , to, xo, V1, x [ * , to, xo, U ] , y [ ., to, xo, U , ” ] which , proves Proposition 3.1. In a similar way the following proposition is found to be true for the situation ( U K ,V K )to be the Pareto (with K = P), Geoffrion (K = G), or Asaddle point (K = A) of (3.1) it is necessary and sufficient that the situation (V,“, U:) be the saddle point, consequently. Remark 3.2. By Theorem 2.1 in Chapter 5 the set of A-saddle points is nonempty in the differential game (3.1, Chapter 4) and, consequently, by Property 1.1 from Chapter 5, there exist Geoffrion, Pareto, and Slater saddle
3 14
7. The Competition Problem
points. But in a static (nondifferential) game the mirror nature of the game is not sufficient for the existence of vector saddle points which, in the case N = 1 (scalar payoff function), coincides with the ordinary saddle point. Let us demonstrate this case for a matrix zero-sum game with scalar payoff function. Example 3.1. Both players in a static zero-sum game with scalar payoff function (X, I:F , (x, y)) have three strategies each
x = (p, p, $9,
y = (y'",
y ( 2 ) ,y'3').
Now let a.. v = Fl(x(i),y")), i , j = 1, 2, 3.
If in this game the strategies of the first player are associated with the rows, and of the second one, with the columns of the matrix, we have a table (payoff matrix), A
=
[:!
a13
a22 a32
a33
The strategies of the first player can then be associated with the ordinal numbers of the appropriate rows of the matrix A and of the second one, with the ordinal numbers of its columns. The game itself with a specified payoff matrix is usually represented as a collection r(l.l,Chapter6) =
(x,
A),
where now 2 = { 1,2,3} and ? = { 1,2,3}. Consequently, r(3,1) is a matrix game whose analysis is a central concern of general game theory. This game has a mirror payoff function if a,.= 11
-0..
11
and the matrix A must be skew-symmetric. Let us see that in r(3.1) with mirror payoff matrix
3. The ZS-solution in the Competition Problem
there exist no vector-valued saddle point that, for the matrix game coincides with the ordinary saddle point ( i o , j o )E 2 x that is,
3 15
r(3,1),
mqx min aij = min aijo = rnax a i o j= min max a i j . jEY
je Y
iEX
iEX
ieX
r(3,1) with A = A*, max min aij = 1 < + 1 = min
jsY
Indeed, for the game i
i
-
i
max a i j , j
thus in r(3,,) with A = A* there is no Slater or, consequently, Pareto, Geoffrion, or A-saddle point. What is important is that the symmetry of the matrix A also is not sufficient, for the saddle point to exist. Thus, for a symmetric matrix +1
-1
A = [+1 0
0 -1
-:I,
+1
it is true, as in the preceding case, that max min aij = - 1 < j
i
+ 1 = min i
max a i j . j
3.2. Vector-Valued Maximin and Minimax of the Game (3.1) By the way quasimotions are obtained (subjection 2.3, Chapter l), every fixed strategy V e V Mis associated, at the time t = 8 the game ends with the set Z[V] = x x Y[V],
where X = X [ t o ,xo, U - Q] n { t = O}
is the reachability domain of the first subsystem in (3.2), and Y [ V ] = @Y[to,xo, V ] n {t = e} is the intersection of the bundle of quasimotions of the second system in (3.2) generated by the strategy V from initial position ( t o ,xo) and the hyperplane t =
e.
Similarly, every strategy U E 'BMof the first player is associated with the set Z[U] =X[U] x
r,
316
7. The Competition Problem
where
Y=%[to,~,,VtQ]n{t=O}, X [ U ] = %[to,xo, U ] n { t = O } . Now, let us consider two multicriterial static games,
Lemma 3.2. For every strategy VE.Y-,, the set Z(,,[V] coincides with Zcs)[Vx]and,for any U E @ ~ , z's"U] = Z(,)[U,]. Proof. Let V be some strategy from the set VM. If z,[ every X E X and ~ [ V J EY [ U ,
FS(ZSCV1)= W
S C V I , YSCVI)
V l E Z,,,[ U,then, for
4 WX,YWl)
or, recalling (3.6),
- b C U , X S C U ) 4 - F(YCVl9 4. Therefore, for every y[U E Y[VJ = X[V,] (Lemma 3.1) and X E X = Y (see (3.5)), F(YSCV1, X S C U ) 4: W V X I , Y )
(3.8)
in every x [ V x ]EXCV,] and y e I: But, by the form of the system (3.2), for the
3. The ZS-solution in the Competition Problem
317
strategy V, E % ,,, there exist quasimotions x[ .,to, xo, V,] and y [ .,to, xo, V,] such that XSCV,l = xce, to, xo, V,l
= YSCVl,
Y ~ C K=I YCe, to, x0, V I = xscvl. Therefore (3.8) means that
W C K I r YSCVXl)4r
W
C
~
X
I
Y), ,
YYE
k: xCVxIEXCI/,I,
or (YSCYI,
XSCVI)
= (xSCVx1,Y S C K l ) E Z ' S ) C ~ x l .
Consequently, the following inclusion is true: Z,S)Cvl = ZCS~CVXl.
The converse inclusion may be proved in a similar way. This proves the lemma.
Proposition 3.2. I f the strategy V"'E VMis Slater-maximin in (3.1) with point (ks[V's'],j S [ V ( " ] ) , V$) is the Slater-minimax strategy in this game and the associated point ( k S [ V 3 , jS[V,(S)])= (js[V(S)],2s[V("]) and the associated Slater-maximin F ( i S[ V S ) ]j,s[ = - [F(Y[ V,(')], j s [V,cs)]), is the Slater-minimax. Conversely, if the strategy U") E aMofthejrst , is player is Slater-minimax in the game (3.1) with (2s[U's)],y [ U c s ) ] )UF' Slater-maximin while (2,[Ul"'], js[Ul"'])= (j"U'"'3, P [ U ( " ] ) and F(P[ U'S'], y [U'S']) = - IF($ [U 3 , j s[U 3 ) .
Proof. If Vcs)is the Slater-maximin strategy of (3.1) from Chapter 4, then, by Definition 1.1 from Chapter 6, there exists a point ( i S [ V ( ' ) ] , j s [ V c s ) ] ) ~ Z ( s , [ V such ( S ) ] that ~(%cV's)19h C ~ ' " ' 3 ) 4:
%m7 Yl S,C Y I )
318
7. The Competition Problem
for every V € V M , ( x s [ V ] ,y s [ V ] ) ~ Z ( S ) [ VBecause ]. F(x, y ) is mirror (3.6),
~ ( ~ s c ~ Mc V” ( ls ’~l )4 V for any V E VM and Z,,, [ V ] = 2”’ [ V,], thus
Y S c n XSCVI)
( x s [ V ] ,y s [ V ] )Z,,[V]. ~
By
(3.9) Lemma
3.2,
(XSCVI, Y S C V I ) = (YSCV,I), XSCVxI).
Among the quasimotions ( x [ ., to,xo, Cs],y[ . ,to,xo, V:]) of the system (3.2) there are those for which P[V,CS’]= x y o , to, xo, V,’”] = js[V‘”], j y v 3 = y y e , to,xo, V 3 = f,[V‘S’]
and (Y[I/X(S’],
because Z,,,[ V”)] the form
=
j”Cv,c”])EZ(S)[V:/x(”’]
Z”’[ VL”]. Consequently, (3.9) may be represented in
w[V,‘S’l, j”P’’3)
4 F ( X S C V 1 ? YSCV,l)
for every V . E @ ~and (xs[Vx],y s [ V , ] ) E Z ( ~ ) [ V , The ] . latter relation implies that VLS)is the minimax strategy of (3.1) and that the associated point is
(-;s[V,‘q
j”cV,cS’])
= (jS[V(S’], 2s[V‘s’]).
The latter part of the proposition is proved in a similar way. In the differential game (3.1) with mirror payofffunction, the of all Slater maximins is symmetric with respect to the point E RN of the set of all its minimaxes 5s,.
Proposition 3.3. set 5 = 0,
Proof. By (3.6) and Proposition 3.2, the following chain of equations holds:
=
IF(%, [V”’], j s [V”’]) = - 5 ( i S[V,Cs’],j s [ V,“’])
- -
m i n S u maxS F(xC0, to, xo, U ] , yC0, to, xo, U l ) UEffM
X1,l.YI.l
that is, for every Slater-maximin there exists a minimax that is symmetric
3. The ZS-solution in the Competition Problem
319
with respect to the point IF = 0,. Analogously, for any Slater-minimax strategy (3.1), maxS q x [ e , to, xo, U ] , y[O, to, xo, Ul)
min’u
~@% XC.I,YC~l l
--
maxS u minS IF(x[e, to, xo, V ] , y[O, to, xo, V ] ) vcv;,
~C~I.YC~1
that is, every Slater-minimax is associated with a maximin symmetric with respect to IF = ON,whence follows Proposition 3.3. Finally, the next proposition may be proved in a similar way. Proposition 3.4. In (3.1) the set of all vector-valued maximins 9(K) is symmetric (with respect to the point IF = ON), to the set of minimaxes 9(;(K) ( K = P, G, A).
In other words, the following sets are symmetric (with respect to IF = 0,): (1) set of Pareto maximins and minimaxes; (2) set of Geoffrion maximins and minimaxes; (3) set of A-maximins and A-minimaxes.
3.3. Obtaining ZS-Solutions in the Competition Problem
-
Now let us prove that the procedure leading to a ZS-solution (subsection 1.5) is valid. We consider a multicriterial dynamic problem
ry= (I:
( w , v ,~ ~ ( ~ c e i ) ) ,
where the system I: is described by equation (1.3) with initial position ( t o , xo); the set of strategies -Y = { V
+
u(t,Y ) I v(t, Y )
Q}
and the vector criterion IF(y[O])is defined on the reachability domain Y of the system (1.3).
Proposition 3.5.
If V s - uS(t,y )
i s the Slater-maximal strategy of problem
320
7. The Competition Problem
ry,the situation (Vz, Vs), where I/; + uS(t,x) is the ZS-solution of the competition problem (1.4), while the associated point, see DeJnition 1.1 (2), (a[& to, xo, e l , ice, to, xo, V']) is defined in such a way that i r e , to, xo, = y [ e , to, xo, Vs] and a[ to, xo, v,"] is any quasimotion ofthe system (1.1) generated b y the strategy v,"from position (to,xo).
e]
a ,
Proof. Let us show that the situation (V;, V s ) is the Slater saddle point of (1.4). Because V'E VC, the set of Slater-maximal strategies of problem ry,by Proposition 2.1 it would be sufficient to show that the strategy - us@,x) is Slater-minimal in the problem
(c - (i.i), %, - ~ ( ~ c e ] ) )
(3.10)
or Slater-maximal in the problem
rx= ( c + (i.i), %, ~ ( X c e l ) ) . But the problem r, coincides with rywhen x is replaced by y. Therefore, Ks is
the Slater-maximal strategy in rxand, consequently, Slater-minimal in problem (3.10). Let us show now that, for any quasimotion bC.9 to, xo,
el,yc.9 tO,XO, V")
of the system (1.5) such that
ace1 = xce, to, xo, el = yce, to, xo, VS1 = iCVSI, the point 0, = F(j[VS]) - lF(n[v,"])E.F-(S)[F(YS) - ff(XS)],
(3.1 1)
where X, and Ysare the sets of Slater-maximal solutions of the associated problems (X, Since X
=
K
W)>
and
( X F(JJ)>.
x,= YS,
(3.12)
that is, the Slater saddle point (V'., V s ) is such that the value of the payoff function chosen on it is the Slater maximum of the set F(Ys)- F(X,). Indeed,
3cvS1= yce, to, xo, vS1E ys, nL-~'.I = xce, to, xo, I/xSIEXs,
3. The ZS-solution in the Competition Problem
321
by Proposition 3.1 in Chapter 2 on the structure of solutions to dynamic multicriterial problems. Therefore,
F(j[VS])
-
F(%[V])EF(YS)
-
F(X,).
Now (3.11)will be proved by contradiction. Assume that there exists a saddle point (xs,y s ) of the game
(X,Y, F(Y) - W) such that F(yS)- F(XS)> F(j[VS]) - F(%[v,s])
= 0,.
Hence 5(yS) > F(x”.
(3.13)
However, by Lemma 2.1, y S € Ysand x’EX,. Then inequality (3.13) contradicts (3.12) and the internal stability of the Slater-maximal solutions of the static multicriterial problem ( Y, F(y)). The resultant contradiction proves (3.11). The inclusion
~(jCVSIl) - ~ ( ~ C V l ) E ~ q S ) C ~( YW SS )) l is proved in a similar way. From this expression,(3.1 l), and Proposition 2.4, it follows that there exist a Slater maximin and Slater minimax equal to F(j[Vs]) - F(i[V]), which proves Proposition 3.5. Analogously, the ZS-solution of (1.4) is the situation (Us, U:), where the strategy U s is Slater-maximal in the multicriterial problem (E + (l,l),92, F(x[B])), while the associated point (%[& to, xo,Us], j [ e , to, x0, ~ 3is ) such that n[e, to, x0, us]= j [ e , to, xo,U 3 . This is exactly the proposition used in subsection 1.5 to obtain the ZSsolution of the competition problem (1.4). Remark 3.2. A similar procedure for obtaining a vector-valued saddle point, where the associated vector-valued maximin is equal to the minimax, can also be formulated in the cases of the Pareto-, Geoffrion-, and Aoptimum.
Thus, in the case of the Pareto optimum (for the ZP-solution) the procedure is as follows:
322
7. The Competition Problem
(1) Find the Pareto-maximal strategy U ' E ~ ' UPtuP(t, , x) of the multicriterial problem r, and, simultaneously, obtain at least one quasimotion ?[. , to, xo, Up] for the system (1.1); (2) The situation (Up, U,'), U!tup(f, y) is the Pareto saddle point such that for p[8, to,x0, UyP] = ?[8, t O , x 0 ,Up], FW,
to, x0, U;I)
-
w e , to, xo, vp1)
= m a x P u minP [[F(y[e, to, x0, VEY'
=
XC~I,YC~]
minP u maxP [F(y[O,
fJE4F
to, x,
v]) - F(x[e,
to, x0,
v])]
U]) - F(x[e,
to, xo,
Ul)]
XC.l.YC.1
that is, (Up,UyP) is the ZP-solution of the competition problem (1.1). This procedure is implemented in the next subsection in constructing a mathematical model of an identical research problem that is solved by two competing teams.
4. Model of Competing Research Activities
Sets of Pareto minimaxes and maximins are found and a ZP-solution obtained in the case of the Pareto optimum. 4.1.
Single- Firm Model
The model is constructed along the lines of [77], where only the Nash equilibrium was investigated. Two teams are engaged in identical research. The time 8 > 0 alloted for this research is specified in advance. If neither is successful for this time, the research ceases since by that time the problem is no longer relevant. Let t j denote a random time tje(O, 131at which the i-th (i = 1,2) team scores a success. If ui(t)is the rate at which the i-th firm acquires (at time t~ [0,8]) the knowledge needed to solve the problem, the knowledge acquired by the i-th team increases by the equation ii = ui(t),
zi(0) = 0,
i = 1, 2,
(4.1)
where the scalar functions ui(t) are Borel-measurable and ui(t)E [0, ail at every t E [0, 0) is the ai specified positive constant. The sets of such functions ui( .): [0,8) + [0, a i l will be denoted as ai,i = 1,2. The initial conditions
4. Model of Competing Research Activities
323
zi(0) = 0, i = 1,2, signify that at the initial time t = 0 neither team has any insight into the problem. The probability that the i-th team will solve the problem if the amount of knowledge about it is zi is described [77] by the formula Fi(zi) = 1 - exp[-Azi]. From this distribution law it follows, in particular, that
+
Y { t i ~ ( tt , d t ) l t i > t ) = Au,(t)dt, ~ E [ O , 01,
or, if by time t the i-th team has not solved the problem, the probability of a success over the nearest time span of dt is directly proportional to ui(t),while the probability of failure by time t is exp[ - Azi(t)]. If the value of the patent by the current time is a constant L, the mean income from securing a patent by the i-th team is described by the functional
J!’)(ui)= AL
f
ui(t)exp[ - Azi(t)]dt.
Let us now proceed to estimating the costs of the i-th team in the course of the research activities. The cost of obtaining additional knowledge at time t is estimated [77] as 0.5u;(t). Therefore the mean payment over the entire time span 6 is given by the functional
ijr e ui’(t)exp[-rt]exp[-Azi(t)] 1
0
dt
where the rate of discounting r is assumed to be a specified positive constant. Then the second criterion of the team’s performance is the functional
:
Ji2)(ui)= - -
u?(t) exp[-rt
- Azi(t)] dt.
(4.3)
If it is required to analyze the optimal behavior of the i-th team alone regardless of the other team, the mathematical model could be the twocriteria1 dynamic problem
(ii= ui, Zi(0) = 0; 4; {Jp(ui),J i 2 ) ( U i ) } ) ,
(4.4)
where the functionals J!’)(ui)and J12)(ui) are described in (4.2) and (4.3), respectively. In this problem, however, the i-th team generates the control ui( *)€ai in order to maximize both criteria J!’)(ui)and J!’)(ui).This signifies that the i-th team tries to minimize the cost -Ji2)(ui) and, simultaneously, maximize the income J!2)(ui).In terms of the theory of multicriterial problems
324
7. Tbe Competition Problem
the solution of (4.4) may be the Pareto-maximal control up( *)E%!~,or for any ui( . ) E ai,the following system of inequalities is nonsimultaneous: J{l)(Ui)
> Jp(up),
.p(Ui)
2 Jp(up)
so
at least one of which is strict; now let = {up}. Note that the Pareto-minimal solution u : ( . ) E of (4.4) is made possible by the nonsimultaneity of the system of inequalities J!”(Ui)
< Jy(u:),
JyyU,)
< JIZ)(ui*)
at least one of which, as in the preceding definition, is a strict inequality. Consequently, in the case of seeking the optimal solution for one team, disregarding the other team, it is sufficient to find, and then use, the Paretooptimal control for problem (4.4). This, however, is insufficient in the case of two competing teams.
UP
4.2. Game-theoretic Model of Competition In the competition the i-th team tries to do better than the other team in terms of both criteria or at least in terms of one criterion without having the other team’s chances. For the first team this means trying to choose the appropriate u ~ ( - ) E %so ! ~ as to increase both components of the vector 1b1, u2) = U l ( 4 , u2)9
12(%
U2)h
(4.5)
where
li(u1, u2) = Jil)(ul)- 5i2)(u2),
i = 1,2.
(4.6)
The objective of the other team is just the opposite, i.e., to find its own control u2( . ) E % ! ~in order to reduce the components of the vector Z(ul, u2),which, by virtue of the fact that min[ -51 = --ax J , implies increasing the differences J(1)
2 (u2)
-
J Y ( d= - 1 A U 1 ,
J (22 ) (u2) - J(l2)(U1)
u2)9
= -12(U1,U2).
The process dynamics are described by the system
z* I. = u . with constraints ui( - )E 4 .
zi(0) = 0,
i = 1,2
4. Model of Competing Research Activities
325
Then the mathematical competition model may be represented as a differential zero-sum game with two-component payoff function where q.is the set of strategies of the i-th player (subsection 4.1); the control system C is described by the ordinary differential equations (4.1); the vectorvalued payoff function I(ul, u2) is specified in (4.9, (4.6). Now let us proceed to the notion of a solution to (4.7). The situation (uy(.),us(. )) E a1x a2will be referred to as the Pareto saddle point of the game (4.7) if q u , , US) 4 I(&
US),
V4(.)€@1
and
Wl, u!) 4 ml, u2)9
1
VU2(. E @2
*
The set of all Pareto saddle points is also denoted as 9. The solution of (4.7) for the first player, the Pareto-maximin strategy uiP)(.), is defined in the following way. A new class of strategies, called the counterstrategies of the second player, will be needed. The second player's counter-strategy is associated with the Borelmeasurable (over t, ul) functions u2(t,ul) such that u2(t,ul) E [0, a2] for every (4 U1)E LO, 0) x LO, all. The set of such counter-strategies will be denoted e2(u1), the inclusion %2 c a2(u1) being obvious. In a similar way the set aI(u2) of counterstrategies ul(t, u2):[O, 0) x [0,a2] + [0, a l l of the first player is introduced. The strategy U : ( . ) E % ~ of the first player will be referred to as the Pareto maximin in the game (4.7) if (1) for every strategy u1(.)e4Vl of the first player there exists a counterstrategy ii2(.) = ti2(.,u 1 ( . ) ) ~ a 2 ( u 1of) the second player such that N41,
iiz)
9 I(%,
U2b
vU2(.)E*2(UI);
the set of such counter-strategies fi2( .) will be denoted 4?2(u1); (2) there exists a counter-strategy iir = ii:(. ,u:(. )) E d2(u:)for which
w ,fir) a I(%, k),
Vu,(.)E%1,
fi2(*)E4?2(Ul).
The constant vector I(u:, fir) will be referred to as the Pareto maximin of
(4.7) and denoted
Z(u:, fir) = maxP u U A . )€*I
minP I(ul, u2).
4. ) E *hJ
326
7. Tbe Competition Problem
The notion of a Pareto-minimax strategy u;( .) of the second player (using the set 421(uz) of counter-strategies of the first player) and of the Pareto minimax of the game (4.7) are introduced in the same way:
I(;:, u r ) = minP u U2(. ) E %
maxP I(ul, uz). UI(.)E%(UZ)
Definition 4.1. The solution of (4.7) is the situation (u:, u ? ) which (1) is the Pareto saddle point in this game;
(2) the strategy u:( .) is the Pareto-maximin and u t ( .), the Pareto-minimax of (4.7); (3) the value of the two-component payoff function I(ul, u z ) in the saddle point (u:, u t ) coincides with the Pareto minimax and maximin, that is, I(u:, u t ) = maxP u Ul(. ) E
-
8,
minP I(ul, u z ) = u 2 ( .) E . W , ( U , )
minP I(u:, u z ) U2(.)E%(U:)
maxP I(ul, u t ) = minP u maxP I(ul, uz). %(4 U2( , ) E @ 2 Ul(. )E4 m 2 )
U l ( . )E
This solution (u:, u t ) is internally stable over the entire set of Pareto saddle points of (4.7), specifically
w,
UZ)
a I ( U 7 , 4),w , ur) $ 1(u7, 4)
for any Pareto saddle points
( U ~ ( - ) , U ~ ( * ) )of E ~the
game (4.7).
The solution of (4.7) introduced by Definition 4.1 is the ZP-solution introduced in Definition 2.1 of Chapter 6. In the last subsection of this section the analytic form of this solution will be found.
4.3. Pareto-Optimal Controls That the following method of obtaining Pareto-maximal control is valid is proved in [102]. Specifically, it is required to find the optimal control @ ( * ) E % in the problem J i ( u i ) = /lJl”(ui)
+ (1
-
/l)Jlz)(ui) - max
U,(. ) € %
(4.8)
4. Model of Competing Research Activities
327
with constraints z. ' I = u.
Zi(0) = 0
(4.9)
and at least one constant PE(O, 1). Then the control uy( .)E'%,' is Pareto-optimal in problem (4.4). The solution of (4.8) and (4.9) will be obtained by the dynamic programming method combined with the method of Lyapunov functions [102]; for brevity of notation, the subscript i will be omitted in the further discussion. The dynamic programming equation for (4.8) and (4.9) takes the form
q(e, z) = 0, solving this equation, the constraint u( - ) E [0, a]
vzE
W.
(4.10)
will be neglected for In now and its validity will be tested below. Since we have the quadratic function of u in the braces of (4.10), the maximum in (4.10) is obtained with (4.11) Substituting uo found in (4.11) in the first equality of (4.10) gives us the following partial differential equation for obtaining the function q(t,z)
The solution of (4.12) is sought in the form q ( t ,z) = h(t)e-".
(4.13)
Substituting (4.13) in (4.12), we find an ordinary differential equation for determining the function h(t):
Its solution is (4.14)
328
7. The Competition Problem
Hence, in light of (4.13) and (4.1l),
' + 2(1 A2- B)r
(1 - B)[(BL)-
The resultant Pareto-maximal control uo(.) has certain properties. First, u o ( . ) is not explicitly dependent on the state variable, thus in problem (4.4) it would be sufficient to use only programmed, or exclusive time-dependent controls, with t time. Second, uo(t) > 0 at every t E [O, 131 for p = const E (0, l), which implies that it is desirable to conduct the research over the entire time span [O, 01. Third,
that is, uo(t) is bounded at every t E [0,19]. This requirement is obvious if an allowance for the research cost is to be made. Now, let us proceed to obtain in the criterion space {J1,J2}the set of values of the functionals (J,(u),J2(u)) that can be achieved on the set of all Pareto-maximal controls of problem (4.4). Since the set of controls @ is convex and the criteria J,(u), and J2(u))are strictly quasiconcave [70, p. 991, then, following [70, p. 1071, we have the entire set uo( .) of Pareto-maximal controls for problem (4.4)if the optimal control problem (4.8),(4.9) is solved (i.e., if uo( .) is found) as the parameter B varies from 0 to 1. From (4.13) and (4.14),
Hence (4.16) Simultaneously, by the dynamic programming method q(0,O) = /3J"'(UO)
Denoting in (4.16)
+ (1 - P)J'2'(UO).
329
4. Model of Competing Research Activities
we have
+
aL2cos = sin+ + a ~ c o s
‘(O,’)
+
+
With = 71/2 we have cp(0,O) = 0, that is, the scalar product (J(”(u0),J(2)(uo))*(O,1) < 0 J(2)(uo)< 0 for all uo( *)E@’. With =0 we have q(0,O) = L, that is, (J(”(u0),J(’)(u0)).(1,O) < L =-J(’)(uo) < L, Vuo( .)€ao. In Fig. 7.4.1 the set J(@’) of values of the criteria (J(’)(u),J(”(u)) on the set 92’ of Pareto-maximal controls uo( * ) is shown as a heavy line. Here also
+
J(@) =
u u
(J(1)(U),J‘2’(u)),
U(.)EQ
J(@’o) =
(J(1)(u),J‘2’(u)).
U( . ) € 1 0
t J‘2’
Figure 7.4.1.
L
330
7. The Competition Problem
4.4. Decision Making in the Competition Model
In this subsection the results of the preceding subsection are shown to lead to the explicit form of a solution of the game (4.7) (Definition 4.1). For brevity of the notation, we introduce two vectors
Ji(ui)= (J!')(ui), J12)(ui)),
i = 1, 2
and consider a two-criteria1 dynamic problem Let 42: = 42'
rl = (il = Ulr z(o) = 0; al;J1(ul)). be the set of Pareto-maximal controls of rl and Jl(W=
A
u
( J ( l 1 W 9
(4.17)
J(l2)(u1N
E@ :
the set of values of the vector-valued criterion J , ( u ) obtained on 42:. Simultaneously with rl let us consider a two-criteria1 dynamic problem
r2= ( i 2= U 2 , z2(0) = 0; ~49~;- J ~ ( U ~ ) ) , where - J 2 ( u 2 ) = (-J\')(U~),J\~)(U~)); in this subsection it is assumed that
Thus the set of controls for the first player a1coincides with the set of controls of the second player. Because the multicriterial problems rl and T2 are of the same kind, the set 42: of Pareto-maximal controls of rl 42: =
@a,
where 42; is the set of Pareto-minimal controls of T2 or the Pareto-maximal controls of the two-criteria1 problem (i2
= u2, z2(0) = 0;
422; J 2 b 2 ) ) .
Recall, that, by (4.15),
As in Lemma 2.1, we may establish the following proposition.
33 1
4. Model of Competing Research Activities
Proposition 4.1. Any situations (u: .), u r ( .))€a: x @-f, and only such situations are Pareto saddle points of (4.7). 7he values of the vector-valued payoff functions Z(ul, u,) obtained on the set 427 92; of all Pareto saddle points are dejined by the difference
+
I(@:,@?)
= Jl(@:) - J,(@,*) =
u
[ J ~ ( u : )- J ~ ( U ? ) ] . (4.18)
u:c ' )E*: u:( ' k.w:
Figure 7.4.2 shows the sets J,(@T)and J 2 ( @ z ) and Fig. 7.4.3, the set
Z(@?, @?)
= J1(@?)
-
.I,(@?).
Pareto minimaxes and maximins are obtained by solving the dynamic twocriteria1 problem
(ii= u f ,Zi(0)= 0;@: x a*;qul, u,)), where the solutions (u:( -), ur( .)) are defined on the set $2:
X
42;
of Pareto saddle points of the game (4.7).
Figure 7.4.2.
(4.19)
332
7. The Competition Problem
Proposition 4.2. The set of Pareto maximins of the game (4.7) coincides with the Pareto-maximal values of the vector-valued payo$ function l(ul, u2) in problem (4.19) obtainable on the set 42: x 42: of all Pareto saddle points, or
where Jl(42:) - J2(42?) is defined in (4.18).
The set of Pareto minimaxes of (4.7) coincides with Pareto-minimal values of the vector-valued payoff function Z(ul, u,) in problem (4.19) obtainable on the set of all Pareto saddle points, or
The set of Pareto minimaxes in symmetric (with respect to the point 0, E R2) to the set of its maximins, and the set of Pareto-maximin (Paretominimax) strategies coincides with the set of Pareto-maximal (Paretominimal) controls of problem (4.19).
Figure 1.4.3
4. Model of Competing Research Activities
333
Proofs may be obtained using the procedures applied to prove Propositions 2.4, .2, and 3.3, with the relation > replaced by 2 . In Fig. 7.4.3 the set of Pareto maximins is shown as a double line and the set of Pareto minimaxes, as a broken line. In the latter figure the point I(0,O) is seen to agree with the solution ( u : ( * ) , u t ( . ) ) of the game (4.7), Definition 4.1. There the Pareto minimax coincides with the maximin and the value of the payoff function I(ul, u2) in the Pareto saddle point. This solution (u:( .), u:( .)) is implemented, in particular, by the situation (u:( .), u r ( .)), where u t ( t ) = u:(t), u:( .)E%:. Then Proposition 4.3. Any situation (u:( .), u r ( where u r ( t ) = u:(t) at t E [O,O) and u:( .)E%: (the set of Pareto-maximal controls of problem (4.17)), is the solution of(4.7) (DeJinition 4.1). a)),
The proof is analogous to that of Proposition 3.5. From Proposition 4.3 we have an algorithm for obtaining a solution (Definition 4.1) in the model (4.7) of two research projects run simultaneously by two different teams: Algorithm.
(1) choose some Pareto-optimal control u:( .) of (4.17) using the formula (4.20) (2) the solution of (4.7) is represented by the situation (u:( .), ur( .)), where u f ( t )= u:(t) at t E [O,e). The choice (on stage (1)) of the Pareto-maximal control u : ( . ) is equivalent to “assigning” a specific constant 8 E (0,l) and then using (4.20). Note that 8 must also satisfy the constraint