Game models of expert study

Game models of expert study

Game modelsof expert study 113 planning models is fairly simple, so that the computational work is mainly concerned with fmding the joint program, i...

1MB Sizes 0 Downloads 18 Views

Game modelsof expert study

113

planning models is fairly simple, so that the computational work is mainly concerned with fmding the joint program, i.e., with solving the corresponding auxiliary optimal control problem. Translated by D. E. Brown REFERENCES 1.

GERMEIER, YU. B., On two-person games with a fixed sequence of moves, Dokl. Akad. Nauk SSSR, 198, No. 5,1001-1004,197l.

2.

GERMEIER, YU. B., Games with unopposed interests (theory of decision making with imperfect unity) (Igry s neprotivopolozhnymi interesami (teoriya prinyatiya reshenii pri nepolnom edmstve), Izd-vo MGU, Moscow, 1972.

3.

MOISEEV, N. N., Elements of the theory of optimal systems (Elementy teorii optimal’nykh sistem), Nauka, Moscow, 1975.

4.

KONONENKO, A. F., On equilibrium positional strategies in non-antagonistic Dokl. Akad. Nauk SSSR, 231, No. 2,285-288, 1976.

5.

VON STACKELBERG, II., The theory of the market economy, Oxford Univ. Press, Oxford, 1952.

6.

FEDOROV, V. V.,Methods for seeking a max-min (Metody poiska maksimina), Izd-vo MGU, Moscow, 1975.

I.

KRASOVSKII, N. N., and SUBBOTIN, A. I., Positionaldifferential games (Pozitsionny differentsial’nye ipry), Nauka, Moscow, 1974.

U.S.S.R. Comput. MathsMath. Phys. Vol. 17, pp.113-128 0 Pergamon Press Ltd. 1978. Printed in Great Britain

differential games,

0041-5553/77/0801-0113$07.50/O

GAME MODELS OF EXPERT STUDY* B. S. METEV Sofia, Bulgaria (Received 12 December 1975)

TO OBTAIN a description of expert study, a one-decision-making-person (dmp) game with Nature is considered. The dmp has no information about Nature’s choices. A second person, the expert (E), participates in the expert study; he knows Nature’s choices precisely and his interests are known to the dmp. Two game constructions, reflecting the interplay between the dmp and E, are given. We show that, in the conditions considered, the dmp obtains a guaranteed result which may be greater than his ordinary max-min in the game with Nature.

1. Introduction There have been many publications in recent years dealing with applications of expert study in the most diverse practical topics. Also, attempts at formalizing the processing of expert opinions and analyzing them have been published; though invariably the attempts have lacked clarity at certain points. In the present paper we aim to employ some ideas underlying the theory of games with unopposed interests (with a fixed sequence of moves and with information exchange) [l-3] , in order to obtain game constructions which reflect the interplay between the persons participating in the expert study. *Zh. vFchisl.Mat. mat. Fiz., 17,4,932-947,

1977.

114

B. S. Meteu

2. Basic model of expert study

To obtain our basic model of expert study, we make the following assumptions. ThereLs a decision-making person (dmp) who plays a game with Nature [4] . The dmp has a finite set A of alternatives ei=A, i=M,among which he chooses just one. Every time the dmp makes his choice, say of ei, Nature is in a single state 8j, j=N, N is a finite set, and the dmp obtains the pay-off (result) aii. The matrix A=llaijll, SM, j=N, is dmp’s criterion, and IMI=mn, INI=n,. If,p rior to each choice, dmp has no information about the actual state 0,. of Nature, he sticks to the concept of the maximum guaranteed result (mgr), and if he does not want to perform an expert study, he can choose the alternative e,+ for which max min Uij=L,. rEM jEN is reached. A second person participates in an expert study; he is @led the expert (E), and has his own inherent criterion B=ll bijjl, i=M, j=N, in the set A of alternatives chosen by the clmp. If the dmp chooses alternative ei and Nature is in the state Oi, then E receives the pay-off (result) bii. Moreover, E has certain reliable information about the actual state of Nature Oi. It will be assumed throughout that the dmp knows exactly the matrix B. He wants the expert to obtain informaLion which will enable him (the dmp) to make a good choice of alternative from the set A. On the other hand, if E tells dmp nothing, then he cannot guarantee himself more than min bij Vj=N. IEM

It will also be assumed that E also sticks to the mgr principle in the following sense: if, in some situation, E’s pay-off depends only on his actions (more precisely, on his communications), he maximizes his payoff; if, whatever E’s action, only some set of his pay-offs is determined, to which his actual pay-off belongs, then he chooses the action which gives him the maximum guaranteed pay-off.

3. DMP strategies in the form i(i) Let us make more detailed assumptions about the players’ information patterns. Assume that E always knows exactly the actual state of Nature Oi and that this fact is known to dmp. Then, dmp, having the right to the fust move, can propose to E some function i(j), while asserting (and holding to this in the future) that, if E says that Nature is in the state @j, j=N, then dmp will choose ei. Case A. If there are no alternatives among the ei (Yi=M, Vj=N) , which are equivalent for E, then

bij+ b,j,

i+k,

Vj=N.

(1)

Assume that dmp has chosen and told E some function i(i), associating with every j EN a unique i EM. This function then maps N onto a subset M C M. Suppose that Nature is in the state

Game models of expert

11.5

study

Oi. Then, E, wanting to maximize his payoff, sends dmp a message jz such that, in accordance with the strategy i(j), dmp is required to make a choice ifiz) such that bifjz),j,=

max

hjs.

hEId

It is thereby discovered that, after fixing the set M C_M, there is no need to consider all the possible functions i(j), mapping N onto M, and it is sufficient to consider the function $&) for which max bM. bi x(j),j= Rr;M Since the matrix

B satisfies condition (I), a unique function i,&) exists for any M G M.

For every M C_M,drnp correspon~~y his guaranteed result is equal to

received for any j E N a pay-off

a,,(,), j,

while

min 02M(l),j. ,aN The mgr for dmp is then given by

Denote

We have the inequalities

The left-hand inequality holds because the set of all possible strategies iApo.>contains the trivial strategies i(i) which are ~dependent of j. The right-hand inequality holds because dmp’s mgr cannot be greater than the minimum of his maximal results for all j EN. From this we have the following. If the equation LH = LB holds (saddle points are present) for the matrix A, then it is meaningless for the dmp to perform an expert study in the form considered here, since the alternative ei* that gives him the max-min also gives him the mgr. Note. The fact that dmp proposes to E the strategy i,+&) does not mean that, whatever the actual state of Nature OilI E will only send a true messagejl, Since, by means of this message, E can guarantee for himself a maximum pay-off in the set M, the message jr is possible. But the case is not excluded in which E can obtain the highest m~um pay-off in the set M with the aid of a false message jz. However, since the matrix B satisfies the condition (l), this is only possible for iAr(ji)=l.,I (jz), i.e., the false message does not decrease dmp’s pay-off. In short, by his choice of the strategy iM(j), providing L, among all the possible such strategies, dmp does not claim that false messages are forbidden. Consider an algorithm for finding the set M C_M which ensures the result Ll for dmp. Put

M=Mo,N=No. Step 1. Consider the set A+, and the function iM&). Put

B. S. Metev

116

We discard from MO all the numbers i for which dmp realizes do for some J’E No, and we obtain ( i.e., for the discarded elements), we have

M, . For every number (element) of the set M,\M,

a jeG’,I max bb=b,, kGM0

and xo4VO\lMi,

---do. I.e., uxnJ-,?ilfi(~~\f~,)

This means that, for eve~~~~~, for which obtain a guaranteed result greater than d,. .

.

.

.

.

.

.

.

.

.

.

..I...

dmp does not

+0,

*

.

.

.

.

.

.

Stsp (k + 1). Consider the set M;; and the function iM&). Put min a 1 MR

(j)J-

-&.

j=No

We exclude from Mk all the numbers for which dk is realized for some j EiVu, and we obtain &&+I* For every number (element} of the set M,\nlr,+, we have the assertion 3 j+Vof max bc,=bxkj and ~~4!f~\lW~,+ CEMk

i.e., Uxkj=& Thismeans that,for everyMC&, not obtain a guaranteed result greater than dk.

for which MiI (Jfk\Mk+,)+@,

dmp does

Clearly, by using this algorithm we obtain a finite sequence of sets

and a corresponding sequence of numbers de, a,, . . . ) d,, * . . ) E-2,. Here,q Grn,. l%eorem 1

We have max

min uiM (j,)j - max {d,} -- &a,.

McM

j=N

To the smallest I, for which dt=d-, the max-min introduced above is realized.

I

there corresponds the greatest set MI, for which

Note. “The greatest set MI)’ contains all the remaining sets M, in which this max-min is reached. To prove the first part of the theorem, we have to show that &
Then, it follows from the algorithm that used.

m d,
w,\~,,,)

for any

+a.

A different method of argument may be

.

117

Game models of expert study

In fact, M contains some elements of the set 3 jEN\

M,\M,i,

and hence

max bkl=bcr and c~!?l,\M~+l. kEM

For these j, dmp obtains &L If, for the remaining j, dmp obtains more, then his guaranteed result remains dp, while if dmp obtains less, his guaranteed result is reduced. The second part of the theorem follows immediately from the first. Case B. If there are alternatives among the e, (tri~M,Vjd'), which are equivalent for E, then B is any real m. X no matrix. Given any M f M, E obtains the ma~um in the set -~‘~(~) CM for fixed M, dmp obtains for every j E N the guaranteed result min

for every] EN. Then,

U,j

Cdf'f(N)

and for all j E N, the guaranteed result mm mm a,,. j c&&~(M) Hence the mgr for dmp is given by ad = Lt.

max min miu McM -

jE%!

c~M’jf.@)

As earlier, we have the double inequality

It may be noted that the algorithm used in case A may also be used in case B, but with certain modi~cations. Put MO = M. Then we have: Step (k t

1). Consider the set Mk and the corresponding subsets JP(M,J

r&

for every

jEN.Fut min 3

min

U,j=&.

CEM'f(Mk)

Since dmp can obtain, for every j,

we exclude from Mk all the numbers, for which dk is realized, and we obtain Mk+ I. As earlier, for each M C Mk, for which Mfl (Mn\Mk+,) guaranteed result greater than d,.

+a,

In the same way as in case A, we obtain finite sequences of sets

Mo~~I~...I>IMp~...~M*~~ and corresponding numbers where q Q m. . We have :

dmp cannot obtain a

B. S. Metev

118 Theorem 2

min

max MCM

jEN

The smallest 1 for which d, = d,, introduced above is realized.

min cad

a,j

=

m;x {d,} = d,,,,.

(M)

corresponds with the greatest Ml for which the max-min

The proof of Theorem 2 is similar to the proof of Theorem 1 in case A. It may be noted that the algorithms in cases A and B represent a simple realization of the branch and bound method.

4. Strategy of dmp in the form M(f) All the main assumptions made in Section 2 are retained here. We assume, as in Section 3, that E always knows exactly the actual state of Nature Oi, and that dmp knows about this. But this time, dmp offers E a function M(i), while claiming (and holding to this in the future) that, if E says that Nature is in the state Oj, j=N, then dmp will choose ei, i&f(j) GM. An attempt by E to obtain the mgr can manifest itself here in full measure. Notice that the notation M(i) will be used in two different cases. First, M(j), with arbitrary j EN, denotes the subset of the set M, whose elements give the numbers of the rows chosen by dmp, when he obtains the message j (or Oi). Second, MQ) denotes the actual function which-associates with every j E N some subset of the set M. The set of all such functions is denotes by M, and in this sense, M(j) EM. Since, for any j, dmp can choose one among 2mu sets, we have 1jg 1=~~o%~

Apart from the strategy M(j) EM for any MO), dmp finds the function i(j) =i (M (j) which determines his actual choice. The dmp does not tell E this function, but sticks to it.

),

The function i(i) satisfies the natural constraint VjEN.

i(i) =i(M(i))=M(j)

On the other hand, given any fured MC), there are different i@, satisfying (3), and the set of all such functions is denoted by IMP;hence i(i) =iW(i)) EIDIf

VM (j) @iI.

CizseA. If no alternatives among the ei (ViEM, VjsN) , are equivalent for E, then, as earlier in this case, the matrix B satisfies the condition b,,#&, Sk, VjEN.

We introduce some notation. For a fured strategy M(j), we denote by I@ the set JP={M(l),

. . .) 1M(n,)}.

The set i@ depends on the strategy M(i), though for typographical simplicity no indication is made of this. Also for fied M(j), we denote by ;iitk, for every k E N, the set of the ME MS, for which is realized max min blk. MEMSiEM

(3)

119

Gamemodelsof expert study In other words

With the M(j) thus fured, we introduce for every k E N the set

It is called the set of E’s possible messages, when the state of Nature is Ok, while dmp offers the strategy M(i). Obviously, corresponding to different strategies &f(j) we have different systems of sets q and Jk with arbitrary k E N, though, again for typographical simplicity, we do not indicate this. Moreover, with a fixed strategy M(j), we introduce the following sets: the set 4 of all false messages k #i, which E may send when the state of Nature is Oj; it depends on M(j), but we do not indicate this; the set uj of the states of Nature Ok+Oj, in which E may send the message j; it depends on M(j), but we do not indicate this. Proposition 1. In the conditions described above, if dmp performs an expert study, telling E the strategy M(j) and wanting to obtain from him the message j, then his (the dmp’s) maximum guaranteed result is given by max max min min aL c3),fi = L,. M(j)E.G w E ZM, HEN j~Jk hoof. We choose a stratew M (j) EMI and a function i(j) =i (M (j) ) EZ~ and we assume that the state of Nature is Ok, k=N. Since E sends any message j E Jk a;d dmp chooses the corresponding i(i), then dmp’s guaranteed result is

min a,(,, k. j=Jk For all k E N, dmp is guaranteed the result .

.

min minai(j),k. k=N

IEI

k

If M(j) is chosen, dmp can choose in the best case i(j) =i (M (j) ) EZ,~ j , in such a way that . max

min kEN

I(l)=1

. min&(j),k. jsJk

Jfi

is realized. Then, obviously, dmp obtains the mgr if he chooses the strategy Mu) for which max M(j)E z is reached.

max

min

i (jErM.

kEN

3

min ai( $3

h.

k

Proposition 2. We have L,
120

B. S. Merev

Every strategy M(i) defines a unique #. Every h@ defines uniquely a set i&P,, i.e., a family {mh+}. On the other hand, th ere are different strategies M(j), defining the same set A@. (The number of them is not greater than n,-,!). In the same way, they all define the same set {MA+} for every k E N. In other words, corresponding to every family {lK+} we have a set of strategies M(i), only some of which have the inherent property: Proper@ cr. The strategy M(i) satisfies the condition M(z)4&+

VkN. .

The meaning of this property is as follows. If the dmp offers E the strategy M(j), having the property cr, then every time Nature is in the state Ok, kN, E ensures for himself the maximum guaranteed result by means of a true message k, though he can also obtain this result for himself by means of false messages j E pk. The set of all strategies M(i), having the property CY, is denoted by Go. On analyzing the expert study, the dmp can consider only the strategies of the set Go, for the following reasons. Let dmp offer E some strategy M(j) e@, and, say for j = jl, the strategy MQI) does not provide the expert the mgr with respect to the entire set I@. When Nature is in the state Oil, E does not give the message jI,but gives some j E pi, and thereby forces dmp to choose i from among the elements of the relevant .V~@,,+. Throughout what follows, strategies M(j) =ii?fa will be considered. Apparatus for analyzing the strategies M(i). Given any strategy

M(j) =A?, we construct a matrix B’=llbgtJIJ, in which row s corresponds to the set M(s) for each s E N; column t corresponds to the state of Nature Or for every t EN, the element b’sr is the guaranteed result of E, if he sends the message s, when the state of Nature is 0,. bsf’ = min b,,. iEM Since M(j)alT,, we have

b,,’ = max bi,‘. reN

The number brSs is called the cost of the message s.

Naturally, among all the elements of a given column of the matrix B’, the maximum is not necessarily reached only on the principal diagonal. The set /+ consists of all the messages CEN, c+j, The set vi consists of all the states

O,, CEN, c#j,

for which b,,‘=b,l. for which b,,‘=b,,‘.

In addition, Jj = /,L~U j. Given any fured M(j), the function i(M(j)), guaranteeing the highest minimum for dmp and called the “locally optimal function”, is found as follows. If yj=O

for j=ji,

If, forj=j?,the

then dmp chooses i(M(j,))

set vk=NsN,

=i(ji)

in such a way that

IV+@, then dmp solves the following problem. For the

Game models of expert study

121

matrix Ah= [la& which is a submatrix of A, idI( Jo NUjz, he has to solve the max-min problem, i.e., he has to find the minimum in each row and choose the row in which the minimum is a maximum. In this way the losses due to false messages arising when choosingM(j), are minimized. After this, a matrix A’= Ilasl’ll, canbe constructed, in which t E N, and in its row number s, the row number @f(s)) of the matrix A is written. The matrix A’ shows the dmp’s gain for each j E N when 44(j) is fured, and in these conditions, for the best (locally optimal) i (M( j) ) dnrj . Definition. The strategy M(j) is said to be perfect if the following condition holds: M(j), for all j E N, contains all the i E M, for which b,, 2 min 6,. ceM0)

If the strategy M(i) does not satisfy this condition, it is said to be imperfect. Any strategy M(j) EM is either perfect or imperfect. Theorem 3 For every imperfect strategy M(j) EM , there is one perfect M’(i) such that the mgr for dmp with M’(i) is not less than his mgr with M(i). In Theorem 3, the expression mgr means that dmp chooses, for each of the compared M (j) E&J, strategies M(j), the locally optimal i(M(i)). Proof. Assume e.g., that the strategy M(j) is such that, for j = jl,the strategy MO’,) does not satisfy the condition of the above definition. Consider a new strategy MB(i) for j # jl ; MH(jl)

contains all the i E M, for which M, (j) =M(j) bij, 2

min b,j,. cEM(j4

We construct the matrices B’ and A’ for M(j), and B’,, A’, for M&), and obt& setspp vi for M(j), and r-jH, viH for MH@. Since the elements of the principal diagonal of the matrix B’ are equal to the corresponding elements of the principal diagonal of the matrix B’B, we have vr=~j for j#ji and pj,H= l.tji. But VjP=Vj,, since the guaranteed results obtainable by E with the aid of MH(jl), when Nature is in the state @,Z@j,, is not greater than his guaranteed results obtainable in the same state with the aid of M(i). Hence the set of false messages, with the strategy MHQ), lies in the set of false messages with the strategy M(j), and the mgr for dmp with MH(i) is not less than his mgr with M(i). It can thus be seen that, for any imperfect M(j) =fla which the theorem holds.

, one perfect M’(i) can be found, for

Denote by & the set of all perfect strategies M(j) EM. Clearly, &, contains some part of the strategies M(i) which realize the mgr for dmp with respect to the whole of %. Here ( M p( =mOno. Obviously, it is sufficient, for the dmp, that he consider strategies M(j) ~fl,,=M~llM~. To obtain a fuller acquaintance with the strategies of the set GOP, some further properties of them might be described; but these properties have no direct connection with the method to be given below, for finding the strategies providing the mgr for the dmp.

122

B. S. Metev

Definition of the operation of union of strategies. Given two strategies Ml(i) and M*(j). We

shah say that they are united (combined) if we associate them with a strategy MS(~), for which ‘+‘j=N.

M,(j) =M, (j) UM, (j) We shah call MS(i) the union of strategies M, (j)

M3(j)zMi(i)

and Mz (j) . Bearing in mind the inclusion VjEN,

we shah say that the strategy MS(~) contains the strategy M,(i). Theorem 4

Given the strategies M, (j) ~i’@,~ and Mz (j) =Map.: each of which guarantees dmp the amount d. Then their union MS(i) will also guarantee him d. Proof. Let us find a partition of the set N into subsets Ni, Nz, N,; NJIN1=@, k+Z; UNk k =N; k, Z=1, 2, 3, such that Ml

(i)

=M2

(i)

viENi,

Mi

(i)

IM2

(i>

VjEN2,

Mi

W

CM2

(i>

VjsN,.

Consider the strategy MS(j), representing the union of strategies M, (j) and M,(j) : Jf3 (j) =Mi

(i>

vj=N,UN,,

vj=Ns. M, (j) =IMz (j) In the strategy MS(j), no j EN3 can be used as a new (relative to M2(i)) false message, when Nature is in the state O,,, jiEl\is, since M, (j)=Mz (j)for all j E N3. Similar statements can be made for the sets N, and N2. In the strategy M$j), no j E N3 can be used as a new (relative to Ml(i)) false message, when Nature is in the state 8,,, jlENIUN2, since, in MS(j), all the j E N3 have a cost less than or equal to their cost in the case of M, (i). The possible false messages jEN,UN?, when Nature is in a state O,,, j,=N,UN2, not harmful, since, in this set,Ms (j) - .$I, (j) ,

are

There is no j EN, in the strategy MS(~) can that be used as a new (relative to Mz(j)) false message, when Nature is in the state Oj,, jlEN3, since MS(j) =Mz (j) for N,UN,. It remains to show that there is no j E N2 in MS(~) that can be used as a new (relative to false message when Nature is in a state O,,, j,=NS. In fact, if M2(j) admitted the false messages j E N2, when Nature is in a state O,,, jlEN3, then, since the cost of messages j E N2 in Mj(i) is less than or equal to their cost in M$‘j), there can only be certain false messages j E N2 in MS(i) when Nature is in the state @j,, ji=N,, by comparison with Mz(j), and there can be no new messages.

M2(i))

It follows from ah that has been said that, for every j in MS(j), we can choose i (44, (j) ) which is no worse than in Ml(j) and M*(j); but this means that MS(~) also guarantees dmp the amount d.

Gamemodelsof expert study

123

Corollary. There is a unique perfect strategy M*(i), which ensures the mgr for dmp and

contains all other strategies that ensure him the mgr. Partition of the set Map into classes, and an algon’thm for finding mgr for dmp. We can

specify any strategy 44’ (j) ERnp

with the aid of a finite sequence of no numbers: [CM,. . . , C/j, . . . ) C,“,l,

where ctj is the cost of the message j. We shall consider the strategy M(i) and, after finding the locally optimal i (M’ (j) ) find the message j. such that dmp obtains the minimal result. There are two possibilities. Possibility 1. The message j. is used like a true message.

This means that dmp obtains his minimum when Nature is in a state Oju and E sends the message j. . Here Here, the following must be taken into account. Case 1. v,=0,

l~~~=0.

Since Vjo = qi, dmp can choose from A#Cjo) the number i that is best for himself. All the strategies Mu ( j) d7,, such that c,;>clj for j+jo, c,~=c~,, for j = jo, do not guarantee dmp more than is guaranteed by M(i). These strategies form a strategy class, for which MC) is the principal strategy. It is denoted by the sequence

ClhCfh+l,. . . 7Ch,l. [CO,. . *1Ct~~--i, In this sequence, only the element Ctjo is marked by a prime. The subsequent classes and subsequent strategy are obtained with the aid of the least possible decrease of the cost C’tjo. Case 2. v,=0,

y,,#0.

Here, b,:,=bjL, for j=p,. Since M’(j) =Map, we have M’(j) ~ilP(j,), j4!,,. The minimum over all j for dmp is reached with j = jo, and since v&=0, the dmp cannot obtain more, i.e., the minimum over all j for dmp is reached for all j=jOU uL,. We mark crju with one prime, while the ct;, for ah j E /+u, receive two primes, Next, for all j E /+o we fmd the corresponding 4. If nj = 9 for all j E ccio, no other costs receive a prime. If j E /+o exist, for which ccifhthey

are combined into the set Qjo’

:

Qlo’=Ow,I cl?“>. . The costs of all messages j,Ep,, j=Qa’

receive three primes.

The process of marking with primes stops when, for all messages j whose costs have k primes, the corresponding /+ = 4 or contain only primed messages.

124

B. S. Metev

The strategy n/rG) takes the form [Cli, CB’, . . . ) Co”, . . . , cl;:], .

where, for j E G,, none of the cti have primes; for j E G’, , alI the ctj have one prime: c’~~;and for j E Gttk), all the cti have k primes: ctitk). All the strategies M” (j) ~app, for which c,,=c,~ for at least one Jo G,‘UG:‘, while cu~GC/j for the remainingj, do not guarantee the dmp more than is guaranteed by M(i). They form the class of strategies for which M(i> is the principal strategy. The subsequent classes are obtained with the aid of a simultaneous minimal decrease in the costs of all messages ]‘EG,‘UG,“U . . . UG:“) . Case 3. v,+@,

y,=@.

Here, the dmp has already solved a max-min problem in order to find the locally optimal i (lb” (j) ) and he cannot always choose the number i, best for himself, from M(j). The cost of the message iO is marked by a single prime : cftio. All the strategies M”(j) sMap, for which c,,~=c~~~ for i=jO, and c,,,Gc:, for the remaining j, do not guarantee for dmp more than MQ). They form a class for which M’(j) is the principal strategy. The subsequent classes are obtained by means of minimal decrease in the cost c’tiO of the message iu. Case4. v,#0,

pafO,

Here, dmp cannot always choose for himself the best number is M(j,) . The dmp obtains his minimum when Nature is in the state OiO and E sends the message ju; but in this state of Nature, E can also send messages j E pjo, for which the minimum may or may not be reached for dmp. As in the case 2, priming of the costs of the strategy MfQ) is performed. Let the minimum be reached for dmp, not only with j = jo, but also with all ~ES,EG,“. All the strategies M”(j) EIV,~, for which c,,,=c.,’ (or c”~~)for at least one jEjoUSs, and c,,,6& for the remainingj, do not guarantee dmp more than is guaranteed by M(i). They form a class for which MG) is the principal strategy. The subsequent classes are obtained by simultaneous minimal reduction of the costs of all messages jEGr’UGt NU . . . UGlk. Possibility 2. The message j. is used as a false message when the state of Nature is Oil.

Then, obviously, v,,+0 Cizse1. ujozO,

and pj,+-O.

There are two further cases.

VjReOa

Here, given the message iI, dmp can choose for himself the best i EMjI);

but moreover,

Game models of expert study

125

when Nature is in the state Oil, a false message j. is possible, and it is precisely then that the dmp obtains the least result. In the strategy M’(j) the cost cltjO has one prime. All the strategies M” (j) E%,~, for which cUI,=cf;,, and c,,)
Here, in the strategy il@(i>,the cost ctjo receives one prime. All the strategies W(i), for which u, , while c,,,
The subsequent classes are obtained by a minimal decrease in the cost c’tjo. Che4.

phZ0,

~,,#0.

Here, the assertions of case 2 hold. The algorithm can now be written: 1) we specify 4@(i) as follows: with every j is associated the i for which E obtains a max-min; 2) for M(j), the locally optimal function i (M*(j) ) ; is found; 3) for M(i), knowing the locally optimal function i (M’ (j) ) the message jO is found, such that dmp obtains a minimum; in accordance with the rules of possibilities 1 and 2, we find the strategy class such that dmp is guaranteed no more than he is by M(i), and such that MVj) is the principal strategy, realizing the lowest upper bound of the dmp’s guaranteed result in this strategy class; if possible, a new strategy MC’+‘)(j) is obtained, in accordance with the same rules, and we return to Para. 2); if it is not possible to obtain a new strategy, the algorithm terminates.

Among all the guaranteed results of the principa.l_strategies found, the dmp’s mgr is simultaneously his mgr with respect to the entire set Mop.

B. S. Metev

126 TABLE 1 a 1 5 4 5 6

b

c

TABLE 2

c-z e

f

3645 13 i4 41 51 4362263 ; 5 236 545

n

g 6

1 3

42 3; 1

45

bcdefg

4214524 a i 5 6

631 5 ;

3 23 465613 642231 5 5

63

41

4:,

25

1

3

6

6

The number of principal strategies (and accordingly, of strategy classes) is not greater than m. X no.

It remains to show that all the strategy classes obtained by means of the algorithm form a partition of the set Mop. All the principal strategies obtained, expressed by message costs, can be written as a table. It contains no columns and not more than m. X no rows. Any column (for arbitrary j) is a monoto~c~y decreasing number sequence as the row number increases (the principal strategy number increases), and consists of series of identical numbers. The last number of each such series is primed. Any such colum with number i contains all the bji, and given any i, there is just one row (with number k) of the table, in which bii is primed. To show that the strategy classes form a partition, we have to show that any strategy ~{~) =lQap is uniquely classified. In fact, choose an arbitrary strategy Mk (j)oG7,r. (c,,I for every] E N.

expressed by means of a sequence

For everyi, we fmd the number rj of the row in the table, in which the number located in column i is equal to ckl and is primed. The classified strategy is located in one of the classes whose numbers are less than or equal to Cj*The number p = min (t3} is equal to the number of the row in the table, in which is written the principal strategy of the class, containing the classified strategy, When obtaining successively the principal strategies by means of the algorithm, we can stop in the same wa;r as before without completing the operation of the algorithm, if, at some step, the turn guaranteed to dmp by the relevant strategy, is equal to the min-max of the matrix A. Thereby the upper bound of dmp’s mgr is reached, and obviously, there is no strategy guaranteeing more for dmi. Case FE,if some of the alternatives e, (vi=M, Vj=N), are equivalent for E, then, as before, given any f C N, it is possible for some of the numbers bii to be equal to one another. As before, any strategy of M”may be perfect (see the definition of strategy in case A) or imperfect. It is easily shown. by the same method as in case A, that, given any imperfect strategy, there is one perfect strategy that guarantees dmp less than the initial imperfect strategy. Consequently, dmp receives his mgr by means.of a strategy which belongs to the set n/r, ~37 of perfect strategies. Again in the same way as before, it is possible for the dmp to consider only the strategies

127

Game models of expert study

have the property cy,i.e., strategies of the set i@Gp=H,,W@,. TABLE 3 a

bcdefg

65Fl511 2654224

i aij

TABLE 4 2

1

-

3 -

b

ooolll 010010

z

000101 110001

a

T g

100011 101100 111000

4 i f ;

4 --

5

a

b

-

6 -

7 -

c -

-

d

- - - - 8 9 10 - -- .I

e -

2 3 : f; 1

-

f - : 3 9 4 2

g -

-

11 -

12 -

13 --

14

b

e

d

a -

5

I f

5

-

15

is --

17

e

f

g

-

-

i

1

-

-

-

6

I

6

6

5 5

5

The expression that gives the mgr L, for dmp in case A, also holds in case B. To find one strategy M(j), realizing this result, we can use the algorithm developed for case A; but at every step, when the cost of a message is reduced achy, all the alternatives with the new minimal cost appear in the new strategy. Example.

Let E and dmp have the matrices shown in Tables 1 and 2 respectively.

For convenience, the states of Nature are denoted by the letters a, b, . . . , g. The dmp choose among 6 alternatives. It is at once clear that the ordinary max-min for dmp is equal to 1. But his mgr, in the set of all type i(i) strategies, is equal to 2. The relevant strategy has the form shown in Table 3. It can easily be seen that, in order for dmp to obtain less than 2, E must depart from his max-min, proposed to him by dmp. Further, strategies of the type M(j) bring in an mgr for dmp of up to 5. This is clear from Table 4. In column 1 of Table 4 are located E’s messages about the state of Nature. In column 2 are the strategies M(i). The table is read as follows: the symbols 000111 in the first row of column 2 mean that, if E’s message is a, then dmp chooses the 4th or 5th, or 6th alternative; the symbols 010010 in the second row mean that, when dmp is told b, he chooses the 2nd or 5th alternative, etc., In column 3 are located the locally optimal I), giving the actual choice of dmp. The numbers written there refer to the alternatives actually chosen by dmp. In columns 4 to 10 are located the matrix B’, indicating the results guaranteed for E by this strategy, when Nature is in a certain state (the appropriate column), and E sends a certain message (the corresponding row).

It is clear that every element of the principal diagonal (in heavier type) is maximal in its column. Moreover, only column 4 contains another maximal element, located in the 3rd row. This means that, when Nature is in state a, E can send the message c as well as II. But dmp chooses in both cases the alternative number 4 and gains 5. This is clear in the columns numbered 11 to 17, where dmp’s pay-offs are written.

5. Conclusion It is easily seen that these constructions can be developed and generalized, on the basis of the representations of actual expert studies. For instance, one possible problem is the description of a situation with several E’s. The idea of constructing and studying such games is due to Yu. B. Germeier. The author thanks I. A. Vatel’, F. I. Ereshko, Ya. N. Dranev, and Ya. I. Rabinovich, for reading the manuscript and making extremely useful comments. iVanslated by D. E. Brown REFERENCES 1.

GERMEIER, YU. B., Introduction to operutions research theory (Vvede.nie v teoriyu issledovaniya open&ii), Nauka,Moscow, 197 1.

2.

GERMEIER, YU. B., On two-person games with a fixed sequence of moves, Dokl. Akud. SSSR, 198, No. S,lOOl-1004,197l.

3.

GERMEIER, YU. B., Games with unopposed interests (theory of decision-making with imperfect unity) (Igry s neprotivopolozhnymi interesami (Teoriya prinyatiya reshenii pri nepolnom edinstve), Izd-vo Mosk. WI-ta, Moscow, 1972.

4.

LUCE, R. D., and RAIFFA, H., Games and decisions, Wiley, 1957.