Copnig ht © I L\C SlOc has tic Con trol Yi lniu s. Lith ua nia n SS R. l·SS R. 1 ~ I Kli
ONE OPTIMAL CONTROL METHOD IN CERT AIN CLASS OF PROBLEMS V. I. Grishin and N. V. Efimova Fu(ully
"I
A jJ/)/il'ri .\lllllil'll/(Ilio . .\I ()VIil(' A i 'iu l i"li I wlilu l l' , .\1".1(""', L 'SSR
Abstract. The determination of the optimal control vector of parameters for a many-step-type game problem is considered in the multi-dimentional case. Tne method consists of the system of two algorithms. The first one represents the iteration procedure to find a solution for any many-step-type game problem if the criterion, intended for the optimization, is a strictly concave-convex function of the control parameters. At the same time any method may be employed to find the optimal control in the non-game proolem. It is proved, that the proposed algorithm is a convergent one. Another algorithm represents the procedure intended for the optimal control of the multi-dimentional non-game problem determination. It is supposed in this case, that the criterion represents a strictly concave or convex control parameters function, assuming the linearization in any point; restrictions, that are imposed upon, must be linear ones. The method is based on the criterion function linearization and the optimal control foundation by the linear programming method with the nonlinearity correction as the following one. Keywords. Optimal control; game theory; iterative methods; linear programming; many-step-type process; large dimension. INTRODUCTION
optimal control is to be found over all steps. One ought to have an attention, that the optimal control over all steps is not a superposition of optimal controls over each step.
The problem of the optimal control determination for a large system operation is considered in conformity to the many-step -type game situation. Phase coordinates vary corresponding to the given relations. Their variation depends on the constant parameters aggregation that will be called in the future the control vector.
These explanations and remarks determine the mean of the problem, formulated in the first phrase. The solving complication of the problem, has been formulated above, is stipulated by its three particularities, those are: large dimension, many-steped type and game situation. In general formulation, at the most arbitrary conditions the problem has no solution at the present time, at means.
Let W be the some phase coordinate criterion function. Obviously W is the control vector function also. A control vector, that provides the extremum of W, defined at the end of the given time interval (at the end of a 'step'), is to be found. Let this vector call the optimal control.
The method of this problem solving with Some conditions, acceptable for many practic problems, is considered below. The method consists of the two iteration procedures with the independent means for every one.
Remark 1.It is suppo sed, that the ini t ial conditions and the description of the system operation are known, so phase coordinates at the end of the step are determined. Remark 2. The system may be stochastic. In this case the system operation describes by means of their probability characteristics, but not phase coordinates relations, for example by means of the first order moments (Pugachov, 1979). These characteristics will be called the phase coordinates also.
THE DETERMINATION OF OPTIMAL CONTROL IN THE MANY-STEP-TYPE GAME PROBLEM
Let us assume, that we can determine the optimal control vector maximizing the criterion function (or minimizing). Then starting from the arbitrary value of Ky O, we can determine the control vector Kx 1
Remark ). The control vector may consist of the two vectors: the optimal value of the first one corresponds to the criterion function maximum and the optimal value of another - to the minimum. Hence, the game situation takes place. The first vector will be denoted Kx ' another - Ky •
and the corresponding value of function W(K x 1 ,Ky o ). Then fixing Kx 1 , determine Ky1 and corresponding value of !(Kx 1 ,Ky1).
Remark 4. The problem may be considered over a few steps, but not one, and the
Further, fixing Ky1 again, determine optimal vector Kx 2 and corresponding value :\ 17
\ '. I. Gri shin anel :\ . \ ' . Efilll()\,
:, I H
t" points (Kx " .Ky ) and (Kx t.Ky ). do not
this process. we obtain the two sequences 1 0 2 1 W: W(K x .Ky ). W(K x .Ky ) •••• and !:
-
1
1
2
2
!(Kx .Ky ). !(Kx .Ky ) ••••• Let the certain line. belonging to the surface Z=W and containing points -W(K 1+1 .K 1 )' 1=0.1.2 ••••• call the may x ximum line and denote it Lmax' Analogously we can determine the term minimum line (denoting Lmin ) for points ~(KxP.KyP), p=1.2, ••• , belonging to the surface Z=W. Further all statings with respect to the maximum line will be correct oorrespondingly treated with respect to the minimum line. The itera!ion process of obtaining the sequence W.which may be called fixing the control vectors in turn. converge, if the criterion function is the bounded and concave-convex one. The proof of this statement represents the main contents of the first part of this work. It is known (Balakrishnan. 1976; Rockafellar. 1970; Ekeland. and Temam, 1976), that the strictly concave-convex bounded function has a saddle point (Kx,Ky ) and a saddle mean C, and the relation ."... /"0. ."..... '" C=W(Kx.Ky ) = max W(Kx.Ky ) • min W(Kx.Ky ) Kx y is true. If the dimension of the vector Kx is n, and m is the vector Ky one, then W(Kx.Ky ) represents the surface in Rn+m+1. Let the plane. containing the given point Kx' orthogonal to the vectors Kx basis and parallel to the Z-axe. call the section Sx(Ky ). Analogously determine Sy(K x )' Consider some properties of maximum line Lmax· 1). Every point (Kxl+1.Kyl), obtained according to the algorithm, described above. satisfies the relation W(K x1+1 'K 0 ' 1"2, •••• y1) • max W(Kx' Ky 1) , 1 " Kx 2). The relation. pOinted out in 1), is true for an arbitrary point of the line Lmax'
* that saticontain the points (Kx * ,Ky) sfy relation: ., * * "tt W(K x .Ky )" W(K x .Ky ) ~ W(K x .Ky ). This statement follows from the fact that the intersection L of the surface Z.W and the section Sx(Ky )' containing the point
* is the convex one. therefore (Kx * ,Ky) drawing the sections Sy(Kx) containing "
n
It
points (Kx ,Ky ) and (Kx ,Ky ) and using the properties 2) and 3) we convince. there are so points in the intersection of L with planes Sy(Kx )' that W values at one of these points at means are greater, than " ); this conclusion at (Kx " .~ ) or (Kx" .Ky is contradicts with that of belonging the " n n the point (Kx ,Ky ) or (Kx .Ky ) to the maximum line.
5). The maximum line is a continuous one. To prove it we must show the existance of ~>O for any £>0, that W(Kx ,Ky ) - W(Kx .Ky ) < £ if
I
"
"" I
11 (Kx' -Kx") .(Ky ' -K y ") 11 < ~ where (K x ' ,Ky ')
.
and (Kx ,Ky " ) represent arbitrary points. lying on Lmax'
Let us draw the section SxCKy) through the
,
,
point (Kx ,Ky ) and the section Sy(Kx) It " through (Kx.K ). These sections and the y - - )' surface Z=W intersect in point (Kx.K y Keeping in mind. that the lenth of an orthogonal projection of a straight line segment_ is_ a lesser one. I tand " that the points W(Kx.Ky ) and W(K x ,Ky ) lie on the concave (and therefore. continuous) intersection of the plane Sy(Kx) with the_s~face Z=W (respectively the points W(Kx,K y )
,
,
and W(K x ,Ky ) belong to the convex intersection of the section SxCKy» we can state the existance of such ~t and &2 that for any ~ > 0 following inequalities W(Kx ' ,Ky ') - W(Kx,Ky ) t /2 and
I
1<
I W(Kx " ,Ky ")
- W(Kx.Ky )/ <
e /2
II(Kx'-Kx).(Ky'-KY)"<~1
The statements 1) and 2) take places for the points. corresponding to the sequence W, and are the definitions for others.
when
3). The line Lmax intersects with the section Sy(Kx ) only at one point. that is the corollary of the strictly concave function property to have the 801e maximum.
Denoting
4). Any line Lmax segment, bounded by the
-W(K x .Ky)
)/1 < a2 ~ '" min{~1'~2}'
are true and
11 (Kx"-Kx) ,(K/-Ky
we obtain
t" ) I ~ I W(Kx ",Ky ) - W(Kx t,Ky <; I W(Kx ' .~' )-W(KX,Ky )I + I W(Kx.Ky )-
" "I < v. C'
that is the proof of the
:> I ~ )
One Op timal Co nt]ol \I e tilod
maximum line continuity. 6). The maximum line is a unique one. Really there is no possibility, for the two points with coordinates (Kx .Ky) and (Kx".Ky) to place on Lmax at the given Ky • since this fact contradicts with that of being W as a strictly concave function of Kx. 7). From the properties 4). 5) and 6) we deduce that any point on the maximum line Lmax (except for saddle point) divides Lmax into two rays I on one of them values of Ware greater and on the other are lesser than that of the division point. We will call these rays the increasing and the decreasing (till value C) branches. respectively.
8). Let us draw via any point (Kx.Ky) of Lmax (except for the saddle point) a section Sx(K y ). Convex intersection L of this section with surface Z=W may also be divided into increasing and decreasing parts according to the value W(Kx.K y ). The section Sy(Kx) drawn via any point of decreasing part of L intersects the decreasing part of Lmax at some point P. It is not true if (Kx.K ) is saddle (we have eliminated this case1 or if saddle value C lies between W(Kx.K ) and the value W(P). y 9). Choosing an arbitrary point (Kx1.Ky1-1). 1=1,2 ••••• and finding Ky1 from the relation
WIK 1 K 1) .. \ X • Y = min W(K 1.K ) we determine K 1+1 from K x y x y the relation W(K 1+1 K 1) .. x • y .. max W(Kx .Ky 1 ). Now we prove that the Kx point (Kx1+1,Kyl) belongs to the decreasing branch of Lmax. Let us draw the section Sy(Kx ) via point (K 1+1 K 1) and section S (K ) via point x • y x Y (Kx1 • Ky l - 1 ). The intersection point of these sections and surface Z.. W lies on the decreasing part of L. corresponding to Sx(K ) section. Thus. according to the y 1+1 1 property 8) point W(Kx .Ky) belongs to the decreasing part of Lmax' provided there is no saddle point between the points W(~1.Ky1-1) and W(Kxl+1.Kyl). So. we obtain W(K 1 K 1-1):> W(K 1+1 K 1) -X'y x'y • and this inequality fails only if saddle point lies between the points defined above. Using this procedure we obtain the decre-
aSing sequence {W(Kx1.Kyl-1)} • 1=1.2 •••• which is bounded and. therefore. converges. Analogously. the sequence {~(KxP.KyP)} • p a 1.2 ••••• monotone1y increases and converges also. Thus. by turn the monotone ses. the
iteration process of fixing in control vectors. we constract two sequences bounded byC: W i decreaother ! j increases (i.j=1.2 •••• ).
The determination of the unknown position of a saddle point is the aim of such an iteration process. Since all Wi~C and Y!j<; C, the pro:ess. obviously. should be stopped when WN - !N J4 where }t is the given precision for solution.
I
1<
We remark. that if the extrema1 values Kx and Ky of the function W do not coincide with the saddle point due to the restrictions of control, then the presented algorithm will better guarantee the convergence to these values. However in this case the iteration process should be sto-
r
pped i f 11 W N+1 - ~N+111 <: }t N - 1!NI - W where Jl is the given preciSion of estimates. It is possible. that when jumping over the saddle point on some k-th step (k-th iteration), the inequality Wk_'>Wk does not hold for the points of the maximum line. In this case we consider the segment. joining the points A:=(Kxk-1.Kyk-2) and
Bo .. (K k-1 K k-1) °
x
• y
•
Let P( A) E: C[0.1] be a certain continuous parametrization of the segment and P(O)=~ 1 2 p( 1) .. B. The term K; ;; AK/- +( 1-:\ )K/corresponds to every A • Now we determine l l Kx for every value of Ky • that the relation
WiK l K ~) \ x • y
max W(K K~) takes x' Y Kx
place. Due to the properties of the maximum lin~ its segment between points Wk _ 1 and Wk is divided by the saddle point into the increaSing and decreaSing parts. As far as the maximum line is a continuous one. then varying A by continuous means. we obtain all the points of a segment between W_ and W k and a saddle one among k 1 them (carring out the check for reaching the evaluation precision). Sometimes for practical purposes different criterion functions are used to determine the control vectors Kx and Ky • Our method will permit to solve this problem by means of comparable simple operations. It is nessesary for it to deter-
\ ' . I. Grishin a nd :\ . \ '. Efilll()\;t
mine the optimal vectors ""1 Kx and "'1 Ky with respect to the criterion W1 and optimal control vectors "'2 Kx " 2 with respect to the criterion W2 and to take the vectors lex 1 and l?y2 as a required ones. It is
,Ky
easily to see, that they are satisfied to the optimal conditions for the different criterions W1 and Vl 2 • THE DETERMINATION OF OPTIMAL CONTROL IN THE NON-GAME MULTI-DIMENTIONAL PROBLEM Now consider the question about the determination of the optimal control vector that turnes the criterion function W into maximum or minimum. Let the initial state of a system is characterized by the vector Xo' With the given Xo vector X1 determines by the control vector K at the end of system function; the last is to be choice by so means, that the function VI(X 1 ) turnes into the optimal one. Suppose, that the following restrictions n are imposed upon vector K:
~a.ijKij~.sj.
The known methods of an optimal control determining (for example, dynamic programming (Bellman, 1957), gradient methods and other ones (Vasiliev, 1974; Pshenichny, and Danilin 1975; Arrow, Nurwicz, and Uzawa, 1958~) are very complicated because of the large dimension of the vector K. The approximate method of solving of the given problem about the optimal control determination is presented below. If the function W(X 1 ) is linear with respect to K (in future we assume W(X 1 ) • • W1 (X o ,K) ), then the optimal vector K determination would conduct by the linear programming method, because the restrictions, imposed upon K, are linear too. Method is based on the property of the function W1 (X o ,K) to be a linearized one and the solution is found by the linear programming method. However this solution may be considered as the first approximation that is to be corrected by any mean~ If supposing the criterion i'l 1 (X o ,K) as a strictly concave one (or strictly convex) bounded function with respect to K and having the first derivatives then one may use the following algorithm of correcting (with any degree of precision) for the approximate solving of the problem. Let KO be an arbitrary first approximation of the optimal control vector. Then one may represent the required vector K as following: K=Ko+~Ko. It is easy to convince at the same time, that the restrictions, imposed upon ~Ko, are also the linear ones.
Linearizing the function W1 (X o ,Ko +AK o ) wi th respec t to ~ KO, we obtain: -
0
~
W1 (X o ,K)=W 1 (X o ,K )+~ i,j
a
W (X ,K o ) 1 o
aKij
AK ij •
Taking this function as an aim function and taking into account the linear restrictions, by linear programming method we determine 4Ko that turns the linearized function W1 into its maximum (or minimum) at the point K1.Ko+~Ko. At this point the W1 value will be lesser than W1 (for the concave function). The W1 value, thus have founded and being corresponded to the maximum of linearized one w1 , may not provide the maximum of given function yet. If in this case we carry out the vector, equal to the difference of vectors KO and K1, and draw the orthogonal hyperplane, then we shall obtain an intersection of Z=W surface with the hyperplane mentioned above. Now divided the segment, connecting the vector KO and K1 ends, by five parts, we choose three points (they will be in the neighbourhood because of any section W being a strictly concave one) with the maximum value of the criterion. The two extremes of them determine a new segment. Thus continuing the process of dividing this new segment and the next segments, one may obtain such value of the optimal control vector K (let denote it by K1) with required precision, that Vl1 has maximum value at this section. At the same time the unequality "1 Vl 1 (Xo ,Ko )~Vl1(Xo,K ) takes place knowingly. Then we again linearize function W1 at the point K1 (not at K1 ), and by above method search for the point "'2 K where the criterion function has its maximum on the new section. "'2 )~W1(Xo,K "'1 ) I f the inequality Vl 1 (X o ,K
holds, we continue the procedure of finding the maximum of initial function. As a result, having the system of conse"'1 quent inequalities: W1 (X o ,Ko )~W1(Xo,K)< "2 ~W1(Xo,K )~ ••• , one may state that, as function Vl1 is bounded, ~he above process converges to some point K, where value of " is the greatest for all meanings W1 (X o ,K) of K. "2 ) ~ W1 (X ,K "'1 ) I f the inequal1 ty W1 (Xo,K o
does not hold for this or futhar steps one ought to repeat the procedure of dividing
One Optim al Co ntrol \lcth od
a new segment [K1K2] ([Ki~i+1]), not the segment [K1K2] ([~iKi+1]), just as it was A ' 1 done when finding the point "'2 K (K~+). "'2 Thus defined, the value of W1 (X o ,K ) (W 1 (X ,Ki + 1 » satisfies already the above o series of inequalities. Vie recommend to follow our i tera tion proA ' cess untill the difference I w1 (X o ,K~) - W1 (X o ,'Ki + 1 ) I (i=1,2, ••• ) of the successively obtained maximum values of initial criterion function becomes lesser then a priori choosen preCision of calculations ~ , provided the inequality l~i_Ki+11< ~ holds with rather small ~ The described process of the determination of optimal control vector K for concave function W1 could be easily adapted for convex criterion functions and minimizing control vector. CONCLUSION Our method permits to find the optimal control in game problems of any dimentions and any number of steps. Some requests on criterion (concave-convexity) or on restrictions (linearity in non-game problems) limit the area of the method applications. However, some supplementary research provides us with a hope, that in the algorithm of determination of optimal control in non-game problem the claims imposed upon criterion function and restrictions could be weakened, and the method could become more general. REFERENCES Arrow, K.J., L. Nurwicz, and H. Uzawa (1958). Studies in linear and non-linear programming. Stanford University Press, Stanford California. Balakrishnan, ' A.V. (1976). Applied functional analysis. ~pringer-Verlag, New-York, Heidelberg, Berlin. Bellman, R. ~1957). Dynamic programming. Princeton University Press, Princeton, New Jersey. Ekeland, I., and R. Temam (1976). Convex analysis and variational problems. North-Holland publishing company, Amsterdam, Oxford american elsevier publishing company, New-York, inc. Pshenichny, B.N., and Y. M. Danilin (1975~ Numerical methods for extremal )roblems. Nauka Moscow (in Russian • Pugachov, V.S. (1979). Theory of probabilities and mathematical statistics. Nauka, Moscow (in Russian). Rockafellar, R.T. (1970). Convex analysi~ Princeton University Press, Princeton, New Jersey. Vasiliev, F.P. (1974). Lectures on methods of solutions of extremal problems. MGU, Moscow (in Russian).
3 21