Optimization of linear variable-structure systems

Optimization of linear variable-structure systems

U.S.S.R. Comput. Maths. Math. Phys. Vol. 21, No. 2, pp. 48-56, 1981. Printed in Great Britain 0041-5553/81/020048-09507.50/0 © 1982. PergamonPress L...

401KB Sizes 0 Downloads 28 Views

U.S.S.R. Comput. Maths. Math. Phys. Vol. 21, No. 2, pp. 48-56, 1981.

Printed in Great Britain

0041-5553/81/020048-09507.50/0 © 1982. PergamonPress Ltd.

O P T I M I Z A T I O N OF L I N E A R V A R I A B L E - S T R U C T U R E SYSTEMS* S. L. KAGANOVICH Leningrad (Received 6 June 1979; revised 14 March 1980)

AN ITERATIVE algorithm is described and proved for finding an optimal control that satisfies the conditions of Pontryagin's maximum principle. The main advantages of the algorithm from the practical stand-point are: universality in the class of variable-structure system, stability of the computational process, monotonic convergence to the required solution, and the capacity to discover, maintain, and evaluate the parameters of sliding optimal control modes.

1. The problem Linear variable-structure systems (1.v.s.s.) have come to be widely used in automatic control and operations research theory [1--4] for the mathematical modelling of actual phenomena [ 5 - 7]. Experience in solving l.v.s.s, optimization problems [3 - 7 ] shows that both particular methods such as Miele's method (based on Green's theorem [6, 8]), or the method of analyzing the switching points [ 4 - 6 ] , and general methods, using Krylov-Chernous'ko [9], Newton-Raphson [10], and Bellman [ I I ] algorithms, have a limited range of application. We describe below an l.v.s.s, optimization algorithm having several advantages over the abovementioned methods and easily realized on a modem computer. We consider a controlled dynamic system described by the equation

x----A (u)x+b(u),

x(O)- Xo,

(1.1)

where x---- ( x ' , . . . , x") ~ R ~"is the vector of the running state of the system, x 0 E R n is the given initial state vector, u ~5 U are the control values, chosen from a given compact topological space U; A (u) and b (u) are respectively an n X n matrix and the n × 1 vector corresponding to the accepted control value u E U; and the functions A (.) and b (.), expressing the dependence of the system structure on the control values, are assumed to be continuous. System (1.1) is controlled by choosing a function from the class of measurable functions

[o, T]--,u}, where T > 0 is a given finite instant.

*Zh. vychisl. Mat. mat. Fiz., 21, 2, 306-314, 1981.

48

(1.2)

49

Optimization of linear variable.structure systems

The effectiveness of the control ~ E .~) is described by the fundamental

1[~1 =,

(l.3)

where c E R n is the given 1 X n vector characterizing the "value" of the individual values of the components of the final star e vector;x~ (.) is the traj,~ctory of system (1.1), generated by the control ~ ; and throughout, <., • > denotes the scalar product of vectors. Under conditions (1.1)-(1.3) we pose the following problem: to f'md ~ ' = arg max I[~1,

(1.4)

when an admissible control of maximum efficiency is used.

2. Algorithmof ~-freezing We introduce into the control space the mapping K: ~Z~--*~Z)asfollows. For the control ~ 0 ~ . ~ we find the corresponding vector-function of conjugate variables, by solving in backward time the equation ~°=-

~0A (~0 (t)),

~ * (/3 =e.

(2.1)

Then, in forward time t E [0, T], we simultaneously solve the system of equations x (0) =-x0,

x=A(~(t))x+b(~(t)),

(2.2)

~ (t) = arg max (~,,:' (t), A ( u ) x + b (u) >. llmU

Since the function argmax

usually has discontinuities, the first equation in (2.2) is a

tt~U

differential equation with discontinuous right-hand side. For the definition of the solution of such an equation, and a proof of the solvability of the corresponding Cauchy problem, see [12]. After this we put ~, (.) =K/Te ( . ) .

(2.3)

Notice that, in system (2.2), only the vector function t~=' (.) of conjugate variables is fixed; whereas the trajectory x ( . ) and the control ffl ( • ) are determined in the course of solving the system (for this reason, the proposed algorithm is called the algorithm of q,-freezing). Theorem

Given the initial control ~0 E ~ ) , let the following sequence be obtained u~, ~ , = K ~ ,

~2=Ku, .....

~.+:=K~.,..,

.

(2.4)

S. L. Kaganovich

50

Then: 1) the sequence of corresponding values of the criterion functional is non-decreaJfing, i.e. I[~0]<...
;

(2.5)

2) if, for some N ~ 0, I[ ~ . . , ] -----I[~^. ],

(2.6)

the control UN is the required solution of problem (1.1) -(1.4); 3) sequence (2.4) is weakly convergent to the solution of problem (1.1)-(1.4), if the problem is not degenerate, i.e., has a unique solution. Proof. I For n ) 0 we introduce the auxiliary function F(t) - - l [ / i . ' ] . where fit n is the control, the same as Un+l in the interval [0, t], and the same as Un in the interval (t, T]. Then (we assume that F ( . ) is differentiable, see (2.9)) T

I[ff.+,],=l[~.]+

Ip(t)dt.

(2.7)

0

The control un-t+at is obtained from -fitn by needle-type variation in the interval [t, t + At], and we therefore use the well.known increments formula, allowing for the special feature that the needle-type variation is started at the point t of discontinuity of the control ~ t ; the formula is proved for this case in [13, pp. 22-28]. We obtain as a remit: F(t+At)-F(t)=<~(t),

+b

(t)) - b

(A(~.+~(t))-A(~.(t)))x

(2.8)

(t)) >at+o

where g, ( • ) is the vector function of conjugate variables, corresponding to the control fit n. This function is defined solely by the control values ill the interval (t, T] (this follows from Eq. (2.1)), so that ~(t) = ~ " ( t ) . We then obtain from (2.8): F (t) ==<~;° (t), (A (~,+, (t)) - A (~, (t)) ) x

+b

(t)) - b

(2.9)

(t)) >,

which is non-negative, in view of the second equation of system (2.2). Using this in (2.7), we arrive at the conclusion that I [ ~ , + , ] ~ > I [ E , ] , n = 0 , 1. . . . , whence (2.5) follows. 2. Using the fact that the quantity (2.9) is non-negative, we obtain as a consequence of (2.7) under condition (2.6), the identity ~ ( t ) m0, t e [ 0 , T]. In the light of (2.9) and (2.2), this implies that <~.' (t), A ( ~ - ( t ) ) x + b ( ~ . (t)) )-= max <@~*"(t), A (u) x + b ( u ) ) , ~ISU

i.e. the control ~ r satisfies the Pontryagin maximum principle [14]. In the class of 1.v.s.s. this condition is sufficient for optimality [ 15, 16], and hence ~N is the required solution.

Optimizan'on of linear variable.structure systems

51

3. Let 7F be the space of generalized controls, whose values are probability measures in U. Then, ~ ' is a compact topological space [ 17] ; !g) is naturally imbedded in ~0 ; and the mapping K of (2.3), and the functional I of (1.3), are, by definition, continuous in ~Y'. We extract from sequence (2.4) a subsequence a , =

K"' o . . . . }.

.....

Since 7/° is compact, 7rI will contain a subsequence which is convergent say to w: .....

where m,>~n,, i = t , 2 . . . . .

....

Here,

lJm l[K"'~ol,=l[wl. (~-aoo

On again applying mapping K to ~r2, we obtain K n 2 = {K~":'~o . . . . , K'~'+ '~o . . . . } ~ _ g w and hence lira

I[ K"'+tU,] ==1[Kw ].

i ,.~ 0o

Since m,+t~>m~+l, i = 1 , 2 , . . . , then, by claim 1) of our theorem, we have

l[ w] =

lira

I[ K"'"~o]>~ lira I[ K~"+'Ko]=I[ Kw].

t ..~ ¢o

i "+ ~

But, in accordance with 1), the converse relation is true, and hence l [ w ] = l [ K w ] . It now follows from claim 2) that w is the solution of problem (1.1)-(1.4). In short, we have shown that any subsequence of sequence (2.4) has itself a subsequence, convergent to the unique solution w of problem (1.1)-(1.4) (using our assumption that the problem is non-degenerate). By using standard arguments, we then conclude that sequence (2.4) itself is convergent to w This proves the theorem. The algorithm of q~-freezing consists of the following operations: Step 1. We put n = 0 and

,(°) (t)--e,

rE[0, T],

(2.10)

which corresponds to an attempt to f'md a locally optimal control. Step 2. We solve system (2.2), where, instead of ~ o (.), we substitute ~ (0)(.). As a result we obtain the locally optimal control ~(0), see [ 18]. This control has the following properties:

a) for weakly controlled systems it serves as a first approximation to the optimal control;

S. L. Kaganovich

52

b) if the mapping K of (2.3) leaves it unchanged, then it is completely optimal, i.e. is optimal with respect to any time interval [0, t], O~t<~T; the existence of this solution is of great practical interest. These properties justify the use of a locally optimal control as the initial approximation in numerical optimization algorithms.

Step 3. We solve system (2.1), substituting ~(n) for ~0. As a result we obtain the vector function of the conjugate variables ~,~n+J~( . ) . Step 4. We solve system (2.2), using t~C~+'l (.) instead of ff~0(. ). Denote the control obtained by ~ (n * 1). Step 5. We check the condition

(2.11) If it is satisfied, we put n.' = n + 1 and return to step 3; if not, we stop the computations and print out ~(n+l) as the required solution. In the light of our above theorem, the algorithm may be seen to have the following properties: a) it can be applied to any variable-structure system; b) it is monotonically convergent (in the weak topology) to the optimal solution of a nondegenerate problem of 1.v.s.s. optimization; c) it is stable, because the systems of differential equations are only solved in the natural direction: the equations of dynamics in forward time, and the conjugate equations in backward time. The algorithm of qz-freezing has some features in common with the familiar algorithms, but differs from them in the following respects: I) in the Krylov-Chernous'ko algorithm [9] the control is found for fixed, "frozen" x ( . ) and ¢ ( • ), whereas now x ( • ) is "unfrozen", so that singular and sliding optimal modes can be computed; 2) in the Newton-Raphson algorithm [10], the control is found for variable x ( . ) and ff ( . ) , whereas now ~ ( • ) is "frozen," so that the computational process is stable; 3) in Bellman's algorithm of successive approximations [11 ], the control is constructed in the form of a synthesis, whereas in our case programmed controls are used, so that a substantial economy is achieved in computer resources.

Optimization of linear variable.structure systems

53

3. Application of algorithm of a/.freezing 1. A characteristic feature of our algorithm ks that it allows stable reallzation of sliding optimal modes; the latter are understood, in automatic control and discontinuous system theory [ 19], as sequences of "int'mitely" rapid switchings of the control. Even the simplest problems of 1.v.s.s. optimization [5 - 7] reveal the existence of such solutions. This is also confirmed by experience in the practical use of the 9-freezing algorithm. It therefore seems worth giving special consideration to the problems that then arise. The existence of a sliding mode is easily detected from the structure of the elements of sequence (2.4): starting with some n, we observe in the function ~n in certain time periods rapid switchings of the control; as the integration step of system (2.2) is reduced, not only do these witchings not disappear, but they in fact become more rapid. The limits (in the weak topology) of these rapidly varying functions are the generalized controls [I 7], which take values from space ~) (U) of normalized measures in the set U. By widening the class ~) up to class ~w of generalized controls, we can go to singular modes, for which numerous optimality conditions are known [20]. But it is very different to use these conditions for performing numerical computations, due to their non.constructive nature. When using the algorithm of xP.freezing to obtain optimal generalized controls, the concepts of micro- and macro-time can be introduced. The function of micro-time is to obtain a complete impression of all the control switchings. Hence the integration of system (2.2) is performed in microtime with a sufficiently small micro-step 8 t/a. The generalized control is built up in macro-time, by smoothing the switchings that occur in micro-time. For this, we choose macro-step 8 tM, with a size which is sufficiently large compared with micro-step 8 t/~, but is sufficiently small compared with the total duration T of the process. Definition. The generalized value of the control in problem (1.1)-(1.4) at the instant t E [0, 7"] is defined as the normalized measure in space U of control values, given by the relation ~,(V)=

lira

lira

6,~-o

6t~-.o

6t~t N~t~c~,~) (V), 6tM

~z~U,

(3.1)

where N~t M((~tg) (V) is the number of micro-steps (of duration 6t/a) in the time interval [t, t +8 tM], during which control values from the set V C U were used; this quantity is determined during the operation of the @-freezing algorithm. Computations from (3.1) are quite easy to realize on a modern computer; expression (3.1) gives a constructive means of finding the optimal generalized controls without having recourse to special optimality conditions for singular controls. 2. To demonstrate the objective scope of the qz-freezing algorithm, we take the problem of optimal control of cell inspection, stated in [7]. We shall indicate the concrete initial data, the parameters of the computational process, and the results obtained, i.e., all the information needed to judge the practical value of the algorithm. Given n cells, in one of which the object G' is located. In the course of time, the object passes randomly from one cell to another. We assume that the transitions are Markov and are described by the matrix A0-----{ez,~: i, j~-i, 2 . . . . . n}, where a 0. is the intensity of transition from the]-th to the i-th cell, i, j ~ l , 2 , . . . , n, i ~ j ;

S. L. Kaganovich

54

If the bth cell is inspected at the instant when object t7 is in it, it will be detected with intensity vi, i = 1,2 . . . . . n; if the object is not in the inspected cell, it is not detected. We wish to organize the inspection of cells during the time interval [0, T] in such a way as to maximize the probability of detecting the object. The control values in this case are the elements ui = , i = 1,2 . . . . . n, so that U - ' { u , , . . . , u,}. Denote by Pi (t) the probability of object O being in the i-th cell at instant t yet not being detected. Using the familiar rules for describing continuous Markov chains, we obtaini~-=A(u)la, where l~ (t) ----(p, (t) . . . . , p o ( t ) ) , A (,,)=Ao-v,D,, ff u = u , , i = t , 2 . . . . . n~ here, Di is the n X n matrix with elements ( D ( ) , ~ t , (D()~-=0 for k@iVlv~i. We assume that the initial values of the probabilities {p~, i = i, 2 , . . . , to p,(0)=p!0), i-~i, 2 , . . . , n.

n}, are given and equal

The probability of the object being found at the instant T is given by

p,(r), so that maximization of this probability is equivalent to maximization of

I ==2

c,p, (T),

c,=-t,

i-----t,2. . . . . n.

It follows from these relations that the problem amounts to optimization of a 1.v.s.s. (c.f. (1.1)-(1.4). It is solved analytically in [7] in the class of generalized controls for the case n = 2, and the difficulties of obtaining a solution for arbitrary n are emphasized. As a typical version of the initial data, we take:

n~-7,

Ao=

6t~t=0.01, --3.3 1.3 t.2 0.8 0 0 0

6tM=0.26,

0.12 0.37 --0.48 0 0 - - 1.75 0.15 1.i 0 0 0,2t 0 0 0.28

T=5,

0.I 0 0.22 0 0.t8 0 - - t.9 0.9 i,4 - - i.22 0 0.t4 0 0.i8

0 0.17 0 0 0.2i - - 0.64 0.26

0 0 0.35 0 0.23 0.t4 - - 0.72

v--Ul.5 0.60 0.56 0.400.25 0.12 0.20If', p(°)=l[O.t 0 0 . 2 0 00.45 0.2511'.

Optimization of linear variable-structure systems

55

0.8

l

II

~6~-

Z I._I

I-

#

Y'3

w"--

I

~,3.4 6. 7 0

J !

i Z

I

I

J

,Y

t

FIG. I J-tt: I i

,Y

2:-

~'-

Z

0.~,1

j O.2

, #

5

5 #

"1 1

2

3

1

5

FIG. 2

The results of applying the q,-freezing algorithm are illustrated in Figs. 1 and 2 in the form of generalized controls, indicating for each instant the distribution of forces between the inspected cells (the numbers above the lines denote the cell numbers). The locally optimal control, obtained after the first iteration, is shown m Fig. 1. The criterion value for this control is 11 = -0.718827. Performance of the iteration took approximately 10 minutes on the BESM-6 computer. Notice the typical "waves," which have been observed in many computations. With successive iterations (each of which also took approximately 10 minutes), the criterion values obtained were: Iz----0.7i2857, Is:---0.? 1 i288, 1,------0.710835, 15=-0,710276, I , - = - - 0 . 7 i 0 i 6 i , 17•--0.7i0i17, I,-----0.7i0096, I9=~--0.7i0090, It0------0.710045, I l i - - - - 0 . 7 i 0 0 7 t , I1~==-0.709995. There is only one break here in the monotonicity: I10 >111 ; it may be explained by computational errors. In Fig. 2 we show the control obtained after 12 iterations. Further iterations change the control values by less than O.1.

56

S. L. Kaganovich

This example reveals that realization o f the ~-freezing algorithm on a modern computer provides an efficient means o f optimizing l.v.s.s, in conditions of quite high dimensionality when sliding optimal control modes are present. Translated by D. E. Brown

REFERENCES 1. EMEL'YANOV, S. V., Automatic control systems with variable structure (Sistemy avtomaticheskogo upravleniya s peremennoi strukturoi), Nauka, Moscow, 1967. 2. EMEL'YANOV, S. V., et al., Theory of variable.structure systems (Teoriya sistem s peremermoi strukturoi), Nauka, Moscow, 1970. 3. BUYAKAS, V. I., Optimal control of variable-structure systems, Avtomatika i telemekhan., No. 4, 57-68, 1966. 4. BUYAKAS, V. I., Singular solutions of the maximum principle in the optimal control of variable-structure systems, in: Optimal automatic control systems (Optimal'nye tlstemy avtomatich, upravleniya), Nauka, Moscow, 1967. 5. DAVIS, B. E., and ELZINGA, D. J., The solution of an optimal control problem in financial modelling, Operat. Res., 19, 1419-1433, 1971. 6. SETHI, S. P., Optimal control of the Vidale-Wolfe advertising model, Operat. Res., 21, No. 4,998-1013, 1973. 7. DOBBIE, J. M., A two-cell model of search for a moving target, Operat. Res., 22, No. 1, 79-92, 1974. 8. MIELE, A., Determination of extrema of line integrals by Green's theorem, in: Optimization methods with applications to space flight mechanics (Collection of Russian translations, Nauka, Moscow, 1965). 9. KRYLOV, I. A., and CHERNOUS'KO, F. L., Algorithm of the method of successive approximations for optimal control problems, Zh. vychisl. Mat. mat. Fiz., 12, No. 1, 14-34, 1972. I0. KALMAN, R., er al., Topics in mathematical system theory, McGraw-Hill, 1968. 11. BELLMAN, R., Dynamic programming, Princeton U.P., 1957. 12. FILIPPOV, A. F., Differential equations with discontinuous right-hand side, Matem. sb., 51, No. 1, 99-128, 1960. 13. GABASOV, R., and KIRILLOVA, F. M., Maximum principle in optimal control theory (Prmtsip maxsimuma v teorii optimal'nogo upravleniya), Nauka i tekhn., Minsk, 1974. 14. PONTRYAGIN, L. S., et aL, Mathematical theory of optimal processes (Matematicheskaya teoriya optimal'nykh protsessov), Nauka, Moscow, 1969. 15. LEONOV, V. V., Study of numerical methods for constructing optimal control in the case of controlled processes described by systems of ordinary differential equations, Peobl kibernetik~ No. 27, 75-86, 1973. 16. LANSDOWNE, Z. F., Tile theory and applications of generalized linear control processes, Techn. Rept. No. 10, Dept, Operat. Res., Standford Univ., 1970. 17. VARGA, J., Optimal control of differential and functional equations (Russian translation, Nauka, Moscow, 1977). 18. CHERNOUS'KO, F. L., Some optimal control problems with a small parameter, Prikl. matem, mekhan., 32, No. I, 15-26, 1968. 19. UTKIN, V. I., Sliding modes and their application in variable-structure systems (Skol'zyashchie rezhimy i ikh primenenie v sistemakh s peremennoi strukturoi), Nauka, Moscow, 1974. 20. GABASOV, R., and KIRILLOVA, F. M., Singular optimal controls (Osobye optimarnye upravleniya), Nauka, Moscow, 1973.