An application of discrete optimal control theory

An application of discrete optimal control theory

AN APPLICATION OF DISCRETE OPTIMAL CONTROL THEORY* BY E. P O L A K I SUMMARY This paper treats a quasi-optimal regulator which operates in two modes...

432KB Sizes 0 Downloads 105 Views

AN APPLICATION OF DISCRETE OPTIMAL CONTROL THEORY* BY E. P O L A K

I

SUMMARY This paper treats a quasi-optimal regulator which operates in two modes. The first mode is saturating sampled-data relay, whereas the second mode may be either continuous or discrete. A possible method of realizing the proposed control law by means of a special purpose analog computer is discussed. I. INTRODUCTION

It is well known t h a t some of the assumptions made in a theoretical analysis of an optimal regulator are, in practice, not valid when the system state is near the desired state. Among such assumptions are the absence of noise in the signals, absolute accuracy of measurement and computation, absence of a constantly present disturbing force, etc. In view of this, one is frequently willing to settle for a suboptimal system which operates in two modes. The first mode is used when the error is large and is in some sense close to the optimal mode of operation, while the second mode is specifically designed to cope with the situation prevailing near the desired state. In a series of papers (1, 2, 3) 2 Desoer and Wing have shown t h a t a simple analog computer, with few active elements, m a y be used to implement time optimal control in P A M regulators with force saturation. Such computers are very attractive in situations where speed of computing and physical size are d o m i n a n t factors, and it is therefore interesting to examine the possibility of using such computers in a two-mode scheme. This paper introduces an analog computer different from the one of Desoer-Wing for the time optimal control of P A M regulators with saturation, and shows how this m a y be modified for use in a two-mode system. The first mode is sampled-data relay, the second mode is left undefined, but could be PAM, P W M , etc., with linear compensation. It is shown t h a t in the first mode, the system goes to a small closed sphere about the origin in the same number of sampling periods as an optimal P A M control would require to take it to the origin itself. Finally, it should be pointed out t h a t W. Nelson (4) has shown sampled-data relay systems to be uncontrollable with respect to the * This research was supported by the National Science Foundation under Grant No. G--15965. 1 Electronics Research Laboratory, University of California, Berkeley, Calif. The boldface numbers in parentheses refer to the references appended to this paper. II8

Aug., ~ 9 6 3 . ]

DISCRETE O P T I M A L CONTROL THEORY

I 19

origin or a set, and thus one could not define a time optimal control problem for such systems. II. DESCRIPTION OF THE SYSTEM

The systems considered consist of a linear plant, a power m o d u l a t o r and a c o m p u t e r (Fig. 1). These regulators for which the desired state xd is the origin 0, are operated in two m o d e s : outside a closed spherical neighborhood of the state space origin, t h e y are s a m p l e d - d a t a s a t u r a t i n g relay systems ; inside this neighborhood, t h e y operate in some linear continuous or discrete mode. This paper will concern itself only with the operation of these systems outside the above mentioned neighborhood. .~:l -- 0

com

ut

odulatr

FIG. 1.

plant

Block diagram of system.

The P l a n t It will be supposed t h a t the plant m a y be described by a linear differential equation of the form,

:~= Bx + ud

(1)

where x = col(x1, x2, . ' . , xn)', B = diag(X~, X2, . . . , Xn) is a constant n X n m a t r i x in which Re X~ < 0, i = 1, 2, • •., n, u is the scalar o u t p u t of the power m o d u l a t o r and d = col (d~, d~, • •., d,) is a constant vector.

The Computer This is a device with the vector input x and three scalar o u t p u t s : f(x), f* (x) and f°(x). The functions f and f * will be called control laws, while the function f° will be referred to as the mode law. Let f~ be a sphere of radius r such t h a t fl = {l[xll < r} where []x[[ = ~ x ~ i and r i=l

will be specified later.

T h e mode law m a y now be specified as follows: f° = sgn (I[x[] - r).

The control l a w f will be d e t e r m i n e d in this paper, while the control law f*, will be left undefined. It is assumed t h a t f * is chosen to give saris• factory system performance in the vicinity of the origin.

120

E. POLAK

[J. F. 1.

The Power Modulator

Let T be the sampling period for relay s a m p l e d - d a t a mode operation, and let M be the value at which the power m o d u l a t o r saturates, t h e n the power m o d u l a t o r o u t p u t is defined by u = I M (1 + f ° ( x k ) ) ' s g n f ( x k ) + ~ k T < t <<_ (k + 1)T,

M (1.- f~(x~)).satf*(x)t

xk = x(kT)

(2)

k = 1,2,3, ...J

It is clear t h a t one of the two terms in the sum above is zero: the first w h e n xk C t2, and the second when x~ ~i f~. W h e n xk ~E ft the system clearly operates in the saturating sampled-data relay mode and the power m o d u l a t o r o u t p u t is then u = sgnf(xk)

for

k T < t < (k + 1)2", III, S T A T E M E N T

OF THE

k = 1, 2, . . . .

(2a)

PROBLEM

Solving Eq. 1 with Eq. 2a it is found t h a t for x ~ ft, the behavior of t h e system is described by the state transition equation x~+, = A~xk -- fkr,~,

w h e r e f k = sgn f(x~), rl = -- M

exp -- tB.ddt, and A = exp TB.

(3)

It

• 0

is now necessary to establish the relationship between systems described by Eq. 3 and P A M regulators. In the P A M mode, the state transition equation is found to be (2, 3) xk+1 = A[-xk -- gkr@

Igal <- 1

(4)

where A and r~ are as defined above and the power m o d u l a t o r o u t p u t is given by u = {g~ for k Y < t < (k + 1)T}. (4a) T h e time-optimal regulator problem for systems described b y Eq. 4 has been solved by Desoer and Wing (1, 2, 3). This problem m a y be stated as follows : given a system as described in Eq. 4 with t h e restriction that l gkl < 1, k = 0 , 1 , 2 , . . . , it is required to find a control law g(x) so t h a t if gk = g(xk) then, given a n y initial state x0, t h e s y s t e m is taken from x0 to the origin in the m i n i m u m n u m b e r of steps possible. For this problem to have a solution, it is necessary t h a t the plant be controllable (a). This is achieved by m a k i n g the following two assumptions: (i) let r~ = A(1-~r,, it will be assumed t h a t rl, r2, . . - , r , are

Aug., I 9 6 3 . ]

DISCRETE OPTIMAL CONTROL THEORY

12I

i n d e p e n d e n t ; (ii) it will be assumed t h a t A--- diag(al, a~, . . . , a ~ )

where

}c~ I < 1,

i = 1,2, . . . , n .

This assumption follows from the one in the form of B. R e t u r n to the systems described in Eq. 3. W i t h each initial state of this system it is possible to associate a n u m b e r N(x0), N = 0, 1, 2, • -., which is the n u m b e r of steps required for the time optimal P A M system described by Eq. 4 to go from x0 to 0. In this paper the problem is related to the time-optimal P A M regulator problem in the following w a y : given an initial state x0 ~ ft, it is required to take the state of system (3) from x0 into the sphere [2 in no more steps t h a n are required to steer s y s t e m (4) from the same state x0 to the origin by means of time-optimal P A M control. T h i s problem m a y be formulated more precisely, thus:

Let r = /lr, q - ~

A~r,ll and

let ~2 = {x:l/xll < r l ;

k=!

given an a r b i t r a r y initial state x0 ~ [2 and the m i n i m u m number, N, of steps in which this state can be steered into the origin by P A M control, it is required to find a control law f ( x ) such t h a t the solution of Eq. 3, xk, be inside the sphere ~2 for k < N. IV. SUMMARY OF R E S U L T S FOR T I M E O P T I M A L P A M REGULATORS

It is proposed to solve the problem stated by modifying a c o m p u t e r for the i m p l e m e n t a t i o n of time optimal control in a P A M regulator described by Eq. 4. Therefore, it will be necessary to summarize and enlarge upon the known theoretical results obtained by Desoer and Wing in a form s o m e w h a t different from the one given by them. As shown in (2, 3), there exists a critical hypersurface C which divides the state space i n t o ' t w o separate parts, so t h a t a n y line parallel to the vector r, intersects it only once. As a result, a n y state xk may, therefore, be expressed uniquely in the form x~ = c~ q-~b(x~)r~, m < 4(xk) < + ~o where ck ~ C. It was also shown t h a t the control law g(xk) = sat ¢(xe) is optimal and t h a t the critical hypersurface is m a d e up of (n - 1)-dimensional parallelepipeds whose edges are parallel to some of the vectors r~, k = 1, 2, 3, • . .. -

V. A L G O R I T H M S FOR T H E CONSTRUCTION OF T H E CRITICAL SURFACE

It is necessary to divide the systems considered into two classes. The first is characterized by a matrix A = diag(a~, . . . , a~) where 0 > a~ > a2 > " " a~ > 1, and the second consists of the remainder.

Algorithm for Systems Class I a T h e critical hypersurface for this case is the collection of all the ( n - 1)-dimensional parallelepipeds P+<~2-..i .... il < i 2 . . . < i,~ 1 ina A form of this algorithm is given in Ref, 2,

I22

E . POLAK

[J. F. 1:.

tegers, a n d P-i,<~...~,_x such t h a t i n-- X

P+<~,..i ..... =

x:x

=

~

• ' + (-

rk +

1) " E r k 2

in-a+1

in-~+l

-

il

i n- 2

y_, r ~ , -

Bi~_,r,,_, + 2~ ......r~,,., . . . . where

i1 < i~ < . . .

and

0 < 5~-j -< 1,

(-

1)n2fl
< ~,,_~, j = 1,2,...,n--

(s)

1! /

and in--1 P - - ~l~2"'"~n.-1 =

x:x

=

in-2

E

rk +

E

~n-',+l

~1

rk . . . .

+ (-

/n-a+l

2

+/~i~_lr~._x - 2¢3~,,_#~,,_2 + . . . . il

< i2 ' ' "

<

0 <_ ~i,,_~ < 1,

and

1),,+~ E r ~ ,

(-

1)*'+'2¢h,r<

in-.l

j = 1, 2 , . . ' ,

n -

1 I. ]

(6)

The vector a+ . . . .

~+ r~ . . . . in

+ (-

1)"

rk

1

as in Eq. 5 a n d , t h e v e c t o r I aixi2...i,,_l

=

in-X

r~ + . . . + ( - 1) ~+~ F r~

~ in-~q-I

as in Eq. 6, will be referred to as reference corners for t h e parallelepipeds P%1¢,..¢,_1 a n d P-il~...¢,_x, respectively. I t is clear t h a t for each value of i,_1 = N ( N = n, n + 1, • • .) t h e parallelepipeds specified b y Eqs. 5

and

bo enumerated

In

horo aro a' o o ho 2C\

n

-"

2/

s u c h parallelepipeds. A l g o r i t h m f o r the General Case

T h i s a l g o r i t h m was n o t d e r i v e d b y Desoer a n d W i n g , b u t it m a y be d e d u c e d f r o m t h e results of (3). L e t v be t h e n o r m a l to t h e (n - 1) d i m e n s i o n a l plane n--1

x : x = E ~'krk, ~ -- _<~'k _< + k~l

}

Atlg., I 9 6 3 . ]

DISCRETE

OPTIMAL

CONTROL

I2 3

THEORY

t h a t is, 0. Let sx---r~ for k = 1, 2 , . . . , n, a n d let sk = sgn
P%i,~...~,,_~=

x:x=

~

in-- 2

s~-

i1

Z

s~+...

+(-l)"Zs~

{n--~4-1

in--2q-I

2

-- /3~,_,si,,~, -b 2/3~,,_~s~,,_,~ . . . . .

(-- l)*~2/~ilSil

0 <_~,~: _< 1, in-I

p-~,~...,,_, =

x:x-

E

j = 1, 2, . . . n -

in-2

s~ +

it~--2"bl

1/

(7)

1}.

(s)

/

{i

Z

s~

. . . . .

(-

1)-+, Z m

in--a+1

2

0_<¢/~j_G 1 , 2 , . . . , n The vectors . . . . ... . . = a + .,~,~

s~

....

1)-

(-

in - 2 + 1

s~ 2

as in Eq. 7 a n d

i~-2-b!

as in Eq. 8 will again be called reference corners for t h e c o r r e s p o n d i n g parallelepipeds. As in t h e previous case, it is e v i d e n t t h a t for e a c h each value of i,_1 = N ( N = n , n

3- 1 , . . . ) t h e r e

a r e 2 ( N n - 22 ) such

parallelepipeds. VL PROPOSED CONTROL LAW

It n o w becomes c o n v e n i e n t to i n t r o d u c e t h e n d i m e n s i o n a l c y l i n d e r s S%~2...~,,_~ = P+,,~2...,,-, U~r~,

-

oo _< q~ _< +

oo

(9)

-

~

~.

(10)

and S--ili2''.i~,__

l =

P-i~i,,...~,~_,U4)rl,

<_ (o <_ +

124

E. POLAK

[J. F. I.

Suppose x ~ S+.i,,...~°_,(x C S-.i,...,,_,) and let c ~ P+~,,,~...~,,_~(P-.i,...~,,_,) be such t h a t x - -

c +¢r~,

-

m

<

~(x)

~

<

+

, t h e n one m a y define

= ~.

(11)

Let

(0, ~,~, . . . , ~,°_,),

~(x) = c o l

~C{-1,

+1},

k=1,2,

...,i,_~

(12)

where Y'. ~krk = a+i,,a:...,. ~(a-
is the reference corner for the parallelepiped P-%,..,...~ ..... (P-.,,,~...,.,,_,) in the relevant form indicated in Eqs. 5, 6, 7 and 8. T h e control law m a y n o w be s t a t e d in the following concise form:

f(x#) =

sgnq~(x,) if [~(xOI >- 1Vi = 0 , 1 , . . . , k IJ(~(x;), zk__j.+,)if I~(x,)l >_ av¢ = 0, a , . . . , l

j0 for (k - j )

>_ in-~ = N(Xj)

1 < k and I~(xi) l < 1 k >__j

where z= = col (0, 0 , . , 1, • 0) is an i,,_1 × 1 column vector with a one in mth place and zeros in all the others.

Proof Let xi ~ P+~,~2...~._,(E P-~li,..~,_l) be such t h a t Iq~(xj) l < 1. T h e n , using Eqs. 5, 6, 7 or 8, whichever is appropriate, the state xj may be expressed in the canonical form in-1

X~ :

~

~z-l+jrt,

.~ = ¢ ( x 3 ;

I=i

1 < n~-l+~ <

I,

n,-l+; =

VI C Ell, . . . i~_1] >_ 1,

V l ~ Ei~, ' "

i._1~.

(14)

As shown by Desoer and Wing, for such an initial state xj the m i n i m u m n u m b e r of sampling periods, N(x~.), in which the state xj, can be taken by system (4) to the origin, is i.-1.

Aug,, 1963.]

I2 5

DISCRETEOPTIMAL CONTROL THEORY

C o m p a r e the response of system (3) with the control law (13), to t h a t of system (4) with the optimal control law stated in Section IV, from the same initial state x0. Let the j*~ (j = 0, 1, 2, . . - ) sampling instant be the first t i m e w h e n ](b (x~.) [ < 1. Clearly, for k = 0, 1, • . • j, x, as given by Eqs. 3 and 4 are the same. T h e state x~ is in some S+~=...~_,(S-~,~=...¢,,_,) and s y s t e m (4) will take it to the origin in i,_, sampling periods, which is the m i n i m u m possible. Now, let us find x;+¢~_, for system (3). W i t h the aid of the appropriate equation (7, 8, 9 or 10), xj m a y be written in the form

ia--!

n--2

Xi --~ 22 ~krk -- 2 E ~fl~,r~, - ~,,,_,fl,,~_~r~_, + O(x;)r,, [~(x~.) ] < 1,

(~ ~ ( - 1 ,

+1),

0 _< 5~ _< 1,

i = 1, 2, . . . , n - 1.

(15)

Applying Eqs. 3 and 13, one m a y c o m p u t e xs+~, xs+e, " " , xs+~._, as shown below. xj+l

=

A

E ~,rk-O.rl

r

I ~/,--2 j=l

1

in-l--1 =

22

~+lr~

I n--2

-- A 2 22 ~fl,:jr,, + ~,.~_fl~,,_,rl._, + ~b(xj)rl

1

(16a)

--in-l-- 1

x~'+2 =

A

I

22

~k+lrk -- ~2rl I

-

A 2 I 2 n--'2 22 ~e~fl~jr~ + ~,:~_~B,.,_,r,~_, + ¢(xi)ri 1

(16b)

--

A ~ i 2 n,--2 22 ~;fl,:~r~ + ~.~_fl~_~r~_, + ~(xj)r~ 1

(16c)

j=l

in--l--2 =

22

x~+,~_, __ 0 - - A ~ - ,

j=l

I 2 n-2 22 ~,fl~r,:, ,i~l

+

~_lfli~_lr~_,

+

q~(x;)rl

1.

(16d)

I26

E.

POLAK

[J. F. l.

It will be observed that, since A = diag (a:,a2, . . . , a ~ ) ~, = expX~, la~l < l f o r i = 1, 2, . . . n , llA.xil <_ I l x l l f o r a l l x . it follows t h a t

where Hence

[A <--:(2 ~ ~
r~ + 2 ~

A~r:

. (17)

This completes the proof. Clearly, if one is allowed to fix the sampling period, one m a y make f~ as small a sphere as desired. VII. T H E C O M P U T E R 4

T h e c o m p u t e r is to be m a d e up of cells connected in parallel and a m e m o r y unit as shown in Fig. 2. E a c h cell corresponds to some S+¢~<~...¢,_~ or (S-~:;,...~,_~) and performs the following operations: it resolves the vector (xk ~ a+¢~¢~...¢,_:, k = 0, 1, 2, . . . , into its components along r:, r¢~, r~, • •., r~,_~ and by measuring the m a g n i t u d e and direction of these components determines if x~ is in the corresponding cylinder or not. If not, t h e n there is no o u t p u t from t h e cell. If the

-•

iliz" . , in_ 1

\

w

S.+ . lli~" • • in_ I

X

i

Memory

to modulator

S.-,

1112. • . In_ 1

S.-.

llig.

• . In_ i

FIO. 2,

Block diagram of the computer.

4 j . Eaton, of the University of California has described (5) a considerably simpler comp u t e r for time optimal P A M control. This c o m p u t e r could also be modified for quasi-optimal control with the additional a d v a n t a g e t h a t a m e m o r y unit would not be required.

Aug., ~ 9 6 3 . ]

DISCRETE OPTIMAL CONTROL THEORY

I2 7

state is in the cylinder in question, then the cell determines ~b(xk) and, if 1¢(xk) I >- 1, gives Vhe c o m m a n d fk --- sat ¢(x~) to the m o d u l a t o r ; if [q~(xk)] < 1, then the cell gives a c o m m a n d to the m e m o r y unit to read out into the modulator the stored sequence {~k}k=l¢~-1 at clocked intervals. It is clear t h a t for optimal P A M control, the m e m o r y unit would not be required and the control gk would simply be sat ¢(xk). The c o m p u t i n g cells can be built mostly from passive elements. VIII. CONCLUSION

This paper is an example of how optimal control theory can be applied in an engineering situation in which oversimplified assumptions do not always hold to produce a quasi-optimal system. Furthermore, it demonstrates how a special purpose analog c o m p u t e r m a y be used in the realization of such control. In those cases where running the system open loop for the last i~_1 sampling periods is undesirable, one m a y simulate the system and compare the predicted state with the observed state. If these differ by more t h a n a predetermined amount, the memory o u t p u t is stopped and the whole computational operation starts over again. REFERENCES (i) C. A. DESOER, AND J. WING, "An Optimal Strategy for a Saturating Sampled-Data System," I R E Trans. Automatic Control, Vol. AC-6, 5-15 (1961). (2) C. A. DESOER, ANDJ. WING, "A Minhnal Time Discrete System," I R E Trans. Automatic Control, Vol. AC-6, pp. 111-125 (196l). (3) C. A. DESOER, ANDJ. WING, "The Minimal Time Regulator Problem for Linear SampledData Systems: General Theory," JOUR. FRANKLININST., Vo]. 272, pp. 208-228 (1961). (4) W. L. NELSON, "Pulse Width Control of Sampled-Data Systems," Ph.D. Thesis, Colmnbia University, 1959. (5) J. H. EATON, "On Line Optimal Control of a Discrete System." To be published.