Nonlinear state feedback for ℓ1 optimal control

Nonlinear state feedback for ℓ1 optimal control

Systems & Control Letters 21 (1993) 265-270 North-Holland 265 Nonlinear state feedback for (1 optimal control* J e f f S. S h a m m a Department of ...

478KB Sizes 0 Downloads 141 Views

Systems & Control Letters 21 (1993) 265-270 North-Holland

265

Nonlinear state feedback for (1 optimal control* J e f f S. S h a m m a Department of Aerospace Engineering and Engineering Mechanics, The University of Texas at Austin, Austin, TX 78712, USA Received 18 Nobember 1992 Revised 16 February 1993

Abstract: This paper considers 61 optimal control problems with full state feedback. In contrast to , ~ optimal control, previous work has shown that linear {1 optimal controllers can be dynamic and of arbitrarily high order. However, this paper shows that continuous memoryless nonlinear state feedback performs as well as dynamic linear state feedback. The derivation, which is nonconstructive, relies on concepts from viability theory. Keywords: Disturbance rejection; nonlinear compensation; (I optimal control; state feedback; viability theory.

1. Introduction

This paper investigates the structure of E 1 optimal control problems with full state feedback. It has been shown recently [6] that even in the case of full state feedback, optimal and near-optimal linear controllers can be dynamic and of arbitrarily high order. This is in contrast to ,,'¢'~ near-optimal control (cf. [7] and references therein) for which full state feedback controllers can be static. This property ultimately reveals an underlying separation structure for Yt~ optimal control with output feedback. In light of the results of [6] (see also [8]), it seems unlikely that such a separationproperty holds for linear (~ optimal controllers.

In this paper, we follow on the work of [6] and consider the utility of nonlinear state feedback. We show that continuous nonlinear static state feedback performs as well as dynamic linear state feedback. Thus, the admission of nonlinear feedback removes the necessity of controller dynamics. The derivation, which is nonconstructive, relies on concepts from viability theory [1-3]. The main idea is to show that disturbance rejection with a known bounded disturbance set is equivalent to restricting the plant state to evolve in a particular bounded region. In the terminology of viability theory, this bounded region is viable for the closed-loop dynamics with disturbances. Viability theory gives conditions for the existence of state feedback leading to viable trajectories. This feedback is then scaled to assure the desired performance for all disturbances. The remainder of this paper is organized as follows. Section 2 reviews some material from ~1 optimal control, set-valued analysis, and viability theory. Section 3 contains the main results. Section 4 presents an illustrative example. Finally, Section 5 gives some concluding remarks.

2. Preliminaries

First, we review some notation regarding f~ optimal control and operator norms (cf. [5, 10]). Let Z denote the set of integers, Y+ the set of nonnegative integers, and ~+ the set of nonnegative real numbers. For x ~ R n, let xi denote the ith component of x, and define def

Ixl = max[xil. i

Correspondence to: J.S. Shamma, Department of Aerospace Engineering and Engineering Mechanics, The University of Texas at Austin, Austin, TX 78712, USA * Supported by AFOSR Research Initiation grant #92-70 subcontracted to the Research and Development Laboratory and NSF grants # ECS-9296074 and # ECS-9258005.

Let d~(71 +) denote the set of bounded one-sided sequences in R". For f = {f(0),f(1),f(2) . . . . } E d~ (Z + ), define def

Ilfll =

sup If(k)l. k~Z +

0167-6911/93/$06.00 © 1993 - Elsevier Science Publishers B.V. All rights reserved

266

J.S. Shamma

:\,mlmear statu,l&'dha~L thr I ~ optimal ~
Let / ~(2 ' ) denote the standard space of absolutely summable one-sided sequences in R. A causal operator H ' / , 2 ( Z + ) . - , / ~ , t 2 +) is called finite-gain-stable if H

"l

<~<'f it Ht ~t tl H II = sup ...... < ,~.

r,,,: li Ill f@ {l

Similarly, let /j(-(7/) denote the set of bounded two-sided sequences in ~" with associated vector norm and induced operator norm, both denoted by II"/I. In case H corresponds to a linear time-invariant system, it is characterized by its impulse response {h(O),h(1),h(2) . . . . } with h ( k ) E ~ m×", and the input-output relationship y = Hu is given by the convolution

Finally, we have the lollowing condilion fc',r lower semicontinuity. Theorem 2.3 (Aubin and Cetlina [2, p. 49]). Let X be a metric space and F and V be Banach space.v Let.~': X x V --+ F he a continuous mapsuch that, for all F~~ X, ve-L/(~,v)is affine. Let S : X , F a m t : X ,~ V be lower semicontinuous set-valued maps with closed convex value.s, and let (7 he locally bounded. Suppose there exists an :~ > 0 such that, for all F,~ X, there exists a r~ U({) such that

I(~, t,) + re,q{~)

V [!r![ _< ~.

Then the set-valued C : X

>V &lined by

C~) ~ {v~ E(~): ,1(~, v)E ~q(~)l is lower semieontinuous.

k

y(k) = ~ h(i)u(k -

i).

i=0

For H finite-gain-stable, let H ~ { [ '(y) ~ I ~ ' ( Z ) denote the extended system acting on two-sided sequences with input-output relationship y = Hu given by

3. Static nonlinear state feedback

We consider the following disturbance rejection problem. The plant dynamics are given by x(k + 1) = Ax(k) + Bid(k) + B2u(k), z(k) = Cx(k) + DI fl(k) + 012u(k),

y(k) = ~ h ( i ) u ( k - i ) . i=0

Note the induced norm equality 11H]I = IIH "*' N. We now collect some concepts and facts from set-valued analysis and viability theory following [i-3].

Given sets X and Y, let F:X---~,Y denote the set-valued map F, i.e,, a mapping from points x e X to subsets F ( x ) c Y.

where the vector signals z, d, and u denote regulated outputs, exogenous inputs, and control inputs, respectively. Let the state have dimension n and control inputs have dimension m. The two full state feedback controller configurations under consideration are the following: (1) Linear dynamic feedback (Kdy): w(k + 1) = AKw(k) + BKx(k),

Definition 2.1 (Aubin [I, p. 56]). Let X and Y be metric spaces. A set-valued map F:X-~,~, Y is called lower semicontinuous if, for any x e X, y E F(x), and sequence x, converging to x, there exists a sequence of elements y, ~ F ( x , ) converging to y.

u(k) = CKw(k) + DKx(k).

(2) Static nonlinear feedback (K~): u(k) = ,q(x(k)).

where g: [~" --+ ~m is continuous and g(0) = 0. An important consequence of lower semicontinuity is the existence of a continuous singlevalued function, called a selection function, within the set-valued map. Theorem 2.2 (Aubin [I, p. 228]). Let X be a com-

Definition 3.1. A controller Kay or K~, is said to be internally stabilizing with a performance of p if (l) the unforced dynamics (d = 0) are globally exponentially stable and (2) the forced dynamics with zero initial satisfy [[d ~-~ z [L < P.

pact metric space and Y be a Banach space. Let F : X , Y he a lower semicontinuous set-valued map with dosed convex values. Then there exists a continuous f : X ~ Y such that, for all x ~ X, f (x) E F (x),

See [5] for further discussion and motivation of such performance objectives. Our main result is the following theorem.

J.S. Shamma / Nonlinearstatefeedbackfor f t optimalcontrol

Theorem 3.2. There exists an internally stabilizing linear dynamic controller, Kdy, with a performance of p only if there exists an internally stabilizing continuous static nonlinear feedback, Kst, with a performance of p. The remainder of this section is devoted to the proof of Theorem 3.2. Before proceeding with the proof, we make the following simplifying assumption, without loss of generality.

Assumption 3.3. The matrix B~ is invertible, The end of this section outlines how this assumption may be removed, First, to establish some notation, let Kdy be an internally stabilizing linear dynamic controller with a performance of p. Let T~d denote the corresponding disturbance to state dynamics d ~ x with zero initial conditions acting on f~(7/+). Since Txa is finite-gain-stable by assumption, let T.~]t denote the extended system acting on f~(7/). Similarly, define T_.d and T.e. We begin by defining three nested sets S ~ S ~ S ~ ~". Define the set S = ~q" as follows. For def~(Y), let so(d) denote the state response [T~,]~d] (k) evaluated at time k = 0. Note that by causality, only the negative time portion of d contributes to so(d). Let e , # > 0 be such that (1 + ~,)l[~dl[ < # < P . Then define

267

lid, II < (1 + ~). Let h(i) be the impulse response of TeXt x d " Since so(d.) = ~ h ( i ) d , ( - i ) , i=0

by viewing components of the impulse response of T~]t as elements of (~(Z+), it follows that x , = so(d,). [] We now define the sets S and S as follows:

_Sdg{~": ~ + B , 6 e ~

Vl6I <_ 1 +~},

Sag{~eR": ~ + B16eS vlOl <_ 1}. Clearly, 5 c S =S. In fact, we can make the following stronger statements.

Claim 3.5. There exists an ~ > 0 such that ¢ ~S implies ~ + r ~ S for all lr[ <_ ~. Proof. Let ¢~S. Then ~ + B 1 6 e S for all 164 -< 1 + e. Alternatively, ~ + B 1 6 a q- BI6bES for all 13.1 < e and t6b] --< 1, Thus, ¢ + Bt6, e S for all 16~t <- e. Using that B1 is invertible completes the proof. []

Claim 3.6. The set S is compact and convex. Proof. This follows from the compactness and convexity orS. [] u

%~

U so(d).

Claim 3.7. The set-valued map U: S ~, Rm defined by ~-~ { v e ~ " : Iv[ _< (1 + e) l[ T,at[

In words, Sis the set of reachable plant states under disturbances )1d II -< 1 + e. Note that the set S depends on the dynamic state feedback Kdy,

is lower semicontinuous.

Claim 3.4. The set S is compact, convex, and has nonempty interior containing the origin.

Proof. There exists an ct > 0 such that, for all ~ e S, there exists a v e ~" such that

and IC~ + Dl16 + Dlzvl
fC~ + D116 + D12v + r) <_ p Proof. Convexity and boundedness o f S follow from the linearity and finite-gain stability of T~n~t. That S has nonempty interior containing the origin follows from Assumption 3.3. To see that S is dosed, and hence compact, let {x,} be a sequence in E" converging to x . , and let x, = so(d,). Since {d.} forms a bounded sequence in f~(Z), there exists a convergent subsequence in the weak* topology on f~(Z), Let d, be the weak* limit. Then

for all [6[ < 1 and ]r] _< ~. For example, if~ = so(d), set v = LrText ,d dark j ~ )~a t time k = 0 . An appropriate application of Theorem 2.3 then leads to the desired result, We are now in a position to apply Theorem 2.3 as follows. Let X %f S, F clef~,, V%f R", and f(~, v)%f A ( + Bzv. Let t,7(~) be as in Claim 3.7,

268

J,S. S h a m m a

Nonlinear state/eedhaek/or

and let

/ ~ optimal control

Proof. First, we show that ~ e S implies

S: X ~ , F ~f .£~-,S. The stated assumptions and previous claims assure that the hypotheses of Theorem 2.3 are satisfied. Thus, the set-valued mapping C : S ~, ~ " defined by

for all 16] _< 1. Given any ,5 with 16! _< 1, define the two vectors v~, and c'h as follows:

c(¢) ~:t {,e 8(~): .f(~, v) e g(x)}

and l), = D I I ( ~ .

= {re 8(¢): A~ + B~veS} is lower semicontinuous. From Theorem 2.2, there exists a continuous selection function 0 : S - - ' N" from C(~). Claim 3.8. Consider the dynamics x(k + l) = Ax(k) + Bid(k) + B2(t(x(k)),

For all d ~ U ' ( Z +) with lid][ _< l, x ( k ) ~ S for all k ~ + and [Izl]-< p.

Proof. The claim follows from the construction []

We will define g:E"--* ~ " corresponding to

Kst by using 0 evaluated only on the boundary of S. Define p : N" ~ ~R+ as follows: p(~)

d eef inf{c~e

Note that ]v,] < # and ]Ubl ~ ]A, and hence ]flu a -~- (l - - fl)Vb] ~ ]2 for all 0 < fl < 1. Set fl = p(~). Note that p(~) _< 1 since ~eS. Then applying the definition of g, we have that fly, + (1 - fl)vb = C~ + Dl16 + Dlzg(~),

x(O)eS,

z(k) = Cx(k) + D~ld(k) + D12y(x(k)).

of 0.

c,, = C~/p(~) + D ~ 6 + D12,0(~/p(~))

N +: ~ e ~,S}.

which proves the desired inequality. Using parallel arguments and that S is convex, we can similarly show that x(0)~ S implies x(l) = Ax(O) + B16 + Bzg(x(O))6S for all t61 ~ I. It follows by induction that, for all d e/~- (Z +), x(0)eSleads to x ( k ) e S f o r all k e y + and [/zll < ~. Hence, ,q satisfies the same hypotheses met by

O. Claim3.11. The state Jbedback g is internally stabilizing with a performance ~f p.

We now define g : E " - - , Em corresponding to K~, as follows. Set g(0) = 0 and

Proof. We first prove the desired induced norm condition. As before, let T,d(g) denote the disturbance to control dynamics but now under state feedback g with zero initial conditions. Similarly, define Tzd(g). It is easy to see that the scaling property of g implies a similar scaling property for Tud (g). That is, for any ~ R + and d~(~(Z+), Tud(g)cd= ~Tud(g)d. The same holds for Tzn(g). It follows that

g(O %~p(g)O(g/pO).

II~d(g) ll =

Claim 3.9. The function p defines a norm on ~". Proof. From [4, p. 106], p is the Minkowski functional of the interior of S; hence, p is a seminorm. Since S is compact, p(~) = 0 only if ~ = 0. []

Since for all ~elt~", the vector ~/p(~) lies on the boundary of S, we see that g and ~ agree on the boundary of S. Furthermore, we see that g exhibits the scaling property that, for all ~ s E + ,

g ( ~ ) = ~g(~). Claim 3.10. The function g satisfies the same hypotheses met by 0 in the statement of Claim 3.8.

sup de/~(Z + ) !di < I

I]~a(g)dl[.

From Claim 3.10, we have that 11T~d(g)ll <-- ~t < p. Finally, we must show that the unforced dynamics under g are globally exponentially stable, i.e., there exist positive constants m >_ 1 and y < 1 such that Ix(k)l ~ m'/lx(0)l.

J.S. Shamma / Nonlinear state feedback for d 1 optimal control

Since p defines a norm on R", it suffices to show that

p(x(k)) < mTkp(x(O)) for appropriate m and 7. Towards this end, set m = 1 and let

269

the construction here is simpler than - but not entirely parallel to - to that in Section 3. Let S denote the set of reachable plant states with tld Jt < 1. It is easy to show that = {~e ~2: Ixll -< 3, Ix21 -< 2}. Similarly, let

7 = max p(~). Since S is strictly contained in S and since B~ is invertible, it follows that 7 < 1. The inequality p(x(k + 1)) _< 7p(x(k)) is clearly satisfied for x(k) on the boundary of S, since the control value 9(x(k)) assures

Ax(k) + B2g(x(k)) = x(k + D e S . However, the scaling property of g assures that p(x(k + 1)) _< 7p(x(k)) is satisfied for any x(k). [] The above construction of 9 and its derived properties complete the proof of Theorem 3.2. We now show that Assumption 3.3 may be removed. In case B~ is not invertible, pick B'~ so that the matrix (B~ B'~) has full column rank. Then a compensator is internally stabilizing with a performance of p for a disturbance matrix BI if and only if it is internally stabilizing with a performance of p for a disturbance matrix (B~ eB'l) with e > 0 sufficiently small.

Then

S = {~e~2: [x,[ ~ 2, Ix21 ~ 1}. These are illustrated in Figure 1. To achieve the desired performance, it suffices to make S a viable set for all !Id II < 1. This may be achieved by assuring that, for any ¢ e S, there exists an admissible v such that A~ + Bzv ~ S. Towards this end, define the set-valued map U:S--~, R" by U(~) = { v e R ' : t C ~ + Dla3 + D12vl < p}. Then U(~) = [ - 3 , 3]. Now define the set-valued map C: S~-, R" by

c(~) = {re tT(~): A~ + B2veS}.

4. An example

From the previous section, we can construct the desired state feedback 9 if we can find a continuous selection function 0 for C(. ) on the boundary of S. Let us inspect C(~) for ~ on the boundary of S. From Figure 1, boundary # 1 is characterized by ~1 = 3 and ~ 2 e [ - 2 , 2]. For these vectors,

In this section, we illustrate the ideas behind the derivation of Theorem 3.2. Consider the disturbance rejection problem

Hence, C(¢) = [ - 4 , - 2 1 m [ - 3 , 3 3 = [ - 3 , - 2 3 .

,,:(:

0

,/

It is easy to show that the dynamic compensation gdy given by u(k)= - x 2 ( k - 1 ) , or in state-space form

w(k + 1) = -xz(k),

,/ []

u(k) = w(k),

is internally stabilizing and [Id ~ z [I = p = 3. We will construct a nonlinear state feedback which achieves the same norm. In this case, there is no need to resort to suboptimal arguments. Hence,

B

[] [] Fig. 1. Illustration of sets S and S ( [[]denotesith boundary).

270

],S. Shamma

Non,linear state leedh~tck /Or / ~ optimal control

Similar analyses lead to [ 3, -2]. ('l~J:= t [2,3], {,[-t-~t,t -~]c~[

~boundary #1. ~cboundary #3, ~ebotmdary #2 or 4.

3,3].

We see that -3, ~y. g(~.) =

3,

~boundary

# 1,

eboundary

#3,

eboundary

# 2 or 4

is one of several continuous selection functions for C(~) on the b o u n d a r y of S. We set g to be the extension of,q defined over [~2 as follows. The set S defines the n o r m p(¢) a~d inf{c~ ~ ~ + : ~ ~ S }

= max { J~t ]/3, 1~21/2}.

Then

This result provides further insight into the /~ optimal control problem, and reopens the theoretical question of the possibility of a separation structure. One d r a w b a c k is that the present derivation is nonconstructive. Hence, it does not seem to lead to new c o m p u t a t i o n techniques for /~ optimal control. Furthermore, the static nonlinear state feedback while continuous will not bedifferentiable at the origin in general. If it were, then the arguments presented in [9] imply the existence of internally stabilizing static linear state feedback which achieves the same performance.

Acknowledgements The author thanks T r y p h o n G e o r g i o u for helpful discussions and the University of West Florida for a visiting position during part of this research.

9(~) = p(~)id(~/p(~)).

In this case g ( ~ ) = - ~ , the deadbeat state feedback. We emphasize that the final g is derived from the original selection function 0- F o r example,

0(~)=

- 2,

~ ~ boundary

# 1,

2, -2~1/3,

~ E boundary #3, c~eboundary # 2 or 4,

leads to g(~) = - 2 ~ / 3 .

5. Concluding remarks Using the tools of set-valued analysis and viability theory, we have shown that nonlinear state feedback performs as well as linear d y n a m i c state feedback for (~ optimal controllers. The main idea is to recognize that the ( ~ n o r m of a signal m a y be monitored in a memoryless manner. Hence, the maintenance of an ( ~ performance specification becomes equivalent to the viability problem of keeping the state in a specified set.

References [t] J.P. Aubin, Viability Theory (Birkh'/iuser, Boston, 1991). [2] J.P. Aubin and A. Cellina, Differential Inclusions (Springer, New York, 1984). [3] LP. Aubin and H. Frankowska, Set-valued Analysis (Birkh~iuser, Boston, 1990). [4] J.B. Conway, A Course in Functional Analysis (Springer, New York, 19851. [5] M.A. Dahleh and J.B. Pearson, Jr., /t-optimal llzedback controllers for MIMO discrete-time systems, IEEE Trans. Automat. Control 32 (1987) 314-322. [6] IJ. Diaz-Bobillo and M.A. Dahleh, State feedback /~optimal controllers can be dynamic, Systems Control Lett. 19 (1992) 87-93. [7] J. Doyle, K. Glover, P. Khargonekar and B. Francis, State-space solutions to standard .~(,2 and ,~¢" control problems, IEEE Trans. Automat. Control 34 (1989) 821-830. [8] D.G. Meyer, Two properties of (J-optimal controllers, IEEE Trans. Automat. Control 33 (1988) 876-878. [9] J.S. Shamma and M.A. Dahleh, Rejection of persistent bounded disturbances: nonlinear controllers, Systems Control Lett. 18 (1992} 245--252. [10] J.C. WiUems, The Analysis of Feedback Systems (MIT Press, Cambridge, MA, 1971).