Nonlinear model predictive control with enlarged terminal sets using support vector machine

756KB Sizes 2 Downloads 107 Views

Report

PDF Reader
Full Text

Copyright © IFAC Nonlinear Control Systems, Stuttgart, Germany, 2004

IFAC PUBLICATIONS www.elsevier.coml1ocate/ifac

NONLINEAR MODEL PREDICTIVE CONTROL WITH ENLARGED TERMINAL SETS USING SUPPORT VECTOR MACHINE Dan SUi,l Chong Jin Ong, S Sathiya Keerthi

National University of Singapore, 10 Kent Ridge Crescent, Singapore 119260

Abstract: In this paper, model predictive control (MPC) of nonlinear systems subject to input and state constraints is considered, for which nominal closedloop stability is guaranteed. We propose the use of a large terminal invariant set and an estimate of the terminal cost to reduce the online computational burden of MPC. These terminal sets and costs are learned off-line via support vector machine method. Its main advantage with respect to other well-known techniques is the reduction of online computational effort by relaxing the terminal constraints. An example illustrates the efficiency of the approach. Copyright © 2004 IFAC Keywords: Model predictive control; Constraints; Stability; Terminal conditions

1. INTRODUCTION

rectly approximates the control law in the feasible space, without the need for XI or F(x). Unlike these previous works, this paper presents an approach for MPC that characterize X I and F(x) based on support vector learning. As many choices of X I and F( x) exist, such an approach requires a lower accuracy of approximation to ensure closed-loop stability of the origin. The bound on the approximation accuracy for closed-loop stability 'and others are developed. The resulting X I so obtained is much larger than those given in previous works. Consequently, the domain of attraction of MPC is larger and the online computational effort is lower. The paper is organized as follows. Basic formulation of MPC and the relevant literature is reviewed in section 2. Section 3 shows the formulation of the terminal conditions as a support vector learning problem. The stability of the closed-loop system using the proposed terminal constraint set XI and cost F(x) is provided in section 4. A Numerical example is given in section 5 and the conclusions are drawn in section 6.

lVIodel predictive control (MPC) is a popular control algorithm for dealing with multi-variable state and input constrained control problems. An important topic in the literature concerning MPC is the aspect of stability. Adding the terminal cost F(x) and the terminal set XI to the optimal problem is a very popular method to guarantee the closed-loop stability of MPC. When the system is both nonlinear and constrained, many of the approaches, for example (Chen and Allgower, 1998; Magni, et al., 2001; Mayne and Michalska, 1990; 11ichalska and Mayne, 1993), are settheoretic based and use a terminal constraint set X I and a terminal cost F (x) to minimize online computational effort. Properties of X I and F(x) that ensure closed-loop asymptotic stability of the origin are also given. Unlike set-theoretic approach, the approximation approach using neural network (Parisini and Zoppoli, 1995) has also been attempted. In this approach, neural network di-

1

supported by National University of Singapore

573

Standard notation is adopted. Ilxll refers to the norm of vector x E Rn. For any square matrix A, A > 0 means A is positive definite and Amax(A) refers to the maximum eigenvalue of A.

2.2 Closed-loop stability

Mayne et al. (2000) provide a set of conditions (together with some minor assumptions) for X I and F(x) for the asymptotic stability of the origin using the MPC approach. The set of conditions uses a local feedback law u = ~ I (x) in X I and is stated below. The proof of this result can be found in Mayne et al. (2000).

2. MODEL PREDICTIVE CO TROL 2.1 Formulation oj MPC

Theorem 2.1. (Mayne et al., 2000) Suppose the terminal cost function F:X I - t R, the terminal constraint set X I and the local control law ~ I:XI - t R satisfy:

Consider the nonlinear discrete-time constrained dynamic system:

x(i

+ 1)

=

u( i) E U,

j(x(i), u(i)) x( i) E X

Vi ~ 0

XI

X, XI is closed and 0 E XI, U, Vx E XI, f(x, ~I(x)) E XI, Vx E XI,

(AI) (A2) (A3) (A4) Vx

(1) (2)

where x( i) E Rn and u( i) E Rm are respectively the state and control of the system. ! is a C 2 continuous function of its arguments with !(O,O) = 0 and U, X are convex, compact subsets of R m and Rn respectively with the origin contained in their interiors. The approach of :rv1odel Predictive Control (MPC) starts by optimizing over u = {u(Olt),u(llt), ... ,u(N - lit)} at time t the following Finite Horizon (FH) problem:

C

~I(x) E

F(f(x, ~I(x))) - F(x) XI

+ £(x, ~I(x))

~

0,

E

Then

JR,(f(x, ~N(X)) - J~(x)+£(x, ~N(X)) ~ 0 (11) for all x E X N, the set of states is steerable to X I by an admissible control in time N or less.

2.3 Literature of the choice of terminal conditions

N-l

J(x, u) = L£(x(ilt), u(ilt))+F(x(Nlt))(3)

In the nonlinear, constrained systems, there are some popular choices of terminal conditions for the constrained optimization problems. In Chen and Allgower (1998), the continuous-time system is considered and £(".) is chosen to be the quadratic cost £(x, u) = xTQx + uT Ru with QT = Q > 0 and R T = R > O. Let

i=O

subject to

x(i + 1It)=!(x(ilt), u(ilt)), Vi

~

0

(4) (5)

x(Olt) = x u(ilt)EU,

Vi ~ 0

(6)

x(ilt)EX,

Vi ~ 0

(7)

x(Nlt)EXI

(8)

A

In the above, x(ilt) and u(ilt) refers to the predicted state and control of the system at index i based on the state measured at time t. The stage cost f(·,·) is such that ~11(x, u)11 ~ £(x, u) ~ ell (x, u) I1 for all (x, u) E X x U and some e ~ ~ > O. XI C Rn and F : XI - t R are appropriate terminal constraint sets and terminal function respectively. Let

8f(x,u) 8x l(x,u)=(O,O)' B

:=

8!(x,u) 8u l(x,u)=(O,O) (12)

and A K = A + BK for any choice of K as long as A K is asymptotically stable. The terminal cost proposed by Chen and Allgower (1998) is a Lyapunov function of the form F(x) = ~xT Px with P being the solution of the Lyapunov equation

(AK

+ J-LI)T P + P(A K +J-LI)=-(Q + KT RK)(13)

for any choice of 1-", 0 < I-" < - Amax (A K ). Arising from this choice of F, XI := {x : x T Px ~ a} is the level set of the Lyapunov function using the local feedback law of

uO(t)=tto(Olt),uo(llt), ... ,uo(N - 1It)X9) andJR,(x)=J(x, uO(t))

:=

(10)

be the optimal control sequence and the optimal value function respectively of (3)-(8). The first control vector uO(Olt) is applied to the physical plant at time t. At the next time instance, the FH problem is solved again using the state measured at time t + 1. The MPC approach proceeds by repeating this procedure as time evolves, establishing a feedback law ~NUi) = uO(Olt).

u=Kx

(14)

when x E X I. The value of a is chosen to satisfy the following three conditions (1) X I is a positive invariant set for the autonomous system x = f(x, Kx).

574

(2) K x E U for all x E X I. (3) for all Xo E X I, x6 Pxo is an upper bound oo on Jo £( x( t), K x( t) )dt for the system ± = j(x,Kx) with the initial condition x(O) = Xo

These considerations give strong support to the approximation approach. Several desirable properties motivate our choice of support vector machine (SYM): universal approximation ability, a theoretical error bound, a global optimal solution defined by a convex quadratic programming problem, and the flexibility in its formulation to accommodate special requirement. The flexibility in its formulation is also important for our purpose. In the following, we show the support vector formulation incorporating special requirement for the characterizations of X f and F(x). Like previous similar works, this process of characterization is performed omine with no implication to the online computational effort.

for the particular choice of J.L in (13). Using these choices of XI, F(x) and other mild assumptions, it is shown that the origin of the closed-loop system is asymptotically stable. The main drawback of this work is that the set X I is usually a small set in the neighborhood containing the origin. Hence, a large value of N is needed, resulting in a heavy online computational load. In De Nicolao et al. (1998), the discrete time system is considered and the terminal cost 00

F(x(Nlt)) =

L: £(x(ilt), Kx(ilt))

Our characterization of X I follows the general approach described in Ong et al.. We consider the constrained nonlinear autonOlTIOUS dynamic system

(15)

i=N is proposed. Again, the trajectory of x( ilt) is based on (14), (4) in the form of

x(i

+ lit) = j(x(ilt), Kx(ilt)),

i ~ N

x(k

(16)

+ 1) = j(x(k), Kx(k)) X(k)EX, Kx(k)EU, Vk

for some stabilizing K. However, the infinite summation is time consuming and several suggestions were given in De Nicolao et al. (1998) for its simplification.

~

0

(17)

Remark 3.1. The preferential method to choose K is that K x is the optimal controller for the unconstrained infinite-horizon problem. When £(x, u) = Ilxll~ + Ilull~, Q = QT > 0, R = R T > 0, K is computed from the following equation

Magni et al. (2001) enlarge the feasibility region of MPC on (1) by optimizing (3) over the control {u(Olt),'" ,u(Nclt)} where Ne:::; N p. The control from u(Nc + lit) to u(Nplt) is again based on a local stabilizing control law (14). The terminal cost F(x(Nplt)) = ~~-;pM-l £(x(ilt), Kx(ilt)) is used under the same local control law. Sufficient conditions on M such that the closed-loop system is stable are also given. However, the value of M such that sufficient conditions are satisfied is large. Also, the terminal set X I is a Lyapunov level set, chosen such that conditions, very similar to the set of above three conditions, are satisfied. The restricted size of X f and a large value of M mean that a large N p is needed.

and P is the positive definite solution of the algebric Riccati equation P=A T PA+Q-ATpB(R+B T PB)-l B T PA(19)

According to Ong et al., the domain of attraction or stability region of the origin we obtain is:

X I ={ x(O) E Rn:x(k) EX, Kx(k)) E U, Vk and x(k)

-t

0 as k

-t

oo}

~

0

(20)

The characterization of X I takes the form of a function, O(x) such that {x : O(x) ~ O} closely approximates X I. The computations of O(x) utilize a modified version of standard SYM.

3. TERMINAL CONDITIONS Our approach retains the online optimization of the FH problem but uses the characterization of XI and F(x) via support vector learning. Such an approach has several advantages: (i) there are many choices of X f that satisfy (A 1)(A3) and many choices of F(x) that satisfy (A4). For example, if F1(x) satisfy (A4), then any F 2 (x) := oF1(x) where 0 > 1 also satisfies (A4). (ii) the control u(t), which is based on the solution of the FH problem of (3)-(8), is not sensitive to small error in the approximation of X I as (8) is only a set inclusion requirement.

Our choice of terminal cost is FR(X) : XI - t R and like previous works, it is associated with a local feedback law u = Kx. There are many choices for the desired terminal cost F(x). Ours is based on (15) and takes the form 00

FN(Xi) = (1

+ E) L:f(x(j), Kx(j)) j=O

for some small

575

E

> 0 under the system

(21)

x (j + 1) = f (x (j), K x (j) ), Vj x(O) = Xi

~ 0

E Xf'

q

(22)

FR(XFL({3; - IJi *)(Xi) . 4>(x) + b

(23)

To construct F R using SVM, we assume that (23) can be numerically solved for many initial points Xi E X f; and for each solution trajectory starting from Xi, FN(Xi) is computed based on (21). Let Yi = FN(Xi) and the error in approximating FN(X) be

(34)

where (3; and IJi * are the optimal Lagrange multipliers corresponding to the solution of (30)-(32). The inner product of 4>(xd' cP(x) can be evaluated easily depending on the choice of the kernel function. For example, if the gaussian kernel is used, 4>(x) . 4>(Xi) = IIx2~;dl for some a 2 O.

(24)

Remark 3.2. The condition (25) shows that the error in approximation can be large when IlxlI is large. Clearly, any reasonable choice of F R will satisfy (25) when Ilxll is large. The accuracy of approximation is only needed when Ilxll is small. It is therefore important to ensure that sufficient data points are represented near the origin for the accurate representation of FR. Of course, other solutions exist. Suppose R(x, u) is the conventional LQ cost, the terminal cost of (1 + E)X T Px where P is the solution of the corresponding Lyapunov equation is a good approximation to FN . Hence, a composite F R can be easily defined that uses the value of the Lyapunov function when IIxll is small and the SVR solution when Ilxll is large to ensure global satisfaction of (25).

For reasons in section 4, we require €(x) satisfy

o < €(x) < ER(x, 0).

(25)

These requirements are imposed on F R via the support vector regression (SVR) formulation.

FR(x) = W . cP(x)

+b

(26)

is constructed from the the data set {Xi, Yi} i , where q is the number of the generated data; the weight vector W E 'H and bias b E R are obtained from the following optimization problem: 1 min -2 w,b

IIwI1 2

(27)

S.t'Yi-W·cP(Xi)-b~Ci,i=l,

W . cP(Xi)

+ b - Yi

~

0, i = 1,

Remark 3.3. While Ci := ER(Xi,O) is sufficient to ensure condition (25), it is also reasonable to impose some accuracy requirements when IIxll is large. Letting Ci := min{ER(xi,0),E1} for some small El > 0 serves such a purpose.

,q (28) ,q (29)

with Ci := ER(Xi, 0). The above formulation is different from standard SVR for regression in that the conditions of (25) are imposed as inequalities (28) and (29). The above optimization problem is converted_to its dual for its numerical solution. Suppose {3, {3 E Rm are the vectors of Lagrange multipliers of (28) and (29) respectively. The Wolfe's dual (Fletcher, 1987) can be shown to be: 1

mi!l {3,{3

q

2L

i

q

-

4. ASYMPTOTIC CLOSED-LOOP

STABILITY With X f and FR(X) characterized, we denote as AFH (Approximate Finite Horizon) the FH problem that optimizes (3) with F R replaces F subject to (4)-(8), where (8) is given by O(x(Nlt)) ~ 0. We now state the assumptions needed for our stability result. (A5) The linearized system of (1) at (x, u) = (0,0) is stabilizable. (A6) The optimization of AFH yields the minimum solution uO(t) at every t. (A7) Function F R (x) is such that (25) is satisfied for all X E Xf' (A8) The set X f satisfies (A1)-(A3) where Kf(x) = K x and K is such that A + B K is asymptotically stable with A, B given by (12).

-

L({3i - (3i)({3j - (3j)4>(Xi) . (Xj) j

q

q

- LCBi - {3i)Yi + L IJici

(30)

q

s.t. L ({3i - IJi) = 0

(31)

(3i ~ 0, IJi ~ 0, i = 1, ... , q

(32)

Again, modification to the standard numerical algorithm (Smola and Scholkopf, 1998; Shevade el al., 2000) allows the solution of the above to be computed. Solving the above, we have

Theorem 4.1. Suppose the assumptions (A5)-(A8) are satisfied. Using AFH under the MPC approach, the origin of (1) is asymptotically stable for all x E X N, the set of states steerable to X f by an admissible control in time N or less.

q

W=L({3; - IJi *)4>(xd

(33)

576

Proof 4.1. The assumption (A5) is needed for (A8). Let x+ = f(x, K x). We first show that (A7) imply that

FR(X+) - FR(x)

+ £(x, Kx)

This example is taken from Chen and Allgower (1998) and is a continuous-time bilinear constrained system described by

~ 0, Vx E XI' (35)

Xl = X2 From (21), we have

X2 =

00

FN (x)=(1

+ E) L£(x(i), Kx(i)) i=O 00

=(1

+ E)(£(X, Kx) + L(1 + E)£(x(i), Kx(i))) i=l

+ c:(x) c:(x+) + c:(x)

+ E)£(X, K x) -

~-(1

+ E)£(X, Kx) + Ef(x, Kx)

~-£(x, Kx)

where the inequality comes from (25) of (A7). Hence condition (35) follows. Condition (35), which is equivalent to assumption (A4), and (A8) satisfy all the requirements of Theorem 2.1. Hence, we have, for all x E X N ,

+ £(x, K,N(X))

~ 0(36)

when the AFH problem is solved at every step (with assumption (A6)) using the MPC law of u = K,N(X). Let x(t+l) = j(x(t), K,NX(t)). It follows from (36) that {J~ (x (t) )} is a monotonically decreasing sequence. Also, the sequence is bounded from zero from (3). This implies that {J~(x(t))} is Cauchy lim J~(x(t)) - J~(x(t + 1)) = 0

t-+oo

+ 1)) ~ e(x(t), K, N ( X( t )))

RII - 2.0 ~

U

~ 2.0}

(39)

Let superscript A denotes the results obtained using our proposed approach while superscript B refers to the method by Chen and Allgower (1998).

(37)

Figure 1 shows the terminal regions of the two methods, XI = {x : O(x) ~ a} and the ellipse X Q = {x E R 2 1x T Px ~ 0.7}, P = [16.59 11.59; 11.59 16.59]. For each starting point, 100 iterations of MPC is performed. Table 1 shows the results of the experiment. In the table, N A and NB refer to the shortest N for which a feasible solution exists for the respective controllers. ti, i = A, B are the CPU times of the two controllers using the shortest horizons respectively.

From (36), we have J~ (x(t )) - JJv (x (t

(38)

We follow the settings used by Chen and Allgower (1998), f(x, u) = x T Qx+u T Ru with Q = 0.5! and R = 1.0, J-l = 0.5 and K = [2.118,2.118]. Using this K value, XI is characterized by O(x). A total of 1047 examples are generated but only 150 are used in the training process. The 150 examples are selected judicially from the 1047 examples via the adaptive procedure described in Ong et al., The overall tuning / training exercise takes 30 seconds and terminates with O(x) being determined by 36 support vectors. b = 5.6795. The gaussian kernel function is used and the optimal kernel width is a = 0.3. Similarly, FR is obtained according to the procedure described in section 4. A total of 2530 examples is generated with only 350 selected in the training process. The total time needed for the overall tuning/training exercise is 30 minutes.

FR(x+) - FR(x)=FN(x+) - FN(x) - c:(x+) =-(1

+ u(J-l + (1 - J-l)xd + u(J-l - 4(1 - J-l)X2)

The equilibrium point x = 0 is unstable when U = 0 and the linearized system is stabilizable for any J-l E (0,1). For the discrete implementation, a sampling period of 0.1 second is used. In addition, it is assumed that the input U has to satisfy the following constraint:

u = {u E

Using (24), we have, for all x, x+ E XI,

J~(f(x, "-N(X)) - J~(x)

Xl

~ 0

which implies that limt-+oo £(x(t), "-N(X(t))) = 0 from (37). Using the property of £(x, u), we have limt-+oo x(t) = O.

5. NU:rvlERICAL EXAMPLES

Figure 1 also shows the closed-loop trajectories of five initial points using the two methods. The solid curves are the trajectories based on our approach while the dotted curves are that by Chen and Allgower (1998).

We illustrate our approach with one example, which is conducted on a Pentium 4 machine at 1.8 GHz. The optimization routines used are based on the algorithms provided for in the Matlab Optimization Toolbox. The choice of c: in F R is 0.05.

In Table 1, initial states with * are as the same as those chosen in Chen and Allgower (1998); initial states with 6 are chosen inside the terminal set

Example 5.1.

577

.,

1.5

Fig. 1. Comparison of the terminal regions and closed-loop trajectories.

XI but outside the terminal set X n ; initial states with. are chosen outside the terminal set XI' The advantage of the proposed approach is clear. The CPU times using Controller A is significantly better then that of Controller B for all. From the Table 1, it is obvious drawn that our method can use the shorter control horizon N and need the shorter online computational time compared with Chen and Allgower (1998). Even if we choose the shorter horizon in order to reduce the online computational burden, the performance of our approach is also satisfactory.

6. CONCLUSION With the rapid development in fast processor speed, the using of interpolation or approximation is more and more appealing. Then, in this paper, by using support vector learning, an efficient implementation of a nonlinear MPC controller has been presented. The purpose of this paper is to lower the online computational burden by relaxing the terminal constraints and omine approximating the terminal cost FR(x).

REFERENCES Chen, H., & Allgower, F. (1998). A quasi-infinite horizon nonlinear model predictive control

578

Nonlinear model predictive control with enlarged terminal sets using support vector machine

Nonlinear model predictive control with enlarged terminal sets using support vector machine

Recommend Documents