Discrete-Time Optimal Control with Control-Dependent Noise and Generalized Riccati Difference Equations

Discrete-Time Optimal Control with Control-Dependent Noise and Generalized Riccati Difference Equations

PII: S0005–1098(98)00044–2 Automatica, Vol. 34, No. 8, pp. 1031—1034, 1998 ( 1998 Elsevier Science Ltd. All rights reserved Printed in Great Britain ...

95KB Sizes 0 Downloads 21 Views

PII: S0005–1098(98)00044–2

Automatica, Vol. 34, No. 8, pp. 1031—1034, 1998 ( 1998 Elsevier Science Ltd. All rights reserved Printed in Great Britain 0005-1098/98 $19.00#0.00

Technical Communique

Discrete-Time Optimal Control with Control-Dependent Noise and Generalized Riccati Difference Equations* ALESSANDRO BEGHI- and DOMENICO D’ALESSANDRO‡ Key Words—Discrete-time optimal control; control-dependent noise; generalized Riccati equations.

Gaussian noise (WGN) processes independent of x(0) and of each other, with intensities »(t) and ¼(t), respectively. The noise dependence on the control is modeled as in Whonam (1967, 1968) and McLane (1971), although the approach used in this paper could be easily adapted to cope with different choices, such as the one proposed in Skelton and Shi (1995) (see Beghi and D’Alessandro, 1997). The problem we consider is then stated as follows.

Abstract—The optimal control law is derived for discrete-time linear stochastic systems with quadratic performance criterion and control-dependent noise. The analysis includes the study of a generalized Riccati difference equation and of the asymptotic behavior of its solutions. ( 1998 Elsevier Science Ltd. All rights reserved. 1. Introduction The analysis of systems with state and/or control-dependent noise is a classical problem in stochastic control theory. In recent years, there has been a renewed interest in this subject. In particular, by letting the statistical description of the noise not be known a priori but depend on the control and state evolution, robustness of the overall control system can be improved (see the work of Skelton and coworkers, e.g. Skelton and Shi, 1995). In several situations, this framework is more realistic than the classical one and the control law derived in the standard LQG framework is no more optimal; its use may lead to a serious degrading of the performances of the system, which may even become unstable (Ruan and Choudhury, 1993; Kro´likowski, 1997). With this motivation, we study, in this communique´, the linear quadratic optimal control problem for discrete-time systems with control-dependent noise. A fundamental role in the synthesis of the control law will be played by a generalized Riccati difference equation (GRDE) (De Souza and Fragoso, 1990). We will investigate the asymptotic behavior of the solutions of such equation and derive results on the existence of an equilibrium solution and its attractiveness properties. Connections between the results presented here and related problems dealt with in the literature are also reported.

Problem 1. For the system described by equation (1), find the state feedback law u(t)"!F(t) x(t) such that the performance index

C

D

1 N~1 J" E + (xT(t)Q(t) x(t)#uT (t) R(t) u(t))#xT (N) Mx(N) 2 t/0

(2) is minimized. In equation (2), M50 and Q(t)50, R(t)'0, t"0, 2 , N!1. As a first step, we write the performance index J in equation (2) as

C

D

1 N~1 J" Tr + (Q(t)#FT(t) R(t) F(t))&(t)#M&(N) , (3) 2 t/0 where &(t) is the state variance of the controlled process. A recursion for &(t) can be obtained by computing &(t#1) as E [E[x(t#1) xT(t#1) Dx(t)]]. We find that &(t#1)"(A(t)!B(t) F(t)) &(t) (A(t)!B(t) F(t))T

2. Derivation of the optimal feedback control law We consider the optimal control problem of possibly timevarying systems on a finite interval I"[0, N], described by a state-space model such as the following

m m # + + (F(t) &(t) F(t)T) Gi(t) »(t) GjT(t) ij i/1 j/1 #D(t) ¼(t) DT(t), 04t4N!1 (4)

x(t#1)"A(t) x(t)#B(t) u(t)

with initial condition &(0) , where with (X) we denote the ijth ij element of the matrix X. The solution of Problem 1 is equivalent to the search for the sequence of matrices F(t), t"0, 2 , N!1 which minimizes equation (3), subject to the constraint (4) on the controlled process variance. This can be seen as a deterministic optimization problem, with the elements of &(t) as state variables, the recursion (4) as nonlinear state dynamics, the elements of F(t) as control variables, and equation (3) as the index to be minimized. In the following theorem, we solve such problem by using a matrix version of the maximum principle (Athans and Tse, 1967).

m # + u (t) Gi(t) v(t)#D(t) w(t) (1) i i/1 with x(0)\N (0, &(0)), x( ) )3Rn, u( ) )3Rm, v( ) )3RdÇ, w( ) )3RdÈ, u denotes the ith component of the control vector u, i and the A( ) ), B( ) ), Gi( ) ), D( ) ) matrices have the appropriate dimensions. In equation (1), v(t) and w(t) are zero-mean white

* Received 10 October 1995; revised 14 August 1997; receivd in final form 27 February 1998. This paper was recommended for publication in revised form by Editor Peter Dorato. Corresponding author Alessandro Beghi. Tel. 00-39-09-8277626; Fax 00-39-09-8277699; E-mail [email protected]. - Dipartimeto di Elettronica e Informatica, Universita` di Padova, via Gradenigo 6/A, 35131 Padova, Italy. ‡ Department of Mechanical and Environmental Engineering, University of California, Santa Barbara, CA 93106, USA.

¹heorem 1. The feedback control law solving Problem 1 is given by u(t)"!F(t) x(t) where F(t)"[R(t)#BT(t) P(t#1) B(t)#)(P(t#1))]~1 ]BT(t) P(t#1) A(t)

(5a)

with ()(P(t#1))) "Tr [P(t#1) Gi(t) »(t) GjT(t)] ij 1031

(5b)

1032

Technical Communiques

and P(t) solves the following generalized Riccati difference equation (GRDE): P(t)"AT(t) P(t#1) A(t)!AT(t) P(t#1) B(t) [R(t)

J

#BT(t) P(t#1) B(t)#)(P(t#1))]~1 ]BT(t) P(t#1) A(t)#Q(t)

(5c)

with terminal condition P(N)"M. Notice that )( ) ) in equation (5b) is a linear operator on the space of symmetric matrices, and it is such that if P50, )(P)50, i.e., it is a linear positive operator (Wonham, 1967). Proof. For the sake of notational convenience, from now on we shall often omit the time index whenever it is t. We proceed as in a standard optimization problem by adjoining the dynamics (4) to the performance index (3) to form 1 N~1 1 JM " + Tr [(Q#FT RF) &]# Tr [M&(N)] 2 2 t/0 1 N~1 # + Tr P(t#1) ((A!BF) &(A!BF)T 2 t/0 m m # + + (F&FT) Gi»G jT#D¼DT!& (t#1)) ij i/1 i/1 N~1 1 " + H (t)# Tr [M&(N)], (6) 2 t/0 where P(t), t"0, 2 , N!1 are symmetric matrices of Lagrangean multipliers, and H (t), t"0, 2 , N!1 is the scalar sequence of Hamiltonians. The first-order necessary conditions for optimality are

C

LH (t) "0, LF(t)

D

LH (t) P(t)" , L& (t)

corresponding to the control law specified by equations (5a) and (5b):

C

D

1 N~1 " Tr &(0) P(0)# + D(t) ¼(t) DT(t) P(t#1) . .*/ 2 t/0

(9)

3. Analysis of the generalized Riccati difference equation We devote this section to the analysis of the time-invariant GRDE, arising in the infinite horizon version of Problem 1. In particular, we are interested in finding conditions for the convergence of the GRDE to an equilibrium point and for ensuring that the resulting constant closed-loop system matrix A!BF is stable. Some of the results will be just sketched being just natural generalizations of the ones known for the standard LQG context. We rewrite equation (5c) as P(t)"AT P(t#1) A!AT P(t#1) B [R#BT P(t#1)B #)(P(t#1))]~1 BT P(t#1) A#CTC,

(10)

where A, B, C, R are now constant matrices, with R'0 and CTC"Q. The equilibrium solutions of this equation are solutions of the generalized discrete-time algebraic Riccati equation (GDARE) P"AT PA!AT PB [R#BT PB#)(P)]~1 BT PA#CTC. To establish our results, we need the hypothesis of detectability of the pair (A, C) and stabilizability of the pair (A, B). Moreover, in order to obtain boundedness results, we shall consider the following assumption on the ‘‘magnitude’’ of the control-dependent noise, which is the discrete-time analogue of the one considered in Wonham (1968). Assumption 1. It holds

P(N)"M.

(7)

The matrix differentiation formulas reported in Rogers (1980) show that relations (7) are equivalent to 0"[!BTP(t#1) (A!BF)#)(P(t#1)) F#RF ]&, (8a) P(t)"(A!BF)T P(t#1) (A!BF) #FT ()(P(t#1))#R) F#Q

where

inf D X D(1, F F

(11)

= (12) X ¢ + (A!BF)T i FT)(I) F(A!BF)i. F i/0 In equation (11), the infimum is considered over the matrices F such that A!BF is stable, and DXD is the Euclidean norm of a symmetric matrix X (i.e., the absolute value of the numerically largest eigenvalue of X).

(8b)

with )(P(t#1)) defined as in equation (5b). We see from equation (8b) that since P(N)"M is a symmetric matrix, all the P(t), t"0, 2 , N!1 are indeed symmetric. Also, since Q(t)50, we have that P(t)50, t"0, 2 , N. Equation (8a) is solved by choosing F(t) as in equation (5a), where the existence of the indicated inverse is granted by the standing assumption R(t)'0, for t"1, 2 , N. Finally, plugging equation (5a) into equation (8b) we get equation (5c). K The form of the solution of Problem 1 is similar to the one obtained by the classical LQG method, the optimal feedback depending on the solution of the Riccati-type difference equation (5c). The GRDE (5c) differs from the standard Riccati difference equation (RDE) for the presence of the extra term )(P(t#1)) in the matrix under inversion, which depends on the given noise structure. Such a term can be interpreted as a further weighing on the control. As a consequence, the effect of the control-dependent noise is the one of forcing to use a more ‘‘cautious’’ control. In fact, the greater the control signal is, the greater the noise introduced by the actuator and therefore the degrading of the performance. Remark 1. It is not difficult to show that P(t)5P (t), 45$ for t"N!1, N!2, 2 , 0, where P (t) is the solution 45$ of the standard Riccati equation obtained by imposing )(P(t#1))"0 in equation (5c) and choosing P (N)"M. 45$ Remark 2. A direct extension of a standard result yields the following expression for the minimum value of the index (3)

It is easy to verify that the above indicated norm coincides with the norm of the linear operator = A ( ) ) ¢ + (A!BF)T i FT)( ) ) F(A!BF)i F i/0 on the symmetric matrices, which is induced by the Euclidean norm. We start by introducing some preliminary lemmas which will be needed in the following. The first one is a result which is proved, in the more general context of positive operators in Hilbert spaces, in Kantorovich and Akilov (1964, p. 189). ¸emma 1. Consider a sequence MP N, n"0, 1, 2, 2 of symmetn ric matrices which is monotonically non-decreasing (non-increasing) in the sense of the matrices, i.e. such that P 4P 4P 42 (P 5P 5P 52). Assume this se0 1 2 0 1 2 quence is bounded (namely, there exists a matrix PM such that DP D4DPM D, ( DP D5DPM D ) for each n"0, 1, 2, 2). Then lim P n n n?= n exists, and Dlim P D4D PM D ( Dlim P D5DPM D ). n?= n n?= n Then, we give two monotonicity and comparison results for solutions of the GRDE. They are stated without proof, since they can be derived analogously to the standard case (Bitmead and Gevers, 1989). ¸emma 2. Assume P (t) and P (t) are symmetric solutions of the 1 2 GRDE (5c) in (!R, N ]. If 04P (tN )4P (tN ), for a given time 2 N 1 Nt, then 04P (t)4P (t), for any t4t . 2 1

Technical Communiques ¸emma 3. Let P (t) be the solution of the GRDE corresponding 0 to the terminal condition P(N)"0. Then it holds that P (t) 0 is always nondecreasing in the reverse time order, i.e. P (t!1)5P (t), for t4N. 0 0 We proceed now to proving a stability property of the symmetric nonnegative-definite (s.n.d.) solutions of the GRDE. To this aim, we first state a boundedness result. It is at this stage that Assumption 1 plays a fundamental role. ¸emma 4. Under the hypothesis of stabilizability of the pair (A, B), and Assumption 1, every s.n.d. solution of the GRDE is bounded over (!R, N]. Proof. It is easy to show that the optimal feedback matrix F(t) in equation (5a) can be obtained by minimizing equation (8b) over F. This can be directly verified by solving for F(t) the equation LP(t)/LF(t)"0. Therefore, P(t)"inf M(A!BF)TP(t#1) (A!BF) F #FT (R#) (P(t#1))) F#CTCN,

(13)

P(N)50. As a consequence, defining PM (t) as

(14)

with FM a given matrix and PM (N)"P(N), we have that PM (t)5P(t) (this is true for t"N!1 and follows for the other times using Lemma 2). Therefore, it is enough to show that PM (t), defined by equation (14), is bounded over (!R, N], when PM (N)"P(N)50, for a certain matrix FM . We choose FM so that

K

K

= + (A!BFM )T i FM T)(I) FM (A!BFM )i "k(1. (15) i/0 This is possible by the assumption of stabilizability of the pair (A, B) and Assumption 1. Let AM ¢ A!BFM . We write the solution of the Lyapunov-type equation (14) as N~t PM (t)"(AM T)N~tPM (N) AM N~t# + (AM T)N~t~i (FM TRFM i/1 N~t #CTC) AM N~t~i# + (AM T)N~t~i i/1 ]FM T ) (PM (N!i#1)) FM AM N~t~i.

GK

N~t AM N~t PM (N) AM N~t + (AM T)N~t~i i/1

KH

](FM T RFM #CTC) AM N~t~i .

(16)

Observe that M in equation (16) is well-defined thanks to the stability of AM . Using equation (15), we have DPM (t)D4M#k

max MDPM ( j)D N. (17) t`14j4N Notice that, from equation (16), M5DPM (N) D, and therefore, M#kDPM (N)D5DPM (N)D. Using this, we can build from equation (17) the following set of inequalities: DPM (N!1)D4M#kDPM (N)D, DPM (N!2)D4M#k max M DPM (N) D, DPM (N!1)D N 4M#k max M DPM (N) D, M#kDPM (N)D N "M#kM#k2 DPM (N)D and, by iteration, D PM (t)D4M#kM#2#kN~t~1M#kN~tDPM (N)D 4M#kM#2#kN~t~1 M#kN~t M. The latter is bounded from above by (1!k)~1 M.

¸emma 5. Assume that there exists a s.n.d. matrix PM which satisfies PM "AM TPM AM #¸T¸. If the pair (AM , ¸) is detectable, then AM is a stable matrix ¸emma 6. Let PM be a s.n.d. solution of the GDARE. Assume detectability of the pair (A, C). Then AM ¢ A!BFM , with FM the feedback matrix relative to PM , is a stable matrix.

PM ¢ (A!BFM )TPM (A!BFM )#FM T()(PM )#R) FM #CTC, which is equivalent to the form PM "AM TPM AM #¸T ¸ if ¸ is such that ¸T¸"FM T()(PM )#R) FM #CTC. It can be shown that, if there exists an undetectable mode of the pair (AM , ¸), this is also an undetectable mode of (A, C). Therefore, under our assumptions (AM , ¸) is detectable and the result follows from Lemma 5. K Using the stability result of Lemma 6, a standard argument proves that the s.n.d. solution of the GDARE is unique. Furthermore, the monotonicity and boundedness results of Lemmas 2—4, together with Lemma 1, yield convergence of every s.n.d. solution of the GDRE. Therefore, we can state the following result. ¹heorem 2. Assume that (A, B) is stabilizable, (A, C) is detectable, and Assumption 1 holds. Then, every s.n.d. solution of the GRDE converges to the unique s.n.d. solution of the GDARE.

Define M ¢ sup t4N

It is now easy to prove the convergence of the solution of the GRDE corresponding to P(N)"0 to a solution of the GDARE. In fact, under the hypothesis of Lemma 4, we have that P (t) is 0 bounded and, by Lemma 3, it is also monotone, therefore it has a limit by Lemma 1, which is a s.n.d. solution of the GDARE. Assumption 1 together with stabilizability of (A, B) is then sufficient to guarantee the existence of a s.n.d. solution of the GDARE. We observe that it can be easily shown that in the scalar case, this condition is also a necessary one. We are now interested in proving the stabilizing properties of s.n.d. solutions of the GDARE. We shall use the following well-known inertia result for solutions of Lyapunov equations.

Proof. Write the GDARE for PM in the form

PM (t) ¢ (A!BFM )TP(t#1) (A!BFM ) #FM T(R#)(P(t#1))) FM #CTC

1033

(18) K

Observe that we only require detectability of the pair (A, C) to establish convergence, while in Wonham (1968) the stronger hypothesis of observability is required for the same result. We give now a sufficient condition for verifying whether Assumption 1 holds true in terms of the system’s parameters A, B, and )(I ). Proposition 1. Assume )(I)'0. If AT(I#B)(I)~1BT)~1A(I,

(19)

then equation (11) is verified. Proof. Since the infimum in Assumption 1 is over the stabilizing F’s, we have that X can be written as a solution of the F Lyapunov equation X "(A!BF)T X (A!BF)#FT)(I) F. (20) F F As a consequence, verifying the validity of Assumption 1 is equivalent to determining whether there exists an F such that X in equation (20) is a stable matrix (i.e., DX D(1). Let F F ½ ¢ )(I) and A ¢ A!BF. We rewrite equation (20) as F (I!X )"AT (I!X ) A #I!AT A !FT ½F. (21) F F F F F F If there exists an F such that DA D(1 and F I!AT A !FT ½F'0, then it follows from standard F F Lyapunov theory that (I!X )'0, i.e. X (I which in turn F F insures that DX D(1. Therefore, the theorem is proved if we F show that, under condition (19), there exists such an F.

1034

Technical Communiques

Let F] "(BTB#½ )~1BTA

(22)

and A] ¢ A K . Observe that the inverse in equation (22) exists F since ½'0. We first show that DA] D(1. In fact, A] TA] "ATA!2ATB(BTB#½)~1BTA #ATB(BTB#½ )~1BTB(BTB#½ )~1 BTA.

(23)

Using the Matrix Inversion Lemma, equation (19) can be rewritten as ATA(I#ATB(BTB#½)~1BTA. (24) Plugging equation (24) into equation (23) and after some algebra, we get A] TA] (I!ATB(BTB#½ )~1½(BTB#½ )~1BTA4I,

(25)

from which it easily follows that A] is a stable matrix. We complete the proof by showing that I!A] TA] !F] T½F] '0, or, equivalently, that AK TAK #FK T½FK (I. Now, A] TAK #F] T½F] "ATA!ATB(BTB#½ )~1BTA "AT(I!B(BTB#½ )~1BT) A "AT(I#B½~1BT)~1A(I,

(26)

where the last relation follows from the Matrix Inversion Lemma and equation (19). K It is interesting to observe that F] minimizes the scalar function y(F)"DDA DD2#DDF DD2 , (27) F Y where DDPDD"(Tr PTP)1@2 and DDPDD "(Tr PTQP)1@2. The minimQ ization of y(F) can be interpreted as an attempt to mediate between the need of having an F as ‘‘stabilizing’’ as possible and that of avoiding the use of large control actions, which would introduce too much noise into the system. 4. Conclusions We have considered the optimal control of discrete-time linear stochastic systems with control-dependent noise and quadratic performance criterion. Similar problems are dealt with in De Koning (1982), where the dependence of the noise on the control is modeled via stochastic matrices and the additive part is neglected. In that paper, the linear operator used for determining the stability of the second moments equation and the convergence of the generalized Riccati difference equation considered there coincide. This led to formulate convergence conditions in terms of mean-square stabilizability. In our approach, the above mentioned operators are different. This is a consequence of having explicitly modeled the noise dependence on the control. We have shown how this treatment leads to explicit conditions for solvability of the control problem such as the one given in Proposition 1. Moreover, the control law derived in this way is

amenable of a system theoretic interpretation as a compromise between stability and optimality. Discrete-time models with control dependent noise are also considered in Bernstein and Haddad (1987) and De Koning (1992) in the context of minimization of different cost criteria. The counterpart of our results for the continuous-time case can be found in Wonham (1967, 1968). References Athans, M. and E. Tse (1967). A direct derivation of the optimal linear filter using the maximum principle. IEEE ¹rans. Automat. Control, AC-12, 690—698. Beghi, A. and D. D’Alessandro (1997). Some remarks on FSN models and generalized Riccati equations. In Proc.of the 4th European Control Conf., Paper no. 662, Brussels, July 1997. Bernstein, D. S. and W. M. Haddad (1987). Optimal projection equations for discrete-time fixed-order dynamic compensation of linear systems with multiplicative white noise. Int. J. Control, 46, 65—73. Bitmead, R. B. and M. R. Gevers (1989). Riccati difference and differential equations: Convergence, monotonicity and stabilizability. In ¹he Riccati Equation, eds S. Bittanti, A. Laub and J. C. Willems, pp. 263—291. Springer, New York. Kantorovich, L. V. and G. P. Akilov (1964). Functional Analysis in Normed Spaces. Macmillan, New York. De Koning, W. (1982). Infinite horizon optimal control of linear discrete time systems with stochastic parameters. Automatica, 18, 443—453. De Koning, W. (1992). Compensatability and optimal compensation of systems with white parameters. IEEE ¹rans. Automat. Control, AC-37, 579—588. De Souza, C. E. and M. D. Fragoso (1990). On the existence of maximal solution for generalized algebraic Riccati equations arising in stochastic control. Systems Control ¸ett., 14, 233—239. Kro´likowski, A. (1997). Steady-state optimal discrete-time control of first-order systems with actuator noise variance linearly related to actuator signal variance. IEEE ¹rans. Automat. Control, AC-42, 277—280. McLane, P. J. (1971). Optimal stochastic control of linear systems with state- and control-dependent disturbances. IEEE ¹rans. Automat. Control, AC-16, 793—798. Rogers, G. S. (1980). Matrix derivatives. Lecture notes in statistics. Marcel Dekker, New York. Ruan, M. and A.K. Choudhury (1993). Steady-state optimal controller with actuator variance linearly related to actuator signal variance. IEEE ¹rans. Automat. Control, AC-38, 133—135. Skelton, R. E. and G. Shi (1995). Finite Signal-to-Noise models: Covariance control by state feedback. In Proc. 3rd European Control Conf., Roma, Italy, September 1995, pp. 77—82. Wonham, W. M. (1967). Optimal stationary control of linear systems with state-dependent noise. SIAM J. Control Optim., 9, 185—198. Wonham, W. M. (1968). On a matrix Riccati equation of stochastic control. SIAM J. Control Optim., 6, 681—697.