Iterative Solution of Algebraic Matrix Riccati Equations in Open Loop Nash Games

Iterative Solution of Algebraic Matrix Riccati Equations in Open Loop Nash Games

Copyright © IFAC Control Applications ofOptimisation, Visegnid, Hungary, 2003 ELSEVIER IFAC PUBLICATIONS www.elsevier.com/locate/ifac ITERATIVE SOL...

468KB Sizes 0 Downloads 65 Views

Copyright © IFAC Control Applications ofOptimisation, Visegnid, Hungary, 2003

ELSEVIER

IFAC PUBLICATIONS www.elsevier.com/locate/ifac

ITERATIVE SOLUTION OF ALGEBRAIC MATRIX RICCATI EQUATIONS IN OPEN LOOP NASH GAMES

Teresa Paula Azevedo-Perdicoulis· Gerhard Jank ..

• ISR -pOlo de Coimbra Departamento Matematica, UTAD P-5000-59J Vila Real, Portugal e-mail : tazevedo@utadpt .. Lehrstuhl IIfiir Mathematik, RWTH Aachen Templergraben 55, D-52056 Aachen, Germany e-mail: [email protected]

Abstract: In this note we study a fixpoint iteration approach to solve algebraic Riccati equations as they appear in general two player Nash differential games on an infinite time horizon, where the infonnation structure is of open loop type. We obtain conditions for existence and uniqueness of non negative solutions. The perfonnance of the numerical algorithm is shown in an example. Copyright © 2003 IFAC Keywords: Linear quadratic games, Nash games, non-symmetric algebraic Riccati equations, fixpoint iteration.

natively general solver for nonlinear equations usually are not perfonning sufficiently good and solution methods which need to detennine invariant subspaces of an associated high dimensional matrix (see (Reid, 1972; Abou-Kandil et al., 1993» are restricted to lower dimensional systems.

1. INTRODUCTION The type of algebraic matrix Riccati equations studied in this article appear in Nash games with quadratic perfonnance criteria and a linear system described by a linear differential equation (see (Ba~ar and Olsder, 1995». For simplicity we restrict ourselves to the two player case and the infonnation structure is assumed to be of open loop type. The Nash equilibrium controls then can be represented in feedback fonn. Considering the game on a finite time horizon then the feedback operator mainly depends on a solution of a Riccati differential equation (see (Ba~ar and Olsder, 1995», while in the time invariant situation with an infinite time horizon it mainly depends on the solutions of algebraic Riccati equations considered in this article (see (Engwerda, 1998; Kremer and Stefan, 2002» .

For further infonnation about solutions of general non-symmetric algebraic Riccati equations and Riccati differential equations see also (Abou-Kandil et al., 2003b). In (lank and Kun, 1998), see also (van den Broek, 200 I), there was encountered a system of coupled algebraic Riccati equations, appearing in Nash games with feedback infonnation structure, and admitting several positive semi-definite stabilizing solutions. This makes the convergence of any algorithm quite difficult.

Typically the algebraic Riccati equations as they appear in game theory are non-symmetric which makes efficient numerical methods developed for symmetric equations from control theory not applicable. Alter-

197

The system of coupled equations (2) can be written as a single equation R(Kl , K 2) = 0, where:

In (Guo and Laub, 2000), a Newton and a fix point iteration procedure for solving non symmetric algebraic Riccati equations appearing in transport theory, are considered. Unfortunately equations encountered in open loop Nash games do not fit into the algorithms developed there. Therefore, we present here another algorithm.

) = - (~~)

Throughout the paper we use the following order relation in the set of real m x n-matrices Mmxn :

A < B A :5 B

if if

aij

< bij

aij:5

bij

-

(1)

A- (A; JT) (~~)

(~~)

+

(~~) (SI

(3)

(~~).

S2)

Hence, we obtain the short notation of the open loop Nash algebraic Riccati equations as:

= 1, . . . , rn, j = 1, ... n . A real square matrix A E Mnxn is called a Z-matrix if there exists s E R and B E Mnxn , B ~ 0, such that A = sIn - B, where In denotes the n-dimensional unit matrix. for all i

R(K)=-KA-DK-Q+KSK=O, (4)

where K :=

A Z -matrix is a monotonic matrix in the order cone IC n := {(Xl,""X n ) E lRnlxi ~ O,i = 1, ... ,n} , called an M-matrix if s > p(B), where p: Mn xn -+ R~o denotes the spectral radius of the matrix. If s = p then A is called a singular M -matrix.

andD = (

(~~) , Q = (~~) , S = (SI

S2 )

° AT0) .

AT

We apply a fixpoint iteration scheme to study this equation (4), this means we deal with the iteration procedure

C(Xk+l) := (-D)Xk+l +XHl(-A) = Q-XkSXk'

We recall the following results on Z -matrices being M -matrices (see (Berman and Plemmons, 1979; Guo and Laub, 2000» .

= (5)

k= -1,0,1,2, .. ..

Theorem 1. For a Z-matrix A, the following are equivalent:

(i) A is an M -matrix. (ii) A-I ~ 0. (iii Av > for some vector v > 0. (iv) All eigenvalues of A have positive real parts.

2. FIXPOINT ITERATION

°

Theorem 2. In equation (4), let S ~ 0, Q > be a Z -matrix such that

-AT®(IO

The open loop Nash algebraic Riccati equations are defined by: Rl(Kl ,K2) 0, R2(Kl ,K2) = 0, where:

=

+

=

+

AT K l K l S I K l + K l S 2K 2, -K2A - AT K 2K 2 S 2K 2 + K 2S l K l •

S2 )

(~~)

(2)

has a solution Xo



= - R-ii I B iT K iX,

x(O)

(7)

is non negative, i.e. that Xl ~ 0, then the fixpoint iteration as defined in (5), initialized 0, yields a sequence with the with X-I following monotonicity properties:

=

o :5

Xl :5 X3 :5 .. . :5 X 2HI :5 .. . :5 X 2k :5 X 2k - 2 :5 .. . :5 Xo·

= 1, 2,

(8)

ii) Let X be any non negative solution of (4) then K- :5 X :5 K+ :5 X o , where K- = limk-+oo X 2k+l and K+ limk-+oo X 2k .

°:5

where x is a solution of

x = Actx ,

> 0.

-DX - XA = Q - XoSXo

is a stable matrix)

i

(6)

i) If Xo is such that the solution Xl of

is the most important question in order to determine a Nash equilibrium (see (Engwerda, 1998; Kremer and Stefan, 2002». This equilibrium then is given by Uj

~)+In®(-D)

-DX-XA=Q

In the theory of Nash games with an infinite planning horizon the existence of stabilizing solutions of (2) (i.e. Act := A - ( SI

and - A

is an M -matrix. Then

= -KIA -

RI (Kt. K 2) Ql R 2(Kt. K 2 ) Q2

°

= Xo.

198

=

From C(X - Xd = -XSX + XoSXo ~ 0 we then infer X ~ Xl, and further from C(X - X 2 ) = -XSX + XlSX I ~ 0 we get X ~ X 2. By mathematical induction it is now easily obtained that

iii) If the iteration sequence XIc converges, i.e. if KK+ then there exists a unique non negative solution of (4).

=

Proof Notice, that the linear operator C in (5) is, by our assumptions, invertible and C- l is a positive operator by Theorem I, i) and the assumption that

_AT® (~

~) +In®(-D)isanM-matrix.

for k = 1,2,· . . . This proves ii). Part iii) is a consequence of i) and ii).

Together with Q > 0 this yields first Xo > O. Let furthermore the solution Xl of (7) be non negative, i.e. Xl ~ O. Then from C(Xl - Xo) -XoSXo we conclude 0 ~ Xl ~ Xo.

=

In (Jank et al., 2002) a more general type of an open loop Nash algebraic Riccati equation has been studied applying Newton method to obtain an iteration sequence. The algebraic Riccati equation studied there is of the following type:

From C(X2 - Xt} = -XISX l + XoSXo ~ XlS(Xo - Xd ~ 0 we now infer X2 ~ Xl and also that X 2 ~ O. Then we obtain the inequality for C(X3 - Xt} = -X2SX2 + XoSXo ~ X 2S(Xo X 2) ~ 0, where we first used that SXo ~ 0, X2 ~ Xo and then X 2 ~ 0, X2 ~ Xo, to obtain the second inequality. Finally this reveals X3

~

R(K) := -KA - DK - Q + (L l +KBdR- l (L 2 +BrK) (9) = -DK-K.4.-Q+KSK,

Br.

where.4. := A - BIR- l L2, D := D - LIR- l Q := Q - LIR- l L 2, S = (SI S2):= BIR- l Br ,

Xl.

with the blocked matrices D To setup now mathematical induction, observe that we have shown

o~ Xl ~

~

X 2Ic -

X3

~

0

~

...

2 ~ ..• ~

BI :=

(B~

Bf), B2

(~g ~~) ,L

X 21c+1 ~ X 21c

X 2 ~ Xo

~~),

l :=

:=

(fA

:=

(AOT JT) '

(~~ ~?) ,

~i) ,L

BL Lt, M2

2 :=

R

:=

(~F)'

for k = 0,1.

Q

Assume now this to be true for k = 0,1, .. . , i . To show that it is true for i + 1, consider first:

B?, L~, Ml E jRn xm2, Rn E IRm, xm l , R22 E 1R"'2x m2,Nl E IRm, xm2,N2 E IRm2 xm',A , QI , Q2 E JR" xn, and R should be invertible.

=

C(X2l+2 - X2l) = -X2l+ I SX2l+ 1 which yields X2l+2

~

+ X2l-lSX2l-1

:=

(

where

E

IRn xml,

It is quite clear that Theorem 2 also applies to equation (9), whereas on the other hand the results obtained in (Jank et al., 2002) unfortunately do not apply to the standard open loop Nash algebraic Riccati equation (4).

~ 0,

X 2l .

Next we consider:

=

C(X2l+2 - X 2l+d = -X2l+1SX2l+1 + X2lSX2l ~ 0,

Moreover, we have: Remark 1. Let in equation (9) .4., D, Q, S, be any matrices of appropriate size such that S ~ 0, Q > 0, -.4. being a Z-matrix and D such that

hence X 2l+ 2 ~ X2l+1 ~ 0, and from:

C(X2l+3 - X 2l+d = = -X2l+2SX2l+2 + X 2ISX2L ~ 0, we finally obtain the conclusion. This proves i).

is an M -matrix such that (6) has a solution Xo such that, furthermore, (7) admits a non negative solution. Then the conclusions of Theorem 2 hold true.

To prove ii) assume that X is a solution of(4), which equivalently can be written as

C(X)

=Q -

XSX .

If X ~ 0 we have C(X - Xo)

X ~Xo .

In particular in the ca~e of a Nash game with discount rate in the performance criterion (see (Abou-Kandil and Jank, 2003a» there appear the following algebraic Riccati equations:

= -XSX, hence

199

converging eventually towards K+. It seems likely here that K+ = K_ = K*, where K* is the unique positive solution

0.432521 0.071826) K*'= 0.071908 0.412991 . ( 0.216119 0.043134 . 0.043105 0.206392 In the second example we apply the iteration scheme to the equation (9) with

where ai, i = 1,2, denotes the constant time preference rate or discount rate (ai :2: 0).

A .=

(-239.67 133.45

103.14) -279.42

-215.292 99.154 211 .713 ( 211.477

134.954 -261.923 212.323 213.338

48.368 45.632 -214.379 144.989

.

The fixpoint iteration does not provide an answer to the question, if the yielded positive solution K - if it exists - stabilizes the system, i. e., yields a stable closed loop matrix Ad = A - SK?

D:=

Notice also that in Theorem 2 it is not necessary that X I has to be non negative. What really counts is that after a finite number of pre-iteration steps an even indexed XU:o :2: 0 is followed by an odd indexed X 2ko + l :2: O. From there on the process runs as described in Theorem 2.

86.50722 177.31148) Q .= 163.41843 0.74101 . ( 181.16389 726.22338 319.30772 606.3118 S .= (0.18 . 0.16

3. NUMERICAL EXAMPLES

42) -199' Q :=

(:~:

K Il 1:9),

8 150 7 S := ( 7 153

I .

0.3248372 -0.0074981 ( 0.1611768 0.0018668

156 6) 6 149 .

Kl2

K ll

-0.0077127) 0.3224781 0.0019546 0.1600956

5.9784 4.8608 ) 5.1765 3.9219 12.9416 11.0542 - t K+. ( 13.2088 10.9548

4. CONCLUSIONS We study an iterative scheme to compute fixpoints of an operator related with algebraic Riccati equations appearing in Nash games on an infinite time horizon. The stabilizing solutions yield a Nash equilibrium.

0.065663) 0.407434 0.039970 0.201637

From the iterative process we obtain two sequences, one monotonically increasing and the other monotonically decreasing. If they converge to the same limit then this is the unique positive solution (positive with respect to the order cone of matrices with positive entries). The process does not reveal if the solution obtained is stabilizing the system, i.e. that the closed loop matrix Ad is stable.

which eventually converges towards K _. While in step 12 we obtain

0.4:l7763 0.076634 K l2 := ( 0.218810 0.045539

=

0.397693) 0.144731 1.480906 - t K -, 1.329006

If there is a solution of equation (9) one can use such a fixpoint iteration as preprocessing and then starting a Newton iteration with Kll or Kl2 as initial point.

which is not strictly positive. Fortunately in the next iteration step we obtain a strictly positive matrix K 2. Then in the II-th step we obtain

0.425698 0.065756 := ( 0.212617 0.037841

0.041441 0.152017 := ( 0.206981 0.404210

99

Starting with a solution of (6) and then getting the solution KI K.=

0.16 0.1 0.6) 0.19 0.6 0.9 .

After 11, respectively 12 iteration steps we obtain

In a first example we use in (4) :

-190 A:= ( 41

50 55 ) 140.5 -216.5

0.076560) 0.417260 0.045565 ' 0.208591

200

If the sequences do not converge to the same limit this iteration procedure can be used for preprocessing followed by a Newton iteration which then is more likely to converge since one comes closer to possible solutions.

REFERENCES Abou-Kandil, H. and G. Jank (2003a) . Open-loop Nash Riccati equations in games with discount rate. To appear in Proc. IFAC Workshop on Control Applications of Optimisation Abou-Kandil, H., G. Freiling and G. Jank (1993). Necessary conditions for constant solutions of coupled Riccati equations in Nash games. System & Control Letters 21, 295-306. Abou-Kandil, H., G. Freiling V. Ionescu and G. Jank (2003b) . Matrix Riccati Equations in Control and Systems Theory. Birkhliuser Verlag. Base!. Ba~ar, T. and G.J. Olsder (1995) . Dynamic Noncooperative Game Theory. Academic Press. New York. Berman, A. and R.J. Plemmons (1979). Nonnegative Matrices in the Mathematical Sciences. Academic Press. New York. Engwerda, le. (1998). On the open-loop Nash equilibrium in LQ-games. Journal of Economic Dynamics and Control 22, 729- 762. Guo, C-H and A.J. Laub (2000) . On the iterative solution of a class of non symmetric algebraic Riccati equations. Siam J Matrix Anal. Appl. 22, 376-391. Jank, G. and G . Kun (1998). Solutions of generalized Riccati differential equations and their approximation. In Computational Methods and Function Theory (CMFT '97). St. Ruscheweyh and E.B. SajJ(eds.) pp. 1-18. Jank, G., D. Kremer Y. N.Njike and T-P AzevedoPerdicoulis (2002) . Newton method for algebraic open loop Riccati equations in nash games. Proc. Portuguese Control Conference, CONTROLO 2002, Aveiro, Portugal. Kremer, D. and R. Stefan (2002). Non-symmetric Riccati theory and linear quadratic Nash games. Proc. MTNS Conference, Notre Dame USA . Reid, w.T. (1972). Riccati Differential Equations . Academic Press. New York. van den Broek, B (2001) . Uncertainty in differential games, PhD thesis. Tilburg University. Tilburg, Netherlands.

201