Automatica 48 (2012) 625–631
Contents lists available at SciVerse ScienceDirect
Automatica journal homepage: www.elsevier.com/locate/automatica
Brief paper
Feedback Nash equilibria for linear quadratic descriptor differential games✩ J.C. Engwerda a,1 , Salmah b a
Department of Econometrics and O.R., P.O. Box 90153, 5000 LE Tilburg, The Netherlands
b
Department of Mathematics, Yogyakarta-55281, Indonesia
article
info
Article history: Received 23 July 2010 Received in revised form 9 February 2011 Accepted 22 August 2011 Available online 15 February 2012
abstract In this paper, we consider the non-cooperative linear feedback Nash quadratic differential game with an infinite planning horizon for descriptor systems of index one. The performance function is assumed to be indefinite. We derive both necessary and sufficient conditions under which this game has a Nash equilibrium. © 2012 Elsevier Ltd. All rights reserved.
Keywords: Descriptor systems Differential games Linear-quadratic regulators Feedback control Nash games Linear systems Riccati equations
1. Introduction In the past decades, there is an increased interest in studying diverse problems in economics and optimal control theory using dynamic games. In particular in environmental economics and macroeconomic policy coordination, dynamic games are a natural framework to model policy coordination problems (see e.g., the books and references in Dockner, Jørgensen, Long, and Sorger (2000), Jørgensen and Zaccour (2003), Plasmans, Engwerda, Aarle, Di Bartolomeo, and Michalak (2006) and Grass, Caulkins, Feichtinger, Tragler, and Behrens (2008)). Moreover, it is wellknown (see e.g., Başar and Bernhard (1995), Kun (2001) and Broek, Engwerda, and Schumacher (2003)) that the issue to obtain robust control strategies can be approached as a dynamic game problem. In this paper, we consider the linear quadratic differential game under the assumption that the dynamics of the underlying process are described by a descriptor system, that is, by both a set of differential equations and linear equations. Problems of this kind appear in studying systems which operate under different timescales
✩ The material in this paper was not presented at any conference. This paper was recommended for publication in revised form by Associate Editor Delin Chu under the direction of Editor Ian R. Petersen. We like to thank Hans Schumacher and referees for their 86 comments on an earlier draft of this paper. E-mail addresses:
[email protected] (J.C. Engwerda),
[email protected] (Salmah). 1 Tel.: +31 134662174; fax: +31 134663280.
0005-1098/$ – see front matter © 2012 Elsevier Ltd. All rights reserved. doi:10.1016/j.automatica.2012.01.004
like e.g., in mechanical engineering where an electrically driven robot manipulator typically has slow mechanical dynamics and fast electrical dynamics, or in environmental economics where global warming is assumed to be a system which has slow dynamics that is affected by various processes that have fast dynamics. They also sometimes naturally appear in modeling systems like e.g., in economics where the Leontieff model describes the relationship between the levels of production of a number of interrelated production sectors, or in finance in portfolio selection optimization. See e.g., Lewis (1986), Hodge and Newcomb (2002) or Grimsrud and Huffaker (2006) and the references therein. The ‘‘one-player’’ regulator problem for descriptor systems that have index one (see Section 2 for a formal introduction of this notion) has been considered by many authors. One of the first who considered this problem was Pandolfi (1981). His results were later on generalized by many authors like e.g., Cobb (1983) and Bender and Laub (1987). The major reason to consider index one descriptor systems is that for all initial states of the system there exists a smooth control that generates a smooth state trajectory if and only if the system is controllable at infinity (or impulse controllable); see e.g., Geerts (1993). Further, all impulsive modes of a system that is impulse controllable can be transformed into finite dynamic modes using a static state feedback control. After this transformation the system is then a descriptor system that has index one. Since we do not want to consider impulsive control actions in this paper we will restrict our attention therefore to systems that have index one too.
626
J.C. Engwerda, Salmah / Automatica 48 (2012) 625–631
We will consider here the linear quadratic differential game. We assume a linear feedback information structure. The reason to consider this information structure is that the corresponding linear feedback Nash equilibria (FNE) have the nice property of strong time consistency, which for instance does not hold under an openloop information structure. For systems just described by a set of differential equations this problem has been considered by many authors and dates back to the seminal work of Starr and Ho in Starr and Ho (1969). It can be shown that for a fixed finite planning horizon the problem has at most one FNE (see e.g., Lukes (1971) and Papavassilopoulos, Medanic, and Cruz (1979)). For the infinite planning horizon it is well-known that the problem may have up to an infinite number of FNE. References and related issues can be found in, e.g.,, Başar and Olsder (1999) and Engwerda (2005). For index one descriptor systems, this problem was solved in Engwerda and Salmah (2009), assuming an open-loop information structure. In Engwerda, Salmah, and Wijayanti (2009), some first results were obtained for higher order index systems. Probably the first reference where this problem was addressed for descriptor systems assuming a feedback information structure is Gardner and Cruz (1978) (see also Gardner (1977) for the zero-sum case). For descriptor systems with a closed-loop perfect state information structure Xu and Mizukami (1994) derived for a finite planning horizon some first results for zero-sum games. Further, these authors considered the leader–follower information structure in Mizukami and Xu (1992) and Mizukami and Xu (1996). Glizer (2000) considered the asymptotic behavior of the zerosum game solution from a cheap control perspective. Engwerda et al. solved in Engwerda, Salmah, and Wijayanti (2009) the linear feedback game problem and provided a complete parametrization of all FNE strategies. Their solvability conditions are posed in terms of a transformed system. The results were derived under the assumption that the state costs are nonnegative. In particular this implies that the obtained results are not directly applicable to a zero-sum game setting. In this paper, we generalize these results for performance criteria that also include ‘‘cross-terms’’, i.e., products of the state and control variables and with no definiteness assumptions concerning the involved state cost. We provide a complete parametrization of all FNE strategies. As a special case we derive existence results for the multi-player zerosum game. The outline of this note is as follows. Section 2 introduces the problem and contains some preliminary results. The main results of this paper are stated in Section 3, whereas Section 4 illustrates the presented theory with an example. The final section contains some concluding remarks. 2. Preliminaries
Qi T Vi11
···
Vi11 Ri1
Mi =
T Vi1N
MiT
T Vi2N
,
Vi22
. ···
Vi1N Vi2N
···
RiN
i ∈ N¯ .
2
No definiteness assumptions are made w.r.t. matrix Qi . Next, we recall some standard definitions and results about descriptor systems. Consider E x˙ (t ) = Ax(t ),
x(0) = x0 .
E x˙ (t ) = Ax(t ) +
N
Bi ui (t ),
x(0) = x0 ∈ Rn ,
ui ∈ Rmi ,
i =1
rank(E ) = r < n.
(1)
Player i wants to choose his control ui limtf →∞ Ji (tf , x0 , u1 , . . . , uN ) where Ji (tf , x0 , u1 , . . . , uN ) :=
tf
(3)
An initial state x0 in (3) is called consistent if with this choice of the initial state the system has a solution. Let λ be any complex number. Then system (3) (or the matrix pair (E , A)) is called regular if det(λE − A) ̸= 0. System (3) has a unique solution, for any consistent initial state, if and only if (E , A) is regular. From Gantmacher (1959) we recall the so-called Weierstrass canonical form. Theorem 1. (1) If (3) is regular there exist nonsingular matrices X and Y such that Y T EX = diag(˜I , N ) and
Y T AX = diag(A1 , ˆI ).
(4)
Here A1 and N are matrices in Jordan form. Moreover, N is nilpotent. ˜I and ˆI are both identity matrices. The size of ˜I corresponds with that of A1 and the size of ˆI corresponds with that of N. A1 and N are unique up to permutation of Jordan blocks. (2) If Xˆ and Yˆ are another set of nonsingular matrices that satisfy (4), there exists a nonsingular matrix T = diag(T1 , T2 ) such that Y T = T Yˆ T , X = Xˆ T −1 , T1 A1 = A1 T1 and T2 N = NT2 . System (3) is said to be of index one3 if N = 0 (its degree of nilpotency is one). Assuming that system (3) has index one it follows that, with
[xT1 (t ) xT2 (t )]T := X −1 x, Bi1 := [Ir 0]Y T Bi ,
where x1 ∈ Rr , x2 ∈ Rn−r ,
Bi2 := [0 In−r ]Y T Bi ,
and F is the block-column matrix F = [ be rewritten as: J i ( x0 , F ) =
(5)
F1T
,...,
FNT T , Ji
]
in (2) can
∞
{[xT1 (t ) xT2 (t )]X T [In F T ] 0
× Mi [In F T ]T X [xT1 (t ) xT2 (t )]T }dt .
(6)
Here the variable xi (.) satisfy (see (1))
Ir 0
x˙ 1 (t ) x˙ 2 (t )
0 0
=
A1 0
×
0 In − r
x1 (t ) ; x2 (t )
Consider the N-person linear quadratic differential game with dynamics
,
..
Mi =
··· ···
+
N B i1
i=1
Bi2
Fi X
x1 (0) = X −1 x 0 . x2 (0)
(7)
Of course problem (6) and (7) is not completely specified, as we did not specify the set of admissible feedback strategies yet and neither paid attention to the fact that there exist initial states for which (7) has no solution. The set of all consistent initial states of (3) is given by {x0 | x0 = X [¯xT1 0]T , x¯ 1 ∈ Rr }. As shown in Kalogeropoulos and Arvanitis (1998) this set does not depend on the choice of
= Fi x to minimize: 2 N¯ := {1, . . . , N }. 3 This is equivalent (see e.g., Kautsky, Nichols, and Chu (1989)) to the assumption
[xT (t ), uT1 (t ), . . . , uTN (t )]
0
× Mi [xT (t ), uT1 (t ), . . . , uTN (t )]T dt ,
(2)
that rank([E ANE ]) = n, where the image of matrix NE equals the null space of E. Notice furthermore that in that case |λE − A| is not a constant.
J.C. Engwerda, Salmah / Automatica 48 (2012) 625–631
matrix X that is chosen in (4). In the sequel we will assume that if x0 = X [¯xT1 x¯ T2 ]T is an inconsistent initial state of (4), the system’s state will jump to x+ xT1 0]T . This assumption is motivated 0 := X [¯ by the idea that the part of the state of the system that is governed by the fast dynamics of the system can, compared to the part of the state governed by the slow dynamics, easily adapt. We will restrict the analysis to the set of linear state feedback controls that stabilize the system for all consistent initial states. A necessary and sufficient condition for the existence of such a stabilizing feedback matrix is that system (1) (or (E , A, B), where B is the block-row matrix B = [B1 , . . . , BN ]) is finite dynamics stabilizable.4 These requirements lead then to the assumption that F ∈ F , where
F := {F | all finite eigenvalues of (E , A + BF ) are stable and (E , A + BF ) has index one}.
¯ are stabilizable. We assume that the matrix pairs (A1 , Bi1 ), i ∈ N, So, in principle, each player is capable to stabilize the first part of the transformed system on his own. Notice that the question whether the matrix pair (A1 , Bi1 ) is stabilizable does not depend on the choice of the matrices X and Y yielding the canonical form (4). Using Theorem 1.(2) it is easily verified that rank[A1 −λI Bi1 ] = rank[A1 − λI Bˆ i1 ] if Bi1 := [Ir 0]Yˆ T Bi , where Xˆ and Yˆ satisfy (4) too. If B¯ 2 is the block-row matrix B¯ 2 := [B12 , . . . , BN2 ] and W := In−r + B¯ 2 FX2 , it follows from Lemma 7 in Appendix that for any fixed matrix F ∈ F , W is invertible. Therefore, for a fixed F ∈ F , the set of consistent initial states for (7) equals {(x1 (0), x2 (0)) | x2 (0) = W −1 B¯ 2 FX1 x1 (0), x1 (0) ∈ Rr }.
(8)
Notice that the assumption that the players use simultaneously stabilizing controls introduces the cooperative meta-objective of both players to stabilize the system (see e.g., Engwerda (2005) for a discussion). Now, u∗ := (u∗1 , . . . , u∗N ) ∈ F is called a feedback Nash equilibrium if the usual inequalities apply. That is, no player can improve his performance by a unilateral deviation from this set of equilibrium actions. Introducing the notation u∗−i (α) := u∗ where u∗i has been replaced by the arbitrary input function α the formal definition reads as follows ∗
∗
Definition 2. F or u ∈ F is called a feedback Nash equilibrium ¯ i (x0 , u∗ ) ≤ Ji (x0 , u∗−i (α)) for every x0 and input α (FNE) if for i ∈ NJ such that u∗−i (α) ∈ F .
¯ ¯T ¯ = Notation. m 1=1 mi ; Bd is the block-diagonal matrix Bd := T T T diag(B11 , B21 , . . . , BN1 ); ¯ i := M
3. Main results
4 (1) is finite dynamics stabilizable if there exists a feedback u(t ) = Fx(t ) such that all finite eigenvalues (i.e., all solutions of |λE − A| = 0) of the system E x˙ (t ) = (A + BF )x(t ) are stable. This property holds if and only if rank [λE − A B] = n for all + λ ∈ C+ 0 . C0 is the set of complex numbers with non-negative real part.
X1T −B¯ T2 X2T
−X2 B¯ 2
0 X Mi 1 0 Im¯
Q¯ i T V¯ i11
V¯ i11 R¯ i1
T V¯ i1N
T V¯ i2N
=:
··· V¯ i22 .. . ···
Im¯
··· ··· ···
V¯ i1N V¯ i2N
.
(9)
R¯ iN
B¯ 1 is the block-row matrix B¯ 1 := [B11 , . . . , BN1 ]; K¯ is the block-column matrix K¯ := [K¯ 1T , . . . , K¯ NT ]T ; E¯ i+1 is obtained from the block-column matrix containing N + 1 zero blocks, where block i + 1 is replaced by the identity matrix, i.e., E¯ iT+1 =
¯ Block[0mi ×r 0mi ×m1 · · · 0mi ×mi−1 Imi 0mi ×mi+1 · · · 0mi ×mN ], i ∈ N; ¯ equals block-row i + 1 of M ¯ i , excluding its first blockrow i of matrix G ¯ i.e., G¯ := [M ¯ 1 E¯ 2 · · · M ¯ N E¯ N +1 ]T [0m¯ ×r Im¯ ]T . Block-entry i entry, i ∈ N, of block-column matrix Z¯ is the (i + 1)th block-entry of the first block¯ i , i.e., Z¯ := [M ¯ 1 E¯ 2 · · · M ¯ N E¯ N +1 ]T [Ir 0 · · · 0]T = column of matrix M T ¯ ¯ [V111 · · · VN1N ] . We now have the following result.
¯ is invertible and R¯ ii > 0, i ∈ N. ¯ Then F is an Theorem 3. Assume G FNE for (1) and (2) for every initial state if and only if Fi = F¯i P + + Si (In − PP + ), P = X [Ir , −(B¯ 2 F¯ )T ]T ,
where Si ∈ Rmi ×n , and
(10)
¯ −1 (Z¯ + B¯ Td K¯ ). F¯ = −G
(11)
¯ are symmetric solutions of the coupled algebraic Here K¯ i , i ∈ N, Riccati-type equations ¯ i [Ir F¯ T ]T = 0, A¯ Tcl K¯ i + K¯ i A¯ cl + [Ir F¯ T ]M
i ∈ N¯ ,
(12)
satisfying σ (A¯ cl ) ⊂ C , where A¯ cl := A1 + B¯ 1 F¯ . Further, Ji = xT0 X −T [Ir 0]T K¯ i [Ir 0]X −1 x0 . −
Remark 4. 1. Since matrix X is invertible and L := [I , −(B¯ 2 F¯ )T ]T is full column rank, the Moore–Penrose inverse P + of P equals P + = L+ X −1 = (LT L)−1 LT X −1 . 2. If Xˆ and Yˆ satisfy (4) too, Xˆ 2 Bˆ¯ 2 = Xˆ [0 In−r ]T [0 In−r ]Yˆ T B = X diag(T1 , T2 )
[0 In−r ]T [0 In−r ]diag(T1−1 , T2−1 )Y T B = X2 B¯ 2 . ¯ and R¯ ii do not depend on the specific choice Therefore, both G of X and Y . So the invertibility and positive definiteness assumptions made in Theorem 3 do not depend on the choice of X and Y . Consider the multi-player zero-sum game D[J1 · · · JN ]T = 0, where D ∈ R(N −1)×N has full row rank and the basis of the null-space ¯ of D has no zero entries. Furthermore assume that, with M1 =: Q¯ 1
V¯ 1T
To facilitate a brief statement of our main Theorem 3, the proof of which can be found in Appendix, we first introduce a number of variables.
The main problem addressed in this paper reads then as follows. Problem 1. Consider the performance criterion (2) and system (1), where rank(E ) = r < n = dim(x). Assume (E , A) is regular and has index one, and (E , A, B) is finite dynamics stabilizable. Let X be as defined in (4). Decompose X =: [X1 X2 ], with X1 ∈ Rn×r and X2 ∈ Rn×(n−r ) . Find conditions under which (1) and (2) has an FNE solution ui = Fi x, with F ∈ F .
627
N
V¯ 1 R¯ 1
, R¯ 1 is invertible and the standard assumption R¯ ii > 0
¯ Then it follows that there exist scalars di such that holds, i ∈ N. Ji = di J1 , i ∈ N¯ (with d1 = 1). Since this property holds for ¯ Notice every x0 it follows from Theorem 3 that K¯ i = di K¯ 1 , i ∈ N. ¯ i = di M ¯ 1 , G¯ = diag(di Imi )R¯ 1 , Z¯ = diag(di Imi )V¯ 1T and that, since M 1 ¯T ¯T ¯ F¯ = −R¯ − 1 (V1 + B1 K1 ). Consequently, (12) has a set of stabilizing solutions if and only if the next algebraic Riccati equation has a stabilizing solution ¯ 1 [Ir F¯ T ]T = 0. A¯ Tcl K¯ 1 + K¯ 1 A¯ cl + [Ir F¯ T ]M
(13)
628
J.C. Engwerda, Salmah / Automatica 48 (2012) 625–631
Notice that (13) is a standard algebraic Riccati equation from which we know that it has at most one stabilizing solution. Numerically this can be verified, e.g.,, by checking whether the Hamiltonian matrix
−1 A1 − B¯ 1 R¯ 1 V¯ 1T −Q¯ 1 + V¯ 1 R¯ −1 V¯ T 1
1
1 ¯T −B¯ 1 R¯ − B1 1 −(A1 − B¯ 1 R¯ −1 V¯ T )T
stable r-dimensional graph subspace (see e.g., Engwerda (2005)). Summarizing we have then from Theorem 3 the next result. Theorem 5. The multi-player zero-sum game has an FNE for every initial state if and only if the standard algebraic Riccati equation (13) has a stabilizing solution K¯ 1 . In case the game has an FNE the set of equilibrium strategies for the multi-player zero-sum game is Fi = F¯i P + + Si (In − PP + ), P = X [I , −(B¯ 1 F¯ )T ]T ,
J˜1 =
with Si ∈ Rmi ×n ,
1 ¯T ¯T ¯ and F¯ = −R¯ − 1 (V1 + B1 K1 ).
∞
{τf f 2 (t ) − u21 (t )}dt and 0
has a
1
1
with revenue functions
J˜2 =
∞
{τp p2 (t ) − u22 (t )}dt .
(15)
0
Both players like to maximize their utility measured by τf f 2 and τp p2 , respectively, using as less as possible control efforts which is measured by u2i . Furthermore there is some third party which requires from both players that if time proceeds f (.) and p(.) should converge to zero. For an interpretation and elaboration of some technical details of this model we refer to Engwerda and Salmah (2010). Introducing xT (t ) := [f (t ) p(t )], we can rewrite the above model (14) and (15) as
Remark 6. Gardner and Cruz (1978) considered convergence of the feedback Nash solution of the singularly perturbed system
E˜ x˙ (t ) = Ax(t ) + B1 u1 (t ) + B2 u2 (t )
x˙ 1 (t ) = x1 (t ) + 2x2 (t ) + u1 (t ) + u2 (t );
Ji =
x1 (0) = x1 (0);
ϵ x˙ 2 (t ) = −x1 (t ) − 2x2 (t ) + 2u1 (t ) + 2u2 (t );
x2 (0) = x2 (0);
with performance criteria ∞
{[xT1 (t ) xT2 (t )]Qi [xT1 (t ) xT2 (t )]T + uTi (t )Rii ui (t )
Ji =
with cost functions
∞
[xT u1 (t ) u2 (t )]Mi [xT u1 (t ) u2 (t )]T dt , 0
; A = β0 −α1 ; B1 = −01 ; B2 = 01 ; M1 = diag(−τf , 0, 1, 0); and M2 = diag (0, −τ p , 0, 1). 1 −α 0 −1 α T If ϵ = 0 we have, with Y := α 2 0 −α and X := 0 , 1
1 0
where E˜ =
0
ϵ
+ uTj (t )Rij uj (t )}dt , i, j = 1, 2, i ̸= j, 2 1 where Qi = 1 2 , Rii = 1, Rij = 2. In this example, the
that the transformed matrices are Y T EX = diag(1, 0); Y T AX = 1 1 T , ] . diag(β, 1); Y T B1 = [ α1 , 0]T ; Y T B2 = [ − α2 α Denoting F2 := [f21 f22 ] it is easily verified that (E , A + BF ) has index one iff f22 ̸= −α . From (9) it follows directly that
converged (symmetric) feedback Nash equilibrium strategies are Fi∗ = limϵ↓0 Fi (ϵ) = −[1.59080.7321 ], i = 1, 2.
¯ 1 = diag(−α 2 τf , 1, 0) and M ¯ 2 = diag(0, 0, p2 ). Therefore G¯ = M α τ˜p k k diag(1, α 2 ). Assuming that τ˜p := α 2 −τp > 0 and β− α12 − α 22τ˜ < 0,
0
1
If ϵ = 0, with Y T
=
0
1 1
and X
=
2
1 1
−
2
0
−1
, the
transformed system parameters (cf. (5) and (9)) are A1 = 0, Bi1 = 3, Bi2 = 1, Q¯ i = 23 , V¯ i11 = 0, V¯ i12 = 0, R¯ ii = 3, R¯ ij = 4, j ̸= i,
and V¯ i22 = 2, i = 1, 2. Straightforward calculations show that the set of FNE for this game are (see (10)):
Fi = 1+
1 2
τ˜
p
we have from Theorem 3 that [F¯1 , F¯2 ] = [ α 1 , τ˜2 ]. Here ki solve p the equations −k
2 β−
1 6
+
2 2 3
−1
1 2
+
2 3
+ λi
1 2
+
2 3
k2
1 ,
α τ˜p
2β −
1
k1 −
2
α2
2
1
α
α 2 τ˜p
k − 2 1
k21 − α 2 τf = 0
So both F1 and F2 are located on the same line. It is easily verified that F1∗ does not lie on this line. Therefore, Fi∗ does not belong to this class of FNE solutions. So, this example demonstrates that in general the FNE strategies of the singularly perturbed system do not converge to an FNE strategy of the corresponding reduced order system. This happens though the fast dynamics in the singularly perturbed system are stable (i.e., A22 = −2 < 0). The next example shows that there are also cases where independent of the stability properties of the fast dynamics always convergence occurs.
= 0.
k2
(16)
τ˜p k1
−k2 F1 = 0 + λ1 α k2 α τ˜p −k2 F2 = [0 − α] + λ2 α . α τ˜p
and
(17)
That is, geometrically, all equilibrium feedback matrices F1 are located on a line that is parallel to a line where the corresponding F2 matrix lies. Depending on the choice of the parameters either zero up to three different equilibrium points (J1 , J2 ) can occur (see Engwerda and Salmah (2010)). In the rest of this example we take τf = β 2 , in which case just one equilibrium occurs (see Engwerda and Salmah (2010)). −α 2 β
4. An example In this section, we illustrate some interesting issues that arise within this framework concerning the choice of FNE strategies by means of a small example. Consider the following model of competition between two players where player 1 controls the slow dynamics of the system (using u1 ) and player two controls the fast dynamics of the system (using u2 ). f˙ (t ) = β f (t ) − p(t ) − u1 (t )
f (0) = f0 ;
where α ̸= 0;
and
In case k2 ̸= 0 it follows next from (10) that the feedback matrices Fi yielding an FNE satisfy
i = 1, 2; − 0.4906 < λ1 + λ2 < 1.1966.
ϵ p˙ (t ) = α p(t ) + u2 (t );
k2
k
(14)
8α 2 β τ˜
Substitution of (k1 , k2 ) = ( 3 , 3 p ) into (16) shows that (k1 , k2 ) solve these equations. Further, from (17), we have that the equilibrium actions are
F1 = 0
−1 8
+ λ1
F2 = [0 − α] + λ2
−8 3
−8 3
β1
β1 ,
and with λ2 ̸= 0.
(18)
Choosing any F from this set (18) of FNE it follows from (8) that the corresponding consistent initial state for the p variable is
J.C. Engwerda, Salmah / Automatica 48 (2012) 625–631
p0 =
8 3
β f0 . The resulting closed-loop dynamics is then described 4
β f (t ), where f (t ) = e− 3 t f0 . Finally β notice that by Theorem 3 the corresponding costs are J1 = − 3 f02 8 2 and J2 = 3 β τ˜p f0 . by the trajectories p(t ) =
8 3
In Engwerda and Salmah (2010) also the feedback Nash equilibria are determined in case ϵ ̸= 0. In particular it is shown there that the corresponding equilibrium cost coincide with the above formulas in the limiting case (ϵ ↓ 0). Furthermore the limiting equilibrium actions are obtained by choosing λ1 = 18 and
λ2 = − τ˜p in (18).
To analyze the question which choice of λi in (18) might provide some additional robustness properties, we next consider the system if one uses the equilibrium actions Fi from (18) to control system (14) if ϵ ̸= 0. Then the system dynamics are described by
8 7 − λ1 + f (t ) β 1 + 3 λ1 f˙ (t ) 8 = p(t ) p˙ (t ) −8 βλ2 λ2 3 ϵ ϵ f (t ) f (t ) f =: Aϵ , = 0 . p(t ) p(t ) p0
λ
Notice that this system is stable iff λ2 < 0 and ϵ2 +β(1 + 83 λ1 ) < 0. From a robustness point of view one option (see Engwerda and Salmah (2010) for other options) could be that players choose λi in such a way that the closed-loop system becomes as stable as possible. Or, stated differently, to determine λi in such a way that the real part of the largest eigenvalue of matrix Aϵ ,
λ(λ1 , λ2 ) :=
1
λ
2
2 ϵ
+
8
+ β 1 + λ1
3
λ2 8 + β 1 + λ1 ϵ 3
2
16 λ2 β + , 3 ϵ
is minimal. First notice that for fixed λi , limϵ↓0 λ(λ1 , λ2 ) = − 43 β . Choosing λ
for a fixed negative λ2 , λ1 such that 1 + 38 λ1 = − ϵβ2 − we have that λ(λ1 , λ2 )
=
λ2 − 16 , 3 ϵβ
βλ2 − − 16 . So, in case players 3 ϵ
would agree to coordinate their actions, as far as it concerns the choice of the λi parameters, we see that this largest eigenvalue can be made arbitrarily small by an appropriate choice of λi . A clear disadvantage of this choice is that its implementation requires knowledge of ϵ . Furthermore, it easily results in high gain feedbacks. On the other hand we see that if player 1 chooses λ1 = − 87 ,
λ(− 87 , λ2 ) = − β4 , whatever player’s 2 choice of λ2 (< 0) is. β β Furthermore, λ(λ1 , λ2 ) > − 4 if λ1 < −87 and λ(λ1 , λ2 ) < − 4 −λ2 7 if − 8 < λ1 < 8 . So, in case it is unclear what the value ϵβ(1+ 3 )
of ϵ will be and, moreover, player 1 does not know the choice by player 2 of λ2 , player 1 can enforce always a certain stability of the closed-loop system. This at the expense that he might have earned more profits (for small values of ϵ ) or that, in retrospect, a more stabilizing control could have been implemented if he would have known the action of player 2. 5. Concluding remarks In this paper, we considered the regular indefinite infiniteplanning horizon linear-quadratic differential game for index
629
one descriptor systems. Both necessary conditions and sufficient conditions were derived for the existence of a feedback Nash equilibrium. These conditions were stated in terms of a transformed system. A complete parametrization was derived for the set of FNE equilibrium strategies. The transformation used here was based on the Weierstrass canonical form. Notice, however, that for numerical purposes it suffices to make a singular value decomposition of matrix E. Using this, one can calculate then easily matrices Y and X yielding the same structural form (26), with matrix A1 a square matrix instead of a Jordan matrix (see Engwerda et al. (2009, Remark 3.2)). In case there exists an FNE equilibrium, usually, there exists an infinite number of feedback Nash equilibria which all give rise to the same closed-loop behavior of the system due to the informational non-uniqueness of the problem. This makes it possible to look for equilibria within this set that satisfy some additional properties, like e.g., robustness. We illustrated in an example how different criteria may yield different choices for the feedback design. In particular we showed that in this example one of these equilibria can be interpreted as the converged FNE of the corresponding singular perturbed system. By means of a counterexample (based on Gardner and Cruz (1978)) we showed that this property does not hold for arbitrary systems. Furthermore we showed that in this example one of the players can enforce a certain stability of the closed-loop system for the corresponding singularly perturbed system by an appropriate choice of his strategy, this independent of the FNE strategy choice of the other player and independent of the time-scale of the fast system. Clearly these observations ask for a more detailed study in a more general setting. Since Qi are allowed to be indefinite, the obtained results were used to solve the multi-player zero-sum game. We showed that in that case all FNE equilibrium strategies yield the same closed-loop system. Appendix Lemma 7. Assume (E , A + BF ) is regular and has index one. Then for all F ∈ F , W := In−r + B¯ 2 FX2 is invertible. Proof. See Engwerda et al. (2009).
Lemma 8. Assume C ∈ Rn×m and D ∈ Rm×n . Then the following holds. 1. In + CD is invertible if and only if Im + DC is invertible. 2. If In + CD is invertible then C (Im + DC )−1 = (In + CD)−1 C . Proof. See, e.g.,, De Carlo and Saeks (1981, p. 97).
The following lemma can be proved directly from the definition of Nash equilibrium. Lemma 9. F ∗ is an FNE for the game defined by the cost Ji (x0 , FH ) and the system x˙ (t ) = (A + BFH )x(t ), x(0) = x0 if and only if Fˆ ∗ is an FNE for the game defined by the cost Ji (x0 , F ) and the system x˙ (t ) = (A + BF )x(t ), x(0) = x0 and F ∗ solves the set of equations F ∗ H = Fˆ ∗ . Next, we introduce a number of variables which facilitate a brief statement of Theorem 10, below. Bd is the blockdiagonal matrix BTd := diag(BT1 , BT2 , . . . , BTN ); K is the blockcolumn matrix K := [K1T , . . . , KNT ]T ; Ei+1 is obtained from the block-column matrix containing N + 1 zero blocks, where block i + 1 is replaced by the identity matrix, i.e., EiT+1 =
¯ Block[0mi ×n 0mi ×m1 · · · 0mi ×mi−1 Imi 0mi ×mi+1 · · · 0mi ×mN ], i ∈ N; row i of matrix G equals block-row i + 1 of Mi , excluding its first
630
J.C. Engwerda, Salmah / Automatica 48 (2012) 625–631
¯ i.e., G := [M1 E2 · · · MN EN +1 ]T [0m¯ ×n Im¯ ]T . block-entry, i ∈ N, Block-entry i of block-column matrix Z is the (i + 1)th blockentry of the first block-column of matrix Mi , i.e., Z := [M1 E2 · · · MN EN +1 ]T [In 0 · · · 0]T = [V111 · · · VN1N ]T . Using this, we recall from Engwerda and Salmah (submitted for publication) the following result. Theorem 10. Assume G is invertible. The differential game (1) and (2), with E = In and Rii > 0, has a feedback Nash equilibrium F for every initial state if and only if F = −G−1 (Z + BTd K ).
(19)
¯ are symmetric solutions of the coupled algebraic Here Ki , i ∈ N, Riccati-type equations ATcl Ki + Ki Acl + [In F T ]Mi [In F T ]T = 0, satisfying σ (Acl ) Ji = xT0 Ki x0 .
⊂
i ∈ N¯ ,
C− , where Acl
:=
(20) A + BF . Further,
Proof of Theorem 3. By Lemma 7, W is invertible. From our jump assumption on inconsistent initial states it follows therefore directly from (7) that for all t > 0 x2 = Hx1 , where H := −W −1 B¯ 2 FX1 . Substitution of this into (6) and (7) shows that Problem 1 has an FNE if and only if F is an FNE for ∞
{xT1 (t )[Ir H T ]X T [In F T ]Mi [In F T ]T X
Ji = 0
× [Ir H T ]T x1 (t )}dt subject to x˙ 1 (t ) = (A1 + B¯ 1 FX [Ir H T ]T )x1 (t ),
(21)
x1 (0) = [Ir 0]X −1 x0 .
(22)
Introducing F¯ := FX [Ir H T ]T , we can rewrite (21) and (22) as ∞
{xT1 (t )[[Ir H T ]X T F¯ T ]
Ji = 0
× Mi [[Ir H T ]X T F¯ T ]T x1 (t )}dt , x˙ 1 (t ) = (A1 + B¯ 1 F¯ )x1 (t ),
and
x1 (0) = [Ir 0]X
(23) −1
x0 .
(24)
By Lemma 8, H can be rewritten as
− (In−r + B¯ 2 FX2 )−1 B¯ 2 FX1 = −B¯ 2 F (In + X2 B¯ 2 F )−1 X1 = −B¯ 2 F (X1 − (In + X2 B¯ 2 F − In )(In + X2 B¯ 2 F )−1 X1 ) = −B¯ 2 F (X1 − X2 (In−r + B¯ 2 FX2 )−1 B¯ 2 FX1 ) = −B¯ 2 F (X1 + X2 H ) = −B¯ 2 F¯ .
(25)
Using (25) we can rewrite (23) as ∞
¯ i [Ir F¯ T ]T x1 (t )}dt . {xT1 (t )[Ir F¯ T ]M
Ji =
(26)
0
So, by Lemma 9, F is an FNE for (1) and (2) for every initial state if and only if F¯ is an FNE for the game defined by (24) and (26) and F solves the set of equations F¯ = FX [Ir H T ]T . The last-mentioned equation always has a solution; see e.g., Abadir and Magnus (2005, p. 295). Further, its solutions are parameterized by (10). Finally, by Theorem 10, F¯ is an FNE for every initial state for game (24) and (26) if and only if (11) holds where K¯ is a stabilizing solution of (12).
References Abadir, K., & Magnus, J. (2005). Matrix algebra. Cambridge: Cambridge University Press. Başar, T., & Bernhard, P. (1995). H∞ -optimal control and related minimax design problems. Boston: Birkhäuser. Başar, T., & Olsder, G. J. (1999). Dynamic noncooperative game theory. Philadelphia: SIAM. Bender, D. J., & Laub, A. J. (1987). The linear-quadratic optimal regulator for descriptor systems. IEEE TAC , 32, 672–688. Broek, W. A. van den, Engwerda, J. C., & Schumacher, J. M. (2003). Robust equilibria in indefinite linear-quadratic differential games. JOTA, 119, 565–595. Cobb, D. (1983). Descriptor variable systems and optimal state regulation. IEEE TAC , 28, 601–611. De Carlo, R. A., & Saeks, R. (1981). Interconnected dynamical systems. New York: Marcel Dekker. Dockner, E., Jørgensen, S., Long, N. van, & Sorger, G. (2000). Differential games in economics and management science. Cambridge: Cambridge University Press. Engwerda, J. C. (2005). LQ dynamic optimization and differential games. Chichester: John Wiley & Sons. Engwerda, J. C., & Salmah, (2009). The open-loop linear quadratic differential game for index one descriptor systems. Automatica, 45, 585–592. Engwerda, J. C., & Salmah, (2010). Feedback Nash equilibria for linear quadratic descriptor differential games. CentER Discussion paper no. 2010-79. Tilburg University, The Netherlands. http://center.uvt.nl/pub/dp2010.html. Engwerda, J. C., & Salmah, (2010). Necessary and sufficient conditions for feedback Nash equilibria for the affine-quadratic differential game. CentER Discussion paper no. 2010-78. Tilburg University, The Netherlands. http://center.uvt.nl/pub/dp2010.html (submitted for publication). Engwerda, J. C., Salmah, & Wijayanti I. E., (2009). The open-loop discounted linear quadratic differential game for regular higher order index descriptor systems. International Journal of Control, 82, 2365–2374. Engwerda, J. C., Salmah ,, & Wijayanti, I. E. (2009). The (multi-player) optimal linear quadratic feedback state regulator problem for index one descriptor systems. In Proceedings 10th european control conference. Budapest. Gantmacher, F. (1959). Theory of matrices: vols. I, II. New York: Chelsea. Gardner, B. F. (1977). Zero-sum Nash strategy for systems with fast and slow modes. In Proceedings 15th allerton conference on communication, computers and control (pp. 96–103). Urbana: Univerisity of Illinois. Gardner, B. F., & Cruz, J. B. (1978). Well-posedness of singularly perturbed Nash games. Journal of the Franklin Institute, 306, 355–374. Geerts, T. (1993). Solvability conditions, consistency, and weak consistency for linear differential–algebraic equations and time-invariant singular systems: the general case. Linear Algebra and its Applications, 181, 111–130. Glizer, V. Y. (2000). Asymptotic solution of zero-sum linear-quadratic differential game with cheap control for minimizer. Nonlinear Differential Equations and Applications, 7, 231–258. Grass, D., Caulkins, J. P., Feichtinger, G., Tragler, G., & Behrens, D. A. (2008). Optimal control of nonlinear processes: with applications in drugs. Berlin: Springer-Verlag. Grimsrud, K. M., & Huffaker, R. (2006). Solving multidimensional bioeconomic problems with singular-perturbation reduction methods: application to managing pest resistance to pesticidal crops. Journal of Environmental Economics and Management, 51, 336–353. Hodge, A. M., & Newcomb, R. W. (2002). Semistate theory and analogue VLSI design. IEEE Circuit and System Magazine, 2, 30–51. Jørgensen, S., & Zaccour, G. (2003). Differential games in marketing. Deventer: Kluwer. Kalogeropoulos, G., & Arvanitis, K. G. (1998). A matrix-pencil-based interpretation of inconsistent initial conditions and system properties of generalized state-space. IMA Journal of Mathematical Control and Information, 15, 73–91. Kautsky, J., Nichols, N. K., & Chu, E. K.-W (1989). Robust pole assignment in singular control systems. Linear Algebra and its Applications, 121, 9–37. Kun, G. (2001). Stabilizability, controllability, and optimal strategies of linear and nonlinear dynamical games. Ph.D. Thesis. RWTH-Aachen, Germany. Lewis, F. L. (1986). A survey of linear singular systems. Circuits, Systems and Signal Processing, 5, 3–36. Lukes, D. L. (1971). Equilibrium feedback control in linear games with quadratic costs. SIAM Journal on Control and Optimization, 9, 234–252. Mizukami, K., & Xu, H. (1992). Closed-loop Stackelberg strategies for linearquadratic descriptor systems. JOTA, 74, 151–170. Pandolfi, L. (1981). On the regulator problem for linear degenerate control systems. JOTA, 33, 241–254. Papavassilopoulos, G. P., Medanic, J., & Cruz, J. (1979). On the existence of Nash strategies and solutions to coupled Riccati equations in linear-quadratic games. JOTA, 28, 49–75. Plasmans, J., Engwerda, J., Aarle, B. van, Di Bartolomeo, B., & Michalak, T. (2006). In series: dynamic modeling and econometrics in economics and finance: vol. 8. Dynamic modeling of monetary and fiscal cooperation among nations. Berlin: Springer-Verlag. Starr, A. W., & Ho, Y. C. (1969). Nonzero-sum differential games. JOTA, 3, 184–206. Xu, H., & Mizukami, K. (1996). Linear feedback closed-loop Stackelberg strategies for descriptor systems with multilevel hierarchy. JOTA, 88, 209–231. Xu, H., & Mizukami, K. (1994). On the Isaacs equation of differential games for descriptor systems. JOTA, 83, 405–419.
J.C. Engwerda, Salmah / Automatica 48 (2012) 625–631 J.C. Engwerda was born in Nieuweschans, The Netherlands, in 1958. He received his master degree from Groningen University and his Ph.D. degree from the Technical University of Eindhoven in mathematics, with a specialization in systems and control theory. From 1988 on he works as an associate professor at the Department of Econometrics and Operations Research at Tilburg University. He is author of ‘‘LQ Dynamic Optimization and Differential Games’’ and coauthor of ‘‘Dynamic Modeling of Monetary and Fiscal Cooperation Among Nations’’. His research focuses on the development of system-theoretic methods for control problems in economics.
631 Salmah was born in Surakarta, Indonesia, in 1964. She received the master degree in Mathematics from Institute of Technology Bandung Indonesia, in 1995 and the Doctor degree in Mathematics from Gadjah Mada University, Yogyakarta Indonesia in 2006. She is currently a lecturer of Mathematics at Gadjah Mada University, Yogyakarta Indonesia. Her research interests are in the areas of dynamic game and applications in economic problems.