Linear Algebra Appl. 481 (2015) 1–35
Contents lists available at ScienceDirect
Linear Algebra and its Applications www.elsevier.com/locate/laa
Yet another characterization of solutions of the Algebraic Riccati Equation A. Sanand Amita Dilip, Harish K. Pillai ∗ Department of Electrical Engineering, IIT Bombay, Powai, India
a r t i c l e
i n f o
Article history: Received 1 September 2014 Accepted 22 April 2015 Available online 2 May 2015 Submitted by B. Meini MSC: 15A24 15A45 06B99 Keywords: Riccati equation Invariant subspaces Schur complement Lyapunov equation Lattice
a b s t r a c t This paper deals with a characterization of the solution set of algebraic Riccati equation (ARE) (over reals) for both controllable and uncontrollable systems. We characterize all solutions using simple linear algebraic arguments. It turns out that solutions of ARE of maximal rank have lower rank solutions encoded within it. We demonstrate how these lower rank solutions are encoded within the full rank solution and how one can retrieve the lower rank solutions from the maximal rank solution. We characterize situations where there are no full rank solutions to the ARE. We also characterize situations when the number of solutions to the ARE is finite, when they are infinite and when they are bounded. We also explore the poset structure on the solution set of ARE, which in some specific cases turns out to be a lattice which is isomorphic to lattice of invariant subspaces of a certain matrix. We provide several examples that bring out the essence of these results. © 2015 Elsevier Inc. All rights reserved.
* Corresponding author. E-mail addresses:
[email protected] (A. Sanand Amita Dilip),
[email protected] (H.K. Pillai). http://dx.doi.org/10.1016/j.laa.2015.04.026 0024-3795/© 2015 Elsevier Inc. All rights reserved.
2
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
1. Introduction and preliminaries Algebraic Riccati equation occurs naturally in control theory, filtering, numerical analysis and many other engineering applications. In optimal control, algebraic Riccati equation (ARE) arises in infinite horizon continuous time LQR problem. ARE is also related to power method, QR factorization in matrix computations [3,10], spectral factorization [17,7,8]. Riccati equation shows up in Kalman filters too [13]. Refer [15] for Riccati equation arising in linear quadratic control problem. In [17], solutions of ARE are used in the study of acausal realizations of stationary processes. Further it is shown how AREs are involved in spectral factorization and in balancing algorithm (related to stochastic balancing) in [17]. In [7,8] solutions ARE are used for parametrization of minimal spectral factors. In [6], solutions of ARE are used to obtain parametrization of minimal stochastic realizations. For a treatment on discrete-time ARE, refer [9,24]. Recently study of ARE has appeared in papers on behavioral theory of systems [4,16]. For recent applications of Riccati equations to fluid queues models and transport equations refer [2] and the references therein. For more literature on Riccati equation, refer [1]. We concentrate on the ARE of the form −AT K − KA − Q + KBB T K = 0 where A, B, Q are real constant matrices having dimensions n × n, n × m and n × n respectively with Q being symmetric. AREs with Q = 0 are known in the literature as homogeneous ARE. One of the methods for solving AREs involves using the eigenvectors of the Hamiltonian matrix H [13,3,21]. There are several other methods for solving the ARE (see [21,23]). Details about constructing solutions of ARE can be found in Chapters 2 and 3 of [3] and in Chapters 7–11 of [13]. It has been proved in [21] that if (A, B) is controllable, then there is a one-to-one correspondence between real and symmetric solutions K of ARE and n-dimensional H-invariant subspaces which are complementary 0 to the span of satisfying some special properties. It is known that if (A, B) pair is I U controllable and column span of 2n × n matrix is an H-invariant subspace satisfyV ing some special property (i.e. the subspace being Lagrangian), then U is invertible and K = V U −1 gives a solution of ARE [13,20]. From the literature, we know that a solution of ARE exists if there exists an n-dimensional Lagrangian H-invariant subspace. We assume that this is the case and fix an arbitrary solution K0 of the ARE. Let K = K0 + X where X can be thought of as a perturbation from K0 . We can then re-write −AT K − KA − Q + KBB T K as = −AT (K0 + X) − (K0 + X)A − Q + (K0 + X)BB T (K0 + X) = −AT K0 − K0 A − Q + K0 BB T K0 − AT X − XA + K0 BB T X + XBB T K0 + XBB T X
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
= −(A − BB T K0 )T X − X(A − BB T K0 ) + XBB T X.
3
(1)
(Since − AT K0 − K0 A − Q + K0 BB T K0 = 0) Let A0 = A − BB T K0 . Then solutions to the equation −AT0 X − XA0 + XBB T X = 0, would characterize all the solutions of the original ARE. A similar construction has been employed previously in several papers, for example [23,19,14,17,20,7] etc. We denote −AT0 X − XA0 + XBB T X by Ric(X). Note that we are interested in real, n symmetric √ solutions of Ric(X) = 0. For a vector v ∈ R , we define its norm · as v = v T v. In this paper, we characterize all the solutions of Ric(X) = 0 using purely linear algebraic techniques. This characterization is done in terms of eigenspaces/invariant subspaces of the newly constructed matrix AT0 . In addition, we demonstrate some remarkable relations between the various solutions of the ARE, which to the best of our knowledge has not been reported earlier. 2. Basic building blocks for solutions of Ric(X) = 0 We are interested in finding all the solutions of Ric(X) = 0. Clearly X = 0 is a solution of Ric(X) = −AT0 X − XA0 + XBB T X = 0. We now look for nonzero X that satisfy Ric(X) = 0. 2.1. Rank one solutions of Ric(X) = 0 We begin with the simplest case i.e., matrices X that have rank one. Since X is symmetric, let X = αvv T where α ∈ R and v ∈ Rn with ||v|| = 1. Further, let β = ||B T v||. Theorem 1. Let X = αvv T , such that ||v|| = 1. Then 1. if v is an eigenvector of AT0 , the rank of Ric(X) is at most one and Ric(X) is definite. 2. for all other v, Ric(X) is indefinite. Proof. Let X = αvv T , where v is a right eigenvector of AT0 , with λ the corresponding eigenvalue. Now Ric(X) takes the following form: Ric(X) = −αAT0 vv T − αvv T A0 + α2 vv T BB T vv T = −2αλvv T + α2 β 2 vv T (where β = ||B T v||) = (−2αλ + α2 β 2 )vv T .
(2)
Clearly, Ric(X) = (−2αλ + α2 β 2 )vv T has rank at most one and depending on the sign of (−2αλ + α2 β 2 ) it is positive or negative semidefinite. Now we prove the second statement. Let X = αvv T where v is not an eigenvector of AT0 . Let AT0 v = γu with ||u|| = 1. Therefore
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
4
Ric(X) = −αγ(uv T + vuT ) + (αβ)2 vv T 0 −αγ uT = u v . vT −αγ (αβ)2 Now
0 −αγ
−αγ (αβ)2
is indefinite which in turn implies Ric(X) is indefinite. 2
Observe that the zero matrix is a very special case of a definite matrix. From the above theorem it is clear that if v is not an eigenvector of AT0 , then we are not going to get solutions of Ric(X) = 0 along X = vv T . Consider the pair of matrices (A0 , B). Using a change of basis, it is possible to write A0 and B in the following form [26,12] A0 =
A11 0 0
A12 0 A22 0
,B =
B1 0
11 where (A11 0 , B1 ) form a controllable pair. Eigenvalues of A0 are controllable while 22 eigenvalues of A0 are uncontrollable. Lefteigenvectors corresponding to uncontrollable T eigenvalues are of the form v = 0 uT . All such vectors v are right eigenvectors of
AT0 such that B T v = 0. All right eigenvectors of AT0 that belong to kernel of B T , are called eigenvectors that correspond to uncontrollable modes. If v is an eigenvector of AT0 associated with a controllable eigenvalue, then B T v = 0 and so these eigenvectors correspond to controllable modes. Theorem 2. If AT0 has an uncontrollable zero eigenvalue, then rank one solutions of Ric(X) = 0 form an unbounded set. Proof. Let v be an eigenvector of AT0 corresponding to zero eigenvalue which is associated to an uncontrollable mode. Therefore, AT0 v = B T v = 0. Let X = αvv T where α ∈ R. Ric(X) = 0 for all α ∈ R and therefore one has an unbounded set of rank one solutions for Ric(X) = 0. 2 For the next theorem, we therefore assume that there are no uncontrollable modes corresponding to zero eigenvalue. Theorem 3. There is one to one correspondence between rank one solutions of Ric(X) = 0 and eigenvectors v of nonzero real eigenvalues of AT0 that correspond to controllable modes. Proof. Let X = αvv T where α ∈ R and v is an eigenvector of AT0 with corresponding eigenvalue λ. Therefore, Ric(X) = (−2αλ + α2 β 2 )vv T . If AT0 v = 0, then λ = 0 and
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
5
Ric(X) = (α2 β 2 )vv T . Hence Ric(X) = (αβ)2 vv T = 0 has only one solution given by α = 0. This solution to Ric(X) = 0 is a rank zero solution and not a rank one solution. Assume λ = 0 and let v ∈ Rn be an associated eigenvector of AT0 corresponding to a controllable mode. Hence, β = ||B T v|| = 0. Therefore, Ric(X) = 0 for α = 2λ/β 2 i.e. for X = (2λ/β 2 )vv T . If the eigenvector v corresponds to an uncontrollable mode, then β = 0 and Ric(X) = −2αλvv T which is zero only when α = 0. But α = 0 implies X = 0 which is of rank zero. 2 2.2. Rank two solutions of Ric(X) = 0 Notice that all real rank one solutions of the ARE are related to eigendirections corresponding to real eigenvalues. The complex eigenvalues of A0 play no role in the rank one solutions. As the complex eigenvalues come in a conjugate pair, we can expect them to play a role in determining the real rank two solutions of the ARE. We therefore now T characterize all rank two solutions Ric(X) = 0. Let X = LLL where L = u v and α1 α3 L= is a rank two symmetric matrix. Here u, v ∈ Rn , with ||u|| = ||v|| = 1. α3 α2 Theorem 4. If X = LLLT (where L is n × 2 and L is 2 × 2) such that two columns of L are linearly independent, then Ric(X) is definite only if column span of L is an AT0 -invariant subspace. Proof. Let L =
u
v
where ||u|| = ||v|| = 1 are linearly independent. Suppose col-
umn span of L is not AT0 -invariant. Let AT0 u = γy where ||y|| = 1 and AT0 v = δz T T T T where ||z|| = 1. Further, let ⎡ ||B⎤ u|| = β1 , ||B v|| = β2 and u BB v = β3 . Then T u ⎢ vT ⎥ ⎢ ⎥ Ric(X) = u v y z T ⎢ T ⎥ ⎣y ⎦ zT ⎡
T11 ⎢ T ⎢ 21 where T = ⎢ ⎣ −α1 γ −α3 δ
T12 T22 −α3 γ −α2 δ
−α1 γ −α3 γ 0 0
⎤ −α3 δ −α2 δ ⎥ ⎥ ⎥ 0 ⎦ 0
T11 = (α1 β1 )2 + (α3 β2 )2 + 2α1 α3 β3 T12 = T21 = α1 α2 β3 + α1 α3 β12 + α2 α3 β22 + α32 β3 T22 = (α2 β2 )2 + (α3 β1 )2 + 2α2 α3 β3 . Case 1: Assume u, v, y, z are linearly independent, i.e., the columns of L, AT0 L span a four dimensional space. Notice that γ and δ are necessarily nonzero in this case. Using this
6
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
fact along with the fact that rank of L is two and the anti-diagonal (2 ×2) block structure of T , one can conclude that the rank of T is four. Now u, y are linearly independent and determinant of the corresponding 2 × 2 principal submatrix in T is negative. Similarly, v, z are linearly independent and determinant of the corresponding principal submatrix is negative. Hence, T must be indefinite. Therefore, Ric(X) is indefinite. Case 2: Now consider the case when u, v, y, z span a three dimensional subspace. By assumption u and v are linearly independent. Without loss of generality assume that y is linearly independent of u and v. We consider two sub-cases here. In the first sub-case, let δ = 0. In this case, it is enough to look at the top (3 × 3) block of the T matrix obtained above. Again, as the principal minor of T corresponding to u, y is negative, therefore Ric(X) is indefinite. For the second sub-case, we assume that δ = 0. Then z = σ1 u + σ2 v ⎡ + σ3 y, ⎤ where T u ⎢ ⎥ σ1 , σ2 , σ3 ∈ R. Therefore −AT0 X − XA0 + XBB T X = u v y RT RT ⎣ v T ⎦ where yT ⎡ ⎤ 1 0 0 σ1 ⎢ ⎥ R = ⎣ 0 1 0 σ2 ⎦. Since T is indefinite as proved above, Ric(X) is indefinite in this 0 0 1 σ3 case too. If u, v, y, z span a two dimensional space, then the span of u, v forms an AT0 -invariant subspace. Thus we have shown that if columns of L do not form AT0 -invariant subspace, then Ric(X) is indefinite. This implies that if Ric(X) is definite, then columns of L must form an AT0 -invariant subspace. 2 From above theorem, it is clear that to find solutions of Ric(X) = 0 where X = LLLT , column span of L must be AT0 -invariant. 2.2.1. Complex eigenvalues of A0 Let pA0 (x) ∈ R[x] be the characteristic polynomial of A0 . Every two dimensional AT0 -invariant subspace has a minimal polynomial given by a degree two polynomial which is a factor of pA0 (x). Consider the case when A0 has a pair of complex conjugate eigenvalues λ ± iμ. Let v = v1 + iv2 be the complex eigenvector of AT0 for λ + iμ. Therefore, AT0 v1 = λv1 − μv2 and AT0 v2 = μv1 + λv2 . Thus v1 , v2 span a two dimensional AT0 -invariant subspace whose minimal polynomial is a degree two irreducible factor of pA0 (x). Let X = LLLT where L is n × 2 matrix having columns v1 and v2 and L is 2 × 2 symmetric matrix. Theorem 5. Let X = LLLT , where L is a symmetric (2 × 2) matrix of rank 2 and the two columns of L are the real and imaginary parts of a complex eigenvector corresponding to an eigenvalue λ + iμ. Further assume λ = 0 and the eigenvalues λ ± iμ are controllable. Then Ric(X) = 0 has a unique rank two solution of the given form. Further, this rank two solution is definite.
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
7
λ μ Proof. Ric(X) = L(−DL − LD + LL BB LL)L where D = . Let −μ λ M = LT BB T L, and since the complex conjugate eigenvalues are controllable, B T L = 0. Ric(X) = 0 iff −DL − LDT + LM L = 0. If L is of rank two, pre and post multiplying by Y = L−1 , we get −Y D − DT Y + M = 0. If Y = [yij ] and M = [mij ], then this last equation gives the following linear system of equations T
⎡
2λ ⎢ ⎣ μ 0
T
T
T
⎤⎡ ⎤ ⎡ ⎤ y11 −2μ 0 m11 ⎥⎢ ⎥ ⎢ ⎥ 2λ −μ ⎦ ⎣ y12 ⎦ = ⎣ m12 ⎦ . 2μ 2λ y22 m22
(3)
Determinant of the 3 × 3 matrix in Eq. (3) is equal to 8λ(λ2 + μ2 ) which is nonzero iff λ = 0. Hence a unique solution Y exists and we need to show that Y is invertible. Suppose Y is not invertible, i.e., Y has rank one. Let Y = αvv T and let DT v = w. As D does not induce a one dimensional real invariant subspace, v, w are linearly independent. Therefore
Y D + DT Y = α vv T D + DT vv T = α(vwT + wv T ) 0 1 vT =α v w . wT 1 0 This implies Y D + DT Y is indefinite. But M = LT BB T L ≥ 0. Thus one obtains a contradiction as Y D + DT Y = M . Therefore Y cannot have rank one. Hence Y must have rank two and LY −1 LT gives the unique rank two solution of Ric(X) = 0 of the prescribed form. ¯ T is the unique rank two solution of Ric(X) = 0. Let L¯ be such that X = LLL Assume that λ > 0 and suppose L¯ is indefinite. As L¯ is symmetric, it has some γ < 0 as its eigenvalue and v as the corresponding eigenvector. Observe ¯ T )v = γv T (D + DT )v = 2γλv T v < 0 v T (DL¯ + LD ¯ T )v = v T LM ¯ Lv ¯ = γ 2 v T M v ≥ 0. v T (DL¯ + LD A contradiction and therefore one concludes that all eigenvalues of L¯ are positive and ¯ T is positive semidefinite. Similarly, we can show that if λ < 0, L¯ is negative X = LLL definite. 2 Now we consider the case when A0 has purely imaginary eigenvalues ±iμ. Theorem 6. If X = LLLT where columns of L form the two dimensional AT0 -invariant subspace associated with a complex conjugate pair of purely imaginary eigenvalues ±iμ of AT0 , then Ric(X) = 0 is not satisfied for any nonzero X of the given form.
8
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
Proof. Let X = LLLT . We that may assume that columns v1 , v2 of L are such 0 μ a b 0 μ AT0 v1 v2 = v1 v2 and D = . There. Let L = b c −μ 0 −μ 0 fore, Ric(X) = −DL − LDT + LM L −2b a − c a b a = L{μ + M a−c 2b b c b
b }LT . c
−2b a − c Note that last term is positive semidefinite since M ≥ 0. Consider μ , a−c 2b μ > 0. Determinant of this matrix is −4b2 − (a − c)2 ≤ 0. Therefore this matrix is indefinite. So we have sum of positive semidefinite matrix and indefinite matrix – this sum cannot be zero. (Suppose A ≥ 0, C ≤ 0 and B is indefinite and A +B = C. Therefore A − C = −B but A and −C are positive semidefinite and −B is indefinite which is a contradiction. Therefore C cannot be ≤ 0.) Therefore the only solution of Ric(X) = 0 of the given form is the zero solution. 2 We have therefore characterized all solutions of Ric(X) = 0 that can arise from a two dimensional AT0 -invariant subspace associated to a pair of complex conjugate eigenvalues. 2.2.2. Real eigenvalues of A0 We now consider a two dimensional subspace spanned by two independent eigenvectors/generalized eigenvectors of AT0 , say v1 and v2 corresponding to real eigenvalues λ1 and λ2 respectively such that λ1 + λ2 = 0. Let X = LLLT where v1 , v2 form columns of L and L is a rank two 2 × 2 symmetric matrix. Theorem 7. Let X = LLLT (L is 2 × 2 with rank 2) where the columns of L span an AT0 -invariant subspace. Further assume that the two columns of L are linearly independent (generalized) eigenvectors of AT0 corresponding to a pair of controllable modes associated to nonzero real eigenvalues λ1 , λ2 such that λ1 + λ2 = 0. Then Ric(X) = 0 has a unique rank two solution of the given form. Proof. Ric(X) = L(−DJ L−LDJT +LLT BB T LL)LT where DJ is a 2×2 upper triangular matrix in Jordan canonical form. It is enough to show that −Y DJ − DJT Y + M = 0
(4)
has a rank two solution Y (here M = LT BB T L). Clearly Y = 0 is not a solution of Eq. (4). Suppose Y is non-invertible, i.e., rank of Y is one. Case (i): Suppose DJ is in trivial Jordan form. Note that Y (i, j) = M (i, j)/(λi + λj ). Let v ∈ ker(Y ). We have v T (−Y DJ − DJT Y + M )v = 0 which implies v T M v =
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
9
v T (LT BB T L)v = 0, hence B T Lv = 0. Post multiplying Eq. (4) by v, we get Y DJ v = 0 and hence v must be eigenvector of DJ . Hence v = e1 or v = e2 . This implies Y (1, 1) = 0 or Y (2, 2) = 0. Since M (1, 1) and M (2, 2) are both nonzero, this is a contradiction. Case (ii): Now suppose DJ is in nontrivial Jordan form. Then v = e1 (since it is the only DJ -invariant subspace). Again, this implies Y (1, 1) = 0 which is impossible since M (1, 1) = 0. Therefore Y has rank two and X = LY −1 LT is a rank two solution of Ric(X) = 0. 2 Next we consider the case when A0 has eigenvalues λ and −λ. Theorem 8. If X = LLLT (L is 2 × 2 with rank 2) where two columns of L form linearly independent eigenvectors of AT0 corresponding to a pair of controllable modes associated to nonzero real eigenvalues λ1 , λ2 such that λ1 + λ2 = 0, then Ric(X) = 0 has either (a) no rank two solution of the given form or (b) infinite rank two solutions of the given form. Proof. Following the proof of the previous theorem, one concludes that Ric(X) = 0 iff −DL − LDT + LM L = 0. Assuming L is of rank two, pre and post multiplying by Y = L−1 , one gets −Y D − DT Y + M = 0. Therefore one obtains (λi + λj )yij = mij . Taking into account λ1 +λ2 = 0, one gets the constraint m12 must necessarily be zero for a solution to exist. Thus, if m12 = 0, then there is no solution Y and therefore no rank two solution of −DL − LDT + LM L = 0. Thus, Ric(X) = 0 has no rank two solution of the given form. On the other hand, if m12 = 0, then one can find an infinite number of solutions Y ii that solve −Y D − DT Y + M = 0. These are given by yii = m 2λi and y12 = y21 = α for any α ∈ R. For almost all values of α, the above gives a rank two Y and therefore one obtains a corresponding rank two solution of Ric(X) = 0 of the prescribed form. 2 We demonstrate the above result with an example. Example 1. Let A0 =
−1 0
0 1 ,B = . 1 1
1 1 Clearly A0 = = D has eigenvalues −1, 1. M = BB = (since we can 1 1 choose L as the identity matrix). One wants to find rank two solutions for this simplified ARE: −DL − LDT + LM L = 0. If such a solution exists, then its inverse Y satisfies the Lyapunov equation Y D + DT Y = M . Let Y = [yij ]. Clearly, (−1 + 1)y12 = m12 . Since m12 = 0, Lyapunov equation Y D + DT Y = M has no solution. Thus, there are no rank two solutions of Ric(X) = 0 in this case. AT0
T
10
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
1 0 Now consider a system with same A0 as above but B = . Therefore, M = BB T 0 1 −1/2 α is also the identity matrix. Since m12 = m21 = 0, Y = (for any α ∈ R) α 1/2 satisfies the Lyapunov equation Y D + DT Y = M (where D = A0 ). Thus one obtains several Y s that are invertible and their inverses give infinitely many rank two solutions of Ric(X) = 0. Now we consider the case when (A0 , B) is uncontrollable. For the rank one situation, Theorem 2 states that an infinite number of rank one solutions exist if there is an eigenvector of AT0 that correspond to an uncontrollable mode associated to zero eigenvalue. Meanwhile, Theorem 3 states that there are no rank one solutions of Ric(X) = 0 that correspond to eigenvectors of nonzero eigenvalues of AT0 associated to uncontrollable modes. For the rank two situation, if the concerned eigenvalues are a complex conjugate pair, then it is clear that M = 0 if these eigenvalues are uncontrollable. As a result, a solution of Ric(X) = 0 reduces to the case of a solution of −DL −LDT = 0 which in turn has no nonzero solution (as this is a linear equation of full rank). Finally, we consider the rank two situation where there are two real eigenvalues, one of which is controllable and the other is not. If the real eigenvalues λ1 = λ2 and λ1 + λ2 = 0, then M turns out to be a 2 × 2 symmetric matrix with three entries being zero. Thus there is only a rank one solution corresponding to the controllable eigenvalue (which was anyway predicted by Theorem 3). On the other hand, if λ1 = λ2 , and one of the columns of L correspond to a controllable mode, whereas the other column does not correspond to a controllable mode, then again no rank two solution of Ric(X) = 0 of the prescribed form exists. On the other hand, an infinite number of rank one solutions of Ric(X) = 0 exist. This is because nearly all linear combinations of the two columns of L, form controllable modes. We demonstrate this with an example. Example 2. Let A0 =
1 0 1 ,B = . 0 1 0
0 In this case, eigenvector v = corresponds to an uncontrollable mode and does not 1 T 2 T give any solution of Ric(X) = 0. All rank one matrices of the form X = (2/||B v|| )vv α where v = (α = 0) give rank one solutions of Ric(X) = 0. Thus we have infinitely 1 many rank one solutions but no rank two solutions. Next we consider the case of repeated eigenvalue in a Jordan block, where the whole block is not completely controllable.
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
11
Theorem 9. If λ is a nonzero repeated eigenvalue of A0 in a Jordan block, such that the Jordan block is only partially controllable, then the AT0 -invariant subspace associated to the Jordan block does not support any nontrivial solution to Ric(X) = 0. Proof. Assume A0 is a lower Jordan block of size 2 with eigenvalue λ. Thus e1 is the eigenvector of AT0 corresponding to λ and since the Jordan block is not completely T controllable, we can assume without loss of generality, that B = e2 , such that B e1 = 0. x1 x2 In this case, assuming X = , one obtains a set of three equations given by x2 x3 2λx1 + 2x2 = x22 2λx2 + x3 = x2 x3 2λx3 = x23 . From the last equation above, it is clear that x3 = 0 or x3 = 2λ. Substituting x3 = 2λ into the second equation, one obtains x3 = 0 which is absurd. Therefore x3 = 0 and therefore one concludes that X is the zero matrix. For Jordan blocks of larger size, one obtains a similar set of equations, where starting from the last equation, one eliminates all other possibilities except the zero solution. 2 Note from the proof above, that for the case when the repeated is zero, eigenvalue α 2 there are several nontrivial rank two solutions given by X = with α ∈ R. 2 0 This is what we expect from our earlier results which claim that an uncontrollable zero eigenvalue could still contribute to nontrivial solutions. Finally, we come to a nontrivial case when (A0 , B) is uncontrollable. This is when A0 has eigenvalues λ, −λ, with one of them having a controllable mode and the other having an uncontrollable mode. Theorem 10. If λ and −λ are controllable and uncontrollable eigenvalues of A0 respectively, then the two dimensional subspace spanned by eigenvectors of AT0 corresponding to λ and −λ gives infinitely many rank two solutions of the ARE of the prescribed form. Proof. As before, we reduce to the 2 × 2 ARE: −DL − LD + LM L = 0 simplified T using X = L2 LL2 where L2 = vc vun (vc is controllable eigenvector and vun is λ 0 m11 0 T T uncontrollable eigenvector) and D = . M = L2 BB L2 = (since 0 −λ 0 0 T vun B = 0). Assuming a rank two solution exists, one uses Y (the inverse of this rank two solution) to obtain Y D + DY = M . Clearly, y22 = 0. Further, one can choose any nonzero y12 since m12 = 0 (note that this is precisely where the case of λ1 + λ2 = 0
12
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
differs from this case). Thus, one gets a Y which is invertible and there are infinitely many rank two solutions of the ARE of the form X = L2 LLT2 . 2 Example 3. Let A0 =
−1 0 1 ,B = . 0 1 0
Clearly, A0 = D in this case. The eigenvector corresponding to eigenvalue −1 is con 1 0 T trollable and that corresponding to 1 is uncontrollable. Note M = IBB I = . 0 0 −0.5 α Observe that any rank two matrix Y = (where α = 0) satisfies Lyapunov α 0 equation Y D + DT Y = M . Since there are infinitely many rank two solutions of this Lyapunov equation, one obtains infinitely many rank two solutions of the ARE. Thus, we observe that even though eigenspaces corresponding uncontrollable eigenvalues do not correspond to rank one solutions of Ric(X) = 0, if λ and −λ belong to Spec(A0 ) such that one of them is controllable and other is not, then their combined invariant subspace corresponds to infinitely many rank two solutions. Observe that if one uses a feedback to modify Example 3, then this example reduces to Example 2. Note −2 0 that in Example 3, X = is a solution of −AT0 X −XA0 +XBB T X = 0. Taking 0 0 A1 = A0 − BB T X one obtains exactly the matrix given in Example 2. Thus one can establish a one-to-one correspondence between the infinitely many rank one solutions of Example 2 and the infinitely many rank two solutions in Example 3. Thus, this last case of λ1 + λ2 = 0 with one controllable eigenvalue and the other eigenvalue which is the negative of the first being uncontrollable, is in reality a disguised version of the situation of having repeated eigenvalues having both controllable and uncontrollable modes. In this section, we have enumerated all situations that give rise to either rank one or rank two solutions of Ric(X) = 0. As it turns out, this is all that is required to completely understand all the solutions of Ric(X) = 0. These rank one and rank two solutions are like building blocks and all the other solutions can be build up from them. We therefore now look for higher rank solutions of Ric(X) = 0. 3. Solutions of Ric(X) = 0 of general rank We can write any rank k solution of Ric(X) = 0 as X = LLLT where L is n × k and L is k × k symmetric matrix. One can show using arguments similar to those used in the proof of Theorem 4 that if columns of L do not form an AT0 -invariant subspace, then Ric(X) is indefinite. Therefore, to get rank k solutions of Ric(X) = 0, columns of L must form an AT0 -invariant subspace. When AT0 is diagonalizable, without loss of generality
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
13
we can take columns of L as eigenvectors of AT0 . When A0 has complex eigenvalues, then one takes the real and imaginary parts of the complex eigenvector of AT0 as the columns of L. For the more general case of repeated eigenvalues in a Jordan block, one takes generalized eigenvectors of AT0 as the columns of L. Using X = LLLT (where L is k × k matrix), Ric(X) is reduced to expression L(−DJ L − LDJT + LM L)LT where M = LT BB T L and DJ is the Jordan form associated with AT0 -invariant subspace. Theorem 11. If (A0 , B) is controllable and A0 has a zero eigenvalue, then the AT0 -invariant subspace associated to the zero eigenvalue does not correspond to any nonzero solution of Ric(X) = 0. Proof. Let columns of Lj denote eigenspace of AT0 corresponding to the zero eigenvalue. Hence, AT0 Lj = 0. For X = Lj Lj LTj , Ric(X) = Lj Mj LTj ≥ 0 where Mj = LTj BB T Lj . Now since (A0 , B) is controllable, LTj B = 0. Therefore Mj = 0 and Ric(X) has no nonzero solution. Now suppose columns of Lj denote generalized eigenvectors of AT0 . For simplicity assume that there is only one Jordan block for the zero eigenvalue. AT0 Lj = Lj DJj where all diagonal entries of DJj are zero. For X = Lj Lj LTj , Ric(X) = Lj (−DJj Lj − Lj DJTj + ⎡ T⎤ l2 ⎢ lT ⎥ ⎢ 3 ⎥ ⎢ . ⎥ ⎢ ⎥ Lj Mj Lj )LTj . Let Lj = l1 l2 · · · lj . Therefore, −DJj Lj − Lj DJTj = − ⎢ ⎥ − ⎢ . ⎥ ⎢ T⎥ ⎣ lj ⎦ 0
l2 l3 · · · lj 0 = C. Consider determinant of 2 × 2 principal submatrix obtained from first and j−th rows and columns of this matrix C. This determinant is negative which implies −DJj Lj −Lj DJTj is indefinite. But Lj Mj Lj is positive semidefinite. Hence, −DJj Lj − Lj DJTj + Lj Mj Lj = 0 for any nonzero Lj . Further, since Lj Mj Lj ≥ 0 and −DJj Lj −Lj DJTj is indefinite, Ric(X) = 0. Hence, ARE is not satisfied for any nonzero X. The general case when there are more than one Jordan blocks also follows in a similar fashion. 2 Corollary 11.1. If A0 has a zero eigenvalue which is uncontrollable, then X = Luc LLTuc is a solution of Ric(X) = 0 for any symmetric L, where columns of Luc are eigenvectors of AT0 associated with the uncontrollable modes of the zero eigenvalue. Proof. Note that AT0 Luc = LTuc A0 = 0 and LTuc BB T Luc = 0 which proves the corollary. 2 If A0 has purely imaginary eigenvalues with nontrivial Jordan structure, then using arguments similar to those used in the proof of Theorem 6, one can show that the
14
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
invariant subspace corresponding to these purely imaginary eigenvalues of AT0 do not contribute to nonzero solutions of Ric(X) = 0. Using X = LLLT (where columns of L are eigenvectors/generalized eigenvectors of AT0 corresponding to real eigenvalues and real and imaginary part of the complex eigenvectors/generalized eigenvectors associated with complex eigenvalues), the problem reduces to the simplified ARE: −DJ L −LDJT +LM L = 0. By assumption, all eigenvalues of DJ are nonzero and not purely imaginary. If DJ has repeated eigenvalues whose algebraic multiplicity is equal to its geometric multiplicity, then DJ would have a block diagonal structure, with 1 × 1 blocks corresponding to real eigenvalues and 2 × 2 blocks corresponding to complex eigenvalues. When DJ has this block diagonal structure, we denote it by D. Lemma 1. If the spectrum of A0 simultaneously contains an eigenvalue λ and its negative −λ, then Ric(X) = 0 has either (a) no full rank solution or (b) an infinite number of full rank solutions. Proof. Assume that Ric(X) = 0 has a full rank solution X ∗ . Then by pre and post multiplying by Y ∗ = (X ∗ )−1 , one gets −Y ∗ AT0 − A0 Y ∗ + BB T = 0. The linear operator (Lyapunov operator) that maps any symmetric matrix Z to the matrix −ZAT0 − A0 Z, has a spectrum containing the zero eigenvalue [11, Theorem 4.4.6]. Let Z0 be the corresponding eigenvector i.e. −Z0 AT0 − A0 Z0 = 0. It is immediate to check that for any real k, Y := Y ∗ + kZ0 is a symmetric solution of −Y AT0 − A0 Y + BB T = 0. Moreover, since Y ∗ is invertible, Y := Y ∗ + kZ0 is invertible for any sufficiently small k and therefore we get infinitely many solutions [Y ∗ + kZ0 ]−1 of Ric(X) = 0. On the other hand, the range of the Lyapunov operator does not contain all symmetric matrices since zero is an eigenvalue. Therefore Ric(X) = 0 would have no full rank solution if −BB T is not in the range of the Lyapunov operator. 2 Theorem 12. If (A0 , B) is controllable and A0 has eigenvalues λi (1 ≤ i ≤ n) such that λi + λj = 0 for all 1 ≤ i, j ≤ n, then Ric(X) = 0 has a unique full rank solution. Proof. Clearly it is sufficient to show that equation −Y AT0 − A0 Y + BB T = 0
(5)
has a unique solution and that this solution is invertible. Existence and uniqueness are guaranteed by [11, Theorem 4.4.6]. As for non-singularity, let Y be the solution of (5), and v ∈ ker(Y ). Then by pre and post multiplying (5) by v T and v, respectively, we see that v ∈ ker(B T ) so that ker(B T ) ⊇ ker(Y ). Now, by post multiplying (5) by v ∈ ker(Y ), we see that Y AT0 v = 0 so that ker(Y ) is AT0 -invariant. Since (A0 , B) is controllable, (AT0 , B T ) is observable and hence the only AT0 -invariant subspace contained in ker(B) is the trivial space {0}. Then ker(Y ) = {0}, i.e. Y is invertible. 2
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
15
Note that the above theorem rules out the case of any eigenvalue of A0 on the imaginary axis. Discrete-time counterpart of the above theorem has appeared in [24] and [9]. Putting together the results so far in this section, one can conclude that a unique full rank solution exists for Ric(X) = 0 whenever all the eigenvalues of A0 are nonzero, controllable and the sum of any two eigenvalues is never equal to zero. It has been demonstrated in Theorem 11 that a controllable zero eigenvalue of A0 results in rank deficient solutions of Ric(X) = 0. On the other hand, Corollary 11.1 shows that an uncontrollable zero eigenvalue of A0 does not hinder the existence of full rank solutions for Ric(X) = 0, but then uniqueness is lost. Theorem 12 demonstrates the conditions for existence of a unique full rank solution for Ric(X) = 0 when all the eigenvalues are controllable. Finally, for the case when the sum of two nonzero eigenvalues of A0 add up to zero, a full rank solution may or may not exist (Lemma 1). If a full rank solution does exist, then there are an infinite number of such full rank solutions. One can isolate several special cases where all the conditions listed above are satisfied. For example, if one considers all the eigenvalues of A0 to be controllable and lying in the open right half complex plane (i.e., having real parts that are strictly positive), then Ric(X) = 0 has a unique full rank solution. In this case, one can in fact show that this full rank solution of Ric(X) = 0 is positive definite. For instance, in [22], it is shown that −A0 Y − Y AT0 + BB T = 0 has a unique positive definite solution. On similar lines, one can also conclude that if all eigenvalues of A0 are controllable and lies in the open left half plane, then Ric(X) = 0 has a full rank solution which is negative definite. On the other hand, if (A0 , B) is not controllable, then one can divide the eigenvalues into two sets: those eigenvalues which are controllable (denoted by Spec(A0 )c ) and those eigenvalues which are not controllable (denoted by Spec(A0 )uc ). If the controllable subspace is k-dimensional, then one can guarantee a rank k solution, provided λi + λj = 0 for all λi , λj ∈ Spec(A0 )c . If this condition is not satisfied, then following Lemma 1 either no rank k solution exists or infinitely many rank k solutions exist. As for the uncontrollable part, if all the eigenvalues are nonzero, then no nontrivial solution of Ric(X) = 0 comes from the AT0 -invariant subspace corresponding to the uncontrollable eigenvalues in general. The only exception to this rule arises out of a situation outlined by Theorem 10 – if an eigenvalue λ ∈ Spec(A0 )c and an eigenvalue −λ ∈ Spec(A0 )uc , then one gets solutions whose rank is greater than k. Finally, if 0 ∈ Spec(A0 )uc , then again one gets solutions whose rank is greater than k. In all these cases, where the uncontrollable eigenspaces contribute nontrivially to solutions of Ric(X) = 0, the uniqueness of maximal rank solution is lost. We end this section by characterizing when the set of solutions of Ric(X) = 0 is finite and when it is bounded. Lemma 2. Let (A0 , B) be controllable and A0 have only nonzero eigenvalues. The number of solutions of Ric(X) = 0 is finite only if the cyclic index of A0 is equal to one. If λi + λj = 0 for all λi , λj ∈ Spec(A0 ), then the above condition is also sufficient.
16
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
Proof. If the cyclic index of A0 is greater than one, then AT0 necessarily has multiple eigenvectors corresponding to the some eigenvalue λ. As (A0 , B) is controllable, if λ is real, then each one of these eigenvectors give rise to rank one solutions of Ric(X) = 0. Similarly, if λ is a complex number, then again one can obtain multiple rank two solutions of Ric(X) = 0, corresponding to the various two dimensional AT0 -invariant subspaces associated to λ. Therefore, cyclic index of A0 being one is a necessary condition for the number of solutions of Ric(X) = 0 to be finite. Even when the cyclic index of A0 is one, there could arise situations where both λ and −λ belong to Spec(A0 ). In such a situation, by Lemma 1, we know that the number of solutions of Ric(X) = 0 may not be finite. Avoiding this situation guarantees that the number of solutions of Ric(X) = 0 is finite. 2 Note that the above lemma considers only those A0 which have nonzero eigenvalues. As seen in Theorem 11, if A0 has controllable zero eigenvalues, then the corresponding eigenspace does not contribute nontrivially to solutions of Ric(X) = 0. This holds even when the dimension of ker A0 is greater than one, i.e, when cyclic index of A0 is greater than one. If (A0 , B) is not controllable, then a nonzero eigenvalue of AT0 that has multiple eigenvectors can still produce a situation where the number of solutions of Ric(X) = 0 is finite. This is precisely when the entire eigenspace associated to this eigenvalue lies outside the controllable subspace. Thus cyclic index of A0 being one is not a necessary condition for number of solutions to be finite in the uncontrollable case. Notice that an uncontrollable zero eigenvalue results in an infinite number of solutions for Ric(X) = 0. Similarly, if a nonzero eigenvalue of A0 has multiple eigenvectors, with at least one of the corresponding modes controllable, then there are an infinite number of solutions for Ric(X) = 0. Finally, if an eigenvalue λ and its negative −λ which are not purely imaginary, belong to Spec(A0 ) with one of them controllable and the other not controllable, then again there are an infinite number of solutions for Ric(X) = 0. The following lemma characterizes when the solutions of Ric(X) = 0 lies within a bounded set. Lemma 3. The set of solutions of Ric(X) = 0 is bounded if all the following conditions are satisfied: • A0 has no uncontrollable zero eigenvalue • There exists no eigenvalue λ which belongs to Spec(A0 )c and Spec(A0 )uc • There exists no eigenvalue λ ∈ Spec(A0 )c with −λ ∈ Spec(A0 )uc . Proof. By Corollary 11.1, it is clear that if A0 has an uncontrollable zero eigenvalue, then the solution set of Ric(X) = 0 is unbounded. If λ ∈ Spec(A0 )c and Spec(A0 )uc , then clearly the geometric multiplicity of λ is greater than one. Following Example 2 and the discussion preceding it, it is clear that such a situation leads to the solution set of
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
17
Ric(X) = 0 being unbounded. Finally by Theorem 10, it is clear that the third condition must be satisfied for the solution set of Ric(X) = 0 to be bounded. 2 4. Information content in a full rank solution of Ric(X) = 0 In this section, we assume that (A0 , B) is controllable and a unique full rank solution of Ric(X) = 0 exists. From our earlier discussions, it is clear that this full rank solution of Ric(X) = 0 has the form X = LL∗ LT where L is a real n × n matrix whose columns are either the (generalized) eigenvectors of AT0 corresponding to real eigenvalues or the real and imaginary parts of complex (generalized) eigenvectors of AT0 corresponding to complex eigenvalues. The matrix L∗ is the solution of the simplified ARE −DJ L −LDJT + LM L = 0. The matrix M = LT BB T L and the matrix DJ is a block diagonal Jordan matrix. Note that the eigenvalues of A0 are such that λi + λj = 0 for all eigenvalues λi , λj ∈ Spec(A0 ). In order to find a rank k solution of Ric(X) = 0 (where k < n), we take X = Lk LLTk where columns of Lk are either (generalized) eigenvectors of AT0 corresponding to real eigenvalues or the real and complex parts of complex (generalized) eigenvectors of AT0 corresponding to complex eigenvalues. In other words, Lk is an n × k submatrix of L. Therefore, AT0 Lk = Lk Dk where Dk is the corresponding k × k submatrix of DJ . Then the ARE: −AT0 X − XA0 + XBB T X = 0 becomes −AT0 X − XA0 + XBB T X = −AT0 Lk LLTk − Lk LLTk A0 + Lk LLTk BB T Lk LLTk = −Lk Dk LLTk − Lk LDkT LT + Lk LMk LLTk = Lk (−Dk L − LDkT + LMk L)LTk where Mk = LTk BB T Lk . Note that Mk is the appropriate k × k submatrix of the original matrix M . Further observe that the rank k solution is obtained by solving the simplified ARE: −Dk L − LDkT + LMk L = 0 which is a chopped up version of the original full rank simplified ARE −DJ L − LDJT + LM L = 0. Assuming that L has rank k, we can pre and post multiply the chopped up version of the ARE by Y = L−1 thereby obtaining the linear equation −Y Dk − DkT Y + Mk = 0.
(6)
This Lyapunov equation has a unique solution Yk (note that our assumption in this section of controllability and λi + λj = 0 where λi , λj ∈ Spec(A0 ) is necessary for this conclusion). From Yk , one obtains a rank k solution Lk = (Yk )−1 of the chopped up ARE: −Dk L − LDkT + LMk L = 0. Using arguments similar to those used in the proof of Theorem 12, we can prove that Yk is invertible. But clearly, this rank k solution of Ric(X) = 0 is related to the full rank solution of Ric(X) = 0 as the governing equation of the former is a chopped up version of the latter. We now bring out this relationship between the various solutions of Ric(X) = 0.
18
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
Theorem 13. Let (A0 , B) be controllable. If A0 has nonzero real and distinct eigenvalues λ1 , . . . , λn such that λi + λj = 0 (for 1 ≤ i, j ≤ n), then the lower rank 2n − 2 nonzero solutions of simplified ARE: −DL−LD+LM L = 0 are obtained from Schur complements of all the 2n − 2 strict principal submatrices of L∗ , the unique full rank solution of the simplified ARE. Proof. Weconsider Dk, the leading principal k × k submatrix of the diagonal matrix D. R11 R12 Let L∗ = be the corresponding splitting of L∗ and Y ∗ = (L∗ )−1 . Y ∗ = T R12 R22 ∗ ∗ Y11 ∗ is the corresponding leading k × k principal submatrix of Y ∗ . Let where Y11 ∗ ∗ ∗ −1 ) 0 (Y11 L = . Further, let M11 be the corresponding k × k principal submatrix 0 0 of M . Then, −Y ∗ D − DY ∗ + M = 0 ∴ ∴
∗ −1 ) −Dk (Y11
−
LM L = = −DL − LD =
∗ Dk −Y11
∗ −1 (Y11 ) Dk
∗ −1 (Y11 ) 0
+
−
∗ Dk Y11
+ M11 = 0
∗ −1 ∗ −1 (Y11 ) M11 (Y11 )
M11 ∗
∗ ∗
∗ −1 ∗ −1 ) M11 (Y11 ) (Y11 0
0 0
0 0
(7)
∗ −1 ) (Y11 0
= 0.
(8)
0 0
∗ −1 ∗ −1 ) − (Y11 ) Dk −Dk (Y11 0
0 . 0
∗ −1 ) 0 (Y11 gives a rank k By Eq. (8), −DL − LD + LM L = 0. Therefore L = 0 0 ∗ −1 ) = solution of −DL − LD + LM L = 0. Note that using Schur complement, (Y11 −1 T R11 − R12 R22 R12 . Similarly, Schur complements of all leading principal submatrices of L∗ give solutions of −DL − LD + LM L = 0. For any principal submatrix which is not a leading principal submatrix, one arrives at a similar result by applying a change of basis with a permutation matrix. 2
Note that from above theorem, if L of rank k satisfies simplified ARE: −DL − LD + LM L = 0, then X = LLLT gives a rank k solution of Ric(X) = 0. Thus, the above theorem states the full rank solution of −DL −LD +LM L = 0 has all the other solutions encoded within it. It is enough to find the full rank solution and all other solutions can be read off from this solution.
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
19
Example 4. ⎡
1 ⎢2 ⎢ A=⎢ ⎣0 0
1 4 2 0
⎤ ⎡ 1 1 ⎢1 1⎥ ⎥ ⎢ ⎥,B = ⎢ ⎣1 0⎦ 4 1
1 5 4 5
⎤ ⎡ −1 4 ⎢1 1 ⎥ ⎥ ⎢ ⎥,Q = ⎢ ⎣1 1 ⎦ 2 1
1 5 1 1
1 1 2 1
⎤ 1 1⎥ ⎥ ⎥. 1⎦ 5
Fixing K0 (a solution of the ARE: −AT K − KA − Q + KBB T K = 0), we get A0 = A − BB T K0 with eigenvalues {10.4939, 3.8178, 2.6097, 1.5784}. Forming the matrix L using eigenvectors of AT0 , the Riccati equation gets modified into diagonal form −DL − LD + LM L = 0 with ⎡
⎤ 10.7996 6.8110 5.8111 1.3180 ⎢ 6.8110 26.5759 39.5777 −19.1530 ⎥ ⎢ ⎥ M = LT BB T L = ⎢ ⎥. ⎣ 5.8111 39.5777 61.0133 −31.5026 ⎦ 1.3180 −19.1530 −31.5026 18.0857 Let Y ∗ be the solution of Lyapunov equation −Y ∗ D − DY ∗ + M = 0. Full rank solution of −DL − LD + LM L = 0 is given by L∗ = (Y ∗ )−1 ⎡
⎤ 4.0715 −5.3083 3.0651 0.6581 ⎢ −5.3083 28.9839 −22.2105 −11.1036 ⎥ ⎢ ⎥ L∗ = ⎢ ⎥. ⎣ 3.0651 −22.2105 17.8956 9.6775 ⎦ 0.6581 −11.1036 9.6775 5.9889 For the (1, 1) principal submatrix of L∗ , one ⎡obtains a rank one ⎤ solution by taking 1.9434 0 0 0 ⎢ 0 0 0 0⎥ ⎢ ⎥ appropriate Schur complement to get L1 = ⎢ ⎥. Similarly, for the ⎣ 0 0 0 0⎦ 0 0 0 0 leading ⎡2 × 2 principal submatrix, ⎤taking the appropriate Schur complement one gets 2.2247 −0.3042 0 0 ⎢ −0.3042 0.3289 0 0 ⎥ ⎢ ⎥ L12 = ⎢ ⎥. A rank one solution of Ric(X) = 0 is given by ⎣ 0 0 0 0⎦ 0 0 0 0 T X1 = LL1 L . Similarly a rank two solution is given by X12 = LL12 LT . Full rank solution is given by X ∗ = LL∗ LT . Thus, all the fourteen lower rank nonzero solutions can be obtained by taking appropriate Schur complements of the full rank solution L∗. Finally, all sixteen solutions of the ARE are obtained as K0 + X, where X is a solution of Ric(X) = 0. Corollary 13.1. Let (A0 , B) be controllable. Let A0 have nonzero distinct eigenvalues λ1 , · · · , λn such that λi + λj = 0 for all 1 ≤ i, j ≤ n. Let λ ± iμ be a complex conjugate
20
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
pair of eigenvalues of A0 . Then the Schur complement of the principal submatrix of L∗ associated with all the other eigenvalues of A0 , determines a rank two solution of the simplified ARE −DL − LDT + LM L = 0. Proof. Let v1 + iv2 be the complex eigenvector of the complex eigenvalue λ + iμ. Then v1 , v2 span a two dimensional real subspace associated with the complex conjugate pair λ ± iμ. Without loss of generality, we assume that matrix L has v1 , v2 as its first two Therefore, AT0 L = LD where leading 2 × 2 principal submatrix of D is columns. λ μ . Let D2 denote this block. Let L∗ and Y ∗ be as in previous theorem. Let −μ λ −1 Y 0 2 Y2 be leading 2 × 2 submatrix of Y ∗ and let L2 = . Clearly L2 satisfies 0 0 −DL − LDT + LM L = 0. Y2−1 can be obtained by taking Schur complement with respect to the trailing (n − 2) × (n − 2) principal submatrix of L∗ . 2 Example 5. ⎡
⎤ ⎡ ⎤ ⎡ 1 1 0 0 1 ⎢ ⎥ ⎢ ⎥ ⎢ A = ⎣0 1 1⎦,B = ⎣0⎦,Q = ⎣0 0 0 1 1 0
0 1 0
⎤ 0 ⎥ 0⎦. 1
Fixing K0 (a solution of the ARE: −AT K − KA − Q + KBB T K = 0) one gets A0 = A − BB T K0 with eigenvalues {1.4142, 1.0987 + i0.4551, 1.0987 − i0.4551}. Let L be a matrix whose first column is an eigenvector of AT0 corresponding to the eigenvalue 1.4142 whereas the second and third columns of L are the real and imaginary parts respectively of the complex eigenvector of AT0 corresponding to the eigenvalue 1.0987 + i0.4551. ⎡
⎤ 14.6489 15.7125 10.8940 ⎢ ⎥ M = LT BB T L = ⎣ 15.7125 16.8533 11.6850 ⎦ . 10.8940 11.6850 8.1016 Let Y ∗ be the solution of Lyapunov equation −Y D − DT Y + M = 0. Rank 3 solution L∗ is given by L∗ = (Y ∗ )−1 . ⎡
⎤ 87.3255 −60.2755 −24.9669 ⎢ ⎥ L∗ = ⎣ −60.2755 42.8452 16.1228 ⎦ . −24.9669 16.1228 8.3025 From the ⎡ 0.1931 ⎢ ⎣ 0 0
Schur 0 0 0 0 0 0
complement of lower 2 × 2 principal submatrix of L∗ , one obtains L1 = ⎤ ⎥ ⎦, and from the Schur complement of the (1, 1) principal submatrix,
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
21
⎡
⎤ 0 0 0 ⎢ ⎥ one obtains L23 = ⎣ 0 1.2407 −1.1103 ⎦. From these solutions, one constructs the 0 −1.1103 1.1643 other three solutions of the original Riccati equation. Remark 1. Let (A0 , B) be controllable. If A0 has n nonzero real distinct eigenvalues λ1 , . . . , λn such that λi + λj = 0 for all 1 ≤ i, j ≤ n, then Ric(X) = 0 has exactly 2n solutions. If A0 has m nonzero real distinct eigenvalues and l nonzero distinct complex conjugate pairs of eigenvalues, such that λi + λj = 0, where λi , λj ∈ Spec(A0 ), then Ric(X) = 0 has precisely 2(m+l) real solutions. Now assume that A0 has repeated eigenvalues whose algebraic multiplicity equals its geometric multiplicity. We continue to impose the condition (A0 , B) is controllable and λi + λj = 0 for λi , λj ∈ Spec(A0 ). Let columns of L be real eigenvectors of AT0 or real/imaginary parts of complex eigenvectors of AT0 associated with complex eigenvalues. Clearly, AT0 L = LD. The choice of L is not unique in this case. Suppose that some other choice of the eigenvectors gives the matrix LT , such that AT0 LT = LT D. Then LT = LT , where T is an invertible matrix. Note that AT0 LT = LT D = LT D. On the other hand, AT0 LT = AT0 LT = LDT . From these, one can conclude that T D = DT . Further, let MT = T T M T and then it is easily seen that the unique solution L∗ to the simplified ARE −DL − LD + LM L = 0 and the unique solution LT to the simplified ARE −DL − LD + LMT L = 0 are related by L∗ = T LT T T . But both these solutions translate to the same full rank solution of Ric(X) = 0 and therefore the full rank solution of Ric(X) = 0 is unique. For each choice of L, one gets a unique rank n solution L∗ and the Schur complements of appropriate (n − k) × (n − k) principal submatrices (k < n) give rank k solutions of simplified ARE: −DL − LDT + LM L = 0. Note that for different choices of L, the matrix M changes and therefore one gets different L∗ s. Thus, the Schur complements of the principal submatrices of the various L∗ need not give the same solutions as the choice of columns of L were different. As a result, the number of rank k solutions, for k < n need not be finite. This is in variance with the cases discussed earlier where the eigenvalues of A0 were assumed to be distinct. Corollary 13.2. Let (A0 , B) be controllable. If A0 has nonzero repeated eigenvalues with trivial Jordan structure such that λ and −λ do not co-exist in Spec(A0 ), then the Schur complements of appropriate principal submatrices of L∗ give lower rank nonzero solutions of −DL − LDT + LM L = 0. Here L∗ is the full rank solution of −DL − LDT + LM L = 0. Proof. Follows from Theorem 13 and discussion above. 2 Next we consider the case of nontrivial Jordan blocks. In this case, the Schur complement of every principal submatrix of the maximal solution L∗ need not give a solution
22
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
of the equation −DJ L − LDJT + LM L = 0. We have already seen that when DJ has a real eigenvalue λ of multiplicity one as its (i, i)-th diagonal entry, then there are two choices as far as taking Schur complements are concerned: to either include or exclude the i-th row and column in the principal submatrix whose Schur complement is to be taken. Similarly, if DJ has a 2 × 2 block corresponding to a complex conjugate pair of eigenvalues appearing in its i-th and (i + 1)-th rows and columns, then again there are only two choices as far as taking Schur complements are concerned: to either include or exclude both i-th and (i + 1)-th rows and columns in the principal submatrix whose Schur complement is to be taken. If DJ has a Jordan block of size k corresponding to a real eigenvalue λ appearing from i-th to (i + k − 1)-th columns and rows, then there are precisely (k +1) choices as far as taking Schur complements are concerned. The rule to be followed is that if (i +r)-th row and column is to chosen in the principal submatrix whose Schur complement is to be taken, then one should also take (i + r + 1), · · · , (i + k − 1)-th rows and columns in this principal submatrix. Notice that r can take values from 0 to k − 1, which gives k choices and the last choice is that none of the columns and rows corresponding to the nontrivial Jordan block be included in the concerned principal submatrix. Example 6. Consider the ARE with ⎡
⎤ ⎡ ⎤ ⎡ 1 1 0 2 0 1 ⎢ ⎥ ⎢ ⎥ ⎢ A = ⎣1 1 0⎦,B = ⎣0 0⎦,Q = ⎣1 0 0 1 0 1 0 ⎡
1 5 0
⎤ 0 ⎥ 0⎦. 1
⎤ −0.5 0.5 0 ⎢ ⎥ Choosing a solution K0 = ⎣ 0.5 −2.5 0 ⎦ of the ARE, we can form the matrix 0 0 −0.4142 ⎡ ⎤ ⎡ ⎤ 3 1 0 1 1 0 ⎢ ⎥ ⎢ ⎥ AT0 = ⎣ −1 1 0 ⎦. Let X = LLLT where L = ⎣ −1 0 0 ⎦ contains generalized √ 0 0 2 0 0 1 T eigenvectors of A0 .⎤Ric(X) = 0 is reduced to L(−DJ L − LDJT + LM L)LT = 0 where ⎡ 2 1 0 ⎥ ⎢ DJ = ⎣ 0 2 0 ⎦. Observe that DJ has a Jordan block of size 2 corresponding to √ 0 0 2 √ the eigenvalue 2 and a block of size 1 corresponding to eigenvalue 2. Thus there are (2 + 1)(1 + 1) = 6 solutions to Ric(X) = 0. Observe that AT0 has two eigenvectors which generate two rank one solutions of Ric(X) = 0. There are two subspaces which are two dimensional and AT0 -invariant that generate rank two solutions. The full space generates a rank three solution and zero subspace generates the zero solution. So there are six solutions of Ric(X) = 0.
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
23
⎡
⎤ 10 −12 0 ⎢ ⎥ We obtain the full rank solution from L∗ = ⎣ −12 16 0 ⎦. Taking the Schur 0 0 2.8284 complements of the principal submatrix involving row and column indices 2 and 3 gives us a rank one solution. Similarly, taking the Schur complement of the principal matrix involving row and column indices 1 and 2 gives another rank one solution. Similarly, the Schur complements of the 1 × 1 submatrices L∗ (2, 2) and L∗ (3, 3) gives the two rank two solutions. We give below the solution obtained by taking the Schur complement of the principal submatrix involving row and column indices 2 and 3, which turns out to be a rank one solution ⎡ ⎤ 1 −1 0 ⎢ ⎥ X1 = LL1 LT = ⎣ −1 1 0 ⎦ . 0 0 0 The rank three solution involves all of L∗ whereas the last solution is X = 0. Remark 2. Note that we began this section under the assumption that a unique full rank solution exists for the case under consideration. There are however cases where λ and −λ simultaneously belong to Spec(A0 ), where an infinite number of full rank solutions exist. In these cases too, taking Schur complements of any full rank solution L∗ of the equation −DJ L − LDJT + LM L = 0, one arrives at lower rank solutions of the equation. The proof of Theorem 13 and the associated corollary go through without any change. In the previous section, we enumerated various special cases, where there is no possibility of a full rank solution of Ric(X) = 0. For these special cases where full rank solution of the simplified ARE −DJ L − LDJT + LM L = 0 do not exist, one can still obtain useful information from those solutions (of the simplified ARE) which have the largest rank. Let A0 have l eigenvalues which are either zero or purely imaginary. Then one takes the matrix L whose columns are eigenvectors (real and imaginary parts of the eigenvectors as the to remaining (n − l) eigenvalues. Let X = LLLT and case may be) corresponding L = w1 . . wn−l (L is n ×(n −l) matrix, L is (n −l) ×(n −l) matrix). This reduces to one of the cases considered before with simplified ARE: −DJ L − LDJT + LM L = 0 for (n − l) × (n − l) matrices. After solving this reduced ARE, we can get solutions of original ARE using X = LLLT . Note that if A0 has only zero and purely imaginary eigenvalues then Ric(X) = 0 has a unique solution which is X = 0 (under the assumption of controllability). Further, note that the appropriate Schur complements of the rank (n − l) solution of the reduced ARE once again gives the lower rank solutions of the reduced ARE. Thus in general, the method of Schur complements can be applied to any maximal rank solution of the simplified ARE −DJ L − LDJT + LM L = 0 to obtain lower rank solutions of this simplified ARE. We now demonstrate a case of the simplified ARE that does not have full rank solution.
24
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
Example 7. ⎡
0 ⎢ −1 ⎢ A=⎢ ⎣ 0 0
1 0 0 0
⎡ ⎤ 2 0 0 ⎢ ⎥ 0 0⎥ ⎢5 ⎥, B = ⎢ ⎣4 2 1⎦ 1 −1 2
⎤ 1 0⎥ ⎥ ⎥ , Q = 0. 2⎦ 1
As Q = 0, therefore we can think of A0 = A as X = 0 is a solution of the ARE −AT X − XA − Q + XBB T X = 0. A0 has two purely imaginary eigenvalues and two complex eigenvalues whose real part is positive. Since the invariant space corresponding to the purely imaginary eigenvalues can be ignored, let L be a matrix whose columns span the invariant subspace corresponding to complex conjugate eigenvalues in the right half plane. Assuming X = LLLT , where L is a 2 × 2 matrix, we get Riccati equation −DL − LDT + LM L = 0 for 2 × 2 matrices. Solving this we get, L=
⎡
0 ⎢ 1.8102 2.4526 ⎢0 , X = LLLT = ⎢ ⎣0 2.4526 4.6131 0
0 0 0 0
⎤ 0 0 ⎥ 0 0 ⎥ ⎥. 0.9051 −1.2263 ⎦ −1.2263 2.3066
Throughout this section, we had assumed that (A0 , B) is controllable. If we relax this condition, then one needs to consider the controllable subspace. If k is the dimension of the controllable subspace, then the maximum rank of a solution of Ric(X) = 0 one can generically expect is k, provided there are no uncontrollable modes corresponding to the zero eigenvalue and the situation of an eigenvalue being controllable and its negative being another eigenvalue which is not controllable does not arise. This is clear from the results in the earlier sections. Eigenvectors of AT0 that correspond to uncontrollable modes, eliminate the quadratic term in Ric(X) = 0, thereby leaving only the linear terms. In case of the zero eigenvalue being uncontrollable, these linear terms also get eliminated, leaving every choice as a solution. Theorem 10 details the other situation, when the rank could be greater than k. In this section, a complete characterization of all solutions of Ric(X) = 0 was done in terms of the unique maximal rank solution of Ric(X) = 0. In case of multiple maximal rank solutions too, this characterization holds – but now the Schur complements of each one of the maximal solutions yield solutions of Ric(X) = 0. 5. Poset structure on solutions of Ric(X) = 0 In the last section we saw that the maximal rank solution of Ric(X) = 0 contains within it information about other lower rank solutions of Ric(X) = 0 (with appropriate assumptions). In this section, we build a partially ordered set (poset) from the solution set of Ric(X) = 0 and we relate some of the earlier results to this poset structure. A lattice is a special case of a poset, where every two elements have a join and a meet.
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
25
We shall state conditions under which the poset of solutions of Ric(X) = 0 is actually a lattice. Lattice structure on solutions of ARE has appeared before in [5], though the approach we take is a bit different from the one in [5]. We begin by specifying a partial order on the set of real solutions of Ric(X) = 0. Clearly, there is a one-to-one correspondence between real solutions of Ric(X) = 0 and the set of real solutions of the ARE: −AT K − KA − Q + KBB T K = 0. Recall that Ric(X) = −AT0 X − XA0 + XBB T X = 0 was obtained from the ARE −AT K − KA − Q + KBB T K = 0 by fixing some arbitrary solution K0 of the ARE and setting A0 = A − BB T K0 . Therefore, a partial order on the set of real solutions of Ric(X) = 0 naturally induces a partial order on the set of real solutions of the ARE. We define a partial order ≺ in the following fashion: (a) if X1 and X2 are two solutions of Ric(X) = 0, then X1 ≺ X2 if the column span of X1 is contained within the column span of X2 , (b) in case the column span of X1 and X2 are the same, then X1 ≺ X2 if X1 < X2 as matrices. By the results in the earlier sections, we know that the column span of any solution of Ric(X) = 0 is an AT0 -invariant space. Given any matrix A0 , the set of all real AT0 -invariant subspaces has a natural lattice structure induced by the relation of subspace inclusion. One would expect this lattice of AT0 -invariant subspaces to induce a lattice structure on the set of real solutions of Ric(X) = 0. Unfortunately, this need not happen due to several reasons that are closely related to observations in Lemma 1. Given any solution X1 of Ric(X) = 0, one can identify X1 with the AT0 -invariant subspace given by the column span of X1 . There are AT0 -invariant subspaces that cannot be identified to any solution of Ric(X) = 0. More crucially, there are AT0 -invariant subspaces which get identified with multiple solutions of Ric(X) = 0. In this poset induced by ≺ on the set of real solutions of Ric(X) = 0, one can associate levels with the rank of the solutions. Thus, the zeroth level consists of just one solution: X = 0, the first level consists of all rank one real solutions of Ric(X) = 0, the second level consists of all rank two real solutions and so on. The top (or the n-th) level contains the rank n solution (if it exists). When the highest non-empty level contains a unique member, then this poset of real solutions of Ric(X) = 0 becomes a lattice. One can now translate the results from the earlier sections to obtain several special cases. We list some of these below. • Assume 0 ∈ / Spec(A0 ). Further assume that (A0 , B) is controllable, λi +λj = 0 where λi , λj are eigenvalues of A0 . Then a unique full rank real solution of Ric(X) = 0 exists. Thus, the poset of real solutions of Ric(X) = 0 with ordering ≺ is a lattice which is isomorphic to the lattice of real AT0 -invariant subspaces ordered by subspace inclusion. • Let S ⊂ Spec(A0 ) such that it is closed under complex conjugation and λi , λj ∈ S implies λi + λj = 0. Let S∗ be a maximal set amongst all S ⊂ Spec(A0 ) with the property stated above. Let k = |S∗ |. Assume 0 ∈ / Spec(A0 ). Further assume that
26
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
(A0 , B) is controllable, then the highest non-empty level in the poset of real solutions of Ric(X) = 0 induced by ≺ is greater than or equal to the k-th one. • If 0 ∈ Spec(A0 ), (A0 , B) is controllable and the nonzero eigenvalues of A0 are distinct with λi + λj = 0 where λi , λj are nonzero eigenvalues of A0 , then the poset of real solutions of Ric(X) = 0 is a lattice which is isomorphic to a sublattice of the lattice of real AT0 -invariant subspaces ordered by subspace inclusion. • In the previous item, if 0 ∈ Spec(A0 ) is not controllable, with all other conditions remaining the same, then the poset of real solutions of Ric(X) = 0 is no longer a lattice. This is because multiple solutions are associated to the AT0 -invariant subspaces that contain the eigenspace associated to the uncontrollable zero eigenvalue. Recall that one fixed an arbitrary solution K0 of the original ARE −AT K − KA − Q + KBB T K = 0 to obtain the equation Ric(X) = 0. Instead of K0 , if some other solution K1 of the original ARE were used, then one obtains the equation Ric1 (X) = −AT1 X − XA1 − XBB T X = 0 where A1 = A − BB T K1 . Observe that the real solutions of Ric1 (X) = 0 can now be given a poset structure along the lines employed for defining the poset structure on the real solutions of Ric(X) = 0. Clearly, the real solutions of Ric1 (X) = 0 are related to AT1 -invariant subspaces, just as the real solutions of Ric(X) = 0 were identified to AT0 -invariant subspaces. Therefore one way to obtain more information about these posets of real solutions of Ric(X) = 0 and Ric1 (X) = 0 would be to compare the AT0 -invariant subspaces and the AT1 -invariant subspaces. We therefore examine relations between AT0 -invariant subspaces and AT1 -invariant subspaces. Observe that as both K0 and K1 are solutions of the original ARE, therefore X∗ = K1 −K0 is a solution of Ric(X) = 0. Conversely, −X∗ is a solution of Ric1 (X) = 0. For now, we proceed with the assumption that algebraic multiplicity equals one for all eigenvalues of A0 . This implies that A0 is diagonalizable as a complex matrix. Lemma 4. Let X∗ be a real rank one solution of Ric(X) = 0. Then A0 and A1 have a common left eigenvector viT such that AT0 vi = λi vi and AT1 vi = −λi vi . Further, all other eigenvalues (and the corresponding right eigenspace) of A0 are the eigenvalues (and the corresponding right eigenspace respectively) of A1 . Proof. Recall from Theorem 3 that if X∗ has rank one, then X∗ = (2λi /||B T vi ||2 )vi viT , where viT is a left eigenvector of A0 . A1 = A − BB T K1 = A − BB T (K0 + (2λi /||B T vi ||2 )vi viT ) = A0 − (2λi /||B T vi ||2 )BB T vi viT ∴ viT A1 = viT (A0 − (2λi /||B T vi ||2 )BB T vi viT ) = λi viT − (2λi /||B T vi ||2 )(||B T vi ||2 )viT = −λi viT .
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
27
Therefore, A0 and A1 have a common left eigenvector viT , such that AT0 vi = λi vi and AT1 vi = −λi vi . Let columns of U (denoted by u1 . . . un ) be such that A0 U = U D, where D is a block diagonal matrix. Thus, the columns of U are the eigenvectors of A0 corresponding to real eigenvalues or the real and imaginary parts of eigenvectors of A0 corresponding to complex eigenvalues. Then U −1 A0 = DU −1 and therefore the rows of U −1 are left eigenvectors of A0 corresponding to real eigenvalues and the real and imaginary parts of left eigenvectors of A0 corresponding to complex eigenvalues. Since viT is a left eigenvector of A0 , viT uj = 0 if j = i. Therefore, for j = i, A1 uj = (A0 − (2λi /||B T vi ||2 )BB T vi viT )uj = A0 uj . Therefore, A0 and A1 have (n − 1) common right eigenvectors and the corresponding A0 -invariant subspace is the same as the A1 -invariant subspace given by all vectors uj such that viT uj = 0. 2 Note that the above theorem also holds if we relax the assumption of the algebraic multiplicity of all eigenvalues of A0 being one to the assumption that the algebraic and geometric multiplicity are equal for all eigenvalues of A0 . In the proof above, observe that one can choose appropriate right eigenvectors of A0 corresponding to λi , such that viT is a row of U −1 . With this addition, the above proof goes through for this more general case. Theorem 14. Let X∗ (a solution of Ric(X) = 0) be a rank k matrix, such that the column span of X∗ is an AT0 -invariant subspace corresponding to k real eigenvalues of A0 , say λ1 , · · · , λk . Then A0 and A1 have (n − k) common eigenvalues and a common right eigenspace corresponding to these eigenvalues. Further, the column span of X∗ is an AT1 -invariant subspace corresponding to eigenvalues −λ1 , · · · , −λk of A1 . Proof. Let X∗ = Lk LLTk where columns of Lk are eigenvectors of AT0 corresponding to eigenvalues, say λ1 , . . . , λk . Let columns of U given by u1 . . . un be right (generalized) eigenvectors of A0 , such that u1 , · · · , uk are eigenvectors corresponding to λ1 , · · · , λk . A1 = A − BB T (K0 + X∗ ) = A0 − BB T Lk LLTk Therefore, for all j > k, we have A1 uj = (A0 − BB T Lk LLTk )uj = A0 uj since LTk uj = 0 for all j > k. Thus A0 and A1 have (n−k) common eigenvalues and the corresponding right eigenspaces are the same.
28
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
As X∗ is a solution of Ric(X) = 0, therefore 0 = −AT0 X∗ − X∗ A0 + X∗ BB T X∗ = −AT0 Lk LLTk − Lk LLTk A0 + Lk LLTk BB T Lk LLTk = Lk (−Dk L − LDk + LMk L) LTk where Dk = diag (λ1 , · · · , λk ) and Mk = LTk BB T Lk . From the above, one concludes that Dk − Mk L = −L−1 Dk L, a matrix whose eigenvalues are −λ1 , · · · , −λk . Observe that column span of X∗ is identical to column span of Lk and therefore LTk A1 = LTk (A0 − BB T X∗ ) = LTk (A0 − BB T Lk LLTk ) = Dk LTk − LTk BB T Lk LLTk = (Dk − Mk L) LTk . Thus, the column span of Lk (and therefore X∗ ) is an AT1 -invariant subspace corresponding to eigenvalues −λ1 , · · · , −λk . 2 Note that in the theorem above, as X∗ is a solution of Ric(X) = 0, the set of corresponding eigenvalues λ1 , · · · λk is likely to have a constraint like λi + λj = 0 for 1 ≤ i ≤ j ≤ k. Importantly, the above theorem also holds for cases where this constraint is absent, by virtue of the first sentence in the above theorem. An analogous result exists for more general X∗ whose column span involve AT0 -invariant subspaces corresponding to complex eigenvalues. To show this, let v1 , v2 be the real and imaginary part of the complex eigenvector associated to an eigenvalue λ + iμ of A0 T and let X∗ be the rank two solution associated with the two dimensional A0 -invariant vT subspace spanned by v1 , v2 . Observe λ = 0. Then X∗ = v1 v2 L 1T . v2 Lemma 5. Let X∗ be a real rank two solution of Ric(X) = 0 such that its column span is an AT0 -invariant subspace corresponding to a complex conjugate pair of eigenvalues λ ± iμ. Then A1 = A0 − BB T X∗ has eigenvalues −λ ± iμ and the AT1 -invariant subspace corresponding to these eigenvalues is identical to the column span of X∗ . All other eigenvalues of A1 are precisely the other eigenvalues of A0 .
λ μ Proof. Let . Let L2 = v1 v2 v1 v2 = v1 v2 D2 where D2 = −μ λ and M2 = LT2 BB T L2 . Let X∗ = L2 LLT2 be the rank two solution associated with the two dimensional AT0 -invariant subspace associated with v1 , v2 . Therefore, as seen in the proof of Theorem 5, D2 L + LD2T = LM2 L. Clearly, AT0
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
D2T − M2 L = −L−1 D2 L
29
(9)
A1 = A − BB T (K0 + X∗ ) = A0 − BB T L2 LLT2 LT2 A1 = LT2 (A0 − BB T L2 LLT2 ) = D2T LT2 − LT2 BB T L2 LLT2 = (D2T − M2 L)LT2 .
(10)
From Eq. (9), it is clear that (D2T − M2 L) has eigenvalues −λ ± iμ. Further, the column span of L2 is the same as the column span of X∗ and corresponds to the AT1 -invariant subspace associated to the complex conjugate pair of eigenvalues −λ ± iμ. The last statement about all the other eigenvalues of A1 follows from arguments similar to those used in Theorem 14. 2 One can use the above lemma to generalize Theorem 14. Thus, in Theorem 14, X∗ could be any rank k solution whose column span corresponds to some k-dimensional real AT0 -invariant subspace. Then A0 and A1 has (n − k) common eigenvalues. Further, the other k eigenvalues of A1 are given by −λi if the remaining eigenvalues of A0 were given by λi . Remark 3. If A0 has nonzero distinct eigenvalues such that λi + λj = 0 for any λi , λj ∈ Spec(A0 ), then for all A1 associated with other real solutions of Ric(X) = 0, a similar property holds. In other words, every A1 has nonzero distinct eigenvalues such that sum of any two eigenvalues is nonzero. In this case it follows from Remark 1 that the poset of real solutions of Ric(X) = 0 has 2(m+l) elements (where m is the number of real eigenvalues and l is the number of complex conjugate pairs of eigenvalues of A0 ). In fact, this poset of solutions of Ric(X) = 0 is a lattice. The lattice obtained in this case looks identical no matter what initial solution K0 we fix i.e. it is symmetric about every point in the lattice. We can make additional observations about such a lattice based on simple counting arguments. Suppose A0 has m real eigenvalues and l pairs of complex conjugate eigenvalues i.e. n = m + 2l. Further, the eigenvalues are distinct and the sum of any two eigenvalues is always nonzero. If k < m is even, then number of rank k real solutions l m l m l of Ric(X) = 0 are given by m k + 1 k−2 + . . . + (k/2)−1 2 + (k/2) . Similarly, if k < m is odd then number of rank k real solutions of Ric(X) = 0 are given by m l m l m k + 1 k−2 + . . . + (k−1)/2 1 . Next, we consider the case when A0 has repeated eigenvalues and such that for all eigenvalues, the algebraic multiplicity is equal to the geometric multiplicity. Observe that Theorem 14 holds for this more general case. If X∗ is a rank k real solution of Ric(X) = 0, then the column span of X∗ is an AT0 -invariant subspace. If the eigenvalues corresponding to this AT0 -invariant subspace contain some instances of repeated eigenvalues, then the proof of Theorem 14 goes through with a slight modification. The choice of the right eigenvectors to form the matrix U in the proof of Theorem 14 must be made in such
30
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
a way that the first k rows of U −1 span the same space as the column span of X∗ . Modifications of Theorem 14 involving complex eigenvalues (where the algebraic and geometric multiplicities are equal) are also valid. Finally, we consider the case of repeated eigenvalues of A0 , where algebraic multiplicity is not equal to the geometric multiplicity. For the sake of simplicity, we consider A0 having only one eigenvalue λ. Lemma 6. Let A0 have only one real nonzero eigenvalue λ with geometric multiplicity one and algebraic multiplicity n. Let X∗ be a rank k solution of Ric(X) = 0. Then A1 = A0 − BB T X∗ has eigenvalues λ and −λ both of geometric multiplicity one and algebraic multiplicity (n − k) and k respectively. Proof. Let X∗ = Lk LLTk where columns of Lk are generalized eigenvectors of AT0 forming k dimensional AT0 -invariant subspace. (L is k × k matrix of rank k and DJk is k × k submatrix of the Jordan matrix DJ corresponding to Lk ). Therefore, L satisfies −DJk L − LDJTk + LMk L = 0 (where Mk = LTk BB T Lk ). Hence, DJTk − Mk L = −L−1 DJk L
(11)
A1 = A − BB T (K0 + X∗ ) = A0 − BB T Lk LLTk LTk A1 = LTk (A0 − BB T Lk LLTk ) = DJTk LTk − LTk BB T Lk LLTk = (DJTk − Mk L)LTk .
(12)
Therefore, from Eq. (11), (DJTk − Mk L) has Jordan structure −DJk . This implies (from Eq. (12)) that A1 has eigenvalue −λ with multiplicity k. L A1 =
LTk
T
LT(n−k) ⎡
λ ⎢1 ⎢ ⎢ = (⎢ . ⎢ ⎣. . DJTk =
0 λ . . .
(A0 − BB T Lk LLTk ) ··· 0 ··· 0 ··· 0 ··· λ ··· 1
− MkT L A21
⎤ 0 0⎥ ⎥ MkT L 0 LTk ⎥ ) 0⎥− LT(n−k) LT(n−k) BB T Lk L 0 ⎥ 0⎦ λ 0 LT T
DJ(n−k)
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
31
⎤ λ 0 ··· 0 0 ⎢ 1 λ ··· 0 0 ⎥ ⎥ ⎢ ⎥ ⎢ where DJT(n−k) is (n − k) × (n − k) matrix given by ⎢ . . · · · 0 0 ⎥. Spec(A1 ) = ⎥ ⎢ ⎣ . . ··· λ 0 ⎦ . . ··· 1 λ Spec(DJTk − MkT L) ∪ Spec(DJT(n−k) ) where Spec(DJTk − MkT L) = Spec(−DJk ) (from Eq. (11)) and DJT(n−k) has eigenvalue λ with algebraic multiplicity (n − k). Note that geometric multiplicity of λ and −λ is one. 2 ⎡
From the results above, it is clear that if A0 has repeated eigenvalues (with geometric multiplicity greater than one and at least one of the Jordan blocks corresponding to the repeated eigenvalue being completely controllable), then there exists real solutions X∗ of Ric(X) = 0, such that the matrix A1 = A0 − BB T X∗ contains eigenvalues λi , λj such that λi + λj = 0. Conversely, if A0 has eigenvalues λi , λj such that λi + λj = 0 and at least one of them is controllable, then there exists real solutions X∗ of Ric(X) = 0, such that the matrix A1 = A0 − BB T X∗ has repeated eigenvalues. Further, if A0 has repeated eigenvalues (with geometric multiplicity greater than one and at least one of the Jordan blocks corresponding to the repeated eigenvalue being completely controllable), then there are infinitely many solutions of Ric(X) = 0 – this is a direct consequence of the fact that AT0 has infinitely many eigenvectors that correspond to controllable modes. Equivalently, if A0 has eigenvalues λi , λj such that λi + λj = 0 and at least one of them is controllable, then there are infinitely many solutions of Ric(X) = 0. In these cases, unlike Remark 3, the poset of solutions of Ric(X) = 0 is not a lattice in general. If one compares the posets of solutions of Rici (X) = 0 corresponding to the various choices of Ai = A0 − BB T Xi , then these posets may look very different from one another, unlike the case in Remark 3. One can however characterize all initial choices of K0 that gives rise to posets that are isomorphic. Further, if one assumes that (A0 , B) is controllable, then there are special choices of K0 that make the poset of solutions of Ric(X) = 0 a lattice. We demonstrate this with a generic example. Example 8. Let (A, B) be controllable in the original ARE. Assume that a solution K0 (of the ARE) was chosen such that the resulting matrix A0 = A − BB T K0 has all real eigenvalues λ1 , λ2 , · · · , λk with multiplicities m1 , m2 , · · · , mk respectively. Assume further that algebraic multiplicity equals geometric multiplicity for all the eigenvalues. Further assume that λi +λj = 0 for 1 ≤ i ≤ j ≤ k. In this case, the poset of real solutions of Ric(X) = 0 is indeed a lattice. In this particular case, there are precisely (2k − 1) nonzero solutions Xi of Ric(X) = 0, such that the posets of solutions of Rici (X) = −ATi X − XAi + XBB T X = 0 where Ai = A0 − BB T Xi are lattices. In fact, all these lattices are isomorphic to the lattice of solutions of Ric(X) = 0. The eigenvalues of the matrices Ai are precisely ±λ1 with multiplicity m1 , ±λ2 with multiplicity m2 , · · · , ±λk with multiplicity mk .
32
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
Let m1 > 1. Then there are infinite solutions Xi of Ric(X) = 0, such that the poset of solutions of Rici (X) = −ATi X − XAi + XBB T X = 0 where Ai = A0 − BB T Xi are all isomorphic to one another but none of these posets are lattices. This is because Rici (X) = 0 has infinitely many full rank solutions. For example, all matrices Ai = A0 − BB T Xi with eigenvalues given by ±λ1 with multiplicity (m1 − 1), ∓λ1 with multiplicity 1, ±λ2 with multiplicity m2 , · · · , ±λk with multiplicity mk generate posets of solutions for Rici (X) = −ATi X − XAi + XBB T X = 0, which are isomorphic to one another, but none of these posets are lattices. For this example, one can deduce various classes of solutions Xi of Ric(X) = 0, such that the posets of solutions of Rici (X) = 0 are isomorphic to each other. In general, this deduction is not easy as it involves nontrivial Jordan blocks. From the above, it is clear that there are several classes of solutions Xi of Ric(X) = 0, such that the posets of solutions of Rici (X) = −ATi X − XAi + XBB T X = 0 where Ai = A0 − BB T Xi are all isomorphic to each other. For cases corresponding to this example, an upper bound can be given to the number of such classes by Πki=1 mi2+1 , where x is the smallest integer greater than or equal to x. This upper bound is attained when the multiplicities mi are all distinct. Note in the above example, that each class of solutions of Ric(X) = 0 that give isomorphic posets contain an infinite number of elements, except for one distinguished class that contains only a finite number of elements. This distinguished class that contain only a finite number of solutions Xi of Ric(X) = 0 have associated matrices Ai with eigenvalues given by ±λ1 with multiplicity m1 , ±λ2 with multiplicity m2 , · · · , ±λk with multiplicity mk where λi + λj = 0 for all 1 ≤ i < j ≤ k. In fact, the posets associated to this distinguished class are lattices. Further note that if all the multiplicities mi = 1, then the upper bound given in the example, evaluates to one, that is there is only one class of solutions and this is the distinguished class. The example discussed in Remark 3 is precisely this very special case, where only the distinguished class of solutions is present. Hence, in that case, the posets are all lattices which were isomorphic irrespective of the choice of initial solution. 6. Galois connection Let rows of LT be left eigenvectors/generalized eigenvectors of A0 and columns of U be right eigenvectors/generalized eigenvectors of A0 . We may assume that LT U = I. Let X = Lk LLTk be a rank k solution of Ric(X) = 0. Therefore, row span of X forms k-dimensional left invariant subspace of A0 . Without loss of generality, assume that rows of LTk are first k rows of LT . Let Un−k be a matrix whose columns are last (n −k) columns of U . Clearly, XUn−k = Lk LLTk Un−k = 0 and columns of Un−k form (n − k)-dimensional kernel of X. Thus, we can identify (n − k)-dimensional right invariant subspaces of A0 with kernel of a rank k solutions of Ric(X) = 0. This induces a partial order on solutions
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
33
of Ric(X) = 0 as follows. We say Xi Xj if ker(Xi ) ⊆ ker(Xj ). Thus right invariant subspaces of A0 also induce a partial order on solutions of Ric(X) = 0. We already saw partial order induced by left invariant subspaces of A0 in the previous section. Definition 1. ([18]) Let P and Q be posets. A Galois connection on the pair (P, Q) is a pair (Π, Ω) of maps Π : P → Q and Ω : Q → P , where we write Π(p) = p∗ and Ω(q) = q , with the following properties: 1) For all p, q ∈ P and r, s ∈ Q, p q ⇒ q ∗ p∗ and r s ⇒ s r 2) For all p ∈ P , q ∈ Q, p (p∗ ) and q (q )∗ . Let ≺1 be the partial order induced by left invariant subspaces of A0 and ≺2 be the partial order induced by right invariant subspaces of A0 on solutions of Ric(X) = 0. Let P be a poset with partial order ≺1 and Q be a poset with partial order ≺2 of solutions of Ric(X) = 0. It is clear that these two posets form a Galois connection. In other words, if Xi ≺1 Xj , then Xj ≺2 Xi and vice versa. The relation between right invariant subspaces of A0 and kernel of X has been noted in [19]. Galois correspondence between set of solutions of two different AREs is obtained in [25] under appropriate hypothesis. 7. Conclusions In this paper, we have done yet another characterization of all solutions of the Algebraic Riccati equation. Importantly, we have only used simple linear algebraic arguments to obtain this characterization. We converted the ARE into an equivalent ARE problem Ric(X) = 0 that eliminates the constant term in the original ARE. The matrix X may then be viewed as a perturbation matrix from a particular solution of the original ARE. We then characterized all the solutions of Ric(X) = 0, ordering them by their rank. We demonstrated how all the solutions are in some sense build up from rank one and rank two solutions, which may be associated to the real and complex eigenvalues of a matrix related to the ARE. We obtain the characterization for both controllable and uncontrollable situations. We also provide conditions under which the set of solutions of the ARE is finite, and when the set of solutions is bounded. We provided conditions under which a unique full rank solution exists for the equation Ric(X) = 0. Special situations that prevent the existence of a unique full rank solution were enumerated and demonstrated. When it exists, the unique full rank solution is obtained by solving a set of linear equations. It was then demonstrated how this unique full rank solution encodes within it all lower rank solutions (under some special conditions). Even when these special conditions are not satisfied, the full rank solution does encode several low rank solutions. Further, it was shown that for the cases when a unique full rank solution of Ric(X) = 0 does not exist, all the maximal rank solutions encode within it information about the lower rank solutions.
34
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
We then explored a poset structure that we imposed on the set of solutions of Ric(X) = 0. We provided conditions under which this poset structure actually defines a lattice – the existence of a unique maximal rank solution. We then characterized those feedback matrices Ai which resulted in the corresponding solution set of Rici (X) = 0 being a lattice. Whenever the poset was a lattice, it was isomorphic to the lattice of invariant subspaces of ATi corresponding to eigenvalues which are nonzero and not purely imaginary. Acknowledgements We acknowledge the two reviewers of this paper, whose suggestions have made this a much more improved article. We thank an anonymous reviewer for providing a much shorter proof of Theorem 12. References [1] H. Abou-Kandil, G. Freiling, V. Ionescu, G. Jank, Matrix Riccati Equations. In Control and Systems Theory, Systems & Control: Foundations & Applications, Birkhäuser Verlag, Basel, 2003. [2] D.A. Bini, B. Iannazzo, B. Meini, Numerical Solution of Algebraic Riccati Equations, Fundamentals of Algorithms, vol. 9, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2012. [3] S. Bittanti, A.J. Laub, J.C. Willems, The Riccati Equation, Springer-Verlag, New York, 1991. [4] T. Brüll, Generalizing the algebraic Riccati equations to higher-order behavioral systems, SIAM J. Control Optim. 51 (2013) 2544–2567. [5] W.A. Coppel, Matrix quadratic equations, Bull. Aust. Math. Soc. 10 (1974) 377–401. [6] A. Ferrante, A parametrization of minimal stochastic realizations, IEEE Trans. Automat. Control AC-39 (1994) 2122–2126. [7] A. Ferrante, A homeomorphic characterization of minimal spectral factors, SIAM J. Control Optim. 35 (1997) 1508–1523. [8] A. Ferrante, G. Michaletzky, M. Pavon, Parametrization of all minimal square spectral factors, Systems Control Lett. 21 (1993) 249–254. [9] A. Ferrante, L. Ntogramatzidis, The generalized discrete algebraic Riccati equation in linearquadratic optimal control, Automatica 49 (2) (2013) 471–478. [10] U. Helmke, J. Moore, Optimization and Dynamical Systems, Springer-Verlag, 1994. [11] R.A. Horn, C.R. Johnson, Topics in Matrix Analysis, Cambridge University Press, 1999. [12] T. Kailath, Linear Systems, Prentice Hall Inc., 1980. [13] P. Lancaster, L. Rodman, Algebraic Riccati Equation, Clarendon Press, Oxford, 1995. [14] V. Mehrmann, E. Tan, Defect correction methods for the solution of algebraic Riccati equations, IEEE Trans. Automat. Control 33 (1988) 695–698. [15] V.L. Mehrmann, The Autonomous Linear Quadratic Control Problem, Lecture Notes in Control and Information Sciences, vol. 163, Springer-Verlag, Berlin, 1991, Theory and numerical solution. [16] D. Pal, M.N. Belur, Dissipativity of uncontrollable systems, storage functions and Lyapunov functions, SIAM J. Control Optim. 47 (2008) 2930–2966. [17] G. Picci, S. Pinzoni, Acausal models and balanced realizations of stationary processes, Linear Algebra Appl. 205–206 (1994) 997–1043. [18] S. Roman, Lattices and Ordered Sets, Springer, 2008. [19] C. Scherer, Solution set of the algebraic Riccati equation and the algebraic Riccati inequality, Linear Algebra Appl. 153 (1991) 99–122. [20] C. Scherer, The Riccati inequality and state-space H∞ -optimal control, Ph.D thesis, University of Wurzburg, 1995. [21] M.A. Shayman, Geometry of the algebraic Riccati equation I, SIAM J. Control Optim. 21 (1983) 375–394.
A. Sanand Amita Dilip, H.K. Pillai / Linear Algebra Appl. 481 (2015) 1–35
35
[22] J. Snyders, M. Zakai, On nonnegative solutions of the equation AD + DA = −C ∗ , SIAM J. Appl. Math. 18 (1970) 704–714. [23] J.C. Willems, Least squares stationary optimal control and the algebraic Riccati equation, IEEE Trans. Automat. Control AC-16 (1971) 621–634. [24] H.K. Wimmer, Unmixed solutions of the discrete-time algebraic Riccati equation, SIAM J. Control Optim. 30 (1992) 867–878. [25] H.K. Wimmer, A Galois correspondence between sets of semidefinite solutions of continuous-time algebraic Riccati equations, Linear Algebra Appl. 205–206 (1994) 1253–1270. [26] W. Wonham, Linear Multivariable Control, Springer-Verlag, 1984.