LU factorization for matrices in quasiseparable form via orthogonal transformations

LU factorization for matrices in quasiseparable form via orthogonal transformations

JID:LAA AID:13529 /FLA [m1L; v1.172; Prn:8/02/2016; 13:43] P.1 (1-36) Linear Algebra and its Applications ••• (••••) •••–••• Contents lists availa...

2MB Sizes 3 Downloads 108 Views

JID:LAA

AID:13529 /FLA

[m1L; v1.172; Prn:8/02/2016; 13:43] P.1 (1-36)

Linear Algebra and its Applications ••• (••••) •••–•••

Contents lists available at ScienceDirect

Linear Algebra and its Applications www.elsevier.com/locate/laa

LU factorization for matrices in quasiseparable form via orthogonal transformations P. Dewilde a , Y. Eidelman b,∗ , I. Haimovici b a

TUM-IAS, Lichtenbergstrasse 2a, 85748 Garching, Germany School of Mathematical Sciences, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel-Aviv University, Ramat-Aviv 69978, Israel b

a r t i c l e

i n f o

Article history: Received 21 January 2015 Accepted 9 January 2016 Available online xxxx Submitted by V. Mehrmann MSC: 15A09 15A23 65F05 Keywords: Structured matrix Quasiseparable representations Fast algorithms LU factorization

a b s t r a c t The paper presents a new algorithm to compute the LUfactorization of a matrix represented in a quasiseparable or semiseparable form (i.e., using generators). It obtains the quasiseparable representations of the factors L and U of an N × N block matrix via O(N ) arithmetic operations on the block entries. The algorithm uses recursions based exclusively on unitary transformations which provide numerical stability even in singular cases. The method of the paper is based on the theory developed in [1] and provides an alternative to the approach proposed in [7] for strongly regular matrices. The algorithm presented here works also for some matrices with possibly singular principle submatrices. The results of numerical tests show that also for strongly regular matrices the new algorithm is comparable with the previous methods. © 2016 Elsevier Inc. All rights reserved.

* Corresponding author. E-mail addresses: [email protected] (P. Dewilde), [email protected] (Y. Eidelman), [email protected] (I. Haimovici). http://dx.doi.org/10.1016/j.laa.2016.01.017 0024-3795/© 2016 Elsevier Inc. All rights reserved.

JID:LAA

AID:13529 /FLA

2

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

[m1L; v1.172; Prn:8/02/2016; 13:43] P.2 (1-36)

1. Introduction In this paper we present a detailed algorithmic treatment of a new algorithm for LU factorization for finite matrices that have a quasiseparable representation. The method was originally derived in the more general (and theoretical) context of infinite dimensional quasiseparable (often also called semiseparable) systems in [1]. The (seemingly only recently discovered) new algorithm has the property that it uses, in contrast to classical methods, exclusively orthogonal or (in the complex case) unitary transformations in the recursions, while still keeping the typical low numerical complexity of quasiseparable computations. The adaptation of the original operator treatment to the standard matrix case is not altogether trivial, because the starting point of the matrix algorithm is different. This is covered in detail in the present paper, while the connection to the original (which has some independent interest) is also discussed in detail. It is an interesting question whether such a fully numerically stable algorithm can compete with existing, highly optimized algorithms that use a more traditional approach in which the pivots (i.e., the subsequent principal minors) are computed and used explicitly. A highly performing algorithm for quasiseparable matrices using pivots recursively (as in the traditional Gaussian elimination) is given in [7,5], see also [6]. In a later section of the present paper we compare the traditional and the new approach experimentally, by comparing the two algorithms in a number of model circumstances. To anticipate on the results, let us just mention here that in most cases the two algorithms have similar performances, but the new algorithm outperforms the traditional algorithms in extreme ill-conditioned or singular cases, where it never breaks down (it shall soon become clear why not). The distinctive quality of the new algorithm is that it uses exclusively two subsequent so called “inner–outer” factorizations based on the quasiseparable representation. Inner– outer factorizations for quasiseparable matrices were originally presented in a number of papers and then systematically treated in the book [2], including their stability and numerical properties. To summarize briefly: they use a recursive sequence of orthogonal (or unitary) transformations which are not only backward numerically stable (because of the orthogonal property of the transformation matrices), but such that the recursion is also strongly stable in itself, because the effect of a (numerical) error made at each stage dies down exponentially in subsequent stages. Because these are well documented (if not well known) classical results, they are not repeated in this paper, although the treatment given here may suggest proof (in the numerical literature such “system” stability is hardly considered an issue, but it plays a crucial role when one attempts to solve infinite dimensional systems of equations). Although Gaussian elimination, and in its wake LU-factorization à la Crout–Doolittle, was developed originally to solve systems of equations, nowadays this can be done more effectively using orthogonal algorithms such as QR-factorization, the more general URV-factorization (in which U and V are isometries and R is an invertible upper matrix), orthogonal Faddeev/Faddeeva [3] (which is of QR-type but integrates the back-

JID:LAA

AID:13529 /FLA

[m1L; v1.172; Prn:8/02/2016; 13:43] P.3 (1-36)

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

3

substitution and succeeds in computing the result of a system of equations or the inverse of a matrix using orthogonal or unitary transformations except for a last, numerically fully stable normalization step) or even the SVD (for highest possible accuracy). Algorithms of this nature have been developed very effectively for quasiseparable matrices, see e.g., [2,4–6], and, just as the presently presented algorithm, they only use orthogonal transformations in the recursions. The importance of LU-factorization is the determination of the factors L (for lower) and U (for upper) themselves, not the solution of systems of equations. The factors L and U play a key role in a number of important problems: stochastic estimation, stochastic modeling, optimal control (both linear and non-linear), optimization theory and system (or network) synthesis. In some cases this reduces to just Cholesky factorization, where the U factor is the transpose of the L factor and the original matrix is positive definite, but in a good number of other cases, a full determination with inertia is needed. There is a large and classical literature on these issues, let us just suffice to mention popular textbooks [8,9]. For a long time it was thought that there is no orthogonal algorithm for the LU- or the spectral factorization problem, but, as we show here, there is one and actually, an attractive one. In this paper, we have chosen to present the detailed algorithmic version of the method first with direct matrix-algebraic proofs of the various properties of the factors obtained (sections 2–5), followed by an operator-theoretic treatment that links the algorithm with the operator theory literature where it appeared originally (section 6, see [1] and [11] in somewhat different versions). This choice has been made to stay in line with the treatment of related matrix algorithms as presented in the book [7] and in [5,6,4]. Although both the new and the former algorithms use similar basic recursive steps, they are substantially different both in the order in which steps are executed and in the results they produce. (In particular, QR and LU factorizations produce very different resulting matrices!) The there upon following operator-theoretic treatment puts the (new) algorithm presented here in the more general (infinite dimensional) framework of quasiseparable systems, which clarifies the role the various operations play and illustrates how and why the numerical algorithm differs slightly from its original. In addition, it indicates how general stability and existence properties can be derived. Change of notation. In order to stay consistent with the notation in the book [7] and the habit to use the symbol U for a unitary matrix, we henceforth write “LR” for “LU”, in which it is understood that the R factor is an upper factor. By the LR factorization we then mean the presentation of a matrix A in the form A = LR with a lower invertible and lower triangular matrix L with only identities on the main diagonal and an upper invertible, upper triangular matrix R. If a matrix is invertible then it admits an LR factorization if and only if it is strongly regular. However, the algorithm we present is not restricted to such matrices, actually it never breaks down

JID:LAA

AID:13529 /FLA

4

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

[m1L; v1.172; Prn:8/02/2016; 13:43] P.4 (1-36)

and yields interesting results in the case of non strongly regular matrices, that are not further explored here. To ease the reading of the paper, here is a short summary of the method. It consists essentially in two so called “algebraic inner–outer factorizations” (in our case implemented by simple QR recursions) applied on quasiseparable representations of, in this case, globally square matrices with block entries: the first one [Step 1] converts a general, lower–upper matrix with block entries into a (well-chosen!) purely upper matrix, while the second [Step 2] reconverts this upper matrix into a mixed form that determines the lower factor L (and hence also the upper factor U) uniquely, leaving as a remainder an invertible upper matrix (which is not used further). Although these transformations are in a quasiseparable sense minimal, they do not yield intermediate matrices with scalar entries even when the original matrix consists of scalar entries. This is due to the special nature of quasiseparable representations of upper or lower unitary matrices: in Step 1 exclusively transformations with elementary orthogonal upper transformations are used (these are necessarily upper matrices with non-scalar diagonal entries), while in Step 2 transformations with elementary orthogonal lower matrices are used exclusively (which are also necessarily non-scalar lower matrices). The computations of these two inner factors (called V and U in the sequel) produce all the data needed for the end-result: the generators for the L-factor are derived partially from the V -matrix and partially as a bilinear ratio of quantities expressed in the generators of both the V and the U -matrices [Step 3]. This last step consists in no more than a representation of one of the generators of the L-factor and does not contribute to the numerical properties of the algorithm (because a “parametric representation” of a number as the ratio of two stably computed numbers does not affect numerical stability when these numbers or their ratio are not used further in any computation—this is also the reason why the classical QR-Faddeev algorithm produces a back substitution in a numerically stable way). The new feature of the algorithm, in contrast to the older quasiseparable URV factorization, is that Step 1 as well as Step 2 are executed at the same side of the original matrix (in the present paper the left side). This seemingly small difference is absolutely essential. To a sketchy reader, the Gaussian-type algorithm given in the book [7] may appear to be almost the same as the algorithm presented here, but the matrix-computational analysis of the new algorithm (as given in sections 2–5) as well as the theoretical analysis of section 6 show the very substantial differences. Strictly speaking, the LU or Gaussian factorization of invertible matrices with scalar entries only exists when the matrix is what has been called “strongly regular”, i.e., when it has non-singular leading minors. Fast algorithms for the LU factorization of strongly regular matrices with quasiseparable representations were developed in [12,5], see also [7]. A Levinson-like algorithm for quasiseparable matrices was suggested in [13]. It turned out that the operations used in this algorithm are similar to ones used in the LU factorization in [7]. However, there are singular matrices with possibly singular leading minors for which the LU factorization also exists. Such cases for scalar matrices were studied in details in the paper by F.M. Dopico, C.R. Johnson and J.M. Molera [10].

JID:LAA

AID:13529 /FLA

[m1L; v1.172; Prn:8/02/2016; 13:43] P.5 (1-36)

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

5

Notice that a Levinson-like algorithm for quasiseparable matrices was suggested by R. Vandebril, M. Van Barel and N. Mastronardi in the monograph [13]. It turned out that the operations used in this algorithm are similar to ones used in the LU factorization in [7]. Various factorization algorithms for hierarchically semiseparable matrices are studied in the paper [14] by J. Xia, S. Chandrasekaran, M. Gu and X. Lie. The paper consists of nine sections. Section 1 is this introduction. In Section 2 the basic definitions concerning quasiseparable generators of matrices are given. In Section 3 we present the unitary-triangular factorization algorithm described in [7], that also corresponds to the gist of the new algorithm (Step 1 and Step 2). In Section 4 we derive the necessary and sufficient condition for the existence of the lower–upper factorization for the unitary factor in the latter representation and represent the solution in terms of the quantities computed in the two previous steps (Step 3). Section 5 summarizes the whole procedure. In section 6 we derive the global operators that correspond to steps 1–3 of the algorithm. This is mostly done by removal of the cumbersome index notation necessary in a detailed matrix-algebraic treatment, while keeping the original symbols. It is remarkable that this can be done seamlessly, but it requires a slight extension of matrix calculus. Section 6 also contains a direct proof for the properties painfully derived in sections 2–5. Section 7 contains a counterexample demonstrating that the condition used for the considering class of matrices is not necessary in general. A disadvantage of the new algorithm is that the Step 1 inner–outer recursion is a backward recursion while the Step 2 inner–outer recursion goes forward. This unpleasant situation can be remedied in a number of cases by a careful replacement of the Step 1 backward recursion with a forward one, based on the fact that in many cases, sufficient data is available on the quasiseparable representation in the forward direction. How this is done is shown in section 8. Finally, in Appendix A we present results of numerical tests that compare the Gaussian elimination approach of [7] with the new approach. 2. Quasiseparable generators In this section we define the quasiseparable representation that will be the starting point of the factorization algorithm. In many cases, such a representation for a matrix is given a priori, e.g., when it is derived from what is known as a “state space model” of a system under study. In some cases it may be advantageous to derive such a representation from partial matrix data (as we do in section 8). Let {a(k)} be a family of matrices of sizes rk × rk−1 . For positive integers i, j, i > j > > define the operation a> ij as follows: aij = a(i − 1) · . . . · a(j + 1) for i > j + 1, aj+1,j = Irj . Let {b(k)} be a family of matrices of sizes rk−1 × rk . For positive integers i, j, j > i < < define the operation b< ij as follows: bij = b(i + 1) · . . . · b(j − 1) for j > i + 1, bi,i+1 = Iri . It is easy to see that > > a> ij = aik ak+1,j ,

i>k≥j

(2.1)

JID:LAA

AID:13529 /FLA

6

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

[m1L; v1.172; Prn:8/02/2016; 13:43] P.6 (1-36)

and < < b< ij = bi,k+1 bk,j ,

i ≤ k < j.

(2.2)

Let A = {Aij }N i,j=1 be a matrix with block entries Aij of sizes mi × nj . Assume that the entries of this matrix are represented in the form ⎧ > ⎪ ⎨ p(i)aij q(j), 1 ≤ j < i ≤ N, Aij = d(i), 1 ≤ i = j ≤ N, ⎪ ⎩ g(i)b< h(j), 1 ≤ i < j ≤ N. ij

(2.3)

Here p(i) (i = 2, . . . , N ), q(j) (j = 1, . . . , N − 1), a(k) (k = 2, . . . , N − 1) are matrices of L L sizes mi ×ri−1 , rjL ×nj , rkL ×rk−1 respectively, g(i) (i = 1, . . . , N −1), h(j) (j = 2, . . . , N ), U U b(k) (k = 2, . . . , N − 1) are matrices of sizes mi × riU , rj−1 × nj , rk−1 × rkU respectively, d(i) (i = 1, . . . , N ) are mi × ni matrices. The representation of a matrix A in the form (2.3) is called a quasiseparable representation. The elements p(i) (i = 2, . . . , N ), q(j) (j = 1, . . . , N − 1), a(k) (k = 2, . . . , N − 1); g(i) (i = 1, . . . , N − 1), h(j) (j = 2, . . . , N ), b(k) (k = 2, . . . , N − 1); d(i) (i = 1, . . . , N ) are called quasiseparable generators of the matrix A. The numbers rkL , rkU (k = 1, . . . , N −1) are called the orders of these generators. The elements p(i) (i = 2, . . . , N ), q(j) (j = 1, . . . , N − 1), a(k) (k = 2, . . . , N − 1) and g(i) (i = 1, . . . , N − 1), h(j) (j = 2, . . . , N ), b(k) (k = 2, . . . , N − 1) are called also lower quasiseparable generators and upper quasiseparable generators of the matrix A. For scalar matrices the elements d(i) are numbers and the generators p(i), g(i) and q(j), h(j) are rows and columns of the corresponding sizes. 3. The QR-like algorithm [Step 1 and Step 2] In Step 1 of the algorithm, a (block) matrix A given in full, quasiseparable form, i.e., a representation with generators for both the lower part and the upper part of the matrix, is converted into block-upper form, using QR factorizations recursively executed on the lower generators. It also converts these generators into a minimal set, produces the generators of a lower unitary matrix V which is such that V ∗ A is upper, and determines the resulting generators for V ∗ A. All this is done by a QR-recursion that proceeds backwards, starting from the highest (block) index down. In Step 2, the resulting remainder is then, using a further series of QR-steps, into the product of an upper unitary matrix U and a right invertible upper remainder R0 . This step is again recursive, starting from the first index onwards (a forward recursion). It turns out that these two consecutive QR-steps produce all the data needed for the result (Step 3 of the next section). Theorem 3.1. Let A = {Aij }N i,j=1 be a block matrix with entries of sizes mi × mj , lower quasiseparable generators p(i) (i = 2, . . . , N ), q(j) (j = 1, . . . , N − 1), a(k) (k = 2, . . . , N − 1) of orders rkL (k = 1, . . . , N − 1), upper quasiseparable generators

JID:LAA

AID:13529 /FLA

[m1L; v1.172; Prn:8/02/2016; 13:43] P.7 (1-36)

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

7

g(i) (i = 1, . . . , N − 1), h(j) (j = 2, . . . , N ), b(k) (k = 2, . . . , N − 1) of orders rkU (k = 1, . . . , N − 1) and diagonal entries d(k) (k = 1, . . . , N ). Set ρN = 0,

L ρk−1 = min{mk + ρk , rk−1 },

k = N, . . . , 2, ρ0 = 0,

νk = mk + ρk − ρk−1 , k = 1, . . . , N.

(3.1) (3.2)

Then the matrix A admits the factorization A = V U R0 ,

(3.3)

where V is a unitary block lower triangular matrix with block entries of sizes mi × νj (i, j = 1, . . . , N ), U is a unitary block upper triangular matrix with block entries of sizes νi × mj (i, j = 1, . . . , N ) and R0 is a block upper triangular matrix with block entries of sizes mi × mj (i, j = 1, . . . , N ). Moreover lower quasiseparable generators pV (i) (i = 2, . . . , N ), qV (j) (j = 1, . . . , N − 1), aV (k) (k = 2, . . . , N − 1) of orders ρk (k = 1, . . . , N − 1) and diagonal entries dV (k) (k = 1, . . . , N ) of the matrix V as well as upper quasiseparable generators gU (i) (i = 1, . . . , N − 1), hU (j) (j = 2, . . . , N ), bU (k) (k = 2, . . . , N − 1) of orders ρk (k = 1, . . . , N − 1) and diagonal entries dU (k) (k = 1, . . . , N ) are obtained via the following algorithm. Step 1 1.1. Using QR factorization or another algorithm compute the factorization  p(N ) = VN



XN L 0νN ×rN −1

,

(3.4)

L where VN is a unitary matrix of sizes mN × mN , XN is a matrix of sizes ρN −1 × rN −1 . Determine the matrices pV (N ), dV (N ) of sizes mN × ρN −1 , mN × νN from the partition

 VN = pV (N ) dV (N ) .

(3.5)

1.2. For k = N − 1, . . . , 2 using QR factorization or another algorithm compute the factorization

p(k) Xk+1 a(k)

 = Vk

Xk L 0νk ×rk−1

 ,

(3.6)

where Vk is a unitary matrix of sizes (mk + ρk ) × (mk + ρk ), Xk is a matrix of sizes L ρk−1 × rk−1 . Determine the matrices pV (k), qV (k), aV (k), dV (k) of sizes mk × ρk−1 , ρk × νk , ρk × ρk−1 , mk × νk from the partition

Vk =

pV (k) dV (k) . aV (k) qV (k)

(3.7)

JID:LAA

AID:13529 /FLA

8

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

[m1L; v1.172; Prn:8/02/2016; 13:43] P.8 (1-36)

1.3. Set V1 to be a ν1 × ν1 unitary matrix. Determine the matrices qV (1), dV (1) of sizes ρ1 × ν1 , m1 × ν1 from the partition

V1 =

dV (1) . qV (1)

(3.8)

Step 2 2.1. Compute Δ1 = d∗V (1)d(1) + qV∗ (1)X2 q(1).

(3.9)

Compute the QR factorization

x1

Δ1 = U1

,

0ρ1 ×m1

(3.10)

where U1 is a ν1 × ν1 unitary matrix and x1 is an upper triangular m1 × m1 matrix. Determine the matrices dU (1), gU (1) of sizes ν1 × m1 , ν1 × ρ1 from the partition  U1 = dU (1) gU (1) .

(3.11)

Compute ∗ P1 = gU (1)d∗V (1)g(1),

∗ Q1 = gU (1)qV∗ (1).

(3.12)

2.2. For k = 2, . . . , N − 1 perform the following. Compute Δk = d∗V (k)d(k) + qV∗ (k)Xk+1 q(k), zk = Pk−1 h(k) +

Qk−1 (p∗V

(k)d(k) +

a∗V

(3.13)

(k)Xk+1 q(k)).

(3.14)

,

(3.15)

Compute the QR factorization

zk Δk

= Uk

xk 0ρk ×mk

where Uk is an (mk + ρk ) × (mk + ρk ) unitary matrix and xk is an mk × mk upper triangular matrix. Determine the matrices dU (k), gU (k), hU (k), bU (k) of sizes νk × mk , νk × ρk , ρk−1 × mk , ρk−1 × ρk from the partition

Uk =

hU (k) bU (k) . dU (k) gU (k)

(3.16)

JID:LAA

AID:13529 /FLA

[m1L; v1.172; Prn:8/02/2016; 13:43] P.9 (1-36)

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

9

Compute ∗ Pk = b∗U (k)(Pk−1 b(k) + Qk−1 p∗V (k)g(k)) + gU (k)d∗V (k)g(k),

Qk =

b∗U (k)Qk−1 a∗V

(k) +

∗ gU (k)qV∗

(k).

(3.17) (3.18)

2.3. Compute ΔN = d∗V (N )d(N ),

(3.19)

QN −1 p∗V

(3.20)

zN = PN −1 h(N ) +

(N )d(N ).

Compute the QR factorization

zN ΔN

= UN xN ,

(3.21)

where UN is a unitary matrix of sizes (νN + ρN −1 ) × (νN + ρN −1 ), xN is an upper triangular matrix of sizes mN × mN . Determine matrices dU (N ), hU (N ) of sizes νN × mN , ρN −1 × mN from the partition

UN =

hU (N ) . dU (N )

(3.22)

The proof follows from Theorem 20.5 and Theorem 20.7 from [7] or the algorithm on p. 1041 of [1]—the connection between the two approaches is further detailed in section 6. 4. The LR factorization. Factorization of the matrix V U [Step 3] The final (Step 3) of the algorithm gives the generators for the L (and as a further derivative also the R matrix) in terms of the generators of the V and the U matrices of the previous section. These latter matrices always exist and can be stably computed. We show that, given invertibility or the original matrix, the L and R matrices will only exist when a sequence of (simple) matrices Qk computed from the generators of V and U is square and invertible. This fact is proven in the next theorem by showing that the L factor in the A = LR factorization is also a minimal left factor in an LR-factorization of V U just as well, so that its generators easily follow from those of the factors V and U (however, the product V U is not computed!). Somewhat hidden in the operation is what in normal Gaussian elimination appears to be the pivot, which turns out, if one wishes to determine it, to be represented in a bilinear form, but it is not used in the recursion. A further analysis of the situation is presented in section 6. Theorem 4.1. In the conditions of Theorem 3.1 the matrix V U admits an LR factorization if and only if the matrices Qk , k = 1, . . . , N −1 in (3.12), (3.18) are invertible. Moreover

JID:LAA

AID:13529 /FLA

10

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

[m1L; v1.172; Prn:8/02/2016; 13:43] P.10 (1-36)

in this case lower quasiseparable generators pL (i) (i = 2, . . . , N ), qL (j) (j = 1, . . . , N −1), aL (k) (k = 2, . . . , N − 1) of orders ρk (k = 1, . . . , N − 1) of the lower triangular matrix L are determined by the formulas ∗ ∗ qL (1) = −Q−1 1 gU (1)dV (1),

qL (k) =

∗ ∗ −Q−1 k (bU (k)Qk−1 pV

(4.1)

∗ (k) + gU (k)d∗V (k)), k = 2, . . . , N − 1;

(4.2)

aL (k) = aV (k), k = 2, . . . , N − 1.

(4.3)

pL (k) = pV (k), k = 2, . . . , N,

Proof. Assume that the matrix V U admits the factorization V U = LG with a block lower triangular matrix L with identity matrices on the main diagonal and a block upper triangular matrix G. Since the matrices V and U are unitary the matrix G is invertible and we get L = V U G−1 .

(4.4)

Using Lemma 5.1 from [7] we have V V (k + 1 : N, 1 : k) = Pk+1 QVk ,

k = 1, . . . , N − 1

> V N V k with auxiliary matrices Pk+1 = col(pV (i)(aV )> i,k )i=k+1 , Qk = row((aV )k+1,i qV (i))i=1 . −1 Hence using (4.4) and the fact that V and U G are (block) lower and upper triangular matrices we get V L(k + 1 : N, k) = V (k + 1 : N, 1 : k)(U G−1 )(1 : k, k) = Pk+1 qL (k),

k = 1, . . . , N − 1

with qL (k) = QVk (U G−1 )(1 : k, k). From here using Lemma 5.3 from [7] we conclude that the lower triangular matrix L with identity entries on the main diagonal has a set of lower quasiseparable generators pV (i) (i = 2, . . . , N ), qL (j) (j = 1, . . . , N − 1), aV (k) (k = 2, . . . , N − 1) with the elements pV (i), aV (k) defined in Theorem 3.1. Next consider the matrix Z = V ∗ L. Here V ∗ is a (block) upper triangular matrix with upper quasiseparable generators qV∗ (i) (i = 1, . . . , N − 1), p∗V (j) (j = 2, . . . , N ), a∗V (k) (k = 2, . . . , N − 1) and diagonal entries d∗V (k) (k = 1, . . . , N ). Applying Corollary 17.5 from [7] we obtain quasiseparable generators of the matrix Z via relations qZ (k) = qL (k), gZ (k) = qV∗ (k),

k = 1, . . . , N − 1,

(4.5)

aZ (k) = aV (k), bZ (k) = a∗V (k),

k = 2, . . . , N − 1

(4.6)

and

γN hZ (N ) pZ (N ) dZ (N )

=

p∗V (N )  pV (N ) ImN , ∗ dV (N )

(4.7)

JID:LAA

AID:13529 /FLA

[m1L; v1.172; Prn:8/02/2016; 13:43] P.11 (1-36)

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••



γk hZ (k) pZ (k) dZ (k)

=

p∗V (k) a∗V (k) d∗V (k) qV∗ (k)





Imk 0 0 γk+1

pV (k) Imk , aV (k) qL (k)

k = N − 1, . . . , 2, dZ (1) =

d∗V

(1) +

qV∗

11

(1)γ2 qL (1)

(4.8) (4.9)

with the auxiliary variables γk which are ρk−1 × ρk−1 matrices. From (4.7) using the fact that VN in the formula (3.5) is a unitary matrix we get γN = IρN −1 ,

pZ (N ) = 0.

Using (4.8) we have γk = p∗V (k)pV (k) + a∗V (k)γk+1 aV (k), pZ (k) = d∗V (k)pV (k) + qV∗ (k)γk+1 aV (k), (4.10) k = N − 1, . . . , 2.

(4.11)

Hence using the equality γN = IρN −1 and the fact that Vk in the formulas (3.7) are unitary matrices we conclude that γk = Iρk−1 , pZ (k) = 0,

k = N − 1, . . . , 2.

Now the relations pZ (k) = 0, k = 2, . . . , N imply that Z is a block upper triangular matrix. Inserting γk = Iρk−1 , k = N, . . . , 2 in (4.7), (4.8), (4.9) we obtain the relations



hZ (N ) p∗V (N ) = , dZ (N ) d∗V (N )





hZ (k) p∗V (k) a∗V (k) Imk = , k = N − 1, . . . , 2, qL (k) dZ (k) d∗V (k) qV∗ (k)

I  m 1 dZ (1) = d∗V (1) qV∗ (1) . qL (1) Combining this with (4.5) we obtain the equalities



hZ (N ) dZ (N )

=

p∗V (N ) , d∗V (N )

Imk 0 , k = N − 1, . . . , 2, = qL (k) Iρk

 I  0 m1 ∗ ∗ . dZ (1) gZ (1) = dV (1) qV (1) qL (1) Iρ1

hZ (k) bZ (k) dZ (k) gZ (k)



p∗V (k) a∗V (k) d∗V (k) qV∗ (k)



(4.12)

Hence using Lemma 20.3 from [7] we obtain factorization of the matrix Z in the form

JID:LAA

AID:13529 /FLA

12

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

[m1L; v1.172; Prn:8/02/2016; 13:43] P.12 (1-36)

2 · · · Z N Z = Z 1 · Z with N = diag{Iχ , AN }, 1 = diag{Z1 , Iφ }; Z k = diag{Iχ , Zk , Iφ }, k = 2, . . . , N − 1; Z Z 1 N k k where χk =

k−1

N

mi , φk =

i=1



i=k+1





Z1 = dZ (1) gZ (1) ; Zk =

mi and

hZ (k) bZ (k) hZ (N ) , k = 2, . . . , N − 1; ZN = . dZ (k) gZ (k) dZ (N )

Hence the matrix Z −1 admits the factorization Z −1 = S N · S N −1 · · · S 1 with S 1 = diag{S1 , Iγ1 }; S k = diag{Iηk , Sk , Iγk }, k = 2, . . . , N − 1; S N = diag{IηN , SN }, where ηk =

k−1 i=1

mi , γk =

S1 =

N i=k+1

mi and

0 Im1 −qL (1) Iρ1

V1 ,

(4.13)

Sk =

0 Imk −qL (k) Iρk

Vk , k = 2, . . . , N − 1,

SN = VN .

(4.14)

Therefore by Lemma 20.2 from [7] we conclude that S = Z −1 is a (block) lower triangular matrix with generators determined via partitions

S1 =

dS (1) ; qS (1)



pS (k) dS (k) Sk = , k = 2, . . . , N − 1; aS (k) qS (k)  SN = pS (N ) dS (N ) .

(4.15)

In particular using (4.14), (4.13) and (3.5), (3.7) we get

pS (k) aS (k)





=

0 Imk −qL (k) Iρk

pV (k) , k = 2, . . . , N − 1, aV (k)

pS (N ) = pV (N ).

(4.16)

Now by Theorem 17.4 from [7] we obtain the formulas for quasiseparable generators of the matrix G = Z −1 U :

JID:LAA

AID:13529 /FLA

[m1L; v1.172; Prn:8/02/2016; 13:43] P.13 (1-36)

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

13







0 dG (1) gG (1) Im1 dV (1)  = (4.17) dU (1) gU (1) , qG (1) β1 −qL (1) Iρ1 qV (1)









0 Imk pV (k) dV (k) hU (k) bU (k) βk−1 0 dG (k) gG (k) = , qG (k) βk −qL (k) Iρk aV (k) qV (k) 0 Iνk dU (k) gU (k) k = 2, . . . , N − 1

(4.18)

and aG (k) = aS (k), k = 2, . . . , N − 1.

pG (k) = pS (k), k = 2, . . . , N,

(4.19)

The matrix G is upper triangular if and only if the relations k = 1, . . . , N − 1

qG (k) = 0,

(4.20)

hold. Indeed using Corollary 5.2 from [7] we have k = 1, . . . , N − 1

G G(k + 1 : N, k) = Pk+1 qG (k),

(4.21)

with the matrices PkG (k = N, . . . , 2) defined via recursive relations  PNG = pG (N ),

 pG (k) G aG (k) Pk+1

PkG =

, k = N − 1, . . . , 2.

Using the equalities (4.19) we get  PNG = pS (N ),

PkG =

 pS (k) S aS (k) Pk+1



 =



Imk 0 G 0 Pk+1

pS (k) aS (k)





, k = N − 1, . . . , 2.

Combining this with (4.16) we obtain  PNG = pV (N ),

Imk 0 G 0 Pk+1

PkG =

Imk 0 ∗ Iρk

 pV (k) aV (k)

,

k = N − 1, . . . , 2.

(4.22)



 pV (k) Since the matrices pV (N ), , k = N − 1, . . . , 2 are composed of ρk−1 , k = aV (k) N, . . . , 2 columns of unitary matrices Vk , k = N, . . . , 2 these matrices have full column ranks, i.e.  rankpV (N ) = ρN −1 ,

rank

 pV (k) aV (k)

= ρk−1 , k = N − 1, . . . , 2.

(4.23)

JID:LAA

AID:13529 /FLA

14

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

[m1L; v1.172; Prn:8/02/2016; 13:43] P.14 (1-36)

We prove by induction that rankPkG = ρk−1 ,

k = N, . . . , 2.

(4.24)

G For k = N it is clear. Assume that for some k with N −1 ≥ k ≥ 2 one has rankPk+1 = ρk .    Imk 0 Imk 0 This means that the matrix has full column rank. Hence using G 0 Pk+1 ∗ Iρk   pV (k) G (4.23) one obtains rankPk = rank = ρk−1 . Thus from (4.21), (4.24) it follows aV (k) that the equalities G G(k + 1 : N, k) = Pk+1 qG (k),

k = 1, . . . , N − 1

are equivalent to (4.20). Inserting (4.20) in (4.17), (4.18) we obtain the conditions

d (1)  V dU (1) gU (1) , 0 β1 = −qL (1) Iρ1 qV (1)





 p (k) d (k)  hU (k) bU (k) βk−1 0 V V , 0 βk = −qL (k) Iρk aV (k) qV (k) 0 Iνk dU (k) gU (k) 





k = 2, . . . , N − 1.

(4.25)

(4.26)

Using (4.25) and the fact that the matrix V1 in (3.8) and the matrix U1 in (3.11) are unitary we get  rank 0 β1 = rankβ1 = ρ1 and therefore β1 is an invertible matrix. Assume that for some k with 2 ≤ k ≤ N − 1 the matrix βk−1 is invertible. Using (4.26) and the fact that the matrix Vk in (3.7) and the matrix Uk in (3.16) are unitary we get  rank 0 βk = rankβk = ρk and therefore βk is an invertible matrix. Moreover using (4.25), (4.26) we get

 d∗ (1)  U (4.27) 0 Iρ1 d∗V (1) qV∗ (1) = −β1−1 qL (1) β1−1 , ∗ gU (1)





 h∗ (k) d∗ (k) −1 ∗ ∗ 0 (k) a (k) p β U U V V k−1 = −βk−1 qL (k) βk−1 , ∗ ∗ ∗ ∗ bU (k) gU (k) 0 Iνk dV (k) qV (k) 

 0 Iρk

k = 2, . . . , N − 1.

(4.28)

JID:LAA

AID:13529 /FLA

[m1L; v1.172; Prn:8/02/2016; 13:43] P.15 (1-36)

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

15

Comparing (4.27), (4.28) with (3.12), (3.18) we conclude that Qk = βk−1 , k = 1, . . . , N −1. Hence it follows that the matrices Qk , k = 1, . . . , N − 1 are invertible. Assume that the matrices Qk , k = 1, . . . , N − 1 in (3.12), (3.18) are invertible. Define the lower triangular matrix L via lower quasiseparable generators (4.1), (4.2), (4.3) and diagonal entries dL (k) = Imk (k = 1, . . . , N ). Using (4.1), (4.2) and (3.12), (3.18) we have  ∗ ∗ ∗ g (1) = Q−1 , d (1) q (1) U 1 V V



Q  ∗ ∗ 0 (k) a (k) p k−1 V V ∗ , = Q−1 (k) b∗U (k) gU k 0 Iνk d∗V (k) qV∗ (k)

 

−qL (k) Iρk

−qL (1) Iρ1



k = 2, . . . , N − 1.

(4.29)

(4.30)

Inserting (4.29) in (4.17) and using the fact that the matrix V1 in (3.8) and the matrix U1 in (3.11) are unitary we get 

qG (1) β1

d (1)  V = d∗V (1) qV∗ (1) dU (1) gU (1) = qV (1)   ∗ −1 Q−1 g (1) , = d (1) g (1) 0 Q U U U 1 1 ∗ Q−1 1 gU (1)



−1 i.e. qG (1) = 0 and β1 = Q−1 1 . Let for some k with 2 ≤ k ≤ N −1 the relation βk−1 = Qk−1 holds. Inserting (4.30) in (4.18) and using the fact that the matrix Vk in (3.7) and the matrix Uk in (3.16) are unitary we get



qG (k) βk

=

Q−1 k





 h (k) b (k) U U ∗ = 0 Q−1 , b∗U (k) gU (k) k dU (k) gU (k)

i.e. qG (k) = 0 and βk = Q−1 k . The equalities qG (k) = 0, k = 1, . . . , N − 1 imply that the matrix G is upper triangular. Hence it follows that the formula V U = LG yields an LR factorization of the matrix V U . 2 Corollary 4.2. In the conditions of Theorem 3.1 assume that the matrix A is invertible. Then the matrix A admits an LR factorization if and only if the matrices Qk , k = 1, . . . , N − 1 in (3.12), (3.18) are invertible. Proof. Assume that the matrices Qk , k = 1, . . . , N − 1 in (3.12), (3.18) are invertible. Then the matrix V U admits the LR factorization V U = LG. Using the formula (3.3) we obtain the factorization A = LR with the upper triangular matrix R = GR0 . Assume that the matrix A admits the factorization A = LR with desired block lower triangular and upper triangular matrices L and R. The formula (3.3) implies V U R0 = LR. Since A is invertible the matrix R0 is invertible and we get V U = LG with the

JID:LAA

AID:13529 /FLA

16

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

[m1L; v1.172; Prn:8/02/2016; 13:43] P.16 (1-36)

upper triangular matrix G = RR0−1 . Hence the matrices Qk , k = 1, . . . , N − 1 in (3.12), (3.18) are invertible. 2 Notice that from the corollary it follows that if the matrix A and the matrices Qk , k = 1, . . . , N − 1 in (3.12), (3.18) are invertible then the matrix A is strongly regular, i.e. all the principal leading submatrices A(1 : k, 1 : k), k = 1, . . . , N are invertible. 5. The LR algorithm Here we obtain the LR factorization of the matrix A via LR factorization of the corresponding matrix V U . This step assembles the work done in Step 1 and Step 2 to produce all the data of the end result. Theorem 5.1. In the conditions of Theorem 3.1 assume that the matrices Qk , k = 1, . . . , N − 1 in (3.12), (3.18) are invertible. Then the matrix A admits the LR factorization. Moreover in this case lower quasiseparable generators pL (i) (i = 2, . . . , N ), qL (j) (j = 1, . . . , N −1), aL (k) (k = 2, . . . , N −1) of orders ρk (k = 1, . . . , N −1) of the lower triangular matrix L as well as upper quasiseparable generators gR (i) (i = 1, . . . , N −1), hR (j) (j = 2, . . . , N ), bR (k) (k = 2, . . . , N −1) of orders rkU (k = 1, . . . , N − 1) and diagonal entries dR (k) (k = 1, . . . , N ) of the upper triangular matrix R in the factorization A = LR are determined by the formulas (4.1), (4.2), (4.3) and by the formulas dR (1) = d(1), dR (k) = d(k) +

gR (1) = g(1),

pV (k)Q−1 k−1 Pk−1 h(k),

k = 2, . . . , N,

gR (k) = g(k) + pV (k)Q−1 k−1 Pk−1 b(k), k = 2, . . . , N − 1; hR (k) = h(k), k = 2, . . . , N,

(5.1)

bR (k) = b(k), k = 2, . . . , N − 1.

(5.2) (5.3)

Proof. Theorem 4.1 implies that the matrix V U admits the factorization V U = LG with a block lower triangular matrix L with the lower triangular quasiseparable generators determined in (4.1), (4.2), (4.3) and diagonal entries dL (k) = Imk (k = 1, . . . , N ) and a block upper triangular matrix G. Using the formula (3.3) we obtain the factorization A = LR with the upper triangular matrix R = GR0 . We compute upper quasiseparable generators and diagonal entries of the upper triangular matrix R via the formula R = L−1 A. Using Corollary 17.10 from [7] we obtain the relations



dR (1) gR (1) Im1  = (5.4) d(1) g(1) , ∗ α1 qL (1)





Imk −pV (k) h(k) b(k) αk−1 0 dR (k) gR (k) = , aV (k) − qL (k)pV (k) qL (k) ∗ αk 0 Imk d(k) g(k)

JID:LAA

AID:13529 /FLA

[m1L; v1.172; Prn:8/02/2016; 13:43] P.17 (1-36)

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

17

k = 2, . . . , N − 1,

(5.5)

dR (N ) = −pV (N )αN −1 h(N ) + d(N ).

(5.6)

From here we obtain the relations dR (1) = d(1),

dR (k) = d(k) − pV (k)αk−1 h(k), k = 2, . . . , N − 1,

gR (1) = g(1),

gR (k) = g(k) − pV (k)αk−1 b(k), k = 2, . . . , N − 1

and α1 = qL (1)g(1),

(5.7)

αk = (aV (k) − qL (k)pV (k))αk−1 b(k) + qL (k)g(k), k = 2, . . . , N − 1.

(5.8)

One should prove that αk = −Q−1 k Pk ,

k = 1, . . . , N − 1.

(5.9)

For k = 1 the relation (5.9) follows from (5.7) and (4.1), (3.12). Let for some k with 2 ≤ k ≤ N − 1 the relation αk−1 = −Q−1 k−1 Pk−1 holds. Using (5.8) we have αk = aV (k)αk−1 b(k) + qL (k)(g(k) − pV (k)αk−1 b(k)). Inserting (4.2) we have ∗ ∗ ∗ ∗ αk = aV (k)αk−1 b(k) − Q−1 k (bU (k)Qk−1 pV (k) + gU (k)dV (k))(g(k) − pV (k)αk−1 b(k))

= −Q−1 k Dk with ∗ Dk = (b∗U (k)Qk−1 p∗V (k) + gU (k)d∗V (k))(g(k) − pV (k)αk−1 b(k)) − Qk aV (k)αk−1 b(k).

Thus one should check that Dk = Pk . We have ∗ Dk = b∗U (k)Qk−1 p∗V (k)g(k) + gU (k)d∗V (k)g(k) + Ck ,

where ∗ Ck = −[(b∗U (k)Qk−1 p∗V (k) + gU (k)d∗V (k))pV (k) − Qk aV (k)]αk−1 b(k).

Hence using (3.18) and the equality αk−1 = −Q−1 k−1 Pk−1 we get ∗ Ck = [b∗U (k)Qk−1 p∗V (k)pV (k) + gU (k)d∗V (k)pV (k) + ∗ (k)qV∗ (k)aV (k)]Q−1 b∗U (k)Qk−1 a∗V (k)aV (k) + gU k−1 Pk−1 b(k).

(5.10)

JID:LAA

AID:13529 /FLA

18

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

[m1L; v1.172; Prn:8/02/2016; 13:43] P.18 (1-36)

From here since Vk in (3.7) is a unitary matrix we conclude that ∗ Ck = b∗U (k)Qk−1 Q−1 k−1 Pk−1 b(k) = bU (k)Pk−1 b(k).

Inserting this in (5.10) and using (3.17) we obtain Dk = Pk .

2

Thus we obtain the following algorithm to compute LR factorization of a block matrix if such factorization exists. Algorithm 5.2. Let A = {Aij }N i,j=1 be a block matrix with entries of sizes mi × mj , lower quasiseparable generators p(i) (i = 2, . . . , N ), q(j) (j = 1, . . . , N − 1), a(k) (k = 2, . . . , N − 1) of orders rkL (k = 1, . . . , N − 1), upper quasiseparable generators g(i) (i = 1, . . . , N − 1), h(j) (j = 2, . . . , N ), b(k) (k = 2, . . . , N − 1) of orders rkU (k = 1, . . . , N − 1) and diagonal entries d(k) (k = 1, . . . , N ). 1. Using algorithm from Theorem 3.1 compute the numbers ρk (k = 1, . . . , N − 1), lower quasiseparable generators pV (i) (i = 2, . . . , N ), qV (j) (j = 1, . . . , N − 1), aV (k) (k = 2, . . . , N − 1) of orders ρk (k = 1, . . . , N − 1) and diagonal entries dV (k) (k = 1, . . . , N ) of the matrix V , upper quasiseparable generators gU (i) (i = 1, . . . , N −1), hU (j) (j = 2, . . . , N ), bU (k) (k = 2, . . . , N − 1) of orders ρk (k = 1, . . . , N − 1) and diagonal entries dU (k) (k = 1, . . . , N ) of the unitary matrices V and U from the factorization (3.3) and auxiliary matrices Pk , Qk , k = 1, . . . , N − 1. 2. If the matrices Qk , k = 1, . . . , N − 1 are invertible compute lower quasiseparable generators pL (i) (i = 2, . . . , N ), qL (j) (j = 1, . . . , N − 1), aL (k) (k = 2, . . . , N − 1) of orders ρk (k = 1, . . . , N −1) of the lower triangular matrix L by the formulas (4.1), (4.2), (4.3) and upper quasiseparable generators gR (i) (i = 1, . . . , N −1), hR (j) (j = 2, . . . , N ), bR (k) (k = 2, . . . , N − 1) of orders rkU (k = 1, . . . , N − 1) and diagonal entries dR (k) (k = 1, . . . , N ) of the upper triangular matrix R by the formulas (5.1), (5.2), (5.3). 6. Global operators In this section we show how the detailed notation of the previous sections leads to global operator-theoretic formulas, from which further properties may then be deduced (also concerning numerical properties). We make, in addition, the correspondence with the system theoretic notation of [2]. Full proofs are given in [1], and are not repeated here (they give an alternative to the proofs of the previous sections). The main ingredients for the globalization are: • indices are suppressed, thanks to the introduction of zero as a potential dimension. In particular, ‘−’ denotes an element of dimensions 0 × 1, ‘|’ an element of dimension 1 × 0 and ‘·’ an element of dimension 0 × 0. Moreover, some new calculation rules must be introduced that are compatible with normal matrix calculus (with ‘∗’ as multiplication operator: | ∗ − = [0], − ∗ | = ·, · ∗ − = −, etc..., these are

JID:LAA

AID:13529 /FLA

[m1L; v1.172; Prn:8/02/2016; 13:43] P.19 (1-36)

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

• •

• •

19

actually all placeholders or dummies). Dimensional correspondence in addition and multiplication of (block) matrices must of course always be respected; systematic use is made of block-diagonal matrices. E.g., p = diag(pi ) is a blockdiagonal matrix constructed from the blocks pi ; operators map dimensionally compatible input sequences to dimensionally compatible output sequences, eventually augmented with dummies when appropriate. Thanks to such (zero-dimensional) augmentations, all indices can be thought of running from −∞ to +∞; a general “causal shift operator” σ maps a sequence u with elements ui forward: [σu]i = ui−1 . The inverse operator is then the transpose σ ∗ , defined by [σ ∗ u]i = ui+1 ; a symbol is used to represent so called “system realizations” (following the traditional terminology of Dynamical System Theory, originally introduced by R.E. Kalman), which higher up were called “generators”. We write

a q p d

T ∼c

(6.11)

for T = d + p(I − σa)−1 σq, in which a, q, p, d are (block) diagonal operators and σ is the causal shift. In the notation introduced before, this would be annotated as Ti,j = pi a> i,j qj for j < i, Ti,i = di and Ti,j = 0 for i < j. (The proof is by Neumann

N expansion of (I −σa)−1 = k=0 (σa)k and noticing that σa is again a diagonal matrix

a q with shifted output indices, for more explanation see e.g., [2]— is a 2×2 matrix p d of compatible block diagonals that has the important system theoretic meaning of mapping the current “state” and the input to the “next state” and the output.) T is a block lower triangular operator, often called “causal”, because the computation progresses with increasing indices. Dually, there are block upper triangular operators, now using the anti-causal shift σ ∗ . An upper or anti-causal realization is then

A ∼a for A = d + g(I − σ ∗ b)−1 σ ∗ h.

b h g d

(6.12)



a q • Realizations are not unique. Given a causal realization , let R be a non-singular p d −1 (square) block-diagonal operator, dimensionally compatible with the product aR , R<−1> aR−1 R<−1> q then an equivalent realization is . R defines what is often d pR−1 called a “state equivalence”. It can be used to reduce the realization further (see the next subsection).

JID:LAA

AID:13529 /FLA

20

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

[m1L; v1.172; Prn:8/02/2016; 13:43] P.20 (1-36)

Besides shifts on vectors and matrices, it is also useful to introduce shifts on diagonals: a forward (causal) shift on some matrix A is denoted A<+1> := σAσ ∗ , while a backward, anti-causal shift on A is A<−1> := σ ∗ Aσ. Some further conventions are: an operator A ∈ c when it is lower (causal), or A ∈ a when it is upper (or anti-causal). An operator is said to be “isometric inner” when it is a causal and isometric operator and “isometric co-inner” when it is an isometric anticausal operator. In the setting of finite matrices all operators with isometric (respect. co-isometric) realizations are isometric (respect. co-isometric) as well, and vice-versa, a causal isometric (respect. anti-causal isometric) has a causal isometric (respect. anticausal isometric) realization. An inner operator is by definition causal unitary etc... In the case the operator A is infinite dimensional the situation becomes more complex, see e.g., [2]. With the conventions just established, the translation of the algebraic recursions of the previous sections to an operator form just consists in dropping indices and writing out realizations. The original, double-sided operator is then written as A =: d + p(I − σa)−1 σq + g(I − σ ∗ b)−1 σ ∗ h

(6.13)

and Theorem 3.1 produces a factorization A = V U R0 , in which V ∈ c is inner (i.e., by definition: causal and unitary), U ∈ a is co-inner (i.e., anti-causal and unitary) and R0 ∈ a has a right anti-causal inverse. Step 1 The method starts with the determination of V as a minimal, causal and unitary factor that makes V ∗ A upper (or anti-causal). Let a unitary realization for V be written as V =: dV + pV (I − σaV )−1 σqV . (6.14)

a The theorem does not assume to be isometric to begin with, nor even of minimal p dimensions, so the determination of V requires a reduction of the generators to a minimal form, because a minimal unitary realization for V will be needed (both its reachability and observability gramians will be unit matrices). Identifying Λ := X ∗ X as the “observability gramian” satisfying the recursion Λ = a∗ Λ<−1> a + p∗ p, the corresponding square root algorithm reduces to a (partial) state transformation matrix X and the definition of a realization for the inner factor V , in which we require X to be minimal, i.e., has independent rows (typical: row echelon form):

p X <−1> a





=:

p V dV aV q V

X 0

(=

pV X ). aV X

(6.15)

This is a backward orthogonal recursion on diagonal operators. One assumes knowledge of X <−1> , and then computes the components of V and the next X via a QR-factorization,

JID:LAA

AID:13529 /FLA

[m1L; v1.172; Prn:8/02/2016; 13:43] P.21 (1-36)

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

21

or any other algorithm that computes a row basis using orthogonal transformations (notice that X is not necessarily square and that the backward recursion starts with an empty matrix Xk = [·] for k > N ). A new equivalent representation for A follows, with reduced anti-causal part:



A = d + pV (I − σaV )−1 σX <−1> q + g(I − σ ∗ b)−1 σ ∗ h

(6.16)

aV isometric—a form that is often assumed from the start [1]. This actually pV produces the reduction: with





a q p 0

∼c

aV X <−1> q . 0 pV

(6.17)

(Often one does not directly start from a state realization or generators, but from the matrix itself. In that case the normal form can be directly derived from the matrix through a forward recursion—by determining orthonormal columns of the subsequent Hankel operators, see Appendix A.) A (potentially non-minimal!) realization for V ∗ A is then found by working out the factorization: ⎤ ⎡ ⎤ ⎤⎡ a∗V a∗V p∗V g p∗V d + a∗V X <−1> q I X <−1> q p∗V ⎥ ⎢ ⎥ ⎢ ⎥⎢ Tu := V ∗ A ∼c ⎣ 0 b h I 0 ⎦⎣ b h ⎦! = ⎣ ⎦ (6.18) qV∗ d∗V g d∗V d + qV∗ X <−1> q qV∗ 0 d∗V g d ⎡

Step 2 Next is the determination of U as a left co-inner factor of V ∗ A. Writing a unitary realization for U as

U ∼a

bU hU gU dU

(6.19)

the co-inner, co-outer factorization of V ∗ A is next obtained by the “square root” recursion

∗ b∗U gU ∗ ∗ hU d U



Y Au Y Bu Cu Du

=

Y <−1> 0 Co Do

(6.20)

Here, the sub-index u indicates a realization for Tu , Ro will borrow {Au , Bu } from Tu and Y is a diagonal operator that is recursively computed with a forward recursion. One has to assume knowledge of the initial Y0 (it is empty) and then one can recursively compute all subsequent Y1 , Y2 , . . . . Do must be right invertible, hence it must form a row basis.

JID:LAA

AID:13529 /FLA

22

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

As Au =

[m1L; v1.172; Prn:8/02/2016; 13:43] P.22 (1-36)

 a∗V p∗V g , Y decomposes as Y1 Y2 , and hence 0 b

  Y Au = Y1 a∗V Y1 p∗V g + b , Y Bu = Y1 (p∗V d + a∗V X <−1> q) + Y2 h

∼a

(6.21)

bU hU gU dU

is defined by producing a row basis D0 in:



Y Bu [∼c U ] ∗ dV d + qV∗ X <−1> q ∗

=

0 Do

(6.22)

 according to (6.20), and then formulas for Y1<−1> Y2<−1> follow: 

Y1<−1>

Y2<−1>



 =

b∗U

∗ gU

Y a∗ Y p ∗ g + Y b 1 V 1 V 2 d∗V g qV∗

(6.23)

In the detailed notation of section 3.1 one gets: Y1,k = Qk−1 , Y2,k = Pk−1 , or, globally, Y1 = Q<+1> and Q2 = P <+1> , and then further (again in the notation of the former section, dropping indices): Δ = d∗V d + qV∗ X <−1> q z = Y1 (p∗V d + a∗V X <−1> q) + Y2 h x = Do .

(6.24)

Step 3: LR A crucial observation provides a shortcut to produce the formulas for L and R in the A = LR factorization, assuming that A is invertible and can be factorized. Let us take, as before, the main diagonal of L as I, then we have necessarily

L ∼c

aV F pV I

(6.25)

for a new block diagonal matrix F to be determined—the generators aV and pV are already known. (This is a direct of the minimality of the realization (6.16),

consequence aV and the fact that in both cases defines the observability operator, because A = LR pV and the projection on the causal part must hence be the same—the standard proof can be found, e.g., in [2].) F can be derived directly from the previously computed factors V and U . A remarkable property is that V and U actually determine F uniquely. To see this, one

JID:LAA

AID:13529 /FLA

[m1L; v1.172; Prn:8/02/2016; 13:43] P.23 (1-36)

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

23

remarks that L−1 V has to be lower, and, moreover, there is automatic cancellation of the “quadratic term”:    L−1 V = I − pV [I − σ(aV − F pV )]−1 σF dV + pV (I − σaV )−1 σqV = dV + pV [I − σ(aV − F pV )]−1 σ(qV − F dV )

aV − F p V q V − F dV ∼c pV dV

(6.26)

and it is not too hard to see that the anti-causal U must move this to anti-causal: 



∼c

a V − F p V q V − F dV pV dV

 

∼a

 bU hU gU d U

∈a

(6.27)

because (1) the given realization for L is necessarily minimal, since its observability space cannot be further reduced due to the fact that it is also the observability space of the causal part of A, for which we have a minimal realization, (2) the realizations for L−1 and L−1 V have to be minimal, and (3) R−1 (L−1 V ) = Ro−1 U −1 with both R−1 and Ro−1 required to be upper. This now means (as before with V , but now on the inverses) that there must be a state transformation R such that 



 a q V V ∗ = R−<−1> b∗U R R−<−1> gU I −F p V dV

(6.28)

This equation determines both R and F (after inversion of the unitary factor): ∗ ∗ R<−1> = b∗U Ra∗V + gU qV ∗ ∗ F = −[b∗U Rp∗V + gU dV ]R−<−1> .

(6.29)

From this, one concludes that R = Y1 =!Q<1> (i.e., Rk = Qk−1 ) and hence has to be an invertible matrix. Of course, this proof hinges strongly on the requirements of invertibility of A and outer-ness of L and R. A more refined proof might give conditions when invertibility is not required (as in the ‘counterexample’ in section 7), but has not been derived so far. It follows (as we know already from the detailed proof in the previous sections) that the generator F can be written as a bilinear form, which is guaranteed to exist when the original operator is invertible. However, it is not needed in the derivation: this is an end result. In system theoretic terms, the crucial operator R can be given an interesting geometric interpretation: it is the generalized set of directional cosines between the observability operator of U and the reachability operator of V . The factorization will not exist if part of these spaces are orthogonal. Moreover, the given representation is the best possible, as both spaces are represented by orthonormal bases.

JID:LAA

AID:13529 /FLA

24

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

[m1L; v1.172; Prn:8/02/2016; 13:43] P.24 (1-36)

Some further considerations on the factorization of the matrix V U V U is a double-sided quasiseparable matrix like the original A, except that it is unitary. Clearly V and U , as defined for A, are for this case just the V and U in the product, and hence the same left factor L will come out, provided the factorization V U is minimal, because the computation for L as given above only uses V and U to compute the result. In a more general case, V U might not be minimal (in an extreme case it could be that U = V ∗ ), and then the factorization would require a reduction, as we did before for A. Assuming a realization for L = I + pV (I − σaV )−1 σqL with qL := F still to be determined, one has Z = V ∗ L = (d∗V + qV∗ σ ∗ (I − a∗V σ ∗ )−1 p∗V ) · (I + pV (I − σaV )−1 σqL ) = d∗V

+ qV∗ σ ∗ (I − a∗V σ ∗ )−1 p∗V + d∗V pV (I − σaV )−1 σqL + qV∗ σ ∗ (I − a∗V σ ∗ )−1 p∗V pV (I − σaV )−1 σqL

(6.30)

the last “quadratic” term being obviously equal to qV∗ [σ ∗ (I − a∗V σ ∗ )−1 a∗V + I + aV (I − σaV )−1 σ]qL , because of the unitarity of the realization of V . It follows that Z = (d∗V + qV∗ qL ) + qV∗ σ ∗ (I − a∗V σ ∗ )−1 (p∗V + a∗V qL )

(6.31)

defining the realization for Z. Z is what is called a “maximal phase” operator in [1]—i.e., an invertible causal operator whose inverse is anti-causal, and section 4 of the present paper shows that it, and its inverse can be factored in elementary factors, a fact that has interest in its own right. 7. The counterexample In this section we show that the invertibility of matrices Qk in (3.12), (3.18) is not necessary for the existence of LR factorization of a matrix when the original matrix is not invertible. Consider the matrix ⎛

1 ⎜0 ⎜ A=⎜ ⎝0 1

1 0 0 1

This matrix admits the LR factorization with

1 0 1 1

⎞ 1 0⎟ ⎟ ⎟. 0⎠ 1

JID:LAA

AID:13529 /FLA

[m1L; v1.172; Prn:8/02/2016; 13:43] P.25 (1-36)

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••



1 ⎜0 ⎜ L=⎜ ⎝0 1

0 1 0 0

0 0 1 0

⎞ 0 0⎟ ⎟ ⎟, 0⎠ 1



1 ⎜0 ⎜ R=⎜ ⎝0 0

1 0 0 0

1 0 1 0

25

⎞ 1 0⎟ ⎟ ⎟. 0⎠ 0

However in the factorization A = V U R0 we have ⎛

1 ⎜0 ⎜ V =⎜ ⎝0 0

0 0 0 1

0 0 1 0

⎛ 1 ⎞ √ − √1 0 2 2 ⎜ √1 √1 1⎟ ⎜ 2 ⎟ 2 ⎟, U = ⎜ ⎝ 0 0⎠ 0 0 0 0

0 0 1 0

⎛√ ⎞ 0 2 ⎜ 0 0⎟ ⎜ ⎟ ⎟ , R0 = ⎜ ⎝ 0 0⎠ 0 1

√ √ 2 2 0 0 0 1 0 0

√ ⎞ 2 0 ⎟ ⎟ ⎟ 0 ⎠ 0

and ⎛

⎞ − √12 0 0 ⎜ 0 0 0 1⎟ ⎜ ⎟ VU =⎜ ⎟. ⎝ 0 0 1 0⎠ √1 √1 00 2 2 √1 2

Here the matrix V U is invertible but not strongly regular. Hence V U does not admit the LR factorization and therefore by Corollary 4.2 some of the matrices Qk are not invertible. 8. Numerical experiments All the numerical experiments have been performed on a laptop with an Intel i5—3470 processor, 3 gigabytes of memory and the machine precision epsilon of 2.2204 e-16 as it is given by the Matlab function eps. All matrices checked are square. In the first experiments (Figs. 1–5), matrices of size 20, 40, . . . , 200 are checked. For each of the 20 matrices of each of these sizes considered, the LR algorithm of the present paper, the book [7] (p. 331, §18.2) LDU algorithm and Matlab’s LU algorithm (with no other parameter than the matrix itself) are performed. However, in some cases the book algorithm does not work at all and then that algorithm is not plotted (Figs. 7, 8). In the last series of experiments (see Figs. 9–11) real square matrices of 8 sizes which are a power of 2, more precisely from N = 32 up to N = 4, 096 are checked. For each of these sizes three kinds of matrices are considered: scalar order one quasiseparable N × N matrices, 2 × 2 block 5-band matrices and 3 × 3 block 7-band matrices. For each size and type we consider 5 random matrices, thus in this last series of experiments a total of 8 · 3 · 5 = 120 are analyzed. Note that the underlying scalar matrices for the size 4, 096 and 3 × 3 blocks are in fact of size 12, 288 × 12, 288. Note also that the 2 × 2 block 5-band matrix could be regarded as a 10-band scalar matrix, while the 3 × 3 block 7-band matrix could be regarded as a

JID:LAA

AID:13529 /FLA

26

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

[m1L; v1.172; Prn:8/02/2016; 13:43] P.26 (1-36)

Fig. 1. Plots for matrices with real generators given by rand (between 0 and 1): the time (left) and the relative norm error (right). Zero time entries do not appear. The execution time for the present algorithm does not increase for higher orders. For order 2 the book algorithm errors increase drastically.

21-band scalar matrix. We also recall that 5-band matrices are quasiseparable of order 2, while 7-band matrices are quasiseparable of order 3. For each of the 120 matrices, the LR algorithm of the present paper, the book [7] (p. 331, §18.2) LDU algorithm and Matlab’s LU algorithm are performed. For our purposes, for each considered matrix A a structure containing random quasiseparable generators is built in the following way. For each of the order one quasiseparable N × N matrices, complex quasiseparable generators are built by using twice the Matlab function rand (once for the real part and once for the imaginary part). Diagonal entries d(k), k = 1, . . . , N are also built as complex numbers. The random numbers which are given by rand are between 0 and 1. For matrices which are quasiseparable of order 2, each of the 4 complex entries of each quasiseparable generator and of each 2 × 2 diagonal entry are build in the same fashion, except for those numbers which control the fact that these matrices will be 5-band 2 × 2 block matrices. It is then obvious how the random generators of order 3 quasiseparable 7-band matrices are built. However, in the particular case of semiseparable of order 2 matrices, the 4 entries of the generators a(k), b(k), k = 2, . . . , N − 1 are not random, since all these small sized matrices become in this case the 2 × 2 identity matrix. In the experiments where the book algorithm does not work, the size N of the considered matrices is 580, or 160.

JID:LAA

AID:13529 /FLA

[m1L; v1.172; Prn:8/02/2016; 13:43] P.27 (1-36)

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

27

Fig. 2. Top are matrices with order 2 generators. Bottom, order 1. The left figure shows plots for matrices with real input from rand of the average absolute value of the 20N 2 entries of the 20 difference matrices considered. The right figure shows plots of their max values. Note that for order one matrices the book algorithm performs better, but for order 2 the present algorithm takes the lead. As in the preceding figure, for order 2, the book algorithm errors increase drastically. The worst norm entry is still one over trillion.

For each checked matrix and each of the three algorithms we computed a set of parameters which show the time and also the deviation of the matrix A compared to its decomposition. Namely, we computed the matrix A out of its given generators and also, for the book algorithm, we computed the matrices L, D and U and wrote down the usual relative 2-norm (or relative Euclidean norm) ||A − LDU ||2 / ||A||2 , as well as the average and the maximal absolute value of the N 2 entries of the matrix A − LDU.

(8.1)

In exact precision the numbers whose computation is described above would be all equal to zero. For the paper algorithm, we also computed (besides the matrix A which we already have), the matrices L and R out of their quasiseparable generators. We recorded the same parameters, but for the difference deviation matrix A − LR.

(8.2)

JID:LAA

AID:13529 /FLA

28

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

[m1L; v1.172; Prn:8/02/2016; 13:43] P.28 (1-36)

Fig. 3. Order 1 generators complex, not real input.

Fig. 4. Condition numbers (the ratio between the largest and the smallest singular values) for real input generators (left) and for complex input (right).

JID:LAA

AID:13529 /FLA

[m1L; v1.172; Prn:8/02/2016; 13:43] P.29 (1-36)

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

29

Fig. 5. Orders 1, 2, semiseparable matrices with complex input. Irrespective of the preceding figures, this figure is for matrices with scalar entries.

Finally, for the Matlab decomposition algorithm we did the same for the matrix A − LU.

(8.3)

Note that the left lower triangular matrix L is a different matrix for Matlab versus the other two algorithms and it must be build up again from the corresponding data of each of the three methods. The R matrix should be equal to the product DU . We called the difference matrices (8.1)–(8.3) error matrices and we plotted their norm relative to the norm of A. We also plotted for each of them their average absolute entry. In our first series of experiments, we completely checked four kinds of random not band matrices: those which have real input generators as opposed to those which have complex inputs and we checked both of them for order one (scalar) quasiseparable generators and for order two quasiseparable generators. For each of the four cases now mentioned we checked different sizes of matrices, for instance the 10 sizes of 20, 40, . . . , 200 or, in other experiments, we checked 25 different sizes, i.e. 200, 400, . . . , 5000. We repeated each experiment 20 times (i.e. for 20 input matrices) and we plotted in the respective figures the average results over the 20 of that particular size, order and input field. This means that in this first series of experiments we analyzed in total 2 ∗ 2 ∗ (10 + 25) ∗ 20 = 2, 800 input matrices.

JID:LAA

AID:13529 /FLA

30

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

[m1L; v1.172; Prn:8/02/2016; 13:43] P.30 (1-36)

Fig. 6. Execution times for Matlab, paper and book algorithms for matrices with 2 × 2 block entries and order 2 quasiseparable generators of sizes 200, 400, . . . , 5000 blocks. Time increases linearly as opposed to N 2 for Matlab and it is much lower than Matlab time for larger matrices. Among other reasons is the fact that Matlab uses for a 10, 000 sized matrix 400 M of memory just to keep the input matrix and the results, while our algorithms use only 0.3 M of memory.

For almost each of the figures, we plotted each order of the blocks and generators, namely order one (almost always below) and order two (almost always above). For each field of scalars (real or complex) we built different figures. Note that for order 2 the size is in fact up to 400 and respectively 10, 000. In those of the accompanying figures which deal also with order 2 matrices, the results measured (both time and error) are higher than the same corresponding results for matrices with scalar entries and scalar generators. Much time and relative error is saved when one works with real random generators as opposed to complex ones. The figures illustrate the fact that the time for the paper algorithm is larger than the time needed for the book algorithm, but for order 2 matrices it does not increase further for the paper algorithm, while the time for the book algorithm does increase. Figs. 1, 2 show the data for matrices with order 2 generators (top) and for matrices with order 1 (bottom). These figures are for matrices with real input from random generators given by rand. We plotted the average absolute value of the N 2 entries of the 20 difference matrices of size N considered, namely 20N 2 absolute values of entries in A − LR for the paper algorithm and of A − LDU for the book algorithm. The right part of the figure shows the maximal (worst) absolute value of the 20N 2 entries of the respective difference matrices.

JID:LAA

AID:13529 /FLA

[m1L; v1.172; Prn:8/02/2016; 13:43] P.31 (1-36)

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

31

Fig. 7. For matrices that are not strongly regular, for instance matrices for which the first two columns are equal, only data for the paper algorithm is plotted.

Fig. 8. The order 2 case: plots of the mean entry and relative norm of the difference matrix A − LR versus Matlab’s corresponding A − LU . In the left plot, column 2 of the matrix is null. In the right plot it is 5 times column 1.

JID:LAA

AID:13529 /FLA

32

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

[m1L; v1.172; Prn:8/02/2016; 13:43] P.32 (1-36)

Fig. 9. Matlab time for size 12, 288 matrices is almost 38 second, while for the paper algorithm it is almost 1.5 s.

Fig. 3 does the same thing when the input generators are complex. For order 2, the farther away from the diagonal, the more the a(k), k = 2, . . . , N − 1 generators are multiplied together. This produces huge entries. Such matrices, which have an effect opposite to the band matrices which appear in nature, are not interesting to study. Fig. 4 shows how bad condition numbers are for non-band up to size 400 matrices. Fig. 5 presents data about semiseparable matrices of order 1 and 2. For both these cases, the algorithm works very well, with an error in the range of the machine precision. Fig. 6 shows that the execution time for the developed linear complexity algorithms is indeed linear and these algorithms are essentially faster that the standard Matlab procedure. The Figs. 7 and 8 are for matrices that are not strongly regular, for instance matrices for which the first two columns are equal (which happens if d1 = g1 h2 , d2 = p2 q1 , q2 = a2 q1 ). In that case, the book algorithm does not work at all. Matlab sends: “Warning: Matrix is singular to working precision” and then: “??? Error using ==> norm Input to SVD must not contain NaN or Inf”. Whenever the algorithm in the book [7] is not applicable we checked only the present paper algorithm vs. Matlab. In order to check matrices that are not strongly regular and for which the book algorithm does not work at all, we made some experiments, by introducing two proportional columns (shown in the last two figures). This is achieved by choosing d(1) =

1 g(1)h(2), τ

d(2) = τ p(2)q(1),

q(2) = τ a(2)q(1),

JID:LAA

AID:13529 /FLA

[m1L; v1.172; Prn:8/02/2016; 13:43] P.33 (1-36)

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

33

Fig. 10. Plots of the average absolute value among all the 5N 2 entries of the difference matrix A − LR versus Matlab’s corresponding A − LU and book A − LDU .

Fig. 11. Plots of the relative norm of the difference matrix A − LR versus Matlab’s corresponding A − LU and book A − LDU , all divided by norm of A.

JID:LAA

AID:13529 /FLA

34

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

[m1L; v1.172; Prn:8/02/2016; 13:43] P.34 (1-36)

where τ is a complex nonzero number. Then column 2 is τ times column 1. In other experiments we introduced an almost null column, by scaling h(3), d(3) and q(3) by a very small value. In still other experiments we forced two columns to be almost zero. In these cases, only the remaining quasiseparable generators are at random. When the second column which is equal (see as above) to τ = 5 times the first one, for instance, for the order 2 semiseparable matrix which has a(k) = b(k) = I2 and the other quasiseparable generators d(k) = I2 , the first entry of p(k), q(k), g(k), h(k) equal to 1 − 1/k,

1 − 2/k,

1 − 5/k,

1 − 3/k,

k = 1, . . . , N, k = 2

respectively, while their second entry is minus the first entry, the book algorithm does not work for any size N of the matrix (and not only for the plotted sizes of 580 and 160), while the present paper and the Matlab algorithms both do. The main conclusion is that our algorithms are some orders of magnitude faster than Matlab for larger matrices (see Fig. 9), while for smaller matrices Matlab is faster. Zero time entries are not plotted. Although Matlab is more exact (see Fig. 10), one can see that even the worst of our results, even in matrices up to 4, 096 ×4, 096 square 3 ×3 blocks still gives a precision of 9 decimal digits. The figures illustrate the fact that the time for the paper algorithm is larger than the time needed for the book algorithm, although the order of the generators does not increase, but between order 2 and 3 quasiseparable matrices time not increase further for these algorithms, while the time for the Matlab algorithm increases drastically since the matrix is twice as large. Of course the memory for both the book and the paper algorithms is linear with respect to the size matrix A, which is the square root of the memory used by Matlab. We plotted in Fig. 11 the average absolute value of the N 2 entries of the 5 difference matrices of size N considered, namely of 5N 2 absolute values of entries, in A − LU for the Matlab algorithm, for A − LR given by the paper algorithm and of A − LDU for the book algorithm. Matlab gains a lot concerning its errors, since for the other algorithms we need first to build the whole matrices A, L, R, D and U out of their quasiseparable generators in order for us to check those algorithms errors, and building the whole matrices already introduces collateral errors, besides the true errors of the checked algorithms. Appendix A

p to In this section we show that the (orthonormal) backward recursion to reduce a an isometric normal form can be replaced by an orthonormal forward recursion, when it is combined with a generator identification process. In general the cost of this operation would be prohibitive O(n2 ), where n is the dimension of the matrix, but in many cases it can be reduced to O(n). The latter will be the case when it is known a priori that the maximal state dimension (the maximal dimension of the matrix ak ) is at most a small constant δ, and that it is achieved with an equally small number of consecutive columns

JID:LAA

AID:13529 /FLA

[m1L; v1.172; Prn:8/02/2016; 13:43] P.35 (1-36)

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

35

in the matrix A. This will, in particular, be the case with band matrices, and with many matrices resulting from 1D or 2D finite difference or finite element problems. The method is based on the properties of the so called lower “Hankel operators” related to the matrix A. The kth lower Hankel operator of A is defined as Hk := [Ak,:;:,k−1 ] (in MATLAB notation). Their rank determines the subsequent dimensions of the matrices ak . The principle is: determine an orthonormal basis for the subsequent lower Hankel operators, starting with the first block-column and then keeping on adding block-columns and deleting irrelevant rows. The property used is: Proposition A.1. Suppose that Hk−1 = Qk−1 Rk−1 is a partial QR-decomposition of Hk−1 , in which Qk−1 is isometric and whose columns form a basis for the range of Hk−1 , and ¯ k−1 := Qk−1,k:N , then the QR-decomposition of Hk := Qk Rk is such that let Q 

¯ k−1 Ak:N,k−1 Q



 = Qk ak−1 qk−1

(A.4)

In addition, pk can be chosen minimally as pk := Qk,k . of the exThe full proof of the proposition is in [2]. One can check the correctness

p pression by assuming the existence of a realization with isometric , in which case a one has: 

¯ k−1 Ak:N,k−1 Q



⎤ ⎡ ⎤ pk pk ak−1 pk qk−1  ⎥ ⎢ ⎥ ⎢ = ⎣ pk+1 ak ak−1 pk+1 ak qk−1 ⎦ = ⎣ pk+1 ak ⎦ ak−1 qk−1 .. .. .. . . . ⎡

(A.5)

the latter being the QR-factorization sought, because the left factor is isometric by assumption. The point is that the subsequent determinations of the realization can be obtained from the previous assessment of the basis Qk−1 , and the new column in A (namely Ak:N,k−1 ). Whether this is really efficient will depend on the original matrix and whether it has a low-order realization, but it is probably the best one can hope for if one has to go forward (an a priori knowledge of a realization which is not in ONF would require a backward recursion). The eventual efficiency is thanks to the potentially (much) lower dimension of the basis as compared to the Hankel. If it is known that the state dimension is no more than δ, then the computational complexity would be at most of the order of N 2 δ 2 , not terrific (with the backward recursion, assuming a realization known it is only N δ 3 , but it is better than N 3 )! In special cases the situation is much better, e.g. in case of band-matrices with non-singular highest order off-blocks, it is just 1. If it is known beforehand that a basis is reached just with partial columns, then the complexity of the identification drops further.

JID:LAA

AID:13529 /FLA

36

P. Dewilde et al. / Linear Algebra and its Applications ••• (••••) •••–•••

[m1L; v1.172; Prn:8/02/2016; 13:43] P.36 (1-36)

References [1] P.M. Dewilde, On the LU factorizations of infinite systems of semi-separable equations, Indag. Math. 23 (2012) 1028–1052. [2] P.M. Dewilde, A.J. van der Veen, Time-varying Systems and Computations, Kluwer, 1998. [3] K. Jainandunsing, Ed.F. Deprettere, A new class of parallel algorithms for solving systems of linear equations, SIAM J. Sci. Statist. Comput. 10 (5) (Sept. 1989) 880–912. [4] S. Chandrasekaran, P. Dewilde, M. Gu, T. Pals, X. Sun, A.J. van der Veen, D. White, Some fast algorithms for sequentially semi-separable representations, SIAM J. Matrix Anal. Appl. 27 (2) (2005) 341–364. [5] Y. Eidelman, I. Gohberg, Fast inversion algorithms for a class block structured matrices, in: V. Olshevsky (Ed.), Structured Matrices in Mathematics, Computer Science and Engineering. II, in: Contemp. Math., vol. 281, American Mathematical Society, Providence, RI, 2001, pp. 17–38. [6] Y. Eidelman, I. Gohberg, A modification of the Dewilde–van der Veen method for inversion of finite structured matrices, Linear Algebra Appl. 385 (2004) 149–185. [7] Y. Eidelman, I. Gohberg, I. Haimovici, Separable Type Representations of Matrices and Fast Algorithms. Volume 1. Basics, Completion Problems, Multiplication and Inversion Algorithms, Birkhäuser, 2014. [8] B.D.O. Anderson, S. Vongpanitlerd, Network Analysis and Synthesis, Prentice Hall, 1973. [9] B.D.O. Anderson, J.B. Moore, Optimal Filtering, Prentice Hall, 1979. [10] F.M. Dopico, C.R. Johnson, J.M. Molera, Multiple LU factorizations of a singular matrix, Linear Algebra Appl. 419 (2006) 24–36. [11] A.E. Frazho, M.A. Kaashoek, Canonical factorization of rational matrix functions. A note on a paper by P. Dewilde, Indag. Math. 23 (2012) 1154–1164. [12] I. Gohberg, T. Kailath, I. Koltracht, Efficient solution of linear systems with recursive structure, Linear Algebra Appl. 80 (1986) 81–113. [13] R. Vandebril, M. Van Barel, N. Mastronardi, Matrix Computations and Semiseparable Matrices: Eigenvalue and Singular Value Methods, The Johns Hopkins University Press, 2008. [14] J. Xia, S. Chandrasekaran, M. Gu, X. Lie, Fast algorithms for hierarchically semiseparable matrices, Numer. Linear Algebra Appl. 17 (2010) 953–976.