Applied Mathematics and Computation 142 (2003) 79–97 www.elsevier.com/locate/amc
Condition number of Bott–Duffin inverse and their condition numbers Yimin Wei a
a,*
, Wei Xu
b
Department of Mathematics, Fudan University, Shanghai 200433, China b Institute of Mathematics, Fudan University, Shanghai 200433, China
Abstract Various normwise relative condition numbers that measure the sensitivity of Bott– Duffin inverse and the solution of constrained linear systems are characterized. The sensitivity of condition number itself is then investigated. Finally, upper bounds are derived for the sensitivity of componentwise condition numbers. Ó 2002 Elsevier Science Inc. All rights reserved. Keywords: Condition number; Bott–Duffin inverse; Sensitivity
1. Introduction The following constrained linear systems: Ax þ y ¼ b;
x 2 L; y 2 L? ;
ð1Þ
where A 2 Rnn , b 2 Rn , a subspace L of Rn , such systems arise in the electrical network theory [2]. It is clear to show that the consistency of (1) is equivalent to the consistency of the equations ðAPL þ PL? Þz ¼ b x and is a solution of (1) if and only if when y x ¼ PL z;
y ¼ PL? z ¼ b APL z;
*
Corresponding author. E-mail address:
[email protected] (Y. Wei).
0096-3003/02/$ - see front matter Ó 2002 Elsevier Science Inc. All rights reserved. doi:10.1016/S0096-3003(02)00285-0
ð2Þ
ð3Þ
80
Y. Wei, W. Xu / Appl. Math. Comput. 142 (2003) 79–97
where z is a solution for (2), PL and PL? is the projection on L and L? , respectively. If APL þ PL? is nonsingular, given b 2 Rn then the unique solution to (2) is x ¼ PL ðAPL þ PL? Þ1 b and y ¼ b Ax. The definition of the Bott–Duffin inverse is given below. Definition 1.1. Given A 2 Rnn , a subspace L Rn , if APL þ PL? is nonsingular then we denote the Bott–Duffin inverse of A on L as ð1Þ
1
AðLÞ ¼ PL ðAPL þ PL? Þ :
ð4Þ
After the introduction of the definition, we are easy to get such characterizations. Theorem 1.1 [1]. Given APL þ PL? nonsingular, the following equations are satisfied ð1Þ
ð1Þ
ð1Þ
AðLÞ ¼ PL AðLÞ ¼ AðLÞ PL ; ð1Þ
RðAðLÞ Þ ¼ L; ð1Þ AðLÞ
¼
ð1Þ
ð2Þ AL;L?
ð1Þ
N ðAðLÞ Þ ¼ L? ;
ð5Þ
þ
¼ ðPL APL Þ ; ð1Þ
ð1Þ
where RðAðLÞ Þ, N ðAðLÞ Þ denotes the range and the null space of AðLÞ , respecþ tively. ðPL APL Þ represents the pseudoinverse (or Moore–Penrose generalized inverse) of PL APL . The classical normwise relative condition number measures the sensitivity of a matrix inverse. In this paper, we will discuss the normwise relative condition number of the Bott–Duffin inverse. Given A 2 Rnn and matrix norm k k, this condition number be defined as ð1Þ
condBD ðAÞ :¼ limþ !0
sup
ð1Þ
kðA þ DAÞðLÞ AðLÞ k ð1Þ
kAðLÞ k
kDAk 6 kAk
:
ð6Þ
Note that in order to reduce the sensitivity measure to a single number, two specifications have been introduced: ð1Þ
(1) We look at the largest relative change in AðLÞ compared with a relative change in A of size . (2) We take the limit as ! 0þ . Hence a condition number records the worstcase sensitivity to small perturbations. When the matrix norm is induced by a vector norm, we can prove condBD ðAÞ has the characterization ð1Þ
condBD ðAÞ ¼ jðAÞ :¼ kAkkAðLÞ k:
ð7Þ
Y. Wei, W. Xu / Appl. Math. Comput. 142 (2003) 79–97
81
Since the constrained linear systems (1) applied widely in electric network theory, it is of interest to define the corresponding condition number for the constrained linear systems Ax þ y ¼ b;
x 2 L; y 2 L? ; ð1Þ
cond
BD
ðA; bÞ :¼ limþ !0
sup
ð1Þ
kðA þ DAÞðLÞ ðb þ DbÞ AðLÞ bk ð1Þ
kAðLÞ bk
kDAk 6 kAk kDbk 6 kAk
:
ð8Þ
Here we measure the sensitivity of the solution x to relative perturbations in A and b. We can also prove below that ð1Þ
condBD ðA; bÞ ¼ jðAÞ þ
kAðLÞ kkbk ð1Þ
kAðLÞ bk
:
ð9Þ
When k k in (8) is the induced matrices norm and any vector norm. It follows from (7) and (9) that condBD ðAÞ 6 condBD ðA; bÞ:
ð10Þ
Condition number are useful for two distinct reasons. When the data fA; bg contains errors, either experimented or numerical, the condition number bounds the level of uncertainty inherent in the solution before a numerical algorithm is applied. Also, when combined with a backward error estimate, it provides on approximate upper bound on the error in a computed solution. Our notation for vector and matrix norm is as The Holder vector Pnfollows. p 1=p p-norm will be written k kp , that is, kxkp :¼ ð i¼1 jxji Þ ð1 6 p < 1Þ and kxk1 :¼ max1 6 i 6 n jxi j. Given arbitrary vector norm k ka and k kb , we define the corresponding operator norm k ka;b by [4] kAka;b :¼ max kAxkb : kxka ¼1
ð11Þ
Note that, in general, the submultiplicative property kABka;b 6 kAka;b kBka;b does not hold, but we do have kABka;b 6 kAkc;b kBka;c
ð12Þ
for any third vector norm k kc . If a ¼ 1 and b ¼ 1, then we will produce the max norm kAk1;1 ¼ kAkmax :¼ max jaij j: 16i6n
For simplicity, we write the P induced norm k ka;a as k ka . The Frobenius norm is defined by kAkF :¼ ð ni;j¼1 ai;j Þ1=2 and we recall that if
82
Y. Wei, W. Xu / Appl. Math. Comput. 142 (2003) 79–97
A ¼ ½ U1
0 ½ V1 0
R U2 0
V2 T ;
where R ¼ diagðr1 ; . . . ; rk ; 0; . . . ; 0Þ, r1 P r2 P P rk > 0, ½ U1 U2 and T ½ V1 V2 is the orthogonal columns, k ¼ rankðAÞ, is a SVD of A [4], then the Pn 1=2 F-norm and spectral norms satisfy kAkF ¼ ð i¼1 r2i Þ and kAk2 ¼ r1 . In the following section we give characterizations for condition number condBD ðAÞ in (6) that arise when the F-norm or max-norm is used, and also ð1Þ when a general k ka;b norm is used to measure A with k kb;a measuring AðLÞ : BD In Section 3 we characterize cond ðA; bÞ in (8) when k k2 is used to measure x and b respectively, and k kF is used for A. We also characterize the case where k ka and k kb are used to measure x and b respectively, and k ka;b is used to measure A. In Section 4, we prove that the condition numbers are approximately as sensitive as the original problems that they describe. Finally, in Section 5, we review results on componentwise condition numbers and derive upon bound on their sensitivity. When the matrix is nonsingular, our results cover the result of Higham in [5].
2. Bott–Duffin inversion We begin this section with a characterization of the Frobenius-norm version of condBD ðAÞ in (6). Theorem 2.1. The condition number ð1Þ
condBD F ðAÞ :¼ limþ !0
ð1Þ
kðA þ DAÞðLÞ AðLÞ kF
sup
kAkF
kDAkF 6 kAkF
ð13Þ
satisfies ð1Þ
condBD F ðAÞ ¼
kAkF kAðLÞ k22 ð1Þ
kAðLÞ kF
ð14Þ
:
Proof. With kDAkF 6 kAkF , neglecting Oð2 Þ terms in a standard expansion [7] gives ð1Þ
ð1Þ
ð1Þ
ð1Þ
ðA þ DAÞðLÞ AðLÞ ¼ AðLÞ DAAðLÞ : Hence, the result is proved if we can show that b A k ¼ kA k2 : sup kAðLÞ D A F 2 ðLÞ ðLÞ b F 61 k DA DAk ð1Þ
ð1Þ
ð1Þ
ð15Þ
Y. Wei, W. Xu / Appl. Math. Comput. 142 (2003) 79–97
83
From inequalities kBCkF 6 kBk2 kCkF , kBCkF 6 kBkF kCk2 , we get ð1Þ c ð1Þ ð1Þ ð1Þ c ð1Þ k 6 kAð1Þ k k DA c kAðLÞ DA DAAðLÞ kF 6 kAðLÞ k2 k DA DAA F 2 DAkF kAðLÞ k2 : ðLÞ ðLÞ Hence ð1Þ c ð1Þ ð1Þ 2 sup kAðLÞ DAA DA ðLÞ kF 6 kAðLÞ k2 : b F 61 k DA DAk ð1Þ
Next we try to show that the equality is attainable. Let rankðAðLÞ Þ ¼ r < n. Then R 0 ð1Þ ð1Þ þ T AðLÞ ¼ ðPL APL Þ ; AðLÞ ¼ ½ U1 U2 ½ V1 V2 0 0 ð1Þ
is an SVD of AðLÞ , where ½ U1 U2 , ½ V1 V2 T is orthogonal columns, R ¼ diagðr1 ; . . . ; rr Þ. c ¼ ½ V1 V2 e1 eT ½ U1 U2 T . Then Let DA 1 2 3 r1 6 7 VT 0 ð1Þ c ð1Þ 7 1T DAAðLÞ ¼ ½ U1 U2 6 AðLÞ DA . 4 5 V .. 2 0 ð1Þ c ð1Þ ð1Þ DAAðLÞ kF ¼ kAðLÞ k22 as required. we can get kAðLÞ DA
pffiffiffi Using the inequalities kAk2 6 kAkF 6 nkAk2 , it follows that jF ðAÞ 6 condBD F ðAÞ 6 jF ðAÞ: n Next we characterize the condition number that arise when k kmax is used in (6). Theorem 2.2. The condition number ð1Þ
condBD max ðAÞ :¼ limþ !0
ð1Þ
kðA þ DAÞðLÞ AðLÞ kmax
sup
kAkmax
kDAkmax 6 kAkmax
ð16Þ
satisfies ð1Þ
condBD max ðAÞ ¼
ð1Þ
kAkmax kAðLÞ k1 kAðLÞ k1 ð1Þ
kAðLÞ kmax
:
ð17Þ
Proof. By analogy with the proof of Theorem 2.1, we must prove that ð1Þ c ð1Þ ð1Þ ð1Þ DAAðLÞ kmax ¼ kAðLÞ k1 kAðLÞ k1 : sup kAðLÞ DA b max 6 1 k DA DAk
ð18Þ
84
Y. Wei, W. Xu / Appl. Math. Comput. 142 (2003) 79–97
Using (12) gives ð1Þ c ð1Þ ð1Þ ð1Þ c ð1Þ k 6 kAð1Þ k k DA c DA ðLÞ kmax 6 kAðLÞ k1 k DA DAA kAðLÞ DAA 1;1 1 DAk1;1 kAðLÞ k1;1 ðLÞ ðLÞ ð1Þ
ð1Þ
6 kAðLÞ k1 kAðLÞ k1 : To show that equality is attainable, let riT and cj denote the ith row and jth ð1Þ ð1Þ column of AðLÞ , respectively. Supposed that k, l are such that krk k1 ¼ kAðLÞ k1 , ð1Þ and kcl k1 ¼ kAðLÞ k1 . Let E denote the matrix of ones and D1 ¼ diagðsignðrk ÞÞ, D2 ¼ diag c ¼ D1 ED2 gives k DA c DAk ðsignðcl ÞÞ. Then choosing DA max ¼ 1, and ð1Þ c ð1Þ ð1Þ c ð1Þ T DAAðLÞ kmax P ðAðLÞ DA DAAðLÞ Þk;l ¼ jrk j Ejcl j ¼ krk k1 kcl k1 kAðLÞ DA ð1Þ
ð1Þ
¼ kAðLÞ k1 kAðLÞ k1 as required.
Finally, we consider the characterization of the the operator norm k ka;b . ð1Þ This suggest the use of k kb;a to measure AðLÞ . To analyze the corresponding condition number, we require the following lemma. Lemma 2.1 [5]. Given vector norm k ka and k kb and vector x; y 2 Rn . Such that kxka ¼ kykb ¼ 1 then there exists a matrix B with kBka;b ¼ 1 such that Bx ¼ y. Theorem 2.3. The condition number ð1Þ
condBD a;b ðAÞ :¼ limþ !0
sup
ð1Þ
kðA þ DAÞðLÞ AðLÞ kb;a
kDAka;b 6 kAka;b
ð1Þ
kAðLÞ kb;a
ð19Þ
satisfies ð1Þ
condBD a;b ðAÞ ¼ kAka;b kAðLÞ kb;a :
ð20Þ
Proof. Following the proof of previous two theorems, the required result is ð1Þ c ð1Þ ð1Þ DAAðLÞ kb;a ¼ kAðLÞ k2b;a : sup kAðLÞ DA b a;b 6 1 k DA DAk
ð21Þ
From (12) we get ð1Þ c ð1Þ ð1Þ ð1Þ c ð1Þ k 6 kAð1Þ k k DA c DAAðLÞ kb;a 6 kAðLÞ kb;a k DA kAðLÞ DA DAA b;b b;a DAka;b kAðLÞ kb;a ðLÞ ðLÞ ð1Þ 2
6 kAðLÞ kb;a :
Y. Wei, W. Xu / Appl. Math. Comput. 142 (2003) 79–97
85
ð1Þ c ð1Þ DAAðLÞ kb;a ¼ To show the opposite inequality, we have kAðLÞ DA ð1Þ c ð1Þ DAAðLÞ yka . We take a certain y with kykb ¼ 1 such that maxkykb ¼1 kAðLÞ DA ð1Þ ð1Þ xka ¼ 1. Now from kAðLÞ yka ¼ kAðLÞ kb;a . It is easy to find a vector x^ with k^ c with k DA c cx ¼ y Lemma 2.1, there exists a matrix DA DAk DA^ a;b ¼ 1 such that DA satisfies ð1Þ c ð1Þ ð1Þ c ð1Þ ð1Þ DAAðLÞ yka P kAðLÞ DA DA^ xka kAðLÞ kb;a P kAðLÞ k2b;a max kAðLÞ DA
kykb ¼1
as required.
3. The constrained linear systems The solution for the constrained linear equations [2] Ax þ y ¼ b;
x 2 L; y 2 L? ;
ð1Þ
ð1Þ
is x ¼ AðLÞ b and y ¼ b Ax ¼ ðI AAðLÞ Þb. We assume henceforth that the perturbation to the equations is as c ðA þ DA DAÞðx þ DxÞ þ ðy þ DyÞ ¼ b þ Db; ð1Þ
where Dx 2 L, Dy 2 L? and kAðLÞ kkDAk < 1. Theorem 3.1. For constrained linear systems Ax þ y ¼ b, the condition number ð1Þ
condBD F ðA; bÞ
:¼ limþ !0
ð1Þ
kðA þ DAÞðLÞ ðb þ DbÞ AðLÞ bk2
sup
ð1Þ
kAðLÞ bk2
kDAkF 6 kAkF kDbk2 6 kbk2
ð22Þ
satisfies ð1Þ
condBD F ðA; bÞ
¼
ð1Þ kAkF kAðLÞ k2
þ
kAðLÞ k2 kbk2 ð1Þ
kAðLÞ bk2
:
ð23Þ
Proof. Suppose kDAkF 6 kAkF , kDbk2 6 kbk2 . Let ðA þ DAÞðx þ DxÞ þ ðy þ DyÞ ¼ b þ Db; ð1Þ
where Dx 2 L, Dy 2 L? , Db 2 Rn , kAðLÞ kkDAk < 1. Using PL z ¼ x, Dx ¼ PL Dz, Dy ¼ PL? Dz then ðA þ DAÞPL ðz þ DzÞ þ PL? ðz þ DzÞ ¼ b þ Db gives APL z þ DAPL z þ APL Dz þ DAPL Dz þ PL? z þ PL? Dz ¼ b þ Db
86
Y. Wei, W. Xu / Appl. Math. Comput. 142 (2003) 79–97
neglecting Oð2 Þ, we get DAPL z þ APL Dz þ PL? Dz ¼ Db ðAPL þ PL? ÞDz ¼ Db DAPL z from which it follows that Dz ¼ ðAPL þ PL? Þ1 ðDb DAxÞ: From above, we find ð1Þ
Dx ¼ PL Dz ¼ AðLÞ ðDb DAxÞ;
Dy ¼ PL? Dz:
Here we just use Dx, hence ð1Þ
ð1Þ
ð1Þ
kDxk2 ¼ kAðLÞ ðDb DAxÞk2 6 kAðLÞ k2 kDbk2 þ kAðLÞ k2 kDAk2 kxk2 ð1Þ
6 kAðLÞ k2 ðkbk2 þ kAkF kxk2 Þ:
ð24Þ
Now using (24) ð1Þ
condBD F ðA; bÞ ¼ limþ !0
ð1Þ
kðA þ DAÞðLÞ ðb þ DbÞ AðLÞ bk2
sup
ð1Þ
kAðLÞ bk2
kDAkF 6 kAkF kDbk2 6 kbk2 ð1Þ
ð1Þ
6 kAkF kAðLÞ k2 þ
kAðLÞ k2 kbk2 ð1Þ
kAðLÞ bk2
ð1Þ
ð25Þ
:
ð1Þ
y kbk2 , DA ¼ Now suppose k^ y k2 ¼ 1 and kAðLÞ y^k2 ¼ kAðLÞ k2 , let Db ¼ ^ ^ y xT kAkF =kxk2 , so that kDAkF ¼ kAkF . With perturbations
ð1Þ
ð1Þ T kAkF
kAðLÞ ðDb DAxÞk2 ¼ AðLÞ ^ x y kbk2 þ y^x
kxk2 2 ð1Þ
ð1Þ
¼ kAðLÞ y^kbk2 þ AðLÞ y^kAkF kxk2 k2 ð1Þ
¼ kAðLÞ k2 ðkbk2 þ kAkF kxk2 Þ giving equality.
The next result concerns the condition number of a; b-norm. Theorem 3.2. For constrained linear systems Ax þ y ¼ b the condition number ð1Þ
condBD a;b ðA; bÞ
:¼ limþ !0
sup kDAka;b 6 kAka;b
ð1Þ
kðA þ DAÞðLÞ ðb þ DbÞ AðLÞ bka ð1Þ
kAðLÞ bka
kDbkb 6 kbkb
ð26Þ
Y. Wei, W. Xu / Appl. Math. Comput. 142 (2003) 79–97
87
satisfies ð1Þ
ð1Þ
condBD a;b ðA; bÞ ¼ kAka;b kAðLÞ kb;a þ
kAðLÞ kb;a kbkb ð1Þ
kAðLÞ bka
:
ð27Þ
Proof. Suppose kDAka;b 6 kAka;b and kDbkb 6 kbkb . As in the proof of Theð1Þ orem 3.1, the key quantity is AðLÞ ðDb DAxÞ. We have, using (12) ð1Þ
ð1Þ
ð1Þ
kAðLÞ ðDb DAxÞka 6 kAðLÞ Dbka þ kAðLÞ DAxka ð1Þ
ð1Þ
ð1Þ
ð1Þ
6 kAðLÞ kb;a kDbkb þ kAðLÞ DAka kxka 6 kAðLÞ kb;a kDbkb þ kAðLÞ ka kkDAka;b kxka ð1Þ
6 kAðLÞ kb;a ðkbkb þ kAka;b kxka Þ giving ð1Þ
ð1Þ
condBD a;b ðA; bÞ 6 kAka;b kAðLÞ kb;a þ ð1Þ
kAðLÞ kb;a kbkb ð1Þ
kAðLÞ bka
:
ð1Þ
Now suppose k^ y kb ¼ 1 with kAðLÞ y^ka ¼ kAðLÞ kb;a , and choose Db ¼ kbkb y^. From Lemma 2.1 there exists a matrix B with kBka;b ¼ 1 such that Bx=kxka ¼ ^ y . Letting DA ¼ kAka;b B, we have ð1Þ
ð1Þ
kAðLÞ ðDb DAxÞka ¼ kAðLÞ kb;a ðkbkb þ kAka;b kxka Þ showing that equality is possible.
With equality for some b, we see from (20) and (27) that with the k ka , k kb measure the inequalities (10) still hold. It is also clear from the proof of Theorem 3.2 that if we alter the definition of condBD a;b ðA; bÞ so that b cannot be BD perturbed, then condBD ðAÞ and cond ðA; bÞ becomes equal. a;b a;b
4. Condition number sensitivity In general, condition number cannot be computed exactly, and hence it is of interest to consider the sensitivity of the problem ‘‘the condition number of condition number’’. This concept was investigated by Demmel [3]. Our results below are more specialized, since they apply only to Bott–Duffin inversion and the solution of constrained linear systems, and consequently they are sharper. To motivate the analysis, we consider a constrained linear systems Ax þ y ¼ b;
x 2 L; y 2 L? :
88
Y. Wei, W. Xu / Appl. Math. Comput. 142 (2003) 79–97
Typically, an a priori rounding error analysis or an a posteriori residual check will allow us to conclude that a computed x^; y^ satisfies a nearby system x^ 2 L; y^ 2 L? ;
ðA þ DAÞ^ x þ y^ ¼ b þ Db;
where kDAk and kDbk are small, say maxfkDAk=kAk; kDbk=kbkg ¼ c1 u, where c1 is close to unity and u is the machine unit roundoff. Using appropriate norms for condBD ðA; bÞ, it is clear that we have the approximate error bound kx x^k 6 condBD ðA; bÞc1 u: kxk
ð28Þ
Now, even when condBD ðA; bÞ has a simple characterization, it cannot normally be computed exactly. Given that A and b may contain errors before an algorithm to compute condBD ðA; bÞ is applied, perhaps the best that f b þ Db f where maxfk DA f DA; DbÞ, DAk= we can hope for is to compute condBD ðA þ DA f kAk; k Db Dbk=kbkg ¼ c2 u, with c2 close to unity. The error in the computed version of the bound (28) may be analyzed by considering the level-2 condition number condBD½2 ðA; bÞ :¼ limþ !0
sup kDAk 6 kAk kDbk 6 kbk
jcondBD ðA þ DA; b þ DbÞ condBD ðA; bÞj : condBD ðA; bÞ ð29Þ
We then have the approximate inequality f b þ Db f 1 u condBD ðA; bÞc1 uj jcondBD ðA þ DA DA; DbÞc 6 condBD ðA; bÞc1 ucondBD½2 ðA; bÞc2 u: f b þ Db f DA; DbÞ We conclude that if condBD ðA; bÞ 6 u1 then using condBD ðA þ DA instead of condBD ðA; bÞ in (28) will not affect the order of magnitude of the error bound. The result below show that for Bott–Duffin inverse or solving a constrained linear systems the sensitivity of the condition number is approximately given by the condition number itself. The first result concerns matrix Bott–Duffin inversion, and relies on the following lemma. Lemma 4.1. As ! 0þ , max
kDAka;b 6 kAka;b
ð1Þ
ð1Þ
ð1Þ
2 jkðA þ DAÞðLÞ kb;a kAðLÞ kb;a j ¼ kAðLÞ kb;a condBD a;b ðAÞ þ Oð Þ:
ð30Þ
Y. Wei, W. Xu / Appl. Math. Comput. 142 (2003) 79–97
89
Proof. Using (12), if kDAka;b 6 kAka;b then ‘‘ 6 ’’ in (30) follows by taking ð1Þ
ð1Þ
ð1Þ
ð1Þ
norms in the expansion ðA þ DAÞðLÞ ¼ AðLÞ AðLÞ DAAðLÞ þ Oð2 Þ (see [7]). ð1Þ
ð1Þ
ð1Þ
ð1Þ
Now let k^ y kb ¼ 1 and kAðLÞ y^ka ¼ kAðLÞ kb;a and let :^ x ¼ AðLÞ y^=kAðLÞ y^ka . Then by Lemma 2.1 there exists a matrix B such that kBka;b ¼ 1 and B^ x ¼ ^ y. Choosing DA ¼ BkAka;b gives ð1Þ
ð1Þ
ð1Þ
ð1Þ
ð1Þ
ð1Þ
y ka kAðLÞ AðLÞ DAAðLÞ kb;a P kðAðLÞ AðLÞ DAAðLÞ Þ^ ð1Þ
ð1Þ
¼ kAðLÞ kb;a ð1 þ kAka;b kAðLÞ kb;a Þ
ð31Þ
showing that (30) is attainable. Theorem 4.1. The level-2 condition number BD½2
conda;b ðAÞ :¼ limþ !0
BD jcondBD a;b ðA þ DAÞ conda;b ðAÞj
sup kDAka;b 6 kAka;b
condBD a;b ðAÞ
ð32Þ
satisfies BD½2
BD condBD a;b ðAÞ 1 6 conda;b ðAÞ 6 conda;b ðAÞ þ 1:
Proof. If kDAka;b 6 kAka;b , then using kA þ DAka;b 6 kAka;b ð1 þ Þ and Lemma 4.1, it follows that ð1Þ
BD 2 kA þ DAka;b kðA þ DAÞðLÞ kb;a 6 condBD a;b ðAÞ½1 þ econda;b ðAÞ þ þ Oð Þ
so that BD condBD a;b ðA þ DAÞ conda;b ðAÞ
condBD a;b ðAÞ
6 condBD a;b ðAÞ þ 1 þ OðÞ:
ð33Þ
Similar, using kA þ DAka;b P kAka;b ð1 Þ and Lemma 4.1, we can derive a lower bound of condBD a;b ðAÞ 1 þ OðÞ for the right-hand side of (33), and hence, in (32) BD½2
conda;b ðAÞ 6 condBD a;b ðAÞ þ 1: To get a lower bound, we may choose DA as in (31), giving ð1Þ
kA þ DAka;b kðA þ DAÞðLÞ kb;a ð1Þ
2 P kAka;b ð1 ÞkAðLÞ kb;a ½1 þ condBD a;b ðAÞ þ Oð Þ
and hence BD BD 2 condBD a;b ðA þ DAÞ P conda;b ðAÞ½1 þ conda;b ðAÞ þ Oð Þ:
ð34Þ
90
Y. Wei, W. Xu / Appl. Math. Comput. 142 (2003) 79–97
This rearranges to BD condsBD a;b ðA þ DAÞ conda;b ðAÞ
condBD a;b ðAÞ
P condBD a;b ðAÞ 1 þ OðÞ:
So, in (32) BD½2
conda;b ðAÞ P condBD a;b ðAÞ 1:
Next we consider constrained linear systems. Theorem 4.2. The level-2 condition number BD½2
conda;b ðA; bÞ :¼ limþ !0
BD jcondBD a;b ðA þ DA; b þ DbÞ conda;b ðA; bÞj
sup
condBD a;b ðA; bÞ
kDAka;b 6 kAka;b kDbkb 6 kbkb
ð35Þ
satisfies BD½2
conda;b ðA; bÞ 6 3condBD a;b ðA; bÞ þ 2: Proof. SupposekDAka;b 6 kAka;b and kDbkb 6 kbkb . From Lemma 4.1, we have ð1Þ
ð1Þ
2 kðA þ DAÞðLÞ kb;a kb þ Dbkb 6 kAðLÞ kb;a kbkb ½1 þ condBD a;b ðAÞ þ þ Oð Þ:
ð36Þ Also, using the definition of condBD a;b ðA; bÞ, 1 1 1 kDxka 6 ¼ 1þ þ Oð2 Þ kx þ Dxka kxka kDxka kxka kxka 1 2 ½1 þ condBD 6 a;b ðA; bÞ þ Oð Þ: kxka Combining (36) and (37), we find ð1Þ
kðA þ DAÞðLÞ kb;a kb þ Dbkb
6
kx þ Dxka ð1Þ kAðLÞ kb;a kbkb kxka
BD 2 ½1 þ condBD a;b ðAÞ þ conda;b ðA; bÞ þ þ Oð Þ
ð37Þ
Y. Wei, W. Xu / Appl. Math. Comput. 142 (2003) 79–97
91
from which it follows that ð1Þ
ð1Þ
kðA þ DAÞðLÞ kb;a kb þ Dbkb =kx þ Dxka kAðLÞ kb;a kbkb =kxka ð1Þ
ðcondBD a;b ðAÞ þ kAðLÞ kb;a kbkb =kxka Þ BD 6 1 þ condBD a;b ðAÞ þ conda;b ðA; bÞ þ OðÞ:
ð38Þ
BD A similar analysis gives a lower bound of 1 condBD a;b ðAÞ conda;b ðA; bÞ þ OðÞ for the right-hand side of (38), and hence we have ð1Þ
ð1Þ
jkðA þ DAÞðLÞ kb;a kb þ Dbkb =kx þ Dxka kAðLÞ kb;a kbkb =kxka j ð1Þ
ðcondBD a;b ðAÞ þ kAðLÞ kb;a kbkb =kxka Þ BD 6 1 þ condBD a;b ðAÞ þ conda;b ðA; bÞ þ OðÞ:
ð39Þ
Now form Theorem 4.1 BD jcondBD a;b ðA þ DAÞ conda;b ðAÞj ð1Þ
ðcondBD a;b ðAÞ þ kAðLÞ kb;a kbkb =kxka Þ
6
BD jcondBD a;b ðA þ DAÞ conda;b ðAÞj
condBD a;b ðAÞ
BD½2
6 conda;b ðAÞ þ OðÞ 6 condBD a;b ðAÞ þ 1 þ OðÞ:
ð40Þ
Using the characterization (27), it follows from (39) and (40) that BD½2
BD BD conda;b ðA; bÞ 6 2 þ 2condBD a;b ðAÞ þ conda;b ðA; bÞ 6 2 þ 3conda;b ðA; bÞ:
ð41Þ
In practice, condition numbers will usually be computed via their characð1Þ terizations; for example, condBD a;b ðAÞ ¼ kAka;b kAðLÞ kb;a . In this case, it could be argued that the best that we can hope to compute is kA þ DA1 ka;b k ð1Þ ðA þ DA2 ÞðLÞ kb;a , where DA1 and DA2 are different small perturbations. By examining the proof of Theorems 4.1 and 4.2 it is clear that allowing different perturbations in this manner does not significantly affect the level-2 condition numbers, in fact, as we show below, for the case of matrix inversion the upper bound in Theorem 4.1 becomes an exact characterization. Theorem 4.3. The alternative level-2 condition number BD½2
conda;b ðAÞ :¼ limþ !0
sup kDA1 ka;b 6 kAka;b kDA2 ka;b 6 kAka;b
kA þ DA1 k kðA þ DA2 Þð1Þ k kAk kAð1Þ k a;b b;a a;b b;a ðLÞ ðLÞ ð1Þ kAka;b kAðLÞ kb;a ð42Þ
92
Y. Wei, W. Xu / Appl. Math. Comput. 142 (2003) 79–97
satisfies BD½2
conda;b ðAÞ ¼ condBD a;b ðAÞ þ 1:
ð43Þ
Proof. We just let DA1 ¼ A and DA2 ¼ BkAk where B is the matrix which satisfies (31) in Lemma 4.1. From Lemma 4.1 gives ð1Þ
kA þ DA1 ka;b kðA þ DA2 ÞðLÞ kb;a ð1Þ
ð1Þ
¼ kAka;b ð1 þ ÞkAðLÞ kb;a ð1 þ kAka;b kAðLÞ kb;a Þ ð1Þ
2 ¼ condBD a;b ðAÞð1 þ þ kAka;b kAðLÞ kb;a Þ þ Oð Þ: BD½2
From the definition of conda;b ðAÞ, it is easy to get (43).
5. Componentwise measure As an alternative to the normwise measures considered in the previous sections, it is possible to treat perturbations in a componentwise manner. We will discuss the perturbation to component of Bott–Duffin inversion and constrained linear systems, respectively. Similarly we can define the condition number of component. As matrix A is nonsingular, Rohn [6] has given a series of conclusions. In this paper, we generalize to Bott–Duffin inversion and constrained linear systems. Theorem 5.1. The component condition number of Bott–Duffin inversion ð1Þ
cBD ij ðAÞ :¼ limþ !0
ð1Þ
jðA þ DAÞðLÞ AðLÞ jij
sup
ð44Þ
ð1Þ
jAðLÞ jij
jDAj 6 jAj
satisfies ð1Þ
cBD ij ðAÞ
¼
ð1Þ
ðjAðLÞ jjAjjAðLÞ jÞij
ð45Þ
ð1Þ
jAðLÞ jij
where jAj denotes ðjaij jÞ and A 6 B means aij 6 bij for 1 6 i; j 6 n. Proof. From [7], we get ð1Þ
ð1Þ
ð1Þ
ð1Þ
ðA þ DAÞðLÞ AðLÞ ¼ AðLÞ DAAðLÞ þ Oð2 Þ to each i; j, ð1Þ
ð1Þ
ð1Þ
ð1Þ
maxfjðA þ DAÞðLÞ AðLÞ jij ; jDAj 6 jAjg ¼ maxfjAðLÞ DAAðLÞ þ Oð2 Þjij g;
Y. Wei, W. Xu / Appl. Math. Comput. 142 (2003) 79–97 ð1Þ
ð1Þ
ð1Þ
93
ð1Þ
jAðLÞ DAAðLÞ þ Oð2 Þjij ¼ jAðLÞ DAAðLÞ jij þ Oð2 Þ ð1Þ
ð1Þ
6 ðjAðLÞ jjDAjjAðLÞ jÞij þ Oð2 Þ ð1Þ
ð1Þ
¼ ðjAðLÞ jjAjjAðLÞ jÞij þ Oð2 Þ:
ð46Þ
Let ð1Þ
ð1Þ
ð1Þ
ð1Þ
ð1Þ
ð1Þ
C1 ¼ diagðsignðai1 Þ; signðai2 Þ; . . . ; signðain ÞÞ; C2 ¼ diagðsignða1j Þ; signða2j Þ; . . . ; signðanj ÞÞ; ð1Þ
denotes the (i; j)th component of Bott–Duffin inversion. Let where aij DA ¼ C1 jAjC2 using (46), ð1Þ
ð1Þ
ð1Þ
ð1Þ
jAðLÞ DAAðLÞ jij þ Oð2 Þ ¼ jðAðLÞ C1 ÞjAjðC2 AðLÞ Þjij þ Oð2 Þ ð1Þ
ð1Þ
¼ ðjAðLÞ jjAjjAðLÞ jÞij þ Oð2 Þ: C1 and C2 are diagonal matrices whose diagonal elements are 1 or )1. Hence, ð1Þ
ð1Þ
ð1Þ
ð1Þ
maxfjðA þ DAÞðLÞ AðLÞ jij ; jDAj 6 jAjg ¼ ðjAðLÞ jjAjjAðLÞ jÞij þ Oð2 Þ From the definition of cBD ij ðAÞ, ð1Þ
cBD ij ðAÞ ¼
ð1Þ
ðjAðLÞ jjAjjAðLÞ jÞij ð1Þ
jAðLÞ jij
:
Theorem 5.2. For constrained linear systems Ax þ y ¼ b the Bott–Duffin inversion component condition number ð1Þ
cBD i ðA; bÞ
:¼ limþ !0
ð1Þ
jðA þ DAÞðLÞ ðb þ DbÞ AðLÞ bji
sup
ð1Þ
jAðLÞ ji
jDAj<jAj jDbj<jbj
ð47Þ
satisfies ð1Þ
cBD i ðA; bÞ ¼
ð1Þ
ð1Þ
ðjAðLÞ jjAjjAðLÞ bj þ jAðLÞ jjbjÞi ð1Þ
jAðLÞ bji
:
Proof. Suppose jDAj 6 jAj. By analogy with the proof of Theorem 5.1.
ð48Þ
94
Y. Wei, W. Xu / Appl. Math. Comput. 142 (2003) 79–97 ð1Þ
ð1Þ
jðA þ DAÞðLÞ ðb þ DbÞ AðLÞ bji ð1Þ
ð1Þ
¼ jAðLÞ ðDb DAAðLÞ bÞ þ Oð2 Þji ð1Þ
ð1Þ
¼ jAðLÞ Db AðLÞ DAxji þ Oð2 Þ ð1Þ
ð1Þ
ð1Þ
6 ðjAðLÞ jjbj þ jAðLÞ jjAjjAðLÞ bjÞi þ Oð2 Þ: Let ð1Þ
ð1Þ
ð1Þ
C1 ¼ diagðsignðai1 Þ; signðai2 Þ; . . . ; signðain ÞÞ; C2 ¼ diagðsignðx1 Þ; signðxn ÞÞ such that Db ¼ C1 jbj and DA ¼ C1 jAjC2 ð1Þ
ð1Þ
ð1Þ
ð1Þ
ð1Þ
jðA þ DAÞðLÞ ðb þ DbÞ AðLÞ bji ¼ ðjAðLÞ jjAjjAðLÞ bj þ jAðLÞ jjbjÞi þ Oð2 Þ: From the definition of cBD ij ðA; bÞ, ð1Þ
cBD ij ðA; bÞ
¼
ð1Þ
ð1Þ
ðjAðLÞ jjAjjAðLÞ bj þ jAðLÞ jjbjÞi ð1Þ
jAðLÞ bji
:
Overall componentwise relative condition number can defined as cBD max ðAÞ :¼ maxfcij ðAÞg;
cBD max ðA; bÞ :¼ maxfci ðA; bÞg:
i;j
i
ð1Þ
ð1Þ
Note that because of the presence of the denominators jAðLÞ jij and jAðLÞ bji BD in (45) and (48) cBD max ðAÞ and cmax ðA; bÞ can be arbitrarily larger than any given normwise condition number, and it is not possible to relate cBD max ðAÞ and cBD max ðA; bÞ via inequalities like (10). The theorem below gives an upper bound BD on the level-2 condition numbers that correspond to cBD ij ðAÞ and ci ðA; bÞ. Although it is clear from the proof that the bounds may be far from sharp, we do have the pleasing result that the level-2 condition numbers cannot be significantly larger than the level-1 condition numbers. As in the normwise case, allowing different perturbations DA to different factors in the characterization (45) and (48) would not affect the results significantly. Theorem 5.3. The level-2 condition numbers BD½2
cij
ðAÞ :¼ limþ sup !0
DA 6 jAj
BD jcBD ij ðA þ DAÞ cij ðAÞj cBD ij ðAÞ
ð49Þ
and BD½2
ci
ðA; bÞ :¼ limþ !0
sup jDAj 6 jAj jDbj 6 jbj
BD jcBD i ðA þ DA; b þ DbÞ ci ðA; bÞj cBD i ðA; bÞ
ð50Þ
Y. Wei, W. Xu / Appl. Math. Comput. 142 (2003) 79–97
95
satisfies BD½2
ðAÞ 6 3cBD max ðAÞ þ 1
BD½2
BD ðA; bÞ 6 3cBD max ðA; bÞ þ 2cmax ðAÞ þ 2:
cij and ci
ð1Þ
ð1Þ
ð1Þ
ð1Þ
Proof. If jDAj 6 jAj then ðA þ DAÞðLÞ ¼ AðLÞ AðLÞ DAAðLÞ þ Oð2 Þ and it follows that ð1Þ
ð1Þ
ð1Þ
jAðLÞ j jAðLÞ jjAjjAðLÞ j þ Oð2 Þ ð1Þ
ð1Þ
ð1Þ
ð1Þ
6 jðA þ DAÞðLÞ j 6 jAðLÞ j þ jAðLÞ jjAjjAðLÞ j þ Oð2 Þ: Hence, from (45) ð1Þ
2 jAðLÞ jij ½1 cBD ij ðAÞ þ Oð Þ ð1Þ
ð1Þ
2 6 jðA þ DAÞðLÞ jij 6 jAðLÞ jij ½1 þ cBD ij ðAÞ þ Oð Þ:
Using (45) giving, ð1Þ
2 jAðLÞ jij ½1 cBD ij ðAÞ þ Oð Þ ð1Þ
ð1Þ
2 6 jðA þ DAÞðLÞ jij 6 jAðLÞ jij ½1 þ cBD ij ðAÞ þ Oð Þ:
From jA þ DAj 6 jAjð1 þ Þ ð1Þ
ð1Þ
ðjðA þ DAÞðLÞ jjA þ DAjjðA þ DAÞðLÞ jÞij ð1Þ
ð1Þ
2 6 ðjAðLÞ jjAjjAðLÞ jÞij ½1 þ þ 2cBD max ðAÞ þ Oð Þ:
Hence, ð1Þ
cBD ij ðA þ DAÞ ¼
ð1Þ
jðA þ DAÞðLÞ jij ð1Þ
6
ð1Þ
ðjðA þ DAÞðLÞ jjA þ DAjjðA þ DAÞðLÞ jÞij ð1Þ
ðjAðLÞ jjAjjAðLÞ jÞij ð1Þ
jAðLÞ jij
2 ½1 þ þ 3cBD ij ðAÞ þ Oð Þ:
ð51Þ
Similar, using jA þ DAj P jAjð1 Þ it easy to show that ð1Þ
cBD ij ðA þ DAÞ ¼
ð1Þ
jðA þ DAÞðLÞ jij ð1Þ
P
ð1Þ
ðjðA þ DAÞðLÞ jjA þ DAjjðA þ DAÞðLÞ jÞij ð1Þ
ðjAðLÞ jjAjjAðLÞ jÞij ð1Þ jAðLÞ jij
2 ½1 3cBD max ðAÞ þ Oð Þ:
ð52Þ
96
Y. Wei, W. Xu / Appl. Math. Comput. 142 (2003) 79–97
Using (51) and (52) BD½2
cij
ðAÞ 6 1 þ 3cBD max ðAÞ:
For constrained linear systems Ax þ y ¼ b, ð1Þ
ð1Þ
Dx ¼ AðLÞ Db AðLÞ DAx; ð1Þ
ð1Þ
ð1Þ
ð1Þ
jAðLÞ bj jAðLÞ jjbj jAðLÞ jjAjjAðLÞ bj þ Oð2 Þ 6 jx þ Dxj ð1Þ
ð1Þ
ð1Þ
ð1Þ
6 jAðLÞ bj þ jAðLÞ jjbj þ jAðLÞ jjAjjAðLÞ bj þ Oð2 Þ: From the constrained linear systems, we have 2 BD 2 jxji ½1 cBD max ðA; bÞ þ Oð Þ 6 jx þ Dxji 6 jxji ½1 þ cmax ðA; bÞ þ Oð Þ:
So, jðA þ DAÞð1Þ jjA þ DAjjðx þ DxjÞ j ðjAð1Þ jjAjjxjÞ i i ðLÞ ðLÞ jx þ Dxji jxji ð1Þ
6
ðjAðLÞ jjAjjxjÞi
jxji ð1Þ ðjA jjAjjxjÞ ð1 þ þ cBD ðAÞ þ cBD ðA; bÞÞ ðjAð1Þ jjAjjxjÞ i i ij max ðLÞ ðLÞ jxji ð1 cBD jxji max ðA; bÞÞ ð1Þ
61 þ
cBD max ðAÞ
þ
2cBD max ðA; bÞ
ðjAðLÞ jjAjjxjÞi jxji þ OðÞ
ð53Þ
and ðjðA þ DAÞð1Þ jjb þ DbjÞ ðjAð1Þ jjbjÞ i i ðLÞ ðLÞ jx þ Dxji jxji ð1Þ
6
ðjAðLÞ jjbjÞi
jxji ð1Þ ðjA jjbjÞ ð1 þ þ cBD ðAÞÞ ðjAð1Þ jjbjÞ max i i ðLÞ ðLÞ jxji ð1 cBD jxji max ðA; bÞÞ ð1Þ
61 þ
cBD max ðAÞ
þ
ðjAðLÞ jjbjÞi jxji
cBD max ðA; bÞ
þ OðÞ:
ð54Þ
From (53), (54) and (48) BD½2
ci
BD ðA; bÞ 6 2 þ 3cBD max ðA; bÞ þ 2cmax ðAÞ:
Y. Wei, W. Xu / Appl. Math. Comput. 142 (2003) 79–97
97
Acknowledgements Project 19901006 supported by National Natural Science Foundation of China and Doctoral Point Foundation of China.
References [1] A. Ben-Isreal, T.N.E. Greville, Generalized Inverse, Theory and Applications, Wiley, New York, 1974. [2] R. Bott, R.J. Duffin, On the algebra of networks, Trans. Am. Math. Soc. 74 (1953) 99–109. [3] J.W. Demmel, On condition numbers and the distance to the nearest ill-posed problem, Numer. Math. 51 (1987) 251–289. [4] G.H. Golub, C.F. Van Loan, Matrix Computations, 3rd ed., Johns Hopkins U.P, Baltimore, 1996. [5] D.J. Higham, Condition numbers and their condition numbers, Linear Algebra Appl. 214 (1995) 193–213. [6] J. Rohn, New condition numbers for matrices and linear systems, Computing 41 (1989) 167– 169. [7] G. Wang, Y. Wei, Perturbation theory for the Bott–Duffin inverse and its applications, J. Shanghai Normal Univ. 22 (4) (1993) 1–6.