Condition numbers for functions of matrices

Condition numbers for functions of matrices

Applied Numerical North-Holland APNUM Mathematics 107 12 (1993) 107-117 389 Condition numbers for functions of matrices * I. Gohberg Department ...

545KB Sizes 0 Downloads 27 Views

Applied Numerical North-Holland

APNUM

Mathematics

107

12 (1993) 107-117

389

Condition numbers for functions of matrices * I. Gohberg Department

of Mathematics,

Tel Aviv Uniuersity, Ramat Aviv, Tel Aviv 49978, Israel

I. Koltracht Department

of Mathematics,

University of Connecticut,

Storm, CT 06269-3009,

USA

Abstract Gohberg, I. and I. Koltracht, (1993) 107-117.

Condition

numbers

for functions

of matrices,

Formulas and estimates for condition numbers of functions of matrices certain formula of the Frechet derivative of a function of a matrix. Keywords.

Derivative

of a function

of a matrix; condition

Applied

Numerical

are obtained.

Mathematics

They are based

12 on a

number.

1. Introduction

In this paper we give a certain representation for the Frechet derivative of a function of a matrix. Our main application is to the condition numbers of the corresponding map, F: A --) F(A),

A,F(A)

E Wx”,

which we describe now in some detail. We consider here differentiable maps only, in which case the usual condition number of F at A # 0 and F(A) # 0, is given by: k(F,

A) =

IIF’(A)lI IIAII II F(A)

II

(see, for example, Gohberg and Koltracht significance: if IIX-A )I < E )I A )I, then

II F(X) -F(A) II F(A) II

II

[l] and references

It has the following

6 &k(F, A) + o(s).

Correspondence to: I. Koltracht, Department of Mathematics, University USA. * This work was partly supported by the NSF (Grant DMS-9007030). 0168-9274/93/$06.00

therein).

0 1993 - Elsevier

Science

Publishers

of Connecticut,

B.V. AI1 rights reserved

Storrs,

CT 06269-3009,

108

I. Gohberg, I. Koltracht / Condition numbers for functions of matrices

Another useful condition number is the mixed condition number, which is given by the formula, II~‘(4Qf IIF(A)

m(F~ A)=

IL ’

II A IIu = mai,j=l,,,,, n l Aij l (see [l]). We identify here an IZX y1 matrix A with the n2 vector consisting of concatenated rows of A, A = (A,,, A12,. . . , Al,, FIEF,. . . , A,,), so that the u-norm is the m-norm of A as a vector in R”“. The matrix DA is the corresponding n2 X n2 diagonal matrix, where

D,=diag{&,

A12,...,Aln,

A2n,...,Ann].

The mixed condition number relates normwise errors in F(A) to componentwise ]Xij-Aij]

~&IAijl,

i,j=l,...,

errors in A: if

Iz,

then

II F(X) -F(A) IIU < m(F, II F(A) llu

A) + O(E).

The obtained representation of the Frechet derivative gives explicit formulas for directional derivatives of the corresponding map. These formulas are also useful for the analysis of structured condition numbers of F at A, which correspond to perturbations of A within a structured subset of the domain of definition of F. By definition, a structured subset of the domain of definition of F is the range of another differentiable map, say, H : R” + Rp with n
then it can be shown that

and aF al$..-Yanah, m(FoH,a)=

II F(A)

aF Ill

llu

m ’

where aF/ah, is the directional derivative of F in the direction h,. A more detailed discussion of usual and mixed condition numbers and of structured condition numbers can be found in [ll and [2] respectively.

I. Gohberg, I. Koltracht / Condition numbers for functions

A representation of the Frechet derivative is obtained which we can write

w4 -F(v) = u-v

E k=l

of matrices

109

in Section 2 for functions

F(A) for

u)Hk(s, v) ds. 0

In Section 3 we obtain such formulas for some elementary functions: a power, an exponential, a logarithm, and a sine and a cosine, which lead to some old and sometimes new expressions for condition numbers of corresponding matrix maps. These expressions help to obtain useful estimates on condition numbers, and they can also be used for efficient calculation of condition numbers (see Mathias [3] for the matrix exponential). It is interesting to note that the mixed condition number, m(F, A), for matrices with a certain sign pattern is often identical to m(F, a!) the condition number of F as a function of a number ct~.

2. A representation

for the derivative

We assume that F(A) is analytic in a disk of radius R, D,, and that the spectral radius of the II x II matrix A, p(A), satisfies the inequality: p(A) CR. Theorem 2.1. Suppose that for all u,v ED,,

F(u) - F(v) u-v

= : k=l

u # v,

/&(S, u)&(S,

(2.1)

V) ds,

0

where for each s E [0, l] the functions G,(s, u) and Hk(s, u) are analytic in D, as functions of u, and are continuous functions of two variables in [0, 11 X D,. Then F’(A):Y+

5

/‘G,(s,

k=l

A)YH,(S,

ds.

A)

(2.2)

0

Proof. We will prove the representation

arbitrary m is straightforward. F’(A):Y+

where R, = (AI -A)-’

&

(2.2) for the case m = 1 only. The generalization We will use the following well-known formula

l[n, =/(A)RP*

to

(2.3)

dh7

is the resolvent. We assume first that A is diagonalizable:

A =X diag{A,,...,A,}X-‘, where diag{A,, . . . , A,} denotes the diagonal matrix with A,, . . . , A,, on the diagonal. It follows from (2.3) that F’(A)Y=X

&~A,=F(A)diag((A

-A,)-‘,...,(A

-A,)-‘)Z

diag((A -A,)-‘,...,(A

-A,)-‘)

1

dh X-l,

110

I. Gohberg, I. Koltracht / Condition numbers for functions of matrices

where Z =X-‘YX.

Therefore

If A, # Ai, then 1

F(A) I,$l=R (A -A,)(A

-/2ni =

cAk

-Aj)

1

1

-“j)-’

p-p A -

A -A,

Aj

If A, = Aj, then

1

dh =

F(Ak)

-

F(Aj)

A,-Aj

(2.4)



(2.5)

dh = F’(A,). Let now for the same Y as above, Qy= L’G(s,

A)YH(s,

A) ds

(where we omit the subscript 1 in G and H), and hence X-‘a,X=cdiag(G(S,

A,)...,

G(s, A,)}Z diag(H(sp AI)T..-~~(~~

‘j))

ds*

Thus [ X-‘QyX]

k,j

=

Zkj/'G(S, 0

*k)H(s, Aj) ds*

For A, # Aj we have

JU’G(S, Ak)G(s,

Aj)

dS =

“‘“;i I ~cni, )

J

and for A, = Aj we have

= )n~/o’c(s,

A,)H(s,

u) ds = ;iy,

F(A;; _%@)

=F’(A,).

Comparing with (2.4) and (2.5) we see that X-‘F’(A)ZX=X-‘Q,X,

for all YE lJVx”, and hence

the operators

F’(A)

and /~G(s, A)(*)H(s,

A) ds coincide.

I. Gohberg, I. Koltracht / Condition numbers for functions of matrices

Suppose now that A is an arbitrary IZx II matrix. Then A can be approximated able matrices as follows: replace each Jordan block of A by A 1 OA

0 1

0.. ***

0 0

0

A

.

.

.

&

0

***

111

by diagonaliz-

The resulting matrix is diagonalizable and becomes arbitrarily close to A as E goes to zero. The representation (2.2) follows now from the continuity of G and H. q The representation similar representations e” - eU p=

' ,(1-S@ eSu ds

/0

u-v

Setting t = 1 -s

(2.1) is not unique. Given one such representation, one can obtain by a change of variables. For example (see Section 3.2 below),

we also have

eU - e*

1

p= u-v

/0

The representation normal.

e fu ,(l-t,udt.

(2.2) allows the following estimate of the norm of F’(A) when A is

Corollary 2.2. Let AA * = A*A and let for k = 1, . . . , m,

Yk(S)= j=F””

Yk = o~s~lrk(S)~

, ..,n

qk(s) = j=ya

’’

IHk(S, Aj) 1)

qk = o~s~l qk('),

,...,n

.

.

where A 1,. . . , A,, are the eigenvalues of A. Then

IIF’(A)

II2 G kgI

~lr,(shk(s)

ds

G

ii

Ykqk*

(2.6)

k=l

Proof. We will indicate the proof for m = 1 only. Since A = QDQ* where Q is unitary and D = diag{A,, . . . , A,}, it follows that

II F’(A)Y

112= Q~‘diag{G(s, /I

A,), . . .,G(s, A,)}Z diag{H(s,

where Z = Q*YQ and hence II F’(A)Y

A,), . . . , H(s, A,)} dsQ* Ii27

IIZ II2= IIY 112.Thus

II2 G klr(s)?(s)

ds II Y 112,

as IIdiag{G(s, A,), . . . , G(s, A,)) II2= y(s) and similarly for H and q.

0

112

I. Gohberg, I. Koltracht / Condition numbers for functions of matrices

We remark that if A is not normal but still diagonalizable,

say, A =XDX-l,

then

IIF’(A) 112< ( II X 112 II X-’ 11~)~~~~ /olyk(s)~&) ds.

3. Examples 3.1. Powers of A First let us consider F(A) = A”, m = 1, 2,. . . . Then clearly Urn-Urn

Uk-lvm-k

u-v

= E k=l

k=l

/lu 0

k-lvm-k

ds,

such that G,(u, s) = uk-l and Hk(u, s) = U~-~. Thus F’(A):Y+

f: Ak-lYAm-k. k=l

It is clear that IIF’(A) II anllAll”-‘, and hence

IIA IIm

k(F, A)-llA”II.

If A is normal, then, with respect to the Euclidean norm, k(F, A) < m, which can also be seen from (2.6). Let us consider now the mixed condition number. Denoting A” = (aij)~j,l we have

auij =

aa,,

c(Ak-‘)ip(Ammk)qj,

k=l

and hence Ak-‘)ip(

Ampk)qi .

number m(F, A) is obtained by dividing this expression by )IA” IIu = Suppose now that A has nonnegative entries only, A = I A I. Then we can maxi j=l .__ n delete the absolute value signs and get

The

condition

I aij I.

IIF’(A)~AIlm = i jyy>..,n it k=l

=

max

i,j=l

,...,n

(Ak-‘)A(Am-*)] ij

(dm)i,j=mIIAmII”.

Thus in this case m(F, A) = m. Note that, for a number (Y, k(F, a) = m(F, a) = m.

I. Gohberg, I. Koltracht / Condition numbers for functions of matrices

113

Let now F(A) = hPm, m = 1, 2,. . . . If A is nonsingular, then F(h) is analytic outside a disk which contains no points of the spectrum of A, and (2.2) still holds. In this case -m

U

-

u-m =

_

E

U-kU-m-l+k ,

u-u

k=l

and hence F’(A):

y-+ - 5 A-kyA-m-l+k. k=l

In particular, for m = 1 we get F’(A):Y+

-A-‘YA-‘.

As in the case of a positive power, IIF’(A) II
k(F, A)
II~-‘Ilm+lll~Il IIA-“lI

*

For m = 1 we get the familiar k(F, A) G ]IA-’ ]I I]A ]I (which in fact is an equality), and when A is normal then k(F, A) =Gm IIA II IIA-’ II for any m. For m = 1 we also know (see [l]) that

m(F, A) =

II I~-‘Il4I~-‘I

II”

11~-111,

where for A = (aij), by definition,



1A I = ( I aij I>.

3.2. The exponent, F(A) = eA Here, see Van Loan [4], eU - e” 1 ,(I

p=

-S)UeSu

ds,

I0

U-V

and hence F’(A)

= /ol e(l-s)A( -)esA ds.

(3.1)

When A is normal it follows from (2.6) that

II F’(A) II2G IleA112. Indeed, in this case y(s) = le (I-‘)* I and v(s) = Ies* I, for 0 G s G 1, where A is the eigenvalue of A with the largest real part (see [4]), and hence /‘r(s)?(s) 0

ds = le” I = IleAII.

114

I. Gohberg, I. Koltracht / Condition numbers for functions of matrices

Thus, with respect to the Euclidean norm,

k(F, A) <

lIeAII2II A II2 IIeAII2

= II A 112.

(It is shown in [4] that this is, in fact, an equality.) Let us consider now the mixed condition number. First observe that

II/

II e(l-s)Al

m(F,

A)<

1Al lesA



lIeAIlo

s Id II0

(3.2)

*

Indeed, if e tA = (~,~(t)>~~=, then it follows from (3.1) that aqj(l)

aa

= ~~~p(l -S)a4j(S)

ds.

P4

Hence the (i, j)th row of F’(A)D,

is of the form

and

II qij III = 2

I/‘gip(l-

P/J=1
2

s)apqaqj(s)

ds (

O

Iaip(l-s)I

lapql Iaqj(s)Ids

P4’1

= [/

0

II e(l-s)AI

I A( lesAl

ds

. I ij

The inequality (3.2) follows now from the fact that

11 WW,

Ijm= i,j=l,...,n max II4ijIII
If the matrix A has nonnegative /lle(l-s)A

I ) A I JesA I

e(l-s)AI ) Al IesAI ds

. " II

entries, then ds = L1 e(l-S)AAeSAds =AeA,

0

and hence

11F’(A)D,

)lm= IIAeA

m(F, A) < II A Ilm.

II u G ll A

IlmlIeA II“. Thus in this case

(3.3)

I. Gohberg, I. Koltracht / Condition numbers for functions of matrices

11.5

3.3. The logarithm, F(h) = log(1 + A)

A direct calculation shows that for all I u 1, I u I < 1, u # U: log(l+u)-log(l+u)

=

u-v

1 ~~ / 0 1+su

1

1

1+sv

ds.

Thus for a matrix A with spectral radius less than 1, F’(A)

=/D’(I+sA-‘(-)(I+sA)-’ ds.

(3.4)

Suppose first that A is normal and let ]l+h]

=

min (1 +A,], k=l,...,?I

where AI,..., A, are the eigenvalues of A. We exclude the case when all eigenvalues have nonnegative real parts because in this case IIF’(A) II2 G 1 and the map is well-conditioned. So let Re A < 0. Then for any s E [O, l] and k = 1,. . . , II:

Indeed, it is an elementary fact that, if I A I < 1 and Re A < 0, then all s E [0, 11. Hence, if Re A, < 0, then ~ll+sA,I

> ll+A,]

I1 +

A I < fi

I1 + sh I for

> ]l+A],

and, if Re A, > 0, then fi]l+sA,]

262

]l+A].

Thus 1 II F’(A)

II2 G

k=??t,n

11 +sA,

O
* I I

1 G2

max k=t,...,n

11 +A,]*

On the other hand, if y, is the eigenvector of A corresponding

Hence 1 II F’(A)

II2 a

k=?,?,n

11 + A, l

P-5)

= 2 Il(lfA)-‘II;.

= II(I+A)-‘112.

to A,, then

116

I. Gohberg, I. Koltracht / Condition numbers for functions of matrices

Thus

llAllJl(I+A)-llI; 2

lIlog(l+A)ll2

>k(F

A)>





11~11211(~+~)-111~ lllog(l+~)ll2





and we see that when A is normal, log(l +A) is ill-conditioned if A has eigenvalues close to - 1 and well-conditioned otherwise. Note that when A is a number, say (Y,then

w

Ial If’(~)1 = I,(1 +a)-‘1

4 =

l f@) l

Ilog(l + a> I *

l~‘(d I G 1 and

Note also that, when Re (Y2 0, then

lal v9

a) G (log(1 +(Y) I

+

1,

when Q + 0. Let us consider now the mixed condition number for a general A. Using the same techniques as in Section 3.2 for the exponential function, it is straightforward to see that

and hence

m(F, A) =G

l14.~~~11(~+~)-11

IAI I(l+.s4)-‘l

ds// ”

IIlw(~ +A) II u

If A has nonpositive entries, then (I + sA)-’ has nonnegative entries, since the spectral radius of SA is less than 1 in magnitude. Thus we can remove absolute value signs in the integral. In this case the integral equals to -(I + A)- ’ and

m(F,

II A II ull(I +A)-’ IIu IIlog(~+A)II, *

A) <

3.4. Trigonometric functions, F(A) = cos A and F(A) = sin h For F(A) = cos A we have cos u - cos v

=

ei(l

--s)u

e is0 _

,-i(l-s)ue-isu

u-v

and hence F’(A)y=

_

&‘[ei~l-~)“Ye’“A

_

,-i(l-s)Aye-id]

ds,

I. Gohberg, 1. Koltracht / Condition numbers for functions of matrices

Substituting

117

Y = I we see that

F’(A)I=

- i[eiA

- ePi”] = -sin

A,

< and hence IIF’(A) II ,> IIsin A 11.If A is selfadjoint, then it follows from (2.6) that IIF’(A) 112 1. Thus for a selfadjoint A, Ilsin A II2IIA II2
A)<

II A 112 IlcosA 112 ’

For F(A) = sin A we have in a similar way, sin u - sin u U-V

F'(A)y=

=z

1 1 1(1 --s)u isu + e e /[0

,-i(l-s)u

eVisU] ds,

+JO1[ei(l-~)~yei~~ + ,-i(l-“)Aye-i”A]

ds,

IIF’(A) II 2 JIcos A II, and for A =A*,

II A 112 IIcosA 112 II A 112 ~ k(P’ A) ~ llsin A 112 ’ Ilsin A 112 We do not have a representation of the form (2.1) for F(h) = fi, F(h) = hJ’lq, where p and q # 0 are integers.

or more generally, for

References and I. Koltracht, Componenhvise mixed and structured condition numbers, SIAMJ. Matrix Anal. 14 (3) (1993). Bl I. Gohberg and I. Koltracht, Structured condition numbers for linear structures, IMA Preprint Series No. 1030, Institute for Mathematics and its Applications, University of Minnesota, Minneapolis, MN (1992); also in: Proceedings Linear Algebra and Signal Processing Workshop, Minneapolis, MN (1992). Preprint (1992). [31 R. Mathias, Evaluating the Frechet derivative of the matrix exponential, SL4M .I. Numer. Anal. 14 (1977) 971-981. 141 C.F. Van Loan, The sensitivity of matrix exponential,

[II I. Gohberg