Applied Numerical North-Holland
APNUM
Mathematics
107
12 (1993) 107-117
389
Condition numbers for functions of matrices * I. Gohberg Department
of Mathematics,
Tel Aviv Uniuersity, Ramat Aviv, Tel Aviv 49978, Israel
I. Koltracht Department
of Mathematics,
University of Connecticut,
Storm, CT 06269-3009,
USA
Abstract Gohberg, I. and I. Koltracht, (1993) 107-117.
Condition
numbers
for functions
of matrices,
Formulas and estimates for condition numbers of functions of matrices certain formula of the Frechet derivative of a function of a matrix. Keywords.
Derivative
of a function
of a matrix; condition
Applied
Numerical
are obtained.
Mathematics
They are based
12 on a
number.
1. Introduction
In this paper we give a certain representation for the Frechet derivative of a function of a matrix. Our main application is to the condition numbers of the corresponding map, F: A --) F(A),
A,F(A)
E Wx”,
which we describe now in some detail. We consider here differentiable maps only, in which case the usual condition number of F at A # 0 and F(A) # 0, is given by: k(F,
A) =
IIF’(A)lI IIAII II F(A)
II
(see, for example, Gohberg and Koltracht significance: if IIX-A )I < E )I A )I, then
II F(X) -F(A) II F(A) II
II
[l] and references
It has the following
6 &k(F, A) + o(s).
Correspondence to: I. Koltracht, Department of Mathematics, University USA. * This work was partly supported by the NSF (Grant DMS-9007030). 0168-9274/93/$06.00
therein).
0 1993 - Elsevier
Science
Publishers
of Connecticut,
B.V. AI1 rights reserved
Storrs,
CT 06269-3009,
108
I. Gohberg, I. Koltracht / Condition numbers for functions of matrices
Another useful condition number is the mixed condition number, which is given by the formula, II~‘(4Qf IIF(A)
m(F~ A)=
IL ’
II A IIu = mai,j=l,,,,, n l Aij l (see [l]). We identify here an IZX y1 matrix A with the n2 vector consisting of concatenated rows of A, A = (A,,, A12,. . . , Al,, FIEF,. . . , A,,), so that the u-norm is the m-norm of A as a vector in R”“. The matrix DA is the corresponding n2 X n2 diagonal matrix, where
D,=diag{&,
A12,...,Aln,
A2n,...,Ann].
The mixed condition number relates normwise errors in F(A) to componentwise ]Xij-Aij]
~&IAijl,
i,j=l,...,
errors in A: if
Iz,
then
II F(X) -F(A) IIU < m(F, II F(A) llu
A) + O(E).
The obtained representation of the Frechet derivative gives explicit formulas for directional derivatives of the corresponding map. These formulas are also useful for the analysis of structured condition numbers of F at A, which correspond to perturbations of A within a structured subset of the domain of definition of F. By definition, a structured subset of the domain of definition of F is the range of another differentiable map, say, H : R” + Rp with n
then it can be shown that
and aF al$..-Yanah, m(FoH,a)=
II F(A)
aF Ill
llu
m ’
where aF/ah, is the directional derivative of F in the direction h,. A more detailed discussion of usual and mixed condition numbers and of structured condition numbers can be found in [ll and [2] respectively.
I. Gohberg, I. Koltracht / Condition numbers for functions
A representation of the Frechet derivative is obtained which we can write
w4 -F(v) = u-v
E k=l
of matrices
109
in Section 2 for functions
F(A) for
u)Hk(s, v) ds. 0
In Section 3 we obtain such formulas for some elementary functions: a power, an exponential, a logarithm, and a sine and a cosine, which lead to some old and sometimes new expressions for condition numbers of corresponding matrix maps. These expressions help to obtain useful estimates on condition numbers, and they can also be used for efficient calculation of condition numbers (see Mathias [3] for the matrix exponential). It is interesting to note that the mixed condition number, m(F, A), for matrices with a certain sign pattern is often identical to m(F, a!) the condition number of F as a function of a number ct~.
2. A representation
for the derivative
We assume that F(A) is analytic in a disk of radius R, D,, and that the spectral radius of the II x II matrix A, p(A), satisfies the inequality: p(A) CR. Theorem 2.1. Suppose that for all u,v ED,,
F(u) - F(v) u-v
= : k=l
u # v,
/&(S, u)&(S,
(2.1)
V) ds,
0
where for each s E [0, l] the functions G,(s, u) and Hk(s, u) are analytic in D, as functions of u, and are continuous functions of two variables in [0, 11 X D,. Then F’(A):Y+
5
/‘G,(s,
k=l
A)YH,(S,
ds.
A)
(2.2)
0
Proof. We will prove the representation
arbitrary m is straightforward. F’(A):Y+
where R, = (AI -A)-’
&
(2.2) for the case m = 1 only. The generalization We will use the following well-known formula
l[n, =/(A)RP*
to
(2.3)
dh7
is the resolvent. We assume first that A is diagonalizable:
A =X diag{A,,...,A,}X-‘, where diag{A,, . . . , A,} denotes the diagonal matrix with A,, . . . , A,, on the diagonal. It follows from (2.3) that F’(A)Y=X
&~A,=F(A)diag((A
-A,)-‘,...,(A
-A,)-‘)Z
diag((A -A,)-‘,...,(A
-A,)-‘)
1
dh X-l,
110
I. Gohberg, I. Koltracht / Condition numbers for functions of matrices
where Z =X-‘YX.
Therefore
If A, # Ai, then 1
F(A) I,$l=R (A -A,)(A
-/2ni =
cAk
-Aj)
1
1
-“j)-’
p-p A -
A -A,
Aj
If A, = Aj, then
1
dh =
F(Ak)
-
F(Aj)
A,-Aj
(2.4)
’
(2.5)
dh = F’(A,). Let now for the same Y as above, Qy= L’G(s,
A)YH(s,
A) ds
(where we omit the subscript 1 in G and H), and hence X-‘a,X=cdiag(G(S,
A,)...,
G(s, A,)}Z diag(H(sp AI)T..-~~(~~
‘j))
ds*
Thus [ X-‘QyX]
k,j
=
Zkj/'G(S, 0
*k)H(s, Aj) ds*
For A, # Aj we have
JU’G(S, Ak)G(s,
Aj)
dS =
“‘“;i I ~cni, )
J
and for A, = Aj we have
= )n~/o’c(s,
A,)H(s,
u) ds = ;iy,
F(A;; _%@)
=F’(A,).
Comparing with (2.4) and (2.5) we see that X-‘F’(A)ZX=X-‘Q,X,
for all YE lJVx”, and hence
the operators
F’(A)
and /~G(s, A)(*)H(s,
A) ds coincide.
I. Gohberg, I. Koltracht / Condition numbers for functions of matrices
Suppose now that A is an arbitrary IZx II matrix. Then A can be approximated able matrices as follows: replace each Jordan block of A by A 1 OA
0 1
0.. ***
0 0
0
A
.
.
.
&
0
***
111
by diagonaliz-
The resulting matrix is diagonalizable and becomes arbitrarily close to A as E goes to zero. The representation (2.2) follows now from the continuity of G and H. q The representation similar representations e” - eU p=
' ,(1-S@ eSu ds
/0
u-v
Setting t = 1 -s
(2.1) is not unique. Given one such representation, one can obtain by a change of variables. For example (see Section 3.2 below),
we also have
eU - e*
1
p= u-v
/0
The representation normal.
e fu ,(l-t,udt.
(2.2) allows the following estimate of the norm of F’(A) when A is
Corollary 2.2. Let AA * = A*A and let for k = 1, . . . , m,
Yk(S)= j=F””
Yk = o~s~lrk(S)~
, ..,n
qk(s) = j=ya
’’
IHk(S, Aj) 1)
qk = o~s~l qk('),
,...,n
.
.
where A 1,. . . , A,, are the eigenvalues of A. Then
IIF’(A)
II2 G kgI
~lr,(shk(s)
ds
G
ii
Ykqk*
(2.6)
k=l
Proof. We will indicate the proof for m = 1 only. Since A = QDQ* where Q is unitary and D = diag{A,, . . . , A,}, it follows that
II F’(A)Y
112= Q~‘diag{G(s, /I
A,), . . .,G(s, A,)}Z diag{H(s,
where Z = Q*YQ and hence II F’(A)Y
A,), . . . , H(s, A,)} dsQ* Ii27
IIZ II2= IIY 112.Thus
II2 G klr(s)?(s)
ds II Y 112,
as IIdiag{G(s, A,), . . . , G(s, A,)) II2= y(s) and similarly for H and q.
0
112
I. Gohberg, I. Koltracht / Condition numbers for functions of matrices
We remark that if A is not normal but still diagonalizable,
say, A =XDX-l,
then
IIF’(A) 112< ( II X 112 II X-’ 11~)~~~~ /olyk(s)~&) ds.
3. Examples 3.1. Powers of A First let us consider F(A) = A”, m = 1, 2,. . . . Then clearly Urn-Urn
Uk-lvm-k
u-v
= E k=l
k=l
/lu 0
k-lvm-k
ds,
such that G,(u, s) = uk-l and Hk(u, s) = U~-~. Thus F’(A):Y+
f: Ak-lYAm-k. k=l
It is clear that IIF’(A) II anllAll”-‘, and hence
IIA IIm
k(F, A)-llA”II.
If A is normal, then, with respect to the Euclidean norm, k(F, A) < m, which can also be seen from (2.6). Let us consider now the mixed condition number. Denoting A” = (aij)~j,l we have
auij =
aa,,
c(Ak-‘)ip(Ammk)qj,
k=l
and hence Ak-‘)ip(
Ampk)qi .
number m(F, A) is obtained by dividing this expression by )IA” IIu = Suppose now that A has nonnegative entries only, A = I A I. Then we can maxi j=l .__ n delete the absolute value signs and get
The
condition
I aij I.
IIF’(A)~AIlm = i jyy>..,n it k=l
=
max
i,j=l
,...,n
(Ak-‘)A(Am-*)] ij
(dm)i,j=mIIAmII”.
Thus in this case m(F, A) = m. Note that, for a number (Y, k(F, a) = m(F, a) = m.
I. Gohberg, I. Koltracht / Condition numbers for functions of matrices
113
Let now F(A) = hPm, m = 1, 2,. . . . If A is nonsingular, then F(h) is analytic outside a disk which contains no points of the spectrum of A, and (2.2) still holds. In this case -m
U
-
u-m =
_
E
U-kU-m-l+k ,
u-u
k=l
and hence F’(A):
y-+ - 5 A-kyA-m-l+k. k=l
In particular, for m = 1 we get F’(A):Y+
-A-‘YA-‘.
As in the case of a positive power, IIF’(A) II
k(F, A)
II~-‘Ilm+lll~Il IIA-“lI
*
For m = 1 we get the familiar k(F, A) G ]IA-’ ]I I]A ]I (which in fact is an equality), and when A is normal then k(F, A) =Gm IIA II IIA-’ II for any m. For m = 1 we also know (see [l]) that
m(F, A) =
II I~-‘Il4I~-‘I
II”
11~-111,
where for A = (aij), by definition,
’
1A I = ( I aij I>.
3.2. The exponent, F(A) = eA Here, see Van Loan [4], eU - e” 1 ,(I
p=
-S)UeSu
ds,
I0
U-V
and hence F’(A)
= /ol e(l-s)A( -)esA ds.
(3.1)
When A is normal it follows from (2.6) that
II F’(A) II2G IleA112. Indeed, in this case y(s) = le (I-‘)* I and v(s) = Ies* I, for 0 G s G 1, where A is the eigenvalue of A with the largest real part (see [4]), and hence /‘r(s)?(s) 0
ds = le” I = IleAII.
114
I. Gohberg, I. Koltracht / Condition numbers for functions of matrices
Thus, with respect to the Euclidean norm,
k(F, A) <
lIeAII2II A II2 IIeAII2
= II A 112.
(It is shown in [4] that this is, in fact, an equality.) Let us consider now the mixed condition number. First observe that
II/
II e(l-s)Al
m(F,
A)<
1Al lesA
’
lIeAIlo
s Id II0
(3.2)
*
Indeed, if e tA = (~,~(t)>~~=, then it follows from (3.1) that aqj(l)
aa
= ~~~p(l -S)a4j(S)
ds.
P4
Hence the (i, j)th row of F’(A)D,
is of the form
and
II qij III = 2
I/‘gip(l-
P/J=1 01
2
s)apqaqj(s)
ds (
O
Iaip(l-s)I
lapql Iaqj(s)Ids
P4’1
= [/
0
II e(l-s)AI
I A( lesAl
ds
. I ij
The inequality (3.2) follows now from the fact that
11 WW,
Ijm= i,j=l,...,n max II4ijIII )/,‘I
If the matrix A has nonnegative /lle(l-s)A
I ) A I JesA I
e(l-s)AI ) Al IesAI ds
. " II
entries, then ds = L1 e(l-S)AAeSAds =AeA,
0
and hence
11F’(A)D,
)lm= IIAeA
m(F, A) < II A Ilm.
II u G ll A
IlmlIeA II“. Thus in this case
(3.3)
I. Gohberg, I. Koltracht / Condition numbers for functions of matrices
11.5
3.3. The logarithm, F(h) = log(1 + A)
A direct calculation shows that for all I u 1, I u I < 1, u # U: log(l+u)-log(l+u)
=
u-v
1 ~~ / 0 1+su
1
1
1+sv
ds.
Thus for a matrix A with spectral radius less than 1, F’(A)
=/D’(I+sA-‘(-)(I+sA)-’ ds.
(3.4)
Suppose first that A is normal and let ]l+h]
=
min (1 +A,], k=l,...,?I
where AI,..., A, are the eigenvalues of A. We exclude the case when all eigenvalues have nonnegative real parts because in this case IIF’(A) II2 G 1 and the map is well-conditioned. So let Re A < 0. Then for any s E [O, l] and k = 1,. . . , II:
Indeed, it is an elementary fact that, if I A I < 1 and Re A < 0, then all s E [0, 11. Hence, if Re A, < 0, then ~ll+sA,I
> ll+A,]
I1 +
A I < fi
I1 + sh I for
> ]l+A],
and, if Re A, > 0, then fi]l+sA,]
262
]l+A].
Thus 1 II F’(A)
II2 G
k=??t,n
11 +sA,
O
* I I
1 G2
max k=t,...,n
11 +A,]*
On the other hand, if y, is the eigenvector of A corresponding
Hence 1 II F’(A)
II2 a
k=?,?,n
11 + A, l
P-5)
= 2 Il(lfA)-‘II;.
= II(I+A)-‘112.
to A,, then
116
I. Gohberg, I. Koltracht / Condition numbers for functions of matrices
Thus
llAllJl(I+A)-llI; 2
lIlog(l+A)ll2
>k(F
A)>
’
’
11~11211(~+~)-111~ lllog(l+~)ll2
’
’
and we see that when A is normal, log(l +A) is ill-conditioned if A has eigenvalues close to - 1 and well-conditioned otherwise. Note that when A is a number, say (Y,then
w
Ial If’(~)1 = I,(1 +a)-‘1
4 =
l f@) l
Ilog(l + a> I *
l~‘(d I G 1 and
Note also that, when Re (Y2 0, then
lal v9
a) G (log(1 +(Y) I
+
1,
when Q + 0. Let us consider now the mixed condition number for a general A. Using the same techniques as in Section 3.2 for the exponential function, it is straightforward to see that
and hence
m(F, A) =G
l14.~~~11(~+~)-11
IAI I(l+.s4)-‘l
ds// ”
IIlw(~ +A) II u
If A has nonpositive entries, then (I + sA)-’ has nonnegative entries, since the spectral radius of SA is less than 1 in magnitude. Thus we can remove absolute value signs in the integral. In this case the integral equals to -(I + A)- ’ and
m(F,
II A II ull(I +A)-’ IIu IIlog(~+A)II, *
A) <
3.4. Trigonometric functions, F(A) = cos A and F(A) = sin h For F(A) = cos A we have cos u - cos v
=
ei(l
--s)u
e is0 _
,-i(l-s)ue-isu
u-v
and hence F’(A)y=
_
&‘[ei~l-~)“Ye’“A
_
,-i(l-s)Aye-id]
ds,
I. Gohberg, 1. Koltracht / Condition numbers for functions of matrices
Substituting
117
Y = I we see that
F’(A)I=
- i[eiA
- ePi”] = -sin
A,
< and hence IIF’(A) II ,> IIsin A 11.If A is selfadjoint, then it follows from (2.6) that IIF’(A) 112 1. Thus for a selfadjoint A, Ilsin A II2IIA II2
A)<
II A 112 IlcosA 112 ’
For F(A) = sin A we have in a similar way, sin u - sin u U-V
F'(A)y=
=z
1 1 1(1 --s)u isu + e e /[0
,-i(l-s)u
eVisU] ds,
+JO1[ei(l-~)~yei~~ + ,-i(l-“)Aye-i”A]
ds,
IIF’(A) II 2 JIcos A II, and for A =A*,
II A 112 IIcosA 112 II A 112 ~ k(P’ A) ~ llsin A 112 ’ Ilsin A 112 We do not have a representation of the form (2.1) for F(h) = fi, F(h) = hJ’lq, where p and q # 0 are integers.
or more generally, for
References and I. Koltracht, Componenhvise mixed and structured condition numbers, SIAMJ. Matrix Anal. 14 (3) (1993). Bl I. Gohberg and I. Koltracht, Structured condition numbers for linear structures, IMA Preprint Series No. 1030, Institute for Mathematics and its Applications, University of Minnesota, Minneapolis, MN (1992); also in: Proceedings Linear Algebra and Signal Processing Workshop, Minneapolis, MN (1992). Preprint (1992). [31 R. Mathias, Evaluating the Frechet derivative of the matrix exponential, SL4M .I. Numer. Anal. 14 (1977) 971-981. 141 C.F. Van Loan, The sensitivity of matrix exponential,
[II I. Gohberg