Chapter I
Operators on FiniteDimensional Spaces
1. Spectrum of an operator. Let E be a finitedimensional, complex, normed space, and let dim E = d . Let also T be a linear operator from E into itself. First, we observe that T is automatically continuous : indeed, take an algebraic basis ( e l , . . . ,e d ) of E , d and write z = El c r j e j , y = Cf &e;. Then :
with K = IfI I T c j l l . We define the operator norm of T :
Since the unit ball is compact (see for instance B.B. [ I ] ) , and since T is continuous, this supremum is actually attained, and is a maximum. We will usually drop the subscript "opn, and we just write [(TI(. We denote by t ( E )the set of linear continuous operators on E , equipped with this norm. When a basis ( e l , . . . ,e d ) is chosen, T can be represented by a matrix
M
= ( t i , j ) i j = l , . _ _dr,
the j - t h column of which is made with the components of T e , on the basis, that is T e j = El t , , , e , . If a vector z is given by its components X = ( 2 1 , . . . ,zd) on the basis, the vector T x has MX as components on the same basis. Of course, the matrix depends on the choice of the basis, the operator norm does not.
Chapter I
6
We are not going here to deal with the theory of matrices, which is a huge theory in itself ; we will instead restrict ourselves t o the notions related to our future study of operators on infinite dimensional Banach spaces. T h e characteristic polynomial of T is the determinant of T - X I : c=(X)
= det
(T- XI) , for X
E
(c,
and this polynomial is also the determinant of the matrix M - X I , if M r e p resents T is a given basis. It is easily seen t o be independent from the choice of the basis. This determinant is of the form :
It has degree d , and therefore has d complex roots, not necessarily distinct. If X is one of them, det (T - X I ) = 0 , which means that T -- X I is not invertible. Since the space is finite dimensional, this is equivalent to the fact that T - X I is not injective (or not surjective). Consequently, there exists a vector z, z # 0 , such t h a t (T - XI)z = 0 , or Tz = Xz. T h e vector z is called an cigenucctor associated with the eigenualue A . To simplify our notation, we write T - X instead of T
-
XI.
. . ,A,
be an enumeration of the roots ( m 5 d ) . T h e set is called the spectrum of T . It is a finite subset of the complex plane, consisting of a t most d points. It is never empty, b u t may consist in a single point : the spectrum of the identity is (1). T h e spectrum of any projection is { O , 1 } . Let now XI,.
a(T) = {XI,.
. . ,A,}
Conversely, given any d points XI,. . . ,X d in the plane, it is easy t o see that there is an operator T on C d with a(T) = {XI , . . . , A , + } . Indeed, let ( e l , . . . , ed) be the canonical basis of C d , and define T e , = X j e , . Therefore, the spectrum has a simple structure. This does not mean, however, that all questions in finite-dimensional operator theory are necessarily easy. Among the hardest (and, quite often, with no satisfactory solution), let’s mention : precise computations of the operator norm, approximation of a given operator by operators in a given class, and so on.
Finite Dimensional Spaces
7
2. Minimal Polvnomial. Let p(z) =
1," aktk be a polynomial with complex coefficients.
We de-
fine :
We will show th a t there is a "smallest" polynomial m, such t hat m ( T ) = 0. This polynomial will be called the minirnol polynomial of T . For this, let ( e l , . . . , e d ) be a basis of E. T h e vectors e l r T e l , .. . ,Tdel cannot b e independent : there exists a linear combination : d 0
that is, by definition ( I ) , a polynomial s1, d"s1 5 d , with sl(T)el = 0. T h e same way, we find polynomials s2,. . . ,S d , with : sj(T)ej =
0
for all j
,
= 2,.
. . ,d.
P u t s = ~ 1 . ~ 2 . .sd. . Then s ( T ) e , = 0 for all j = 1,. s(T)z= 0 for all x in E.
.,, d ,
and therefore
We have found a polynomial s, such t hat s ( T ) = 0 . But s has t o be reduced : its degree may be d Z , and we will see later t hat such a polynomial exists with dos 5 d . We factor .(A) = Q M (A - A,)". , and we may assume Q = 1 . If a point X i is not in the spectrum, the corresponding term T - A i l is invertible, so can be removed from s. Let
n,=, .(A)
=
(A - A,)Q'.
0
A, E u ( T )
Now, we look a t all z ' s such that (T- X1)alz = 0. If for all of them we also have (T - X 1 ) Q l - l z = 0 , we replace a1 by a1 - 1 in u. We start again with a1 - 1, and so on, until we cannot proceed further. We then pass to a2,and so on. More precisely, we define, for i = 1,. . . ,m (where m is the number of points in a ( T ) ,m 5 d ) : vi = inf{k E IN ; ( T
- A , ) ~ Z=
Then we put :
0, for a11 z s.t. (T - ~ , ) ' + ' z =
rI(A m
m(A) =
i= 1
~
A;)".,
01.
Chapter I
8
and we have obtained a polynomial m , still satisfying m(T) = 0 . A0
E
By construction, all roots A i of m(A) belong to a(T). Conversely, let o(T), yo a corresponding eigenvector. Then :
so rn(A0)
= 0 , and the roots of m(A) are exactly the points of a ( T ) .
T h e polynomial m(T) is called the minimal polynomial of T ,for the following reason :
Proposition 2.1. p(T) = o .
-
The polynomial m divides every polynomial p s u c h that
Proof. - Let p be such a polynomial. First, as we already observed, p must have the points in a ( T ) as roots ; we eliminate the others, and write :
We now show that a, 2 u,, for i = I , . . . , m . Assume on the contrary that, for instance, a1 < P I . Then, by the definition of vl , there is a point z1 such that :
(T-
A p + l Z l
= 0,
y1 = ( T - Al)(r’z, # O .
From (3), we deduce
p(A) = (A - A p q ( A 1 , with q ( A 1 )
# 0.
We have Tyl
= Alyl,
so :
which contradicts p ( T ) = 0 and proves our claim.
Corollary 2.2. - If p , q are two polynomials, we have p ( T ) = q ( T ) if and only if every A, E a ( T ) is a root of p - q of order 2 v i . Indeed, this says that p - q is divisible by the minimal polynomial.
Finite
Dimensional Spaces
9
Among all polynomials satisfying p ( T ) = 0 , the most noticeable one is the characteristic polynomial, which we have already defined :
c(X) = det (T - A).
Theorem 2.3 (Cayley-Hamilton). - The characteristic polynomial satisfies c(T) = 0.
Proof. - Elementary linear algebra (see for instance I . N . Herstein 111) allows us t o write in a proper basis, the matrix of T in a triangular form :
[; A{
al,2
... ...
... ...
;;]
al,d
where A:, . . . , A > are the points of a(T), but this time each of them repeated according t o its multiplicity. If ( u l , . . . , u d ) is this basis, we get : Tu1 =
Y p l ,
+ X:WZ,
=
a l p I
A{)V,
=
0,
(T- x;)vz
=
01,2Ulr
(T - A)d)Ud
= al,dV1 f
Tu2
which means :
(T-
... f
ar,dU, $.
.. . f
ad-1,dvd-1
Therefore ( T - A : ) u l = 0 , ( T - X { ) ( T - X 9 u 2 = 0,.. . , (T-X{)."(T-A&)ud = 0. So the product (T - A',) - . . (T - A:) annihilates all the vectors of the basis. Since the X i ' s are the roots of the characteristic polynomial, we get c ( T ) = 0 . From Theorem 2.3 and Proposition 2.2 follows obviously that d"m 5 d . There are obvious examples in which the minimal polynomial h a s degree strictly less than the degree of the characteristic polynomial (which is d , the dimension of the space). For example, a projection always satisfies T 2 - T = 0, that is m(A) = A' - A .
Chapter
10
I
3. T h e analytic functional calculus.
We have already defined p ( T ), when p is a polynomial. We now extend this definition t o a larger class of functions. Let f be a function from C into itself. We say t hat f belongs to the space 3 ( T ) if there exists a neighborhood V of a ( T ) on which f is analytic (for an elementary theory of analytic functions, we refer the reader to H. Cartan 111). We recall th at f is said to be analytic o n a compact set if it is analytic on some neighborhood of this compact set. The neighborhood does not need t o be connected, and depends on the function. Let f E 3 ( T ) . For every A, E a ( T ) ,we consider the derivatives f ( k ) ( X , ) , k < Y , (there are v1 + ... Y, = d such derivatives).
+
Let p be a polynomial such t hat all k < v,. We then put :
/(T)
f(k)(X,) =
= P ( ~ ) ( X , ) , for all A, E a ( T ) ,
P(T)
This definition does not depend on the choice of the polynomial p : if q is another one with the same properties, then p ( T ) = q ( T ) ,by Corollary 2.2. We now list some elementary properties of this definition :
Theorem 3.1. - If f,g E 3(T), a , B E Q:,
~f + P g E 3 ( T ) ,and (4 + P g ) ( T ) = a / ( T )+ P g ( T ) , f-4 f - 9 E 3 ( T ) ?and f.g(T) = f(T)S(T) c) if f(x) = 1 ;:akXk, then /(T)= a k T k ,
a)
c," 7
d) / ( T ) = 0 if and only if f ( k ) ( X , ) = 0 , for all X i E a ( T ) ,all k
< v;.
T h e proof is left to the reader. T h e first quality of the class 7(T)is t hat it contains functions with values 0 and 1 only, thus allowing us t o build projections which commute with T : For every A, E a ( T ) , let e,(A) defined by : .,(A) = 1 on some neighborhood of A,, = 0 on some neighborhood of all other A] 's, J # i. We put E, = e , ( T ) ; this is a n operator, with the following properties :
Proposition 3.2. - For a)
E: I=
j = 1,.
. . ,m , we have :
= E,,
b) E,E, = 0 , for C)
1,
1 7E , .
t
#
J ,
These properties follow immediately from Theorem 3.1.
Finite Dimensional Spaces
11
For i = 1,. . . ,m, we call X , the image of the operator Ei . Then we get the formula :
E
=
X1@..*$Xm.
xr
Indeed, by Proposition 3.2, c ) every ~ z can be written as z = E i z , and this decomposition is unique by b). Since E; is in fact defined as a polynomial in T , it clearly commutes with it. Therefore :
i = 1,...,m ,
TX,c X i , which means that XI is invariant by
T.
Also, we have :
(T-
=
0.
Indeed, El= p,(T), where pI is a polynomial satisfying : PI(A1)
= 1
, P,( k )(A,) Vj
p,(k)(X,) = 0 ,
Therefore, pl factors as :
P*(A) =
= 0,
4x1
15 k
< Vl
# i, V k , 0 5 k < uJ .
n(A ~
A,)”,
7
J f l
and (A - X,)”~p,(X) is a multiple of the minimal polynomial m, which proves (1)-
So we get :
( T -A,)”~X,= 0.
(2)
For : = 1,. . . , m , we put N, = Ker(T - A , ) ” ~ , N: = Ker I7,+,(T
Lemma 3.3. Proof.
~
For i = I , . ..,rn,N,n N,’
=
(0).
~
X,)”J.
n,,,(A
The polynomials p,(X) = (A - A,)”,, and q l ( A ) = - A,)”, have no roots in common. By Bezout Identity, there exist polynomials rl(X), r2 ( A ) such that : -
tlP1
+rzq, = 1
which implies :
from which Lemma 3.3 follows obviously.
Chapter I
12
We may now prove :
Proposition 3.4. - For i = I , . . . ,m, X, = N,.
Proof. - 1) We have seen that (T - A,)"# X, = 0 , thus Xi c K e t ( T - X,)"i
.
2) By Lemma 3.3, the sum of the N , is a direct sum ; the sum of the X i is also direct. Since the latter is E (prop. 3.2,c ) , so is the former. The proposition follows.
The projections E, , i = 1,. . . ,m , will also allow u s to give an expression of any function f ( T ) , f E f ( T ) :
Proposition 3.5.
-
If f E 3 ( T ) ,we may write .-
Proof. - We consider the function :
One checks immediately that, for i = I , ,
dk)(X,)=
. . , m , and k < v , ,
f'k'(X,) ,
and therefore f ( T ) = g ( T ) ,by Theorem 3.1, d). We now study the convergence of a sequence of operators fn(T) :
Proposition 3.6. - Let ( f n ) n > ~ be a sequence of functions in 3 ( T ) . Then f n ( T ) converges in operator norm if and only if the d sequences , for i = 1,. . . ,m, k < v , , converge in a .
(fik'
Proof. - 1) Assume that the d complex sequences converge. Then Proposition 3.5 indicates that fn(T) is Cauchy, and therefore converges. 2) Assume that the sequence (fn(T))nEmconverges in L ( E ) . We know that (T - X 1 ) " l - ' E 1 # 0, and therefore we may find z such that (T -
Finite Dimemional Spaces Since y = E l ( z ) , all terms with i
#
13
1 are 0 , by Proposition 3.2, b). So :
All terms with k 2 1 are 0 by the definition of r = v1 - 1, and formula (2). Since moreover E l y = y , we get : In(T)yr = fn(Al)(T
AI)'Y .
Since the sequence (jn(T)y,)nEmconverges, this implies that converges. Then we write :
fn(T)yr-l =
(fn(X1)),,m
+
fn(x~)~r-l fA(Al)~r,
and therefore (J,I,(Al))converges. The same holds for the other points in the spectrum.
To end this chapter, we give a Cauchy formula for operators : P r o p o s i t i o n 3.7. - Let U be an open set, containing a ( T ) . We assume that the boundary 'I of U consists in a finite number of simple closed curves, oriented in the direct sense. Then, if f E 7(T)is analytic on U , we have :
P r o o f . - For A
4
a ( T ) ,we set R ( z ) = 1/(A R ( T ) = (A
-
~
T)-l
and so :
by the usual Cauchy formula, and finally, = f(T) 9
and Proposition 3.7 is proved.
2).
Then :
Chapter I
14
4. Computing the operator norm on a Hilbert sDace.
The formula 1 ( l ) , given as definition of the operator norm, is not suitable for practical purposes. Instead, practical computations are made the following way : Let M be the matrix representing T in some basis. Let M' be its adjoint, that is M' = t & f . Then the matrix M ' M is self-adjoint, and, by the results of Chapter IV, Section 3, or Chapter VII, it has real, positive eigenvalues, and one of them, X , satisfies X = llM'Mll = IIMII'.
So
fi is the required value of
IlTll.
Programs d o exist to find the eigenvalues of a matrix ; however the computations become obviously longer and less precise when the dimension increases.
15
Finite Dimensional Spaces Exercises on Chapter I.
Exercise 1. - Show t h a t the following are equivalent : a)
t I,"-' ~k converges,
b) gT" converges, a ( T ) is contained in D and .(A)
c)
= 1 , for all X E a ( T ) ,with
(Hint : consider t h e sequences of functions /,,(A) Xn/n).
=
x:-'Xk,
1x1 = 1. gn(X) =
Exercise 2. Let T be a n operator on a Banach space E , and assume that T has a cyclic vector 2 0 , t h a t is : ~
E = s p a n ( z 0 , T Z O ,T'zo,
...I
Assume moreover t h a t the minimal polynomial of T can be written (with d=dim E) : 70
+ 7lx 4-
' - '
Find a basis in which the matrix of
0
..
I
+ 7 d - ] X d - l + Ad .
T
is of the form :
.. . . ..
... ... Prove t h a t in this case, t h e characteristic polynomial of minimal polynomial ( u p to a change of sign).
T is equal
to the
Exercise 3. - Let E , d i m E = d , and T an operator such t h a t T d= 0. Show of E and k 1 integers 1 n o < nl < that there exists a basis ( 2 1 , . . . , q ) < n k = d , such t h a t Tr; = r ; + l ,except if i E {no,nl,...,nk},in which
+
case
Ti;= 0.
Exercise 4. - In
a! n , what
is the operator norm of the matrix whose entries
are all 1's ?
Exercise 5. - Let
M = ( a Find A such t h a t e A = M .
[ 2:)
16
Chapter I
Exerciee 6. - Let
M = ( i
i i)
Compute the distance between M and the set of diagonal matrices.
Finite Dimensional Spaces
17
Notes and Comments. T h e results of this chapter are mostly reproduced from Dunford-Schwartz [I],vol.1, with a few modifications in the presentation. The reader may consult this book, as well as the book by P. Halmos [ I ] .
Complements on Chapter I. Exercise 4 involves the computation of the operator norm of the matrix, whose entries are all 1's. In a " , the result is n . But if, instead of 1, one puts ~ ; , ,j ( i , i = 1,.. . ,n), independent random variables with values f1 , then the operator norm is, with great probability, of the order of fi (see Benett Goodman - Newman Ill), thus dropping considerably. An open problem (communicated by David Larson) : We consider 3 x 3 matrices. Let P be a "diagonal projection", that is a matrix of the form :
P
=
(H !)
where a , b and c take the values 0 or 1.
For any matrix M , let : a ( M ) = sup{II(Z - P)MPII ; P is a diagonal projection} and
d(M)
=
inf{J\M- DJJ;
D is a diagonal matrix}
Compute :
and then :
K 3 = s u p { K ( M ) ; M is a non - diagonal 3 x 3 matrix}