On condition numbers of a basis

On condition numbers of a basis

Linear Algebra and its Applications 439 (2013) 3359–3377 Contents lists available at ScienceDirect Linear Algebra and its Applications www.elsevier...

378KB Sizes 1 Downloads 30 Views

Linear Algebra and its Applications 439 (2013) 3359–3377

Contents lists available at ScienceDirect

Linear Algebra and its Applications www.elsevier.com/locate/laa

On condition numbers of a basis Mario Ahues a , Balmohan V. Limaye b,∗ a

Unité CNRS 5208, Institut Camille Jordan, Université de Lyon, Faculté des Sciences, Université Jean Monnet, 42023 Saint Étienne, France b Department of Mathematics, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India

a r t i c l e

i n f o

Article history: Received 9 March 2012 Accepted 9 September 2013 Available online 14 October 2013 Submitted by N.J. Higham MSC: 15A 15A12 15A15 65A35 Keywords: Condition number Optimal scaling Orthogonality in a Banach space Gram matrix Cholesky decomposition Implicit function theorem Bernstein polynomials Lagrange and Legendre polynomials Power basis Hat functions Pyramid functions Simple functions Lebesgue constant

*

a b s t r a c t The condition number κ p (x) of an ordered basis x of a finite dimensional subspace of an L p -space, 1  p  ∞, measures the extent to which small relative changes in the basis coefficients lead to small relative changes in the function values, and vice versa. We consider an ordered basis x of an arbitrary finite dimensional normed space and introduce numbers  p ,r (x), 1  p , r  ∞, which control ‘near linear dependence’ of the basis elements as well as overflow/underflow during computations. These numbers are defined in terms of the norms of the basis elements and the norms of the elements of the ordered dual basis. Optimal scaling strategies are determined. Effects of matrix transformation on these numbers are explored. If x is an ordered basis of an inner product space, then  p ,r (x) can be calculated explicitly in terms of the diagonal entries of the Gram matrix corresponding to x and the diagonal entries of its inverse. In the last section, we define a condition number of x relative to the problem of computing the ordered dual basis, and relate it to  p ,q (x), where (1/ p )+(1/q) = 1. © 2013 Elsevier Inc. All rights reserved.

Corresponding author. E-mail addresses: [email protected] (M. Ahues), [email protected] (B.V. Limaye).

0024-3795/$ – see front matter © 2013 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.laa.2013.09.015

3360

M. Ahues, B.V. Limaye / Linear Algebra and its Applications 439 (2013) 3359–3377

1. Introduction The condition number K ( A ) :=  A   A −1  of a nonsingular matrix A with respect to a matrix norm  ·  has been effectively used to measure the reliability of computations involving the matrix A. If M is a finite dimensional subspace of an L p -space with 1  p  ∞, and x := [x1 , . . . , xm ] is an ordered basis of M, then the condition number of x is defined by



κ p (x) :=



max

m

j =1 u ( j ) x j  p

u  p

0=u ∈Cm×1

 max

0=u ∈Cm×1



 u  p , j =1 u ( j ) x j  p

m

where Cm×1 is the m-dimensional Euclidean space consisting of all column vectors u := [u (1), . . . , u (m)]t with complex entries u (1), . . . , u (m), u  p denotes the Hölder p-norm on Cm×1 , and x p denotes the p-norm of x ∈ L p . The number κ p (x) measures the sensitivity of computing the coefficients in the expression of an arbitrary element x ∈ M as a unique linear combination of the basis elements x1 , . . . , xm of M. (See, for example, [15,6,3,8].) In the present article, we consider an arbitrary normed space M over C of dimension m  2, and for an ordered basis x := [x1 , . . . , xm ] of M, we focus on the possibility of the basis elements x1 , . . . , xm being ‘nearly dependent’, that is, the angle between a basis element x j and the subspace spanned by the remaining basis elements being too small, and also on the possibility of overflow/underflow during computations. A case in point is the subspace iteration method, on which the well-known QR algorithm for computing the eigenvalues of a matrix is based. Let M  denote the dual space of M consisting of all (continuous) linear functionals defined on M.  ] of M  such that x (x ) = δ There is a unique ordered basis x := [x1 , . . . , xm j i , j for 1  i , j  m. It is i called the ordered dual basis of M  . Let  ·  denote the given norm on M, and let  ·  denote the induced norm on M  , that is, let

    f  := sup  f (x): x ∈ M and x  1 for f ∈ M  . For positive real numbers a1 , . . . , am , consider the pth power mean of a1 , . . . , am defined by



μ p (a1 , . . . , am ) :=

( m1

m

p 1/ p j =1 a j )

if 0 < p  ∞,

max{a1 , . . . , am } if p = ∞.

For 1  p , r  ∞, we define

 



 .  p ,r (x) := μ p x1 , . . . , xm  μr x1 , . . . , xm

In [7], the number ∞,∞ (x) was denoted by cond(x), and it was used to treat ‘uniformly conditioned’ bases of finite dimensional spectral subspaces. We shall show that  p ,r (x)  1. Let us mention a well-known example of an ordered basis, consisting of the hat functions e 1 , . . . , em defined on an interval [a, b] with nodes at t 1 , . . . , tm . The ordered  , namely e  (x) := x(t ), where x belongs to dual basis is given by the evaluation functionals e 1 , . . . , em j j the linear span M of e 1 , . . . , em , and j = 1, . . . , m. Working with the sup norm, we note that the norm of each e j as well as the dual norm of each e j is equal to 1. Thus, if e := [e 1 , . . . , em ], then  p ,r (e ) = 1 for all p and r. We shall often refer to this example. This paper is organized as follows. In Section 2, we compare the numbers  p ,r (x) with the condition numbers κ p (x), where 1  p , r  ∞. In Section 3, we express  p ,r (x) in terms of the norms x1 , . . . , xm  and the distances of the xj s from their complementary subspaces. This shows how the numbers  p ,r (x) may control ‘near linear dependence’ of x1 , . . . , xm as well as overflow/underflow during computations. We show that each of the numbers  p ,r (x) is equal to 1 precisely when the norms x1 , . . . , xm  are equal and the ordered basis x is ‘orthogonal’. In Section 4, we obtain optimal scaling strategies for these numbers. In Section 5, we find upper bounds for the numbers  p ,r (x) when p , r = 1, 2, ∞ and the basis x is subjected to a matrix transformation. In case the norm on M is induced by an inner product, we show that for any ordered basis x of M,  p ,r (x) can be expressed in terms of the diagonal entries of the corresponding Gram matrix and the diagonal entries of its

M. Ahues, B.V. Limaye / Linear Algebra and its Applications 439 (2013) 3359–3377

3361

inverse. In Section 6, we calculate  p ,r (x), when x is obtained by a matrix transformation from an ‘orthonormal’ basis in three well-known special cases. In Section 7, we define in the classical sense, a condition number γ p ,r (x) of an ordered basis x of M relative to the problem of computing the or ] of M  , treating the given basis x := [x , . . . , x ] as data. We show dered dual basis x := [x1 , . . . , xm 1 m that γ p ,r (x)  m p ,q (x) for all 1  p , r  ∞, where (1/ p ) + (1/q) = 1, and comment on the sharpness of this bound. As illustrations of our results, we treat ordered bases consisting of Bernstein polynomial functions, Lagrange polynomial functions (with Chebyshev nodes as well as equidistant nodes), Legendre polynomial functions, power functions, pyramid functions and simple measurable functions. 2. Comparison of  p ,r (x) and κ p (x) Let  ·  be a prescribed norm on an m-dimensional linear space M, and let x be an ordered basis of M. In this section, we explore relationships among the numbers  p ,r (x) for various p and r, and then compare the numbers  p ,r (x) with the condition numbers κ p (x) of x, where 1  p , r  ∞. Although  p ,r (x) and κ p (x) measure different sensitivities as mentioned in the Introduction, they can be nicely bounded by each other (Proposition 2.3). Proposition 2.1. Let x be an ordered basis of M. Consider 1  p  p˜  ∞ and 1  r  r˜  ∞. Then

 p ,r (x)   p˜ ,˜r (x) and m(1/ p˜ )+(1/˜r )  p˜ ,˜r (x)  m(1/ p )+(1/r )  p ,r (x).  ] be the ordered dual basis of M  . For j = 1, . . . , m, Proof. Let x := [x1 , . . . , xm ], and let x := [x1 , . . . , xm   let a j := x j  and b j := x j  . Then by the power mean inequality [2, p. 17],

 p ,r (x) = μ p (a1 , . . . , am )μr (b1 , . . . , bm )  μ p˜ (a1 , . . . , am )μr˜ (b1 , . . . , bm ) =  p˜ ,˜r (x). Also, by Jensen’s inequality,









m1/ p˜ μ p˜ (a1 , . . . , am ) = (a1 , . . . , am ) p˜  (a1 , . . . , am ) p = m1/ p μ p (a1 , . . . , am ). Similarly, m1/˜r μr˜ (b1 , . . . , bm )  m1/r μr (b1 , . . . , bm ). Multiplying respective terms in these two inequalities, we obtain the desired inequality. 2 Corollary 2.2. Let x be an ordered basis of M. For 1  p , r  ∞,

1  1,1 (x)   p ,r (x)  ∞,∞ (x) and

∞,∞ (x)  m(1/ p )+(1/r )  p ,r (x)  m2 1,1 (x).   }. Since 1 = |x (x )|  x  x , Proof. We note that 1,1 (x) = AM{x1 , . . . , xm } AM{x1  , . . . , xm j j j j that is, xj   1/x j  for each j = 1, . . . , m, the AM–HM inequality shows that 1  1,1 (x). The remaining inequalities follow directly from Proposition 2.1. 2

Turning now to the condition number κ p (x), we remark that although κ p (x) has been defined only for an ordered basis x of a finite dimensional subspace of L p , the definition extends naturally to an ordered basis of any finite dimensional normed space M if we replace the norm x p by the prescribed norm x of x ∈ M. Let us consider the product spaces M 1×m and ( M  )1×m , where m is the dimension of M, and define the p-norm  ·  p on M 1×m by

3362

M. Ahues, B.V. Limaye / Linear Algebra and its Applications 439 (2013) 3359–3377

x p :=

  ( mj=1 x j  p )1/ p

if 1  p < ∞,

max{x j : j = 1, . . . , m} if p = ∞

for x := [x1 , . . . , xm ] ∈ M 1×m . The p-norm  · p on ( M  )1×m is defined similarly. These definitions

extend the definition of the Hölder p-norm on Cm×1 , that is, when we let M := C. Given an ordered basis x := [x1 , . . . , xm ] of M, define  · x : Cm×1 → R by



t u x := u (1)x1 + · · · + u (m)xm for u = u (1), . . . , u (m) ∈ Cm×1 . Clearly,  · x is a norm on Cm×1 , and it is equivalent to the p-norm  ·  p on Cm×1 , that is, there are constants α > 0 and β > 0 such that βu  p  u x  α u  p for all u ∈ Cm×1 . The best such constants are

α p (x) :=

max

0=u ∈Cm×1

u  x u  p

and

β p (x) :=

max

0=u ∈Cm×1

u  p . u  x

Define Ψx : (Cm×1 ,  ·  p ) → ( M ,  · ) by Ψx (u ) := u (1)x1 + · · · + u (m)xm for u = [u (1), . . . , u (m)]t . Then α p (x) = Ψx  p and β p (x) = Ψx−1  p , so that κ p (x) = α p (x)β p (x) = Ψx  p Ψx−1  p . Proposition 2.3. Let x be an ordered basis of M and x be the ordered dual basis of M  . Let 1  p  ∞ and (1/ p ) + (1/q) = 1. Then

 x∞  α p (x)  xq and x ∞  β p (x)  x p . Consequently,

∞,∞ (x)  κ p (x)  m q, p (x). Proof. For j = 1, . . . , m, if we let u := [0, . . . , 0, 1, 0, . . . , 0]t ∈ Cm×1 , where only the jth entry is 1, then u x = x j  and u  p = 1, and hence x∞  α p (x). Next, for any u ∈ Cm×1 , u x  u  p xq by Hölder’s inequality, and hence α p (x)  xq . Thus the inequalities involving α p (x) follow. We now prove the inequalities involving β p (x). By the Hahn–Banach theorem, u  p = max{| v t u |: v ∈ Cm×1 and  v q = 1} for each u ∈ Cm×1 , so that

β p (x) =

max

0=u ∈Cm×1

  u  p |v t u| = max : u , v ∈ Cm×1 , u = 0, v = 0 . u  x  v q u x

On the other hand, for 0 = v ∈ Cm×1 and for y = u (1)x1 + · · · + u (m)xm ∈ M, we have v (1)x1 ( y ) +  ( y ) = v t u, so that · · · + v (m)xm

  m  | k=1 v ( j )xj ( y )| |v t u| m×1 max : u∈C , u = 0 = max : y ∈ M , y = 0 u  x  y 

and

β p (x) = max

 m  k=1 v ( j )xj   v q

m×1

: v ∈C

 , v=  0 .

Hence the inequalities involving β p (x) follow exactly in the same manner as the inequalities involving α p (x). Multiplying respective terms in these two sets of inequalities, we obtain the remaining inequalities. 2 Proposition 2.3 allows us to obtain estimates for  p ,r (x) if estimates for versa. This is illustrated by the following example.

κ p (x) are known, and vice

M. Ahues, B.V. Limaye / Linear Algebra and its Applications 439 (2013) 3359–3377

3363

Example 2.4. Let m  2 and M denote the linear space of all polynomial functions of degree at most m − 1 defined on [−1, 1], and let the prescribed norm on M be the L p -norm, where 1  p  ∞. Consider the ordered basis x = [x1 , . . . , xm ] consisting of the Bernstein polynomial functions given by

 x j (t ) :=

m−1 j−1



1+t

 j −1 

2

1−t

m− j

for j = 1, . . . , m and t ∈ [−1, 1].

2

Then x j  0 for each j = 1, . . . , m and 2.11]:

m

j =1

x j = 1. The following estimates are known [8, 7.18, 2.10,

√ π m −1 3 2  κ1 (x)  √ 2m−1 , √ m m+1 1

(π (m + m−2 m−1

1

1

1 1/4 )) 2

2m− 2  κ2 (x)  m

3

2m− 2  κ∞ (x) 

m−1



m)1/4

1

2m− 2 ,

3

2m− 2 .

By Proposition 2.3, we obtain the following estimates. For the L 1 -norm on M,

√ π 3 2m−1  ∞,1 (x)  ∞,∞ (x)  √ 2m−1 . √ m m m+1 For the L 2 -norm on M,

1

1

1

m(π (m + 12 ))1/4

2m− 2  2,2 (x)  ∞,∞ (x) 

(π m)1/4

1

2m− 2 .

For the L ∞ -norm, that is, for the sup norm on M,

m−2

3

m(m − 1)

2m− 2  1,∞ (x)  ∞,∞ (x) 

m m−1

3

2m− 2 .

It is instructive to consider another ordered basis of M consisting of the Lagrange polynomial functions given by

j (t ) :=

m 

(t − tk )/

k =1 k= j

m 

(t j − tk ) for j = 1, . . . , m and t ∈ [−1, 1],

k =1 k= j

where −1  t 1 < · · · < tm  1. It is clear that j (t i ) = δi , j for i , j = 1, . . . , m, and x =

m

m

j =1

x(t j ) j

for every x ∈ M. In particular, j =1 j = 1. However, 1 , . . . , m take both positive and negative values, unlike the Bernstein polynomial functions. Let us consider the sup norm on M, and let  ] is the or := [ 1 , . . . , m ]. For i = 1, . . . , m and x ∈ M, define i (x) := x(t i ). Then  := [ 1 , . . . , m dered dual basis of M  , and  i  = 1 for each i = 1, . . . , m. In order to obtain estimates for the norms of 1 , . . . , m , consider the mth Lebesgue constant defined by

 m     Λm := max for m ∈ N, j (t ) : t ∈ [−1, 1] j =1

which plays an important role in polynomial approximation. It can be easily seen that  ∞  Λm 

 1 .

The norms of 1 , . . . , m depend heavily on the nodes t 1 , . . . , tm . For example, consider the Chebyshev nodes t j := cos θ j , where θ j := (2 j − 1)π /2m for j = 1, . . . , m. According to a result of Fejér, the

3364

M. Ahues, B.V. Limaye / Linear Algebra and its Applications 439 (2013) 3359–3377

sup norm of j is less than or equal to





2 for each j = 1, . . . , m [5, p. 5]. Since   ∞ = 1, we ob-

tain q, p ( )  ∞,∞ ( )  2, where 1  p  ∞ and (1/ p ) + (1/q) = 1. By Proposition 2.3, we obtain √ κ p ( )  m 2 for 1  p  ∞. On the other hand, consider the equidistant nodes t j := (2 j − 1 − m)/(m − 1) for j = 1, . . . , m in [−1, 1]. In this case, the following estimates for the mth Lebesgue constant Λm are given in [14, Theorem 2]:

2m−3

(m

− 1)2

< Λm <

2m+2 m−1

.

Since Λm → ∞, we see that  1 → ∞ as m → ∞. Now since   ∞ = 1 and   1 = m, we see that ∞,∞ ( ) =  ∞ and 1,1 ( ) =  1 /m. Hence

2m−3 m(m

− 1)2

< 1,1 ( )   p ,r ( )  ∞,∞ ( ) <

2m+2 m−1

for all 1  p , r  ∞ by Proposition 2.1. Note that  p ,r ( ) tends to infinity exponentially; in fact its mth root tends to 2 as m → ∞. Again, by Proposition 2.3, we obtain

2m−3 m(m

− 1)2

< κ p ( ) <

m2m+2 m−1

for 1  p  ∞.

Let us now reverse our procedure and begin with an ordered basis of the dual space M  of M.  ] be an ordered basis of M  . Then there is a unique ordered basis x of M Proposition 2.5. Let x := [x1 , . . . , xm  such that xi (x j ) = δi , j for i , j = 1, . . . , m. Let 1  p , r  ∞. Then





 p ,r x = r , p (x) and κ p x = κq (x),

where

1 p

+

1 q

= 1.

 ] be the ordered dual basis of M  , and let J : M → M  denote the canonProof. Let x := [x1 , . . . , xm ical isometry of M onto M  . Then there are x1 , . . . , xm ∈ M such that xj = J (x j ) for j = 1, . . . , m, and

so x := [x1 , . . . , xm ] is an ordered basis of M, xi (x j ) = J (x j )(xi ) = xj (xi ) = δi , j for i , j = 1, . . . , m and

x j  = xj  for j = 1, . . . , m. Clearly,  p ,r (x ) = μ p (x )μr (x ) = μ p (x )μr (x) = r , p (x). Also, κ p (x ) = α p (x )β p (x ) = βq (x)αq (x) = κq (x), where (1/ p ) + (1/q) = 1, as in the proof of Proposition 2.3. 2 As a final point in the comparison of  p ,r (x) and κ p (x), we remark that for any normed space M of dimension m, there is an ordered basis x = [x1 , . . . , xm ] of M such that x j  = 1 = xj  for each j = 1, . . . , m [4, Lemma 2.1], so that  p ,r (x) = 1 for all p and r. (We shall give specific illustrations in Examples 6.2.) This is not the case for the condition numbers κ p (x). For example, if we let M = Cm×1 √ along with the 2-norm  · 2 , then κ∞ (x)  m for every ordered basis √ x of M, that is, the so-called Banach–Mazur distance between (Cm×1 ,  · 2 ) and (Cm×1 ,  · ∞ ) is m. (See [12, Chapter 9].) 3. Orthogonality The notion of orthogonality in an inner product space can be generalized to an arbitrary normed space M in many ways. We adopt the following generalization as given in [10, p. 215]. An element x ∈ M is said to be orthogonal to a subspace N of M if x = dist(x, N ). Accordingly, an ordered basis x := [x1 , . . . , xm ] of a normed space M of dimension m  2 is called orthogonal if

 j ), x j  = dist(x j , M for each j = 1, . . . , m.

 j := span{x1 , . . . , x j −1 , x j +1 , . . . , xm } where M

M. Ahues, B.V. Limaye / Linear Algebra and its Applications 439 (2013) 3359–3377

3365

 j ) for j = 1, . . . , m. Clearly, x j   d j for each j = 1, . . . , m. In the following Let d j := dist(x j , M proposition, we express the numbers  p ,r (x) in terms of x1 , . . . , xm  and d1 , . . . , dm , that is, with ]. out involving the ordered dual basis x = [x1 , . . . , xm  ] be the ordered dual basis Lemma 3.1. Let x := [x1 , . . . , xm ] be an ordered basis of M and x := [x1 , . . . , xm    of M . Then d j = 1/x j  for each j = 1, . . . , m. Consequently,



 p ,r (x) = μ p x1 , . . . , xm  μr (1/d1 , . . . , 1/dm ) for 1  p , r  ∞.

In particular,

1,1 (x) =

AM{x1 , . . . , xm } HM{d1 , . . . , dm }

1,∞ (x) =

AM{x1 , . . . , xm }

∞,∞ (x) =

(AM{x1 2 , . . . , xm 2 })1/2 , 2 1/2 (HM{d21 , . . . , dm }) max{x1 , . . . , xm } ∞,1 (x) = , HM{d1 , . . . , dm }

2,2 (x) =

, ,

min{d1 , . . . , dm } max{x1 , . . . , xm } min{d1 , . . . , dm }

,

where AM and HM denote the arithmetic and harmonic means respectively. Proof. Fix j ∈ {1, . . . , m}. Then

    x = max |c j |: c 1 , . . . , cm ∈ C and c 1 x1 + · · · + cm xm  = 1 . j

Since c 1 x1 + · · · + cm xm   |c j |d j for all c 1 , . . . , cm ∈ C, we obtain xj   1/d j . On the other hand, there are c 1 , . . . , c j −1 , c j +1 , . . . , cm ∈ C such that d j = x j − c 1 x1 − · · · − c j −1 x j −1 − c j +1 x j +1 − · · · − cm xm . If we let x := (x j − c 1 x1 − · · · − c j −1 x j −1 − c j +1 x j +1 − · · · − cm xm )/d j , then x = 1 and xj (x) =

1/d j , so that xj   1/d j . Thus xj  = 1/d j . The expressions for 1,1 (x), 2,2 (x), 1,∞ (x), ∞,1 (x) and ∞,∞ (x) follow by noting that μ1 (a1 , . . . , am ) = AM{a1 , . . . , am } = 1/ HM{1/a1 , . . . , 1/am } and max{a1 , . . . , am } = 1/ min{1/a1 , . . . , 1/am } for any positive real numbers a1 , . . . , am . 2 If x is orthogonal, then

1,1 (x) =

AM{x1 , . . . , xm } HM{x1 , . . . , xm }

,

∞,∞ (x) =

max{x1 , . . . , xm } min{x1 , . . . , xm }

,

and similar expressions hold for all other numbers  p ,r (x). Proposition 3.2. Let x := [x1 , . . . , xm ] be an ordered basis of M, and let 1  p , r  ∞. Then  p ,r (x) = 1 if and only if x1  = · · · = xm  and x is orthogonal. Proof. Suppose 1,1 (x) = 1, that is, AM{x1 , . . . , xm } = HM{d1 , . . . , dm }. Since AM{x1 , . . . , xm }  HM{x1 , . . . , xm } and x j   d j for j = 1, . . . , m, we obtain









AM x1 , . . . , xm  = HM x1 , . . . , xm  = HM{d1 , . . . , dm }. The first equality above implies that x1  = · · · = xm , and the second implies that x j  = d j for j = 1, . . . , m, that is, x is orthogonal. Conversely, suppose x1  = · · · = xm  and x is orthogonal. Then









max x1 , . . . , xm  = min x1 , . . . , xm  = min{d1 , . . . , dm }, that is, ∞,∞ (x) = 1. The inequalities 1  1,1 (x)   p ,r (x)  ∞,∞ (x) proved in Corollary 2.2 complete the proof.

2

3366

M. Ahues, B.V. Limaye / Linear Algebra and its Applications 439 (2013) 3359–3377

As we have already mentioned in the Introduction, hat functions constitute a prime example of an ordered basis satisfying the conditions stated in Proposition 3.2. Several other examples will be given in Section 6. They consist of pyramid functions, characteristic functions and Legendre polynomial functions (Proposition 6.1 and Examples 6.2). Results in this section show that if the number  p ,r (x) is of modest size, then the distance d j of any x j from span{xi : i = 1, . . . , m and i = j } would not be much smaller than x j , so that ‘near linear dependence’ among x1 , . . . , xm would not cause problems, and moreover, the norm of any x j is neither too large nor too small in comparison with the norms of the other xi ’s, so that overflow and/or underflow can be avoided during computations. This is borne out, for example, by the following simple inequalities:

x j   d j  x j  and ∞,∞ (x)

 xi   x j   ∞,∞ (x)xi  ∞,∞ (x)

for each fixed j ∈ {1, . . . , m} and all i = 1, . . . , m. 4. Scaling In this section, we consider the effect of scaling on the numbers  p ,r (x). If we multiply all x j ’s by a nonzero scalar, then clearly  p ,r (x) does not alter. This is, however, not the case if x j is multiplied by a nonzero scalar α j for each j = 1, . . . , m. We look for optimal scalars α1 , . . . , αm which would minimize  p ,r ([α1 x1 , . . . , αm xm ]). Proposition 4.1. Let x := [x1 , . . . , xm ] be an ordered basis of M. For nonzero scalars α1 , . . . , αm , consider the ordered basis xα := [α1 x1 , . . . , αm xm ] of M. (i) Let 1  p , r < ∞ and define t := pr /( p + r ). Then



 p ,r (xα )  μt x1 /d1 , . . . , xm /dm , r and this minimum is attained if and only if |α1 | p +r x1  p dr1 = · · · = |αm | p +r xm  p dm . (ii) Let 1  p < ∞. Then  p ,∞ (xα )  μ p (x1 /d1 , . . . , xm /dm ), and this minimum is attained if and only if |α1 |d1 = · · · = |αm | dm . (iii) Let 1  r < ∞. Then ∞,r (xα )  μr (x1 /d1 , . . . , xm /dm ), and this minimum is attained if and only if |α1 |x1  = · · · = |αm |xm . (iv) ∞,∞ (xα )  max{x1 /d1 , . . . , xm /dm }, and this minimum is attained if and only if for i , j ∈ {1, . . . , m}, there is k ∈ {1 . . . , m} such that |αi |xi dk  |α j |d j xk .

 j and dist(α j x j , M  j) = Proof. Observe that span{α1 x1 , . . . , α j −1 x j −1 , α j +1 x j +1 , . . . , αm xm } = M  j ) = |α j |d j , just as α j x j  = |α j |x j  for j = 1, . . . , m. |α j |dist(x j , M The desired results follow from the general Hölder inequality which states that for positive real numbers a1 , . . . , am , b1 , . . . , bm and 1  p , r  ∞,

μt (a1 b1 , . . . , am bm )  μ p (a1 , . . . , am )μr (b1 , . . . , bm ), where

1 p

+

1 r

1

= , t

and, in fact, equality holds if and only if the following condition is satisfied: p

p

r a1 /br1 = · · · = am /bm ,

in case 1  p , r < ∞,

b 1 = · · · = bm ,

in case 1  p < ∞ and r = ∞,

a1 = · · · = am ,

in case p = ∞ and 1  r < ∞,

for i , j ∈ {1, . . . , m}, there is k ∈ {1 . . . , m} such that ai b j  ak bk , in case p = ∞ = r . The general Hölder inequality is easily derived from the usual Hölder inequality since t = ∞ if and only if p = ∞ = r, and otherwise (t / p ) + (t /r ) = 1.

M. Ahues, B.V. Limaye / Linear Algebra and its Applications 439 (2013) 3359–3377

3367

The results in (i)–(iv) are obtained by letting a j := |α j |x j  and b j := 1/|α j | d j for j = 1, . . . , m in the above statements. 2 Remarks 4.2. In accordance with the inequalities 1,1 (x)   p ,r (x)  ∞,∞ (x) proved in Proposition 2.1, we observe that the minimum value attained by the number  p ,r (x) after optimal scaling, as given in Proposition 4.1, satisfies

μ1/2 (a1 , . . . , am )  μt (a1 , . . . , am )  μ∞ (a1 , . . . , am ), where a j = x j /d j for j = 1, . . . , m. This follows from the power mean inequality since t = pr /( p + r ), so that (1/2)  t  ∞. Equality holds in these two inequalities if and only if x1 /d1 = · · · = xm /dm [2, p. 17]. Proposition 4.1 determines the optimal scaling strategy when not both p and r are equal to ∞: Choose

α j := x j − p/( p+r ) d−j r /( p+r ) for j = 1, . . . , m. On the other hand, when p = ∞ = r, there can be several optimal scaling strategies. For example, we may choose

α j := x j −1 for j = 1, . . . , m or α j := d−j 1 for j = 1, . . . , m. We give an example of yet another choice. Let M := C2×1 with the norm given by u  := max{|u (1)|, |u (2)|} for u := [u (1), u (2)]t ∈ C2×1 . Consider x1 := [1, 1]t and x2 := [0, 3/2]t . Then x1  = 1 and x2  = 3/2. It can be easily seen that d1 := min{x1 − cx2 : c ∈ C} = 1 and d2 := min{x2 − cx1 : c ∈ C} = 3/4. Hence the minimum value of ∞,∞ (xα ) is max{x1 /d1 , x2 /d2 } = 2,  |α2 |x2  and |α1 |d1 = |α2 |d2 . and it is attained by choosing α1 = 1 = α2 , for which |α1 |x1  = Example 4.3. Let M := Cm×1 , and u := [u 1 , . . . , um ] be an ordered basis of Cm×1 . If u j := [u j (1), . . . , u j (m)]t for j = 1, . . . , m, then u is given by the m×m nonsingular matrix A := [ai , j ], where the entry ai , j in the ith row and the jth column is equal to u j (i ) for i , j = 1, . . . , m. Also, the scaled ordered basis u α is given by the matrix A α := [α j ai , j ] obtained by scaling the columns of A. While Proposition 4.1 gives a satisfactory answer to the problem of finding an optimal scaling for  p ,r (u ), an answer to the corresponding problem of finding an optimal scaling for a condition number K( A ) =  A   A −1  of the matrix A is not easily available in general, as can be seen from [1,11]. We quote a couple of results in [11, Theorem 2.5]. For A ∈ Cm×m , let  A  p := max{ Au  p : u ∈ Cm×1 and u  p = 1}, 1  p  ∞, and let  A  denote  A  p or the Frobenius norm of A. (i) If K( A ) :=  A 1  A −1  , then K( A α ) is minimized if α1 u 1 1 = · · · = αm um 1 . (ii) If K( A ) := (max |ai , j |) A −1  , then K( A α ) is minimized if α1 u 1 ∞ = · · · = αm um ∞ . These results can be compared with the following result. Let  ·  be any norm on Cm×1 . For 1  r < ∞, the number ∞,r (u α ) is minimized if and only if α1 u 1  = · · · = αm um , and ∞,∞ (u α ) is minimized if α1 u 1  = · · · = αm um . This follows from Proposition 4.1 and Remark 4.2. 5. Transformed bases In this section, we investigate the change in a condition number of an ordered basis when it is transformed to another ordered basis by a nonsingular matrix. For this purpose, it is convenient to use the Gram product of x := [x1 , . . . , xm ] ∈ M 1×m and f := [ f 1 , . . . , f m ] ∈ ( M  )1×m defined by



[ f , x] := f i (x j ) ∈ Cm×m ,

3368

M. Ahues, B.V. Limaye / Linear Algebra and its Applications 439 (2013) 3359–3377

where f i (x j ) is the entry in the ith row and the jth column of the m × m matrix [ f , x]. We note that

if x is an ordered basis of M and [ f , x] = [ g , x] for some f , g in ( M  )1×m , then f = g. Similarly, if f is an ordered basis of M  and [ f , x] = [ f , y ] for some x, y in M 1×m , then x = y.

Let Im denote the m × m identity matrix. For x in M 1×m and f in ( M  )1×m , [ f , x] = Im if and only if x is an ordered basis of M and f is the ordered dual basis of M  . Next, for x := [x1 , . . . , xm ] ∈ M 1×m and Θ := [θi , j ] ∈ Cm×m , we define

 xΘ :=

m 

θi ,1 xi , . . . ,

i =1

m 

 θi ,m xi ,

i =1

in accordance with the usual matrix multiplication. In Section 4, we have already considered the particular case xα = xΘ , where Θ := diag(α1 , . . . , αm ). Similarly, we define f Θ for f ∈ ( M  )1×m and Θ ∈ Cm×m . It follows that [ f , xΘ] = [ f , x]Θ and

[ f Θ, x] = Θ t [ f , x] for all x ∈ M 1×m , f ∈ ( M  )1×m and Θ ∈ Cm×m . Let p ∈ {1, 2, ∞}. For x ∈ M 1×m and f ∈ ( M  )1×m , recall the norms x p and  f p introduced Section 2. For Θ := [θi , j ] ∈ Cm×m , let

Θ1 := max

m 

j =1,...,m

|θi , j |,

Θ∞ := max

i =1,...,m

i =1

m 

 |θi , j |,

m 

ΘF :=

j =1

1/2 |θi , j |

2

.

i , j =1

Then  · 1 ,  · F and  · ∞ are submultiplicative norms on Cm×m . Also,

xΘ1  x1 Θ∞ ,

xΘ∞  x∞ Θ1

and

   xΘ2  x2 min ΘF , Θ1 Θ∞ for all x ∈ M 1×m and Θ ∈ Cm×m . Proposition 5.1. Let x be an ordered basis of M and x be the ordered dual basis of M  . Let Θ ∈ Cm×m be nonsingular and define y := xΘ . Then y  := x Θ −t is the ordered dual basis of M  . As a consequence,

1 1  p ,r ( y )  Θq Θ −1 r  p ,r (x) for p , r ∈ {1, ∞} and + = 1, p q    2,2 ( y )  min KF (Θ), K1 (Θ)K∞ (Θ) 2,2 (x). Further, if Θ is Hermitian, skew-Hermitian, symmetric or skew-symmetric, then  p ,r ( y )  K1 (Θ) p ,r (x) if p , r ∈ {1, ∞}, or if p = 2 = r. Proof. Since [x , x] = Im , we see that [ y  , y ] = [x Θ −t , xΘ] = Θ −1 [x , x]Θ = Im . Hence y  is the ordered dual basis of M  . Note that Θ −t ∞ = Θ −1 1 , Θ −t 1 = Θ −1 ∞ and Θ −t F = Θ −1 F . Suppose p , r ∈ {1, ∞}. Let (1/ p ) + (1/q) = 1 and (1/r ) + (1/s) = 1. Then  y  p  x p Θq and

 y  r = x Θ −t r  x r Θ −t s = x r Θ −1 r . Hence

 p ,r ( y ) =

 y  p  y  r m 1/ p

m1/r

x p x r  Θq Θ −1 r 1/ p 1/r = Θq Θ −1 r  p ,r (x). m

m

Next,





m 2,2 ( y ) =  y 2 y  2  ΘF Θ −t F x2 x 2 = m KF (Θ)2,2 (x),

M. Ahues, B.V. Limaye / Linear Algebra and its Applications 439 (2013) 3359–3377

3369

that is, 2,2 ( y )  KF (Θ)2,2 (x). Similarly,

     y 2 y  2  Θ1 Θ∞ Θ −t 1 Θ −t ∞ x2 x 2 , so that 2,2 ( y ) 



K1 (Θ)K∞ (Θ) 2,2 (x).

In particular, if Θ is Hermitian, skew-Hermitian, symmetric or skew-symmetric, then Θ −1 is also so, and hence Θ∞ = Θ1 , Θ −1 ∞ = Θ −1 1 and K∞ (Θ) = K1 (Θ). Hence the desired results follow. 2 In the next section, we shall see that the estimates of a condition number of a transformed ordered basis provided by the above proposition can be improved in three important special cases. (See Remark 6.3.) Inner product spaces. Suppose that there is an inner product · , · on M such that x, x = x2 for all x ∈ M, that is, there is a function · , · : M × M → C which is positive definite, linear in the first variable and conjugate-symmetric. For a fixed y ∈ M, define f y : M → C by f y (x) := x , y , x ∈ M. Then f y ∈ M  ,  f y  =  y , and y → f y is a conjugate-linear map from M onto M  . Hence for x = [x1 , . . . , xm ] ∈ M 1×m and y = [ y 1 , . . . , ym ] ∈ M 1×m , we define

[ y , x] := [ f , x],

where f = [ f y 1 , . . . , f ym ].

It can be easily seen that [ y , xΘ] = [ y , x]Θ and [ y Θ, x] = Θ ∗ [ y , x] for all x, y ∈ M 1×m and Θ ∈ Cm×m , where the superscript ∗ denotes conjugate-transpose. The entry in the ith row and the jth column of the matrix [ y , x] is x j , y i , i , j = 1, . . . , m. In particular, when y i = xi for i = 1, . . . , m, then we obtain the Gram matrix G := [ x j , xi ] corresponding to x. It is nonsingular if and only if {x1 , . . . , xm } is a linearly independent subset of M, and then G is in fact positive definite, and so it has the Cholesky decomposition G := C ∗ C , where C is an upper triangular matrix with positive diagonal entries. Further, {x1 , . . . , xm } is an orthonormal subset of M if and only if G = Im . Proposition 5.2. Suppose the norm on M is induced by an inner product , . Let x := [x1 , . . . , xm ] be an ordered basis of M, and let G := [ g i , j ] and G −1 := [γi , j ] denote the corresponding Gram matrix and its inverse. √ √ Then x j  = g j , j and  y j  = γi , j for each j = 1, . . . , m. As a consequence,

√ √ √ √  p ,r (x) = μ p ( g 1,1 , . . . , g 1,1 )μ p ( γ1,1 , . . . , γ1,1 ). In particular,

2,2 (x) =

1 m



1/2 1 (tr G )1/2 tr G −1 = KF (C ), m

where G = C ∗ C is the Cholesky decomposition, and



1/2

1/2 ∞,∞ (x) = max{ g 1,1 , . . . , gm,m } max{γ1,1 , . . . , γm,m } . Also, for an optimally scaled basis xα ,

√ √ 2,2 (xα ) = AM{ g 1,1 γ1,1 , . . . , gm,m γm,m }, √ √ ∞,∞ (xα ) = max{ g 1,1 γ1,1 , . . . , gm,m γm,m }. m×m Further, if  ·  is any submultiplicative norm on C such that  A  greater than or equal to each √ diagonal entry of A for all A ∈ Cm×m , then ∞,∞ (x)  K (G ).

Proof. Fix j ∈ {1, . . . , m}. Then x j 2 = x j , x j = g j , j , the jth diagonal entry of G. Let y := x G −1 .

Since G ∗ = G, we see that [ y , x] = [x G −1 , x] = G −1 [x, x] = Im . Hence [ f y 1 , . . . , f ym ] is the ordered

3370

M. Ahues, B.V. Limaye / Linear Algebra and its Applications 439 (2013) 3359–3377

dual basis of M  . Note that  y j  =  f y j  = 1/d j for j = 1, . . . , m. Also, G −1 is the Gram matrix corresponding to the ordered basis y of M since

[ y , y ] = x G −1 , x G −1 = G −1 [x, x]G −1 = G −1 . Hence  y j 2 = y j , y j = γ j , j , the jth diagonal entry of G −1 . Thus we obtain the desired expression for  p ,r (x). Letting p = 2 = r, we obtain 2,2 (x) = (tr G )1/2 (tr G −1 )1/2 /m. Further, tr(G ) = tr(C ∗ C ) = C 2F and tr(G −1 ) = tr(C −1 C −∗ ) = C −1 2F . Hence the desired expressions of 2,2 (x) follow. By Proposition 4.1, 2,2 (xα ) is minimum if α j = (γ j , j / g j , j )1/4 for j = 1, . . . , m and its minimal √ √ √ value is AM{ g 1,1 γ1,1 , . . . , gm,m γm,m }, while ∞,∞ (xα ) is minimum if α j = 1/ g j , j for j = 1, . . . , m √ √ and its minimal value is max{ g 1,1 γ1,1 , . . . , gm,m γm,m }. Let now  ·  be a submultiplicative norm on Cm×m such that |a j , j |   A  for all A := [ai , j ] ∈ m×m and j = 1, . . . , m. Then | g j , j |  G  and |γ j , j |  G −1  , so that ∞,∞ (x)  G  G −1  C √ K ( G ) . 2 1/ 2

1/ 2

=

Proposition 5.2 enables us to calculate  p ,r (x) in terms of the diagonal entries g 1,1 , . . . , gm,m of the Gram matrix corresponding to x and the diagonal entries γ1,1 , . . . , γm,m of its inverse. While g j , j ’s may be readily available, it may not be easy to find γ j , j ’s. We illustrate this in the following result about the power basis. We refer to [6] for a computation of κ∞ (x), where x is the power basis on the interval [a, b]. Proposition 5.3. Let M denote the linear space of all polynomial functions on [−1, 1] of degree at most m − 1 1 along with the inner product x, y := −1 x(t ) y (t ) dt for x, y ∈ M. Consider the power basis of M given by

x := [x1 , . . . , xm ], where x j (t ) := t j −1 for j = 1, . . . , m and t ∈ [−1, 1]. Let G := [ g i , j ] denote the corresponding Gram matrix and let G −1 := [γi , j ] be its inverse. Then for j = 1, . . . , m, g j , j = 2/(2 j − 1) and

γ j, j =

m  2k − 1

2

k= j

(2k−1 [(k

[(k + j − 2)!]2 , − j )/2]![(k + j − 2)/2]!( j − 1)!)2

where the summation is taken over all odd k ∈ { j , . . . , m} if j is odd, while the summation is taken over all even k ∈ { j , . . . , m} if j is even.

1

Proof. Since g i , j = −1 t i + j −2 dt, we obtain



gi, j =

2 i + j −1

0

if i + j is even, if i + j is odd

for 1  i , j  m.

Thus g j , j = 2/(2 j − 1) for j = 1, . . . , m. To find the diagonal entries γ1,1 , . . . , γm,m of G −1 , consider the Cholesky decomposition G = C ∗ C . Let e := x C −1 . Then [e , e ] = C −∗ [x, x]C −1 = C −∗ GC −1 = Im . Hence e is an orthonormal ordered basis of M. If e := [e 1 , . . . , em ], then e j is a polynomial function on [−1, 1] of degree j − 1 for j = 1, . . . , m, since C −1 is upper triangular. It is called the Legendre polynomial of degree j − 1, normalized to make its L 2 -norm equal to 1. For n = 0, 1, 2, . . . , the Legendre polynomial P n of degree n satisfying P n (1) = 1 is given by

P n (t ) =

 n/2

(−1)k

k =0

(2n − 2k)! 2n k!(n − k)!(n − 2k)!

t n−2k

for t ∈ [−1, 1],

√ √

where n/2 denotes the greatest integer less than or equal to n/2. Also,  P n 2 = 2/ 2n + 1 for n = 0, 1, 2, . . . . (See, for example, [9, p. 339 and p. 342].) Note that if n is even, then each term of P n is of even degree and if n is odd, then each term of P n is of odd degree.

M. Ahues, B.V. Limaye / Linear Algebra and its Applications 439 (2013) 3359–3377

3371

τi, j denote the entry in the ith row and the jth column of C −1 . Then for j i −1 for t ∈ [−1, 1]. Thus for 1  i  j  m, τi , j i =1 τi , j xi , that is, e j (t ) = i =1 τi , j t

For 1  i  j  m, let

j

j = 1, . . . , m, e j = is the coefficient of t i −1 in the normalized Legendre polynomial of degree j − 1, namely



2j −1



( j −1)/2

(−1)k

2

k =0

(2 j − 2k − 2)! 2 j −1 k!( j − k − 1)!( j − 2k − 1)!

As a result, if either i or j is odd and the other is even, then or both are even, then

τi, j = (−1)( j−i)/2



t j −2k−1 .

τi, j = 0, while if both i and j are odd

2j −1

( j + i − 2)!

2

2 j −1 [( j − i )/2]![( j + i − 2)/2]!(i − 1)!



.

Since G −1 = C −1 C −∗ , we see that

γ j, j =

m 

|τ j ,k |2 for j = 1, . . . , m,

k= j

where the summation is taken over all odd k ∈ { j , . . . , m} if j is odd, while the summation is taken over all even k ∈ { j , . . . , m} if j is even. This completes the proof. 2 Example 5.4. To find 2,2 (x) and ∞,∞ (x) for the power basis x, we need to know tr G, tr G −1 , max{ g 1,1 , . . . , gm,m } and max{γ1,1 , . . . , γm ,m } according to Proposition 5.2. Now by Proposition 5.3,  m max{ g 1,1 , . . . , gm,m } = 2 and tr(G ) = 2 j =1 1/(2 j − 1). On the other hand, it is not so easy

to calculate tr(G −1 ) and max{γ1,1 , . . . , γm,m }. We, therefore, restrict to some initial values of m for calculating ∞,∞ (x). Let m = 1, . . . , 6, and x := [x1 , . . . , xm ] be the power basis. Noting that √  √ √ max{ g 1,1 , . . . , gm,m } = 2, we list the values of 2γ j , j , j = 1 . . . , m, in the following table, where the six rows correspond to m = 1, . . . , 6 and the six columns correspond to j = 1, . . . , 6.

m j

1

1

1

2

1

3

4

5

6

2

√ √







8

8



3

2

15

5

6

3

3

15

4



2 3

3





5



2 5



2 35 8



3



2 3



5· 5·





2 21





4 21 4





5 2 5



2 45 4



9· 9·

35 8 35 8



11 ·

63 8

It can be easily checked that max{γ j , j : j = 1, . . . , m} = γm,m when m = 1, . . . , 5, while max{γ j , j : j = 1, . . . , 6} = γ4,4 when m = 6. Hence Proposition 5.2 yields the following values:

m=1

∞,∞ (x) = 1

2



3

3

4

5

6

3 5

5 7

105

45 7

2

2

8

4







3372

M. Ahues, B.V. Limaye / Linear Algebra and its Applications 439 (2013) 3359–3377







Further, by choosing α j := 1/ g j , j = 2 j − 1/ 2 for j = 1, . . . , m, we obtain an optimally scaled √ √ ordered basis xα for which ∞,∞ (xα ) = max{ g 1,1 γ1,1 , . . . , gm,m γm,m } has the following values:

m=1

2

∞,∞ (xα ) = 1

1

3

4

5

6

3

5

21

45

2

2

4

4

Let us consider a typical case√m = 3. The first polynomial functions on √ normalized Legendre √ three √ [−1, 1] are given by e 1 (t ) := 1/ 2, e 2 (t ) := 3 t / 2 and e 3 (t ) := 10 (3t 2 − 1)/4 for t ∈ [−1, 1]. If G = C ∗ C is the Cholesky decomposition, then e := x C −1 , where

 C

−1

=

Then C 1 =





1/ 2 0 0

√  − 10/4 √0 , 3 /2 √0 0 3 10/4

2 and C −1 ∞

√

and so

C=



2 0 0

√0 2 /3 0 √





2 /3 . 0√ 4/3 10

= 3 10/4. We note that ∞,∞ (x) = 3 5/2 = C 1 C −1 ∞ .

6. Transformation of orthonormal bases We say that an ordered basis e := [e 1 , . . . , em ] of a normed space M is orthonormal if it is orthogonal and e j  = 1 for each j = 1, . . . , m. In this case,  p ,r (e ) = 1 for all p and r satisfying 1  p , r  ∞. Let x be an ordered basis of M. If p , r ∈ {1, ∞} and (1/ p ) + (1/q) = 1, then by Proposition 5.1,

   p ,r (x)  inf Θq Θ −1 r : x = e Θ, e orthonormal and Θ nonsingular ,



and a similar inequality holds for 2,2 (x) if we replace Θq Θ −1 r by min{KF (Θ), K1 (Θ)K∞ (Θ)}. If x itself is orthonormal, then letting Θ := Im , we note that equality holds in these inequalities. This can also happen when x is not orthogonal. For example, if m = 3 and x = [x1 , x2 , x3 ] is the power basis given by x1 (t ) = 1, x2 (t ) = t and x3 (t ) = t 2 for t ∈ [−1, 1], then we have seen toward the end of Example 5.4, that x = e C , where e is the orthonormal basis consisting of the first three normalized Legendre polynomial functions on [−1, 1], and ∞,∞ (x) = C 1 C −1 ∞ . In this section, we consider orthonormal bases of subspaces of three important normed spaces, and show how exact values of condition numbers of e Θ can be calculated, where Θ is any nonsingular matrix. First we prove three general results in abstract settings, and then we give concrete examples to illustrate these results. Proposition 6.1. Let e := [e 1 , . . . , em ], x := e Θ , where Θ is a nonsingular matrix, and let e  , x be the respective ordered dual bases. (i) Let X = C ( T ), the set of all scalar-valued continuous functions on a compact Hausdorff space T (such as a closed and bounded subset of Cn×1 ), and x := sup{|x(t )|: t ∈ T } for x ∈ X . Consider distinct points in T . Suppose e 1 , . . . , em are in X such that e j (t i ) = δi , j for i , j = 1, . . . , m, each e j is nonnegative t 1 , . . . , tm m on T and j =1 e j = 1. Then for i , j = 1, . . . , m,

x j  = the ∞-norm of the jth column of Θ,   x = the 1-norm of the ith row of Θ −1 .

and

i

1 (ii) Let  X = L ( T , σ ), the set of all scalar-valued integrable functions on a measure space ( T , σ ), and x := T |x(t )|dσ (t ) for x ∈ X . Consider disjoint measurable subsets T 1 , . . . , T m of T . Suppose e 1 , . . . , em are in X such that T e j dσ = 1, each e j is nonnegative on T and e j = 0 on T \ T j for j = 1, . . . , m. Then for

i , j = 1, . . . , m,

j

x j  = the 1-norm of the jth column of Θ, and   x = the ∞-norm of the ith row of Θ −1 . i

M. Ahues, B.V. Limaye / Linear Algebra and its Applications 439 (2013) 3359–3377

3373

(iii) Let ( X , ·, · ) be an inner product space and x := x, x 1/2 for x ∈ X . Consider a subset {e 1 , . . . , em } of X such that e j , e i = δi , j for i , j = 1, . . . , m. Then for i , j = 1, . . . , m,

x j  = the 2-norm of the jth column of Θ,   x = the 2-norm of the ith row of Θ −1 . i

and

Proof. In each of the three parts above, e is an ordered orthonormal basis of M := span{e 1 , . . . , em }. Since x = e Θ , it follows, as in Proposition 5.1, that the ordered dual basis of M  is given by x := e  Θ −t . Let θi , j denote the entry in the ith row and the jth column of Θ , and τi , j denote the entry in the ith row and the jth column of Θ −1 . Then

xj =

m 

θi , j e i for j = 1, . . . , m and xi =

i =1

m 

τi, j ej for i = 1, . . . , m.

j =1

(i) The ordered dual basis e  of M  is given by e i (x) := x(t i ) for i = 1, . . . , m and x ∈ M. Fix j ∈ {1, . . . , m}. Then m        x j (t )  |θi , j |e i (t )  max |θ1, j |, . . . , |θm, j | for all t ∈ T , i =1

m

since each e i is nonnegative on T and i =1 e i = 1. Also, x j (t i ) = θi , j for each i = 1, . . . , m. Hence x j  = sup{|x j (t )|: t ∈ T } = max{|θ1, j |, . . . , |θm, j |}. Next, fix i ∈ {1, . . . , m}. Then m m        x (x)  |τi , j |x(t j )  x |τi , j | for all x ∈ M . i j =1

j =1

m

m

   Also, if we let y 0 := j =1 (sgn τi , j )e j , then  y 0   1 and xi ( y 0 ) = j =1 |τi , j |. Hence xi  = |τi ,1 | + · · · + |τi ,m |, as desired.  (ii) The ordered dual basis e  of M  is given by e i (x) := T x dσ for i = 1, . . . , m and x ∈ M. i Fix j ∈ {1, . . . , m}. Then

 m m         x j (t ) =  θi , j e i (t ) for t ∈ T , θi , j e i (t ) =   i =1

i =1

since e i is nonnegative on T and e i = 0 on T \ T i for i = 1, . . . , m. Hence

 x j  =

|x j | dσ =

m  i =1

T

Next, fix i ∈ {1, . . . , m}. Let

 |θi , j |

e i dσ = |θ1, j | + · · · + |θm, j |. Ti

τi := max{|τi,1 |, . . . , |τi,m |}. Then

   m  m        x (x)    τi  | τ | x d σ | x | d σ  τ |x| dσ i , j i i   j =1

Tj

j =1 T

j

for all x ∈ M ,

T

since T 1 , . . . , T m are disjoint. Also, if we let y j := (sgn τi , j )e j , then  y j   1 and xi ( y j ) = |τi , j | for j = 1, . . . , m. Hence xi  = τi , as desired.

3374

M. Ahues, B.V. Limaye / Linear Algebra and its Applications 439 (2013) 3359–3377

(iii) The ordered dual basis e  of M  is given by e i (x) := x, e i for i = 1, . . . , m and x ∈ M. Fix j ∈ {1, . . . , m}. Then



2

x j  = x j , x j =

m  i =1

θi , j e i ,

m 



θi , j e i =

i =1

m 

|θi , j |2 ,

i =1

since e j , e i = δi , j for i , j = 1, . . . , m.  m m m  Next, fix i ∈ {1, . . . , m}. Let y i := j =1 τ i , j e j . Then xi (x) = j =1 τi , j x, e j = x, j =1 τ i , j e j =

x, y i for all x ∈ M, and so

m  m m      2 2 x =  y i  = y i , y i = τ e , τ e | τ i , j |2 , = i, j j i, j j i j =1

as desired.

j =1

j =1

2

Examples 6.2. We now give specific examples of ordered orthonormal bases satisfying the conditions mentioned in the three parts of Proposition 6.1. (i) Let T = [a, b], a closed and bounded interval in R, and consider m points a = t 1 < · · · < tm = b. Let e 1 , . . . , em be continuous real-valued functions on [a, b] such that e j (t i ) = δi , j for i , j = 1, . . . , m, and each e j is an affine function on each of the subintervals [a, t 2 ], . . . , [tm−1 , b]. They are known as the hat functions with nodes at t 1 , . . . , tm , because of the shape of their graphs. It is easy to see m 2 that each e j is nonnegative and j =1 e j = 1. Similarly, let T be a polygonal region in R with a prescribed triangulation, (s1 , t 1 ), . . . , (sm , tm ) being the vertices of the triangles. Let e 1 , . . . , em be continuous real-valued functions on T such that e j (si , t i ) = δi , j for i , j = 1, . . . , m, and each e j is an affine function (of two variables) on each of the triangular regions that make up T . They are called pyramid . . , (sm , tm ) because of the shape of their graphs. Again, one can see functions with nodes at (s1 , t 1 ), . m that each e j is nonnegative and j =1 e j = 1. These functions are extensively used in piecewise linear interpolation and in finite element approximation. (ii) Let T be a bounded Lebesgue measurable subset of Rn , let σ denote the Lebesgue measure on T , and consider disjoint measurable subsets T 1 , . . . , T m of T having positive measures. For j = 1, . . . , m, let e j be the characteristic function of T j divided by σ ( T j ), that is,



e j (t ) := Clearly,

 Tj

1 /σ ( T j ) 0

if t ∈ T j , if t ∈ T \ T j .

e j dσ = 1, each e j is nonnegative on T and is equal to 0 on T \ T j for j = 1, . . . , m. Linear

combinations of these functions are known as simple measurable functions. They are commonly used for approximating integrable functions. (iii) For j = 1, . . . , m, let e j denote the Legendre polynomial function on [−1, 1] of degree j − 1

1

satisfying −1 |e j (t )|2 dt = 1. Then {e 1 , . . . , em } is an orthonormal subset of L 2 ([−1, 1]), and its linear span M is the set of all polynomial functions on [−1, 1] having degree at most equal to m − 1. These functions are heavily used in polynomial approximation. Remark 6.3. In each of the three parts of Proposition 6.1, e i  = 1 = e i  for i = 1, . . . , m, so that e := [e 1 , . . . , em ] is an orthonormal basis of M. Further, if x := e Θ , where Θ is nonsingular, then we can find exact values of  p ,r (x) for 1  p , r  ∞. For example, in part (i),

∞,1 (x) =

1 m



max |θi , j |: 1  i , j  m

m 

|τi , j |,

i , j =1

which is, in general, much smaller than the upper bound K1 (Θ) = Θ1 Θ −1 1 given by Proposition 5.1. Similarly, in part (ii),

M. Ahues, B.V. Limaye / Linear Algebra and its Applications 439 (2013) 3359–3377

1,∞ (x) =

1 m



max |τi , j |: 1  i , j  m

m 

3375

|θi , j |,

i , j =1

which is, in general, much smaller than the upper bound K∞ (Θ) = Θ∞ Θ −1 ∞ given by Proposition 5.1. Next, in part (iii),

2,2 (x) =

1



m

m 

1/2  |θi , j |

2

i , j =1

m 

1/2 |τ i , j |

2

=

i , j =1

KF (Θ) m

,

which is certainly smaller than the upper bound KF (Θ) given by Proposition 5.1, whenever m  2. If G is the Gram matrix corresponding to x and G = C ∗ C is its Cholesky decomposition, then e := x C −1 is orthonormal. Letting Θ = C , we recover the result 2,2 (x) = KF (C )/m obtained in Proposition 5.2. √ We remark that KF (Θ)/m is, in general, smaller than the upper bound K1 (Θ)K∞ (Θ) given in Proposition 5.1. Finally, in part (iii), we also obtain

 ∞,∞ (x) =

max

j =1,...,m

m 

1/2  |θi , j |

2

max

i =1,...,m

i =1

m 

1/2 |τ i , j |

2

,

j =1

which is, in general, much smaller than the upper bound Θ1 Θ −1 ∞ given in Proposition 5.1. These and similar exact calculations using the results of Proposition 6.1 confirm the upper bounds for  p ,r (x) given in Proposition 5.1, and also show that these upper bounds can be improved in special cases. 7. Derivation of a condition number A condition number relative to a mathematical problem is supposed to give a sharp upper bound for the relative error in the solution of that problem when the relative error in the data of that problem is known. Let us consider a given ordered basis of a finite dimensional normed space M as the data, and address the problem of computing the corresponding ordered dual basis of M  . We proceed in a classical fashion using the Implicit Function Theorem.  ] ∈ ( M  )1×m Proposition 7.1. Let x := [x1 , . . . , xm ] ∈ M 1×m be an ordered basis of M and let x := [x1 , . . . , xm  1×m be the ordered dual basis of M . For 1  p , r  ∞, consider the p-norm  ·  p on M and the r-norm  · r on ( M  )1×m . There is a function ϕ from a neighbourhood of x in M 1×m to ( M  )1×m such that ϕ (x) = x and [ϕ (x + h), x + h] = Im whenever h ∈ M 1×m and h p is small. The relative error h p /x p in the data x and the relative error ϕ (x + h) − ϕ (x)r /ϕ (x)r in the computation of x satisfy



h p ϕ (x + h) − ϕ (x)r  γ p ,r (x) + o h p as h p → 0,  ϕ (x)r  x p where

γ p,r (x) =

 x p x r





t  sup x x , h r .

h p =1

Further, γ p ,r (x)  x p x q , where q satisfies (1/ p ) + (1/q) = 1. Proof. Define F : M 1×m × ( M  )1×m → Cm×m by

F ( y , f ) := [ f , y ] − Im , where [·, ·] denotes the Gram product. Note that F (x, x ) = Im − Im = Om . The partial Fréchet derivatives D 1 F ( y , f ) : M 1×m → Cm×m and D 2 F ( y , f ) : ( M  )1×m → Cm×m of F at ( y , f ) ∈ M 1×m ×( M  )1×m are given by

D 1 F ( y , f )(h) = [ f , h],

h ∈ M 1×m ,

and

D 2 F ( y , f )(k) = [k, y ],



k ∈ M

1×m

.

3376

M. Ahues, B.V. Limaye / Linear Algebra and its Applications 439 (2013) 3359–3377

It is easy to see that for k ∈ ( M  )1×m and Θ ∈ Cm×m ,





D 2 F x, x (k) = Θ

if and only if k = x Θ t .

Hence the partial Fréchet derivative D 2 F (x, x ) of F at (x, x ) is a bijection from ( M  )1×m to Cm×m . By the Implicit Function Theorem, there is an open subset U of M 1×m containing x, an open subset V of ( M  )1×m containing x , and a function ϕ : U → V such that for y ∈ U and f ∈ V ,

F ( y , f ) = Om ,

[ f , y ] = Im if and only if

that is,

and the Fréchet derivative D ϕ (x) : M 1×m → ( M  )1×m of



−1

D ϕ (x)(h) = − D 2 F x, x D 1 F x, x (h),

f = ϕ ( y ),

ϕ at x is given by

h ∈ M 1×m .

By the very definition of a Fréchet derivative, the relative errors h p /x p and ϕ (x + h) − ϕ (x)r /ϕ (x)r satisfy



h p ϕ (x + h) − ϕ (x)r  γ p ,r (x) + o h p as h p → 0,  ϕ (x)r  x p where

γ p,r (x) :=

  x p sup D ϕ (x)(h) r .  ϕ (x)r h p =1

Now for h := [h1 , . . . , hm ] ∈ M 1×m , 





D ϕ (x)(h) = −x x , h

t



=−

m 





x1 (h i )xi , . . . ,

i =1

m 

 



xm (h i )xi .

i =1

Also, by Hölder’s inequality,

m  m       xj (h i )xi  xj hi  xi  xj h p x q for j = 1, . . . , m, i =1

i =1

where (1/ p ) + (1/q) = 1. Since

γ p,r (x) =

ϕ (x) = x , it follows that



t   x p x p   sup x x , h r    x q x r = x p x q .   x r h p =1 x r

2

Remarks 7.2. The quantity

γ p,r (x) =

 x p ϕ (x)r





t  sup x x , h r

h p =1

is indeed a condition number in the classical sense as described in [13, Lecture 12, pp. 89–90]. In fact, it is of the form x J (x)/ f (x), where J (x) is the Jacobian of f at x. The upper bound x p x q of the condition number γ p ,r (x) obtained above may be compared with the upper bound xq x p of the condition number κ p (x) obtained in Proposition 2.3 with (1/ p ) + (1/q) = 1. We now show that the upper bound x p x q of γ p ,r (x) is sharp, where 1  p  ∞ and

(1/ p ) + (1/q) = 1. Let M := Cm×1 along with the norm given by u  := max{|u (1)|, . . . , |u (m)|}. Let  ] be the correspondx := [x1 , . . . , xm ] denote the standard ordered basis of Cm×1 and x := [x1 , . . . , xm  ing ordered dual basis, that is, xi (u ) = u (i ) for i = 1, . . . , m and u := [u (1), . . . , u (m)]t ∈ Cm×1 . Since xi  = 1 for each i = 1, . . . , m, we see that x r = m1/r .

M. Ahues, B.V. Limaye / Linear Algebra and its Applications 439 (2013) 3359–3377

3377

Let h := [1/m1/ p , . . . , 1/m1/ p ]t ∈ M and h := [h, . . . , h] ∈ M 1×m . Since h = 1/m1/ p , we see that h p = 1. Also, 





x x ,h Since 

m

t

=

m 



h(1)xi , . . . ,

i =1

 

i =1 xi 

m  i =1





h(m)xi

=

1

m 

m 1/ p

i =1



xi , . . . ,

m 





xi .

i =1

= m, we obtain

  t  m m1/r   x x , h = = m1/q m1/r = x q x r . r 1/ p m

Thus

γ p,r (x) =



t   x p sup x x , h r = x p x q ,   x r h p =1

and so the upper bound x p x q of γ p ,r (x) is sharp. Since x j xj   1 for each j = 1, . . . , m, the Hölder inequality shows that the upper bound

x p x q of γ p ,r (x) is at least equal to m. In order to bring down this minimum value from m to 1, we have defined  p ,q (x) := x p x q /m, so that γ p ,r (x)  m p ,q (x). The cases p = 1, 2 and ∞ yield the most commonly used product norms  · 1 ,  · 2 and  · ∞ on M 1×m . Hence the numbers 1,∞ (x), 2,2 (x) and ∞,1 (x) are of greater relevance. Acknowledgements This work is supported by Project 4101-1 of the Indo-French Centre for the Promotion of Advanced Research, New Delhi. The authors are grateful to the referees for their valuable suggestions toward improvement of the content as well as the presentation. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]

F.L. Bauer, Optimally scaled matrices, Numer. Math. 5 (1963) 73–87. E.F. Beckenbach, R. Bellman, Inequalities, third printing, Springer-Verlag, New York, 1971. C. de Boor, The exact condition of the B-spline basis may be hard to determine, J. Approx. Theory 60 (1990) 344–359. K.P. Deepesh, S.H. Kulkarni, M.T. Nair, Approximation numbers of operators on normed spaces, Integral Equations and Operator Theory 65 (2009) 529–542. L. Fejér, Lagrangesche Interpolation und die zugehörigen konjugierten Punkte, Math. Ann. 106 (1932) 1–55. W. Gautschi, The condition of polynomials in power form, Math. Comp. 33 (1979) 343–352. B.V. Limaye, Uniformly conditioned bases of spectral subspaces, Numer. Funct. Anal. Optim. 34 (2013) 180–206. T. Lyche, K. Scherer, On the L 1 -condition number of the univariate Bernstein basis, Constr. Approx. 18 (2002) 503–528. G.F. Simmons, Differential Equations with Applications and Historical Notes, second edition, McGraw–Hill, New York, 1991. I. Singer, Bases in Banach Spaces I, Springer-Verlag, New York, 1970. A. van der Sluis, Condition numbers and equilibration of matrices, Numer. Math. 14 (1969/1970) 14–23. N. Tomczak-Jaegermann, Banach–Mazur Distances and Finite-Dimensional Operator Ideals, Longman Sci. & Tech, New York, 1989. L.N. Trefethen, D. Bau III, Numerical Linear Algebra, SIAM, Philadelphia, 1997. L.N. Trefethen, J.A.C. Weideman, Two results on polynomial interpolation in equally spaced points, J. Approx. Theory 65 (1991) 247–260. J.M. Varah, On the condition number of local bases for piecewise cubic polynomials, Math. Comp. 31 (1977) 37–44.