Chapter 7
General rational interpolation As explained in Chapters 2 and 5, the recurrence relations described in this book allow to compute Pad~ approximants along different paths in the Pad~ table: on a diagonal, a staircase, an antidiagonal, a row, . . . . In this chapter, we develop a general framework generalizing the above in two directions. Firstly, a more general interpolation problem is considered from which, e.g., the Pad~ approximation problem can be derived as a special case. Secondly, recurrence relations are constructed allowing to follow an arbitrary p a t h in the "solution table" connected to the new interpolation problem. In Pad~ approximation, the approximant matches a maximal number of coefficients in a power series in z or in z -1 . This corresponds to a number of interpolation conditions at the origin or at infinity. In the Toeplitz case, we had two power series: one at the origin and one at co. As shown in (4.38), (4.39), the interpolation conditions are distributed over the two series. The general framework will allow to m a t c h a number of coefficients in several formal series which are given at several interpolation points. Again all the degrees of freedom in the approximant are used to satisfy a maximal number of interpolation conditions that are distributed over the different formal series. This could therefore be called a multipoint Pad~ approximation problem.
7.1
General framework
The main ideas for this general framework are given by Beckermann and Labahn [12, 13] and Van Sarel and Sultheel [228, 229]. The reader can 351
352
CHAPTER 7. GENERAL RATIONAL INTERPOLATION
consult these papers for more detailed information. We sketch the main ideas. As before F denotes a (finite or infinite, commutative) field, F[z] denotes the set of polynomials and F[[z - ~]] denotes the set of formal power series around ~ C F and F ( z - ~) denotes the set of formal Laurent series around with finitely many negative powers of ( z - ~). We need these formal series for different points r C F, called interpolation points. The set (finite or infinite) of all these interpolation points is denoted by Z. Now suppose Z = { ~ l , . . . , ~,~ and that we want to find a rational form of type (fl, a) whose series expansion at ~ matches k~ coefficients of the given series gz e F[[z-~z]] for i = 1 , . . . , n . I f k l + . ' - + k n is equal to the number of degrees of freedom in the approximant, namely a + fl + 1, then the approximant is a multipoint Pad~ approximant for this collection of series {g~ : i = 1 , . . . , n } . In the special case that k~ = 1 for all i, then the multipoint Pad~ approximant is just a rational interpolant. If n - 1 and hence kl = a + fl + 1, then we have an ordinary Pad~ approximant at ~1. All the information which is used for the interpolation (the k~ terms of g~, i = 1 , . . . , n) could of course be collected in one Newton polynomial, which could be the start of a formal Newton series expansion of a function. However, in this formal framework, infinite power series need not converge and thus need not represent functions. So we have to replace the notion of function by a set of formal series at the interpolation points. In principle, these series are totally unrelated. Therefore we define a formal Newton series as a collection of power series {g~}z~Z with gr e F[[z - r D e f i n i t i o n 7.1 ( f o r m a l N e w t o n series) The set F[[z]]z of formal Newton series with respect to the set of interpolation points Z C F is defined as
~[[z]]z
-
{ g - {g~}r162 e F[[z- r
We call gr the expansion of g at ~. Elements in F[[z]]z can be multiplied as follows. With f, g e F[[z]]z, the product h - fg e F[[z]]z is defined as
h- fg-{h(}~ez
with
h ( - f(g(.
Also division can be defined. When gr ~ 0, V( E Z, the quotient h is defined as h - f / g - {hr162 with h r fr162
Note that in general h i - f r F[[z- r
f/g
belongs to F(z - () and not necessarily to
7.1. G E N E R A L F R A M E W O R K
353
Because polynomials can be written as an element of F [ [ z - (']] for any (" E Z, we can consider the set of polynomials F[z] as a subset of the set of formal Newton series: F[z] C F[[z]]z. Hence, the product of g E F[[z]]z and p E F[z] is well-defined resulting in an element of F[[z]]z. Similarly for the quotient. ,~x, denotes the set of m x s matrices whose entries are in F[[z]]z and similarly for F[z] 'nx~ etc. Recall the Pad~ interpolation condition (5.9) for power series in z ord ( f ( z ) a ( z ) - c(z)) >_ a + 13 + 1. Setting G(z)-
[1
- f(z)],
P(z) - [c(z) a(z)] T,
and
w(z)-
z ~+~+1, (7.1)
this can be rewritten as
a ( z ) P ( z ) - w(z)R(z),
R(z) e
(7.2)
The power series R(z) is called the residual. For power series in z - (', the factor w in the right-hand side should be replaced by w(z) - ( z - ~)~+1 and R should contain only nonnegative powers of ( z - ('). Generalizing this notion further to our situation of formal Newton series where multiple interpolation points are involved, w(z) can be any monic polynomial having interpolation points as zeros. When G(z) is not a row vector, but a general matrix, we introduce not just one polynomial w, but a vector of polynomials tO.
Definition 7.2 ( o r d e r v e c t o r ) Let m be an integer with m >_ 2. An order vector ~ - ( w t , . . . , w m ) with respect to Z is defined as a vector of monic polynomials having interpolation points as zeros. Now we can express interpolation conditions for the polynomial matrix P, given a matrix series G by requiring that G P - diag(a~)R with residual R containing only nonnegative powers of ( z - (') for all (" C Z. We say that P has G-order a3. D e f i n i t i o n 7.3 ( G - o r d e r ) Let G e F[[z]]~ x " and ~ an order vector. The polynomial matriz P C F[z] mxs is said to have G-order ~ iff GP-
diag(wl,...,wm)R
with R e F[[z]]~ x~.
(7.3)
The matriz R is called the order residual (for P). The set of all polynomial vectors having G-order ~ is denoted by S ( G , ~ ) , i.e., 8(G,5)-
{Q e F[z]mXl 9 Q has G-order ~}.
354
C H A P T E R 7. G E N E R A L R A T I O N A L I N T E R P O L A T I O N
E x a m p l e 7.1 The G-order of a polynomial P will depend on the set of interpolation points Z. For example suppose
G(z)-
[zz 1, z] ~ ( ~ - 2)
z-
[1]
~
~- 9
Then G(z)P(z)-
(z-
2 ) ( 2 z - 1)
"
Therefore, if Z - {0, 1} or Z - {0}, then P has G-order ~7 - (z k, 1) for k - 0, 1. If Z - {0, 2}, then P has G-order a7 - (z k, ( z - 2) t) for k, l - 0, 1. If
a(z)-
z(z-~)
~-~
~-~
,
then
so that for Z - {0, 1}, this P has G-order ~ - ( z P ( z - 1)q, ( z - 1) k) for p , q - 0, 1 , 2 , . . . and k e {0, 1}. When Z - {0, 1,2}, we can add the factor ( z - 2) ~ to the first component of ~7 with r any nonnegative integer.
Before we look into the algebraic structure of S(G, ~), we need the following definition; see [17, p. 3, Th. 7.6, 7.4, p. 105]. D e f i n i t i o n 7.4 ( m o d u l e , free m o d u l e , f i n i t e - d i m e n s i o n a l ) Let R be a ring. Then an additive commutative group M with operators R is a (left) R-module if the law of external composition R x M ~ M 9 (~, a ) ~ ~a, is subject to the following axioms, for all elements ~, ~ E R and a, b C M" a( a + b) (a + $)a
-
(~)~la
-
aa + ab, aa + $a,
~(~), a.
A (left) R-module M is free if it has a basis, i.e., if every element of M can be expressed in a unique way as a (left) linear combination of the elements of the basis. The dimension dim M of a free R-module M is the cardinality of any of its bases.
7.1. G E N E R A L F R A M E W O R K
355
We have a right R-module when instead of (a)~)a = a(),a) we have (a)~)a - ~(aa). Since we only need the left version here, we drop the adjective "left" everywhere. It is clear that F[z] TM is an m-dimensional F[z]-module. A possible basis is given by the set {ek)k~=~ with ek - [ 0 , . . . , 0, 1, 0 , . . . , 0] T where the one is at position k. A principal ideal domain is defined as follows [66, p. 301,318]. D e f i n i t i o n 7.5 (ideal, p r i n c i p a l ideal d o m a i n ) An ideal A in a ring R is a subgroup of the additive group of R such that R A C A and A R C A. In any commutative ring R, an ideal is said to be principal if it can be generated by a single element, i.e., when it can be written in the form aR with a C R. An integral domain in which every ideal is principal is called a principal ideal domain. The Euclidean domain of polynomials F[z] is a principal ideal domain [66, Th. 1, p. 319]. Every submodule of a free R-module will also be free if R is a principal ideal domain. We have indeed [17, Th. 16.3] T h e o r e m 7.1 Let R be a principal ideal domain and let M be a free Rmodule. Then every submodule N of M is free with dim N <_ dim M. Because the set of all polynomial m-tuples is an m-dimensional (free) module over the principal ideal domain of the polynomials F[z], we know that each submodule of F[z] TM is also free with dimension smaller than or equal to the dimension m of the complete module. The key idea in this chapter is the following fact. T h e o r e m 7.2 The set S(G, ~) of all polynomial vectors Q having G-order forms a submodule of the F[z]-module F[z] TM. A basis for the submodule S(G, ~) always consists of exactly m elements, i.e. dim S(G, ~) - m. P r o o f . It is easy to see that
Q1 - -Q2 E e
VQ e
VQ1, Q2 E and W e F[z].
Hence, we have proved that S ( G , ~ ) is a submodule. To prove the second part of the theorem, we refer to the updating steps of Section 7.2 keeping always just m basis vectors. [:] A basis for S(G, cD)is called a (G,cs The polynomial m • m matrix B whose columns form a (G,~)-basis is called a (G,~)-basis matrix.
356
C H A P T E R 7. G E N E R A L R A T I O N A L I N T E R P O L A T I O N
The following property allows to characterize a basis matrix in terms of its determinant. T h e o r e m 7.3 For S = 8 ( G , ~ ) there exists a unique monic polynomial Xs, such that
(a) a polynomial matrix B is a ( a , ~)-basis matrix iff the columns of B belong to $ ( G , ~ ) and det B = cxs with 0 ~ c E F; (b) for any polynomial matriz P e F[z] '~x'~ whose columns belong to S(G, &): det P = c. )is with c e F[z]. P r o o f . (a) Take a (G,~)-basis matrix B and define Xs as the monic polynomial such that det B = cXs. Any other (G,~)-basis matrix B ~ can be written as B' = B U with U a unimodular polynomial matrix (i.e. a matrix whose determinant is a nonzero constant). By taking determinant-values, part (a) is proved in one direction. On the other hand, if the columns of P are elements of $ ( G , ~ ) and det P = cxs with 0 r c C F, we can write the columns of P in terms of a (G, ~)-basis matrix B as P = B U with U a polynomial matrix, since the columns of P are in 8 ( G , ~ ) . Taking the determinant of the left and right-hand side it turns out that U is unimodular. Hence, also P is a basis matrix. (b) The polynomial matrix P can be written as P = B C with C C F[z] "~x'~ and B a (G,~)-basis matrix. Again by taking determinant-values, part (b) is proved. 0 From this theorem, we can conclude C o r o l l a r y 7.4 A polynomial matrix B is a (G,~)-basis matrix iff the columns of B belong to 8(G, ~) and 0 ~ deg det B is as small as possible.
Definition 7.6 (characteristic p o l y n o m i a l )
The polynomial Xs as described in the previous theorem is called the characteristic polynomial for
In [13], the characteristic polynomial is called generating polynomial. However we think that characteristic polynomial is a more appropriate name. As we said before, our formal series need not converge and thus do not represent functions. However, if G e F[[z]]~ • we can define the "function value" G(() for all ( C Z as the constant term in the formal series Gr E G which is the formal expansion of G at ~ e Z, i.e., G(() = Gr We say that G is regular iff for each point ( E Z, the function value G(r nonsingular. The characteristic polynomial has the following property.
7.1. G E N E R A L F R A M E W O R K
357
T h e o r e m 7.5 Let Xs be the characteristic polynomial of $(G, 5). Let ~2 d e t d i a g ~ - 1-[~=lwk. Then
(a) If G is regular and B is a (G,~)-basis matrix with order residual R, then the following statements hold (1) R is regular (3) d e t G = c d e t R (b) If G e F[[z]]~ • off~.
with O ~ c C F then the characteristic polynomial Xs is a divisor
P r o o f . ( a ) ( 1 ) S u p p o s e R is not regular, i.e., 3( C Z such that det R ( ( ) 0. Hence, there exists a nonzero vector x C ] ~ x l having, e.g., the kth with I~ the component different from zero such that RI~Dr E F[[z]]m• z identity matrix with the kth column replaced by x and D~ the identity matrix with the kth diagonal element replaced by ( z - ()-1. Therefore, because G is regular, BI~D~ belongs to F[z] 'n• with a lower degree of the determinant whose columns are in S(G, ~). Hence, this contradicts the fact that B is a basis matrix. (2) Taking the determinant of both sides of
G B - diag ~ . R gives us Xs - 12 because G and R are regular and Xs and 12 are monic. (3) From (2)it follows that det G - c det R with 0 ~ c e F. (b) The proof follows from the updating steps of Section 7.2 below. Now that we can describe all the polynomials P that are in $((7, r i.e., all polynomials that satisfy a certain number of interpolation conditions, we should select some that satisfy certain degree restrictions. Conditions on the degree can be seen as interpolation conditions at infinity. Recall that in the Pad~ example, we did not want all the polynomials P - [c a] T which satisfied [1 - l I P - O(z~+~), but we had Pad~ approximants only if dega _ a, degc _ fl and v = a + ~ . Even then, the problem had many different solutions. In the minimal Pad~ approximation problem (see Section 5.6), we introduced a shift parameter a to impose a certain structure in the degree of numerator and denominator. We defined the mPA indeed as the "simplest possible" rational form which satisfied the interpolation conditions and for which a degree structure was imposed by the parameter a, i.e., the degree of the numerator was at most a - a and the degree of
C H A P T E R 7. GENERAL R A T I O N A L I N T E R P O L A T I O N
358
the denominator at most a.
In other words we have the componentwise
inequality deg(a(z),z~c(z)) < (a,a), or Zr
H(z)P(z)-
[
]
a(z)
- S(z)z~'
with S e F[[z-1]].
(7.4)
As we know from Section 5.6, either deg a - a or deg c - a - a. Thus it is impossible to increase a in the right-hand side since S(oo) # 0. We shall therefore call a the H-degree of the vector P (see below). Similarly, in the general situation, we have to select within the submodule S ( G , ~ ) , those polynomial m-tuples which satisfy additional conditions concerning their degree structure. In other words, we have to impose interpolation conditions at infinity. However, since we now have a rectangular matrix P E F[z] mx~ instead of just one column, we shall have to replace the scalar z ~ in the right-hand side by some diagonal matrix, which we denote as
z 6 - diag(z 6~, z62,..., z 6"). For the left-hand side matrix H, we shall allow a more general form H(z) zr where ~ - ( a l , . . . , am) is a vector of shift parameters, which impose the relative importance of the degrees of the different rows of P. The role of H will become clear below. To introduce the generalization we are heading for in a more formal way, we need a precise definition of the concepts H-degree and H-reducedness. Recall that F(z -1) denotes the set of formal Laurent series over F with finitely many positive powers of z.
Definition 7.7 (H-degree, H-reduced)
Let H E F(z -1)mXm with
det H ~ 0 and P E F(z-1 ),7,x9 without zero columns. .-)
Then P is said to
have H-degree ~ if HP-
Sz 6
(7.5)
with S e F[[z-~]] m• and S(oo) not containing zero columns. S is called the degree residual and S(oo) the H-highest degree coefficient of P. A zero column of P has corresponding H-degree equal to -oo and the corresponding S-column is zero. The matriz P is called H-reduced if the matviz S(oo) has full rank. A polynomial matriz is called column reduced when it is H-reduced with H = I, the identity matviz. T h e / - d e g r e e corresponds to our usual notion of degree and t h e / - h i g h e s t degree coefficient is just the usual highest degree coefficient. Note that if
7.1. GENERAL F R A M E W O R K
359
P is H-reduced, then deg det S = 0. When P is column reduced, then deg det P = 61 + ' " + ~ . Thus the degree of the determinant is the sum of the degrees of its columns. If a (G,5)-basis matrix B is H-reduced, it is called a (G,H,&)-basis matrix. It is easy to transform the basis of a submodule into an H-reduced one. If the basis is not already H-reduced, take the basis element Bi with largest H-degree 6/whose H-highest degree coefficient is linearly dependent on the H-highest degree coefficients (hdc) of the other basis elements m
(H-hdr Bj)dj - o ./=1
with di ~ O. Replacing Bi by ~j~--1 djz6~-'5~Bj, gives us another basis for S(G,~) but the H-degree of Bi is decreased. Repeating this process, we finally get an H-reduced basis. The transformation, in each elementary step, can be obtained by multiplying the basis matrix from the right with an elementary matrix which differs from the unit matrix only by its column i. There can be powers of z in that column, but it has 0 r di C F on its diagonal. Thus this elementary matrix has a constant, nonzero determinant. Thus each elementary transformation is a right multiplication with a unimodular matrix. The product of all these unimodular matrices is of course again unimodular. For further details, we refer to [239]. Once we have a (G, H, &)-basis, it is easy to parametrize all elements of the submodule with a given upper bound on the H-degree. T h e o r e m 7.6 Given an H-reduced basis for the submodule S ( G , ~ ) with
basis elements Bi having H-degree 8i, all elements of the submodule S(G, ~) having H-degree <_ 6 can be parametrized uniquely as Eim=l ciB i with ci a polynomial of degree <_ 6 - 6i. P r o o f . Let us denote the H-highest degree coefficient of a polynomial vector P by 7~(P). The H-highest degree coefficient of Eim=i ciB i is determined by m
7 ~ ( ~ ciBi) i:1
~
(hdc cj). 7-/(B3)
7j +Sj =max{"fi +dii}
with 7i - deg ci. This coefficient can not become zero because the H-highest degree coefficients of the basis elements are linearly independent. Hence, m
H-deg ( E c i B i ) - max{3'/+ 6i). i=l
C H A P T E R 7. G E N E R A L R A T I O N A L I N T E R P O L A T I O N
360
Because the H-highest degree coefficients of the basis elements are linearly independent, different choices of the polynomials c~ lead to different elements of the submodule. This proves the theorem. D We introduce the following notation: If ~ C Z TM is a vector of integers, then Note that this is not a norm since the ~z can be negative. To characterize a (G, H,~)-basis matrix B, we can use the following property. T h e o r e m 7.7 Let B e F[z] '~• deg det H. Then we have
with H-degB - ~ and denote O(H) -
(a) 161 > deg det B + O(H), -.r
(b) I~1 - deg det B + O(H) iff B is H-reduced. (c) B is a (G, H, ~)-basis matrix iff det B ~ 0, I~ - deg Xs + O(H) and the columns of B belong to S(G, ~). P r o o f . These properties directly follow by taking determinants in (7.5). [:] Following [13], we introduce the following definition for later use.
Definition 7.8 ( n o r m a l d a t a , w e a k l y n o r m a l d a t a ) We say that the data (G, H , ~ ) are normal if all components of the H-degree of an H-reduced basis matrix are equal. The data are called weakly normal if the components of the H-degree differ at most by one.
7.2
Elementary updating and downdating steps
To motivate the elementary steps described in this section, think of the minimal Pad~ approximants [w, a] to the series f e F[[z]]. We characterized the entries of the Pad~ table not with the degrees of numerator and denominator, but with the parameters w and a. Elementary steps (from one entry to a neighboring one) correspond to changing w or a by one. Increasing or decreasing w by one meant satisfying one more or one less interpolation condition. Decreasing or increasing a by one meant changing the degree structure of the mPA. In the more general situation, we characterized the interpolation conditions by an order vector ~ and the degree structure was imposed, mainly by the shift vector d. Thus elementary steps in this multi-dimensional interpolation table shall correspond to adding or removing one interpolation
7.2. E L E M E N T A R Y
UPDATING AND DOWNDATING
STEPS
361
condition in one of the points of Z or to increasing or decreasing a component of d by 1. Let us assume that we start with an H-reduced basis B = [B1, B 2 , . . . , Bin] for the submodule S = S ( G , ~ ) of Theorem 7.2 and that we want to make one of the elementary changes mentioned above. 1. If we want to change the order vector ~ --. ~ by component wi(z) ~ w~(z) - ( z - ()wi(z) with ( w~(z) - w i ( z ) / ( z - () (if wi(z) is divisible by (z to transform the H-reduced basis matrix B with H-reduced basis matrix B ~ with G-order ~ . 2. If we want to ponent aj ~ basis m a t r i x with G-order
changing only one e Z , or wi(z) r then we have G-order u~ into an
change the shift vector d ~ d ~by changing only one comaj aj • 1 then we want to transform the H-reduced B with G-order a3 into an H~-reduced basis matrix B ~ ~, where g ' ( z ) - z ~ ' - ~ H ( z ) . -
Let us consider the procedures to do these elementary operations in more detail. 1. Let us consider the change ~ ~ a3~ by changing only one component wi(z) ~ w ~ ( z ) - ( z - ()wz(z). Any polynomial vector P e F[z] TM with G-order ~ satisfies the set of linear homogeneous equations (7.3), i.e.,
G P - diag ~ . R with R C F[[z]]~ xl. If it has to get a G-order ~ , then it should satisfy one additional equation, namely Ri(() - 0 in
a~(z)P(z)-
w~(z)R~(z)
with R~ e F[[z]]z
(7.6)
where G~ and Ri denote the ith row of G and R respectively. Thus P satisfies this extra equation (7.6)iff TQ(P) - 0 where 7~i(P) denotes the ith component of the order residual for P evaluated at (: 7~i(P) -
R,(r Thus the simplest case is when T~i(Bj) = 0 for all j = 1 , . . . , m, since then obviously S ~ = S and we can keep the same H-reduced basis. In this case, Xs = Xs'. Note that this is only possible when G ( ( ) is singular. When at least one of the residuals T~i(Bj) r 0, we can use the following algorithm to determine an H-reduced basis for S~:
362
C H A P T E R 7. G E N E R A L R A T I O N A L I N T E R P O L A T I O N 9 take a basis element having nonzero order residual and smallest H-degree; suppose this is the basis element Bt; 9 take the following m -
B~.- B j -
1 polynomial m-tuples
(TQ(Bj)/TQ(Bt))Bt,
j - 1 , . . . , m but j 7~ l;
9 add the polynomial m-tuple B [ ( z ) m polynomial vectors.
( z - ()Bt(z) to get a set of
It is clear that all elements B}, j - 1 , . . . , m belong to S'. Moreover the matrix B' is H-reduced while the degree of its determinant deg det B'(z) - deg Xs,(z) - deg [Xs(Z) " (z - ()] is as small as possible. Hence, according to Corollary 7.4, B' is an H-reduced basis matrix for S'. 2. Let us consider the change a~ - , ~' by changing only one component wi(z) ---, w~(z) - w i ( z ) / ( z - ~) (if wi(z) is divisible by (z - ()). This is the reverse of the previous step. The new basis matrix B' should satisfy
a(z)B'(z)-
d i a g ( w l , . . . , w i _ l , w i / ( z - C),Wi+l,...,OJm)Rt(z)
with R ' 6 F[[z]] m e x m . Consider the set of linear equations R(()c-
(7.7)
e,:.
When R(() is nonsingular, there is a unique solution - [cl,
r 0.
Note that R ( ( ) i s always nonsingular if G is regular. If R ( ( ) i s singular, consider all nonzero solutions of (7.7). One can show that the following set of polynomial vectors forms an H-reduced basis for the submodule
9 try to find (one of) the solution(s) c such that the polynomial vector ~"=1 c i B i ( z ) i s divisible by ( z - () (note that, if G is regular, the unique solution gives a linear combination divisible by ( z - ( ) ) ; if such a solution does not exist, we take B ' = B and we are d o n e .
7.2. E L E M E N T A R Y UPDATING AND DOWNDATING STEPS
363
9 Otherwise, if such a solution does exist, look for the index l with cl # 0 and ~t as large as possible; take the polynomial vector m
i=1
We take B[(z) - B [ ' ( z ) / ( z - () as the first member of the new basis; the other m - 1 members of the new basis are the elements of the old basis, excluding the /th one, i.e., B~ - Bi for i - 1 , . . . , m but i # I. --,
0
!
Consider next the change a ---, tY~ where aj - aj + 1. That is H ~ is obtained from H by multiplying the j t h row of H by z. Note that the interpolation conditions (7.3) do not change, i.e., the submodule S ( G , ~ ) considered is the same, only the degree structure is different. Let us denote the j t h component of the H-highest degree coefficient of a polynomial vector P as 7-/j(P). Note the analogy with the interpolation at r in our previous procedure. Degree conditions are like interpolation conditions at 0o. 7-/j(P) is indeed the j t h component of the degree residual evaluated at cr We now transform the H-reduced basis { B 1 , . . . , Bin} for S({~, ~ ) i n t o an H~-reduced basis as follows 9 take a basis element Bt having the smallest H-degree and such that 7-lj(Bt)~ O. This is chosen as B[; 9 add the following m -
1 polynomial m-tuples -
(nj(B,)/Uj(B,))B,(z),
i-
1,...,rebut
i#l
with 8i the H-degree of Bi. It is easy to see that by this procedure B' is generated from B by multiplying B to the right by a unimodular matrix. Thus, the new matrix B' is also a basis matrix for S ( G , ~ ) . Moreover it is obviously HI-reduced. 4. Similarly, we can divide the j t h row of H by z resulting in H'. This is equivalent to decrementing one of the shift parameters aj. Suppose
364
CHAPTER 7. GENERAL RATIONAL INTERPOLATION that the H-degree of B is ~ - ( ~ 1 , . . . , ~,,~) and let us denote the Hhighest degree coefficient of a polynomial vector P excluding the j t h component as ~ # j ( P ) . The set of linear equations TrL
- o
F_, i=1
has a nontrivial solution (cl, c2,..., Cm) ~ O. Once again, it is easy to show t h a t the following set of polynomial vectors forms an H~-reduced basis for the same submodule S(G,~). 9 look for the index l with ct ~ 0 and ~t as large as possible; take the polynomial vector m
-
B,
i=1
as the first member of the new basis; 9 the other m - 1 members of the new basis are the d e m e n t s of the old basis, excluding t h e / t h one, i.e., B~ - Bi for i - 1 , . . . , m but i ~ I. When we combine the basis elements as columns of a square m • m basis matrix, then the transformation for a change in the shift parameters d, i.e., in case 3 and case 4, involves a right multiplication by a unimodular matrix. For cases 1 and 2 where an interpolation condition is added or removed, this is also described by a right multiplication with a unimodular m a t r i x which is now possibly followed by a multiplication or a division of one of the columns by ( z - (). Each of these elementary steps is very simple, efficient and straightforward. They do not only allow us to u p d a t e or downdate an already computed solution, but by concatenating several of these elementary steps, we can go from one "point" (~, Y) to any other point ( ~ , Y~). Hence, we can follow any path in the solution table. For example, considering the scalar linearized Pad~ approximation problem reformulated in Definition 7.9, one can follow a diagonal, a row, a column, an antidiagonal, and any combination of these in the Pad~ table. We can even make circular walks. See Section 7.4. For example, to go from one entry of a row in the Pad~ table to the next entry, one increments the interpolation index v (connected to the interpolation point 0) and decrements a2 (or equivalently increments a l ). This m e t h o d is not only an efficient updating or downdating procedure, but it allows to compute very efficiently the solution at any point (~, Y),
7.3. A G E N E R A L R E C U R R E N C E STEP
365
starting from scratch. This can be done as follows. First, note that the columns of the identity matrix Im• are H-reduced for any kr which is column reduced, thus forming a basis for the module F[z] TM. If at a certain stage H is not column reduced, we can make it column reduced by right multiplication with a unimodular matrix. The columns of this unimodular matrix form an kr-reduced basis. Starting from this basis, we can recursively compute a z~H-reduced basis for any choice of the order vector 07 and for any choice of the shift parameter vector Y, using the three basic updating steps 1, 3 and 4 described above.
7.3
A general recurrence step
In the previous section we have described the elementary steps to a solution of our general interpolation problem corresponding to a certain point (07, d) in the solution table. In this section, we shall look at this from a higher level and describe how to find a (G (2), H(2),~7(2))-basis matrix B (2) in two steps. First we find a (G(1),~(1))-basis matrix B(1) and then we can update B (~) into B (2). Such a global updating step is described in the following theorem. T h e o r e m 7.8 (division into t w o s u b p r o b l e m s ) Let 0~(1) and 07(2) be two order vectors such that ~(1) <_ ~(2), i.e., such that ~(1) divides ~(2) (componentwise). Let ~(1'2 ) - w-.(2) / w-.( )1 (componentwise). Let B (1) be a (G(1),~(x))-basis matrix with order residual R(x). Set H(1,2) - H(2)B (1). If B(1'2) is a (R (1), H(l'~),~(1,2))-basis matrix, then B(2) - B(1)B (1'~) is a (G(2), H(2),~(2))-basis matrix. They have the same order residual and the same degree residual and the H(2)-degree of B (2) is equal to the H(1,2)-degree of B(1,~). Conversely, if B(~) is a (G(2),H(2),~(2))-basis matriz, then B(1'~) = (B(~))-~B (2) is an (R(~),H(~'2),~(l'2))-basis matriz. They have the same order residual, the same degree residual and the H(1,2)-degree of B(1'2) is equal to the H(~)-degree of B (~). P r o o f i Because
G(1)B (1)
=
R(1)B(1, 2) =
diag(07(1))R (1) and diag(07(1,2))R(1,2),
we get that
G(1)B(1)B(1, 2) = G(1)B (~) =
diag(03(1)~(l'2))R(l'2) or diag(~(2))R (1,2) - diag(~(2))R (2)
C H A P T E R 7. G E N E R A L R A T I O N A L I N T E R P O L A T I O N
366
Because H (i'2) - H(2)B (i), we can rewrite H(1,2)B(1,2)
H(2)B(i)B(i,2) H(2)B (2)
:
S(1,2)Z6-'(i'~)
aS
--
s(i,2)z6-~1'~1o r
=
S(1,2)z g(t'2) = S(2)z g(~).
Because S(2)(00)is nonsingular, B (2) is H(2)-reduced. Any (G(2),~(2))-basis matrix B can be written as B - B(i)B ~. Hence, deg det B is minimal iff deg det B ~is minimal. This proves the first part of the theorem. The second part can be proven in a similar way. 0 We shall see now how the general theory can be applied to the Pad~ approximation problem.
7.4
Pad6 approximation
Let us write the Frobenius-type definition 5.3 of a Pad~ approximant in another way. Observe how close this is to the definition of a minimal Pad~ approximant (Definition 5.5). D e f i n i t i o n 7.9 ( r e f o r m u l a t i o n P A - F r o b e n i u s ) Given a formal power series f ( z ) - ~ ' = o fk zk E F[[z]], with fk E F, we define a (right) Pad6 approximant (PA) [~/a] of type (~,a), ~ , a >> O, as a rational fraction c(z)a(z) -1, which can be written as a polynomial vector Q(z) - [c(z) a(z)] T (or as a polynomial couple (c(z), a(z)) when appropriate), satisfying ord (G(z)Q(z)) > (v + 1, O) (row-wise) with v - / 3 + a
deg ( H ( z ) Q ( z ) ) <_ ~ (column-wise, i.e. H-degree)
(7.8) (7.9)
6 as low as possible where G(z) and H (z) are defined as G(z) -
[
01 - f(Z)l
]
and
H(z) -
0
01
z -~'
"
(7.ii)
Note that in the definition, we can always multiply H by a power of z and still obtain the same approximant. This can be used to make one component of ~ equal to 0. This is what was done in (7.4) since multiplying the above H ( z ) with z ~ gives the H matrix of (7.4). The above definition fits into the general definition when we set Z -- {0}, ~ - (z v+l, 1), d - ( - ~ , - a ) , H ( z ) - z ~ and G ( z ) a s above. In this definition, H is used to impose a
7.4. PADE APPROXIMATION
367
degree structure on the solution. This is essentially the difference a between the upper bounds for the numerator and denominator degree. The solution is found among the rational fractions satisfying the order condition and having a minimal H-degree, i.e., it has the lowest possible degree, taking the imposed degree structure into account. All solutions Q(z) of (7.8) form a submodule of the F[z]-module F[z] 2 of polynomial couples. A basis for such a submodule consists of two elements of F[z] 2. When the basis is H-reduced, it is easy to write all solutions of (7.8) and (7.9)for any ~. Hence, it is easy to get the solution having minimal We call a (G, H,~)-basis matrix for the data of Definition 7.9 a (~, a)basis matrix. Because 0 is the only interpolation point, the order vector a3 will be of the form ~ ( z ) - (z ~+1, 1). Our aim is to use Theorem 7.8 in the present situation to construct the recurrence to compute the (fl + r/, a q- ~)-basis matrix when the (13, a)-basis matrix is known. Before we come to that, we have to analyse the situation a bit deeper to see what kind of updates will be needed. We first shall see why the data are not normal, but weakly normal and then we shall find out why there are two possible types of weakly normal data. Note that a3 is completely defined in terms of (~, a). Thus, when G and H are known, then, instead of calling the data (G, H, ~) normal or weakly normal, we shah also say that the point (fl, a) is normal or weakly normal, and assume that the matrices G and H are understood. One can remark that also H is known when (~, a) is known, but we should warn here that this is only true when H has indeed the simple form of (7.11). We shah meet situations further on where H will not be that simple. Because G(0) is nonsingular and 0 is the only interpolation point, G is regular. Hence, the characteristic polynomial X.,s(z) - z ~'+~+1 and by Theorem 7.7, a polynomial matrix B with H-degree ~ is a (13, a)-basis matrix iff all columns of B are elements of S and [~[ - deg Xs + c3(H) thus (a + + 1) q- ( - a - ~) - 1 - ~1 + ~2. Therefore, 61 # ~2, i.e., the point (/3, a) can not be normal. Because there always exists a Frobenius-PA of type (fl, a), we can always assume that this PA forms the second basis vector. So, ~2 <_ 0. Hence, ~ 1 - 1 - 6 2 > 0 . Let ...#
B_[C~ c2] al a2 and introduce T-
C H A P T E R 7. GENERAL R A T I O N A L I N T E R P O L A T I O N
368
p = ord r, fl*=degc2-7
7 = deg gcd(c2, a2), and a* = dega2 - 7
_< a.
Taking into account the block structure of the Pad6 table (with f0 ~ 0) described in Theorem 5.2, we know that c2/a2 is a PA (Frobenius) of type (/~, a ) l y i n g in the singular block with top-left corner (/~*, a*) and dimension = v - 7 + P - (a* + fl*). Note that m i n { 7 , p } = 0. If ~c = 0, the singular block is trivial, i.e., (/3", a * ) = (fl, a) and p = 0. Lemma
7.9 Let G, H be as in (7.11} and ~ - (z ~+~+1, 1). Then the fol-
lowing are equivalent: 1. The data triple (G, H, ~) is weakly normal. 2. There exists no nontrivial solution in S ( G , ~ ) of negative H-degree. .
The dimension of the subspace of all solutions in S(G, ~) having nonnegative H-degree is one.
The data triple (G, H, ~3) at the point (fl, a) can not be normal. However, we can assume it is weakly normal, i.e., d~l = 1 and 62 - 0. There are two types to be distinguished which will correspond to familiar situations when interpreted in terms of their position in the Pad~ table. (See Corollary 7.11.) Indeed, when 62 = 0, then there are two possibilities: either 7 = 0 or 7 > 0, i.e., the couple (c2, a2) is coprime or not. (a) If d~2 = 0 and 7 = 0, then/3* = degc2 = ~ or a* = dega2 = a, i.e., the point (fl, a) lies at the top or the left border of a singular block. (b) If 62 = 0 and 7 > 0, it easily follows that p = 0. Hence, we have the equalities fl* = degc2 - 7 = fl - 7 and ~ = a - a* or the equalities a* = deg a 2 - 7 = a - 7 and ~ = ~ - fl*. Therefore, if 3' > 0, the point (fl, a) lies under the antidiagonal in a singular block at the right or at the b o t t o m border of that block. The two situations are depicted on the Pad~ tables of Figure 7.1. Referring to the numbering used, we call them weakly normal points of type (a) and type (b) respectively. These two possibilities for weakly normal points depending on 7 appear also in the following theorem. 7.10 Let the data (G,H,~3) of Definition 7.9 correspond to the weakly normal point (fl, a). Thus ~(z) - (z ~+1, 1) with v - a + ~. Let
Theorem
369
7.4. PADF, A P P R O X I M A T I O N
Figure 7.1" Weakly normal points
type (a)
type (b)
a >_ 0 a n d f l >_ 0 and hence, v >__ O. Let B be a (,O,a)-basis matrix and define [1
f]B- [s r].
-
Then, B has one of the two (mutually excluding)forms" (a) B has the unique form
B -
Iv c] z2u ~ a
with the following normalization
a(O)- I
and s(O)- i.
(b) B has the (nonunique) form
,_iv zc] u
za
with the following normalization u(O)- 1
and
~ ( o ) - 1.
370
C H A P T E R 7. G E N E R A L R A T I O N A L I N T E R P O L A T I O N
P r o o f . From Theorem 7.6, we get that all solutions (c, a) having H-degree _ 0 and satisfying
('/.12)
c ( z ) - f(z)a(z) - ?'(Z)Z c~+/3+1 form a subspace of dimension 1. There are two possibilities:
(a) a(0) ~ 0. Hence, we can take for the second column of B the unique polynomial couple (c, a)with a(0) = 1. From Theorem 7.6, we get that there exists a basis vector (v", u") of H-degree 1, linearly independent of (c, a) and (zc, za). Because a(0) ~ 0, it is clear that there exist (unique) 71,72 C F such that
('V'1, "t/,H) + ")'1(r a)+ ")'2(zc, za) has the form (v, z2u ') with H-degree 1. This polynomial couple is unique up to a nonzero constant factor. By taking the determinant of the equality G B - diag u3. R we obtain that
R(0)-
,(0)]
with u -
Z2lt !
is nonsingular. Because u ( O ) - O, s(O)~ 0 and (v, u ) c a n be normalized such that s(0) - 1. (b) a(0) = 0. Because u = a + f l >_ 0, this implies that also c(0) = 0. Because R ( 0 ) i s nonsingular and a(0) = 0, r(0) # 0 and u(0) ~ 0. Hence, we can scale (c,a)such that r(0) = 1 and (v,u) such that u(0) = 1. Note that ( u , v ) i s not unique. A linear combination of (c, a ) a n d (zc, za) can be added. Thus the theorem is proved.
[3
Because the basis matrix of the previous theorem in case (a) is unique, we will call it the canonical (fl, a)-basis matrix for the weakly normal data (G, H,&) of type (a). An example of weakly normal data of type (b) is given below. E x a m p l e 7.2 ( w e a k l y n o r m a l d a t a of t y p e (b)) Let f ( z ) - 1 + Oz + 0z 2 - lz 3 + . - . a n d consider the case a - 1 and f l - 2. It turns out that each solution having H-degree <_ 0 has a(0) - 0 and has exact H-degree 0.
7.4. PADF, A P P R O X I M A T I O N
371
Hence, the data are weakly normal of type (b). A possible choice for a basis matrix is B(z) -
1 + z3
z
with r ( O ) - 1. We summarise our results about the two different types of weakly normal data in the following Corollary. C o r o l l a r y 7.11 The point (~, a) is weakly normal of type (a) iff the point
(~, a) in the Padd table is at the top or at the left of a singular block. The (~, a) is weakly normal of type (b) iff the point (~, a) in the Padd table is below the antidiagonal in a singular block and at the bottom or the right side of a singular block. Weakly normal data of type (a) are particularly interesting. We look at this in more detail below. From now on we often use the term weakly normal without specifying the type; we then always mean that the data are of type
(a). T h e o r e m 7.12 The point (~, a) is weakly normal of type (a) iff the Hankel
matrix
f•-a-}-I
f~-a+2
"'" 9
H(~, ~)
f~-a+2
""9
fB ~
9
o
=
/~
**~
fill+a-1
is nonsingular. If moreover a + ~ >_ 1, then the first column of the canonical (~, a)-basis matrix is the (unique with s(O) - 1) PA of type (f~- 1, a - 1) multiplied by Z2 .
P r o o f . From Lemma 7.9, we know that the data ( G , H , ~ ) are weakly normal of type (a) iff the dimension of the space of solutions in $(G,05) having H-degree _< 0 is one and each of these solutions (c, a) satisfies a(0) ~t 0. Let a(z) - a o + a l z + . . . + a ~ z ~. The a-part ofasolution (c,a) 6 S ( G , ~ ) having H-degree <_ 0, satisfies the following equation H(f~'~)[a~ -" "al] T - a0[f~+l'-" f~+~]T. This proves the first part of the theorem. The proof of the second part is trivial.
O
CHAPTER 7. GENERAL RATIONAL INTERPOLATION
372
Now we are finally ready to construct the recurrence relation to compute the (/3 + r/,a + ~r matrix B(2) when the (/3, a)-basis matrix B(1) is known (always assuming that (fl, a) as well as (/3 + r/,a + ~r are weakly normal). We suppose that the number of interpolation conditions does not decrease, i.e., ~7+ ~ >_ 0. We also assume that ~ > 7/. The other case 7/> is similar. For the notation, see Section 7.3. That is, we set B (2) - B(1)B (1'2) and H (1'2) - H(2)B (1), where H (1)
-
-
diag(z -t3, z -=) and H (2) - diag(z -t3-n, z-"-~).
Furthermore, with ~(~) - (z aTf~+l, 1) and ~3(2) - (z aT~Tf~TrI+l, 1),
G(X)B(X) _
diag(~(*))R(*),
H(X)B (x) _ SO:)zg(')
and R ( ~ ) - [ s(~) 9 r(~)] 9 where x = 1 or 2. Also B(~)_ [ v(~) c(~) ] u(~) a(~) where (~) can be one of (1) (2) or (1,2) Let us look at the degrees of the polynomial matrix B(l'2). deg B(1'2) (z)
-- deg [(B(1)(z)) -1B(2)(z)] -
-
-<
deg
[z -5-'(1,('(1)(z))-lH(i)(z) (H(2)(z))-l,.,-e(2)(z)z 5-'(~,]
(+~1
~
(componentwise).
Because B (1) and B(2) are the canonical basis matrices, we obtain that a(l'2)(O)- 1 and ord (u(1,2))- 2. The formal Laurent series H(1,2) is defined as H(1,2)(z)-
H(2)(z)B(1)(z)- [
Z-a-~.U(1)(Z)
z-C~-~a(1)(z)
] 9
373
7.4. PADF, A P P R O X I M A T I O N
Hence, d e g H ( l ' 2 ) <- r / [+ l--~ + 1
-7/]_~
(componentwise).
Therefore, to find the second column of B(1,2), we have to compute the (1,2) - 1 of the following set of linear homogeneous unique solution with a 0 equations s~1)
0
..-
(i)
(i)
9
s 1 9
sO
9
.
,
r~1) r~l) 9
0 (11
7' 0 .
(1,2) Co
... .. 9
(~,21
o
9 9
.
999
0
9 ..
v(1) ~+~
o
(1)
v~+~
vO)
~
9
9 -9
""
0
~
c0)
(1)
0 0
"
(7.13)
cO)
C/3 9
%1,2)
0
1 .
9
9
(1,21
a(
The first block row contains ~ + 7/rows. These equations express the additional ~ + 7/interpolation conditions. The second block row has ~ - 7/rows. In view of the degrees of c(2) and a (2), the update would give an approximant of type (/3 + ~, a + ~). The equations of the second block row express that the numerator degree should not be fl + ~ but fl + 7/= .~ + ~ + (7/- ~r hence the ~r 7/equations. For the columns the subdivision is ~ columns for the first block column and ~ + 1 columns for the second one. The coefficient matrix of (7.13) with the first column of the second block column deleted is square and will be denoted by T1,2.
C o r o l l a r y 7.13 The following are equivalent 1. The matrix T1,2 is nonsingular.
2. The data (G,H(2),~ (2)) are weakly normal, which means that (fl + ~l, a + ~) is a weakly normal point for this problem. 3. The data (R(1),H(1'2),~(1,2)) are weakly normal.
374
C H A P T E R 7. G E N E R A L R A T I O N A L I N T E R P O L A T I O N
Taking into account that ord (u (1'2)) - 2, we can compute the first column of B(1'2) as the solution of the following set of linear equations if ( + r/ > 0.
O
v~ v~ I,
.
0 1
(:.:4)
l
0
(1,2) u(+l
0
If ~ + 77 > 0, it follows that vO'2) - 0. If no new interpolation conditions are added, i.e. when : + 77 - 0, v0(1,2) - 1 / s o(:) ~ 0 and the first column of B(:'2) can be computed as the solution of the following set of linear equations. -
(:,2)
-
v1
0 ~
(1,2)
T1,2
;II:I)
-- __V~ 1,2)
+1
~!1,2)
v ~+n+2 (:) -
(1,2) . U~+l
0
_
Let us consider two special cases of the previous general recurrence between two arbitrary weakly normal data points in the Pad~ table. We shall look for the next weakly normal data point on a diagonal and on a row in the Pad~ table.
Diagonal
path
Suppose we know the canonical (fl, a)-basis matrix B (1) for the weakly normal data point 03, a). We want to construct B(2), the canonical basis matrix for the next weakly normal data point on the same diagonal in the Pad~ table, i.e., of the form (fl + ~, a + ~). For general ~, not necessarily minimal, the coefficients of the polynomial elements of B(1,2) can be found as the solution of the following set of linear equations (we drop the superscripts in
7.4.
PADE
375
APPROXIMATION
u (1'2), V(1'2), C(1'2) a n d a (1'2) a n d t h e s u p e r s c r i p t s in s (1) a n d r (1))
T(~)
Vl
CO
0
--7' 0
V2
Cl
0
-rl
9
9
9
9
9
~
v(
c(-1
0
'/L2
al
0
-rl~_l --r~
u(
a(-t
0
--r2~_ 2
I
--r2~-i
ul~+l
at~
(r 5)
where
80
0
"'"
81
80
"9
0
-..
0
0
ro
"'.
0
0
9
T(~)
-
S~-1 8~
9
9
S~-2
9. .
80
r~-2
...
ro
0
8~-1
" " "
81
r~-I
999
rl
ro
r(_l
r(_2
r(
r(-1
9 9
, ~
,
82~-2
82~-3
9. .
8~--I
r2(-3
""
82~-1
82~-2
9. .
s~
r2(-2
999
a n d w h e r e vo - uo - ul - 0 a n d a w e a k l y n o r m a l d a t a p o i n t , the smallest n o n s i n g u l a r m a t r i x T ( ( ) r k - 2 - 0 a n d rk-1 ~ 0. This T ( k )
So
0
ao - 1. B e c a u s e (f~ + ( , a + ( ) has to be m a t r i x T ( ~ ) should be n o n s i n g u l a r . T h e occurs for ~ - k w h e n ro - r l = " " has a lower t r i a n g u l a r f o r m
...
0
0
...
0
0
"'.
0
0
0
0
0
0
9
81
80
9
.
0
8k-1
8k-2
" 9"
80
0
Sk
8k-1
9
s1
rk-1
9
T(k)
-
9
9
9
9
9
9
82k-2
82k-3
82k-1
82k-2
.
9
...
"'" 9
9 9
9
.
9
9
9. .
8k__ 1
r2k-3
"''
Sk
72k-2
"""
rk-1 ?'k
.
.
0 ?'k-1
C H A P T E R 7. G E N E R A L R A T I O N A L I N T E R P O L A T I O N
376
and the system becomes
2}1 ?32
T(k)
r c1
Vk 2t2
Ok-1 al
9
.
Uk Ukq-1
ak-1 ak
0
0
0 0
0 --7'k_ 1
0
--Tk
9
~
9
~
0
--r2k_2
1
--~'2k-1
Because of the lower triangular structure of the coefficient matrix T(k) and the initial zeros in the columns of the right-hand side, we get that
v(z) - 0
u(z) - uk+lz k+l ~ and
c(z) - Ck-1 Zk-1
with Uk+l - 1/rk_l # 0 and ck-1 - - r k - 1 / s o ~ O. The polynomial a(z) of degree _< k is the solution of
o r d ( c k _ l z k - l s ( z ) + r(z)a(z))
-
Ck-lS(z) + zr(z)q(z -1)
-
2k
or
with ord ( c k - l s ) -
O, ord ( z r ) -
k, ord (r') > k + 1 and q(z -1) - z-ka(z).
Hence, according to Definition 1.12, we can write
q(z -1)
-
a(z) - zkq(z -1)
-
- ( s d i v zr)ck_l
or z k - ( s div zr)ck_l .
Note that Ck_ 1 WaS chosen such that a(0) - 1. Finally, the basis matrix B (1'2) looks as follows
I B(I'2)(z)
-
0 "ll,k+l zk+l
Ck_ 1Z k-1 ] a(z)
with a defined by (7.16). We obtain the same results as in Section 5.2, more precisely, as in Theorem 5.3 taking into account that the (v, u) here are z times the (v, u) polynomial couple of Theorem 5.3. In a similar way, canonical basis matrices can be computed on an antidiagonal following a sequence of subsequent weakly normal data points.
377
7.4. P A D E A P P R O X I M A T I O N Row
path
To find the elements of the basis m a t r i x B (1'2) t r a n s f o r m i n g the c a n o n i c a l basis m a t r i x B(1) for t h e w e a k l y n o r m a l d a t a point (13, a ) to the c a n o n i c a l basis m a t r i x B (2) for the w e a k l y n o r m a l d a t a point (~, a + [), we have to solve t h e following set of linear e q u a t i o n s as a special case of (7.13) a n d (7.14) (also here we d r o p the s u p e r s c r i p t s , b u t d e n o t e the u, v, c a n d a w i t h a p r i m e to d i s t i n g u i s h t h e m f r o m the d i a g o n a l case) "
v~ v~
c~ c~
0
--T O
0 --r~-2 1 0 0
T([)
a _x
--r~-I
(7.17)
0 0
,
0
0
w i t h a~ - 1, v~ - u~ - u~ - 0 a n d w h e r e t h e coefficient m a t r i x T(~r s t a n d s for "
so
0
Sl
SO
9
.
...
9
0
0
.
r0 9
o
9
... ". ,
0
0
0
0
,
9
now
-
9
.
,
-
0 0 9
_ v#+l
If c~ ~t 0, the m a t r i x
999 999 9
...
0 v#+l ,
... 999
0 c~
c~
"..
c~-~+2
c~ C~_l
.
v~_~+3
v~_~+2
c~_~+1 _
[ 0] v#+l
is n o n s i n g u l a r . cient m a t r i x
0 0
V~+l v~
c~
If c o -- O, t h e n V~+l r 0 b e c a u s e the highest degree coeffi-
[ v/3+1 C/3 ] U~+l
a#
is n o n s i n g u l a r . S u p p o s e now t h a t c~ - c~-1 = "-" = c~-k+2 - 0 a n d c ~ - k + l ~t 0 a n d t h a t r0 - rl = . . ' = rl-1 - 0 a n d rt 7t 0. It can easily be
378
CHAPTER
7.
GENERAL
RATIONAL
INTERPOLATION
checked that the smallest nonsingular matrix T(~)is reached for ~ - k + l T(k +
l)O(z+x)x(k-1)
L( so, . . . , sk +z_ l )
L(rz, 99 r~+z-2)
R(v~+~,...,v~_~._z+2)
O(a+z)x(k-1)
O(k+z)x(z+~) O(k-t)x(z+i) R(c#_k+~, . . .,c~_~_z+~)
with L a lower triangular Toeplitz matrix
L(to, tl,. . .,tk)
I to
.."
0
i
"'.
i
-
tk
.. 9 to
and R a lower triangular Hankel matrix
R(to, tl,. . .,tk)
--
I O .-:
.."
to
"..
to 1 "
tk
The system (7.17) has the form" 0
0 ~
I Vl T(k+t)
u2 u;
c1
0
0
0
--7"l
1
--rk+l-1
0
0
a! L,0
9
~
o
o
0
0
Here we have split up a'(z) as a'(z)-
at(z ) + zka~H(Z)-
[1
z ...
zk-1]a'c +
zk[1 z .-. z~-k]a~y.
The notation a'L,O refers to the fact that we dropped the constant term (which is 1). Similarly we have set ~"(z)-
z~[1 z "'" zk-~]u 'L + z k+~ [1 z ... z ~-k]us
7.4. P A D E A P P R O X I M A T I O N
379
It is clear that v'(z) - 0 and u'(z) - u~z k. Hence we can write the above system of hnear equations as
ord(s(z)d(z)+r(z)a~L(Z))>_ deg (v(z)d(z) + c(z)zka~(z)) <_
k +l + Z.
Because o r d ( s ) - 0, o r d ( r ) - l a n d o r d ( a ~ ) - 0, we get that ord ( c ' ) - l from (7.18). Because deg (v) - fl + 1, deg (c) - fl - k + 1 and deg (a~z) _< l, we obtain from ( 7 . 1 9 ) t h a t deg(c') <_ I. Hence, c'(z) - c~z t with c~ # 0. From (7.18), we derive that
a~L(Z) -- --(s(z) div r(z)z k-t-1)c~z k-1 and from (7.18), we obtain that
a ~ ( z ) - - ( v ( z ) z I div c(z)zk)c~. The previous recurrence on a row in the Pad~ table allows to compute the canonical basis matrix at subsequent weakly normal data points (fl, a), i.e., points where the Toeplitz matrix
Tg -
......
j=l,2,... ,a
is nonsingular. If we want to solve a set of linear equations
T2.
- b,
(7.20)
we can apply the recurrence until we finally reach the point (~, a*). Because the basis matrix contains also the parameters for the inversion formula, we can solve (7.20)in an efficient way, i.e., using O((a*) 2) operations in the field F. If exact arithmetic is used, there can be no problems of numerical instability. However, scaling could be required. On the other hand, when finite precision is used, the computed solution of the linear system could differ a lot from the exact solution even if T~. is well-conditioned. To increase numerical stability, it is necessary to avoid the weakly normal data points in the Pad~ table for which the Toeplitz matrix T~ is ill-c~176 Then, in general we have to use the recurrence relation jumping from one weakly normal data point to a next one not necessarily the immediate next one. The criterion to decide whether T~ is ill-conditioned or not, is the so-called look-ahead criterion.
CHAPTER 7. GENERAL RATIONAL INTERPOLATION
380
7.5
Other applications
In this chapter we have seen how the ideas of recursive computations in Pad~ approximation can be generalized to general interpolation problems. We have illustrated how this specializes in the familiar case of Pad~ approximation. In fact we showed that all the solutions satisfying the interpolation conditions formed an F[z]-module and all these solutions could be described by a basis matrix. By imposing degree conditions on the elements of the basis, one can easily write down all solutions having a given degree structure. By adding new interpolation conditions and/or changing the degree conditions, the basis matrix changes accordingly. Pad~ approximation is not the only application of this general idea. There are many other applications that fit in this general framework. Explaining them all in detail would lead us too far. We therefore give some examples that have appeared in the literature and refer to the papers for details. 9 In [227] an algorithm is presented which computes Pad~-Hermite approximants along a diagonal in the Pad~-Hermite table. 9 Beckermann develops a similar algorithm for the M-Pad~ approximation problem in [11]. 9 In [226] the vector rational interpolation problem was solved in a similar way. As we have seen in Section 7.3, the updating formula for an arbitrary change of the degree conditions and arbitrary new interpolation conditions is possible. This general updating formula can be used to jump from a weakly normal data point in the Pad~ table to another such point. As a special case, the recurrence formula was given when the other point is the immediately next weakly normal data point along a diagonal or along a row in the Pad~ table. When following a diagonal path, the basis matrix at each point of the path can be found by solving a linear system of equations where the coefficient matrix is Hankel. These Hankel matrices are nested, i.e., the previous one involved is a leading principal submatrix of all the following ones. At each point of the path, the basis matrix contains the parameters of an inversion formula for the Hankel matrix involved. This leads to a procedure to solve a Hankel system of dimension n using O(n 2) arithmetic operations. Similar results are obtained for nested Toeplitz matrices when a row path is followed.
7.5. OTHER APPLICATIONS
381
As we have stressed before, in this book we did not investigate the numerical stability of the recurrence relations when they are computed in finite precision. However, the main tools are available to enhance the numerical stability of the computations by using look-ahead techniques. Indeed, the general updating formula can be used to jump from one weakly normal point to another one, which need not be the next one on a path that one is following (e.g., a diagonal or a row in the Pad~ table). Indeed, for any general interpolation problem, one can decide to make an update only when the weakly normal point is well-conditioned. This means a weakly normal point where the basis matrix (or the system to be solved at that point) is well-conditioned. It is not an easy task to decide whether this is the case or not. Condition numbers for matrices involve the norms of the matrix and its inverse. However, as illustrated in the Hankel and Toeplitz case, we do have the parameters available at every stage to compute the inverse, and it might even be possible to estimate the norm of the inverse or the condition number itself, without ever computing the inverse. This gives a criterion to decide whether to update or not to update. This is the basic idea of look-ahead. We give some examples in the literature where these ideas are used. 9 For Hankel matrices, Freund and Zha [99] develop a look-ahead Trench algorithm, Cabay and Meleshko [49] construct a look-ahead algorithm for Pad~ approximants while Bojanczyk and Heinig [18] give a lookahead Schur algorithm. 9 An overview of these three methods generalized to the block Hankel case as well as several connections can be found in [231]. 9 For scalar Toeplitz matrices, several of these look-ahead algorithms have been designed [51, 52, 93, 98, 103, 123, 150, 155]. 9 Even superfast, i.e. requiring O(n log 2 n) operations, look-ahead algorithms were developed for Toeplitz matrices [126, 129, 130]. 9 For the block Toeplitz case we refer the reader to [232, 233]. 9 In [117] stable look-ahead versions of the Euclidean and Chebyshev algorithm are handled. 9 An error analysis is made for Hankel matrices in [49], for generalized Sylvester matrices (connected to multi-dimensional Pad~ systems) in [44, 45] and for block Toeplitz matrices in [233].
382
C H A P T E R 7. GENERAL R A T I O N A L INTE, R P O L A T I O N 9 In [46] the results of numerical experiments are shown using the weakly stable algorithm of [45] for computing Pad~-Hermite and simultaneous Pad~ approximants along a diagonal in the associated Pad~ tables. In [230] a look-ahead method to compute vector Pad~-Hermite approximants going from one perfect point to another on a diagonal in a vector Pad~-Hermite table is designed. This generalizes [48] where the scalar Pad~-Hermite problem is handled. The connection with matrix Pad~ approximants is given. It is also explained how this method can be used to solve block Hankel systems of equations where the blocks can be rectangular. Another generalization is given in [47].
Besides using a "look-ahead" strategy, only recently a totally different approach was taken to overcome the possible instabilities of the "classical" algorithms. The Gaussian elimination method uses (partial or complete) pivoting to enhance numerical stability. Unfortunately, pivoting destroys the structure of a Hankel or Toeplitz matrix. However, other classes of structured matrices maintain their structure after pivoting and thus fast as well as numerically stable methods can be designed. In [136], Heinig proposed for the first time to transform structured matrices from one class into another and to use pivoting strategies to enhance the numerical stability. It is shown how Toeplitz matrices can be transformed into Cauchy-like matrices by the discrete Fourier transformation, which is a fast and stable procedure (see [182]). For Woeplitz-plus-Hankel matrices this was done in [137]. Real trigonometric transformations were studied in [111, 140, 141]. A matrix M = [mkl] is called Cauchy-like if, for certain numbers yk and zl, the rank of the matrix [(yk - zl)mkl] is small compared to the order of M. Pivoting does not destroy the Cauchy-like structure. For Cauchy-like systems several fast algorithms exist [110, 111, 113, 136, 144, 166]. Instead of transforming into a Cauchy-like matrix, [140] explains how to transform a Toeplitz matrix into paired Vandermonde or paired Chebyshev-Vandermonde matrices and how to solve the corresponding systems of linear equations. In [139] a Toeplitz system is also transformed into a paired Vandermonde system, which is solved as a tangential Lagrange interpolation problem. In [174], a Hankel system is transformed into a Loewner system. The parameters of an inversion formula for this Loewner matrix are computed by solving two rational interpolation problems on the unit circle. Recently Gu [122] has designed a fast algorithm for structured matrices that incorporates an approximate complete pivoting strategy. For an overview of different transformation techniques and algorithms we refer the reader to [102, 140, 141,142] and the references cited therein.
7.5. OTHER APPLICATIONS
383
Very recently Chandrasekaran and Sayed [53] derived an algorithm that is provably both fast and backward stable for solving linear systems of equations involving nonsymmetric structured coefficient matrices (e.g., Toeplitz, quasi-Toeplitz, Toeplitz-like). The algorithm is based on a modified fast QR factorization of the coefficient matrix. To develop this algorithm, the theory of low displacement structure is used [166]. In [132], Hansen and Yalamov perturb the original Toeplitz matrix when ill-conditioned leading principal submatrices are encountered. Hence, also the solution of the corresponding linear system is perturbed. Its accuracy is improved by applying a small number of iterative refinement steps. See also [241].