On zeros of polynomial and vector solutions of associated polynomial system from Vieta theorem

On zeros of polynomial and vector solutions of associated polynomial system from Vieta theorem

Applied Numerical Mathematics 44 (2003) 415–423 www.elsevier.com/locate/apnum On zeros of polynomial and vector solutions of associated polynomial sy...

85KB Sizes 4 Downloads 93 Views

Applied Numerical Mathematics 44 (2003) 415–423 www.elsevier.com/locate/apnum

On zeros of polynomial and vector solutions of associated polynomial system from Vieta theorem Xinyuan Wu 1 Department of Mathematics, Nanjing University, 210093 Nanjing, People’s Republic of China

Abstract We show equivalence of a polynomial equation to a system of equations defined by Vieta (Viete) theorem and simple recursive derivation of this system, specify application of Broyden’s method to this system using O(n2 ) flops per iteration, and demonstrate its rapid local convergence in our preliminary experiments.  2002 IMACS. Published by Elsevier Science B.V. All rights reserved. Keywords: Polynomial zeros; Bezout theorem; Iterative approximation of all zeros; Simultaneous approximation of real zeros

1. Introduction Let Pn (z) be a complex coefficient polynomial of degree n zn + a1 zn−1 + a2 zn−2 + · · · + an−1 z + an . There are two different types of zero-finder for pn (z), one of which is referred to as single-zero method and the other is a simultaneous one. For the algorithms that compute the zeros one a time, if a zero has been determined with sufficient accuracy, a linear factor is removed from the polynomial by the Horner algorithm, and the algorithm is employed again to compute a zero of the deflated polynomial. The method of successive removal of linear factors has the following disadvantage. In applications sometimes it is not necessary to determine the zeros to any great accuracy. If the method of successive deflation is used, explicitly it is not possible to take advantage of this special situation for if the early linear factors are very inaccurate the polynomials obtained by successive deflation may be falsified to an extent that make the remaining approximate zeros meaningless. Therefore, in recent years, the numerical methods which simultaneously approximate all the zeros of a polynomial are increasingly popular (see [1–4]). For a good 1 Current address: Department of Mathematics, Universität Tübingen, Auf der Morgenstelle 10, D-72076 Tübingen, Germany. The revised form of this paper was completed at the department of Mathematics, Universität Tübingen, during my academic visit there.

0168-9274/02/$30.00  2002 IMACS. Published by Elsevier Science B.V. All rights reserved. PII: S 0 1 6 8 - 9 2 7 4 ( 0 2 ) 0 0 1 4 5 - 9

416

X. Wu / Applied Numerical Mathematics 44 (2003) 415–423

review, some recent papers are available (see Bini and Pan [5] and Pan [6]). Given the coefficients of an nth-degree polynomial Pn (z) having zeros z1∗ , z2∗ , . . . , zn∗ , we would like to simultaneously approximate all the zeros of the polynomial. For this purpose, it is a customary strategy to employ Vieta theorem which relates the coefficients of an nth-order polynomial and its zeros. Namely, we can form a polynomial system F (z) = 0 whose solutions is a vector, z∗ = (z1∗ , z2∗ , . . . , zn∗ )T and in the next section, we prove the equivalence between finding the zeros of a polynomial and finding the solutions of a polynomial system from Vieta theorem. This result is of no surprise, and many popular effective algorithms of the Weierstrass (Durand–Kerner) type amount to modifications of Newton’s method applied to this system.We propose an alternative approach based on Broyden’s method, which in our specification uses O(n2 ) flops per iteration and has superlinear local convergence. By both features, it is similar to the cited variation of Newton’s method, but we believe that our alternative algorithm or its future modification should be superior in handling polynomials with multiple and clustered zeros. We organize the paper as follows. In the next section we prove equivalence of polynomial equation to Vieta’s system and specify Broyden’s method for the system. In Section 3 we show recursive reduction from a polynomial to the system. In Section 4, we show the results of our preliminary numerical tests. 2. The polynomial system F (z) = 0 associated with polynomial Pn (z) by Vieta theorem Consider the following real coefficient polynomial of degree n Pn (z) = zn + a1 zn−1 + a2 zn−2 + · · · + an−1 z + an ,

(2.1)

which has zeros z1∗ , z2∗ , . . . , zn∗ . It is well known that those zeros satisfy the following simultaneous nonlinear system F (z) = 0,

(2.2)

where z = (z1 , z2 , . . . , zn ) and F (z) = (f1 (z), f2 (z), . . . , fn (z)) , and   f1 (z) = 1j1 n zj1 + a1 = 0,     f2 (z) =   1j1
T

from Vieta’s theorem. Generally speaking  T z∗ = z1∗ , z2∗ , . . . , zn∗ ,

(2.3)

(2.4)

is not a unique solution of the nonlinear system (2.3), except z1∗ = z2∗ = · · · = zn∗ , however each of them gives only the zeros z1∗ , z2∗ , . . . , zn∗ of the nth-degree polynomial (2.1) without regard to the permutation of the subscript. In fact, we have Theorem 2.1. The number of solutions of the polynomial system (2.3) is bounded by n!. Moreover, every solution of the polynomial (2.3) can be expressed in the form   (2.5) z∗ = zσ∗ 1 , zσ∗ 2 , . . . , zσ∗ n , where (σ 1, σ 2, . . . , σ n) is a permutation of the n items {1, 2, . . . , n}.

X. Wu / Applied Numerical Mathematics 44 (2003) 415–423

417

Proof. From the polynomial system (2.3), we have   deg fi (z) = i, i = 1, 2, . . . , n. So, by the classical Bezout Theorem we conclude that the number of solution of the polynomial system (2.3) is bounded by n!. Besides, it should be noticed that in (2.3) all fi (z), i = 1, 2, . . . , n, are elementary symmetric polynomials. It follows consequently that fi (z∗ ) = 0,

i = 1, 2, . . . , n

and F (z∗ ) = 0. Therefore, the number of solutions of the polynomial system (2.3) is bounded by n!, and every solution of the polynomial (2.3) can be expressed in the form (2.5). ✷ Corollary 2.1. Suppose that zi∗ = zj∗ ,

i = j, i, j = 1, 2, . . . , n,

then there are n! solutions to the polynomial system of Eq. (2.3) and the polynomial system (2.3) has a unique solution  T if and only if z1∗ = z2∗ = · · · = zn∗ . z∗ = z1∗ , z2∗ , . . . , zn∗ Theorem 2.1 means that in general, the number of isolated solutions of (2.3) is at most n!, but each of them gives all zeros z1∗ , z2∗ , . . . , zn∗ of the nth degree polynomial (2.1) without regard to the permutation of the zi∗ , i = 1, 2, . . . , n. In this sense we would like to say that finding the zeros of a polynomial and finding the solutions of a polynomial system from Vieta theorem is equivalent. Thus any available numerical methods such as Newton iterative method and quasi-Newton methods, can be used to deal with the polynomial system (2.3). Because most practical numerical methods are directly or indirectly related to Jacobian matrix of F (z), in the next we consider this issue. Explicitly, for the particular nonlinear system (2.3) we have Jacobian matrix of F (z),     1 1 ... 1 ∇f1 (z)T  ∇f2 (z)T   f21 (z) f22 (z) . . . f2n (z)      , DF (z) =  = . .. .. ..  .. .    . . . . .  ∇fn (z)T fn1 (z) fn2 (z) . . . fnn (z) where fkj (z) =

∂fk = ∂zj



zj1 zj2 . . . zjk−1 ,

k = 2, . . . , n, j = 1, 2, . . . , n.

1j1
For the Jacobian matrix DF (z) we have Theorem 2.2. Suppose that zi∗ = zj∗ , i = j, i, j = 1, 2, . . . , n, then   det DF (z∗ ) = 0.

(2.6)

(2.7)

418

X. Wu / Applied Numerical Mathematics 44 (2003) 415–423

Proof. By applying a sequence of elementary operations,  1 1 ... ...  ∗ ∗  z1 − z2 . . . ...   .. ..  .   . ∗ k−1 ∗ det DF (z ) =  ∗ i=1 (zi − zk )        = z1∗ − z2∗ · · ·

k−1 



 zi∗ − zk∗ · · ·

i=1

n−1 



we have ... ... .. . ... .. .

        k−1 ∗ ∗  i=1 (zi − zn )   ..  .  n−1 ∗ ∗  (z − z ) 1 ∗ z1 − zn∗ .. .

i=1

i

n

 zi∗ − zn∗ .

i=1

Thus it follows immediately that det(DF (z∗ )) = 0 from the assumption of zi∗ = zj∗ , i = j, i, j = 1, 2, . . . , n. ✷ Corollary 2.2. If polynomial (2.1) has multiple zeros, then det(DF (z∗)) = 0. From Theorem 2.2, since DF (z∗ )) is nonsingular provided zi∗ = zj∗ , i = j, i, j = 1, 2, . . . , n, it follows that the mapping F (z) is a local homeomorphism at each solution z∗ of the polynomial system (2.3), namely, the solution z∗ of (2.3) is locally unique. Now let Ω is a closed subregion containing the solutions z∗ to the polynomial system (2.3), then it is easy to see that DF (z) is Lipschitz continuous at z∗ from (2.6) and (2.7). In fact, D 2 F (z) is continuous and DF (z) − DF (z∗ )  sup0t 1 D 2 F (z∗ + t (z − z∗ )) z − z∗  L z − z∗ , for all z ∈ Ω. From Corollary 2.2 we can see that Newton iteration (unless modified) is not suitable for the polynomial system (2.3), if Pn (z) has multiple zeros and/or clusters of zeros. The problem of multiple zeros is an ill-conditioned one which needs to be treated with caution (sometimes a transformation can be used, Wu [7]). However, super-linear convergence of Broyden’s method is achieved without requiring that the sequence {Hn } converge to the inverse of Jacobian matrix DF (z∗ ) of F (z) at the zero z∗ (see Dennis et al. [8]). This is one of the advantages for Broyden’s method. The other advantage is that Broyden’s method for the polynomial system (2.3) associated with polynomial Pn (z) defined in (2.1) is free from calculating Jacobian matrix. In the simplest form Broyden’s method is of the form zk+1 = zk − Hk F (zk ),

(2.8)

where, given an approximation H0 to (DF (z0))−1 , the matrices {Hk } are generated by Hk+1 = Hk − (Hk yk − pk )pkT Hk /pkT Hk yk ,

(2.9)

and yk = F (zk+1 ) − F (zk ),

pk = zk+1 − zk .

(2.10)

The motivation of employing Broyden’s method for the nonlinear system (2.3) is that the matrices generated by (2.9) are good approximations to the inverse of Jacobian matrix DF (z) and thus (2.8) resemble Newton’s method, but with the difference that (2.9) only requires one function evaluation and O(n2 ) arithmetic operations while the Jacobian matrix requires the evaluation of O(n2) partial derivatives. Moreover, (2.8) can be carried out in O(n2) operation. As for convergence of Broyden’s method for system (2.2) we have

X. Wu / Applied Numerical Mathematics 44 (2003) 415–423

419

Theorem 2.3. Assume that F (z∗ ) = 0 and zi∗ = zj∗ , i = j, i, j = 1, 2, . . . , m. Consider Broyden’s method as defined by (2.8), (2.9) and (2.10). Then Broyden’s method is locally and Q-super-linearly convergent at z∗ . Proof. From the assumption of zi∗ = zj∗ , i = j, i, j = 1, 2, . . . , m, it follows immediately that DF (z∗) is nonsingular by Theorem 2.2. Besides, DF (z) is Lipschitz continuous at z∗ . In fact, D 2 F (z) is continuous and DF (z) − DF (z∗ )  sup0t 1 D 2 F (z∗ + t (z − z∗ )) z − z∗  L z − z∗ for all x ∈ Ω, where Ω is a closed subregion containing the solutions z∗ of the polynomial system (2.3). So F (z) defined as (2.2) satisfies the conditions of convergence for Broyden’s method (see Broyden et al. [9]). Therefore, Broyden’s method for system (2.2) is locally and Q-super-linear convergent at z∗ . ✷ To be more precise, the conclusion of this theorem means that there is an ε > 0 and a δ > 0 such that if z0 − z∗ < ε and H0 − (DF (z∗ ))−1 < δ, then Broyden’s method is well defined, and if {zn } is the sequence produced by (2.8), (2.9) and (2.10), then either zn = z∗ for some n at which place the iteration stops, or {zn } converges Q-super-linearly to z∗ , the unique solution of system (2.3). Here we do not discuss the global convergence of Broyden’s method for (2.3).

3. Recursion algorithm for computing F (z) associated with Pn (z) Although from Vieta’s theorem we may express F (z) as (2.3), in practice it is very difficult to calculate F (z) directly by (2.3) as n is large. This motivates a recursion algorithm, by which F (z) can be generated automatically by computer. We formulate the recursion algorithm as follows. Theorem 3.1. Without loss of generality, let Pn (z) = zn + a1 zn−1 + · · · + an−1 z + an = (z − z1 )(z − z2 ) · · · (z − zn ). If q1 =

n 

zi ,

p1 = q1 − z1 ,

i=1

pj = pj −1 − zj , t1j = zj pj , n−i+2  ti−1,k , tij = zj qi =

k=j +1 n−i+1 

ti−1,j ,

j = 2, . . . , n − 1, j = 1, 2, . . . , n − 1, i = 2, . . . , n, j = 1, . . . , n − i + 1, i = 2, . . . , n,

j =1

then fi (z) = qi − (−1)i ai , namely

i = 1, 2, . . . , n,

T  F (z) = q1 + a1 , q2 − a2 , . . . , qn − (−1)n an .

420

X. Wu / Applied Numerical Mathematics 44 (2003) 415–423

Proof. By the definition of q1 , p1 and pj , j = 2, . . . , n − 1, q1 = z1 + z2 + · · · + zn ,

explicitly we have f1 (z) = q1 + a1 .

By the hypothesis of the theorem, p1 = q1 − z1 ,

pj = pj −1 − zj ,

t1j = zj pj ,

j = 1, 2, . . . , n − 1,

and tij = zj

n−i+2 

ti−1,k ,

k=j +1

we have t11 = z1 (z2 + · · · + zn ), t12 = z2 (z3 + · · · + zn ), ··· t1,n−1 = zn−1 zn .  So q2 = n−1 j =1 t1j and f2 (z) = q2 − a2 , t21 = z1 (z2 z3 + z2 z4 + · · · + zn−1 zn ) = z1 (t12 + t13 + · · · + t1,n−1 ), t22 = z2 (z3 z4 + z3 z5 + · · · + zn−1 zn ) = z2 (t13 + t14 + · · · + t1,n−1 ), ··· t2,n−2 = zn−2 zn−1 zm .  So q3 = n−2 j =1 t2j and f3 (z) = q3 + a3 . Generally, by induction principle, we have tij =

n−i+2 

ti−1,k ,

i = 2, . . . , n, j = 1, . . . , n − i + 1

ti−1,j ,

i = 2, . . . , n

k=j +1

and qi =

n−i+1  j =1

and then fi (z) = qi − (−1)i ai , i = 2, . . . , n, T  T  F (z) = f1 (z), . . . , fn (z) = q1 + a1 , . . . , qn − (−1)n an .



4. Numerical experiments Here for polynomial system (2.3) we use Broyden’s method. With Broyden’s method, some polynomials have been tested as in the following examples:

X. Wu / Applied Numerical Mathematics 44 (2003) 415–423

421

Table 1 P4 (z) = (z − 1)4 with zeros z∗ = (1, 1, 1, 1)T z0 = (0.999, 0.999, 0.999, 0.999)T

z6 = (1.000000000, 1.000000000, 1.000000000, 1.000000000)T

Table 2 P10 (z) = (z − 1)(z − 2)(z − 3)(z − 4)(z − 5)(z − 6)(z − 7)(z − 8)(z − 9)(z − 10) with zeros z∗ = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)T z0 = (0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5)T z40 = (

1.000000000, 6.000000000,

Table 3 with zeros

1.999999999, 7.000000001,

3.000000000, 7.999999999,

4.000000001, 9.000000000,

4.999999999, 10.00000000

)T

 P12 (z) = 12 i=1 (z − 0.1 × i) z∗ = (0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2)T

z0 = (0.05, 0.15, 0.25, 0.35, 0.45, 0.55, 0.65, 0.75, 0.85, 0.95, 1.05, 1.15)T z57 = (

Table 4 with zeros

0.09999999746, 0.4999999976, 0.9000000464,

0.2000000083, 0.5999999760, 0.9999999513,

0.2999999884, 0.7000000227, 1.100000006,

0.4000000163, 0.7999999771, 1.200000013

)T

 P16 (z) = 16 i=1 (z − 0.5 × i) z∗ = (0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0, 6.5, 7.0, 7.5, 8.0)T

z0 = (0.6, 1.1, 1.6, 2.1, 2.6, 3.1, 3.6, 4.1, 4.6, 5.1, 5.6, 6.1, 6.6, 7.1, 7.6, 8.1)T z39 = ( 0.500000000, 2.500000001, 4.500000430, 6.500007964,

0.999999984, 3.000000063, 4.999997518, 6.999995411,

1.500000020, 3.499999893, 5.500005921, 7.500001543,

2.000000010, 4.000000059, 5.999991380, 7.999999794

)T

All computations were done in double precision. It should be pointed out that for Example 1, Matlab 5.3 gives the following incorrect results (1.0002, 1 + 0.0002i, 1 − 0.0002i, 0.9998)T and for Example 5, Matlab 5.3 again gives the following incorrect results (0, 2.2002, −1.7, 2.0971, −1.6, 2.0125, 1.8629 + 0.0443i, 1.8629 − 0.0443i, −1.5, 1.6541 + 0.0789i, 1.6541 − 0.0789i, −1.4, 1.4406 + 0.0691i, 1.4406 − 0.0691i, −1.3, 1.2399 + 0.0142i, 1.2399 − 0.0142i, −1.2, −1.1, 1.0939, 1.0015, −1, 0.8998, −0.9, 0.8, −0.8, 0.7, −0.7, 0.6, −0.6, 0.5, −0.5, 0.4, −0.4, 0.3, −0.3, 0.2, −0.2, 0.1, −0.1)T .

422

X. Wu / Applied Numerical Mathematics 44 (2003) 415–423 Table 5

 P40 (z) = 40 i=1 (z − (2.2 − 0.1 × (i − 1))) with zeros zi∗ = 2.2 − 0.1 × (i − 1), i = 1, 2, . . . , 40 z0i = 2.18 − 0.1 × (i − 1), i = 1, 2, . . . , 40, z0 = (z01 , z02 , . . . , z040 )T

z57 = (

Table 6

2.199999028, 1.800007207, 1.400004126, 0.9999985115, 0.6000007295, 0.2000007165, −0.2000010330, −0.5999991901, −1.000001343, −1.400004813,

2.100005103, 1.699992837, 1.299998017, 0.8999989753, 0.5000010931, 0.1000001960, −0.3000009014, −0.6999992509, −1.099995358, −1.499997929,

1.999995305, 1.600012362, 1.199998868, 0.7999995585, 0.4000012169, −0.0000003638, −0.4000004226, −0.7999999006, −1.200007205, −1.600000533,

1.899994504, 1.499994146, 1.099998436, 0.7000001858, 0.3000010819, −0.1000008205, −0.4999997491, −0.9000005653, −1.299992983, −1.699999842

)T

 P28 (z) = 28 i=1 (z − (2.7 − 0.2 × (i − 1))) with zeros zi∗ = 2.7 − 0.2 × (i − 1), i = 1, 2, . . . , 28 z0i = 2.68 − 0.2 × (i − 1), i = 1, 2, . . . , 28, z0 = (z01 , z02 , . . . , z028 )T

z41 = (

2.700000000, 1.900000008, 1.099999995, 0.3000000031, −0.5000000004, −1.299999997, −2.099999983,

2.500000000, 1.700000003, 0.8999999965, 0.1000000039, −0.7000000027, −1.500000004, −2.300000008,

2.299999988, 1.499999998, 0.6999999988, −0.09999999649, −0.9000000039, −1.699999982, −2.499999998,

2.100000006, 1.299999996, 0.5000000012, −0.2999999981, −1.100000004, −1.900000023, −2.700000000

)T

Acknowledgement I am grateful to the anonymous referees for helpful comments on the presentation.

References [1] Z.H. Zhang, An iteration method of finding the roots of algebraic equation without repeated root, Numer. Math. J. Chinese Univ. 23 (1) (2001) 38–44. [2] M.S. Petkovi´c, On initial conditions for the convergence of simultaneous root finding methods, Computing 57 (2) (1996) 163–177. [3] W.S. Luk, Finding roots of a real polynomial simultaneously by means of Bairstow’s method, BIT 36 (2) (1996) 302–308. [4] M.S. Petkovi´c, A family of simultaneous zeros-finding methods, Comput. Math. Appl. 34 (10) (1997) 49–59. [5] D. Bini, V.Y. Pan, Computing matrix eigenvalues and polynomial zeros where the output is real, SIAM J. Comput. 27 (4) (1998) 1099–1115. [6] V.Y. Pan, Solving a polynomial equation: Some history and recent progress, SIAM Rev. 39 (2) (1997) 187–220. [7] X.Y. Wu, J.X. Lin, R. Shao, Quadratically convergent multiple roots finding method without derivatives, Comput. Math. Appl. 42 (1–2) (2001) 115–119.

X. Wu / Applied Numerical Mathematics 44 (2003) 415–423

423

[8] J.E. Dennis Jr., J.J. Moré, A characterization of superlinear convergence and its application to quasi-Newton methods, Math. Comp. 28 (1974) 549–560. [9] C.G. Broyden, J.E. Dennis Jr., J.J. Moré, On the local and superlinear convergence of quasi-Newton methods, J. Inst. Math. Appl. 12 (1973) 223–245.