Journal of Computational and Applied Mathematics 365 (2020) 112369
Contents lists available at ScienceDirect
Journal of Computational and Applied Mathematics journal homepage: www.elsevier.com/locate/cam
On the numerical stability of Newton’s formula for Lagrange interpolation André Pierro de Camargo Centro de Matemática, Computação e Cognição, Universidade Federal do ABC - UFABC, Rua Santa Adélia, 166, bairro Bangu, cep 09210-170, Santo André, SP, Brazil
article
info
a b s t r a c t
Article history: Received 25 March 2019 Received in revised form 24 July 2019 MSC: 65G50 65G30 Keywords: Lagrange interpolation Newton’s formula Newton’s divided differences Numerical stability
It was previously noticed in the literature that the computation of Newton’s formula can be very sensitive to the ordering of the input data and it may be numerically unstable even for interpolation at the Chebyshev nodes. Nevertheless, to the best of our knowledge, the effects of rounding errors in the computation of Newton’s formula have not been properly quantized yet. In this note we make a full stability analysis of the propagation of rounding errors in the computation of Newton’s formula for Lagrange interpolation. We derive sharp upper bounds for the numerical error in the computations of three algorithms for computing polynomial interpolants in Newton form that can be used to understand the instabilities mentioned above. We throw light on and give a practical meaning to the condition number defined in Carnicer et al. (2019) by showing that it is the right thermometer for measuring the effects of rounding errors in the computation of Newton’s formula. © 2019 Elsevier B.V. All rights reserved.
1. Introduction In Lagrange interpolation, the effects of rounding errors in the computation of polynomials given in barycentric form is measured by the Lebesgue constant
⎞ ⏐ ⏐ ⎜∑ ∏ ⏐⏐ x − xk ⏐⏐⎟ Λ(x) := max ⎝ ⏐ xi − xk ⏐⎠ , x ∈ [a,b] k=0 ⎛
n
i=0
n
(1)
k̸ =i
which is the norm of the linear operator y ↦ −→ p(., x, y) ∈ C [a, b] that maps the real vector y = (y0 , y1 , . . . , yn ) to the polynomial of degree at most n that interpolates y at the nodes x = (x0 , x1 , . . . , xn ), xi ∈ [a, b] ∀ i, that is p(xi , x, y) = yi , i = 0, 1, . . . , n. E-mail address:
[email protected]. https://doi.org/10.1016/j.cam.2019.112369 0377-0427/© 2019 Elsevier B.V. All rights reserved.
2
A.P. de Camargo / Journal of Computational and Applied Mathematics 365 (2020) 112369
In fact, if x, x and y are formed by floating point numbers with x ∈ [a, b] and the machine precision ϵ satisfies (5n + 5)ϵ < 0.001, the computed value fl(p(x, x, y)) of p(x, x, y) with the first barycentric formula, p(x, x, y) =
( n ∏
) (x − xi )
i=0
n ∑ y i λi , x − xi
λi = ∏
i=0
1 (xi − xj )
,
j̸ =i
satisfies [1]
|fl(p(x, x, y)) − p(x, x, y)| ≤ 1.01(5n + 5)ϵ Λ(x)∥y∥∞ .
(2)
Similar bounds also exist for the second barycentric formula, n ∑ yi λi p(x, x, y) = x − xi
/
n ∑ i=0
i=0
λi , x − xi
when Λ(x) is small [2,3]. Bound (2) is sharp and can be attained up to constant factors in some cases. For this reason, the Lebesgue constant is usually called the condition number for Lagrange interpolation. For the Chebyshev nodes of first and second kinds, for example, the Lebesgue constant is O(log(n)) [4] and (2) tells us that the barycentric formula is numerically stable in this case. However, for the computation of p(x, x, y) in the Newton form p(x, x, y) =
n ∑
y[x0 , x1 , . . . , xi ]
i=0
i−1 ∏
(x − xj ),
(3)
j=0
we have a completely different scenario. It was already observed by Higham [1] that the computation of the right-hand side of (3) can be very sensitive to the ordering of the input data (x, y), and may be numerically unstable for large n even for interpolation at the Chebyshev nodes. This problem was previously studied by Tal-Ezer [5]. It was argued that due to the superfine spacing between adjacent points, computing the divided differences will suffer from huge roundoff errors. Nevertheless, to the best of our knowledge, the effects of rounding errors in the computation of the right-hand side of (3) have not been properly quantized yet. In this note we present a full rounding error analysis for three algorithms that can be used to compute Lagrange interpolants in Newton form. The first one (Inv) computes y[x0 , x1 , . . . , xi ], i = 0, 1, . . . , n , in closed form, [6], p. 247:
⎞ ⎛ / i i ∑ ∏ ⎟ ⎜ (xj − xk ) ⎠ , i = 0, 1, . . . , n. y[x0 , x1 , . . . , xi ] = ⎝yj
(4)
k=0 k̸ =j
j=0
The second one (For) computes y[x0 , x1 , . . . , xi ], i = 0, 1, . . . , n, by forward substitution as the solution w of the lower triangular system
⎤ ⎡ w0 y0 ⎢ w1 ⎥ ⎢ y1 ⎥ ⎢ V (x) × ⎢ ⎣ .. ⎦ = ⎣ .. . . wn yn ⎡
⎤
p0 (x0 ) ⎢ p0 (x1 )
⎡
⎥ ⎥ , V (x) = ⎢ ⎦ ⎣
.. .
p0 (xn )
⎤ p1 (x1 )
.. .
p1 (xn )
⎥ ⎥, ⎦
..
. ...
(5)
pn (xn )
with pi (x) :=
i−1 ∏
(x − xj ), i = 0, 1, . . . , n,
(6)
j=0
and the third one (Diff) computes y[x0 , x1 , . . . , xi ], i = 0, 1, . . . , n, by Newton’s divided differences y[xi , xi+1 , . . . , xi+k+1 ] = y[xi+1 , xi+2 , . . . , xi+k+1 ] − y[xi , xi+1 , . . . , xi+k ] xi+k+1 − xi
.
(7)
We derive some upper bounds for the numerical error in the computation of the right-hand side (3) by (Inv), (For) and (Diff). For instance, for some algorithms and configurations of the nodes, we have
|fl(p(x, x, y)) − p(x, x, y)| ≤ 1.01(6n) ϵ ΛN (x) ∥y∥∞ ,
(8)
A.P. de Camargo / Journal of Computational and Applied Mathematics 365 (2020) 112369
3
where
⎛ ⎞ n i−1 i i ∑ ∏ ∑ ∏ ΛN (x) = max ⎝ ci |x − xj |⎠ , ci = x ∈ [a,b]
i=0
j=0 k=0 k̸ =j
j=0
1
|xj − xk |
,
(9)
i = 0, 1, . . . , n. As we shall see, the bound (8) is sharp and can be attained up to constant factors in some cases. In addition, notice that the right-hand sides of (1) and (9) have exactly the same shape. For these and some other reasons that we shall explain, we claim that ΛN (x) should be regarded as the condition number for Lagrange interpolation in Newton form. Actually the quantity ΛN (x) was previously defined in [7] in a more general context as an upper bound for the Lebesgue constant (1), but no connection with the propagation of rounding errors in the computation of p(x, x, y) is made (this is explained in detail in [8]). Our approach also helps to understand some phenomena previously reported in the literature. For instance, it was noted by Tal-Ezer [5] and Reichel [9] that, for the Chebyshev nodes, the error in the computation of the right-hand side of (3) can be considerably diminished by rearranging the nodes in some orderings and this is accompanied by the reduction of ΛN (x). A natural question in this context is whether this strategy would work for equidistant nodes (xeq )i = a + ih, i = 0, 1, . . . , n, h =
b−a
.
n Unfortunately, we have a negative answer to this question. As we shall see in Section 2.1, we have max Λ x(i)
(
)
1≤i≤n
≤
ΛN (x),
(10)
where Λ x is the Lebesgue constant of x(i) = (x0 , x1 , . . . , xi ). Thus the error in the computation of Newton’s formula will be small only if all subgrids x(i) , i = 1, 2, . . . , n, of x are almost optimal in the classic sense. This does not hold for equidistant nodes, because, for every permutation x′eq of xeq , we have [10]
(
) (i)
(10)
ΛN (x′eq ) ≥ Λ(x′eq ) = Λ(xeq ) ≥
2n
.
n2 In Section 2 we present our upper bounds and we carry out a discussion of the magnitude of ΛN (x) in some cases. We prove that ΛN (x) grows exponentially in n for the Chebyshev nodes and this explains the instability reported in [1]. In Section 3 we present some numerical experiments to show how our upper bounds perform in practice and in Section 4 we make a thorough discussion on the effects of rounding errors in the computation of Newton’s divided differences and also on the related literature. Section 5 is devoted to the proofs the theorems stated in Section 2. Remark 1. We emphasize that the goal of this article is solely understanding why Newton’s formula may be numerically unstable. Thus, we shall not make any comparison with its barycentric counterparts in terms of accuracy, number of flops, etc. (this was already done in [1,11] and [12] ). 2. Upper bounds for the numerical error in the computation of Newton’s formula For the algorithm (Inv), we have Theorem 1. If x, x and y are formed by floating point numbers, with x ∈ (6n)ϵ < 0.001, the right-hand side of (3) can be computed with (Inv) so that
[a, b] and the machine precision ϵ satisfies
|fl(p(x, x, y)) − p(x, x, y)| ≤ 1.01(6n) ϵ ΛN (x) ∥y∥∞ .
(11)
Our upper bound for the numerical error for the algorithm (For) depends on the condition numbers cond V x(µ)
( (
))
⏐⏐ ( )⏐⏐ ⏐⏐⏐⏐ ( )−1 ⏐⏐⏐⏐ := ⏐⏐V x(µ) ⏐⏐∞ ⏐⏐V x(µ) ⏐⏐ ,
(12)
∞
of the submatrices V x(µ)
= V ((x0 , x1 , . . . , xµ )), µ = 0, 1, . . . , n, ( ) of V (x)( = ) V x(n) (here, ∥.∥∞ is (the )norm induced by the supremum norm ∥.∥∞ ). Notice that, by (4), we have cond(V x(µ) ) in closed form: cond(V x(µ) ) = ⎛ ⎞⎤ ⎡ ⎛ ⎞⎤ ⎡ / i i i ∑ ∑ ∏ ⎢ ⎜ ⎟⎥ ⎣ max ⎝ 1 |xj − xk | ⎠⎦ , (13) |pj (xi )|⎠⎦ ⎣ max ⎝
(
)
0≤i≤µ
j=0
0≤i≤µ
j=0
µ = 0, 1, . . . , n, with pi (x) defined by (6).
k=0 k̸ =j
4
A.P. de Camargo / Journal of Computational and Applied Mathematics 365 (2020) 112369
We have Theorem 2.
If x, x and y are formed by floating point numbers, with x ∈ [a, b] and the machine precision ϵ satisfies
1.01(3n)ϵ cond(V ) ≤ 0.5,
(14)
the right-hand side of (3) can be computed with (For) so that
|fl(p(x, x, y)) − p(x, x, y)| ≤ 3.03(4n + 1) ϵ ΘN [x, y], ⎛ ⎞ n i−1 ∑ ∏ for ΘN [x, y], := max ⎝ vi |x − xj |⎠ and vi = x ∈ [a,b]
[
(
) (i)
cond(V x
]
)+1
[
i=0
(15)
j=0
]
max |y[x0 , x1 , . . . , xj ]| , i = 0, 1, . . . , n.
0≤j≤i
Of course, we can eliminate the dependence of ΘN [x, y] in Theorem 2 on the numbers y[x0 , x1 , . . . , xi ], i = 0, 1, . . . , n by using
⎞ ⎛ / i i ∏ ∑ (4) ⎟ ⎜ |xj − xk | ⎠ , i = 0, 1, . . . , n. y[x0 , x1 , . . . , xi ] ≤ ∥y∥∞ ⎝1 j=0
(16)
k=0 k̸ =j
However, this may overestimate the error because the multipliers of ∥y∥∞ in (16) may be large for large n. We were unable to obtain a useful upper bound for the numerical error for the algorithm (Diff) for all node configurations. It can be shown that computing y[x0 , x1 , . . . , xi ], i = 0, 1, . . . , n, with Newton’s divided differences is equivalent to solving the linear system (5) by Neville elimination. Thus, in principle, one could adapt the techniques employed in [13] to handle this case. Nevertheless, we suspect that the resulting upper bounds would be neither very sharp nor easy to interpret. A remarkable exception to this is when the nodes are sorted in increasing order (or decreasing). In this case, the upper bound (8), which holds for the algorithm (Inv) independently of the distribution/configuration of the nodes, also holds for the algorithm (Diff). Theorem 3. If x, x and y are formed by floating point numbers, x0 < x1 < · · · < xn , with x ∈ [a, b] and the machine precision ϵ satisfies (6n)ϵ < 0.001, the right-hand side of (3) can be computed so that
|fl(p(x, x, y)) − p(x, x, y)| ≤ 1.01(6n) ϵ ΛN (x) ∥y∥∞ .
(17)
In the sequel we will estimate the magnitude of ΛN (x) for some families of nodes. 2.1. On the magnitude of ΛN (x) In the following discussion of ΛN (x) we will assume that [a, b] = [−1, 1], because ΛN (x) is invariant under translations and dilatations of the interpolation interval. In this case, the following lemma by Erdős and Turán [14] states that the numbers ci defined in (9) grows exponentially regardless of the distribution of the nodes. Lemma 1. If xi ∈ [−1, 1], i = 0, 1, . . . , n, ci ≥ 2i−2 , i = 0, 1, . . . , n.
(18)
Thus, using that
ΛN (x) ≥ ci
i−1 ∏
|1 − xj |,
j=0
we have Corollary 1. If −1 ≤ x0 < x1 < · · · < xn ≤ 1 and x⌊n/2⌋ ≤ 0,
ΛN (x) ≥ 2⌊n/2⌋−2 .
(19)
In particular, ΛN (x) grows exponentially for the Chebyshev nodes 3 in the usual order and this explains the instability reported by Higham [15]. We believe that ΛN (x) will be large for large n whenever x0 < x1 < · · · < xn , but this still needs to be investigated. Another approach to lower bound estimation for ΛN (x) is to note that (4)
|y[x0 , x1 , . . . , xi ]| ≤ ci ∥y(i) ∥∞ , i = 0, 1, . . . , n,
A.P. de Camargo / Journal of Computational and Applied Mathematics 365 (2020) 112369
5
where y(i) denotes the vector formed by the first i elements of y. Therefore,
⎡ ⏐ ( )⏐ max ⏐p x, x(i) , y(i) ⏐
x ∈ [a,b]
i ∑
≤
⎣ max
≤
ΛN (x) ∥y(i) ∥∞ .
x ∈ [a,b]
k=0
ck
k−1 ∏
⎤ |x − xj |⎦ ∥y(i) ∥∞
j=0
and we have (10):
(
Λ(x ) = (i)
max ∥y(i) ∥∞ = 1
) ⏐ ( (i) (i) )⏐ ⏐ ⏐ ≤ ΛN (x). max p x, x , y
x ∈ [a,b]
For upper bounding estimation, Carnicer et al. [7] proved that
ΛN (x) ≤ cond (V (x)) Λ(x). Unfortunately, this does not help to understand why the errors in the computation of the right-hand side of (3) can be reduced for interpolation at Chebyshev nodes by permuting the nodes, because, by (12) and (18) , we have that cond(V (x)) grows exponentially in n regardless of the distribution of the nodes. Therefore, this is a relevant open topic for future research. 3. Numerical experiments We start with the results of a numerical simulation in which we interpolate the function f (x) = sin(x) at the Chebyshev nodes of the second kind
( xi = − cos
iπ n
)
, i = 0, 1, . . . , n.
In Fig. 1 we plot the maximum error En = max |f (x) − p(x, x, y)| x ∈ M
over a grid M formed by 104 + 1 evenly spaced nodes distributed in [−1, 1] for several values of n. According to the well-known formula for the interpolation error for smooth functions, [16] p. 78, the error En for n > 20 is below the machine precision ϵ ≈ 2.2 · 10−16 and what we see is purely the error in the computation of p(x, x, y) due to rounding. The values (BInv ) and (BFor ) are our upper bounds (11) and (15), respectively, for the numerical errors in the computation of p(x, x, y) by algorithms (Inv) and (For). The upper bound (BInv ) is the right-hand side of (8). In this specific case, our upper bound for the algorithm (Diff) is also (BInv ). As we can clearly see, the numerical error grows exponentially in this case for all three algorithms and our upper bounds for the numerical error are attained up to constant factors. Using the standard floating point arithmetic model of [15] (see Section 5.1), one can prove that the upper bound (BInv ), which is the right-hand side of (8), can always be computed with high relative precision. On the other hand, the upper bound (BFor ) depends on y[x0 , x1 , . . . , xi ], i = 0, 1, . . . , n (see Section 2) and this can be inconvenient in some cases because these numbers might not be computed with good relative precision. Consequently, the analysis based on the computed values (BFor ) may be compromised by large numerical errors in the computation of y[x0 , x1 , . . . , xi ], i = 0, 1, . . . , n. This is the case in this experiment and that is why the values (BFor ) in Fig. 1 seem odd for large n. Of course, this problem can be solved by computing the numbers y[x0 , x1 , . . . , xi ], i = 0, 1, . . . , n in high precision, but this is ad hoc. 3.1. Leja and other orderings It was observed empirically in [9] and [5] that the effects of rounding errors in the computation of Newton’s formula can be diminished by ordering the Chebyshev nodes differently. An explicit way of doing that suggested by [9] is using a Leja ordering:
• choose x′0 ∈ {x0 , x1 , . . . , xn } such that |x′0 | = maxi |xi |. • for 2 ≤ j ≤ n, choose x′j ∈ {x0 , x1 , . . . , xn } such that
⏐ j−1 ⏐ ⏐ j−1 ⏐ ⏐∏ ⏐ ⏐∏ ⏐ ⏐ ⏐ ′ ′ ⏐ ′ ⏐ ⏐ (xj − xk )⏐ = max ⏐ (xℓ − xk )⏐ . 0≤ℓ≤n ⏐ ⏐ ⏐ ⏐ k=1
k=1
In Fig. 2 we repeat the experiment of the previous section using a Leja ordering. As we can see, a remarkable reduction of the numerical errors is observed in this case and this is accompanied by the reduction of the upper bounds (BInv ) and
6
A.P. de Camargo / Journal of Computational and Applied Mathematics 365 (2020) 112369
Fig. 1. The error En for algorithms (Inv), (For) and (Diff) (in logarithmic scale, base 10) and the upper bounds (BInv ) and (BFor ) for Chebyshev nodes, for n up to 100.
Fig. 2. The error En for algorithms (Inv), (For) and (Diff) (in logarithmic scale, base 10) and the upper bounds (BInv ) and (BFor ), for Chebyshev nodes in a Leja ordering, for n up to 100.
(BFor ). Therefore, our analysis gives a solid theoretical support for the use of Leja orderings in Newton interpolation and this corroborates the empirical evidence of [9]. However, we remark that, in view of (10), bounding ΛN (x′ ) in this case is challenging (this is discussed a little bit more in Section 2.1). We also repeated the experiment using the Chebyshev nodes randomly ordered. A similar reduction of the effects of rounding errors were also observed in this case, but not so extreme as in the case of the Leja ordering (see Fig. 3). We emphasize that the computation of the values in Fig. 3 was not selective in the sense that only one random permutation was used for each n. This suggests that for the many possible orderings of the Chebyshev nodes, only a few may lead to large numerical errors. The values of Fig. 4, which shows the distribution of rounding errors for n = 75, for 10000 random permutations of Chebyshev nodes, corroborate this suspicion (the algorithm used here is Diff). We repeated the experiment above on random permutations, but now for equally spaced nodes. The results, shown in Fig. 5, corroborate the claim in the introductory section that permuting the nodes might not lead to a significant reduction of the effects of rounding errors for interpolation at equally spaced nodes. In the computation of Newton’s divided-differences, rounding errors may cancel at intermediate steps. In the algorithm (Inv) this does not happen because the interactions between the values of y only happen at the final stage. Thus, it is more or less expected to have the worst case scenario for the algorithm (Inv) and, for this reason, it is also expected to have numerical errors below the right-hand side of (8) for all three algorithms. We emphasize that this possible cancellation of rounding errors is the only significant practical difference between algorithms (Inv) and (Diff). In fact, using the relations
vi′+1,j
=
vi′,j (xj−1 − xj )
, j = 1, 2, . . . , i and
vi′+1,i+1
= 1
/ i−1 ∏
(xi − xk )
k=0
between the elements of the inverse V ′ of the matrix V(x) in (5), all the numbers in the right-hand side of (4) can be computed with a total of O(n2 ) arithmetic operations with no matrix storage. In addition, [5] suggested rescaling the interpolation interval to [−2, 2] in order to avoid overflow in the computation of Newton’s divided-differences for large n and a nice strategy to avoid overflow/underflow in the computation of multiplicands with a large number of factors (that introduces no additional rounding errors) was presented in Section 3.1 of [17]. Moreover, note that the upper bound in Theorem 2 is at least as great as the upper bound (17) for ∥y∥∞ = |y0 |. Therefore, for the reasons explained above,
A.P. de Camargo / Journal of Computational and Applied Mathematics 365 (2020) 112369
7
Fig. 3. The error En for algorithms (Inv), (For) and (Diff) (in logarithmic scale, base 10) and the upper bounds (BInv ) and (BFor ), for Chebyshev nodes randomly ordered, for n up to 100.
Fig. 4. The distribution of the errors En (in logarithmic scale, base 10) for n = 75 and for 10 000 random permutations of Chebyshev nodes.
Fig. 5. The distribution of the errors En (in logarithmic scale, base 10) for n = 75 and for 1000 random permutations of equally spaced nodes in [−1, 1].
it is reasonable to regard the number ΛN (x) defined in (9) as the condition number for Newton’s formula for Lagrange interpolation.
8
A.P. de Camargo / Journal of Computational and Applied Mathematics 365 (2020) 112369 Table 1 ( ( )) The sensitivity of cond (V (x)) and cond V x′ with respect to scaling (multiplication of the nodes by a factor sca) for Chebyshev nodes x and Chebyshev nodes x′ in a Leja ordering. n
5
10
20
50
100
x, sca = 1 x, sca = 2 x′ , sca = 1 x′ , sca = 2
3.194e+02 6.792e+02 6.474e+01 22.180
1.170e+05 4.774e+05 2.088e+03 46.566
1.3599e+10 1.887e+11 2.198e+06 93.489
2.118e+25 1.315e+28 2.139e+15 349.656
4.466e+50 1.531e+56 2.412e+30 913.902
4. The instability of Newton’s divided differences and some proposed remedies In the previous section we empirically checked that interpolation at the Chebyshev nodes with Newton’s formula is stable when the interpolation nodes are ordered in a Leja ordering. However, although stability is warranted in this case by the low values of the upper bounds (11) and (15) in Fig. 2, it does not necessarily mean that the computation of Newton’s divided differences y[x0 , x1 , . . . , xi ], i = 0, 1, . . . , n (which is just a part of the whole computational process) is itself stable. In fact, by (13), (9) and by Erdös and Turán Lemma (18), we have the following exponential estimate cond (V (x)) ≥ 2n−2
(20)
for the condition number defined by (12). Therefore, cond (V (x)) grows exponentially in n independently of the distribution of the nodes x = (x0 , x1 , . . . , xn ) (provided that all the nodes lies in [−1, 1]). Consequently, the computation of Newton’s divided differences may be unstable even for nodes in [−1, 1] in a Leja ordering. Another very important characteristic that distinguishes the stability of the complete Newton’s formula from the stability of Newton’s divided differences is that, while ΛN (x) is invariant under scaling, the condition number cond (V (x)) is not! Our experiments suggested that the optimal scaling factor (sca) for Chebyshev nodes in the usual order is sca = 1, while for Chebyshev nodes in a Leja ordering, the optimal scaling factor is sca = 2. The values in the last row of Table 1 reveals that cond (V (x)) do not grow too fast for Chebyshev nodes in a Leja ordering when the nodes are scaled to [−2, 2]. More than interesting, this fact is also curious because Tal-Ezer [5] argued that using intervals with size four helps to prevent underflow and overflow in the computation of Newton’s formula. However, the phenomenon observed in Table 1 was not taken into account in [5] and, therefore, this can be an interesting topic for future research. 4.1. Some other strategies for computing Newton’s divided differences Let us briefly mention that, when the interpolated values y come from an analytic function f , that is y = (f (x0 ), f (x1 ), . . . , f (xn )), there are some closed form expressions for its Newton’s divided differences. These expressions were already explored in the literature in order to obtain stable algorithms for computing Newton’s divided differences. For instance, [18] proposed to compute Newton’s divided differences based in an algorithm for computing Newton’s divided differences for f based on its representation by contour integrals (the trapezoidal rule is used for computation of the integrals in [18]). Another approach, that was employed in [19], is to use the representation of the vector of Newton’s divided differences as a matrix function depending on f . In this case, what is actually computed is a polynomial matrix based on a truncated Taylor series of f . However, although we do not question the benefits from these approaches, it is important to remark that it does not make much sense to compare them with the classical recurrences (7) in terms of stability because they use extra information than just the sampled values (f (x0 ), f (x1 ), . . . , f (xn )). 5. Proofs 5.1. The standard floating point arithmetic Our approach is based on the standard floating point number arithmetic model presented in [15]: fl(w op z) = (w op z)(1 + δ ), |δ| ≤ ϵ, op = +, −, ∗, /,
(21)
for all floating point numbers w and z. We will also use Stewart’s relative error counter [15]
⟨q⟩ =
q ∏
(1 + ξi )ρi ,
for ρi = ±1
and |ξi | ≤ ϵ.
(22)
i=1
In view of (22), we can rewrite (21) as fl(w op z) = (w op z)⟨1⟩, op = +, −, ∗, /.
(23)
A.P. de Camargo / Journal of Computational and Applied Mathematics 365 (2020) 112369
9
It is also convenient to write ⟨q⟩u when we want to give a label u to the specific q rounding errors we are concerned with. The basic properties of Stewart’s error counter can be summarized as 1
⟨q⟩u
⟨q⟩u1 ⟨p⟩u2 = ⟨p + q⟩v ,
= ⟨q⟩w ;
(24)
if p ≤ q then ⟨p⟩u = ⟨q⟩v
(25)
and (lemma 3.1 of [15]) if qϵ < 0.001 then |⟨q⟩ − 1| ≤
qϵ
≤ 1.01qϵ.
1 − qϵ
(26)
(25), it is easy)to show that if a sum of m + 1 floating point numbers a0 , a1 , . . . , am is evaluated recursively (∑By (24) and ] [∑r r +1 a = i=0 ai + ar +1 , then the computed sum satisfies i=0 i ( m ) m ∑ ∑ fl ai = ai ⟨m⟩i . (27) i=0
i=0
5.2. Proof of Theorem 1 By (4), (23), (24) and (27) we have
⎞ ⎛ / i i ∏ ∑ ⎟ ⎜ (xj − xk ) ⎠ . fl(y[x0 , x1 , . . . , xi ]) = ⟨3i⟩j ⎝yj
(28)
k=0 k̸ =j
j=0
Analogously, by (23), (24) and (25), we obtain
⎛ ⎞ i−1 i−1 ∏ ∏ fl ⎝ (x − xj )⎠ = ⟨2i − 1⟩′ (x − xj ). j=0
(29)
j=0
Then, by (24) and (27), fl(p(x, x, y)) = n
⎡
⎛
i
/
∑ ⎢∑ ′⎜ ⎣ ⟨n + 5i⟩j ⎝yj i=0
(25)
=
⎞⎤
i
∏ k=0 k̸ =j
j=0
i−1
⎟⎥ ∏
(xj − xk ) ⎠⎦
(x − xj )
j=0
⎞⎤ ⎛ ⎡ / i i−1 i n ∏ ∑ ∑ ⎟⎥ ∏ ⎢ ′⎜ (x − xj ). (xj − xk ) ⎠⎦ ⎣ ⟨6n⟩j ⎝yj i=0
k=0 k̸ =j
j=0
j=0
Therefore, |fl(p(x, x, y)) − p(x, x, y)| ≤
∥y∥∞
⎡ ⎛ ⎞⎤ / i n i i−1 ∑ ∑ ∏ ⎢ ⎜ ⎟⎥ ∏ ′ |⟨ 6n ⟩ − 1 | 1 | x − x | |x − xj | ⎣ ⎝ j k ⎠⎦ j i=0
(26)
≤
k=0 k̸ =j
j=0
j=0
1.01(6n) ∥y∥∞ ΛN (x).
5.3. Proof of Theorem 2 Let us denote by w ˆi the computed value of wi , i = 0, 1, . . . , µ. We have
⎛ w ˆi = fl ⎝yi −
i−1 ∑
w ˆj
j−1 ∏
j=0
⎞ (xi − xk )⎠ .
k=0
By (23), (24) and (27), we obtain
w ˆi
=
y i ⟨i + 1 ⟩i −
j−1 i−1 ∑ ∏ ⟨i + 1⟩i ⟨2j − 1⟩i,j w ˆj (xi − xk ) j=0
=
yi ⟨i + 1⟩i −
i−1 ∑ j=0
k=0
( w ˆj ⟨2j + i⟩i,j
j−1 ∏ k=0
) (xi − xk ) ,
(30)
10
A.P. de Camargo / Journal of Computational and Applied Mathematics 365 (2020) 112369
i = 0, 1, . . . , µ. Hence, we have
(M + δ M ) × (w + δ w) = y + δ y, ( ) with M := V x(µ) , δ yi = yi (⟨i + 1⟩i − 1), i = 0, 1, . . . , µ, and δ M ⎡ p0 (x0 )⟨0⟩0,0 p1 (x0 )⟨2⟩0,1 . . . pµ (x0 )⟨2µ⟩0,µ p1 (x1 )⟨3⟩1,1 . . . pµ (x1 )⟨2µ + 1⟩1,µ ⎢ p0 (x1 )⟨1⟩1,0 ⎢ .. .. .. ⎣ . . . p0 (xµ )⟨µ⟩µ,0
p1 (xµ )⟨µ + 2⟩µ,1
...
:= ⎤
⎥ ⎥ − M. ⎦
pµ (xµ )⟨3µ⟩µ,µ
By the classical analysis of perturbed systems ([6], p. 37), we have cond(M , ∥.∥∞ ) ∥δ w∥∞ ≤ ∥w∥∞ 1 − cond(M)∥δ M ∥∞ /∥M ∥∞
(
∥δ y∥∞ ∥δ M ∥∞ + ∥y∥∞ ∥M ∥∞
)
.
(31)
Moreover, (25) and (26) tell us that
∥δ y∥∞ ≤ 1.01(n + 1)ϵ and ∥y∥∞
∥δ M ∥∞ ≤ 1.01(3n)ϵ. ∥M ∥∞
Thus, we get
∥δ w∥∞ ∥w∥∞
1.01(4n + 1)
≤ (14)
cond(M) 1 − 1.01(3n)ϵ cond(M)
ϵ
(32)
2.02(4n + 1)cond(M) ϵ
≤ and, in particular,
|w ˆµ − wµ |
(µ)
2.02(4n + 1)cond(V x
(
≤
)
[
]
) max |wi | 1≤i≤µ
ϵ,
(33)
µ = 0, 1, . . . , n. On the other hand, by (29) we have
⎛ ⎞ i−1 i−1 ∏ ∏ fl ⎝ (x − xj )⎠ = ⟨2i − 1⟩′ (x − xj ). j=0
j=0
Then, by (24) and (27), fl(p(x, x, y)) = n i−1 ∑ ∏ ⟨n + 2i⟩w ˆi (x − xj ) i=0
(25)
=
j=0
n ∑
⟨3n⟩i w ˆi
i=0
i−1 ∏
(x − xj ).
j=0
Therefore, |fl(p(x, x, y)) − p(x, x, y)| ≤
(33), (26)
≤
i−1 n i−1 n ∑ ∏ ∑ ∏ ⟨3n⟩i |w ˆi − wi | |x − xj | + |⟨3n⟩i − 1||wi | |x − xj | i=0 j⎡ =0 i=0 j=0 ⎤ [ ]∏ n i−1 ∑ [ ( (i) ) ] cond(V x ) + 1 max |wj | |x − xj |⎦ . (3.03(4n + 1)) ϵ ⎣ i=0
0≤j≤i
j=0
5.4. Proof of Theorem 3 For each i and k, we have that y[xi , xi+1 , . . . , xi+k ] is a linear function of yi , yi+1 , . . . , yi+k , say y[xi , xi+1 , . . . , xi+k ] =
k ∑
αi,i+1,...,i+k,j yi+j ,
j=0
i = 0, 1, . . . , n, k = 0, 1, . . . , n − i. Using (7) and that x0 < x1 < · · · < xn , we can easily conclude, by induction on k, that
αi,i+1,...,i+k,j =
∑
γi,i+1,...,i+k,j,r
(34)
r
is a sum of terms with the same sign, j = 0, 1, . . . , k, and that (αi,i+1,...,i+k,j ) (αi+1,...,i+k+1,j ) ≤ 0, i = 0, 1, . . . , n − k.
(35)
A.P. de Camargo / Journal of Computational and Applied Mathematics 365 (2020) 112369
11
In fact, by (7) , we have
∑ ∑
γi,i+1,...,i+k+1,j,r
γi+1,i+1,...,i+k+1,j−1,r ′ −
∑
r′
=
γi,i+1,...,i+k,j,r ′′
r ′′ xi+k+1 −xi
r
and
∑ ∑
γi+1,i+1,...,i+k+2,j,r
∑
γi+2,i+3,...,i+k+2,j−1,r ′ −
r′
=
γi+1,i+2,...,i+k+1,j,r ′′
r ′′
,
xi+k+2 −xi+1
r
with the convention that empty sums must be interpreted as zero. Thus, given that xi+k+1 − xi > 0 and xi+k+2 − xi+1 > 0, if (34) and (35) hold for k, then (34) and (35) hold for k + 1. Let ˆ y[xi , xi+1 , . . . , xi+k ] by the computed value of y[xi , xi+1 , . . . , xi+k ] . By (23), (24) and (27), we have ˆ y[xi , xi+1 , . . . , xi+k+1 ] =
( fl
=
(ˆ y[xi+1 , xi+2 , . . . , xi+k+1 ] − ˆ y[xi , xi+1 , . . . , xi+k ])⟨1⟩1
)
(xi+k+1 − xi )⟨1⟩2 ˆ y[xi+1 , xi+2 , . . . , xi+k+1 ]⟨3⟩3 − ˆ y[xi , xi+1 , . . . , xi+k ]⟨3⟩4 . xi+k+1 − xi
=
(36)
By induction on k, it follows by (24) and (36) that
( k ∑ ∑
ˆ y[xi , xi+1 , . . . , xi+k ] =
) γi,i+1,...,i+k,j,r ⟨3k⟩i,i+1,...,i+k,j,r
yi+j
r
j=0
i = 0, 1, . . . , n, k = 0, 1, . . . , n − i. By (29), we also have that
⎛ fl ⎝
⎞
i−1 ∏
(x − xj )⎠ = ⟨2i − 1⟩′
j=0
i−1 ∏
(x − xj ).
j=0
Then, by (24) and (27), fl(p(x, x, y)) =
⎡ ( ) ⎤ i−1 n i ∏ ∑ ∑ ∑ ⎣ γ0,1,...,i,j,r ⟨n + 5i⟩0,1,...,i,j,r yj ⎦ (x − xj ) i=0 (25)
=
j=0
r
j=0
⎡ ( ) ⎤ i−1 n i ∑ ∑ ∑ ∏ ⎣ γ0,1,...,i,j,r ⟨6n⟩0,1,...,i,j,r yj ⎦ (x − xj ). i=0
j=0
r
j=0
Therefore, |fl(p(x, x, y)) − p(x, x, y)| ≤
∥y∥∞
⎡ ( )⎤ i−1 n i ∑ ∑ ∑ ∏ ⎣ |γ0,1,...,i,j,r ||⟨6n⟩0,1,...,i,j,r − 1| ⎦ |x − xj | i=0
r
j=0
j=0
⎡
(26)
≤
1.01(6n) ∥y∥∞
)⎤ i−1 n i ∑ ∑ ∑ ∏ ⎣ |γ0,1,...,i,j,r | ⎦ | x − xj | i=0
(34)
=
1.01(6n) ∥y∥∞
=
1.01(6n) ∥y∥∞
j=0
r
j=0
⎡ ⎤ i−1 n i ∏ ∑ ∑ ⎣ |α0,1,...,i,j |⎦ | x − xj | i=0
(4)
(
j=0
j=0
⎡ ⎤ / i n i i−1 ∑ ∑ ∏ ⎢ ⎥∏ 1 |xj − xk | ⎦ |x − xj |. ⎣ i=0
j=0
k=0 k̸ =j
j=0
12
A.P. de Camargo / Journal of Computational and Applied Mathematics 365 (2020) 112369
Acknowledgments We thank two anonymous referees for including references [18,19] and [9] and also for some valuable remarks that motivated the inclusion of Section 4. References [1] N. Higham, The numerical stability of barycentric Lagrange interpolation, IMA J. Numer. Anal. 24 (2004) 547–556. [2] W.F. Mascarenhas, A. Camargo, The backward stability of the second barycentric formula for interpolation, Dolomites Res. Notes Approx. 7 (2014) 1–12, arXiv:1310.2516. [3] W. Mascarenhas, A. Camargo, The effects of rounding errors in the nodes on barycentric interpolation, Numer. Math. 135 (1) (2017) 113–141, arXiv:1309.7970. [4] S. Smith, Lebesgue constants in polynomial interpolation, Ann. Math. Inform. 33 (2006) 109–123. [5] H. Tal-Ezer, High degree polynomial interpolation in Newton form, SIAM J. Sci. Stat. Comput. 12 (3) (1991) 648–667. [6] E. Issacson, H. Keller, Analysis of Numerical Methods, Dover Publications Inc., New York, 1994. [7] J.M. Carnicer, Y. Khiar, J.M. Peña, Optimal stability of the Lagrange formula and conditioning of the Newton formula, J. Approx. Theory 238 (2019) 52–66. [8] A. Camargo, The practical meaning of the condition number by Carnicer, Khiar and Peña for Lagrange interpolation, submitted for publication. [9] L. Reichel, Newton interpolation at Leja points, BIT 30 (1990) 332–346. [10] L.N. Trefethen, J.A.C. Weideman, Two results on polynomial interpolation in equally spaced points, J. Approx. Theory 65 (1991) 247–260. [11] J.P. Berrut, L.N. Trefethen, Barycentric Lagrange interpolation, SIAM Rev. 46 (3) (2004) 501–517. [12] W. Werner, Polinomial interpolation: Lagrange Vs Newton, Math. Comp. 43 (167) (1984) 205–217. [13] P. Alonso, M. Gasca, J.M. Peña, Backward error analysis of Neville elimination, Appl. Numer. Math. 23 (1997) 193–207. [14] P. Erdös, P. Turán, On interpolation. III. Interpolatory theory of polynomials, Ann. of Math. 41 (3) (1940) 510–553. [15] N. Higham, Accuracy and Stability of Numerical Algorithms, SIAM, Philadelphia, 2002. [16] W. Gautschi, Numerical Analysis, second ed., Birkhäuser, New York, 2012. [17] W. Mascarenhas, The stability of barycentric interpolation at the second kind Chebyshev points, Numer. Math. 128 (2) (2014) 265–300. [18] M. Lopes-Fernandez, S. Sauter, Fast and stable contour integration for high order divided differences via elliptic functions, Mah. Comp. 84 (2015) 1291–1315. [19] A. McCurdy, K.C. Ng, B.N. Parlett, Accurate computations of divided differences of the exponential functions, Mah. Comp. 43 (168) (1984) 501–528.