The effect of rounding errors in iterational processes

The effect of rounding errors in iterational processes

7 11. SOBOLEVSKII P.E., Equations with operators making an acute angle. Dokl. AN SSSR, 116, 5, 754-757, 1957. 12. ZARUBIN A.G., The rate of convergen...

587KB Sizes 0 Downloads 40 Views

7

11. SOBOLEVSKII P.E., Equations with operators making an acute angle. Dokl. AN SSSR, 116, 5, 754-757, 1957. 12. ZARUBIN A.G., The rate of convergence of pro3ection methods in the eigenvalue problem. Zh.vychisl. Mat. mat. Fiz., 22, 6, 1308-1315, 1982. Translated

U.S.S.R. Comput.l4aths.Math.Phys.,Vo1.25,No.4,pp.7-13,1985 Printed in Great Britain

by H.Z.

OO41-5553/85 $lO.GO+O.OO Pergamon Journals Ltd.

THE EFFECT OF ROUNDING ERRORS IN ITERATIONALPROCESSES*

E.I. FILIPPOVICH The effect of rounding errors on the accuracy of the result in iterational processes with an ultralinear convergence is examined. The well-known result for processes with linear convergence follows from the estimates obtained. It is shown that in the case of ultralinear convergence the overall effect of rounding errors on all the steps of the iterations is equivalent to the effect on several of the last steps.

Introduction. z&=X is defined in the metric space Suppose an iterational process for the quantities X with the distance p(s, 5') for the elements 2, z'=X, and it is established that the rate of convergence of the approximation Q+, to the exact value z' is given by the relation .Y (1) z')Gqp(t,,2:) 1 P("r+s, where g and Y are certain constants which do not depend on k. When v=l, q>I it is called ultralinear /l/. Because of rounding errors, the approximate values z",,,,rather than the accurate values &+,, are obtained in the calculations. Suppose at step k the value 2", is obtained and Q+, is the accurate value which would be obtained from zr if there were no rounding errors, while I,+1 is the correspondingvaluedistorted by the rounding errors. We shall further assume that for all k>l (2) p cc, 4 a the rounding errors at all steps have an upper limit set by the quantity U. Under these i.e. conditions the following questions are of practical importance: I. What is the asymptotic form of the behaviour of the quantities p(Z,,z') as k-m? II. Since the value 5' remains unknown, and only the values s", are known, can we obtain estimates for the quantity p(.z"*,z') using the values &) under conditions (1) and (2)? P(x",+,, III. Usually, when implementing the iterational process, it is required that the final approximation P, for Y should satisfy the relation

andtheprocess

stops

P(l*,r')Q if for some fairly large k P(f,,i)l+,)Gb

(3)

(4)

where E and 6 are certain small numbers determining the accuracy with which t* can be found. The quantities E and 6 must obviously be consistent with each other and with the quantity 0. It is clear, for example, that it is absurd to seek the solution x* with the accuracy or to assume 6<0 since in the sphere (s:p(z,z*)
as

p(t,, 2’) < o/(1---4) + qkf("o*5*)*

(5)

lim sup p(i'k, Z')
(8)

k--m

The corresponding estimates for ultralinear processes are established below, and estimates (5) and (6) follow from the general theory as a special case. The whole complex of questions The value Y is usually relatively easily obtained in the theory of the I-IV is examined. specific iterational method, whereas it is often difficult to establish an estimate of the 25,7,973-902,1985 *Zh.vychisl.Mat.mat.Plz.,

constant q, and its value depends not only on the method chosen, but also on the individual peculiarities of the problem being solved. are well-known for the v>3. For example, inverse interpolation methods /l/, for which secant and Stevenson method ~=I.618 (see /l/j, and the Newton and Newton-Kantorovich method (see /2, 3/) v=2. In all the above cases the value of g is estimated using the behaviour of the derivatives of the functions or operators considered, and it is difficult, as a rule, to obtain estimates. In this paper it is shown that, unlike linearly convergent processes, for processes with ultralinear convergence there are simple estimates for p(zr,I'). They depend only on v and cr do not depend on q and enable us to answer questions I-III, which is essential for the practical utilization of the results obtained. 1.

The asymptotic

form of the quantities

P&

z")

Using the triangle inequality

and conditions

(1) and

(Z), we obtain P~+&?p~*+s, where

Inequality

(1.1) is fundamental

(1.1)

P~~P(~"~,z').

for what follows.

The following

Lemma. If: 1) F(Zpr...,ZO) is a function which does not decrease z,,i=O, I,..., p in the domain of definition; 2) y,+&F(y.,...,y,-,) for all nap; is found recurrently from the relation 3) U.+t

is required:

with respect to each of the arguments

(1.2)

u,+,=F(&%,...,u,-,I; for all

4) yPuc

i=O, 1,...,P

then li,~UC VDO. Proof.

According

(1.3)

to condition 4), inequality t1.3) holds for all i=O,1, ....p. rap. Then, according to conditions 2), 1) and 3),

We shall

assume that it holds for some

y.+,GF(I/r, . . . , yr-p)e(u,,

. . , ur-p)-I&+1,

which proves (1.3) by induction. The lemma enables us to proceed frem an analysis of inequalities of the type of conditions 2) to an investigation of equations of type C1.2), which is often simpler. In accordance with what has been asserted, if we proceed from inequality U.1) to the equations

(1.4)

q*+I=qqr’+o

and investigate the asymptotic form of n* as k--m, we can obtain an estimate of the quantities assuming vlO-pp. PI as k-t-, We can consider relation (1.4) as setting the iterational process of finding the root of the equation (1.5) I1=q?j+o. The function f(q)=q?l'i-a when v>i and n>O is convex (underneath). As a function of (J Eq.Cl.5) can have two, one or no real roots. If we solve Eq.fl.5) graphically, using_(1.4), it-is easy to satisfy ourselves that the value_s nr approach either the least root TJ if no<4 where The critical value of the quantity 0 for which ?j is the largest root, or m when nO>n. Eq.(1.5) has one root, corresponds to the tangency of the straight line f(n)=n and the curve f(~~)=qq'+o whilst o,,=(l--v-')(pv)"'~-". Thus, for an asympfptic estimate the minimum root n of Eq.fl.5)

(1.6)

of the quantities Q as k--c- it is sufficient to estimate from above, having real positive roots, under the conditions

c
1.

If OGJGUcrr

then the following hold?: 1) for the roots t) and n"of Eq.Cl.5)

the following

(1.8) estimates

hold: (1.9,) (l.Sbi

9

2)

in the case of strict

inequalities

in (1.8) the strict inequalities

in (1.9) also

hold. of ,;I--;:;

L~~~~~~~u~O,

i=O follows,from (1.5) and ;=q""*';, ~~:::~~::n~h~h:o~~lu~ion L it is easy to satisfy ourselves that as increases and the root 9 decreases, such that when o=ecr they are jdentical. Hence follows the right-hand inequality in (1.9b). Since n is the repulsion point of the process (1.4). the derivative - with respect to n - of the right-hand side of (1.5) when tl'rl is greater than, or equals, unity, whence follows the left-hand inequality in (1.9b). Suppose, further, that g(q)=qn*+o-_rl. Then g(o)=qa’>o and under the conditions (1.8)

q

~-:-I)1=-&[ ((i

(-!J’-i]

GO.

Therefore, GE[o;u/(l-v-')I, which gives the extreme inequalities in (1.9a). We shall write (1.5) in the form r)=u!(l-go)'-') and shall substitute the larger value a/(1-v-‘) into the righthand side instead of '1. The denominator will decrease but, taking account of (1.8) and (1.6), will remain positive, whence follows the inequality in (1.9.a) which is second from the left. 2. When OCaO, g(a/(i-v-l))<0 and, therefore, w,e can ll__q'/j'-" formulate the st_rict inequalities in (1.9a). Since the root q decreases from a=0 to ~m,.-(qv)“(‘-V) when c==ccr, as u increases, the strict inequalities in (1.9b) when follow from the strict inequalities in (1.8). Corollary 1. asymptotic form

When conditions

(1.8) are satisfied

for the quantities

p(t,z'),

the

(i.10) holds, where lim eP)=lim .a?)=0 k-c9 *-.LX and (1.11) Bearing

in mind that

it is easy to obtain Corollary

2.

(6) from

If for some

(1.11). k&O +n4<

then, using

(Wqv) 1’+-(1,

Mi,

the relations

we obtain

and, therefore,

p (f,, t’)

G&ftw

(1.12)

(q*.--‘l),

whence

Relations Notes. k increases

(1.12) and

(1.13) are a generalization

of

(5) and

(6) for the ultralinear

case.

1. When ;hp{ the quantities 'I*,defined by (1.41, decrease monotonically and approach ;i. Therefore Isoexists, such that when k>k, the inequalities

as (1.10)

When qa
which do not contain

the constant

Q, for ilnknown - or e::ficult

tc estimate

- parameters,

,I.

10

Note that as their meaning.

~1

the right-hand

sides of

q,v, (T 3. For the unknown quantities and use them for an asymptotic estimate of Since the conditions of the type pO<"& use either the secant method of one of the rapidly converges with i0=o and ;;o=&,>

(1.14) approach

= and these estimates

lose

we can immediately calculate the roots i and ; p(&,r') and to analyse the convergence the quantities limits of the roots are known from (1.9), we can other known methods. For small o Newton's method For methods with quadratic convergence (v=2)

;=2a/[i+(l-4qo)'"], ;=&) It should be noted, however, that for practical purposes in many cases estimates (1.14) are quite sufficient, since the length of the interval [a; 0/(1-v-')], which containts the root ti equals o/(v-i) and decreases as v increases consisting of 1.618(I for v=i.618 and 0 or 0.5cl for v-2 or v=3. Therefore the root 6 can differ from 0/(1-v-l)by a factor of not more than 2-3, which for small (ris unimportant in practice for estimating the order of magnitude of P(Z'r, z'). (i-v-l)-'on the right-hand side of (1.14) equals 4. When v-1.618,2,3 the coefficient 2.618, 2.1.5 respectively. convergent processes the effect of rounding Thus, in ultralinearly errorson the final result does not asymptotically exceed the sum of the errors of the last 2-3 steps of the iteration for large k. It is obvious from it that, for Table 1 shows the values s,,for different v and g. 5. the important practical values Y for calculations to eight decimal places, condition (1.7) holds even for q-lo’, if a<5.10-1. Thus, condition (1.7) is not too bounded, and we can always assume it is completed by increasing, if necessary, the number of digits in the calculations, for example by proceeding to doubled accuracy in the computer calculations. Q= ql~~v-‘~[~oq’~(*-‘J1y’ 6. When there are no rounding errors (c-O), it is easy to obtain po=t)o
(1.15)

For v-i.6i8 2 and 3 the values of the coefficient v'/('-~) when qUX in (1.15) equal 0.45904,O.S and 0.57735respectively, Thus, for important practical values of v the convergence of the initial process, when there are rounding errors, will be guaranteed if the initial error p0 is roughly half the size of the initial error which guarantees convergence when there are no rounding errors. It is easy to verify that when vsl the inequalities e-'~v*/(+*)~lhold, whilst Therefore convergence of the initial equalities are achieved in the limit as v+i+O and v-cm. poi, convergence will certainly be guaranteed if the initial error PP is one third of the error which guarantees convergence when o=o. The above provides an answer to Question IV. Table

1 ::z 10' 10'

2.

1.7533.10-1 4.2241.10-3 1.0177~10-‘ 2.45~8.10-5 5.9070-10-a

2.5.10~' 2.5.10-2 2.5.1OV 2.5.10-‘ 2.5.10-"

1.

3.8430.10-' 1.2172.10-' 3.849O.1O-z 1.2178iO-2 3.849@iO-3

Estimates of p&z*) using p&, :A+,) Stopping the initial iterational process when condition (4) holds requires caution. Generally speaking, inequality (4) characterizes the nearness of the successive approximations ZI to x'. The methods and problems in which and and not that of the approximation &+, & convergence is slow at the initial stages are well-known, and it accelerates in proportion to the approximation to x*. In these cases for an unfortunate choice of 6 inequality (4) can hold at the initial stages, not because the specified accuracy of the solution is already achieved, but because of the low rate of convergence, when the successive values f, and x1+1 differ slightly. Without going into details of the difficulties which thus emerge, we note that we can only judge the accuracy p(f,,s') of the approximation 4* to z* using the quantity at the final stages, when 2"*+,is already close to z'. In other words, for the P(& &+J iterational process to converge we must know the asymptotic form of p(i,, .r')with respect to P(z"., 2;+,) for large k, and for condition (4) we must know the estimate of p(2"h,x'),tobe able to verify condition (3). For complex problems with a large number of calculations it is sometimes sufficient to find a solution with a relatively low accuracy, corresponding to the accuracy of the setting of the initial data. The calculations stop for e and 6 much greater than (J. In these cases we can expect that, although the process is stopped with respect to condltlon (4) for large 6, it is nevertheless possible that the approximation %A obtained 1s much n~lar~r 2' than follows from (4). The corresponding investigation is presented below. Using thetriangle inequality in the form

11

P(L+,, 2”,)~p(fr+,, and the results

of Sect.1, we obtain

q+pw, 3

(2.1)

that for fairly large k and

O,

satisfying

(1.7),

P(5*,i*+,)~2o/(i-v-'),

(2.2)

or, more accurately, Zfi for v>l, p @k, Thus,

for fairly

zk+l)

d

20/(1-q) for v=l.

large k the quantities

p(&, &+, )become

of the same order as 0.

Theorem 2. If condition (1.7) holds and p(& x",,,) >O then for the quantities large k the following bilateral estimates hold asymptotically:

p@kv

z’)<

(J +

1-

q (I -

p @k.,

@k,fk+,) zk,,)- u]‘--l

p

V-1)1-v [p (ft,,

Pk+l)

p <

(I,, Zk+l)+

p(&,Z')

for

‘3

I-v-1



CJ

-

p(ekl~~‘)~ 1 + q[p(Ek,fk+,)-u]Y-'s p (fk* gk+l) - (T 1 + v-1(1-v-')v-'[p&' Q+r)/U- I]+1 > Proof. Using mind, we obtain

the triangle

inequality

p&qp*'+o,

P(% &+1)-U 1 + [,,(?k, fk+,)/u-

11,-l ’

(2.4)

p(k, ~')~p(f*,z"~+~)+p(f*+,,s')andbearing

where

(1.1)

in

(2.5)

o,=%+p(i,, L?,+~).

As we can easily satisfy ourselves graphically, inequality (2.5) is possible in two cases: ph
Pk
holds,

and since u,>a=>;(or)>~(o), only P(f,,~'~~~(a*)~rP(isfr+~)+oli(l-v-'~

is possible for fairly large k. Further, as in Sect.1, the middle part of inequality is proved by replacing o by O& . To prove (2.4) we shall consider (2.1) once more. Using (1.1) we obtain Y -_p*QPr -8rV Inequality

(2.6) is only possible

where

(2.3)

(2.6)

G+p(f,, &-t&-O.

when

P*G(W, , ;(a,) is the only root of the equation -_tl=qq-a',_ The function g(q)=qqv+q-6r equals --a* when q=O and ^ ~)(a,)=(O;d and, consequently,

(2.7)

where

qUI' when

n=u*.

Therefore

~(a,)-%. Now from the relation

?(a,)=U,/[l+q~‘-'(a,)] using

(2.8)

(2.8) and (2.6) we obtain

P(fk,zk+l)-u 1 + q[p(fk,%-I)- u]'-l The first inequality in (2.4) follows from from the first using (1.7) in the form

The last inequality for all v>l. Corollary

3.

(2.4) is obtained

When

v>l

(2.7) and (2.9).

(2.9)

The second inequality

from the second if we take into account

is obtained

that (v-I)"-'/v"Gl

the estimates (2.10)

obtained

from the extreme

inequalities

Corollary 4. For linearly v=l we obtain (2.4) when

(2.3) and

convergent

(2.4). hold asymptotically

processes

for large k.

from the first inequalities

(2.3) and

Notes. 7. Conditions (1.101, (1.11) and 11.13) provide estlmdtes of the limiting values bur do rot enable is to establish the value of i: 5.. of the reduction in the error y~(iU. .r').

12

which this really happens, which would be important for determining the step for stopping the iterational process. The right-hand inequalities (2.10) and (2.11) enable us to estimate the error p(.r*, I*) using the known quantity p&, f*+,)and, therefore, to formulate the stop criterion for achieving a specified accuracy for obtaining t* . inequalities (2.10) and (2.11), a discontinuation of 8. As can be seen from theleft-hand the calculations for relatively large p(f,,f*+,)gives, generally speaking, a large error ~(P~.z'). At the same time the presence of bilateral estimates enables us to judge how far the approxiI~ is from the accurate solution I* if it is obtained under condition (4). mate solution

3. Agreement of the accuracy of the solution with the rounding errors and the rate of convergence. In accordance with estimate (2.2), condition attainable), if 6 is chosen such that when v>l 6>20/(1-v-1) Similarly,

when

v=l

condition

(3) is attainable

(it is

(3.1)

if ==s,a.

(3.2)

if e>

v>l,

(6 + @/(I -v-r)'

Table

0.3)

v=l.

*/ 1 (6 + a)/(1-4) 2

Coefficients

Table

3

Coefficients 0.8 1 *a.9 1 0.95

-2

k

for fairly large

==S”cf.

(4) is attainable SS2o/(f--q)

Condition

(4) holds

I

ii

I

2:

I 4:

1.618

::

I

Riiations (3.3) are obtained from (3) taking (2.10) and (3.1) and (3.2), we can write conditions (3.3) in the form

i%

(

I

y2

1:

)

3

I

i

(2.11) into account.

Using

(3.4)

Tables 2 and 3 present the values of the coefficients sV,rV,ss and r, for different v for agreement and g. As can be seen from this data, in the widely known methods with v>l of the accuracy 6 must be roughly 1 decimal or 3 binary orders greater - and the quantity E must be 1-2 decimal or 4-5 binary orders greater - than u. The corresponding quantities for linear processes for large q must be 2-3 decimal orders We can roughly assume that, for agreement of the accuracy, the greater than the quantity u. quantity 8 must be chosen approximately 1 decimal order less than E and 1-2 decimal orders greater than u. Naturally, in cases when a lower accuracy is permissible, the quantities 6 and e can be chosen greater than those stated. Infringementofthe conditions of agreement of the accuracy (e.g. choosing 6 of the same order as a) can and, as the examples show, does lead to the phenomenon - paradoxical at first glance - when the iterational process first rapidly converges and then recycles near the solution 2..

Conclusion. 1. On the basis of the above, we can draw the following conclusions: 1) in ultralinear iterational processes rounding errors are not accumulated, do not exceed the sum of the rounding errors of several of the last steps and, for fairly small u (condition (1.7)), their effect does not depend in practice on the constant g, but only on the indicator v and on u; 2) in practical calculations the required accuracy of the result and the condition for stopping the iterational process must be in agreement with the rounding errors u and with the rate of convergence, e.g. in the form (3.1)-(3.4). the iterations 1s not considered, 2. The estimate u of the rounding errors at onestepof since it must be produced individually for each method, taking into account the digit net (for computer calculations) and, possibly, the features of the specific problem. 3. Strictly speaking, our analysis is related to the absolute errors for fixed-point calculations; it is easy to see, however, that its basic conclusions also remain valid when analysing relative errors for floating-point calculations. For example, if the space X is normalized, then instead of (1) we can write ll~~+~-~'ll~qll~~-~'lP, or, proceeding

to the relative

errors,

where

q'=qlls'l~-'. Inequality

(2) is replaced

by

where IJ is the relative error of one step of the iterations The basic inequality (1.1) is replaced by

for floating-point

calculations.

ph+iQ tPkY+cr+i, ar+~=~/lxr+~ll/ll~‘Il. For the iterational

process

to converge

it is obvious

that

lim uk =u, k-and, consequently, the basic conclusions concerning the asymptotic form of the quantities Pr==Il&-x~II/llx’lI will remain valid. There are no fundamental difficulties, although the above is complicated by certain details. 4. The results discussed relate to iterational methods usually considered. Methods of successive approximations exist, however, which are formally analogous to iterational in which an abnormal accumulation of processes with linear and ultralinear convergence, rounding errors can occur that do not correspond to the results presented above. REFERENCES 1. 2. 3.

Moscow: OSTROVSKII A.M., The solution of equations and sets of equations. Izd-vo inostr. lit., 1963. KRASNOSEL'SKII M.A. et al. An approximate solution of operator equations. Moscow: Nauka, 1969. Moscow: Nauka, 1977. KANTOROVICH L.V. and AKILOV G-P., Functional analysis.

Translated U.S.S.R. Comput.Maths.Math.Phys., Printed in Great Britain.

Vol.25,No.4,pp.l3-25,1985

by H.Z.

0041-5553/85 $10.00+0.00 Pergamon Journals Ltd.

THE CONVERGENCE OF DIFFERENCE APPROXIMATIONS AND THE REGULARIZATIONOF OPTIMAL-CONTROL PROBLEMS FOR ELLIPTIC EQUATIONS *

F.V. LUBYSHEV The problem of minimizing a quadratic functional, which depends on solving an equation of the elliptic type with variable coefficients, is considered. The boundary condition and right-hand side of the equation serve as the control. The weak continuity of the functional in the space of the controls is proved. Using the summator-identity method, the problem is reduced to an optimal-control difference problem. The convergence of the difference scheme to a generalized solution of the initial boundary value problem is proved. The functional convergence and weak control convergence are established. The process of regularizing difference approximations to obtain a minimizing sequence which strongly converges in the space of the controls is carried out. Many papers consider the problems of approximating different classes of extremal problems (see, e.g., /l/ and the literature cited therein). These questions have heen fairly fully examined for optimal-control problems connected with ordinary equations, and have a much less completed form for control problems connected with equations in partial derivatives. This paper considers the optimal-control problem with a quadratic functional determined by the solutions oianelliptic-type equation with variable coefficients and mixed boundary conditions, with control in the boundary condition and in the right-hand side of the equation. The paper touches upon /2-5/, where it is shown that the functional Z(g) of the problem is convex and Z(g)EC'*'(ZZ), H is the space of the controls; optimality conditions rn the form of variational inequalities are obtained; it is shown that U,.cH -the set of points of the minimum of Z(g) - is non-empty, convex, closed and bounded and any minimizing sequence for u,. weakly in H. Z(g) converges to Z(g) is slightly continuous in ii. IItS In fact, as will be shown below, the functional difference approximation using the summator-identity methods /6, 7/ IS considered for an approximate solution of the control problem. This approach is particularly suitable when approximating optimal-control problems, when the functional is determined b] solutions cf problems which differ from the first boundary value problem, since it enables us to simplify the complex problem of constructing difference analogues ofhoundaryconditions and cc, preserve the basic properties of the boundary value problem operator for thp d?ff?rezcr svratcir, also. "Zh.vychisl.Mat.mat.Fiz.,25,7,983-1000.1985