U.S.S.R. Comput.Maths.Math.Phys.,Vol.27,No.6,pp.lO5-109,1987 Printed in Great Britain
0041~5553/87 $lO.CQ+O.CQ 01989 Pergamon Press plc
COARSENED METHODS OF CONJUGATE DIRECTIONS* A.B. BAKUSHINSKII and V.N. THUSHNIICOV
A modification of the methods of conjugate directions, in particular the conjugate - gradient method, is proposed. The modified methods are suitable for the numerical solution of ill-posed linear problems.
1. Conjugate - direction methods (CDM), in particular the conjugate - gradient method (CGM), are extremely popular procedures for unconditional minimization; they are widely used, for example, in the numerical solution of systems of linear algebraic equations (which is equivalent to the quadratic optimizaion problem). Along with the recent interest in approximate methods for solving ill-posed problems, such as linear operator equations with non-invertible operators (e.g., Fredholm equations of the first kind), there have been significant developments in the theory of numerical methods for solving such problems. One of the most important results has been the realization that standard linear iterative methods (i.e., methods with a linear transition operator) for linear problems (or quadratic optimization problems) can be used to solve the corresponding ill-posed problems, provided suitablecriteria are established for stopping the procedure (the number of iterations a(8) as a function of the accuracy of the input data , e.g., the right-hand side of the equation). The approximation thus obtained is an approximate solution in the sense of regularization theory /l/. It is natural to extend the consideration of such questions to non-linear iterative methods (i.e., methods with a non-linear transition operator). Typical representativesof this group of methods, together with the method of steepest descent, are the CDMs. Unfortunately, the results obtained to date are not as definitive as in the linear case. In addition, numerical experiments show that CDMs for the approximate solution of ill-posedproblems need some modification. In this paper, following the principle of iterative regularization /2/, we propose certain "coarsened" versions of CDMs. We have been able both to prove that these methods converge to the exact solution and to formulate a simple stopping rule. 2. We will outline some notions from the theory of regularization of linear equations that will be used below. Let A be a continuous linear operator from a Hilbert space 8, to a space HZ. The operator A need not be invertible, but in order to avoid unimportant technical details we shall assume that KerA-{O}. Consider the operator equation AZ-U, (1) where uMf,,z~H,, and s is an unknown element of HI. When Eq.(l) is being considered as a mathematical model for some real problem, the quantity on the right is usually known as a result of various measurements, which naturally involve inaccuracies. In many cases, the following error model is an adequate reflection of the real situation: instead of the "exact" right-hand side u, one if given a certain approximation E:[jZ-ullt6,where 6 is a known quantity. From the formal point of view, Z- the "approximate" solution of (1) - is obtained by applying a certain operator R(i7,6)--I.If it is also true that lim sup ~R(~?,6)-:~-0, 64 UI;-.ul~a
(2)
then R(li,6) is known as a regularizing algorithm (or operator) for solving Eq.(l) (see /l/j. It is readily seen that Eq.(2) is a formal version of the following important condition, which should presumably be satisfied by any "reasonable" approximate method: if the accuracy of the input data is successively improved (i.e., they tend to the exact data),:then the approximate solutions produced by the method should approach an exact solution. Unfortunately, it is, in principle, impossible to estimate the error of the approximation in (2) uniformly for all u (or z) when the operators A-' are unbounded. This is the characteristicproperty of an illposed problem. In practice - other conditions being equal - the "ideal" algorithms R should have the property that, given a fixed error level ii the value of R(Z,6) depends as "weakly" as possible on the actual location of iiin the neighbourhood of the exact u. Since one usually has no estimates of the accuracy of the approximations, the "quality" of a specific approximate solution, as in the well-posed case, must be checked on the basis of indirect criteria; e.g., one may verify that certain functionals of the approximate solution (suchastheresiduals, etc.) are small. The actual algorithms R(*,6) are usually constructed using continuous approximations of the solutions of Eq.(l). L.etR, bea sequence of continuous (not necessarily linear) operators defined throughout H,, suchthat limIjR,Az-z[]=O V=H,. (3) *-.. lZh.vychisl.Mat.mat.Fiz.,27,12,1763-1770,1987 105
106 In the classical situation, when the operator A in (1) has a continuous inverse, any convergent numerical method, essentially, generates a sequence of such operators. For example, in any convergent iterative method for solving cl), R, is an operator representing the action of n steps of the process on u (or d:). If u is known only approximately (Z:~jii--u[~96), one takes the element R.(E) as an approximate solution of Eq.(l). The error /Ill.(n)-~(1 is usually unknown. In well-posed problems, however, A-‘2 exists and is close to A-‘u for small 6, because A-’ is continuous; hence no a priori relationship is assumed between n and &Given Zone computes the operators R, as long as this is technically feasible or until the result is established up to some preassigned number of significant figures, and so on. It is assumed in such situations that the greater n , the more accuracte the computed A-'c, and for small & the values of A-‘Z and Z-A-'u are assuredly close together. In an ill-posed problem (l), i.e., when the operator does not possess a continuous inverse, many classical approximation schemes are formally applicable. It has been shown that Eq.(3) is then true, but it is, in principle, impossible to use such schemes in actual computations as just described for the well-posed case. Indeed, in the ill-posed case 1? may not belong to the image A(H,), and therefore limR,fi may not exist or, even if A-% does exist, it may deviate considerably from n-d) A-In, even if g and u are close together. In the ill-posed case, therefore, there is no point in computing R-ii with n increasing without limit. It turns out, however,thatone canmeaningfullyaskwhetherastoppingrule (orcriterion) exists -thespecificationofsomenumber n(6) Itisinthis sensethatonecanspeakofusingclassical suchthat (2)holds for R.,a,(ii)=R(E,6). iterativeprocessesfortheapproximatesolutionof ill-posedproblems (1). zany specific stopping criteria are known for linear iterative processes - iterations with a linear transition operator (see, e.g., /3, 4/). However, for linear processes (steepestdescent or CDM) it has not been possible to prove more than the approximatingproperty (3) (see, e.g., /5/J. The problem of stopping criteria that might produce regularizing algorithms (2) is as yet unsolved. Interesting results in this direction may be found in /6-g/. However, these papers adopt certain additional a priori assumptions. In this paper we shall demonstrate the use of iteratively regularized analogues (in the sense of /2/) of standard non-linear methods. These analogues are no more complicated to use than the underlying classical methods. Moreover, since they involve additional parameters (apart from the number of iterations), one can exert additional control on the numerical stability of the process, making it more sensitive to perturbation of the input data (in particular, the right-hand side u of cl)), which is usually of paramount improtanceinpractical work. 3. Assume that the operator A in (1) is a positive selfadjoint operator in a Hilbert space H. The important point is that A is not assumed to have either a bounded inverse, or indeed and R(A) (the range of A) is not closed. any inverse at all; i.e., in general, KerA#{O} Coarsened methods of conjugate directions will be constructed as follows. Assume we are given a certain integer La1 and a sequence (e.}>O, lime.-0. Additional n-.oI restrictions on the choice of (e.} will be formulated later. Choose some initial point zo=H. The iterations are constructed as follows. If t, has already been computed, r"+L is obtained by applying L steps of the CDM to the operator equation (A+e,E)r--u (4) with initial point .rn. There is a uniform scheme for proving the convergence of iteratively regularized CDMS /2/. The simplest instance of this scheme is obtained when the underlying method is that of conjugate errors /lo, p.356/. the iterations in this method are defined by If A-A'>O,
We may assume without loss of generality that 11.411-~.
(6)
If A,cE, l=-c=-0, it follows from convergence estimates for (51 (see /lo, pp.348-349/) that ~~r,-~~~~~~(~-~*)~"~lz~-z'll~. (7) an exact solution of Eq.(l). This estimate is adequate for an investigation of the iteratively regularized (coarsened)conjugate error method by the general scheme of /2/. Here
z’
is
Theorem 1. Let Eq.(l) be solvable and let r'be a normal solution (a solutionwithminimum norm); in addition assume that the normalization condition (6) is satisfied. If the sequence 1/2>(e,)>O satisfies the conditions
107
then for any La1 the sequence (2.) produced by the coarsened conjugate error method (41, (5) converges in the norm of H to a normal solution of (1). Proof. Let
II.
denote the solution of (4) (which exists and is unique since e,>O.
It
is shown in /2/ that
limllr."--l'ija = 0,
(9)
In...
Following /2/, we shall derive a recurrence relation for x.=+,--rI.(I. Using (7) and (lo), we have
Hence I E.--em+1I
1+2e,
x,+t G (l+e.)‘%+o
(
e.
1.
By (91, we deduce from (11) (see /2, ll/) that limx,-0. ,X-D
(12)
Combining this with (9), we obtain the statement of Theorem 1. The proof of the analogous results for other iteratively regularized CDMs is technically rather more complicated, since the estimate (7) is valid only in the "energy norm", which is different for each specific method. In particular, for the CGM, ~~Zs-t'~~~*~(l-C)"~~Z~-t'[~n, (ll~ll”A==bhd”)~ We shall need the following identity:
II~-~~lls~‘-f*~~~-fr~z’~, f*(r)==_(Az, z)-%(u, z). The iteratively regularized CGM is investigated as follows. Put
(13)
Il*j/sA.en,+j!,,. Proceeding as
in the proof of Theorem 1, we see that for any L>l,
Using (13), we can transform the right-hand side of (14):
IlG+r~. .*$+l-+~+l-~. (.h+,,Z.+,)
+E
,y”-
n+~~Znt~r~n+~~-2~u,~“+~~-l,1+‘~+,~,~~~n+r
~,(+,+~,1.+~)+2(u,1.+,)if,,+.“.,(r~
)-
)-L4~.+~,z~+d.
Consequently,
)I. Ilr.+r-z~n+,lL -ll~.+i-~.,llt - (e++,-cn)Il~n+,llz+[f.(~.~)-f.+~(~~.+,
(W
If we stipulate that {e,}be a monotone decreasing sequence, the first term on the right of (15) is negative, and may be omitted when (15) is substituted into (14). The second term on the right of (15) is easily estimated in absolute value, using (10) and (9). After the appropriate reduction we readily see that
With these remarks in mind, we deduce from (14) the following analogue of (11):
,.,,,(~)2,.+o(.y.q,
(16)
n
where xn-IIr,-r,,.(l~*. _ Noting that x,>enxn' and also that lim(ern+dE,)-
1
33-m
(by (a)), we deduce (12) from (16). We have thus proved the following theorem. Theorem 2. Under the assumptions of Theorem 1, if e.40, the sequence produced by the regularized conjugate gradient method (the method of steepest descent) converges in the norm
108
of X to a normal solution of Eq.(l). It is worth noting that when L-l iteratively regularized CDMs reduce to iteratively regularized IIa-processes" /12/ (the method of steepest descent, minimum error method, etc.). 4. Suppose now that instead of u one has Z:iiS-u,iG6. The following theorem establishes a stopping rule for the iteratively regularized conjugate error method (5) Theorem 3.
Under the assumptions of Theorem 1, if n(6) is so chosen that lim(6/en(a,)-0, d-4
(17)
then Eq.(Z) holds with R(S, 6)p&,(g), where f?,(m) is the operator representing n steps of the iteratively regularized conjugate error method.
Let (z,,(Z)}, {z,.(a)} denote the sequence produced by the iterative process for Proof. input I and the sequence of solutions of Eqs.(l), respectively. As in the proof of Theorem 1, one shows that
the estimate IIt,..,(r..(S)jl depends on 6, inawaynowto
be determined. By (101,
1~~...,(8)-Z..(~)il(ll~.n.,(~)~~le,--e.+,l/E,. But
l&...,~s~ll~ll~~..,~~~Il+ll~...~~u’~-~...~~~~ll. (n)-~...,(u)ll~O(6/~~+,), the norms il~..,(u)il are bounded independently of n and, by BY (2), llr..., the last relatioship of (8), en/en+,=O(l).Finally, Ilt...,(u')ll-0(6/e.). The inequality analogous to
(11) is A.+,g
(18)
Now fix the sequence (e.)and choose n(6) so that (17) is true. We may assume without loss of generality that b/e,<1 for any fixed 6,O
But
,(I91 The second term on the right was estimated in the course of proving (18):
lb. “(4, 03 -~~“(b,(~)Il-O(6k,~,). Recalling (9) and noting that n(b)+= as 6-0, we deduce the statement of the theorem from (19). Analogous assertions hold for other iteratively regularized CDMs. 5. We now consider a numerical example which, to our mind, demonstrates the value of iteratively regularized CDMs. In this example H,4ip=Elo, (lCO-dimensionalEuclidean space), the operator Aisgenerated by the Rilbert matrix
A-(“u”(~],
Li-l,..., lOO.
This matrix is non-singular, but system (1) with this matrix is so ill-posed that attempts to solve it produce all the effects characteristic for ill-posed problems. The computations proceeded as follows. The role of the exact solution was played by a on which basis the right-hand side u=Az was computed. Then, given vector z-(L,,...,z,~~)~, using the values u-(u~,...,u,~~)~, possibly distorted by noise, the exact solution was reconstructed by the above scheme. The computations were done in B-byte arithmetic. The results will be presented for the case in which the noise was assumed to have the A,-6(-i)',where 6 is a certain numerical parameter and L-O.Oii, i-l,..., form A=(A,, ....A..,)', 100. Different versions of the methods were compared in terms of the function $00 ERR(G,k)- &<(6,+-z,)', 1-1
(20)
with zti(6,0)-0, i--1,...,100‘ in all versions. The number k in (20) denotes the number of iterations in the specific algorithm. Fig.1
109
for the "pure" method (5). shows the form of ERR(&k) For illustrative purposes, we list a few individual values of the function: ERR(O,zO)O.OO21,ERR(O.OOI,20)=ii.22...,ERR(O.OOl. 20)>10'. It is clear from Fig.1 that if the input data are exact, the conjugate error method is quite efficient here: after 20 iterations the error falls to l/lo4 of its initial value, and when the iterations are continued the error decreases monotonically. In particular, if k-100 (accordingto the theory, this number of iterations should certainly produce the exact ERR(., 0) is strongly dependent on solution), ERR(0,1OO)=iO-'. However, for "large" Is,X-=-7, 6,andeven when 6 is not large (a relative error from 0.1 to 1%) there is a marked minimum with respect to k.
Fig.1
Fig.2
represents the performance of the iteratively regularized (coarsened)method (5). In the construction, we took L-10 andvaried E* each10 iterations: Fig.2
el,=[iO((k/lO] + 1)]-',k < 30, the
parameter was s>O. values of the function ERR(k, 6,e): ERR(20, 0, 0.001)-0.04873..., ERR(20, 0.001, 0X101)-0.617, .., ERR(20, 0.01, O.i)-2.96. . . . It is evident that the coarsened method also yields the exact solution fairly efficiently of its when the input data are exact. After 20 iterations, the error falls to ml/lo3 original value (s=O.OOl).However, the dependence of the function ERR(.,.) on 6 is markedly weaker than for the pure method (5). There is no sharp minimum even for large &This indicates that the method is relatively non-critical with regard to the choice of a stopping rule (fully agreeing with the theoretical result represented by Theorem 3) and illustrates the value of the iteratively regularized method (5) in the numerical solution of linear ill-posed problems. variable A few
REFERENCES 1. TIKHONOV A.N. and ARSENIN V.YA., Methods for solving ill-posed problems, Moscow, Nauka, 1979. A.B., Methods for solving monotone variational inequalities based on the 2. 3AKUSHINSKII principle of iterative regularization. Zh. Vychisl. Mat. mat. Fiz., 17, 6, 1350-1362, 1977. 3. BAKUSHINSKIIA.B., Selected problems in the approximate solution of ill-posed problems, MOSCOW, Izd-vo MGU, 1968. 4. JAINIKKO G-M., Methods for solving linear ill-posed problems in Hilbert spaces, Tartu, Izd-vo TGU, 1982. 5. EMIROVSKII A.S. and POLYAK B.T., Iterative methods for solving linear ill-posed problems given exact information, I. Izv. Akad. Nauk SSSR. Tekhn. Kibernetika, 2, 13-25, 1984. 6. XLYAZOV S.F., On the stable solution of linear operator equations of the first kind by the method of steepest descent. Vestnik MGU, Ser. 15, 3, 26-32, 1980. 7. SAW L., A family of non-linear iterative methods for solving ill-posed problems. Izv. Akad. Nauk EstSSR. Fiz.-Mat., 3, 261-268, 1982. Cm regularizing algorithms based on the method of conjugate gradients. Zh. 8. SILYAZOV S.F., Vychisl. Mat. mat. Fiz., 26, 1, 15-22, 1986. 9. !TCMIROVSKII A.S., Cm the regularizing properties of the method of conjugate gradients applied to ill-posed problems. Zh. Vychisl. Mat. mat. Fiz., 26, 3, 332-347, 1986. 10. SAMABSKII A.A. and NIKOLAYEV E.S., Methods for solving difference equations, Moscow, Nauka, 1978. 11. VASIL'YEVF.P.,Numericalmethods for solvingextremalproblems. Moscow, Nauka, 1980. P.P. et al., Approximate SOhtiOn of 12. KBASNOSEL'SKII M.A., VAINIKKO G.M. and ZABREIKO operator equations, Moscow, Nauka, 1969. Translated by D.L.