Multiplicative Iterative Algorithms for Convex Programming P. P. B. Eggermont Centre for Mathematics and Computer PO Box 4079 1009 AB Amsterdam, The Netherlands and Department
of Mathematical
Science
Sciences
University of Delaware Newark, Delaware 19716
Submitted by Tommy Elfving
ABSTRACT
We study multiplicative iterative algorithms for the minimization of a differentiable, convex function defined on the positive orthant of R!“. If the objective function has compact level sets and has a locally Lipschitz continuous gradient, then these algorithms converge to a solution of the minimization problem. Moreover, the convergence is nearly monotone in the sense of Kullback-Leibler divergence. Special cases of these algorithms have been applied in positron emission tomography and are formally related to the EM algorithm for positron emission tomography.
1.
INTRODUCTION In this paper we study multiplicative
iterative
algorithms
for the mini-
mization problem minimize
1(x)
subject to
x zo,
(1.1)
where 1 is a convex, continuously differentiable function on RN with compact level sets and locally Lipschitz continuous gradient. The interest in such algorithms is sparked by the emergence of the EM algorithm for LINEAR ALGEBRA AND ITS APPLICATIONS
130:25-42
(1990)
25
0 Elsevier Science Publishing Co., Inc., 1990 655 Avenue of the Americas, New York, NY 10010
0024-3795/90/$3.50
26
P. P. B. EGGERMONT
maximum
likelihood
estimation
in positron emission
tomography
(Shepp
and
Vardi [ll]) and by the more or less ad hoc variation proposed by DaubeWitherspoon and Muehllehner [3]. The EM algorithm belongs to a class of algorithms which arise as the method for successive substitution for the complementarity equations of the Kuhn-Tucker conditions for a solution of (l.l), viz., x is a solution of (1.1) if and only if
W(x) > 0,
x > 0, VZ(X)]
Xj[
see e.g. Mangasarian
Xr+l=
j
j=1,2,...,N;
0,
=
[8]. The resulting Xr{l-
(1.2)
algorithms
CfJ,[V2(X”)]
have the form
j=1,2
j},
>..., N,
(1.3)
where w, is a relaxation/steplength parameter. Unfortunately these algorithms are rather complicated to analyze, even in the special case of the EM algorithm, (Vardi et al. [14]). 0 ne notable feature of the convergence proof of the EM algorithm is the predominant role played by the Kullback-Leibler informational divergence between two vectors in IQ,“, which we feel is only partly explained by the fact that the negative log-likelihood function up to a constant can be written as a Kullback-Leibler divergence [14]. The alternative interpretation is to consider these multiplicative algorithms as certain approximate “proximal point methods” for (1.1) (Rockafellar [lo]), viz., x n+l is determined/interpreted as an approximate positive solution of the equation Xj
or equivalently,
+
WnXj[Vl(X)]
j=1,2
j=Xj">
via the Kuhn-Tucker
conditions,
>..., N, as an approximate
(1.4) solution
of the problem
where
d(xlly)
minimize
Z(X) + w, 'd(x"llr
subject
X20
to
is the Kullback-Leibler
divergence
)
(1.5)
(see [7] or [14])
N
d(xlly)=
c j=l
xjbrg;+yj-xj. 3
(1.6)
MULTIPLICATIVE ITERATIVE ALGORITHMS
27
Since d(xl]t~> 2 0 always, and d(x ]]t~) is convex in both x and y, there is at least a superficial similarity between the problem (1.5) and the standard proximal point algorithms [lo]. We will not exploit this connection, but it is obviously a very intriguing one. Besides the “exact” algorithm-i.e., x n+l is the unique positive solution of the equation x;+‘(1+
-we
w,[ VZ(x”f’)]
can now state the more practical
.;+I
approximate
as the implicit
,..., N,
(1.7)
method
j=1,2
=
We refer to these two algorithms respectively. Witherspoon
j=l,2
j} = x;,
I.‘.,
N.
(l-8)
and the explicit
After appropriate scaling, the ISRA and Muehllehner [3] is of the form (1.8).
method
algorithm, of
Daube-
The implicit algorithm (1.7) has a decided theoretical advantage, but is not an applicable algorithm as is, whereas (1.8) has the same cumputatio~lal complexity as the algorithm (1.3) and is still easy to analyze. The key ingredients in the convergence proof for the implicit algorithm kinds of montonicity. On the one hand we have that
l(x”) > z(x”+‘)
unless
Xn=Xn+l
(1.7) are two
(1.9)
as well as the unexpected
d(x*llx”)
> cl(x*llxn+l),
(1.10)
where x* is any accumulation point of (x”},,, thus showing that there is at most one accumulation point. The existence of such an accu.mulation point follows from the boundedness of (x”}, [from (1.9) and the compact level sets assumption on Z], and is easily shown to be a solution of our minimization problem (1.1). Th e analysis of the explicit algorithm (1.8) is only slightly more complicated, but we need to make a suitable choice for w,. Apart from establishing the crucial inequalities (1.9)--(IlO), these convergence proofs are virtually identical with the one given by Vardi et al. [14, Appendix] for the convergence of the EM algorithm.
P. P. B. EGGERMONT
28
Upon reflection, the difference between (1.3) and (1.7) is precisely the difference between the explicit and implicit Euler methods for the numerical solution of the system of ordinary differential equations
dxj_- at
xj[
vz(x)]
j>
j=1,2
,...,
N.
(1.11)
Indeed, the continuous analogues of the inequalities (1.9) and (1.10) are easily established. In the parlance of stability theory for ordinary differential equations these inequalities say that Z(x) and d(x*]]x) are Lyapunov functionals for the differential equation (1.11). For more details, and applications, see Hofbauer and Sigmund [4]. On the theoretical side this study sheds new light on the emergence
of
the Kullback-Leibler divergence in the theory of multiplicative iterative algorithms. There is a modest contribution on the practical side in that a special case of the algorithm (1.8) is the image space reconstruction algorithm of Daube-Witherspoon and Muehllehner [3] for emission tomography: the algorithm converges regardless of whether the solution is unique or not. In the case of uniqueness the convergence had been shown by de Pierro [5]. (In a later paper [6], de Pierro improves on this; see Section 6.2.) It should also be noted that an inequality
similar to (1.10)
for the original
multiplica-
tive iterative algorithm (1.3) d oes not seem to be available, except for the EM algorithm. See Vardi et al. [14. Appendix] for a complicated proof.
2.
ASSUMPTIONS In this section we state the precise
assumptions
and make a choice for the relaxation parameters let ]].()a (]I* ]lm>denote the Euclidean three assumptions about 1:
ASSUMPTION2.1.
The objective
tiable and convex on Ry,
(the maximum)
function
function
l(x)
We
norm on RN. We make
is continuously
differen-
so for all X, y E [w:, (VZ(x)-VZ(y),x
cf. Mangasarian
on the objective
in the explicit algorithm.
- y>>
0;
[B]. Here (. , . ) is the usual inner product on RN.
(2.1)
MULTIPLICATIVE
ITERATIVE
ASSUMPTION 2.2.
29
ALGORITHMS
The level sets of 1 are compact,
i.e., for every
y E Ry
the set
{xE rwy:2(x)
(2.2)
is compact.
ASSUMPTION 2.3.
The
objective
function
continuous gradient, i.e., for every compact constant L
Z(X) has a locally subset
Lipschitz
C of rWy there
exists a
(2.3) We now choose o n = w(x”), where
W(X) =
the relaxation
parameter
o,
in the algorithm
[max{M,2llVZ(x)ll_,ll2~ -Vz(x)lI,)]-la
in which M is an arbitrary, VZ(x) for the line segment
fixed constant,
(x-tD(x)VZ(x)
and
L
is the Lipschitz
:o~t~(2~~vz(x)ll,)-1).
(1.8) as
(2.4)
constant of
(2.5)
Here D(x) is the diagonal matrix with diagonal elements xi, xs, . . . , xN. For the algorithm (1.7) any w, will do, as long as inf w, > 0. The existence of x”+’ > 0 as “defined’ by (1.7) follows from its interpretation as the unique, positive solution of (1.5): note that d(x”(( x ) is strictly convex in x and tends to infinity as x does so. It is obvious that we need that 1 + w,,[ VZ(x)]j >‘O for
' is a reasonable choice. The modification (2.4) all j, so wCx>< (2llVZ(x>IIJ is motivated by the desire that Z(.r”)> Z(X~+‘); see Lemma 3.1. We comment briefly on the above assumptions. Assumptions 2.1 and 2.2 are pretty much essential. Assumption 2.3 justifies the particular choice of w(x) for which we can prove that {Z(x")}, is decreasing. Any alternative version of Assumption 2.3 and choice of o(x) for which this holds true should be fine, but some of the details will be different. An example in point is the treatment (in Section 6) of the image space reconstruction algorithm of Daube-Witherspoon and Muehllehner [3] with an implicit choice of o,
30
P. P. B. ECGERMONT
which is unlikely to satisfy the above assumptions. It seems a reasonable assumption that if o, is determined by line minimization, then a similar treatment will be possible.
3.
THE DOUBLE
MONOTONICITY
OF THE IMPLICIT
ALGORITHM
In this section we give the deceptively simple proofs of the monotonicity relations (1.9) and (1.10). We define the mapping H: Iwy + Iwy as follows. For x > 0 we let y be the positive solution of the equation Y+
4x1 NY)WY) = x
(3.1)
and define H(x) as H(x) = y LEMMA3.1. Proof.
For any x E rWT we have Z(x) > Z(H(x))
Let y = H(x).
Z(x)- Z(y) a
By convexity
(Vl(Y),X
unless x = H(x).
we have
- Y>'~(X)(VZ(Y),D(Y)VZ(Y)).
Since y > 0, this shows that Z(x)case x = y = H(x).
Z(y) > 0 unless
D(y)VZ(y)
= 0, in which W
We state the second monotonicity be any number with
result in slightly different
form. Let m
(3.2)
inf(Z(x):x>O}
&t-
dCx*(lx “+‘) for all n.
{x*E rWy: Z(x*)
<
m}.
(3.3)
of (1.1) is in a, so that CR is not empty. x*~Ck.
For
any
x”>O
we
have
d(x*llx”)>,
MULTIPLICATIVE
ITERATIVE
31
ALGORITHMS
Proof. Since x0 > 0, so are all x”, and thus d(x*]]x”) all 12. Now
d( x*11x”) -
d(x*llxn+l) =
cxy log ?-
is well defined
for
i-l+1
++$?+i
x;
.i
Since log(l + t)- ’ = - log(I + t) > - t, we thus get that the above expression is greater than or equal to
W,(VE(Xn+l),Xn+l-Lx*)> W,[Z(*n+l) - z(x*)] )
with the last inequality due to convexity. expression is nonnegative. The general pendix]. we omit
4.
By the choice of x* E Q the last n
convergence of the implicit algorithm is now easily shown along the outline of the convergence proof for the EM algorithm [14, ApSince the proof is essentially the same as for the explicit algorithm, it here.
MONOTONICITY
OF THE EXPLICIT
ALGORITHM
In this section we prove the monotonicity results for the explicit rithm. They are slightly messier than for the implicit case. Let G : rW7+ iRy be defined as
j=1,2
By the choice of w(x), obviously xj = 0.
,..., N.
algo-
(4.11
G(x) > 0 if x > 0, and [G(x>]j = 0 only if
P. P. B. EGGERMONT
32
LEMMA 4.1. Proof.
For al2 r z 0 we have Z(x) > E(G(x)) unless x = G(r).
From the convexity
Z(x)-
Z(G(x))
of Z we have
> (VZ(G(x)),x
-G(r)),
which equals
(VZ(x),x Since
-G(x))-
(VZ(x)-VZ(G(x)),x
VI is locally Lipschitz
continuous,
-G(x)).
this expression
is greater
than or
equal to
xj~[Vz(“)ljle i ‘-
=w(x)~ for the appropriate
l+W(X)[VZ(X)]j
constant
15. By the choice
;W(X)C *jl[VZ(X)] and the conclusion
COROLLARY
convergent Proof.
of w(r)
(4.3)
proceeding
with the
{Z(x”)}, is decreasing.
4.3. {x"}, is bounded,
and every subsequence
subsequence. Since
(4’2)
U
follows as in the proof of Lemma 3.1. of this lemma before
1
this dominates
D(x)vz(X)),
jI"=+OJ(~)(Vz(r),
We state some consequences second monotonicity result. COROLLARY 4.2.
w(x)Lxj
l+ W(X)[VZ(X)] j
Z(xn> 2 Z(x “+ ‘>, then for every n, X” E
which is a compact set.
{x> O:Z(x)
Q
Z(xO)],
itself has a
MULTIPLICATIVE
COROLLARY
ITERATIVE
4.4.
Let
V(x)=
ALGORITHMS
w(x)(VZ(x),
33
D(x)VZ(x)).
Then
V(x.)<4[Z(x)-Z(G(x))].
This last result just summarizes the proof of Lemma 4.1. We now introduce the following notation, For x E Iwy and y > 0 let
4xllv) = d(xlly)+8
l(Y) - Z(x) M
(4.4)
.
Note that e(x 11 y) is a measure for IIx - y l[s, provided
x solves the minimiza-
tion problem (1.11, or is otherwise known to satisfy Z(r) 6 Z(y). Let m and R be defined by (3.2)-(3.3) with lx”}, generated explicit algorithm.
Let x* E Q. Then e(x*(lx”)>,
LEMMA 4.5.
Proof. Since defined. Again
e(x*((x”+l)
by the
for all YL
x0 > 0, we have rn > 0, for all n, and so d(x*llx”)
is well
?I+1
d(x*llr”)-d(x*llr”+l)=~r~log~+xl’-r;” j
Once more, log(l+
t)-’
> - t, and so the above expression
on{VZ(r”), xnBy the choice
I
x*) -
cj
dominates
~~~~n[Vz(r”)l j12 l+w,[vz(xn)]j
of w, the last sum is dominated
by 2w,V(x”)
.
(see Corollary
P.P.B.EGGERMONT
34 4.4) and thus by 8[Z(xn)-
Z(x”+‘)]/M.
We have thus shown that
where we have once more used convexity. follows.
Since
5.
ALGORITHMS
CONVERGENCE
OF THE EXPLICIT
I(%“) > Z(x*), the result n
With the Lemmas 3.1-3.2 and 4.1-4.5 in hand, the convergence proofs of the two algorithms are virtually identical We will just consider the explicit algorithms, since this is the slightly more complicated case. The outline of the proof is from Vardi et al. 114, Appendix]. LEMMA 5.1. The sequence converges.
{x”},
generated
by the explicit
algorithm
Proof. By Corollary 4.3 the sequence {x”),, has a convergent subsequence {x”‘}k with limit x*, say. Then Z(x*) = limk em Z(x”~). We may now apply Lemma 4.5 to our x*; thus e(z*JJx”)>/ e(x*]{x”+‘) for all n. Now for our subsequence {x”“} we have e(x*]]x”‘) + 0 as k + Q), and by the monotonicity then lim e(x*](x”) n-+m Then also d(x*jlx”)
+ 0, and the convergence
= 0. of {x”}, to x* follows.
n
We finally need to show that the I* obtained above solves the minimization problem (1.1). First we have that x * is a fixed point of the iteration x n+l = G(x”), so that
Xj*[VZ(X*)]j=O, j=1,2,...,N.
(5.1)
We have of course also that xi” > 0,
j=1,2
,..‘, N.
(5.2)
MULTIPLICATIVE
ITERATIVE
35
ALGORITHMS
In order to show that VZ(x*) > 0, consider for a fixed XJ’+~/X~. Since {x”), is bounded, it follows that liminf,,, liminfw,[VZ(x”)]j n+m It follows that [VZ(r*)]j
j
the ratios r” < 1, or
rn =
> 0.
> 0. Since j was arbitrary.
[vz(x*)]j>o,
j=1,2
,...)
N.
Thus, x* satisfies the Kuhn-Tuckner conditions, and is a solution minimization problem (1.1). We have thus proven
(5.3) of the
The sequence Ix”), generated by the explicit algom’thm THEOREM 5.2. starting from an x0 > 0 converges to a solution of the minimization problem (1.1). It should be remarked that in the above proof we can show that x* satisfies the Kuhn-Tucker conditions only after we have shown that {x”], converges to x*. In the absence of Lemma 4.5 we do not have this convergence, and indeed it does not appear to be easy to prove that x* is optimal only from knowing that a subsequence of (n”) converges to x* [so 2(x*) = lim” _m Z(xn)], without some further assumptions on Z(x).
6.
APPLICATIONS
TO LINEAR
MODELS
In this section we discuss some applications of multiplicative iterative algorithms. We consider the following class of convex programming problems. Suppose equations
some physical phenomenon
Ax=b,
is modeled by the system of linear
(6.1)
where A E iWMxN is a nonnegative matrix with nonzero column sums, the data vector b eIWM is nonnegative and x E rWy describes the physical quantity to be identified. Due to all kinds of approximations in the modeling process (linearization, statistical effects), the notion of a good solution of the system (6.1) needs to be clarified. A common choice is to interpret (6.1) in
36
P. P. B. EGGERMONT
the least squares sense. The “solution” of the minimization problem
Slightly
more generally,
of (6.1) is defined to be the solution
minimize
1IA.x- Z7llg
subject to
x >, 0.
let L(y)
(6.2)
be a strictly convex function
on WY, with
compact level sets and locally Lipschitz continuous gradient. Typically, will also depend on the data vector b, but we suppress this dependence the notation. Now choose as interpretation of (6.1) minimize
L( Ax)
subject to
X > 0,
L in
(6.3)
and we are interested in applying the explicit multiplicative rithm to the function Z(X) = L(Ar). We first show that assumptions of Section 2. First of all, it is easy to show that and has a locally Lipschitz continuous gradient. It should be
iterative algo1 satisfies the Z(x) is convex remarked that
Z(X) is not strictly convex if A has a nontrivial nullspace. There remains the question of compact level sets of Z(X). To settle this, note that for fixed y E rWy the level set of L
is compact,
and so is its intersection
with
Thus it follows that for every z E rW$’ the set
{Ax : Z(x) =sZ(z),
x E
lq}
(6.4)
is compact. Since A is nonnegative and has nonzero column sums, we have that N(A)n lRy = {O}, where N(A) = {x E IWN: AX = 0) is the nullspace of A, and so the compactness of (6.4) implies the compactness of
(x E i.e.,
2 has compact
level
sets.
rwy:Z(r) Thus
<
Z(z)},
Z satisfies
all our assumptions.
The
MULTIPLICATIVE conclusion
ITERATIVE ALGORITHMS
37
is that the algorithm
XO>O, n
,;+I
= 1+
o.[A&(Axn)]j ’ j=1,2
converges to a solution of the minimization problem We now discuss a few specific examples.
>...> N,
(6.3).
6.1. Constrained Least Squares We already mentioned the minimization problem (6.2). It is not clear whether the original multiplicative algorithm (1.3) converges in case N(A) # (0). In case N(A) = (0) the solution of (6.2) is unique, and it is not hard to show that then convergence follows. The explicit multiplicative algorithm
n
“q+1=
converges
6.2.
j=1,2 ,..., N,
l+~,[AT$x”-b)]jy
for appropriate
w,, whether
(6.5)
A has full column rank or not.
Image Space Reconstruction Algorithm (ZSRA) If we apply a diagonal scaling in the constrained
least squares problem,
we get
minimize
lIAD( A) y - b 11:
subject to
y 2 0,
(6.6)
where D(h) is a diagonal matrix with diagonal elements A,, A,, . . . , A, which are all strictly positive, and we take r = D(A)y as our solution of (6.1). The algorithm (6.5) applied to (6.6) gives the iteration formula n
,;+I
= l+w”Aj[Ar(:D(A)yn-h)li.
j=1,2
,.**, N,
P. P. B. EGGERMONT
38
and scaling back to xn,
,..., l+0.“,[A$4X”-b)]jT n
xy+’
=
3
In emission
j=1,2
tomography
f
N.
the column sums of A are normalized
(6.7)
to one,
j = 1,2 ,..., N,
aij=l,
i=l
and the above algorithm is applied with resulting algorithm has the elegant form
,;+I
This
algorithm
=
is the
Daube-Witherspoon we have
[ATbIj x; [AT~“lj, image
space
and Muehllehner
w, = I and
hj’
= [ATbIj.
j=l,c...,N.
reconstruction
The
(6.8) algorithm
(ISRA)
of
131. de Pierro [5] proves that for (6.8)
IlAx” - blJ; > IlAP+'
- b/l;,
(6.9)
and proves convergence when the solution of (6.2) is unique. When there is no uniqueness, it is tempting to appeal to Lemma 4.5 and Section 5. However, Lemma 4.5 has been proved for a restricted choice of o, only. We prove here an analogue of Lemma 4.5 for the present circumstances. First we quote the result (6.9) as proved by de Pierro [5].
LEMMA
6.1.
ments xr/[ATAx”]j,
JJAx” -
Let D, E [WNxN be the diagonal j = 1,2,. . ., N. Then
@I; - (l&n+1 - bl\; >, (r”
matrix with diagonal ele-
- xn+‘, D,+”
-
xn+‘)).
In order to formulate the analogue of Lemma 4.5, we let x* denote any accumulation point of (x “),, and we let A E (WNXN denote the diagonal
MULTIPLICATIVE ITERATIVE ALGORITHMS matrix with diagonal elements
e(x*llr)
LEMMA6.2. Proof.
hj = [Arb]j,
= d(Ax*)lhx)+
39
j = 1,2,. . . , N. Finally
bll; - &4x*-
&4x -
let
big.
e(x*((x”) > e(x*J1x”+l).
As before,
we have
II+1
d(Ax*llAx”+‘)
d(Ax*llhx”)-
= c
hjrT
log s
Sincelogt=-logt-‘al-t-
+ Aj(xj” - x7+‘) 3
j
‘, the last expression
dominates
which equals
c
Aj(x; - x;+l)(x;+l xn+l
j Since
Aj/xjn+l
I
= [ATAx”]j /xJ,
(x” - xn+l,
y(yn+I
- x;)
the above expression
-x*))=
is equal to
- (x” - Xn+l, D,‘(*”
- X”+l))
+(x”--n+l,D~‘(Xn-X*)).
Since
D;‘(x”
- xn+l) = AT(Ax” - b), the last inner product can be written
as (x” -
x*, AT&”
- b)),
40
P. P. B. EGGERMONT
which, by convexity,
dominates
By Lemma 6.1, this last expression have shown that
d(hr*llAx”)-
is nonnegative.
d(Ax*)(Ax”+t’)> - (+
which, again by Lemma
Summarizing
xn+l, Dn-‘(xn-
now, we
xn+l)),
6.1, dominates
- (px”
-
bll; + IIAx”+l - b&
and the lemma follows.
H
The above two lemmas now imply the convergence by the material
in Section
of the algorithm
(6.8),
5.
It should be remarked that (6.8) has been applied in remote sensing for the reconstruction of temperature profiles (see Chahine [l], Chu [2], Twomey [13], and references therein), but then in the form [cf. Equation (6.1)]
q+l
bj
=x”-
’
j=1,2
,..‘,
N.
(6.10)
[AX"]j'
This must be regarded with even more skepticism than (6.8), since the above algorithm necessarily generates an oscillating sequence {x”), if the system (6.1) is not consistent. The same criticism would apply to the variation proposed by Twomey [12]. Note that de Pierro [5] considers
the minimization
minimize
(x,Mx)-2(b,x)
subject to
X20
problem
with M positive definite and M > 0 elementwise. Our theory applies to this case as well, even when M is only nonnegative definite (but still M > 0 elementwise), since then the level sets remain compact. In [6], de Pierro studies the above minimization problem without the assumption of definiteness (convexity), but with M 2 0 elementwise, and with strictly positive diagonal.
MULTIPLICATIVE
ITERATIVE
41
ALGORITHMS
6.3.
Maximum Likelihood Estimation in Emission Tomography In emission tomography the problem can be stated as the minimization of Z(x) = d(bllAx) over rWy. Here d(*I(.) is the Kullback-Leibler divergence (1.6). The EM algorithm of Shepp and Vardi [ll] has the simple form ~7”
with rt”=bi/[Ax”li, in this case
= Xy[ Arr^]j,
i=I,2
,...,
j=I,2
>...1 N,
(6.11)
M. It is a special case of the algorithm (1.3);
‘j n+l=~~{l-~,(l-[ATrn]j)),
j=1,2,...,N.
(6.12)
Vardi et al. [ll] were able to prove Lemma 4.1 (easy) and Lemma 4.2 (hard) in this case, but it only works for 0 < w, < 1. However, Lcwitt and Muehllehner [9] in their experiments advocated the choice w, = 4 after a few iterations, so convergence in this case is still an open question. In our version of the algorithm we would have n
q”
=
Wn(IIIIATr”]j) ’ j=l,2
1+
,...> N,
(6.13)
and there are no restraints on the size of o, other than the (easily modifiable) choice (2.4); see the comments on this choice in Section 2. It remains of course to be seen what the practical difference (if any) is between the algorithms
(6.12)
and (6.13).
REFERENCES M. T. Chabine, Inverse problems in radiative transfer: Determination of atmo-
spheric parameters, J. Atmospheric Sci. 27:960-967 (1970). of Chabine’s nonlinear relaxation inversion method used for limb viewing remote sensing, Appl. Opt. 24:445-447 (1985). M. E. Daube-Witherspoon and G. Muehllehner, An iterative image space reconstruction algorithm suitable for volume ECT, IEEE Trans. Med. Imaging MI-5:61-66 (1986). J. Hotbauer and K. Sigmund, The theory of Evolution and Dynamical Systems, Cambridge U.P., Cambridge, 1988. A. R. de Pierre, On the convergence of the iterative image space reconstruction algorithm for volume ECT, IEEE Trans. Med. Imaging MI-6:174-175 (1987). W. P. Chu, Convergence
42
P. P. B. EGGERMONT
6
A. R. de Pierro, Nonlinear relaxation methods for solving symmetric complementarity problems, J. Optim. Theory Appl., to appear.
7 8 9
F. Liese and I. Vajda, Convex Statistical Distances, Teubner, Leipzig, 1987. 0. L. Managasarian, Nonlinear Programming, McGraw-Hill, New York, 1969. R. M. Lewitt and G. Muehllehner, Accelerated iterative reconstruction for positron emission tomography based on the EM algorithm for maximum hkelihood estimation, IEEE Trans. Med. lmaging MI-5:16-22 (1986). R. T. Rockafellar, Monotone operators and the proximal point algorithm, SlAM J. Control Optim. 14:877-898 (1976). L. A. Shepp and Y. Vardi, Maximum likelihood reconstruction in emission tomography, ZEEE Trans. Med. Zmuging MI-1:113-122 (1982). S. Twomey, Comparison of constrained linear inversion and an iterative nonlinear algorithm applied to the indirect estimation of particle size distribution, /. Comput. Phys. 18:188-200 (1975). S. Twomey, Introduction to the n4athematics of inversion in Remote Sensing and indirect Measurements, Elsevier, Amsterdam, 1977. Y. Vardi, L. A. Shepp, and L. Kaufman, A statistical model for positron emission tomography, J. Amer. Statist. Assoc. 80:8-38 (1985).
10 11 12
13 14
Receiced 23 August 1988; final manuscript accepted 28 April 1989
linear