A Theoretical and Computational Method for Determining Optimal Treatment Schedules in Fractionated Radiation Therapy* K. J. ALMQUIST AND
H. T. BANKS Lefschetz Center for Dynamical Providence, R. I. 02912 Communicated
Systems, Division of Applied Mathematics,
Brown Uniwrsi@
by R. Vasudevan
ABSTRACT Using a mathematical model based on existing models in the literature, the response of tumor and tumor-bed cell populations to fractionated radiation therapy is investigated. Problems of determining the optimal dose schedule for a given treatment calendar (specified number of doses and time intervals between doses) are formulated. A theoretical and computational method for solving such problems is proposed. Representative results,
I.
which support
the efficacy
of the method,
are presented
and discussed.
INTRODUCTION
A number of authors [6-9,12,13,15] have suggested the study of fractionated radiation therapy for tumors via deterministic mathematical models. In addition, probabilistic models for general growth processes of particle populations in a host have been developed [4,21]. In these latter references the authors determine expressions for the expected lifetime of a host as a function of treatments (chemical or radiation) with both dosage and frequency variable. “Optimal” treatment levels are discussed. While others have suggested simulation techniques to choose “best” treatment schedules for deterministic models, Hethcote and Waltman [15] are the first, to our knowledge, to propose the use of a tool from what has become known as modern optimal control theory to systematically delllns research was supported in part by the Office of Army Research under DAARO-D-31-124-73-G130 28931X2. MATHEMATICAL
and
in part
BIOSCIENCES
by the National
29, 159-179 (1976)
0 American Elsevier Publishing Company, Inc., 1976
Science
Foundation
under
GP
159
160
K. J. ALMQUIST AND H. T. BANKS
termine optimal schedules. Specifically, they demonstrate the use of dynamic programming to find optimal fractionation schedules with a model based on that of Fischer [ 121. Fischer’s basic model describes four classes of cells in a solid tumor and calculates the theoretical response of these classes to any given radiation treatment program. Hethcote and Waltman, to accommodate an inherent practical limitation in the method of dynamic programming, simplify Fischer’s model so as to retain only two classes of cells from his model. In the present paper we present another method, also arising from the mathematical theory of modern optimal control theory, which enjoys a number of advantages over dynamic programming. We apply this method to a model based on those of Fischer and of Hethcote and Waltman but which has five classes of cells. We point out that this method is a viable one for even more complex models and can be applied with relative ease to other models of this nature which are developed as new experimental evidence in support of or in contradiction to various mechanisms becomes available. Indeed, the ideas presented here are applicable to general radiation-response models based on the fundamental premises that (1) untreated tumors and normal tissue grow according to some specified laws and (2) irradiation inhibits (again as described by specified laws) this growth by killing portions of subsections of the cell populations involved. We emphasize that the reader should distinguish between the mathematical method we present below and the particular model for growth dynamics to which we happen to apply the method. In addition to alleviating the problem of dimensionality posed by dynamic programming (see [ 15]), our method (which is actually quite well known to investigators familiar with the literature in mathematical control theory) has the added advantage of permitting greater flexibility in the choice of the “cost” or “payoff” criteria. The macro mathematical model for fractionated therapy that we choose for demonstration of our proposed method is no doubt a naive approach to the in uiuo cellular situation [for example, the model ignores completely the difficult questions associated with variation in cell sensitivity in the different phases of the mitotic cycle (see [2] and [12])] and is adopted here mainly to facilitate discussion of the mathematical method. The effects of radiotherapy (as well as chemotherapy [2]) on tumor growth, while the subject of a substantial amount of experimental and clinical research, are still not sufficiently well understood to enable us to formulate confidently a mathematical model which would gain wide acceptance. Indeed, there have been and continue to be a large number of investigations (see the surveys [2,24] and their bibliographies) aimed at improving the mathematical models used for cell population kinetics. Thus the emphasis of this presentation is not on the modeling aspect of fractionated therapy but on demonstrating use of an alternative method to that proposed by Hethcote and Waltman.
OPTIMAL
2.
RADIATION
161
THERAPY
THE MODEL
The model employed here, based on that of Fischer, retains his four classes of cells: live-oxygenated, dead-oxygenated, live-anoxic, and deadanoxic. These four classes make up the tumor, and in addition, we have added a fifth class, called the tumor bed, which is to represent the healthy normal tissue surrounding the tumor. We adopt the basic assumptions of Fischer [12], which, for the sake of completeness, will be briefly outlined here. The rationale behind the four distinct subdivisions of the tumor is the widely recognized differences in response to irradiation which these classes of cells exhibit. These differences are due primarily to the fact that oxygenated tumor cells are much more susceptible to radiation damage than are anoxic cells, We shall also take into account the fact that welloxygenated cells are capable of rapid reproduction through mitosis, while anoxic cells are not. The standard formula (based on cell culture and transplantation experiments) for the survival of live-oxygenated cells after irradiation is given by (see [l]) s= 1 .-.(I -e-WDo)no,
(1)
where S = the surviving fraction of cells (the fraction of the total left alive after irradiation), D = the dose (in rads) of radiation administered, D, = the “normal” or “mean lethal” dose (a “characteristic dose” parameter representing the dose required so that fractional survival is e- 1 when no= l), and no= the multi-target, single-hit extrapolation number. A proper choice of the parameters D,,n, will produce dose-survival curves which agree with empirical observations (see [l] and [12]). This formula (with different parameters) will also be used for the surviving fraction of the live-anoxic and the tumor-bed cells. The “characteristic” OI “normal” dose parameters will be denoted by D,,(liveanoxic) and Db (tumor bed), while n, and nb will denote the respective extrapolation numbers. Irradiation, in addition to “killing” some cells, also produces sub-lethal damage to other cells. We shall make the common assumption (see [ 10,191) that this damage is repaired by the cell within the first 24 hours. Since our programs will specify only treatments at least 24 hours apart, the only effect due to sub-lethal damage in the model will be a small mitotic delay (proportional by a factor y to the dose) as described below. If N,,(t) represents the number at any time t of live-oxygenated tumor cells, we assume a growth process after irradiation with dose D of the form
dNox
-=
dt
aNox,
0,
t>
yD,
t
K. J. ALMQUIST
162
i.e., oxygenated cells multiply with a rate constant number of live-oxygenated cells left immediately of the above yields
N,,(t) =
No,(O), N,,(O)e"('-YD),
AND H. T. BANKS
(Y.If N,,(O) represents the after irradiation, solution
t yD.
For the small time intervals (no more than 7710 days) between treatments in our program, this exponential expression closely approximates the familiar Gompertzian expression for growth (see [ 11): N,,(t)=
No,(0)e-bemc’.
As has been pointed out [ 1,2,12], both of these expressions (2), (3) approximate experimental findings well. It has been suggested [l] that irradiated “dead” cells can sometimes produce several generations before the daughter cells finally disintegrate; however, this appears not to be the general situation. In any case, we assume that cells in the dead classes remain intact in the bulk of the tumor until they attempt mitosis, at which time they disintegrate. If N&t) represents the number of dead-oxygenated tumor cells at time t (t =0 corresponds to that time immediately after irradiation), the above assumptions can be expressed mathematically by
Ndo(t) =
Ncto(O), Ndo(0)e-*('-YD),
t YD,
where we assume the growth parameter (Yis the same as that in (2) for the live-oxygenated cells. As indicated above, we shall use an expression as in (1) with parameters Db and nb to determine the surviving fraction of tumor-bed cells after irradiation. We shall not include a repopulation mechanism term in our model for the tumor bed. There are two reasons for this: (i) few data [ 1,5,23] are available to support a choice of repopulation rate parameters, and (ii) inclusion of such a term would have little effect on our demonstration of the efficacy of the proposed method. Let us elaborate on this latter comment. Hethcote and Waltman compare sample results for their model without repopulation of the bed with results for their model with a repopulation growth mechanism where the bed growth rate 6 is chosen in a somewhat arbitrary manner (so that mean regeneration time-10 days). In the latter case they let the initial tumor-bed size be arbitrary and record bed surviving fraction as a measure of the tumor-bed size. As is clear from the
OPTIMAL RADIATION THERAPY
163
results presented in [15], the ratio of 6 to the tumor growth parameter a! is important (with respect to comparison of results predicted by the optimal schedules) and can be chosen so that the corresponding optimal treatment schedules have larger recovery periods between treatments once the entire tumor is well oxygenated (e.g., near the end of the treatment period for the model used by Hethcote and Waltman). In our numerical results detailed below, we have fixed the recovery periods, and this trend could not be produced in our responses even if we included a bed growth mechanism. Our method is flexible, and we could, of course, have also included a study of our model with a repopulation growth mechanism for the tumor bed and variable times between treatments. We have not done this, since our main goal in this note is to describe use of the mathematical method and not to perform comparisons on variations in modeling assumptions. Thus our tumor-bed variable represents only the number of cells from the original tumor bed which have survived irradiation, and no provision is made for the generation of new tumor-bed cells during the treatment program. While this is clearly not the situation in oiuo (repopulation of the tumor bed is without doubt very important), we believe that this variable for the tumor bed provides as good an indicator of tumor-bed damage (for the purposes of demonstration of the feasibility of the method) as any other we might choose based on somewhat arbitrary repopulation assumptions. The mechanisms which govern oxygenation and reoxygenation are not well understood at all. There does seem to be a large amount of empirical evidence to support the conjecture that the ratio of oxygenated to anoxic cells increases as a fractionated radiation treatment program is carried out. (For further discussion see, for example, [16,17].) In order to attempt to model this, we assume (as do Fischer and Hethcote and Waltman) that this ratio is roughly inversely proportional to the size of the tumor. If f denotes the fraction of well-oxygenated cells present, then we specifically assume that
fce-BN,
(5)
where /3 is a parameter and N is the total number of cells in the tumor at the given time we compute the ratio. We admit here that this is a somewhat arbitrary specific expression which is of course difficult to justify at this time. However, it does reproduce in the model the same overall patterns that have been observed in experimental studies. Until more precise information regarding the mechanisms involved in oxygenation and reoxygenation (see [17]) is available, modeling of the changes in the ratio of oxygenated to anoxic cells during fractionated therapy must by necessity remain rather crude. In simulating the growth and death of the tumor and tumor bed, the calculations are carried out as follows. Assume that values are given for the
164
K. J. ALMQUIST
AND
H. T. BANKS
number of cells in each of the five classes immediately before irradiation and that a radiation dose D is specified. One first calculates how many live cells are killed. Secondly, for a period of time t (in general, t is the time in hours between the current treatment and the next treatment), one calculates the repopulation of live-oxygenated cells and the disintegration of deadoxygenated cells. Finally, at the end of this time period t, one “realigns” the tumor-cell populations so that the oxygenated cell fraction equals e-ON. In this final step, it is assumed that cells can change their state of oxygenation, but cannot change their state of “living” or “dead”. Hence the equations for this step must insure the conservation of the total tumor-cell number and conservation of “living” tumor-cell number (and thus consequently conservation of the “dead” tumor-cell count). The equations also cause the transfer of cells to be in the proper live-to-dead ratio, since the process by which state of oxygenation is changed is assumed incapable of distinguishing between living and dead cells. The above procedure results in the following sets of equations. First initial values are given for N, = live-oxygenated N2 = dead-oxygenated N3 = live-anoxic
cell count,
cell count,
N4 = dead-anoxic N, = tumor-bed
cell count,
(6)
cell count, (normal
tissue) cell count.
Next, one may combine the steps involving “kill” and “reproduction and disintegration” and use (1) with appropriate parameters along with (2) and (4) to obtain new values for NJ at time t > yD given by
k,(t)=
{N,+N,-N,[
1-(1-e-D/D~)s]}e-“(‘-7D),
(7) 13,(t)= N3+ N4-- T&(t),
165
OPTIMAL RADIATION THERAPY
Finally, one must realign the cell count to obtain the proper oxygenated-toanoxic ratio. Letting
N(t)=Ij,(t)+m,(t)+1S3(t)+~~(t), one finds that the current
(8)
ratio R (t) is given by
oxygenated-to-anoxic
(9)
while the desired ratio R (t) is given by e-BNw
R (1) =
Letting Nj(r) represent the respective computes these as follows: If R (2) > R’(t),
N2( t) =
N3
(9
=
1 _
e-BN(O
’
“realigned”
(10)
cell counts at time t, one
A,(t) + &(t) - L, $3 (2) IJ,OL
(11)
Nd(f)=L,
where
N(f) ‘-[l+R(r)][l+Ij,(t)/lj,(t)]
K. J. ALMQUIST
166
AND H. T. BANKS
On the other hand, if R (t) < k(t),
fi,
N3(t)=i+,(f)+&(t)-
(t> M
m2(t)’ N4( t) =
fi,( t) + &(t) -
(14
M,
where
R (ON (t> M=
[l+R(t)l[l+fi,(t)/fi,(t)]
One thus finds new values N,(t) for the cell counts after irradiation of dose D, a time period t, and realignment. These values may now be used as initial values for the next treatment (assuming another dose is to be given at time t). By performing a finite sequence of the above computations, one can simulate a complete fractionated treatment program once the initial tumor and bed cell counts, the doses, and the time periods between doses have been specified. For a more detailed derivation of the equations presented above, along with a discussion of the underlying assumptions, the reader should consult the paper of Fischer [12]. 3.
MATHEMATICAL
FORMULATION
OF THE PROBLEM
Let us suppose that we have chosen a particular “treatment calendar”, by which we mean a specified number of treatments to be given, along with a set of specified time intervals between the treatments. These time intervals need not be equal; they simply must be pre-determined in the formulation discussed below. By a “treatment program” we shall mean (i) a treatment calendar and (ii) a set of specified doses to be given at the treatment times. The doses in a treatment program need not be equal-indeed, one of our beliefs is that equal-dose treatment programs are, in general, not the “best” choices. The object here, then, will be to formulate and discuss the solution of the problem of choosing a treatment program that is optimal in some sense among all treatment programs with a pre-specified treatment calendar.
OPTIMAL RADIATION THERAPY
167
Assume for the moment that we have chosen a treatment calendar specifying m treatments along with a set {t,, t,, . . . , t,,_ ,} of time intervals between treatments. That is, ti is the time interval between the ith and (i + 1)st treatment, i = 1,2,. . . , m - 1. Let u(i) be the dose (in rads) administered at the (i + 1)st treatment, i = 0, 1,. . , m - 1. Let
denote the numbers in the five classes [in the same order as listed in (6) above] of cells remaining immediately before the (i+ 1)st treatment u(i) is given, i=O,l,..., m - 1, and x(m) represent the corresponding values after the last (mth) treatment is given. Letting -F; denote the composition of the functions described in the previous section, we may then write xi(i+l)=4(x,(i),x,(i)
,..., x,(i),u(i)),
(13)
j= 1,2 ,..., 5, i=O,l,..., m - 1. That is, identifying N,, . . . , N, from (6) with D with u(i), and x,(i+ 1) ,..., x,(i+ 1) with xl(i), . ..,xs(i), Nt(ti+ I>,***3N,(ti+ ,) as given by (11) or (12), the equation (13) is a mathematical expression to indicate that xj(i+ 1) is computed as described in (6)-(12). This, of course, assumes that the initial tumor and bed size is given. Schematically, this process may be dex(O)=(x,(O),...,x,(O)) picted as in Fig. 1.
x(o)~
049 pJ
“2’
#’
xy . . . xc”,-
1)
”yj- 1J+x(m)
FIG. 1
The equations
(13) are usually written in vector form: x(i+
l)=F(x(i),u(i)),
(14)
and the process is then seen to be in the basic form of a multi-stage or discrete-time control system, where u is the “control” and x is the “state” variable (see [3]). Given system dynamics (13) [or (14)], one may wish to “control” the process in order to achieve some goal, say minimizing a “cost” or “payoff” function. We choose this cost or payoff function here to be of the form m-l
J=@P(x(m))+
x
i-o
Li(X(i),U(i)),
(15)
K. J. ALMQUIST
168
AND H. T. BANKS
where we assume that the functions @ and L, have continuous first partial derivatives with respect to each of their argument variables. Note that for any given initial values x(O), a specification of values {u(O), U(l)>. . . , u(m - 1)) uniquely determines the values x(1),x(2), . .,x(m) through (14). Hence specification of a control program (i.e., a treatment program-it is understood that the time intervals are also to be specified) along with values x(0) will determine the value of J in (15). Hence we may write J=J(u;x(O)),
(16)
where u represents the chosen control program. Thus, we have an optimization problem which requires minimization of .I in (16) with respect to u, given the initial values x(0). Formally we state the problem as follows: Assume a given treatment calendar and initial tumor and bed size x(0). Find the set {u*(O), u*( I), . . . , u*(m - I)} = u* of control values which, when used with the given treatment calendar, minimizes the payoff J( u;x(O)) over the class of all control programs with the given treatment calendar. The above constitutes what is known in the literature as an optimalcontrol problem with multi-stage system dynamics. First-order necessary conditions (in the form of Pontryagin’s maximum principle or a generalized Lagrange-multiplier rule) for such problems are well known and are essentially the analogue of the usual first-order conditions g’ = dg/& =0 which one encounters in an elementary calculus course when studying the problem of minimizing a real-valued function g of a real variable y. We state one form of these necessary conditions which we have used to generate the results discussed in Sec. 5 and 6 below. Introducing multiplier variables h(i) = (X,(i), h2( i), . . . ,X,(i)), we define a modified cost functional J by m-l
J=J+
5
2
x
i-0
j=l
~j(i+l){~(x(i),~(i))-xj(i+l)},
(17)
where J is given by (15) and Fj are the functions in (13) [or (14)]. Note that if the system equations (13) are satisfied, then j reduces to J, the original cost function. The necessary conditions we state here essentially replace the problem of minimizing J subject to the constraints (13) by the problem of unconstrained minimization of j as defined in (17). Defining the Hamiltonian function
Hi=Li(x(i),u(i))+
x j-l
Aj(i+
l)F,(x(i),u(i)),
(18)
OPTIMAL RADIATION THERAPY
169
the necessary conditions for a minimum pliers h(i) satisfying the equations aHi )Ik(j)=
ax,(i)
the existence
of multi-
5 =
3+ ax,(i)
cAj(i+l)+$), j-
k= 1,2,..., 5, such that the optimal satisfy aHi*
aL,(x*(i),u*(i)) &4(i)
i= 1,2 ,..., m-
1,
(19)
1
b(m)=
o=au(i)=
guarantee
a@(x(m)> control
+ c
’
(20)
’
h(m)
a* and corresponding
qi+
state x*
aFj(x*(i),td*(i)) 1)
&4(i)
(21)
j=l
for i=O,l,..., m-l. Because of widespread misuse in practice, we feel compelled to make several points about the above conditions. First of all, these conditions should be treated as necessary conditions for J to possess a local minimum at u*. They are, in general, not sufficient, and one cannot claim that a particular solution provides either an absolute or a local minimum without further arguments. Thus, when combining these conditions with numerical techniques, one must be watchful for local minima or even saddle points which may result. A second comment pertains to the question of “normality”. Most derivations of the first-order necessary conditions as given here either (i) carefully state the conditions with an extra multiplier A, [i.e., in place of (18) one defines the Hamiltonian Hi=h,Li+Zj,,Aj~], or (ii) incorrectly use the Lagrange-multiplier rule to obtain the necessary conditions with he= 1 (for a discussion of this, see [25, pp. 214224]), or (iii) involve some type of tacit assumption (a form of “controllability” or “regularity” of the nonlinear system equations when linearized along the optimal trajectory-see [3] or Chapter 9, section 5 of [ 181) which yields what is usually called “normality”. This latter situation essentially guarantees that one can choose )b= 1 and state the necessary conditions as we have done above in (19)-(21). In certain types of free-endpoint problems one can argue normality directly from the so-called transverslity conditions (boundary conditions for the A’s), but in general one needs to invoke the assumptions mentioned in (iii). These assumptions for normality involve (in the case of nonlinear systems) the optimal trajectory and control and are thus usually impossible to verify a
K. J. ALMQUIST
170
AND H. T. BANKS
priori. Common practice (and what we shall do here) in use of the maximum principle in computational procedures is to assume normality and carry out the computaions, seeking possible candidates for an extremum (or at least candidates that offer improvements over existing policies). In some instances this can be justified by further arguments particular to the problem under investigation. Also, one often wishes to use these conditions in connection with sufficiency arguments. That is, one uses the necessary conditions to identify “stationary” or “extremal” trajectories and then verifies that additional conditions (usually involving convexity and normality-see [3, p. 21 l] or [20]) obtain which strengthen the necessary conditions to sufficient conditions. We do not wish to go into these matters in detail here, but do point out that this procedure is not unlike that used in seeking the minimum of a real variable function g by first solving for roots of g’(v) = 0 and then testing these roots with further conditions [e.g., g”(y,) > 0 or g”(y,,)
NUMERICAL
PROCEDURES
The necessary conditions given in the previous section constitute what is known as a two-point boundary-value problem, since the boundary values for x are given at i = 0 [i.e., x(O)] and those for h are given at i = rn [see (20)]. Equations (14) and (19) must be solved using these boundary values while simultaneously requiring that (21) hold. Even in very simple examples, such a system can be exceedingly difficult to solve exactly. Thus one usually turns to some type of method, quite often involving an iterative procedure, for,an approximate solution. We do that here. The method we have chosen is a first-order conjugate gradient method based on the method of Fletcher and Reeves [ 14,221. We chose a first-order method because such a scheme requires only that we calculate aF/ax and aF/ au, while second- (and higher-) order methods require the calculation of higher partial derivatives. This would be straightforward enough, but for the model under investigation here would prove quite time-consuming (and thus perhaps prohibitively expensive). On the other hand, a simple “search” technique, requiring no derivatives, would be ill suited for such a large and complex system as ours and would prove very expensive to execute on the computer. As we have already mentioned, the system to be optimized is also rather large and expensive for easy use of the techniques of dynamic programming. We chose the conjugate gradient method over the gradient method (for a discussion see [18]) because of its convergence properties. While both
OPTIMAL RADIATION THERAPY
171
methods exhibit rapid convergence at points away from the optimum, experience indicates that the conjugate gradient method converges much faster at points near the minimum. (Here a “point” is a set of values for the control vector u, hence a point in Euclidean m-space.) The method we describe below essentially solves Eq. (14), (19) and (20) exactly while iterating to (hopefully) a solution of (21). The algorithm is as follows: (i) Choose an initial guess for u; call it u”. (Here we shall use superscripts to denote the iteration number in the algorithm.) (ii) Given u” [the first time through these steps we have Y=0 with uy coming from (i) above], and the initial data x(O), use (13) to compute the corresponding trajectory and denote it by x”. (iii) Using u”, xy with (19) and (20), calculate the multipliers A, which are denoted h “. (iv) Compute the gradient vector G’ at u = u”, x = x”, X =A”, where
aH, au03
aff, auo>...>
affm-1 au(m--
1) ’
by using the expression in (21). Note that G’ is an m-vector. (v) Find the new conjugate gradient direction s”, also an m-vector, by rV=
-GO -G’+/~“s’-’
if if
v=O, v+O,
where P”=(G’,G’)/(G’-‘,G”-‘). Here the notation (a,b) means the usual inner product of two m-vectors. (vi) Minimize J along this conjugate gradient direction from u”. That is, find the non-negative scalar w* that minimizes q(w) = J(u’ + WS”;x(0)). (vii) Choose u “+’ = u”+ w*s” and return to step (ii) to begin the next iteration. The above procedure will generate a sequence of control vectors u0,u’,u2 ,...) and corresponding values of J. The iteration continues until (hopefully) convergence is obtained (i.e., there is little or no change in successive values of J). In carrying out step (vi), it is common in practice not to insist on an exact minimum. Instead, we “march out” in steps along the conjugate gradient direction until the J values begin to increase. Or, if J increases on the first step, we cut the step size until we get a decrease in J. In either case, J decreases and then increases at points along this direction. We then use cubic interpolation to obtain an approximate answer for the minimum in this interval. We remark that this part of the method is what requires most
172
K. J. ALMQUIST
AND H. T. BANKS
of the computer time used, since for each “step” we take (that is, for each new value of w that we test) we have to recompute the values for u, then for x, then for J. For the model described above, evaluation of the trajectories x turns out to be rather time-consuming on the computer. Again, we feel compelled to offer a word of caution here. There are many theoretical results for the method given above, and authors often speak of “quadratic convergence”. The careful reader will note that quadratic convergence when used in this context simply refers to the fact that the method will converge in an finite number of steps when applied to a quadratic function. Nothing is guaranteed for non-quadratic problems. Theoretical results on convergence for this method (and several others) used for general non-quadratic functions are practically nonexistent. Therefore it is not advisable to use these methods in an uncritical way, and we emphasize the need for very stringent numerical convergence criteria when carrying out such computations. 5.
NUMERICAL
RESULTS
In this section we present results from our investigations, using the methods of Sec. 4, of the problem formulated in Sec. 3. But first we must discuss the choice of a payoff function. For optimization problems arising in biology, as well as in economics, engineering, etc., this choice is often one of the more difficult aspects of problem formulation. Our choice, while clearly subject to debate and criticism, is made up of a combination of terms representing factors which we feel merit inclusion. In our study here we have included terms representing respectively the “costs” associated with (1) remaining live tumor cells, (2) damage to tumor bed, (3) “real” cost of treatment, and (4) violation of constraints on dosage levels. The form of these terms is listed below. (1) J,= wi[xi(m)+~~(m)]: the sum x,(m)+xs(m) represents the total number of live tumor cells (oxygenated plus anoxic) remaining after treatment is completed, while wi is a scalar weighting factor. (2) Jz= w&x5(m)), where f2(Y)=P(Y)12 if Y < 1, h(r)= -[Wvf if y > 1. The term x,(m) represents the amount of undamaged tumor bed (normal tissue) left after treatment, while fi is a function that increases rapidly (without bound) as its argument approaches zero. Again we have a scalar weighting factor w2. (3) J3= Zy&iws{.5 +(l/m)arctan[O.l2u(i)1.81): this term represents a cumulative “cost of treatments”. The arctan function, with the parameters as chosen, approximates a function which is 0 if there is no treatment at i and is 1 if there is a treatment at i. In seeking an optimal dose program, this term should drive to zero any insignificant doses, say 10 rads, which clinically are so small that one in practice would not administer them. The
OPTIMAL RADIATION THERAPY
173
parameter values are set so that at u(i) > 30 rads, one is charged approximately a “cost” of 1 (scaled by the weighting factor w,) for the treatment. (4) Jb = Zy&jf,(u(i)), where f4(z) = 0 if z < 500, f4(z) = (z - 500)2 if z > 500. This is a “penalty” term designed to keep all doses within some acceptable range. Here we chose 500 rads as a maximum desirable dose. Our choice of payoff used for the results reported below is then J=J,+J,+J,+J,,
(22)
where the J, are as described above. We recognize that others might argue for different terms, depending on the importance they place on certain “goals” of a treatment schedule. For example, one might absolutely insist that one attain a target value for x,(m) (live oxygenated tumor cells at end of treatment), say x,(m) Q E for a given E. A problem incorporating this feature could also be treated using the methods outlined above [the restriction on x,(m) could be handled as a terminal boundary condition, or could be included as a penalty term much like the one involving f4 above]. Since the purpose of our presentation here is to demonstrate the use of a certain method in selecting “optimal” treatment programs, we have not varied the form of our payoff. We simply point out that the method is equally feasible for a large class of other payoff functions that one might consider to be of interest. For our numerical simulations, we chose to use the same parameters in the model as those proposed by Fischer [ 121 and also used by Hethcote and Waltman [15]. That is, CY -0.004, y =0.005, D,= 100.0, n,=4.0, 0.3 250.0, n, = 1.0, Da = 120.0, n, =4.0, /3 = lo- lo, if t is taken in hours in the expressions of Sec. 2. Initial values for the cell populations were x,(O)= 3.679 x 109, x2(O) = 0.0, x3(O) = 6.321 x 109, x4(O) = 0.0, x,(O) = 1O8. After numerous runs in which the weighting factors in the payoff were varied, we chose the values w , = 100, wz = 100, wg= 50. These values were used for all of the schedules given in the tables and figure below. From a collection of more than 85 “optimal” treatment programs obtained using the methods of Sec. 4, we have given in Tables 1 and 2 a representative sample of the results obtained. Three types of results are given in Table 1. First, we present the “optimal variable-dose programs” in which one specifies a “treatment calendar” (m = number of treatments and At = period of time, chosen constant in these programs, between treatments) and then seeks the optimal treatment program as described in Sec. 3. The second block in Table 1 represents the final results if one uses the same treatment calendar and the same total dose as in the corresponding program in the “optimal variable dose program” but, instead of optimizing, gives the dose in m equal treatments. This we have labeled “equal-dose program”.
174
K. J. ALMQUIST
AND
H. T. BANKS
Finally, the third block presents results called “optimal equal-dose programs” computed as follows. The same treatment calendar as in the corresponding program in block 1 is used, and then one asks to find the treatment program that minimizes J among the class of all programs (with the given calendar) with equal doses. Thus the underlying problems for the results reported in blocks 1 and 3 of Table 1 are the same except for the stipulation in the latter that one only consider equal-dose programs in the optimization procedure. In Table 2 the optimal variable-dose schedules for the programs given in block 1 of Table 1 are listed. The graph in Fig, 2 below again refers to results obtained for the problem of finding “optimal variable-dose programs”. Each point on a curve represents an adjusted value of the optimal payoff for a given treatment calendar. The values at points (e.g., 10,15) refer to the number m of treatments specified by the calendar. Thus, the curves connect values of the optimal payoff for a fixed number of treatments as the time At between treatments (and thus the total time of treatments) varies from one calendar
COMPARISON
Treatment
TABLE 1 OF OPTIMAL VARIABLE-DOSE, EQUAL-DOSE, OPTIMAL EQUAL-DOSE PROGRAMS
code number
m (number of treatments) Ar (time between treatments, in days)
K46
K53
K64
K69
KY’2
AND
K71
K82
10 7
15 3.5”
20 2
25 1
30 1
35 1.4a
40 1.48
Optimal variable dose program
Total dose (in rads) Live tumor cells left Undamaged tumor bed Payoff (adjusted)
4167 7.0 0.05 1638
4635 2.8 0.2 1 516
5135 1.24 0.52 157
5669 0.41 0.82 29
5687 0.68 0.70 106
5968 0.56 0.71 104
6150 0.53 0.73 120
Equal dose program
Total dose Live tumor cells left Undamaged tumor bed Payoff (adjusted)
4167 21.0 0.06 2942
4635 10.6 0.32 1180
5135 3.9 0.82 381
5669 0.89 0.11 72
5687 66.9 25.2 5630
5968 410.2 77.4 39,102
6150 3995.9 302.4 396,285
Optimal equal dose program
Total dose Live tumor cells left Undamaged tumor bed Payoff (adjusted)
4289 8.6 0.02 2369
4758 4.43 0.13 860
5238 1.90 0.40 274
5727 0.55 0.77 62
6301 0.67 0.66 83
6836 0.72 0.63 93
7355 0.84 0.57 115
“The programs with At=3.5 and At= 1.4 refer to treatment calendars with 2 and 5 treatments per week respectively. The time intervals between treatments for these cases were (4,3,4,3,4,3,. ) for the twice-weekly treatments and { 1, 1, 1,1,3,1,1,. ) for the calendars specifying 5 treatments per week (corresponding to daily treatments, Monday through
Friday).
OPTIMAL
RADIATION
OPTIMAL Treatment
code no.
175
THERAPY TABLE 2 VARIABLE-DOSE TREATMENT
K46 (in = 10)
K53 (m=l:)
K64 (m=20)
K69 (m=25)
364.2 287.9 279.8 351.4 424.2 468.7 489.7 501.3 500.9 499.3
243.9 223.3 226.8 228.6 232.6 240.3 252.1 272.1 292.9 322.3 348.2 382.7 412.5 456.1 501.4
213.1 197.3 192.8 192.8 195.8 201.2 208.3 216.4 224.9 233.6 242.3 251.1 260.0 269.6 280.3 293.2 309.8 333.7 372.3 446.3
202.2 188.2 184.0 183.7 185.3 188.0 191.4 195.2 199.1 203.1 207.0 211.0 215.0 219.1 223.3 227.8 232.7 238.1 244.3 251.7 260.8 272.5 288.3 310.9 346.0
&Doses u(i) listed sequentially,
with i increasing
from 0 to m - 1
PROGRAMSa K72 (m=30)
K71 (m=35)
0.0 0.0 0.0 0.0 0.0 165.7 173.9 178.8 181.6 182.7 185.7 187.9 191.6 196.3 201.6 214.2 218.7 223.3 227.9 232.6 243.6 247.6 251.9 256.7 262.2 276.9 282.9 290.4 300.0 312.4
0.0 0.0 0.0 0.0 0.0 0.0 0.0 85.0 114.1 133.0 163.7 170.4 176.2 182.1 188.0 200.8 205.1 209.1 213.0 216.7 224.6 227.2 229.8 232.5 235.1 241.5 243.9 246.4 249.1 252.0 259.5 262.3 265.5 269.0 272.9
K82 (m=40) 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 62.9 98.5 130.4 146.4 154.8 169.6 174.5 179.0 183.4 187.6 197.3 200.9 204.4 207.8 211.2 219.2 222.0 224.9 227.8 230.7 237.6 240.1 242.7 245.4 248.3 255.2 257.7 260.3 263.2 266.4
K. J. ALMQUIST
176
AND H. T. BANKS
1000-
0.
’
0
20
40 TOTAL
60 TREATMENT
60 TlME(,h.
I00
120
140
OAYS)
FIG. 2
to another. The payoff values are “adjusted” by subtracting the “cost of treatment” term J, which essentially shifts the curves down by a constant factor (IV, X number of non-zero treatments). Several comments are perhaps of interest with regard to our use of the techniques in Sec. 4 to obtain results, of which the above forms a reasonably representative sample. First, the actual cost of determining an “optimal variable-dose program” was relatively low, on the order of several dollars for computer time, and thus is a strong argument for the feasibility of extended use of such schemes in comparing models and in studying trends predicted by these models. The rate of convergence of the iterative scheme was rapid in most cases, say roughly in 30-50 iterations. (For example, the numbers of iterations to “convergence” for the results reported in Tables 1
OPTIMAL RADIATION THERAPY
177
and 2 were 20 for K64 and K69; 30 for K53; 40 for K46, K71, and K72; and 50 for KS2.) As is to be expected with such schemes, choice of the initial guess u” is a relatively important factor influencing the rate of convergence. In our work, we normally specified some constant-dose program for the initial guess [say, u”(i) = 200 for each i = 0, 1,. . . , m - 11. This worked very well except in cases where the “treatment calendar” called for a large number of treatments (more than 30) when the optimal schedule actually appeared to need fewer treatments (as in K82, where the first 10 treatments are wiped out in the optimal treatment schedule). In these instances convergence was somewhat slower. 6.
DISCUSSION
AND SUMMARY
A number of general patterns present in our results are discernible in the sample presented in Tables 1 and 2 and Fig. 2. In comparing the optimal variable-dose program results (block 1 of Table 1) with those for the equal-dose program (block 2) with the same total dose, one observes that the payoff is substantially smaller for the variable-dose treatments. Furthermore, the resulting final values for tumor bed remaining are “better” for the variable-dose programs. These comments are true for all of the 34 different such comparisons we considered, not just the ones listed in detail in the above tables. A comparison of the optimal variable-dose programs with the optimal equal-dose programs reveals that the corresponding optimal equal-dose program required a larger dose to produce either approximately the same final results with regard to the surviving bed and tumor (see, e.g., K7 1, K72) or, as in most cases (see K53), less desirable final values for these factors. Again, this general pattern persisted throughout all of our results, thus promoting the conclusion that one obtains “more for one’s investment (dosage)” via use of optimal variable-dose programs than with optimal equal-dose programs We note that in a number of cases the final tumor value for the cell count was greater than one. However, absolute numbers here are probably not as meaningful as one might at first expect. A change in the weighting factors w,, w2, wg would serve to drive the final tumor value below a given E >0 if one placed great importance on such absolute values. Or, a slight reformulation of the problem as described in Sec. 5 could easily handle such constraints if so desired. Also, observe that in some cases the surviving tumor-bed values are smaller than the surviving tumor-cell count. Recalling that the model does not contain terms to allow for growth of the undamaged tumor bed, one again sees that the comparison of these absolute numbers is not so meaningful. Two general patterns were present in the optimal dose schedules for the above model. Either the doses u(i) are increasing as i increases (this was
K. J. ALMQUIST AND H. T. BANKS
178
typical of the calendars that called for large numbers of treatments-see, for example, K71) or in the case of a small number of treatments (see K46) there was a slight initial drop in u(i) for the early values of i, after which u(i) increased with i. We point out that the increasing pattern was also observed by Hethcote and Waltman [ 151 in their work. A possible explanation for these patterns (in the context of this mathematical model) is that initially one kills mainly oxygenated tumor cells (the anoxic being more resistant to radiation damage), thus reducing tumor size and promoting reoxygenation. As the percentage of the tumor which is oxygenated increases during the course of the treatments, one gives increasingly larger doses to kill the oxygenated cells. In studying Table 1 and Fig. 2, one concludes
that one can improve
the
final results by increasing the number of treatments up to a certain point. (For the weights used in our payoff, the “best” number of treatments appears to be in the range of 25 to 30.) There also appears to be a “best” choice of overall treatment time for a given particular number of treatments. This can probably be explained by observing that there is a tradeoff between re-oxygenation and regrowth-one enhancing destruction of the tumor cells, the other, of course, detrimental to this goal. In conclusion, we summarize the contributions of this paper as follows. Choosing a model based on current information available in the literature, we have shown how to formulate the problem of choosing fractionated radiation therapy programs as a problem in the mathematical theory of optimal control. Methods, both theoretical and computational, for solving such problems are discussed. We offer support for the feasibility of such procedures by carrying out a substantial number of satisfactory applications of the method to our model. A sample of the typical results obtained in the course of our investigations is presented. Given some confidence in such a model, we demonstrate how one can use such techniques to develop insight into the characreristics of responses to fractionated therapy. The formulation and method presented here are rather general ones, capable of handling larger and more complex models with relative ease as these newer models are developed on the basis of findings in radiation research. We Indeed, problem express draft of
wish to thank Paul Waltman for a number of helpful discussions. it was through our conversations with him that our initial interest in the of “optimal” fractionated radiation therapy arose. We also wish to our appreciation to R. A. Vitale for critical comments on an earlier this manuscript.
REFERENCES 1 J. R. Andrew, delphia, 1968.
The Radiobiologv
of Human Cancer Radiotherapy,
Saunders, Phila-
OPTIMAL RADIATION THERAPY 2
3 4 5
6 I 8 9 10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 2.5
179
J. Aroesty, T. Lincoln, N. Shapiro, and G. Boccia, Tumor growth and chemotherapy: mathematical methods, computer simulations, and experimental foundations, Math. Biosci. 17, 243-300 (1973). A. E. Bryson and Y. C. Ho, Applied Optimal Conrrol, Blaisdell, Waltham, Mass., 1969. E. A. Chemiavsky and H. M. Taylor, Control of a general lethal growth process, Math. Biosci. 13, 235-252 (1972). L. Cohen, Radiation response and recovery: radiobiological principles and their relation to clinical practice, in The Biological Basis of Radiation Therapy (E. Schwartz, Ed.), Lippincott, Philadelphia, 1966, pp. 208-348. L. Cohen, A cell population kinetic model for fractionated radiation therapy, I. Normal tissues, Radiology 101,419427 (1971). L. Cohen and M. J. Scott, Fractionation procedures in radiation therapy: a computerized approach to evaluation, Br. J. Radiol. 41, 529-533 (1968). G. J. Dienes, A kinetic model of biological radiation response, Radiat. Res. 28, 183-202 ( 1966). G. J. Dienes, Survival curves and dose fractionation: some general characteristics, Radiat. Res. 48, 551-564 (1971). M. M. Elkind, Cellular aspects of tumor therapy, Radiology 74, 529-541 (1960). G. M. Ewing, Calculus of Variations with Applications, Norton, New York, 1969. J. J. Fischer, Mathematical simulation of radiation therapy of solid tumors, I. Calculations, Acta Radiol. Ther.: Phys. Biol. 10, 73-85 (1971). J. J. Fischer, Mathematical simulation of radiation therapy of solid tumors, II. Fractionation, Acta Radio. Ther.: Phys. Biol. 10, 267-278 (1971). R. Fletcher and C. M. Reeves, Function minimization by conjugate gradients, Comput. J. 7, 149-154 (1964). H. Hethcote and P. Waltman, Theoretical determination of optimal treatment schedules for radiation therapy, Radiat. Res. 56, 15&161 (1973). A. E. Howes, An estimation of changes in the proportions and absolute numbers of hypoxic cells after irradiation of transplanted C,H mouse mammary tumours, Br. J. Radiol. 42, 441447 (1969). R. F. Kallman, The phenomenon of reoxygenation and its implications for fractionated radiotherapy, Radiology 105, 135-142 (1972). D. G. Luenberger, Optimization by Vector Space Methoak, Wiley, New York, 1969. N. J. McNally, Recovery from sub-lethal damage by hypoxic tumor cells in vivo, Br. J. Radiol. 45, 116119 (1972). L. W. Neustadt, Optimization: A Theory of Necessary Conditions, Princeton U. P., Princeton, N. J., 1975. M. F. Neuts, Controlling a lethal growth process, Math. Biosci. 2, 41-55 (1968). J. M. Ortega and W. C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables, Academic, New York, 1973. P. Rubin and G. Casarett, A direction for clinical radiation pathology, in Radiation Effec? and Tolerance, Normal Tissue, F’roc. 6th AMU. Cancer Symp. (J. Vaeth, Ed.), University Park Press, Baltimore, 1972. A. J. Tarr and R. A. Vitale, Mathematical models of tumor growth, Pattern Anal.’ Tech. Rep., Brown Univ., Providence, R. I.,May, 1975. L. C. Young, Calculus of Variations and Optimal Control Theory, Saunders, Philadelphia, 1969.