0005-1098183/010015-14503.0010
Automatica, Vol.19,No. 1, pp. 15-28.1983
PergamonPressLtd. © 1983InternationalFederationof AutomaticControl
Printedin GreatBntain.
Decomposition Methods in Multiobjective Discrete-time Dynamic Problems* K. TARVAINEN,t Y. Y. HAIMES~t and I. LEFKOWITZ~
Decomposition methods for discrete-time multiobjective dynamic problems, where hierarchical-multiobjective subsystems are coordinated via the surrogate worth trade-off (SWT) method and its extensions, provide lower dimension subproblems in terms of variables and possibly the number of objectives. Key Words--Decision theory; discrete-time systems; dynamic programming; hierarchical systems; industrial control; large-scale systems; multiobjective optimization; multicriteria optimization; system analysis; system theory.
decomposition approach has been studied extensively (Mesarovic, Macko and Takahara, 1970; Lefkowitz, 1966; Lasdon, 1970; Wismer, 1971; Himmelblau, 1973; Haimes, 1977; Singh and Titli, 1978, 1979; Mahmoud, 1977). In the following discussion, decomposition methods for multicriteria dynamic problems are considered. In this context, we define a decomposition method as an algorithm where the problem is reduced to a series of multicriteria problems relating to single points of time. Since a solution of a multicriteria problem involves interaction with a decision maker, only discretetime problems with a finite number of corresponding multicriteria subproblems are of interest. Two decomposition methods are first derived: a method based on dynamic programming, arrived at by applying the principle of optimality (Section 4), and a two-point boundary value problem method (Section 5). In these methods, the multicriteria subproblems are related consecutively to each other. In multilevel methods (feasible, Section 6.1; non-feasible, Section 6.2), the subproblems are solved (or can be solved) in parallel, coordinated by a master program (or coordinator). Derivation of the multilevel method is based on corresponding multicriteria static studies (Haimes and Tarvainen, 1980; Tarvainen and Haimes, 1980a, b). For completeness, Section 3 deals with multicriteria dynamic methods that are not decomposition methods in the above sense. In these methods, the original problem is reduced to a series of single-objective dynamic problems. The following methods are reviewed: the weighting method (Section 3.1), the surrogate worth trade-off (SWT)-Pontryagin (Section 3.2),
Abstract--Decomposition methods for multicriteria dynamic (discrete-time) problems are derived. In these methods, the original problem is reduced to a series of multicriteria subproblems related to individual stages. Hence, the dimensionality of decision variables in each subproblem is smaller than in the original problem. The following decomposition procedures for such problems are developed: (I) a dynamic programming method, (2) a two-point boundary value problem method, (3) multilevel methods, and (4) the formulation of a temporal hierarchy. For completeness, methods for multicriteria dynamic problems are reviewed that, at the outset, transform a problem into a series of single-objective problems. Formulation of the multiobjective problem in the context of a multilayer temporal hierarchy is also presented. The temporal structure motivates problem simplification by decomposing the overall decision-making problem according to relative time scales.
1. INTRODUCTION
DECOMPOSITIONmethods for solving multicriteria large-scale static problems have been studied, e.g. by Geoffrion and Hogan (1972); Olenik and Haimes (1979); Haimes and Tarvainen (1980); Tarvainen and Haimes (1980a, b). The purpose of these studies was to develop solution schemes where each subproblem to be solved is of lower dimension in its variables and possibly in the number of objectives than the original problem. In the single-objective case, the
*Received 17 July 1981; revised 14 June 1982. The original version of this paper was presented at the 8th IFAC Congress on Control Science and Technology for the Progress of Society which was held in Kyoto, Japan during August 1981. The published proceedings of this IFAC meeting may be ordered from Pergamon Press Ltd, Headington Hill Hall, Oxford OX3 0BW, U.K. This paper was recommended for publication in revised form by editor A. Sage. tElectrical Engineering Department, Technical University of Helsinki, Espoo 15, Finland. ~Systems Engineering Department, Case Western Reserve University, University Circle, Cleveland, OH 44106, U.S.A. 15
16
K. TARVAINEN, Y. Y. HAIMES and I. LEFKOWITZ
constraint method dynamic programming (Section 3.3). In Section 3.4, a method is derived by applying a method by Geoffrion and co-workers to the dynamic case. In single-objective dynamic optimization problems, there are four main classes of algorithms: (1) gradient methods, (2) dynamic programming, (3) two-point boundary value problem methods (Pontryagin's principle), and (4) finite difference/mathematical programming methods (cf. Sage, 1968). In the following discussion, we present multicriteria counterparts for these clases, respectively: (1) the gradient method, Section 3.4; (2) dynamic programming, Section 4; (3) the two-point boundary value problem method, Section 5; and (4) multilevel methods, Section 6. In Section 7, problems with decomposition with respect to time spans are dealt with. In practice, decision variables can often be decomposed into classes (layers) with different optimization horizons. The multiobjective generalization of this temporal hierarchy in the single-objective case [as addressed by Lefkowitz (1977), Mesarovic, Macko and Takahara (1970)] is formulated. 2. PROBLEM FORMULATION
Consider the discrete-time system shown, in block diagram form, in Fig. 1. In the ith stage, i = 0 . . . . . N - 1, let x~ = state vector; fi = ( f ~ , . . ., f ,i) T _- vector of objectives; ni = the number of objectives; and m~ = decision vector. Furthermore, let the state change at stage i according to xi+l = Hi(xl, mi), i = 0 . . . . .
N -
1, x0
given.
Also, other constraints can be imposed: see the handling of other types of constraints in Haimes and Tarvainen (1980) and Tarvainen and Haimes (1980b). The overall objectives F~. . . . . F. in which the decision maker is interested are, in general, some functions on f (generalization to the case where Fis depend also on the final state xN is straightforward). We can now state the multicriteria discrete-time problem as follows: rain {F,(f ° . . . . . fN-,) . . . . . F~(fo . . . . . iN-,)} m
(4) with fi = (fl (xi. mi) . . . . .
fi.i(xi, ml)) ~
subject to (1) and (3). In this paper, a solution of a vector minimization problem means a vector that a decision maker prefers over other feasible vectors. That is, the theory developed in this paper is designed to be used by a decision maker. (We are not dealing with the entire Pareto-optimal set.) Thus, solutions of the multiobjective problems depend on the preference structure of the decision maker. On the other hand, we assume, mainly for notational simplicity, that a solution of a multiobjective problem obtained by a particular decision maker is unique. For brevity, F denotes (F~ . . . . . F,) T. The problem formulation is rather general, allowing, for example, objectives that may vary with time. In one common case of interest, we have n i = n (i=O ..... N-l) and, for j = 1 , . . . , n, we have
(1) N-1
The stages' objectives are assumed to be functions of the stage variables fi = f~(xi, mi),
i = 0 . . . . . N - 1;
j = 1. . . . . n,.
(2)
P-')= Z fl.
(5)
i=1
That is, the decision maker is interested in the total (or average) of the objectives over the optimization period (Tarvainen, 1981).
For brevity, let X ~--- ( X 0 . . . . .
3. METHODS USING SINGLE-OBJECTIVE SUBPROBLEMS
XN-I) T
fN-I)r = vector of all stage objectives, m = (m0 . . . . . mN_~)T = vector of all decision variables. If the dependence of x on xo is suppressed, then x = x ( m ) , as determined by (1). Finally, let the decision vectors mi be constrained by the following equation f = (fo .....
gi(xi, mi) <- 0,
i = 0 . . . . . N - 1.
(3)
Most multicriteria techniques actually reduce the multiobjective problem to a series of singleobjective problems (Chankong and Haimes, 1983). Hence, we can derive several methods for the dynamic case by applying such techniques developed for the static case to the dynamic one and solving the ordinary dynamic problems with standard optimization methods. Below, three such dynamical multicriteria methods are briefly reviewed.
Multiobjective discrete-time dynamic problems
17
I,
'
I
I
I
r-%
I
I
i I
i I---- -~
xo
,I- --7 FI ~[ Fl(f) L - -41~
I I
I I
I !
I
L I
,
fl
I
Stage 0
~
"'°
mo
i
t
I I ...
__I :
"~- ---[
Fn
r
i i F (f) I--- -- -I,' i---~ n I
ifi
I F"L._-
"p.
I ni
btage i
~
" °"
mi
age N-I
mN_l
FIG. 1. Structure of the system (solid line) and objectives (broken line).
3.1. Weighting method In the weighting method, problem (4) is converted into a series of single-objective problems of the following form (subject to the same constraints) n
min ~ wiFi
(6)
using the following e-constraint form min FI subject to F2 -< e2, /~. -< e.
where wi -> 0. By varying the weights, different Paretooptimal points result. All subproblems of the form (6) are single-objective discrete-time problems. They can be solved by any appropriate algorithm developed for discrete-time dynamic problems. There are plenty of techniques available, especially when we have objectives ~ of the form given in (5) (Chankong and Haimes, 1983). The weighting method, although efficient computationally, has a severe limitation: in order to generate all Pareto-optimal points, the Pareto-optimal surface in the objective space must be convex (and still strictly convex, to avoid additional complications). For more details see Haimes, Hall and Freedman (1975), in which the continuous case is treated. As can readily be seen, there is no essential difference between the discrete-time and the continuoustime cases.
3.2. SWT-Pontryagin Here the e-constraint method, where one objective function constitutes the primary objective and all other objectives act as constraints, see Haimes, Hall and Freedman (1975) and Haimes and Hall (1974), is utilized. Paretooptimal solutions for problem (4), are generated
(7)
(8)
and the original constraints [equations (1) and (3)]. By varying the e s, different Pareto-optimal points are generated. The SWT method, which is based on the e-constraint method, does not suffer from the convexity limitations of the weighting method. A drawback is the additional computational work caused by the additional constraints. For more details see Haimes, Hall and Freedman (1975), in which continuous-time cases are treated. The SWT method (Haimes and Hall, 1974; Haimes, 1977; Chankong and Haimes, 1978), which uses the e-constraint approach, is used. The subproblems are solved using Pontryagin's principle; the objectives are of integral form equivalent to (5) in the continuous case.
3.3. Constraint method/dynamic programming A multiobjective dynamic programming method for capacity expansion was developed by Chankong, Haimes and Gemperline (1981). This approach circumvents the increase in the dimensionality of the state space that the constraints Fj_< ej (j=2 . . . . . n) might otherwise cause as was discussed by Tauxe and co-workers (1979a, b, 1980). In their work separability conditions for the objective functions and constraints were assumed.
K. TARVAINEN,Y. Y. HAIMESand I. LEFKOWITZ
18
3.4. Gradient method This method is derived below by directly applying the method of Geoffrion, Dyer and Feinberg (1972) to the dynamic case. The method is called the gradient method because it reduces to the usual gradient method when the number of objectives is one, and because the starting point to derive it is application of the gradient method to the utility function. Consider problem (4) subject to (1) [without the inequality constraints (3)]. Assume that there exist a differentiable underlying loss (negative utility) function L(G . . . . . F,); R " ~ R for the decision maker's preferences. The loss function is not directly used in the method. Instead, indifferent trade-offs (marginal rates of substitution) given by the decision maker are used. The following relationship between the loss function and these trade-offs is used
aL(F, . . . . . G)/aFi = ,k*(F, aL(F, . . . . . F,)/OE
.....
F.)
Here, in the first equality, the chain rule of differentiation is used. In the second equality, aL/aF~ (arbitrarily this first derivative is assumed to be nonzero, strictly positive, because we are minimizing objectives). The third equality results from the use of (9). In the fourth equality, simple transfer of constants in derivation is made. The term aL/aG in (10) is a postive constant, and it can be left out in determining the direction of the gradient. Note that the remaining d/dm-term in (10) is the same as in the singleobjective case with a performance index J(x(m), m)=~,A*~FM(x(m), m)]. So, the i
standard techniques (Bryson and Ho, 1969, Section 2.2) can be applied to evaluate the d/dm-term with the aid of an adjoint variable. Without repeating this procedure in this particular case, the results are given in the following (the aL/3F1 term is dropped out)
(9)
d LI,,,=., ~ =/.'~ A*~ ~gF~(f(x*,m*)) afi(x~, m'i) ~=1 Ofi ami
dmi where it*(G . . . . . F,) = t h e decision maker's indifferent trade-off between the ith and jth objectives when the objectives have values F~. . . . . F~. , For notational convenience, we define )t ~j also for i = j. From (9), we get it* = 1, which we use as a definition of it ~ for all i. The method is an iterative one; the iteration index is denoted by the superscript k. At the kth iteration we have a decision vector m k with a corresponding value fk for f and with a corresponding value F k for the overall objective vector. To serve as a new direction Am k for seeking a better value of m, the direction in which the loss function decreases fastest is selected. Hence, we evaluate the total derivative of L with respect to m at the values determined at the kth iteration
d L[Ft[f(x(m), m)] . . . . F.[f(x(m), m)]]l,.:,.,
dm
= ~ aL k d ~:, ~ (F ) d--mFs[f(x(m), m)]l,:m* k
aL
G [f(x(m), m)]b.=m~ aL = OFi (Fk) £ x" *k ls ~d G[l:(x(m), re)lira=,,' s=l
=OL k f d aft (F)la--m ~-_,it*kFs[f(x(m)'m']IIm=mk" (lO)
+ ~T+,aHi(x~, m~) c3mi , i=O . . . . . N - 1
(ll)
where ~i ( i = 1. . . . . N) is calculated (backwards) from the following difference equation
= [an, ,5i
L
m,)]Ta,+, axi
+ [s~__!A~ks aFs(f(xk'afi ink)) i=1 ..... N-I;
3fi(Xki'Oixmk)] T,
8N=0.
(12)
Using (11) and (12), the gradient can be calculated. Then we can proceed as proposed by Geoffrion, Dyer and Feinberg (1972); that is, we give to the decision maker the values of the objectives along the direction of the negative gradient (note: we have a minimization task here). The decision maker decides how far to go in this direction. The indifferent trade-offs at that point are asked of the decision maker. Then a new gradient is calculated using (10) and (11), and so on until the preferred solution is found. Computationally, each subproblem of this algorithm is no more tedious than a single-objective problem. Note that it was not necessary to make any assumptions concerning the structure of objectives [such as (5)]. Note also that there were no constraints, such as those found in (3), assumed in the development. Constraints on the decision vector of the form g~(mi)-<-0 are usually easily incorporated into a gradient method, but constraints on the state variable x are generally not easy to handle.
Multiobjective discrete-time dynamic problems As can be readily seen, a corresponding algorithm for the continuous-time problem can be derived.
19
In general, let
N-I
P}(s,)=Y
' 0 m °) [~(x,,
t=l
4. DYNAMIC PROGRAMMING
Consider problem (4) with the objective structure given in (5) and with the constraints in (3). Let us apply the principle of optimality starting with the last stage.* The last stage's decision depends on, in addition to the value of xN-i, the cumulated values of the objectives. That is, if the decision maker
N-2 i=0
where x ° and m ° (t = i. . . . . N - 1) are preferred values when starting from s~; i = 1. . . . . N - 1 ; j = l . . . . . n. That is, Pi(si) is the resulting cumulative increase of the Fj objective when starting from s~. The general optimization problem is correspondingly for each s~ and i = 0. . . . . N - 2
knows the value of E f} (j = 1. . . . . n) and the value of xN-~, he is able to make the decision on mN-t. In other words, the state vector for the ith stage is s, ~ [xi, FI, . . . . F ~, ]r,
min
IF. I + f~(xi, mi) + P~+'(si+,(si, roD) 1
mi
LF,+f~(xi, mi) + rni+t. , tsi+ltsi, i - mi))J(16 ) where
i=0 ..... N-1
(13) Si+l(Sl, mi) = [Hi(xi, mi), where
i-1
F:=Y.I.
i=l ..... N-l;
F;-0;
t=O
j = l . . . . . n.
(14)
For the last stage, we have then for every value of [xN_I,F~ -I . . . . . F ~ -1] the following multicriteria problem rain I .FIt~-~+ / IN-1(XN-i,mt~-i)1 •
raN_, i~.
+I~
(xN_,.m,~_,)
.
(15)
Denote the value of /~-i at the preferred solution by P~-~(SN-I), i = l . . . . . n. Next, consider the ( N - 2 ) t h stage• Let sN-2 be given. The decision vector is ran-2, which will cause the following value for the next state SN-|(SN-2, mt~-2) = [HN-2(XN-2, mr,,-2), F IN-2
+ ff-2(xN-2, raN-2) . . . . . F ~ -2
N-2
+ f , (xN-2, raN-2)]r. The corresponding optimal decisions for the ( N - 1)th stage result in values for if-* given by P~-*(sN_,). By the principle of optimality, we can now define the problem for the (N - 2)th stage as follows for each sN-2: min raN- 2
I
FI + fi(xi, mi) . . . . .
Fi+ f,i (xi, mi)]T.
Note [see (15)] that problem (14) applies also for i = N - 1 when we define P ~ = 0 , j = l,...,n. When solving problems (16) backwards, the last problem (i = 0) is solved once for the initial state So = (x0, 0 . . . . . 0)T [see (14)], and the whole problem is solved. The only essential difference between this multiobjective case and a single-objective case is the inclusion here of the past performance via F~s. This is because, in a general case, preferences for performance in the future depend also on the past levels of objectives. Hence, the dimension of the state space increases by n. Note that if we, as above, let n = 1, we have a single-objective problem• The Fis in the corresponding problems in (16) drop out as constants in single-objective problems, and we have a usual dynamic programming problem. Compared to the method of Section 3.3, the above method assumes decision making with respect to each stage, which sometimes may be a drawback• On the other hand, the above method produces a feedback law.
.FIN-2+ f~-2(xN-2, ran-2) + P~-I(SN-,(SN-2, raN-2)) 1
N-2+ f ,N-2(xn-2,
F,
ran-2)+ Pon-, (s~-~(sN-2, m~_2))]
*Brown and Strauch (1965) show the applicability of the principle of optimality to the Pareto-optimality problem. Tarvainen (1981) shows the applicability of the principle of optimality with respect to preferred solutions.
5. TWO-POINT BOUNDARY VALUE METHODS
I n the following discussion, the same necessary condition that was derived for static
K. TARVAINEN,Y. Y. HAIMES and I. LEFKOWITZ
20
hierarchical muitiobjective methods (Tarvainen and Haimes, 1980b). Given the problem
is used
min {F, (z) . . . . . F,, (z)} subject to w(z) <_o.
And under general assumptions in connection with trade-offs, a necessary condition for z ° to be a preferred solution of the decision maker(s) is that the following set of equations be satisfied
(17)
AT(FO)~z (20)+pT-~z (zO)=o pTw(zO) = O,
p > O, W(Z°) <--O.
(18)
Here, F = (Fi . . . . . F n ) T, F ° = F(z°), A(F °) = [1, X*2(F °) . . . . . X*,(F°)] r with A*k(F°) = the decision maker's indifferent trade-off (the marginal rate of substitution) between F~, Fk (k = 2 . . . . . n), p is a multiplier vector, and the superscript T denotes transpose. Applying this necessary condition to the problem given by (1)-(4) yields the following equations (suppressing some arguments) in addition to (I) and (3)
~fii -- aT -I" Ai(i*l)[~cxaT + '/7T -agi s[J' ~X t.J, i+l -O~ ~/ ~/ =
Aiz.exT
i = 1. . . . . N -
1
AIN-I)u([) = 0
ai (i = 1. . . . . N), ~i (i = 0 . . . . . N - 1 ) are vectors of multipliers. Above, it is assumed that one component of A~(/)---arbitrarily the first o n e - - i s nonzero. Furthermore, we assume here that A~(f)s components (i = 0 . . . . . N - 1 ) are nonnegative for all f. If this is not the case, the resulting multicriteria problems have to be interpreted in a generalized sense; compare this with the corresponding situation in the static case in Tarvainen and Haimes (1980b). In the remainder of this section, we consider a special case without inequality constraints. Thus, (3) and (22) drop out, as well as the g~ terms in (19) and (21). Note that (21) is, by the general necessary condition, a necessary condition for the following multicriteria problems (i = 0 . . . . . N - 1)
-_f~(xi' mi) + minml~ [.)(xi, mi)
A"'"(f),ST+,H~(x~,mA1 (27)
L r.,(x,, m,) with indifferent trade-offs given by the vector Xis=(1, A~s2,...,Ais,,) (that is, the indifferent trade-off between fl and fl to be used is )tlsi). Furthermore, note that (19) and (20) are recursion formulas for 8
0,
(19)
( oHi ~ T ai+, -[-- ( ofi~ Tt~ ~(f), a~ = A ~"*~)(f) \ axi / \axi/ i = 1. . . . . N - 1
with
aN = 0 i [ f'~T tgJ¢i
As,,,
(20)
cgHi ..~ T agi _
Omi+ A"i+J)(f)aT+l c3ml
"Oi > - 0 ,
T rlig~=0,
"qi-~i
- 0,
i = 0 . . . . . N - 1,
(21)
i=0 ..... N-1
(22)
where* hg,(f)=
Ai(f)IM(f),
i=O . . . . .
N-
1
(23)
with x'(s) =
i=0 ..... N-1 (24)
and h00c)=X~(f)/hi(f),
i,j=O,...,N-1
*Subscript s denotes scaling; X~ is a vector.
(26)
(25)
aN = 0.
(28) (29)
Altogether we have elements for a two-point boundary value problem, as we had in the single-objective case: system equation (1) with an initial value, x0, an auxiliary variable (8) with a difference equation with a boundary value aN = 0, and an optimization problem (27) for each stage. One way the single-objective case is different is the fact that, at each stage of a general case, there is a dependence on values of objectives in other stages (D. A natural way to o v e r c o m e this difficulty is to use the relaxation method. That is, first a value for the vector f is guessed. At the beginning, in solving problem (27) or evaluating ai [(28)] at a stage i, guessed values of objectives in other stages are used when needed. E v e r y time a new value for a component of f is evaluated in solving problem (27), the new value is used in the other stages when needed.
Multiobjective discrete-time dynamic problems Several techniques developed in the singleobjective case can now be directly modified to handle the above two-point boundary value problem. As an example, consider the relaxation method. Its basic idea is first to guess a trajectory for the auxiliary variable (8, here), then to integrate the state equation, solving at each stage the decision variable using the guessed value for 8. Then the auxiliary variable is solved backwards, using the obtained values for x and m, and so on in turn. For improvement of the convergence area, refer to the damping technique (PDE method) in Steven (1971). As in the single-objective case, a strong optimization result for every stage (like Pontryagin's theorem) does not hold in this discretetime problem in a general case. Hence, also here some convexity requirements are generally needed; it is difficult to handle stationary solutions to the multicriteria problem (27). For convexity conditions, see the corresponding treatment in nonfeasible methods in Tarvainen and Haimes (1980b). Note that, as in the gradient method of Section 3.4, the objective structure may be general, constraints of form g~(mi) are easily incorporate, but constraints on x~s are difficult to handle. 6. MULTILEVELMETHODS 6.1. Feasible method In the feasible method, x is used as a coordination variable. Consider the necessary conditions (1), (3), and (19)--(26) when x = x k. Note (1), (3), (21), and (22) represent a necessary condition for the following multicriteria problems (i = 0 . . . . . N - 1)
minV /'~(xk' mi)
(30)
m,)J
21
Assume, for a moment, that the problems in (30) have solutions m~ (i = 0 . . . . . N - 1). Then the corresponding necessary conditions (1), (3), (21), and (22) are satisfied with some values for the multiplier vectors 8~, 71i denoted by 8 k, r/k, i = 0 . . . . . N - 1 . Furthermore, if we assume that the regularity conditions (Luenberger, 1973) holds for constraints (1) and (3), then 8~ and ~ are unique. Note that it follows from (22) that, for every component of g~(x k, m~) that is nonzero, the corresponding ~i-component is zero. The remaining ~-components and 8i are solvable from (21), which is a linear equation with respect to these multipliers. When the ~ and 8~ multiplier vectors are solved, they along with inks and x~s can be substituted into (19) to check whether these remaining necessary equations are satisfied. If they are not satisfied, a new value of x, x TM, is determined, and the procedure is repeated. Determining of x TM can be based on the same idea as selection of m k+t in the method of Section 3.4. That is, we first determine a direction Ax k in which the underlying loss function decreases the fastest. Then the decision maker decides, by direct binary comparisons or via trade-offs (Tarvainen and Haimes, 1980b), how far to proceed in this direction. In contrast to the gradient method of Section 3.4, the g~(x~, mi)<-0 constraints do not present a principal difficulty. But computationally these are naturally not always easy. When they are present, feasible direction methods, for example, can be'tried. In our case, applying the feasible direction method in its simplest form without any jamming prevention results in the following linear programming problem, for which the solution Ax k is determined min N-I ~ Xli(fk){(Gxi)TAxi
--(T]~)T(f~giAxi4-~iAmi)}\~Xi
subject to
k 1 -- H~(x~, m~) = 0 Xi+
(31)
subject to
gi(x k, mi) <- 0 - 1 -< xi -< 1 (component-wise)
with
indifferent
trade-offs given by , k • , f ( x i , mi) . . . . . "if-i)) [(23), (24)]. In a general case, the trade-offs for the ith problem (30) depend on the values of objectives in other stages. In such a case, we can use the relaxation approach explained in Section 5. That is, we use the latest values for objectives in other stages. Alternatively, we can take objectives that affect other subsystems via the tradeoffs as additional coordination variables (Tarvainen and Haimes, 1980a, b).
A xi+, = ( aH~l ax~)A x~ + ( aHJ ami)Ami, (a(g~)jl#xl)Axi + (~(gi)jlcgm~)Am~ <_0 for j ~ { j l ( g i f x k, mk))i =O}; i = l . . . . . N - l ; Axo = 0 where (GDT ~ ~ i ( : k ) f~]i__(~k.)T..b__ , •
dX i
i(i+l)(fk~[~k
T OHi . ox i
~gi -
~X i
K. TARVAINEN, Y. Y. HAIMESand I. LEFKOWITZ
22
For a derivation of problem (31), see a similar case in Tarvainen and Haimes (1980b). For a more elaborate feasible direction algorithm in a less general multicriteria case, see Geoffrion and Hogan (1972). It was assumed above that the problems in (30) have solutions. In general, the only difficulty that may exist in this respect is related to feasibility (Tarvainen and Haimes, 1980b). That is, x ~, if arbitrarily selected, may be such that there is no value of mi that even satisfies the constraints. Similarly, even if Ax k was locally feasible, proceeding in that direction may result in nonfeasible points. Figure 2 summarizes the feasible scheme. Other feasible schemes may be derived, as in Haimes and Tarvainen (1980), and Tarvainen and Haimes (1980a, b). 6.2. Nonfeasible method In the norffeasible method, the connections between the stages are relaxed. That is, the
value of state xi of stage i is not necessarily the same as the value of state xi determined by stage i - 1 until the coordination is ended. Consequently, we make following changes in the problem formulation. Let yi denote the final state of stage i, i = 0 , . . . , N - 1 . System equation (1) is replaced by equation Yi = ni(xi, mi),
i = 0 . . . . , N - 1.
And, as additional constraints, we take yi--Xi+l=O,
i=0 ..... N-2.
~Hi + (nk)Ti _~i: Bgi 0 (~)T+ ~i(i+l)(fki(~+l)T_~i f o r i = 1 . . . . . N-l?
I f y e s , s t o p ; o t h e r w i s e d e t e r m i n e ×k+l = ×k + c k Axk s e l e c t e d in i n t e r a c t i o n
where c k i s
with t h e d e c i s i o n maker and Axk i s t h e s o l u t i o n
o f problem (31).
k xi xk
i+l
STAGEi - l
~
(33)
(Note XN does not appear in the objectives; hence we do not have to write (33) for i = N - 1 . ) Otherwise, the problem formulation remains the same. Note that the change in the formulation is a technical one; the problem remains the same. To derive nonfeasible schemes, the general necessary condition for a preferred solution (Section 5) is applied to the new problem for-
COORDINATOR Are (G~)T~ ~ ( f k ) 3 f l •_
(32)
(fk), fik
(@fi/3xi)k, (@Hi/Bxi)k gil~xi )k
STATE i + l
STAGE i Solve problem (30). i fi , Evaluate As, 6i, ni, ~fl/~x i, 3Hi/Bxi, and @gi/@xi at the preferred solution. FIG. 2. A feasible scheme.
Multiobjective discrete-time dynamic problems
consecutive stages, are also taken as coordination variables, because, in a general case, they depend on values of objectives in several stages. If f~ ( i = 0 . . . . . N - l ) depend on values of objectives other than fi, the relaxation method explained above may be used. In the nonfeasible case--basically because the stages are optimized completely separately--the correspondence between a stage optimization and the corresponding necessary conditions is generally not as probiemless as it is in the feasible method. For example, it may happen that solutions to the necessary conditions are only stationary solut i o n s - a n d not even unique--to the corresponding subproblems. Correspondingly, the coordination algorithm fails. As in the single-objective case, some convexity properties are needed. These are now
mulation. This results in a set of equations like (19)--(26). Now, instead of fixing x, the coordinator fixes vi ( i = 1. . . . . N - l ) that are multiplier vectors related to the new constraints in (33). As in deriving the feasible scheme, a set of equations is noticed to be necessary conditions to multicriteria problems, which are given in Fig. 3 (for i = 0 . . . . . N - 1 ) . Note that the trade-off vector Ai, is evaluated at the values of the original objectives, not of the modified objectives. In Fig. 3, v~ = 0 for i = 0, N. The remaining necessary conditions consist of (33). They are checked by the upper level. Based on the discrepancies y~ - - X i +kl , a new value for the v~s can be assigned essentially by the same rule as that used in the single-objective case (Tarvainen and Haimes, 1980b). In Fig. 3, the A"i+l~s ( i = 0 . . . . . N - 2 ) , which are the trade-offs between the first objectives in two
COORDINATOR
Are e~
^
k
y~ - Xi+l = O,
and ~ i ( i + l ) ( f k ) : ( x i ( i + l ) ) k , i : O. . . . . N-2? I f yes, s t o p ; i f
not, then
kk k+l k c ei \~i : ~i + (~Oi)k+l (xi(i+l))k+l
k ,
= ~i(i+l)(fk),
c
> O, i = I . . . . . N - l , i = 0 . . . . . N-2.
IFT k
~.
~ STAGE i - l
xi
~i+lJ
J Yik fik
STAGE
STAGE i+l i (~i(i+l) k k T k T fl + ) (~i+l) Yi - (~i) xi :i '2
I min i,mi,Yi
i ni u b j e c t to
23
Yi = Hi (xi 'mi ) gi = ( x i ' m i ' Y i ) ~ 0
with preferences given by the trade-off vector X~(fi).
FIo. 3. A non-feasible scheme.
K. TARVAINEN,Y. Y. HAIMESand I. LEFKOWITZ
24
more complex because we have the trade-offs as variables. Trade-offs are generally not known beforehand for applying these conditions (Tarvainen and Haimes, 1980b). Hence, in some practical cases, a try must be given to the method. However, as a rule of thumb, and seen from the form of the general conditions, at least one objective in each stage should be strictly convex. 7. TEMPORAL CONTROL HIERARCHY
7.1. Motivating application In an industrial plant, raw materials, energy, labor, etc. are transformed into finished products, often through very complex sequences of operations and production processes. There are a great many control and decision-making functions involved in the determination of operating conditions, resource allocations, scheduling of production units, etc., which affect plant performance. These control and decision-making functions organize into a hierarchy based on the relative time scales of the associated actions. We term this structure a temporal control hierarchy. A particular example which serves to illustrate the concepts presented here is steelmaking (see Lefkowitz and Cheliustkin (1976) for a discussion of the various production processes involved in the production of rolled sheet steel and the associated control/decisionmaking problems). In the interest of brevity, we consider only a small part of a modern steel works and have eliminated much of the detail. The system shown in Fig. 4 identifies four subsystems (stages) consisting of: Stage 1. The impurities in the raw iron are
removed to produce steel of specified composition and properties. Stage 2. The molten steel is cast into slabs of specified dimensions which are stored until ready to be rolled. Stage 3. The slabs are heated to a temperature appropriate for the rolling operation. Stage 4. The heated slabs are drawn through successive sets of rolls which progressively reduce the thickness of the steel until the specified gauge is achieved. Each subsystem has its own computer control which generates the control inputs to the subsystem in order to best satisfy one or more objectives associated with the subsystem. The nature of the control problem formulation for each stage may be outlined as follows:
Stage 1.
fl =
maximize productivity minimize energy and material losses minimize probability of off-standard product satisfy target dates specified by customer orders
I specify grade of steel to be produced in Jeach heat m, = t specify operating conditions (e.g. charge [composition, blow time, oxygen flow [ rate, lance position)
Stage 2.
f2=
minimize down-time minimize rejects minimize maintenance satisfy target dates
i atn i gJ JCoordn
co0t,ol
I
I
Control
MotlenJ Steem l akn i gJMoltenJCasnitgS1ab~ Ir°n 7 Furnaces J Steel~Machn ies
Contro
1
HeanitgSalb~ Furnaces
FIG.4. Hierarchicalstructure in steel-makingsystem.
I Co0trol
R ~olinM gills
Multiobjective discrete-time dynamic problems specify dimensions for each slab / m2 = [specify machine settings, etc. J" Stage 3.
f3 =
m3 = Stage
minimize deviation of slab temperature from specified value minimize fuel consumption minimize delay in slab exit time specify firing rate of furnace } specify push rate of slabs into furnace . ,
f4=
maximize rolling rate maximize time between roll changes minimize power consumptions minimize amount of off-spec, product minimize delay in meeting due date
m 4 ~--
select slab sequence for rolling from available set specify roll speed, roll gap settings, temperature for each slab
Selection of the objectives associated with each subsystem is based on considerations of ease of implementation of local controls, satisfaction of local constraints and compatibility with overall system objectives. The vector of overall objectives may be formulated thus
F =
maximize productivity maximize efficiency minimize costs minimize fuel consumption minimize in-storage inventory minimize lateness in meeting due dates
The components of F are dependent on the values of the objective functions achieved for each of the subsystems. For example, overall fuel consumption depends on the energy efficiencies achieved in the various operations. The system described above forms part of a hierarchical decisionmaking structure organized according to the temporal relationships (or relative time scales) among the different controlldecisionmaking functions; e.g. the weekly scheduling function may specify target dates for each stage of production relevant to a given customer's order in order to assure meeting the promised delivery date. 7.2. Temporal control hierarchy We may consider that the vector of objectives associated with each stage is comprised of two classes of objectives: (a) objectives that are
25
intrinsically related to system performance but are noncommensurate and perhaps conflicting, and hence are true components of a multiobjective optimization problem, and (b) objectives that are artificially induced by the process of decomposing the overall decisionmaking problem into subproblems. An example of the first class is the pair of objectives {maximize profit, minimize environmental impact}; an example of the second class is provided by f4, e.g. maximizing the time between roll changes is not a basic objective for the system, but it is consistent with overall objectives of minimizing costs and maximizing productivity. The argument extends to the multilayer temporal control hierarchy where the (k + 1)th layer controller specifies targets for the kth layer decision problem, where these targets provide a convenient basis for simplifying the complex multistage optimization problem while at the same time providing feedback mechanisms for reducing the effects of uncertainties with respect to future inputs and events. Thus, we extend the concept of the multilayer, temporal control hierarchy to include not only the partitioning of the overall decisionmaking problem into subproblems based on different time scales but also the identification with each temporal layer of an associated multiobjective decisionmaking problem. Figure 5 shows the basic structure of the temporal control hierarchy. It is assumed that the control" problem is partitioned to form an L-layer hierarchy where C k, the kth layer control function, generates a decision or control action every T k units of time (on the average), with T
TM
>
T k,
k = 1, 2 . . . . . L - 1.
Associated with the kth layer control function are the following inputs and outputs: information set describing the state of the plant and environmental factors relevant to the kth layer decision process; u k÷~ decisions of the (k + l)th layer controller that exert priority over the kth layer control process; in particular, u T M provides targets and/or constraints for the kth layer problem such that the actions of C k are consistent with goals set for the overall problem; v k-~ information from the infimal unit C k-~ relevant to the kth layer function, e.g. feedback of the results of prior actions of xk
ck; m k
actions of the kth layer controller applied
26
K. TARVAINEN, Y. Y. HAIMES and I. LEFKOWITZ
I
control <.d
j,,,
-
•
•
•
Xk
I V
-1
I L
r> , * •
FIo. 5. Temporal control hierarchy.
directly to the plant; tasks to be carried out in conjunction with the control output /2k.
Let t k denote the most recent time, prior to some specified time t, at which a kth layer control action has occurred. We assume that a kth layer action automatically triggers actions at all lower layers in order to ensure consistency with the notion that C k exerts priority over C k-~. Accordingly, we have the ordering t k+~ <-
t k ~ t < t k + T k <- t k+l + T ~+~, ( k = 1,2 . . . . . L - l ) .
At time t k, the kth layer controller generates the output uk(t k) based on input information currently available. Thus
uk(t k) = fk[uk+l(tk+J), xk(tk), Vk-t(tk)]. The control u k is assumed to remain fixed over the time interval (t k, t k + T k) at the value determined at time t k, i.e. i~k(t) = u k ( t k ) ,
t k <-- t < t k -b T k.
In a similar manner, the temporal structure implies the relations
vk(t k) = gk[xk(tk), vk-l(tk)] m k ( t ) = hk[uk(tk)],
t E [t k,t k + Tk].
We assume further that the information set x k ( t k) is obtained by measurement of various
plant.inputs and outputs that are important to the kth layer action (and are available); in particular
x k ( t k) = 0k[yk(tk), ~k(tk)] where y k ( t k ) and z k ( t * ) denote, respectively, the subsets of plant outputs and external inputs, averaged over the interval [t k -- T k, tk], that are observed and are relevant to C k. Since yk is functionally related to the inputs m k and z k via the plant model, we have also
x k ( t k) = ~bk[~k(t~), zk(tk)] where z k denotes the subset of environmental inputs to the plant that are significant in their effect on the kth layer decision process. The overbar again denotes an averaging or aggregating process over the interval [t k - T k, tk]. Thus, z represents the system's external inputs (disturbances, order inputs, etc.), which are partitioned into subsets where the ith subset z ~ is associated with the control period T i, i = 1,2 . . . . . L. In effect then, the kth layer controller generates a control action at time t k based on the following information sets: v k-~ characterizing the residual effects of z i over the interval (t k - T i, tk), i= 1,2 . . . . . k - l ;
Multiobjective discrete-time dynamic problems x k characterizing the effect of z k over the interval (t k -- T k, tk); u k÷~ characterizing the residual effects of z i over the interval (t k - T i, tk), i= k + l , k + 2 . . . . . L. The temporal control hierarchy provides a basis for simplification and aggregation of the models used in generating the control functions. The classification of the disturbance inputs into subsets z i i -- l, 2 . . . . . L, may be based on correlation methods, sensitivity analysis, spectral analysis, etc. For example, if tOo is the lowest frequency at which a particular input exhibits significant energy content (in terms of its power density spectrum) and then To = 2~r/tO0 T k > To we may consider the effects of variations of the input to average out over T k and hence C k needs to use only the mean or expected value of the input; T k-~ To the disturbance appears as a nonstationary input with respect to C k and hence we should attempt to apply compensating action; T k ~ To the disturbance is essentially constant over the period T k and it can be absorbed within the model as a parameter (which may be updated by an adaptive function at a higher layer). Thus, with respect to the kth layer control function, (i) the model excludes input variations whose effects tend to average out over T k, (ii) variables whose effects are relatively constant over the decision horizon are parameterized at their mean values, and (iii) the remaining variables are aggregated. The period T k is selected (out of a set of feasible control periods) so that the results actually achieved by the plant show an acceptably small mean deviation from the results described by the model. The planning and scheduling process represents a special application of the temporal hierarchy. The essence of the problem is that the overall decision horizon is long, the system is complex with many diverse inputs, and we have only very limited information concerning the inputs. The temporal control hierarchy formalizes a rational basis for mitigating these difficulties through aggregated models and through feedbacks that tend to reduce the effects of uncertainties and model approximations. We consider, in this application, the following special characteristics: 1. The decision horizon is an integral multiple of the control period, ~.k = rT k, r > 1.
27
2. The kth layer control period equals the decision horizon for the ( k - 1 ) t h layer, T k-T k-1"
3. The kth layer control problem is solved repetitively every T k time units. The solution at time n T k is associated with the interval [nT k, n T k + ~.k]; however, uk(nT k) need reflect only the initial segment of the solution function, i.e. the output segment for the interval [nT k, (n + 1)Tk]. This constitutes the allocation or target for the ( k - 1)th layer control problem. 4. The control output uk(nT k) is determined on the basis of: (a) the target or allocation u T M set by the supremal unit; (b) the current estimate of the nature of the environmental inputs forecast over the interval [nT k, n T k + ~k] as presented by the information s e t xk(nTk); (c) the feedback vk-t(nT k) from the infimal unit identifying deviations of performance from the target values (with respect to the ( k 1)th layer subproblem and the preceding control interval [(n - 1)T k, nTk]). 5. The above sequence is repeated with period TR; the result is a continual updating of u k based on the current information available. 6. The above procedure extends to each layer of the hierarchy, resulting in an articulated Llayer decision-making structure. In summary, the following benefits arise from the above approach: (a) a rational basis is provided for aggregating the variables, permitting simplification of the complex initial formulation of the problem; (b) the effects of uncertainty are reduced because solutions based on a prediction of the disturbance input over a shorter horizon are found for the subproblems (at the lower layer); (c) local constraints and locally dominant factors are handled at the lowest control layer consistent with timing, information requirements, and related considerations; (d) there is a natural mechanism for the introduction of feedback of experience both in plant operation subject to prior control and in the prediction of disturbance inputs over the horizon period; (e) features of the multilayer functional hierarchy may be superimposed to provide for information processing, implementation (direct control), adaptive functions, and the handling of contingency occurrences; (f) systems integration is achieved through a well defined and clearcut assignment of tasks and responsibilities to the various layers of control and through information feedback,
28
K. TARVAINEN, Y. Y. HAIMES a n d I. LEFKOW1TZ
all o f w h i c h p r o v i d e the b a s i s f o r c o o r d i n a t i o n o f the i n t e r a c t i n g d e c i s i o n f u n c t i o n . (g) m o t i v a t i o n is p r o v i d e d f o r i d e n t i f y i n g d i f f e r e n t o b j e c t i v e s w i t h e a c h l a y e r of the temporal hierarchy which further decompose (and t h e r e b y s i m p l i f y ) t h e o v e r a l l p r o b l e m . The hierarchical structure may help the d e c i s i o n m a k e r in a s s e s s i n g t h e t r a d e o f f f a c tors a s s o c i a t e d w i t h t h e v a r i o u s o b j e c t i v e s , p a r t i c u l a r l y as t h e y reflect the e f f e c t s o f r i s k and uncertainty on the overall decision-making p r o c e s s . 8. CONCLUSION Methods for solving multicriteria dynamic problems have been reviewed and derived here. S e l e c t i o n o f an a l g o r i t h m f o r a p a r t i c u l a r c a s e d e p e n d s o n t h e p r o p e r t i e s o f the p r o b l e m : is it a d i s c r e t e or c o n t i n u o u s - t i m e p r o b l e m ; a r e the o b j e c t i v e s c h a n g i n g in t i m e o r are t h e y c o n stant; what type are the constraints and funct i o n s ; h o w a v a i l a b l e is the d e c i s i o n m a k e r ? A s in s i n g l e - o b j e c t i v e p r o b l e m s , t h e r e is no b e s t m e t h o d : e v e r y m e t h o d d i s c u s s e d h a s its b e s t application area. T h e m u l t i c r i t e r i a p r o b l e m is also e x a m i n e d in the context of multilevel methods for solution. F i n a l l y , the t e m p o r a l c o n t r o l h i e r a r c h y is p r e s e n ted as m o t i v a t i o n f o r i d e n t i f y i n g m u l t i p l e o b j e c tives in a d y n a m i c s y s t e m o p t i m i z a t i o n p r o b l e m where decision-making sub-problems are struct u r e d a c c o r d i n g to r e l a t i v e t i m e s c a l e a n d p r i o r i t y of actions. Acknowledgements--This research was supported in part by the National Science Foundation, grant no. ENG79-30605, "The Integration of the Hierarchical and Multiobjective Approaches", and Department of Energy, grant no. DEACO-180-RA-50256, "Industry Functional Modeling". REFERENCES Brown, T. A. and R. E. Strauch (1965). Dynamic programming in multiplicative lattices. J. Math. Anal. Appl., 12, 364. Bryson, A. E. and Y. C. Ho (1969). Applied Optimal Control, Blaisdell, Massachusetts. Chankong, V. and Y. Y. Haimes (1978): The interactive surrogate worth trade-off (ISWT) method for multiobjective decision-making, In S. Zoints (Ed.), Multicriteria Problem Solving. Springer, New York. Chankong, V. and Y. Y. Haimes (1983). Multiobjective Decision Making: Theory and Methodology. North Holland, New York (to be published). Chankong, V., Y. Y. Haimes and D. Gemperline (1981). A multiobjective dynamic programming method for capacity expansion. IEEE Trans. Aut. Control, AC-26, 1195. Geoffrion, A. M. and W. W. Hogan (1972). Coordination of two-level organizations with multiple objectives. In A. V. Balakrishnan (Ed.), Techniques of Optimization. Academic Press, New York, pp. 455-466. Geoffrion, A., J. Dyer and A. Feinberg (1972). An interactive approach for multi-criterion optimization with an application to the operation of an academic department. Management Sci., 19, 357.
Haimes, Y. Y. and W. A. Hall (1974). Multiobjectives in water resources system analysis: the surrogate worth trade-off method. Water Resources Res., 10, 614. Haimes, Y. Y., W. A. Hall and H. Y. Freedman (1975). Multiobiective Optimization in Water Resource Systems: The Surrogate Worth Trade-off Method. Elsevier, Amsterdam. Haimes, Y. Y. (1977). Hierarchical Analyses of Water Resources Systems: Modeling and Optimization of Largescale systems. McGraw-Hill, New York. Haimes, Y. Y. and K. Tarvainen (1980). Hierarchical-multiobjective framework for large scale systems. In P. Nijkamp and J. Spronk (Eds), Multicriteria Analysis in Practice. Gower Press, London. Himmelblau, D. M. (Ed.) (1973). Decomposition of Largescale Problems. Elsevier, New York. Lasdon, L. (1970). Optimization Theory for Large Systems. Macmillan, London. Lefkowitz, I. (1966). Multilevel approach applied to control system design. Trans ASME, 88D, 392. Lefkowitz, I. and A. Cheliustkin (1976). Integrated systems control in the steel industry. Report of the International Institute of Applied Systems Analysis, Laxenburg, Austria, CP 76-13. Lefkowitz, I. (1977). Integrated control of industrial systems. Trans R. Soc., London, 287, 443. Luenberger, D. G. (1973). Introduction to Linear and Nonlinear Programming. Addison-Wesley, Reading, Mass. Mahmoud, M. S. (1977). Multilevel systems control and applications: a survey. IEEE Trans Syst., Man & Cybern., SMC-7, 125. Mesarovic, M. D., D. Macko and Y. Takahara (1970). Theory o[ Hierarchical Multilevel Systems. Academic Press, New York. Olenik, S. C. and Y. Y. Haimes (1979). A hierarchicalmultiobjective method for water resources planning. IEEE TRANS Syst., Man & Cybern., SMC.9, 534. Sage, A. P. 0968). Optimum Systems Control. Prentice-Hall, New York. Singh, M. G. and A. Titli (1978). Systems--Decomposition, Optimization and Control. Pergamon Press, Oxford. Singh, M. G. and A, Titli, (Eds) (1979). Handbook of Largescale Systems Engineering Applications. North-Holland, Amsterdam. Sobel, M. J. (1975). Ordinal dynamic programming. Management Sci., 21,967. Steven, A. (1971). A hybrid method for the solving of optimal control problems in a real time environment. SIMS Meeting, Copenhagen. Tarvainen, K. and Y. Y. Haimes (1980a). Basic hierarchicalmultiobjective optimization techniques. Case Western Reserve University, Cleveland, Ohio, report no. SEDWRP-1-80. Tarvainen, K. and Y. Y. Haimes (1980b). Coordination of hierarchical-multiobjective systems: theory and methodology. Case Western Reserve University, Cleveland, Ohio, report no. SED-WRP-2-80. Tarvainen, K. (1981). Hierarchical-multiobjective optimization. Ph.D. dissertation, Department of Systems Engineering, Case Western Reserve University, Cleveland, Ohio. Tauxe, G. W., R. R. Inman and D. M. Mades (1979a). Multiobjective dynamic programming: a classic problem redressed. Water Resources Res., 15, 1398. Tauxe, G. W., R. R. Inman and D. M. Mades (1979b). Multiobjective dynamic programming with application to a reservoir. Water Resources Res., 15, 1403. Tauxe, G. W., D. M. Mades and R. R. Inman (1980). Multiple objectives in reservoir operation. J. Water Resources Planning and Management Division, Proc. Am. Soc. Cir. Engng, 106,225. Wismer, D. A. (Ed.) (1971). Optimization Methods [or Large Scale Systems with Applications. McGraw-Hill, New York.