COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING 16 (1978) 303-311 0 NORTH-HOLLAND PUBLISHING COMPANY
CONGAU - CONSTRAINED MINIMIZATION OF LEAST SQUARES OBJECTIVE FUNCTIONS
L. BELLAGAMBA School of Aeronautics and Astronau tics, Purdue University, WestLafayette, Indiana
Received 1 June 1977 Revised manuscript received 7 July 1977
A technique for constrained parameter optimization is presented and applied to some test problems. The procedure employs an exterior penalty function to transform the constrained objective function into an unconstrained index of performance which is minimized by Gauss’ method. Gauss’ method recasts the minimization problem to one of solving simultaneous linear equations with the variation of the parameters as the unknowns. For the special case of a quadratic objective function and linear constraints, the penalized critical value is obtained in one step. The restriction that the objective function be of least squares form results from using Gauss’ method. The restriction on each constraint is that first derivatives with respect to all parameters be obtainable.
1. Introduction
The general nonlinear programming problem may be stated: Minimize the objective function f(x), x = {x1, x2, . .. XJ subject to:
The solution of the nonlinear programming problem will be specified as f * at x * . For the special case of no constraints (p = q = 0) and an objective function of the form
with
provided (i) grad Bi X grad 13~# o, for i # j, i, j = 1, 2, . .. m, (ii)m> n. Gauss’ method has demonstrated excellent performance [ 1 ] . A modification
of Gauss’ method,
L. Bellagamba, CONGAU - Constrained minimization of least squares objective functions
304
the Marquardt-Fletcher algorithm, has been reported to converge for a wider range of objective functions by sacrificing speed of convergence [ 21. Briefly, Gauss’ method seeks a solution of the m simultaneous equations e(x) = 0,
(3)
which minimizes the squared-error fi Thus, provision (i) is imposed to guarantee that the elements of the column matrix 8 are functionally independent [3]. Provision (ii) guarantees that the number of unknowns does not exceed the number of equations. Though an exact solution of eq. (3) may not exist, if fis continuous in the n-space defined by its parameters, then fhas an absolute minimum f * [4 1. It is this property which motivates the use of Gauss’ method for minimization problems. With regard to potential application the form of eq. (2) is restrictive but widely used. Examples include total squared-error,total structural mass and crankshaft balance. Section 2 describes the extension of Gauss’ method, while section 3 presents the corresponding algorithm. Section 4 presents the results of the application of the formulated algorithm, entitled CONGAU, to a number of test problems,
2. Derivation of method The solution of eq. (1) will be sought by finding the minimum of an index of performance P, where IP(x, w) = e’(x>e(x) + w2(h’(x)h(x) + If(X) Z(x)) *
(4)
Eq. (4) is a standard exterior penalty function form consisting of the scalar penalty weight w and the squared-error e of the unsatisfied constraints, where e=h*h+l*I.
(5)
The column matrix 2 consists of the elements of g which violate the inequality constraint; hence the dimension of I is r X 1, (r Q q). Thus the second provision of Gauss’ method stipulates that m + p +r<
TZ.
A necessary condition for a local minimum is that the variation of the P be zero, that is 6IP=O.
(6)
Taking the variation of eq. (4) yields 6B = 2(6*6 0 + w2(h*6h + 1*61) , where
(7)
L. Bellugamba, CONCAU - Constrained minimization of least squares objective finctions
- ah
305
(9)
6h - &ix
Substituting eqs. (8)-( 10) into eq. (7) yields
(11) The minimization
problem is: Given an x which leaves eq. (6) unsatisfied, find a 6x such that
(12)
P(x + 6x) = 0 .
From eq. (11) this implies (13) Using Taylor’s formula while neglecting second-order terms results in the following approximations:
8(x+6x)=@(x)+ae (x)6x ,
(14)
h(x + 6x) = h(x) + g (x) 6x ,
(1%
ax
2(x + 6x) = J(x) + tg
g
(x + 6x) = g
w
(x) *
6x , (17)
306
L. Bellagamba, CONGAU - Constrained minimization of least squares objective functions
(18)
(19) Defining acJ
ax
’
xK ax
al_,
’
a<-
’
and substituting eqs. (14)-( 19) into eq. (13) yield efJ + GxfJ’J + w2(h’K + Gx’K’K + lfL + Gx’L’L) = 0 .
(20)
Thus
(21)
A&r=b, where A = (JfJ+ w2 (KfK + L’L))
and b = -(J’@ + w2 (K’h + L’Z)) .
For a given x and w eq. (2 1) represents n simultaneous linear equations, which can be solved for 6x by any of a variety of methods. Three notes are appropriate. First, the approximation of eqs. (15)-( 19) are exact if the column matrices 8 , h and 1 have components which are linear functions in x. Thus the objective function is quadratic with respect to the variables x, and the critical value for a given penalty weight is obtained in one step. Second, the use of an exterior penalty function allows for an infeasible starting point, and termination generally occurs at a slightly infeasible point [ 51. Finally, computational efficiency can be obtained by noting that the matrix A is symmetric.
3. Algorithm description The following algorithm describes a solution procedure: Step I: Select: (a) Starting point x
(b) Initial weight w (c) Tolerances, er, e2, e3, e4. Step 2: Calculate: (a) Matrix A (b) Column vector b. Step 3: Solve: ASx = b for 6x Calculate: y =x + Sx. Step 4: Sequence termination check:
L. Bellagamba, CONGAU - Constrained minimization of least squares objective functions
301
type 1: Is P(y) - P(x) < fr if B(x) < e2 or is (B(y) - IP(x))/P(x) < e if P(x) > e2 ? type 2: Is 6x’& < ea? If selected check procedure is not satisfied, set x =y, go to step 2. Otherwise, go to step 5. Step 5: Termination check: Is e < eq? If not satisfied, set w equal to cw (c > l), set x = y, go to step 2. Otherwise, terminate with X* =y.
4. Numerical examples The above algorithm was formulated as a FORTRAN IV program entitled CONGAU and was applied to the following test problems: 1. Minimize f(x) = (x1 - 2)2 + (x2 - 1)2 subject to g,(x) = g,(x)=x,
xf -
x2
<
0
)
+x2 -2
Starting point: x = {2,2}. Solution: f” = 1 atx’ = {l,l} [6]. 2. Minimize f(x) = (x1 - 2)2 + (x2 - 1)2 subject to h,(x) =x1 - 2x, + 1 = 0 ) g, (x) = x; + 4x; - 4 < 0 . Starting point: x = {2,2}. Solution: f’ = 1.393 at x* = IO.823, 0.911) [ 11. 3. Minimize f(x) = 100(x, - xt )2 + ( 1 - x1 )2 subject to g, (x) = -x2 + 4x: Q 0 , g,(x) = 0.5 - x1x2 < 0 . Starting point: x = (2,2}. Solution: f’ = 56.5 atx’ = (OS,l]. 4. Minimize f(x) = 9x: + xi + 9x2, subject to
L. Bellagamba, CONGAU - Constrained minimization of least squares objective functions
308
h,(x)
=x; +x; -
1= 0 )
g,(x)=x,+0.5<0.
Starting point: x = {2,2,2}. Solution: f’ = 1 atx” = {O,-1,O) [7]. 5. Minimize f(lc) = (x1 - 1)2 + (x1 - x2)2 + (x2 - x3)2 subject to h,(x)=xJl
+xj:)+x;-4-3fl=O.
Starting point: x = {2,2,2]. Solution: f * = 0.3256 X 10-l atx’ = I1.1048, 1.1966, 1.5352) [8]. 6. Minimize f(x)= (x1- x2)2+ (x2+ x3 - 2)2 + (x4 - 1)2 + (x5 - 1)2 subject to h,(x)=x,
+3x,=0,
h2(.x)=x3+x4-2x5=0, h,(x)=x,
-x5
=o.
Starting point: x = { 2,2,2,2,2}. Solution: f = 4.0930 at x* = (-0.7674, 0.2558, 0.6279, -0.1163, 0.2558) [71. 7. Minimize f(x) = (x1- 1)2 + (x1 - x2)2 + (xg - 1)2 + (xq - 1)4 + (x5 - 1I6 subject to l
h, (x) =x4x: + sin(x, - x5) - 22/Tz= 0 , h,(x)=x,+x;x;
-S-2fi=O.
Starting point: x = {2,2,2,2,2). Solution: f *= 0.2415 at X* = { 1.1661, 1.1821, 1.3802,,.1.5060, 0.6109) 8. Minimize f(x)= (x1- l)? + (x1 -x2)2 + (x2 -x~)~ + (x3 - x4j4 + subject to
h,(.Y)=xlxg
[71. (x4 - x514
-2=o.
Starting point: x = {2,2,2,2,2}. Solution: f * = 0.7877 X 10-l atx’ = (1.1911, 1.3626, 1.4728, 1.6350, 1.6790)
[8].
L. Bellagamba, CONGAU - Constrained minimization of least squares objective functions
309
Table 1. Summary of formulation of test problems Problem number
n
m
p
q
e1
02
2 2 2 3 3 5 5 5
2 2 2 3 3 4 5 5
0 1 2 1 1 3 2 3
2 1 0 1 0 0 0 0
x1 - 2 x1 -2 10(x2 -x:,
x2 - 1 x2 - 1 1 -x1
3x1
x2
x1 - 1 x1-x2
x1 - 1 x1 - 1
Xl
03
05
94
3x3 -x2
(x2
x2+x3-2
x4
Xl
x3-
--x2
‘Xl -x2
-
x312
1 1
1
x5 -
x2 -x3
(x4
-
II2
(x5
-
o3
(x3
-
x412
(x4
-
x512
8 ___-
Table 2. Summary of results of application of CONGAU
Problem number ~---.--_-_-_
1
f* x: xz* XJ xX x:
2
3
4
5
6
7 I____
0.999507 1.00025 1.00012
1.39017 0.824259 0.911687
56.5006 0.500652 1.00066
0.999444 5.61E-6 -0.999722 0.
0.0325681 1.10485 1.19666 1.53526
4.09278 -0.767416 0.255809 0.627944 -0.116247 0.255842
0.241501 1.16614 1.18208 1.38028 1.50599 0.610692
0.0787758 1.19113 1.36260 1.47282 1.63501 1.67908
2.14B1
1.83E-6
3.80E-6
3.09E-7
3.54E-11
1.38E-9
2.5 8E-9
S.SlE-10
.-
Final squarederror, e Matrix A formulations Final penalty weight w
5
5
28
8
8
4
16
5
30
30
300
30
30
300
30
30
Table 1 summarizes the formulation of the objective function for each problem. Table 2 summarizes the results (the notation 2.74E-7 means 2.74 X 10e7). Indirect comparison of optimization algorithms is difficult and potentially ambiguous. However, the presentation of a new method requires some justification for its existence. Table 3 summarizes the best previously published application results of alternate algorithms. Since computers of different manufacture were employed to implement each algorithm, there are no comparisons of required central processor times. For each test problem, all derivatives were calculated from exact analytical expressions. In each case the initial penalty weight was taken to be 30, and a value of 10 was used for c. The tolerances were cl = lo-‘, e2 = lo-lo, ea not used and eq = loss. The form of the employed termination check (type 1) dictated the number of objective function evaluations - these exceeded the required matrix A formulations by one (see table 2). The solution of the linear equations was obtaines by Gauss elimination with back substitution and pivoting [ 9 I. The Purdue University Computing Center CDC 6500 was employed in all executions using single precision arithmetic and compiled with the Minnesota Fortran Compiler.
L. Bellagamba, CONGAU - Constrained minimization of least squares objective functions
310
Table 3. Summary of performance of referenced algorithms -
-___
Problem number
Reference
1 2 4 5 6 7 8
161
[II
171 181 [71 I71 VI
Algorithm iterations
Objective function evaluations
Derivatives employed?
138
no
t 7 11 4 11 9 l__l__.__-
75
yes yes yes yes yes
40 105
t Various algorithms employed with time as only reported comparison.
The initial selection and subsequent modification of the penalty weight are two difficulties inherent to all penalty function methods. To determine the performance sensitivity with respect to variations in initial penalty weight, CONGAU was applied to each test problem using 10 as a value for c and alternatively w’s of 1, 10, 30, 100 and 1000. The tolerances were el = IO-‘, e2 = IO- lo, e3= lo-* ande,= lo-‘. The results of this investigation are presented in table 4.
~___
Table 4. Summary of initial penalty weight effect on performance lations is on the left, and the final value of w is on the right) --___ Initial penalty weight w
1
10
12,100 12,100 39,lOOO 23, 100 10,lO 6,100 t, 1 8, 10
_-__8,100 5,30 8,100 5,30 33,lOOO 28,30 11,100 8, 30 8, 10 8, 30 4,100 4,300 16, 10 16,30 5, 10 5,30 -__ -_--.____-
Problem 1 2 3 4 5 6 7 8 __---~--.---_
-I______-_
(in each column the number of matrix A formu-
30
100
Y5,lOO 5,100 9,lOOO 8,100 8,100 2,100 15,100 5,100
1000
5,lOOO 6,1000 t, 1000 8,lOOO 9,lOOO 2,lOOO 17,lOOO 6,1000
‘r Failed to meet all tolerances after 20 matrix A formulations.
Apperently, when the initial value of the ratio w2e/8’e exceeds approximately 104, performance ceases to improve. This observation suggests a selection criterion for the initial w, the only justification being that it has worked. With regard to the sequential modification of w, several techniques have been developed making use of gradient information [ 1 I. However, the simplistic modification of multipyling by a constant greater than one remains embarrassingly efficient. Finally, it should be noted that the test problems are relatively well scaled. Such a condition is to the benefit of all minimization techniques and is often imposed by some procedure. Unfortunately, no scaling procedure is guaranteed to be effective. A common procedure is to weight the objective function and the constraints by multiplying each by constants. The weight of the objective function is its initial value. The constraint weights are chosen to equalize the initial
L. Bellagamba, CONGAU - Constrained minimization of least squares objective functions
311
gradients, subject to the condition that the initial sum of all the squared weighted constraints be unity. No scaling procedure was employed by CONGAU.
5. Conclusion A method The method are required results were
for constrained minimization with objective functions in least squares form is presented. allows for infeasible starting points. First derivatives with respect to the parameters of the objective function terms and all constraints. For test problems examined, good obtained.
References [l] D.M. Himmelblau, Applied nonlinear programming (McGraw-Hill, New York, 1972). [2] J. Wood, Non-linear least squares methods for parameter estimation with applications in reactor physics, J. Inst. Nucl. Eng. (1973) 106-110,113-121. [ 3 ] E.C. Zachmanoglou and D.W. Thoe, Introduction to partial differential equations with applications (Williams and Wilkins Co., Baltimore, 1976). [4] W. Kaplan, Advanced calculus (Addison-Wesley, Reading, MA, 1973). [5] R.L. Fox, Optimization methods for engineering design (Addison-Wesley, Reading, MA, 1971). [6] B.V. Sheela and P. Ramamoorthy, Swift - a new constrained optimization technique, Comp. Meth. Appl. Mech. Eng. (1975) (1975) 309-318. [7] W.A. Gruver and N.H. Engerbach, A variable metric algorithm based on an augmented lagrangian, Int. J. Num. Meths. Eng. (1976) 1047-1056. [S] A. Miele, E.E. Cragg and A.V. Levy, Use of the augmented penalty function in mathematical programming problems, Part 2, J. Opt. Th. Appls. (1971) 131-153. [9] ‘B. Carnahan, H.A. Luther and J.C. Wilkes, Applied numerical method (Wiiey, New York, 1969).