A sensitivity interpretation of adjoint variables in optimal design

A sensitivity interpretation of adjoint variables in optimal design

COMPUTER METHODS NORTH-HOLLAND IN APPLIED MECHANICS AND ENGINEERING 48 (1985) 81-89 A SENSITIVITY INTERPRETATION OF ADJOINT VARIABLES IN OPTIMAL ...

453KB Sizes 3 Downloads 49 Views

COMPUTER METHODS NORTH-HOLLAND

IN APPLIED

MECHANICS

AND ENGINEERING

48 (1985) 81-89

A SENSITIVITY INTERPRETATION OF ADJOINT VARIABLES IN OPTIMAL DESIGN * Ashok Department of Mechanical

D. BELEGUNDU

Engineering, GM1 Engineering & Management Flint, MI 48502, U.S.A.

Jasbir

Institute,

S. ARORA

Civil and Mechanical Engineering, The University of Zowa, Iowa City, IA 52242, U.S.A.

Revised

Received 26 January 1984 manuscript received 3 August

1984

The adjoint method of computing derivatives of cost and constraint functions with respect to design variables requires the calculation of certain adjoint variables. Until now, the adjoint variables have been looked upon only as some intermediate vectors needed to calculate design derivatives. In this paper, they are shown to have an important significance. They represent the sensitivity of the cost and constraint functions with respect to the loading or forcing function in the design problem. A sensitivity theorem for the adjoint variables is presented for structural, mechanical dynamic, and distributed parameter systems. These results offer some immediate practical advantages, such as a method for computing influence coefficients for structural systems, and a method for verifying (debugging) the analytical calculation of adjoint variables in development of a computer code.

1. Introduction The adjoint variables have been used fairly extensively in computing derivatives of cost and constraint functions with respect to design parameters [l]. The adjoint method has been borrowed from optimal control theory [2,3] and modified for structural and mechanical systems. Until now, the adjoint variable has been considered as some dummy or virtual displacement vector [4,5]. It has been treated as some intermediate calculation that is needed just to obtain the design derivatives of implicit functions in optimal design. These variables, however, have an important significance. They represent the sensitivity of the cost and constraint functions with respect to the loading or forcing function in the design problem. It is the purpose of the present paper to bring out this significance of the adjoint variables. Though this result (as the basic adjoint method) has its origin in optimal control theory, its realization in engineering design is of considerable importance. The sensitivity theorem presented herein is derived by treating the load as a design variable vector. However, this result is also motivated by showing that the adjoint variable is actually a Lagrange multiplier. Then, a sensitivity theorem for Lagrange multipliers that already exists in * Part of this research 0045-7825/85/$3.30

is supported

@ 1985, Elsevier

by the NSF Grant Science

Publishers

No. 82-13851. B.V. (North-Holland)

82

A.D. Belegundu, J.S. Arora, Theorem for the adjoint variable in optimal design

the field of nonlinear programming leads to the same result. The sensitivity result is shown for structural, dynamic, and distributed parameter problems. Relatively straightforward models of the design problems are used here to keep the notation simple.

2. Structural systems Consider the scalar valued function q = q(b, z), where b E R k is a column vector of design variables, and z E W” is a vector of nodal displacements determined from the finite element equations (state equations) qb(b,z)=K(b) (nxn)

(nxl)

2

F

-

(nxl)

= 0.

(1)

(nxl)

In (1) F is a vector of nodal loads on the structure, and K(b) is a stiffness matrix. The function q can be a cost or constraint function in the optimization problem. The adjoint method has been developed to compute the sensitivity coefficients dq/db. A Lagrangian approach is now used to derive the sensitivity coefficients, which is different from the approach used in [l]. Form the Lagrangian L as

(2)

L=q+A’cp, where the n x 1 vector h is the Lagrange multiplier vector associated with each function the equations in (1). Since 84 = 0, we have SL = Sq + A’&$ = 6q. Thus,

q

and

(3) where

dq

aL-%Sb

6q=dbSb,

and

C) -= ab

Now, choose A such that dL

z--0.

(4)

Then, in view of (l)-(4), (3) yields

(5) (lxk)(lXx)

(lxn)

(nx k)

A.D. Belegundu,

J.S. Arora, Theorem for the adjoint variable in optimal design

83

where i means that z is treated as a constant while carrying out the partial differentiation alab. This is done to avoid working with the third-order tensors. Equation (4) reduces to

~+htJ-[Kz-FF]=O

or

K*=-5.

Equations (5) and (6) are the same as those derived in [l]. Thus, we see that the adjoint vector A in [l] turns out to be simply the Lagrange multiplier vector for the equality constraints given in (1). Now, let us treat the load vector F as the design vector b; i.e., b = F. Then (5) yields the sensitivity of 4 with respect to F:

dq aq dF=rlF+ht-$K2-F) or

with A determined from (6). Equation (8) simply states that the adjoint vector h is the sensitivity of the function q with respect to F, which is the result we wanted to establish. We have also shown that the adjoint variable h is a Lagrange multiplier vector associated with q and the state equations in (1). This fact, together with a standard sensitivity theorem for Lagrange multipliers in mathematical programming ([6, p. 2311, for example), also leads to the result in (8). Further, the first-order change Sq due to a change SF in F can be written as Sq = -h’6F.

(9)

This result can be of great significance in practical applications. It tells us how the function q(b, t) changes if we change the load vector F. An alternate method of deriving (8) is given in Section 5. Besides the above interpretation, the sensitivity result of the paper can be used to verify (debug) the analytical calculation of adjoint variables in the development of a computer code. For example, the following finite difference approximation obtained from (8) can be compared to the analytical calculation of h in a structural problem:

where E is a small number. 3. Dynamic systems A result similar to the above result will now be shown for dynamic systems. Consider the scalar valued functional

84

A.D. Belegundu, J.S. Arora, Theorem for the adjoint variable in optimal design

T

?P=

I qk z(t), i(t), b) dt,

(10)

0

where b E R k is a design vector and z(t) E R n is a vector of generalized displacements f#a(t,z(t),i(t), b) = P(b) (n X n) (nx 1)

i (n X 1)

- F(l,

2,

b) = 0 .

satisfying

(11)

(n x 1)

Equation (11) represents the generalized equations of motion in first-order form. The functional p can represent either a cost or constraint functional in the problem. The adjoint method will now be derived using a Lagrangian approach. The approach is also valid when b depends on the independent variable t, as discussed in the next section. Form the Lagrangian L as L = q t /i’(t)4

(12)

)

where A(t) is the Lagrange multiplier vector. Since A’4 = 0, we have Sq = SL. Thus, (13) which gives

Now choose A(t) such that

(1%

dt=O. Performing

an integration

by parts on (15) yields the Euler-Lagrange

equation

(16)

0,

with the terminal conditions

J$(T)&(T)

-

s

(O)Sz(O)= 0.

In view of (11) and (12) (16) reduces to

(17)

A.D. Belegundu,

85

J.S. Arora, Theorem for the adjoint variable in optimal design

(18)

and since &z(O)= 0, (17) yields PA(T)=

-Z(T).

(19)

Also, (14) yields

Equations (18)-(19) are the adjoint equations derived in [l]. Again, we have established that the adjoint variable h(t) is a Lagrange multiplier. If we now let the load function F in (11) be the design vector (b = F), then (20) yields T

T{z+A’-$‘&F])SFd~ a4

&,b= j

or

S$ = -

(21)

A’6F dt ,

0

which is analogous to the result in (9) for structural systems. Equation of the ‘economic interpretation’ for Lagrange multipliers [7]. 4. Distributed parameter

(21) is again a reflection

systems

The sensitivity theorem for the adjoint variable can also be derived parameter systems. Consider the scalar valued functional

for distributed

(22) where x E R ‘, u is a scalar valued function, z’ = dzldx, and t = {zl, . . . , z,y satisfies

t$(x,z,z',u)=z'-F(x,z,u)=O, xosx~x~, and the boundary dj(Z(XO),

(23)

conditions 2(X1))=

0

7

j

=

1, * * *

7

II *

(24)

Equations (23) and (24) define a boundary value problem. In (22), !P can be a cost or constraint function in the problem, and u(x) is a design variable function. For example, u(x) is the cross-sectional area of a beam and x is the coordinate along the axis of the beam. The

86

A.D. Belegundu,

J.S. Arora, Theorem for the adjoint variable in optimal design

Lagrangian approach to determine the sensitivity of P with respect to u is similar to the dynamics problem addressed in Section 3. Therefore, that derivation is not repeated. Instead the major results are summarized. If L = 4 + A’(x)4 is the Lagrangian, then X(x) is chosen to satisfy the Euler-Lagrange equation 0,

(2%

together with the end conditions

(26) Substituting

for 4 from (23), (25) and (26) reduce to (27)

and

[S (x1)+ h’(xl)]Sz(xl) - [S (x0)+ ~tbJ)]szw =0 *

(2%

Also, S$ is given as (see (20)) (29) Equations (27)-(29) are as those derived in [l]. Now, if we let u = F, then from (29) the change S$ in + due to a change SF in F is SI,~ =

-1”A’SF dx. xl

5. Examples EXAMPLE

5.1. An alternate derivation is now given to illustrate the sensitivity result for the adjoint variable in structural systems. Consider a structure whose equilibrium is governed by (1). Consider

4’4@,4.

(31)

Then

dq

aq dz

dF=zG*

(32)

A.D. Belegundu,

J.S. Arora, Theorem for the adjoint variable in optimal design

87

Since z = K-‘F, dz/dF = K-l. Thus, (32) yields (33) Comparing

(33) and (6),

This is the same result as given in (8). If q is a linear function of z, and q = 0 when z = 0, then A are simply the influence coefficients associated with q [8]. That is, -Ai equals the value of q due to a unit load applied along the ith degree of freedom of the structure. EXAMPLE 5.2. The following example illustrates the sensitivity result for dynamic systems. Consider the functional

with g satisfying S= at,

g(O)= 1,

where g is a scalar-valued for this problem yield /i=l, Equations

(35)

function and ‘a’ is a scalar. The adjoint equations in (18) and (19)

h(T)=-1.

(35) and (36) yield g(t) = iat’+

1,

A(f)=t-1-T.

(37)

The sensitivity result in (21) will now be shown below. Two different considered. Choice 1. In (35), we have the forcing function F = at. Let 6F = t8a .

choices of SF are

(38)

Then SS = t6a and 6g = ft’sa, from which we obtain

se = I 0

Now,

T(sg + 6s) dt = (kT3 + $T2)6a

.

(39)

A.D. Belegundu, J.S. Arora, Theorem for the adjoint variable in optimal design

88

-I

T

A’SF dt = -

T (t - 1 - T)(t&) dt = (iT3 + $T2)6a, I 0

0

which is same as (39) in accordance with the sensitivity result in (21). Choice 2. Instead of (38) let SF=A, where A is a scalar. Then we have Sg = A and Sg = tA, which yields 8,~ =

IT(sg + 6s) dt = (;T* + T)A .

(40)

0

Also,

-I 0

T A'SF

=-I

T(l-

I- T)(A)dt=

(;T*+ T)A,

0

which is same as (40) and hence satisfying (21).

6. Conclusion A sensitivity result for the adjoint variable is derived. It is shown that the adjoint variable represents the sensitivity of a given function with respect to the load vector. This result is shown for structural, mechanical systems dynamics, and distributed parameter problems in optimal design. The sensitivity result is derived by treating the load vector as the design variable. It is also shown that the adjoint variable is a Lagrange multiplier for the state equation. Therefore, an established theorem regarding the sensitivity of constraint variation in nonlinear programming lends further insight into this result. Both derivations lead to the same interpretation. Examples illustrating the sensitivity interpretation of the adjoint variable are given. The interpretation of the adjoint variable as a sensitivity of a function with respect to the load vector or forcing function (Lagrange multiplier) is regarded as a basic result of considerable practical significance. With this result, optimum loading configurations can be determined, as also the optimum utilization of a structure [l]. In addition the result can help in verifying the analytical calculation of adjoint variables in a computer code.

References [l] E.J. Haug and J.S. Arora, Applied Optimal Design (Wiley, New York, 1979). [2] D. Burghes and A. Graham, Introduction to Control Theory Including Optimal Chichester, 1980). [3] A.E. Bryson and Y.C. Ho, Applied Optimal Control (Wiley, New York, 1975).

Control

(Ellis

Horwood,

A.D. Belegundu,

J.S. Arora, Theorem for the adjoint variable in optimal design

89

[4] J.S. Arora and E.J. Haug, Methods of design sensitivity analysis in structural optimization, AIAA J. 17 (19) (1979) 970-974. [5] G.N. Vanderplaats, Comment on ‘Methods of design sensitivity analysis in structural optimization’, AIAA J. 18 (11) (1980) 1406-1408. [6] D.G. Luenberger, Introduction to Linear and Nonlinear Programming, (Addison-Wesley, Reading, MA, 1973). [7] D.G. Luenberger, Optimization by Vector Space Methods (Wiley, New York, 1969). [8] A.D. Belegundu, Interpretation of the adjoint equations in optimal design, ASCE J. Structural Div., submitted.