Decomposition algorithms for on-line estimation with nonlinear DAE models

Decomposition algorithms for on-line estimation with nonlinear DAE models

Computers chem. Engng Vol.21, No. 3, pp. 283-299, 1997 Copyright© 1996ElsevierScienceLtd Printed in Great Britain.All rights reserved S0098-1354(96)00...

1MB Sizes 0 Downloads 100 Views

Computers chem. Engng Vol.21, No. 3, pp. 283-299, 1997 Copyright© 1996ElsevierScienceLtd Printed in Great Britain.All rights reserved S0098-1354(96)00002-6 009s-1354797 $17.00+0.00

Pergamon

DECOMPOSITION ALGORITHMS FOR ON-LINE ESTIMATION WITH NONLINEAR DAE MODELS Jo~o S.ALBUQUERQUE, L.T. BIEGLER Chemical Engineering Department, Carnegie Mellon University, Pittsburgh PA 15213, U.S.A. (Received 26 September 1994; revised 18 August 1995) Abstract--In a previous paper we presented an algorithm for solving the large nonlinear problems that arise in dynamic state and parameter estimation. These problems are described by a set of Ordinary Differential Equations (ODE) subject to initial values, diseretized and solved under a simultaneous solution strategy within an Error-in-all-Variable-Measurements (EVM) framework. In this paper, we extend this work to systems described by Differential and Algebraic Equations (DAE), for both initial value (IVP) and boundary value (BVP) problems. The resulting discretized NLP problems are usually large both in the number of variables and the number of degrees-of-freedom, and the size of the problem grows linearly with the number of data samples or the number of discretizations, making them difficult to solve with general purpose NLP solvers. We discretize the DAE model with an implicit Runge--Kutta scheme (IRK) so that higher index and stiff problems can be solved, and use the SQP method to solve the resulting NLP. Under the assumption that the measurement errors are independently distributed, the optimality conditions for each data set at the QP subproblem level are decoupled so that the first order conditions in the state variables, input variables and the stage derivatives can be solved recursively and expressed as functions of the optimality conditions in the parameters, thus reducing the size of the QP subproblem and turning the solution effort linear with both the number of sample points and the number of discretization points. We use two decoupling procedures depending on whether the DAE system is expressed as an IVP or a BVP. For the IVP we extend the technique used in our previous paper and use an affine transform which expresses the directions in the variables and the multipliers as functions of the steps in the parameters. For the BVP we use a selected variable elimination procedure so that the optimality conditions in the state and input variables, stage derivatives and multipliers of the QP subproblem will be transformed into an almost block diagonal system of linear equations. We then solve this system using SOLVEBLOCK, a solution method linear with the number of blocks to reduce the size of the QP subproblem to the dimension of the parameters. As seen in our examples, these approaches are therefore much faster than general purpose NLP solvers. In this paper we also present a case where an IVP is unstable and difficult to solve, but stable and well behaved when rewritten as a BVP. Copyright © 1996 Elsevier Science Ltd

1. INTRODUCTION. In the previous paper (Albuquerque and Biegler, 1995) we discussed the role and importance of dynamic parameter estimation in plant operation and reviewed the work done in this field. We also discussed the difference between the classical regression formulation and the EVM formulation and concluded that the EVM formulation is more realistic for chemical engineering problems, since all the variables being measured have errors associated to them, and we cannot assume the independent variables to be error free. Sometimes we cannot even distinguish between independent and dependent variables (e.g. liquid-vapor equilibrium). We discussed the two competing strategies for solving the dynamic parameter estimation problem: sequential and simultaneous. In the sequential strategy, the DAE system is integrated by some numerical package within the optimization procedure. An example of such strategy is the one implemented by Kim et aL (1991) where GRG2 was used as the optimizer and DASSL was used as the

* To whom all correspondence should be addressed, e-mail: [email protected] 283

integrator. This approach keeps the size of the optimization problem small, but the repeated integration of the model can be very costly. In the simultaneous approach, the dynamic problem is approximated by an NLP by diseretizing the differential equations and transforming them into approximating algebraic constraints so that the model is integrated as the problem converges to an optimal point. This approach is potentially more efficient since feasibility is achieved only at the solution. However, if the process is meant to operate dynamically, or has persistent disturbances over time and never really reaches steady-state, then the number of data points will be large and therefore the problem will be large, and if an efficient solution method is desired, then we must take advantage of the problem structure. So far, little work has been done in solving parameter estimation when the model is described by a DAE, a frequent situation in multistage processes (Campbell, 1994). These systems have two problems that do not appear in ODE systems: consistency of initial conditions and the index problem (Brenan et al., 1989). Also, less attention has been given to parameter estimation of BVPs. Although most practical problems are IVPs by nature, a BVP formulation is still very attractive for systems with

284

J. S. ALBUQUERQUEand L. T. B1EGLER

forward increasing modes in the fundamental solution. Here the BVP form is stable while the IVP formulation will fail due to numerical instability (Ascher et al., 1988). Since any IVP can be rewritten as a BVR the ability to solve the latter type of problem is needed (Tanartkit and Biegler, 1995). In the next section we formulate our dynamic parameter estimation problem. In Section 3, we discuss the discretization of the DAE constraints. Next, we show the resulting discretized NLP problem and discuss the difficulties involved with its solution (Section 4). We also briefly describe the SQP algorithm and the kind of updates and line searches we used. Sections 5 and 6 contain the full description and derivation of the decoupling strategies used for solving the IVP and BVP respectively. Finally, results and comparisons with other solvers are given in Section 7. In particular, Section 7.3 deals with a problem which is unstable when solved with initial conditions but stable when solved with boundary conditions. A discussion of the results is given in Section 8. 2. GENERAL PROBLEM STATEMENT

Let Zt and Ut be the state and control or input variables (or incidental parameters) at time i and let Z//and ~ be their respective measurements. These variables are related to their measurements according to the following state-space model:

Ui=u,+8,

(1)

Suppose that the noises ei and 8, are uncorrelated and independent with time. Also, at this stage we consider only bounds on the parameters as inequality constraints. Then the problem with the data collected from t = t~ to tN+~can be formulated as: /¢

min ~ lt(Zi~.j,Us, Ui, O)+ ltc+~(Z~+~+~,O)

assumed for the noise. For instance, if e, and ~, are normally distributed with a zero mean, then the objective function of problem (PI) derived from the Maximum Likelihood principle would be the residual-sum-ofsquares. The DAE system is the deterministic process model in the absence of noise. 3. DISCRETIZING THE DAE SYSTEM

The solution of DAE systems poses several difficulties which do not show up when solving ODE systems (Petzold, 1982). Closely related to the level of difficulty associated with a DAE are the notion of index and the structure of the DAE. In this section we discretize the general DAE system using Implicit Runge-Kutta (IRK) methods. These methods have two important advantages over multi-step methods. The other is that it is possible to construct higher order A-stable methods. This property is desirable when dealing with stiff systems. Convergence, accuracy and stability proofs have been given for IRK methods when applied to fully implicit index one, semi-explicit index two, and any linear constant coefficient systems for consistent initial values (Brenan et al., 1989; Brenan and Petzold, 1989) and for consistent boundary values (Ascher and Petzold, 1991). Moreover, several techniques (Gear, 1988; Bachmann et al., 1990; Chung and Westerberg, 1991) have been developed to reformulate high index DAEs to lower index forms. While the original, high index DAE may have a desirable structure, we will also rely on these methods when the limit of our IRK discretization is exceeded. It is beyond the scope of this paper to address the issues of consistency of initial and final conditions, stiffness, index and stability in detail. In this study we assume consistent initialization and an IRK discretization and stepsizes are chosen to insure accuracy and stability. For a more detailed discussion on these subjects, please refer to the references mentioned above. Consider the following general DAE system:

i+l

~ , z, u, o, t =o

(2)

Z ~ R ~, U~R% O ~ f f , ~.R'+"÷P'*R~ Z~ =Z(t~) , for IVP RiZ( t l) = Zi ,R rZ( tt¢+t) = Zr ,for BVP

h(0)-<0

(Pl)

Z ~ R ' , U ~ R % 0~1~, ~.R"+m+P-.I~, h:R~--'R~,ZI~R% ZT E R ~r, R1¢ !~"% R TE R arm where N is the number of data samples, and O are the unknown parameters. For convenience we omit the measured quantities (Z~, Ul). The objective function is derived from distributional assumptions regarding the measurement noise and reflect the stochastic model

Using an M-stage implicit Runge-Kutta method (see Brenan et al., 1989) over the time horizon [t~,tN÷~], and assuming that the control variables U~remain constant in the time interval [t,,t~÷t], we get:

~Zik',Z,+h~akjZ#',Ut, O, ti+hckl=O,k=l .... ,M (3) ~

j-I

]

M

Zi÷,=Zl+h k+l 2 bkZ~'

(4)

where Z~' ~/~' arc the stage derivatives which approx-

On-line estimation with nonlinear DAE models

285

in Section 2 and the integration scheme described in Section 3 is as follows: imate ~ -

d Z t=ti+hct

and h = t~+t - t~ is the time step. The

N

min ~; I,(Z~,Ui, O)+ IN+I(Z~+I,O) i=l

stage derivatives can be uniquely determined from equations (3). For a linear constant coefficient DAE the solution to (3) must be unique if the DAE system is to be solvable (Brenan et al., 1989). For nonlinear systems this would translate into saying that the Jacobian of (3) with respect to the stage derivatives would be nonsingular. This is also a necessary condition for the convergence of Newton's method when solving these equations. We now group variables and terms so that (3) and (4) are transformed into a more convenient and compact system of equations. Let tr = h(b~ ..... bM)r®l,, 71T__(7tT c ~ R ~s and ~ i - ~ ~l..... Z wT ~M), Z ~ R ' with s = nxM. Then equation (4) will be rewritten as:

z,÷,=z,+~:

(5)

System (3) can be written more compactly as: ~k(Z:,Zt, U,,O)=O, k= 1. . . . . M

(6)

If we group the functions ~ into a vector function G/r=(a/~/i,...,~/~), Gi:R"-"R", then system (3) is further compressed into: G,(Z,,Z;,U,,O)=O

(7)

A much more appealing and intuitive form of equation (7) is when the stage derivatives appear in the left hand side, multiplied by a matrix E which separates the differential variables from the algebraic variables. It is not necessary to do so, but we do it anyway because it makes (7) easier to understand when discretizing ODEs and semi-explicit DAEs. We then rewrite (7) as: EZ;= F,(Z,,Z;,U,,O)

s.t. EZ;=F,~Z,,Z;,U,,O)

Z,÷,=Z~+aZ~ Zi =Z(tj) , f o r IVP

R~, = Z. R r Z N + ,= Z r ~forB V P

h( O)<-O

Here we assume that the initial or boundary conditions are consistent and the chosen IRK integrating scheme is stable and consistent for the DAE. It may appear in this formulation that the discretization of the DAE is done at the same points where the data are sampled. This does not need to be so. By data set we mean the collection of points where the data are sampled at a point in time (terms in the objective function). We need to evaluate the discretized DAE's at these sampling points because the control variables are piecewise constant between sampling times. However, depending on the size of the sampling intervals, the DAE may be discretized more often, between the sampling times to obtain an accurate solution. This can be done easily by using more stages in the IRK formulation. Suppose we are sampling at times t~and t~+~ and want to discretize the DAE (2) twice in the interval (ti, t~+t), say at points ti and t: (t~
(8)

E E R sxs

Note that form (8) is absolutely general. A fully implicit system would lead to E=0, with full dependence of F~ on Zf. A pure ODE would lead to E=I. Finally, a semi-

identity matrix multiplies those components of Z~ which are related to the differential variables. Thus the discretized form of DAE (2) under the M-stage IRK scheme will be: EZ;=F,(Z,, Z;, U,, 0) Zi , , = Zi + a Z :

(9)

4. FINAL PROBLEM STATEMENT AND SQP SOLVER

The final problem statement for the initial value problem, using the distributional assumptions described

(P2 )

z,,=z,+az;

(10)

G,(Z~, Z~, Ui, g)=0

(11)

Z,,, =Z,#Z,,,+ o~Z/

(12)

G,, (Zr,Zr',Ur,O) =0

(13)

Since the control variables are piecewise constant, Ur=Ui and replacing (10) in (12) and (13) we get: Zi+i = Zi+ otZ~+otZ/,'

(14)

G,(Z,Z~,Z/U, 0) = 0

(15)

Gi(Zi,Z;,Zi,',Ui,~)=Gi,(Zi@ti~Z;,Zi,',UiO). Equations (13) and (15), together with (14) can be written as a 2Mstage IRK in the (t~, t~+~) interval. The terms l~ in the objective function only include the N + I sampling points. The number of variables in this problem for the IVP case is N(n + m + n.M) + p. The number of equality constraints is N(n.M + n). The problem grows linearly with both the number of data sets and time steps, although an increase in the number of time steps only contributes with more variables, whereas an increase in the number of data sets also adds degrees-of-freedom to the problem. Since with a general NLP solver the effort where

286

J. S. ALBUQUERQUEand L . T . BIEGLER

of solving a problem grows with a quadratic or even cubic trend, a problem like this can easily become too difficult or too expensive to solve as N becomes large. Problem (P2) is solved by using the Successive Quadratic Programming (SQP) method, which is known to take fewer function and gradient evaluations than the reduced gradient method. The general framework for SQP is: 1. Solve a QP approximation of the problem to obtain the descent directions. 2. Perform a line search. 3. Update the Hessian matrix of the Lagrange function. 4. Return to (1) until convergence is achieved. Step (1) is the most important step since a large QP subproblem may be very expensive to solve, and the one to which most of our work was devoted by developing a strategy to solve the QP subproblem efficiently taking advantage of the problem's structure. Prior to describing this step we briefly summarize the remaining steps. A line-search method developed by Cuthrell and Biegler, (1985) is used. This line search uses an augmented Lagrange function and possesses both global and superlinear convergence properties when used with an SQP algorithm. As for the Hessian update, we use the BFGS formula, unless the residuals are small or the constraints are linear. For these cases the Hessian can be approximated by a Gauss-Newton (GN) approximation, using only the second (analytic) derivatives of the objective function. This approximation will be at least semipositive definite, if the objective is derived from the Maximum Likelihood method, and it will be trivial to evaluate if the least squares function is used. Since we do not know a priori the size of the residuals, we monitor the progress of the objective function to decide whether

we choose the update based on the following two switching rules: [BFGS otherwise

i= I ..... N+ !

Hi=lGNifto~>7"~( or ~ < z 2

(16)

From our experience ~'~=0.1 and %=0.1 are recommended. This approach is particularly useful in problems where the residuals are small or the constraints are almost linear and the analytical Hessian is at least positive semidefinite. In the latter case, the BFGS update would become ill-conditioned or singular, while with Gauss-Newton we merely apply a modified Cholesky factorization (Gill et al., 1981). Also, since the GN update leads to better approximation under the outlined circumstances, we can expect the method to converge in fewer iterations. 5. T H E I N I T I A L VALUE P R O B L E M .

The approximating NLP to problem (PI) subject to initial conditions is: N

min X I,(Zi, Ui,~+ IN+I(ZN+I,~ i=l

s.t.

[EZ, =F,(Z,.Zi . U , , O ~ t = IZ,., =z,+ az,' Z, =Z(t,)

I .... ,U

h(0) ~<0

(P3)

We solve this problem using SQP. For the sake of simplicity we drop the inequalities for the moment. The Lagrange function of problem (P3) will be: L= IN+j(ZN+,,0)+

the residuals are small or not. If we relate tot = 1

~bk ,

where ~ is the objective function at iteration k, then as 4h converges superlinearly, tot should tend to one for small residuals (~+~ ~0) and should tend to zero if the residuals are large (~+~ ,~ ~ ) . As for the constraints, the closer they are to linearity, the better a first order Taylor approximation will be. A Newton step obtains immediate feasibility with linear constraints, thus we compare the progress of the constraint violation. In this

/1,(z,.u,. ,=. [

z,-

tff[EZ,_F,(Z,,Z,,U,,O) ]

} (17)

Aie R",/x~ e R'. or N+I

L= • Li iffil

(18)

where liCk+# O(lldll2) case, it can be easily seen that to2= IlCkll = Ilfkll '

where Ck is the constraint value at iteration k and d is the direction in the variables. The term on the numerator of the right hand side is proportional to the size of the second derivatives, so the more linearly the constraints behave, the smaller it will be. Based on these behaviors,

/

IN+~ i=N+ 1 L'= [1,- A,+.(Z,+. - Z , - otZi')i= I ..... N - IfffEZ,'

- F,)

(18) The QP suhproblem will be, for Zt initialized at the known consistent initial conditions:

On-line estimation with nonlinear DAE models

287

N+I

min X

. . . ,Ci~R" . . A I E R . ,Bi~IT , -OF~ ~ ER ~'P,s~ER",si , ~R',

i=1

ai~R",biEl~.ci~RC

ool I

and inserting them in the problem (P4), the QP subproblem becomes:

kAOd

N+I

1 +

min Y

I~l

i+1

taz.'az.' ,av,'aoq ../av. I LaOJ t

a r AZi + br AZi, + crAui + 1_z:lZriBiI ,AZi 2

t

s.t. az,÷, = az,+,~az, + ( - z,÷, +z,+,,z, ) E A Z I ' = -3Fi A z i + ~OFi AZ,i + OFI A u

ozi

ozl

oui

t_ AZ' rffz2AZ/ + 1 AUrB~3AU,+

'

+2

AT.~= 0

+ (°l')" ab a o +

(P4)

Where i= 1..... N. The symmetric matrix H~ containing the second order information is defined as follows:

-~,,, ~i,~ s,,, si-]

Hi= IBm,

B~,.

B~q '

L~',, B',~

~',,

s',,_]

02L

02L

02L

~''=

ou---]' ~z,

n i - - _ _02L i _ O2L ' 22-- OZi, 2' B23- ~ ,

82L , B33- O2Ui, i

_

OZiO0 ~''= -i _ O~L B24- OZ/c30

i

__

22E/(-

r~i

nsxm

~/$23E/(

ni

~,.rp ~i

~D24E/(-

nnttm

~D33E/~

~i

(P5)

for i= 1. . . . . N. The optimality conditions in AZ~, AZ/, AU~ and the multipliers are:

aI+BiHAZi+B~I2AZ/ +Bi~sAUI+ A,+~ - ai+arl.q=O

(19)

bi+B~,AZ;+B~2AZ/ +B~3AU,+arAI., - $r/a.i=0

(20)

cI+B~,AZI+B;2AZI' +B~,AU,+Crla,,=O

(21)

AZi +I = AZi + aAZi' + s i

(22)

(bIAZI'=A,AZ, + C, AUI + s,'

(23)

O:L

B3~- 3UiO 0

, B~4- 002

~,.rs

la6eBhae

s.t. AZi.,=AZI+aAZi' +si

aN+

i _ c32Li

Bi

'Bhav,

AZ t =0

i= l ..... N + I 02L

oz, ozi"

"

EAT.i' =AiAZ,+ BizlZ i' + CiAUi + s i'

B~3

~"'= a-~,~' ~'~=

2

az~,B,,~az, ' + az~8,,,av ,+ az,'

OFi + ~ AO+(-EZI'+Fi)

[

'

I

+BU+IAZ I I N+ I -

aN+,=0

(24)

where ~ = E - B1 and i= 1..... N. The following transformations are replaced into the optimality conditions (19) - (24)

A, = TiAZ, +?,a 0+7i

(25)

tzi=MiaZ~+fnlAO+fni

(26)

AUi=NiAZ;+fiiAO+hi

(27)

AZ i' = QiAZi +ghA O+gL

(28)

nmxp

~D34E/X

Bh e R~'. Defining the quantities below.

OF, B= OF, OF, Ai= ~ , , OZi,,Ci=~~i,si=-Zi+,+Zi+otZi', (P4)

TIER '~", liER~P, tIER ~, M i e R "x', Mi~R "x', fni~R" , Ni~iR mx", biER r"xp, b i e r ~", Q i E R "rn, q i E g s.'t', ZliER ~. (28)

si,=_EZ,+Fj +

Ci=

01; +B~AO

OF; A O 01, +~4alO ' b;=B~24AO, 30 , a;=

Note that this affine transformation includes both algebraic and differential state variables on the right hand sides. This allows us to treat higher index DAEs (->2) in a natural way. From (24) and (25) one gets the following terminal conditions:

J. S. ALBUQUERQUE a n d L. T. BIEOt.ER

288 _

+i

_

+I

r~+,-~,,, r~+,-u',/4, ~,,+ , -

oz,,+ ,

(29)

By using transformations (25) to (28), substituting (22) and rearranging equations (19), (20), (21) and (23), and imposing the condition described below: ~=0

/xi=/21A0+/2i

(41)

(30)

ziP.,=0, A~, =0

(42)

AU~=(Ni~P.i+hi)A0+ (N/I~.I+hi)

(43)

AZi' = (Q,,~.i +~i)AO+ (QiA~.i +tJi)

(44)

P'i= (MiA~I+fn,)dO+ (MiA~ +fnI)

(45)

Ai= (T~A~I+DA 0+ (T,A~i+91)

(46)

AZi÷,=(AZi+otA~i')AO+(A~i+otA~/ +s i)

(47)

the following systems for Q,, Ni, M~k, ~,, fn. ~h, h, and fn,, are obtained:

FO,7

A .INi|=- |

/

(41)

and the following forward recursive relations can be derived:

yi=O

L

(40)

A0icR ~, ~ti~R'~, Ai~W, ~:~R ~p,/2i ~ R'.

.o,azl+ ,PAo+ ~,1=o-~ ~ = o

I_h,t~

A1=~tXt0+a; _ al*+l

(31)

By insertingtheselastrelationsintothe QP subproblem (P4) and takingin accountthe inequalitiesin O. we get:

-AI _1

N+ I

min Lfn~

I aF/a#J

L -

i+I

(32)

[(v l,.",~:pi+ -ar~1+t + arT,+ :?

al/aUi

(33)

+ ~ A¢('~5'rHi'~Y',)'~e]

- ( E X / +F,) with

s.t.

e. A~= I "

L

-

h( 6~+Vh( 6~TA~--O

(P6)

where

B~s.,

BI33

f'T . A' ~l~2s÷m)'~2s+m)

~,

-CI

0 t J ' "y'~'"

Fa1/0zi7 [o0j +

A° and

(33)

I= ''

0 Ial/aui I

for

Lal/aoJ

i=1 .... ~V, The coefficient matrix A' is always nonsingular if problem (P5) has a unique solution. The proof for this is in Appendix I. Backward recursive relations are also derived:

L alu+l/aoJ

TI=(~,+ TI+,)+(BJt2+TI.,a)QI+~:~N,+ArMI (34) I1ffi(~4+?,+s)+(~2+ T1÷,a)~l,+~38,+ATfn, (35)

and Hj is given in (P4). Ayt, V I i ¢ R (. . . . . .

p)

+,,.,+r1.,sl)+(~,+T,.,a)O,+~,3~,+A~:fn,

,,=(011~

(P6)

Problem (P6) is the reduced QP subproblem in the space of the parameters. It is equivalent to the much bigger problem (P4), since the optimality conditions in AZI,AZ~, Using the initialconditionfor AZ i and the linearized AUI and the linearized constraints are embedded in it. An constraints (22) and (23). and the transformations efficient QP solver, QPKWIK (Schmid and Biegler, (25)-(28)we can collectterms for: 1994), was used to solve it. The algorithm for solving the Az,= (37) QP problem will be:

(36)

d~id#+42,

AT./ =,~.l' Ae+ ~;

(38)

AU, ffi a~.ilAe+ at.i i

(39)

1. Backward cycle: 1.1. Initialize TN+l,Tu+jand ~N+lusing (29).

On-line estimation with nonlinear DAE models 1.2.1. Qi,Ni,Mi,gh,hi,fni,~li,~i and ~hi solving systems (31), (32), and (33). 1.2.2. Ti,t~and 1i using equations (34), (35) and (36). (If i = I, skip this step)

s.t.

289

aZi÷ , = aZi + aaZi' + s ,

(P8)

E AZ/ = A~AZ, + B,AZ/ + C,A U, + s,' RAZI = 7.1- R~Zl,

2. Forward cycle:

RrAZN+ , = Zr - R rZN+,

2.1. Initialize Agt and A).~using (42). 2.2. For i= 1..... N compute: 2.2.1. AO~,A~I~,zlZ,',zI~',~ and f,~ using the bracketed terms in equations (43), (44), (45). 2.2.2. AZI+I,ZlZ,÷~,ai+~and Ai+~ using the bracketed terms in equations (46) and (47). 3. Finally, build the reduced QP subproblem (P6) and solve it. 4. With A O and equations (37)-(41) compute AZi,AZ~,

This problem is equal to problem (P5) except that the initial condition is replaced by boundary conditions. Here R~ and R r a r e full row rank matrices which define the initial and terminal conditions, respectively. The optimality conditions for problem (P8) are (19) to (23) plus the following for the first and last time intervals: a, +B t,,AZ, +B',2AZ , ' +B',3AU, + A2+At/z, - RrlZo=O (48)

AUi, A i and tzi.

Note that the dimensions of the linear systems (31), (32), (33) and the QP problem (P6) are independent of the number of data sets. What changes with the number of data sets is the number of the recursions. 6. THE BOUNDARY VALUE PROBLEM.

T

aN+I +Bll AZN+ ~- AN+I -- RTIZN+i =0

(50)

R rAZN+ I = Z T - RrZN+ I

(51)

where / ~ R N, and /ZN÷~ER""are the multipliers associated with boundary conditions (49) and (51), respectively. The system of equations defined by equations (19)-(23) and (48)-(51) can be transformed into an almost block diagonal form after some algebraic manipulation. First we define the following:

The NLP problem we now want to solve, with consistent separable linear boundary value conditions is:

aRi= I h'~7/]' ~ i=2 ..... N ' A W ~ = [ z lAU, Z / t i = I ..... N

L/.Li J

N

min £ ii(Zi, Ui,~)+ iN+I(ZN+I,~)

(51)

aiRiER ~+', A W I ~ I T +'"

i+1

s.t. EZ,'=F,(Z,,Z,',U,,O)

(49)

RIAZI = ZI - RIZI N+I

(lr])

Z,+ , = z, + aZ,'

R~Z,= g,

Then we write the optimality conditions (19)-(23) in terms of AZ~and AW,: a,+(B'i, - I AT)ARi+(O I 0)AR,+, +(B',2 B',3)AW,=0 (52)

RrZ,~+ , = z r

bi+(Biz, 0 -- dpT)z:~R,+(o Otr O)AR,, I +(B~2 B~3)AW/--0

h(O)<~O

(53)

Again, we drop the inequalities in the parameters and use the SQP algorithm, approximating (P7) by a QP subproblem at each iteration. Using the quantities defined in the previous section in problem (P5) we get the following QP:

(54)

ci+(B~, 0 Cr)AR,+(B~2 B~3)AW,=O

- si+(l 00)aRi+ I +( - 10 O)ARi+( - a O)AWI=O (55)

- s/+( -A~ 0 0)A&+(¢~ - C3AW~=0

(56)

Combining equations (54) and (56) and solving for AW,. we get:

N+I

rain '+l

(57)

AWl= - ~ i - I(ai +'~iAR3 "

I

[araZi+briAX i' +cr Aui+ ~ AZ'rB~ ,AZ i

)

1

+ ~ az,' "B'.,~az,'+ ~ au~,~.,aui+ az;~az,'

+ az~,BL, a u , + a z , , 'BhAU,

/ Ol, ~,

1 ae~,,ao

where ^ a,--f- s , [A,--I

t

c,I

'

a

Lz~,,

0 o

B,=

-C, 6,.,

~,.~

t

a,~R,+.A, el~,+m)×<2.+,,,~iE~.,+,.)×(,+,.) Matrix l~li is nonsingular if B~3 is sufficiently positive definite and ~, is nonsingular (appendix 2). We now combine equations (52), (53) and (55):

290

J. S. ALBUQUERQUEand L. T. BIF~L~ (65)

The optimality conditions for the first time interval will then be fully expressed by combining (64) and (65):.

F~'II -'o A,~']

I !i I

[R, 00]AR, =Z, - RtZ,

(58)

bl + ~ i ~ i "Fbl+ I ~RI +l +~i aWi = 0

o 001

0

~"÷'~'+'>

L--ff

O.J

Finally, we treat the optimality conditions for the last time interval by defining:

O-J

b,eR~"÷,,C,~2.+.~.+,), b~l~>÷.~.÷.,, (58)

arM÷,=i~.÷,i, zmM+, ~R~÷"~, 1

and eliminate AW~ using equation (57). We then get the resulting expression: a~+P~AR~+bAR~+I=0

(59)

[!'i]

where ~iER2"+sand PiER tu+')x(~+') can be computed through the following expressions: ci=bi -- ~-"~7 lai

(60)

~i-~-~i--~.~i" l,~i

(61)

l

0

and by grouping equations (50) and (51) to (59): ARM

The transformed system of optimality conditions, without the initial and terminal conditions will be:

(67)

where

~r= ~ 3 1

P' p,., b

..

,+,=

,;, (62)

and this system has an almost block diagonal structure. To treat the initial conditions we form the following:

II/J~la

RTXM+,--XTJ

0

~TER("+"~e2"+"~,$:M+,ER~+"~ and ~Mis computed through equation (60)• Finally, putting systems (62), (66) and (67) we get the following transformed system of optimality conditions:

[R0 0 0]

0

L

b

(62)

b P,., b

and we use the same mathematical treatment as before to get the following expression, which is different from expression (58) only in that the matrix ~ ~R t~+')X("+"T÷') is slightly different from ~:

bl +~lAXi+bAX2+~tAWI=0

(63)

,••

PM 0 ~R I "

FB:, -R; A,~] C,=[B~, o -o.,~J L-I 0

~1~ i - I

~;-~

ARi ~l~i+ I

We then eliminate AW, as before using equation (57):

e,+P,AR,+DAR2= 0

-POX,-x?

AR2

(64)

where ~j and P, ~z.+.)x( . . . . . . > are computed the same way as before, using equations (60) and (61). We now add the initial conditions (49), rewritten in terms of ARt:

4RN+I

(68)

Ci÷I ~M CN~.I

This system has an almost block diagonal form and we solve it using SOLVEBLOK (De Boor and Weiss, 1980). This solver LU decomposes the coefficient matrix by

On-line estimation with nonlinearDAE models pivoting in the non zero square blocks of the matrix. The requirements for these blocks to be nonsingular are that equations (8), which define the stage derivatives are nondegenerate, so that the multipliers/~ are unique and that the matrices that define the initial and terminal conditions, Ro and R r are full row rank, so that the first and last blocks of system (68) will be nonsingular. Note that the RHS has a linear dependence on AO, since it is built from vectors a, b~, c, s~ and s/. Then system (68) will be decomposed into two systems with the same coefficient matrix but with two different right hand sides such that the solution will be expressed as AR~=AI~AO +d~/, Once ,~t, Ak; are computed, then the corresponding terms for AW.AIV~ and A ~ can be computed using equation (57). By using the definitions of dRi and AW~ the relations (37)-(41) can be built, plus the following:

p.o=~dO+~

(69)

/zt~+,=/2N+,A0+/gN+,

(70)

Using these, and replacing into QP subproblem (P8) we get the reduced QP subproblem (P6), and AO will be computed. Using equations (37)-(41), (69) and (70), we compute AZ~, AZ/, AU~, A~and/~. Thus, the strategy to solve the QP subproblem of the BVP is simple: 1. 2. 3. 4.

Build system (68). Solve for A/~i and '~i" Using (57), compute AIVi and Ai~i. Using the definitions of AR,. and AW, compute relations (37)-(41). All flow rates in tool/rain

C~t6 C~Is iC4Hto nCdlto

7. RESULTSANDCOMPARISONS. In this section we solve three examples and compare the performances of our solvers, IVP_SQP for the IVP problems and BVP_.SQP for the BVP, with general NLP solvers. In example 1 we treat a semi-explicit index 1 problem while example 2 is a semi-explicit index 2 DAE. Finally, in example 3 we treat an unstable initial value ODE, solving it both as an IVP and as an equivalent BVP and we compare and discuss stability issues.

7. I. Example 1: isothermal flash vaporization In this example we treat an index 1 stiff system. This problem is described in Perry (1984). Consider an isothermal flash vaporization initially at steady state with four components. At time t =0 ÷ a substantial change in feed composition occurs, including the addition of two more components, leading to a second steady state for the same temperature and pressure conditions (Fig. 1). If F, V, L are the feed, top and bottom molar flow rates, zi, Y,.and xi are the feed, top and bottom molar fractions, M is the molar liquid hold-up and if the molar liquid hold-up is maintained constant (open-loop condition), the mixture inside the drum is perfectly mixed, and if the hold-up in the vapor phase is assumed to be negligible,

V=15.0842 Comp.

Yi 0.305 0.121 0.179 0.395

300 psia

0.30

I.~3.,2839 Comp.

~ xi

C3H6 C3Hs

0.163 0.07

iC 4HI0 nC 41"1,o

0.201 0.566

(a)

C2H4 C~H6 C3H6 C3Hs iC4HIo

Comp. v

0.4O

AZi, AZ/, AUi, A, and Ixi.

Q

zi

0.12 0.18

5. Build reduced QP subproblem (P6). Solve this problem to get A0. 6. Using relations (37) to (41), (69) and (70) compute

V=96.716

Comp. C3H6 C3Hs /C4HlO nC dtto

Comp.

291

C2H4 C2H6 C3H6 C3Hs iC4Hto nC4HIO

zi 0.02

0.03 0.05 0.10 0.20 0.60

0.063 0.082 0.082 0.151 0.181 0.441

I.~4.9158 Comp.

C2H4 C2H6 C31~ C3Hs iC4Hio nC dtto Co)

Fig. 1. Steady state conditions. (a) Before perturbation.(b) After perturbation.

xi 0.012 0.021 0.044 0.091 0.203 0.629

292

J. S. ALBUQUERQUEand L. T. BIEGLER

then the dynamic material-balance equations and phase equilibrium relations for six components is: dxi

d'--[ =f.z~- v.yl-

l.xi i= 1..... 6

(71)

f=v+ 1

(72)

yi=Ki.r~ i= 1..... 6

(73)

C

C

7~ x~= E y~ i=l

(74)

t=1

where f=FIM, v=V/M and l=LIMand the Kis are the equilibrium constants. Equations (71)-(74) constitute a semi-explicit index 1 DAE which could be easily reduced to an ODE system, however, this system could be stiff, so it is important to integrate it using an implicit method. The state variables are v, 1, y~ and x~, the input variables are f and zj and the parameters are the equilibrium constants K~. In this problem, the measured variables were both the states and the inputs. The system was simulated from t=O-3 rain, and Gaussian noise was added to the simulated values of the state and input variables. The model was discretized using an implicit Euler method (the simplest IRK scheme) with constant time step equal to the time horizon divided by the number of data sets (3IN min), with consistent initial values on the state variables. This approach is accurate and stable as the DAE is index 1. A regression was performed using a normal least squares objective function on the measured variables. The regression was performed for several data sets using both our implementations (IVP_SQP and BVP SQP), MINOS and CONOPT, a GRG solver. Table 1 shows the size of the problem in the number of variables, constraints and degrees of freedom for several data sets.In Table 2 and Fig. 2 we compare the computational performance of our strategies, the IVP solver IVP_SQP (section 6), the BVP

solver BVP_SQP (section 7) with MINOS and CONOPT, two reduced gradient general NLP solvers, for several data sets. To examine the behavior of the switching rules, we compare the number of iterations necessary for the convergence of a 10 data set problem, for several noise levels in the data, when only BFGS and GN updates are used all the way and when both switching rules are used. This comparison is shown in Fig. 3. As seen from our results, both the general NLP solvers take a polynomial effort to solve the problem with the number of data sets, whereas with our solvers the computational time is only linear and they are far more efficient for large problems. Here the IVP_SQP solver is more efficient than the BVP_SQP solver and is able to solve larger problems since it uses less memory. Although this problem has a small time horizon, in a real application the system may never reach a steady-state because of the disturbances in the feed flow and compositions, and a larger time window must be used. In this problem, BFGS updates take about 4 times more iterations than GN updates, except for very high noise levels in the data. The switching rules performed quite well, keeping the number of iterations to at most one more than GN. 7.2. Example 2: two connected tanks: an higher index problem. Two tanks are connected by a valve (Fig. 4). When the resistance of this valve is zero, the dynamic model of this system is index 2 and has a semi-explicit structure. A I and A2 are the cross section areas of the tanks. Fo, FI and F2 are volumetric flow rates and hj and h2 are the liquid heights in both tanks. For any valve resistance R, the flow balances and the head loss equations will look like:

Table 1. Number of variables, constraints and degrees of freedom vs number of data sets, for the isothermal flash problem, n = 14, m = 7 . p=6, M = l Number of data sets. N

Variables

Constraints

Degrees-of-freedom

10 100 400 800

356 3506 14006 28006

280 2800 11200 22400

76 706 2806 5606

Table 2. CPU time. gradient evaluations, function evaluations for CONOPT, MINOS, IVP_SQP and BVP_SQP for the isothermal flash problem, standard-deviation in noise level of 0.01 Number of data sets, N

CONOPT

MINOS

IVP_SQP

BVP_SQP

10 20 50 100 150 200 400 600 800

21,324, 163212 144, 813, 988991 946, 1375, 3678572 -Fail-

4, 9, 298 -Fail100, 29, 1397 267, 46, 1410 537, 64. 1528 930. 81, 1507 4471, 148, 1026 18120. 217, 978

21, 5, 12 42, 5, 12 107, 5, 12 222. 5, 12 447.7, 16 604, 7, 16 1291.7. 16 1508.7,16 1970, 7, 16

48, 5, 12 95. 5, 12 244, 5, 12 485.5. 12 994, 7, 16 1305, 7, 16 2827, 7. 16 4430, 7, 16

On-line estimation with nonlinear DAE models

293

20000

|

15000 I"1

K m

MINOS

O 10000-

m

8

O

Ivp._SQP

A

BVP_SQP

5000-

0-~ O O ¢q

O

O O ,~

O O ,,O

O O an

O O O

Number of data sets Fig. 2. CPU time for CONOPT, MINOS, IVP SQp and BVP_SQP for the isothermal flash problem, standard deviation in noise level of 0.01.

50

40-

| u

[]

30-

20-

BFGS

Switching rules

E

S

7

0

i O O 0

I

I •

O

I °

0

0 0

S t a n d a r d . d e v i a t i o n in the noise Fig. 3. Number of iterations for BFGS, GN and switching rules updates for the 10 data set isothermal flash problem.

294

J.S. At~uqueaoue and L. T. BIEOI~R

Ai -~t =F°- F'

(75)

dh 2

A2 -~- = F~ - F2

(76)

F,R=f,(h,,h2)

(77)

F2=f2(h2)

(78)

Suppose there are no resistances from the valve, pipe, inlets and outlets. Then equations (77) and (78) will be replaced by:

hi =h2

(79)

F2=A2~

(80)

The system composed by equations (75), (76), (79) and (80) is an index 2 problem. The state variables are h,, h2, F~ and F2. The input variable or incidental parameter is F0. There are no parameters in this problem. We used A = A2 =- 2 m 2 and F o = 9 m3/s. and simulated the system for a time horizon of 10 seconds starting with both tanks empty and added Ganssian noise to the state and input variables to simulate measurements. We used a normal least squares objective function on the states and inputs and an implicit Euler discretization on the DAE system with a constant time step equal to the time horizon divided by the number of data sets (10/N sec). This IRK discretization is adequate as it can be shown to be stable and sufficiently accurate for index 2 DAEs (Brenan et al., 1989; Brenan and Petzold, 1989). Table 3 shows the size of the problem in the number of variables, constraints and degrees of freedom for several data sets. In Table 4 and Fig. 5 we compare the performance of both our sU'ategies, IVP_SQP (section 6) and BVP_SQP (section 7) with MINOS when the data is corrupted by noise with a standard-deviation of 0.05. Finally we compare the number of iterations required

tO solve the 10 data set problem for several noise levels in the data, when only BFGS and GN updates are used with the switching rules. This comparison is drawn in Fig. 6. Both our solvers performed linearly with problem size and were far more efficient for large problems, whereas the general NLP solver performed polynomially. I V P SQP was more efficient than BVP..SQP. Also IVP_SQP is able to solve larger problems since it consumes less computer memory. In this problem, the BFGS update takes almost twice as many iterations as the GN update. The switching rules require only one more iteration than GN, even for very large standard-deviations in the noise. In this case, the second switching rule is taking over because the problem does not seem to have a high degree of nonlinearity.

7.3. Example 3: a problem unstable in the initial values, but stable in the boundary conditions. Consider the following linear constant coefficient ODE problem:

with the following initial conditions:

z(0)= [ 07r]

(82)

where/~ is some positive definite constant and O is the parameter to be estimated, nominally equal to It. This problem has no control variables. However, the greater /t is, the more difficult the problem will be to solve numerically. Beck (1982) showed that it is extremely difficult to solve a parameter estimation problem with this IVP when the ODE is integrated in a single shooting manner. However, they had much greater success when using a multiple shooting framework. Although the

Table 3. Number of variables, constraints and degrees of beedom vs number of data sets, for the tanks problem, n=4, m= I, p--0, M-= 1. Number of data sets. N

Variables

Constraints

Degrees of freedom

10 400 800 1400

90 3600 7200 12600

80 3200 6400 11200

10 400 800 1400

Table 4. CPU time, gradient evaluations, function evaluations for MINOS, IVP_SQP and BVP_SQP for the tanks problem, standard-deviation in noise level of 0.05. Results obtained in a VAX 3200 Number of data sets, N

MINOS

IVP_SQp

BVP_SQP

10 50 100 200 400 800 1400

6, 8, 197 60, 12, 536 214. 20, 920 841.37, 1566 3196, 73. 3385

4, 5, 12 23, 6, 14 48, 6, 14 109. 7, 16 218, 7, 16 430,7, 16 757, 7, 16

5, 5, 12 31, 6, 14 61, 6, 14 140, 7, 16 279, 7, 16 581,7. 16

On-line estimation with nonlinear DAE models

295

4000

3000-

2000

i

O

MINOS

O

IVP_SQP

O

BVP_SQP

1000

0 0

0

0

0

Number of data sets Fig. 5. CPU time for MINOS, IVP_SQP and BVP..SQP for the tanks problem, standard deviation in noise level of 0.05.

multiple shooting approach is numerically more stable than single shooting, it is still difficult to solve an unstable problem, such as the one shown above, since these problems have forward exponentially increasing modes which make round-off and discretization errors propagate in a disastrous way. As was shown by Tanartkit and Biegler (1995) the instability can be removed if the IVP is rewritten as an equivalent BVP. In this section we compare the performance of our IVP estimator (IVP_SQP) with our BVP estimator (BVP_ SQP) in terms of stability, and we try to explain the different behaviors through a stability analysis. We then

explain how a problem in the initial values (WP) can be written as an equivalent problem with boundary conditions (BVP).The analytic solution of (81) with initial conditions (82) for a given O is unique since the coefficient matrix of the associated homogeneous ODE is continuous and the problem is linear with constant coefficients (Ascher et al., 1988) and the analytical solution is:

z=

sin(m) [ ucos(m)J for 0=or

10

m m 0

8

m

J|

[~

BFGS

[]

Switching rules

88

Z

0

w 0 0 0

~'~ 0 0

0 *'-, 0

t 0 ¢q 0

0 0

Standard-deviation In the noise Fig. 6. Number of iterations for BFGS, GN and switching rules updates for the 10 data set tanks problem.

(83)

296

J, S. ALBUQUERQUEand L. T. B I ~ t £ R Table 5. Performance of IVP_SQP and BVP_SQP in CPU seconds, iterations, value of objective function and estimated parameter for several data sets. Results were obtained in a VAX 3200 IVP_SQP Data sets, N

CPU (s)

Iterations

Objective

O

BVP_SQP CPU (s)

Iterations

Objective

O

10 20 50 100

4 6 -Fail-Fail-

7 6

0,667 0,326 -

3.2893 3.2155 -

2 3 10 20

2 2 3 3

1,206 0.605 0.239 0.118

3.1407 3.1417 3,1424 3.1423

An associated BVP with the same unique solution is the one with the following boundary conditions:

['0

(84)

Iz(t)l-
The solution of (81) with initial conditions (82) or boundary conditions (84) is equivalent, however the stability of the formulations is a different matter. We simulated data for Z~ and Z2 with O = ~rby solving (83). We then used a normal least squares objective function and integrated equations (81) with implicit Euler and/z =60. We used IVP_SQP with initial conditions (82) integrating from t=0 to t = l , and BVP_SQP with boundary conditions (84), for several data sets. Table 5 shows the results of the regression for several data sets and for both solvers. The results show that the IVP formulation not only takes more iterations than the BVP formulation, but it fails to converge for small integration steps. The reason why this happens is because the fundamental solution has exponentially increasing modes, meaning that when we integrate forward, the global errors arising from the discretization and round-off errors grow exponentially. The smaller the integrating step, the more acute this problem is and this will happen independently of the discretization procedure since the instability comes from the solution itself. However, if the problem is solved as a BVP, and if the boundary conditions are carefully selected, then the problem will be dichotomous and will be stabilized (Ascher et al., 1988). To visualize this, we briefly describe the result of a stability analysis. The solution of the IVP problem is bounded by: (85)

"1

where q=

be ill conditioned. Although decreasing the step size will decrease the discretization error, it will create large round-off errors which will propagate in a catastrophic way. For our BVP, the solution bound is:

_ (/z~+ 06¢)sin(0t) ] and

K

is the condition

number for the IVP problem. When /.t is large, the condition number r will be approximated by: bt + l ~, K= T e

(86)

meaning that K will be very large and the problem will

(87)

where Kj and x2 are the conditioning constants and B is the right hand side of the boundary conditions (84). For large/z, K~ will be given by: K~= 1 +/z

(88)

Since this homogeneous ODE associated to (81) is exponentially dichotomous with proportionally constant K= 1 and under these conditions K2 ~ K(2K~ + l)(Ascher et al., 1988), then K: will be: K2--<3+2#

(89)

In our example fl=0, so the solution will be bounded by t¢~alone. For large/z, r2 "~ K meaning that the BVP will be far better conditioned than the IVE In a situation like this, where the given or the natural problem is an IVP, it may be desirable to rewrite it as an equivalent BVE and solve the latter since it has better stability properties. The consistent initial conditions of an IVP problem: Z~=Z(t0

(90)

can be replaced with an equivalent set of boundary conditions: RoZ, = RoZ( t)= Z,, RrZ~+, =Zr

(91)

where Zr is unknown and can be included in the parameter vector. This parameter has to be such that the initial conditions (90) are enforced. The strategy described in section (7) can be used to solve this problem, with the necessary modifications so that Zr is treated as a parameter. The directions in the variables and the multipliers will then be affine transforms on aZr as well. In order to insure that all the initial values are being satisfied, we then add the following constraint to the reduced QP subproblem (P6): /tO A~.,[ ~L~r]+A~.,=Z(t,)-Z,

(92)

A similar approach has been used in the model predictive control (Santos et al., 1995) of the Tenessee Eastman problem. In both cases endpoint constraints were required to stabilize the problem.

On-line estimation with nonlinear DAE models 8. CONCLUSIONS Dynamic systems' optimization has several problems which do not show up in steady-state systems. In our NLP approach, the differential equations associated to the dynamic behavior of the system are discretized and transformed into algebraic equations. In this paper we present a strategy in which the Implicit Runge-Kutta methods are used to integrate and solve the DAE models. IRK methods have been shown to be able to solve higher index problems, so we do not limit ourselves to a particular structure or a particular index. In fact we solved an index 2 problem successfully. Another problem arising from the simultaneous strategy is that the resulting NLP problems become very expensive as either the number of data sets or the number of time steps in the discretization increases. Here we developed two decomposition strategies for EVM problems, one for IVPs and the other one for BVPs and as seen from our examples, the computational effort of solving a problem increases linearly with the both the number of data sets and time steps with both our decomposition techniques. However with general NLP solvers, and with the tailored solvers reported in the literature, this effort grows with the square or even the cube. Thus our approaches are much more efficient for large data sets or large number of discretizations. Since in this kind of problem it is desirable to have many data sets for statistical meaning and many time steps for numerical accuracy, our decomposition techniques are better suited for parameter estimation problems than general NLP solvers. Also, global and superlinear convergence can be proved since they are SQP methods and all properties hold. In nonlinear problems, the type of second order update used in the SQP algorithm can have a substantial influence in the convergence rate. Switching rules based on the size of the residuals have been used previously to accelerate convergence (Albuquerque and Biegler, 1995; Fletcher and Xu, 1979; Tjoa and Biegler, 1991) and they performed well in this study. Here, we also added another rule which exploits near linear constraints so that a quadratic convergence can be obtained even when the residuals are large. We also showed that, although most problems are naturally formulated as initial value problems, there are situations where it is desirable to rewrite these problems as boundary value problems since the latter may be more stable than the former, and thus much easier to solve. We also showed how this can be done. APPENDIX 1

In this appendix, we want to prove the nonsingularity of matrix A~,, given that problem (P5) has a unique solution (see section 5). The coefficient matrix of the

297

optimality conditions for the QP subproblem (PS), defined by equations (19) to (24) and solving for [AZ,, .... AZ N, AN, AZff, AUN, IZN, AZN+l, AN+~] has the following structure: •~. . . . . . . .

/~n-t

-1

B~12'

/~13-'

o B~31- I

0

B~q32"l

--AN-I -I

0 0

@A,-I -a

At_,

0

I

-eL,

0

d

~33- I

CNT_I

0

0

-- C,v-i 0

0 0

0 1

0 0

with the last block matrix as:

-B~n

-I 0 0

B~2

B~l3

Ar

0 0 0

-AN

4'~, -C.

-I 0

0 0 0

-or 0

0 0

0 0 0

(l)

(2)

(3)

(4)

(5)

I " otr

(1) (2)

0 I /i~j~I

0 0 0 -I

(3) (4) (5) (6)

(6)

(7)

Since (P5) has a unique solution, then all the columns and the rows of the optimality conditions should be linearly independent. By looking at the system above, i t is easy to see that if the matrix defined by rows (1)-(6) and columns (l) and (3)-(7) must be nonsingular, because any linear dependence on these rows or these columns would make the whole system go singular. With this in mind, and that T~+~=/~.+~, we start to factorize the system by using these columns. 1. Post-multiply column (7) by Ts+t and add it to column (6). 2. Add the resulting column (6) to column (1). 3. Post-multiply the resulting column (6) by tr and add it to column (3). At this point we arrive to the following equivalent system:

-l~.+Ts+. - I B~12+Ts+.a B~l3 Ar Ts+, I /~2i+aYTN+, 0 /~22+aYTs÷,cr ~ -~b r WrTN+, Wr

0

o

o

-A. 0 0

0 0 0

@~ 0 0

- C. 0 0

0 0 0

0 I 0

0 0 -I

(1)

(2)

(3)

(4)

(5)

(6)

(7)

The block matrix made with the elements in rows (2), (3) and (4) and columns (3), (4) and (5) is A ~ and this matrix has to be nonsingular if we wish to keep pivoting since it is our only choice for pivoting. We now define the following vectors:

J. S. ALBUQUERQUEand L. T. BIEGLER

298

V~, + dr#+-,

V~,~.+r#+,a

"-L

By pre-multiplying the first row B, by -~2~bl-s and adding to the second row, we get the equivalent system:

]

Note that from equations (31) and (34), TN is given by:

TM= (/~'. + Tin. t) -ytcA;sIvtv We now group terms in the system above, obtaining the following:

" •I+1TN+I

- !

vn

0

0

0

0

I

0

0

0

0

0

-1

(I)

(2)

(3)

(4)

(5)

TN+ I

v#

A,~..,L~

1

~

Iol

Acknowledgements--Financial support of the first author from the Programa Ciencia, a scholarship program from JNICT, Lisbon, Portugal, is gratefully acknowledged. Facilities support from the Engineering Design Research Center is also acknowledged.

We keep pivoting by performing the following operations: • Post-multiply column (3) by (A~,)#-~. • Post-multiply the resulting column (3) by -vN and add it to column (1). We get the following:

~1-

~ -I

A; 0 iI

0 0

B~32-'

B~33-I

00

~bN-I --Or -cN-I 0

C~-i

0

0

0]

REFERENCES Albuquerque J.A. and L.T. Biegler, DecompositionAlgorithms for On-line Estimation with Nonlinear Models. Computers Chem. Engn&. 19, 1031 (1995). Kim I.W., MJ. Liebman and T.F. Edgar, A Sequential Error-inVariables Method for Nonlinear Dynamic Systems. Computers Chem. Engng. 15, 663 (1991). Campbell S.L., High Index Differential Algebraic Equations, accepted for publication in: Mechanics of Structures and Machines (1994). Brenan K.E., S.L. Campbell and L.R. Petzoid, Numerical Solution of Initial-ValueProblems in: Differential-Algebraic Equations, Elsevier, New York (1989). Ascher, U.M., R.M.M. Mattbeij and R.D. Russell, Numerical

Solution of Boundary Value Problems for Ordinary Differential Equations, Prentice Halls, New Jersey (1988).

with the last block matrix as:

-T# v.(A,%)-' T#+,

t-

O

,

0

0

t

0

0

0

0

-1

At this point, we repeat the outlined procedure, and by re.cursion prove the nonsingularity for A~,y,,i=N ..... 1. APPENDIX 2

Matrix B~ (section 7) is nonsingular if the second derivatives of problem (P7) with respect to the control variables ~33 is sufficiently positive definite and O~ is nonsingnlar. To see this, consider the structure of matrix Bj:

-c,

and this matrix will be nonsingular if the entry in the second row and the second column, ~33-l-~2~/1Ci is nonsingular. This will happen if B~3 is sufficiently positive definite, so that this matrix is also positive definite, and therefore nonsingular.

Tanartkit P. and L.T. Biegler, Stable Decomposition for Dynamic Optimization, Industrial and Engineering Chemistry Research 34, 1253 (1995). Petzold L.R., Differential/AlgebraicEquations are not ODE's. SLAM& Sci. Star. Comput. 3, 367 (1982). Brenan K.E. and L.R. Petzold, The Numerical Solution of Higher Index Differential/Algebraic Equations by Implicit Methods. SlAM J. Numer. Anal. 26, 976 (1989). Ascher U.M. and L.R. Petzold, Projected Implicit Runge-Kuna Methods for Differential-Algebraic Equations. SlAM J. Numer. Anal. 28, 1097 (1991). Gear C.W., Differential-AlgebraicEquation Index Transformations. SLAM& Sci. Star. Comput. 9, 39 (1988). Bachmann R., L. Brull, T. Mrziglod and U. Pallaske, On methods for reducing the Index of Differential Algebraic Equations. Computers Chem. Engng. 14, 1271 (1990). Chung, Y. and A.W. Westerberg,Solving Index and Near Index Problems in Dynamic Simulation, Phl) dissertation thesis, Department of Chemical Engineering, Carnegie Mellon University,1991. Cuthrell I.E. and L.T. Biegler, Improved Infeasible Path Optimization for Sequential Modular Simulators---II:The Optimization Algorithm. Computers Chem. Engng. 9, 257 (1985). Gill,P.E.,W. Murray and M.H. Wright,PracticalOptimization, Academic Press,London (1981). Schmid C. and L.T. Biegler,Quadratic Programming Methods for Tailored Reduced Hessian SQP. Computers and Chem. Engng. 8. 817 (1994).

On-line estimation with nonlinear DAE models De Boor C. and R. Weiss, SOLVEBLOK: A Package for Solving Almost Block Diagonal Linear Systems. ACM Transactions on Mathematical Software 6, 80 (1980). Perry, R.H., Perry's Chemical Engineering Handbook, McGraw-Hill, New York (I 984). Bock, H.G., Recent Advances in Parameter Identification Techniques for O.D.E., In: Numerical Treatment of Inverse Problems in Differential and Integral Equations, Deuflhard, P. and Hairer, E., eds., Birkhttuser, 1982. Santos, L.O., N.M.C. Oliveira and L.T. Biegler, Reliable and

299

Efficient Optimization Strategies for Nonlinear Model Predictive Control, Proceedings of the 4~ IFAC Symposiarn on Dynamics and Control of Chemical Reactors, Distillation Columns and Batch Processes, June 1995, Helsing~r, Denmark, 33-38. Fletcher R. and C. Xu, Hybrid Methods for Nonlinear Least Squares. IMA J. Numer. Ana/. 7, 371 (1979). Tjoa I.B. and L.T. Biegler, Simultaneous Solution and Optimization Strategies for Parameter Estimation of DifferentialAlgebraic Equation Systems. I and EC Res. 30, 376 (1991).