Engineering Applications of Artificial Intelligence 15 (2002) 41–51
Trajectory tracking in batch processes using neural controllers a, . Jonas Sjoberg *, Mukul Agarwalb,1 a
Department of Machine and Vehicle Systems, Division of Mechatronics, Chalmers University of Technology, S-412 96 Gothenburg, Sweden b Department of Chemical Engineering, LTC, Swiss Federal Institute of Technology, CH-8092 Zurich, Switzerland
Abstract Optimal operation of batch processes usually involves closely following a pre-optimized batch trajectory, e.g., the temperature trajectory in an exothermic batch reactor. Controllers for trajectory tracking have previously been designed and tuned based on a physical or empirical plant model. In batch processes where it is difficult to build a sufficiently accurate model, it is attractive to tune a nonlinear parameterized controller directly on the plant data, provided the number of batch runs required for the iterative tuning remains acceptably low. In a recent work, the authors have proposed a tuning method that makes the best use of each plant run to rigorously calculate the correct gradient for the iterative tuning optimization. In this work, this method is applied to obtain a tuned neural-network controller for tracking the temperature trajectory in an exothermic batch reactor example taken from the literature. Results indicate the efficacy of the method for optimizing a neural controller without requiring an excessive number of batch runs for the trial-and-error iterations. r 2002 Elsevier Science Ltd. All rights reserved. Keywords: Batch-process control; Data-based controller tuning; Nonlinear parameterized control; Optimal batch operation; Tracking setpoint trajectory
1. Introduction Processes operating in the batch or semi-batch mode are usually required to repetitively follow a given trajectory that is prescribed based on operator experience or on model-based optimization of performance criteria, such as product yield in chemical processes. The trajectory is specified either for easily measured secondary variables, such as temperature, or for key variables, such as concentrations, that are often difficult to measure on line and must be inferred from other measurements to enable tracking. Trajectory tracking represents a difficult control problem for batch processes. The dynamic nature of the operating point, the nonlinear nature of the process, and the limited end-time of the operation hinder the applicability of wellestablished continuous-system control strategies due to infeasibility of local linearization and futility of asymptotic performance guarantees. Asymptotic results are specially useless, since, in many batch processes such as *Corresponding author. Tel.: +46-31-772-1855; fax: +46-31-7723690. . E-mail addresses:
[email protected] (J. Sjoberg),
[email protected] (M. Agarwal). 1 Present address: Corporate Technology, Buhler Ltd., CH-9240 Uzwil, Switzerland.
reactors, the operation in the early stages of the batch run is more crucial for the final end-point performance than the operation in late stages (Terwiesch et al., 1994). Successful control must therefore perform well from the very beginning of a batch run. 1.1. Previous approaches Various approaches have been used to tackle the trajectory tracking problem. PID control with fixed tuning, corresponding to a model linearized around a single operating point, cannot perform well over the entire dynamic trajectory (Kiparissides and Shah, 1983). Time-varying linearization of the nonlinear first-principles model has been used to adapt the PID tuning to changes in the operating point (Perne, 1990; Rotstein and Lewin, 1992) or to select a conservative PID tuning that guarantees robust stability for the entire range of the linearized models (Rotstein and Lewin, 1992). Performance of a PID controller with fixed conservative tuning has been improved by addition of a feedforward control determined from the nonlinear first-principles model (Kravaris et al. 1989). The response of a fixedtuning PID controller has been preprocessed through a nonlinear transformation that varies with the operating point as dictated by the nonlinear first-principles model,
0952-1976/02/$ - see front matter r 2002 Elsevier Science Ltd. All rights reserved. PII: S 0 9 5 2 - 1 9 7 6 ( 0 2 ) 0 0 0 1 8 - 0
42
J. Sjoberg, M. Agarwal / Engineering Applications of Artificial Intelligence 15 (2002) 41–51 .
so that the plant receives a control input that is adapted to the dynamic trajectory in spite of the fixed PID tuning (Kravaris and Chung, 1987; Soroush and Kravaris, 1992; Wang et al., 1993) Standard minimum variance control (Kuhn, 1989), self tuning control (Kiparissides and Shah, 1983), stable adaptive control (Kiparissides and Shah, 1983), pole placement control (Bhat et al., 1990), globally linearizing control (Chang et al., 1996), generalized predictive control (Sanchez del Rio et al., 1990; Rafalimanana et al., 1992) and model predictive control (Defaye et al., 1993; Lee and Datta, 1994; Peterson et al., 1989) have been employed based on linear black-box models adapted to on-line data (Kiparissides and Shah, 1983; Sanchez del Rio et al., 1990; Rafalimanana et al., 1992; Defaye et al., 1993; Chang et al., 1996), or based on nonlinear firstprinciples models (Kuhn, 1989; Peterson et al., 1989; Bhat et al., 1990), or based on local linearizations of the latter (Lee and Datta, 1994). Controllers based on linear black-box models adapted to on-line data not only require an adequate model parameterization, but also attain peak performance only late into the batch run owing to their asymptotic properties. Controllers based on nonlinear first-principles models yield inferior performance in the presence of model errors (Bhat et al., 1990; Soroush and Kravaris, 1992; Lee and Datta, 1994). This problem has been alleviated through classical modifications such as adapting the first-principles model based on on-line data (Kuhn, 1989; Kravaris et al., 1989; Wang et al., 1993; Lee and Datta, 1994) or estimating the unmodelled dynamics by filtering the past data (Bhat et al., 1990; Lee and Datta, 1994). 1.2. Proposed approach The drawbacks of the above methods stem from the fact that, in all cases, the controller is derived and tuned based on a process model and its performance is limited by the accuracy of the structure and parameters of that model. A different approach is taken in this work by using a generic controller structure and tuning the controller parameters directly on the plant data. In the face of highly nonlinear nature of batch processes, a nonlinear controller parameterization is warranted, in order that a fixed tuning, independent of the operating point, may work well over the entire dynamic trajectory. For nonlinear black-box functional mappings, generic neural-network structures present one . of the attractive alternatives (Sjoberg et al., 1995); a simple neural controller that essentially realizes a nonlinear-gain P-controller is therefore used for the trajectory tracking of the batch process in Section 3. However, the tuning method that is applied in Section 3 is more general and is applicable to any smoothly parameterized controller. Note that the use of a
generic controller structure does not imply that the feedback adjustments made by the controller are ad-hoc in any way; the controller is tuned using an algorithm that minimizes the control criterion imposed by the user, so that the feedback adjustments of the controller conform to that criterion. There are several ways to tune a neural controller directly on the plant data (Agarwal, 1997b). In general, gradient-based iterative optimization of the performance index is preferable over gradient-free algorithms and use of a model-based, analytical gradient is more efficient than numerical approximations of the gradient through perturbations in the controller parameters. Computation of an analytical gradient requires knowledge of the dynamic input–output sensitivity of the unknown plant. Crude ad hoc approximations of this sensitivity suffice for adequate tuning in many applications, since each iteration improves the performance index towards the optimum so long as the inner product of the approximated and actual plant sensitivities is positive. Economy of the number of iterations needed to attain satisfactory convergence of the tuning parameters is paramount in the case of batch processes, since each iteration entails an entire new batch run, which is often expensive. Efficient utilization of each batch run is therefore important, which requires best possible estimation of the plant sensitivity in each iteration in order to expedite the convergence. In the absence of an accurate firstprinciples model, the best estimation of the plant sensitivity is provided by an appropriate black-box model that represents a best fit for the entire batch run in the iteration under consideration. For this purpose, in this work, a linear time-varying black-box sensitivity model is adapted to the entire trajectory obtained in each iteration (see Fig. 1). A (possibly time-invariant) nonlinear black-box model is not used, since it would anyway need to be linearized for rigorous computation of the gradient (see Section 2). Notice that a time-varying black-box model is used here only for the purpose of gradient determination. Provided the time-variance in the model is adequately rapid, the optimum performance index is dictated entirely by the actual plant data on which the controller is tuned. The same holds for a nonparameterized fixedtrajectory feedforward controller employed in (Lee et al., 1994) that is iteratively optimized on the plant data, while using linearizations of the first-principles model for the gradient calculation (Lee et al., 1994). This is in contrast to the above-mentioned model-based methods, where the controller is designed and tuned on the model itself, so that the model error directly impairs attainability of the optimum of the actual plant performance index. Use of a plant model in determining an analytical gradient for the tuning of neural controllers based on
J. Sjoberg, M. Agarwal / Engineering Applications of Artificial Intelligence 15 (2002) 41–51 .
43
Fig. 1. The neural control scheme for trajectory tracking.
plant data is well known in the neural-network literature (Agarwal, 1997b). Gradient calculation accounting for the model dynamics has been implemented relatively rarely (Sastry et al., 1994; Lightbody and Irwin, 1995), whereas only the static model sensitivity has been used in most applications (Ishikawa et al., 1993; Narendra and Mukhopadhyay, 1994; Ku and Lee, 1995) at the expense of poorer convergence. In both cases, the analytical gradient is invariably calculated based only on the open-loop path between the controller parameters and the plant output. That is, the overall change in the plant output, y; caused by an infinitesimal change in the controller parameters, w (giving the needed gradient of y with respect to w) is computed only over the forward open-loop path, accounting only for change in the control input, u; due to change in w and for the consequent change in y due to the change in u: This does not account for the control feedback path, over which the change in y effects additional change in u; which in turn additionally changes y; and so on (see Fig. 1). The previously reported applications could afford to disregard the control feedback path, since they invariably involved only a one-step performance index (Agarwal, 1997b) and since the number of required iterations was not critical for the continuous processes that they addressed. For trajectory tracking in batch processes, the performance index involves the entire trajectory over all batch times and it is desirable
to keep the number of required iterations as low as possible. This work therefore derives the rigorous gradient, accounting for the control feedback loop and for the model dynamics, by applying an approach recently . proposed by the authors in a nonneural context (Sjoberg and Agarwal, 1997), which extended the linear approach of (Hjalmarsson et al., 1995) to nonlinear plants. The novelty of the approach lies in the use of a timevarying model to provide the gradient to tune a time-invariant controller on actual plant data. Previously, as mentioned above, either time-invariant controllers have been tuned using time-invariant (physical or black-box) global models whose accuracy restricts the attainment of optimal tuning, or timevarying (on-line adapted) models have been used to tune time-varying (on-line adapted) controllers for continuous plants where asymptotic optimality of the tuning suffices. In the approach used here, rapid time-variance in the model allows adaptation of a model with low bias (and high variance), while the time-invariance of the controller smooths out the effect of the high variance and allows optimal tuning to be effective from the very beginning of the batch run (instead of only asymptotically). Tuning of a neural controller using the proposed approach is demonstrated for trajectory tracking in a simulated batch reactor example taken from the literature.
44
J. Sjoberg, M. Agarwal / Engineering Applications of Artificial Intelligence 15 (2002) 41–51 .
2. The tuning method 2.1. The plant Most batch processes exhibit a nonlinear dynamic input–output behavior zðtÞ ¼ f ðzðt 1Þ; y; zðt nz Þ; uðt 1Þ; y; uðt nu ÞÞ; ð1Þ yðtÞ ¼ zðtÞ þ eðtÞ;
ð2Þ
where t is the discrete time instant, u; z; and y are the control input, undisturbed output, and the measured output, respectively, e is a zero-mean disturbance signal, nz and nu are integers, and f ðÞ belongs to some unknown class of smooth functions. The above plant description is not known, it is only assumed that such a description exists. In order to facilitate insight into the derivation and in view of the batch reactor example presented in Section 3, a single-input single-output description is assumed here, extension to the multiinput multi-output case is trivial. The above discretetime description represents the sampled behavior of the usually continuous-time batch process.
concreteness, consider the typical performance index 1 g ð4Þ JðwÞ ¼ E½ðyðwÞ yd Þ2 þ E½uðwÞ2 ; 2 2 where E is the expectation operator with respect to the disturbance eðtÞ; g is a user-specified design parameter, and yd is the desired plant output. This performance index is useful when the user wants to penalize the absolute amplitude of the input, u: If the user prefers an offset-free control instead, the term uðwÞ2 in the above equation can be replaced by ðuðt; wÞ uðt 1; wÞÞ2 so as to penalize the input variation instead of the input. The desired output, yd ; is specified as a filtered version of the reference signal r yd ðtÞ ¼ Td rðtÞ:
ð5Þ
The filter Td is typically specified as linear timeinvariant, but is not restricted to this class. Tuning of the neural controller (see Fig. 1) involves seeking the optimal weights wn that minimize the performance index, such that wn ¼ arg min JðwÞ: w
ð6Þ
2.2. The controller
2.4. Need for derivatives
In the absence of a plant model, a nonlinear blackbox controller with generic parameterization in the form of neural network is postulated for tracking the reference trajectory r;
The optimization in Eq. (6) must be performed by trial-and-error iteration. Since the number of required trial runs needs to be economized for batch processes, optimization algorithms that make use of an analytical gradient of the objective function are preferable. One such efficient algorithm iteratively updates the weights as
uðtÞ ¼ gðw; yðtÞ; yðt 1Þ; y; yðt ny Þ; rðtÞ; y; rðt nr ÞÞ; ð3Þ where w denotes the network weights, and the specification of the integers ny and nr is related to, and requires rough knowledge of, the plant integers nz and nu : It is assumed that the structure of the neural network g is known by specification based on experience or trial and error. An initialization of the network weights can be . obtained in various ways (Sjoberg, 1997); it is assumed that certain initial weights w ¼ w0 are available that stabilize the plant in the sense that small changes in the reference trajectory r lead to small changes in the closedloop response y in successive batch runs. Stabilizing initial weights can often be obtained by correspondence with a simpler, possibly nonneural, stabilizing controller, as in the example in Section 3. The initial neural controller needs to be tuned so as to optimize its performance over an entire given reference trajectory rðtÞ; in order to ensure good performance for this trajectory and for other similar trajectories close to it. 2.3. Performance criterion For a given tuning, the tracking performance is judged using any differentiable objective function; for
0 wiþ1 ¼ wi mi R1 i J ðwi Þ;
ð7Þ
where the prime denotes derivative with respect to w; i is the iteration count, mi is a step length that is chosen so as to obtain down-hill steps, and the matrix Ri is a Hessian approximation that guides the search from the steepestdescent to a more favorable direction (Dennis and Schnabel, 1983). Given optimal choices of mi and Ri ; the rate of convergence of the weights and the number of required trial runs depends on the accuracy of the gradient value J 0 ðwi Þ that needs to be calculated anew in each iteration. This gradient is given by J 0 ðwÞ ¼ E½ðyðwÞ yd Þy0 ðwÞ þ gE½uðwÞu0 ðwÞ:
ð8Þ
In practice, the right-hand sides of Eqs. (4) and (8) can be calculated only after replacing the expectation operator by a summation operator. The accuracy of the gradient in Eq. (8) is determined by the accuracy of the derivatives y0 ðwÞ and u0 ðwÞ: Rigorous calculation of both these derivatives must account for the control feedback loop and the model dynamics, as discussed in Section 1. In the absence of an accurate plant model, a linear time-varying sensitivity model is postulated to enable rigorous estimation of
J. Sjoberg, M. Agarwal / Engineering Applications of Artificial Intelligence 15 (2002) 41–51 .
these two derivatives. The linearization of the nonlinear plant facilitates accounting for the control feedback path and the model dynamics in the gradient calculation. The procedure below is adapted from the previous . work by the authors (Sjoberg and Agarwal, 1997). It ensures that the optimization in Eq. (6) is performed with an efficient and accurate gradient computation. As is customary in nonlinear optimization problems, the number of iterations needed for convergence cannot be specified a priori. 2.5. Approach to obtain derivatives Given controller weights wi from the ith iteration in Eq. (7), a batch run is made with a reference signal r1 ðtÞ to obtain the trajectory fr1 ðtÞ; u1 ðtÞ; z1 ðtÞ; y1 ðtÞgN t¼1 : The reference signal r1 ðtÞ is assumed to be given, for example, by an optimal control technique. The plant is linearized around this trajectory at each of the N discrete time instants comprising the batch duration. For small perturbations around this trajectory, the smoothness of the plant and the controller allows Taylor expansions of Eqs. (1) and (3) to give, together with Eq. (2), the linearized time-varying plant DzðtÞ ¼ F ðtÞDuðtÞ;
ð9Þ
DyðtÞ ¼ DzðtÞ þ DeðtÞ
Du0 ðtÞ u0 ðtÞ u01 ðtÞ ¼ u0 ðtÞ;
45
ð17Þ
fr1 ðtÞ; u1 ðtÞ; z1 ðtÞ; y1 ðtÞgN t¼1
can since the base trajectory be any pre-specified fixed trajectory around which the linearization is carried out, so that it is independent of w: The derivatives Dy0 and Du0 could be evaluated using Eqs. (11) and (12) in the open-loop, i.e., by setting DyðtÞ ¼ 0 in Eq. (12). The derivatives would then account only for the plant and controller dynamics, but not for the effect of the feedback path. In order to account also for the feedback loop, these derivatives must be obtained from the closed-loop expressions for Dy and Du; namely, DyðtÞ ¼ F ðtÞg0 Dw þ F ðtÞSðtÞDyðtÞ þ F ðtÞTðtÞDrðtÞ þ DeðtÞ;
ð18Þ
DuðtÞ ¼ g0 Dw þ SðtÞF ðtÞDuðtÞ þ SðtÞDeðtÞ þ TðtÞDrðtÞ; ð19Þ that are obtained by using Eq. (12) in Eq. (11) and vice versa. Taking the derivative with respect to w of the closed-loop expressions in Eqs. (18) and (19), and noting that F ðtÞ; DrðtÞ; and DeðtÞ are independent of w; i.e., F 0 ðtÞ ¼ 0; Dr0 ðtÞ ¼ 0; De0 ¼ 0; the derivatives accounting for the feedback loop are
ð10Þ
Dy0 ðtÞ ¼ F ðtÞg00 Dw þ F ðtÞg0 Dw0 þ F ðtÞS0 ðtÞDyðtÞ þ F ðtÞSðtÞDy0 ðtÞ þ F ðtÞT 0 ðtÞDrðtÞ
ð11Þ
Du0 ðtÞ ¼ g00 Dw þ g0 Dw0 þ S 0 ðtÞF ðtÞDuðtÞ þ SðtÞF ðtÞDu0 ðtÞ
ð20Þ
or, eliminating the unknown DzðtÞ; DyðtÞ ¼ F ðtÞDuðtÞ þ DeðtÞ
þ S 0 ðtÞDeðtÞ þ T 0 ðtÞDrðtÞ:
and the linearized time-varying controller DuðtÞ ¼ g0 Dw þ SðtÞDyðtÞ þ TðtÞDrðtÞ;
ð12Þ
where D denotes deviation from the base trajectory fr1 ðtÞ; u1 ðtÞ; z1 ðtÞ; y1 ðtÞgN t¼1 ; i.e., DxðtÞ ¼ xðtÞ x1 ðtÞ for any variable xðtÞ; Dw ¼ w wi ; and F ðtÞ; SðtÞ; TðtÞ are time-varying transfer functions Pnu n n¼1 ð@f ðÞÞ=ð@uðt nÞÞq P F ðtÞ ¼ ; ð13Þ nz 1 n¼1 ð@f ðÞÞ=ð@zðt nÞÞqn SðtÞ ¼
ny X n¼0
TðtÞ ¼
nr X n¼0
@gðÞ n q ; @yðt nÞ
ð14Þ
@gðÞ qn ; @rðt nÞ
ð15Þ 1
where q is the shift operator such that q xðtÞ ¼ xðt 1Þ; and all the partial derivatives, including g0 in Eq. (12), are time varying and are evaluated at the base trajectory fr1 ðtÞ; u1 ðtÞ; z1 ðtÞ; y1 ðtÞgN t¼1 : Notice that F ðtÞ is unknown, but SðtÞ and TðtÞ can be calculated, given a trajectory and a controller. The derivatives y0 and u0 needed in Eq. (8) can be obtained in terms of the derivatives Dy0 and Du0 ; as Dy0 ðtÞ y0 ðtÞ y01 ðtÞ ¼ y0 ðtÞ;
ð16Þ
ð21Þ
These expressions give the derivatives valid under operation at the trajectory fr1 ðtÞ þ DrðtÞ; u1 ðtÞ þ DuðtÞ; z1 ðtÞ þ DzðtÞ; y1 ðtÞ þ DyðtÞgN t¼1 using the controller weights wi þ Dw for any arbitrary (small) values of Dr; Du; Dz; Dy; Dw: However, the derivatives y0 and u0 in Eq. (8) are needed for operation at the base trajectory fr1 ðtÞ; u1 ðtÞ; z1 ðtÞ; y1 ðtÞgN t¼1 using the current controller weights wi : Thus, for evaluation of the required derivatives, Dr ¼ 0; Du ¼ 0; Dz ¼ 0; Dy ¼ 0; Dw ¼ 0; (and, consequently, from Eq. (10) De ¼ 0Þ; and the definition Dw ¼ w wi implies Dw0 ¼ w0 w0i ¼ I 0 ¼ I; where I denotes the identity matrix, so that Dy0 ðtÞ ¼ F ðtÞg0 þ F ðtÞSðtÞDy0 ðtÞ;
ð22Þ
Du0 ðtÞ ¼ g0 þ SðtÞF ðtÞDu0 ðtÞ:
ð23Þ
Rearranging and using Eqs. (16) and (17) gives y0 ðtÞ ¼ Dy0 ðtÞ ¼ ½1 F ðtÞSðtÞ1 F ðtÞg0 ;
ð24Þ
u0 ðtÞ ¼ Du0 ðtÞ ¼ ½1 SðtÞF ðtÞ1 g0 ;
ð25Þ
as noisy estimates of the gradients for use in Eq. (8). Notice that the gradients for only the open-loop path, not accounting for the feedback loop, would have been simply y0 ðtÞ ¼ F ðtÞg0 and u0 ðtÞ ¼ g0 :
46
J. Sjoberg, M. Agarwal / Engineering Applications of Artificial Intelligence 15 (2002) 41–51 .
2.6. Identification of the linear time-varying model Since the structure gðÞ of the neural network is known, computation of g0 is trivial and the polynomial SðtÞ can readily be evaluated from Eq. (14). As the plant f ðÞ and the undisturbed output zðtÞ are unknown, the transfer function F ðtÞ needed in Eqs. (24) and (25) cannot directly be evaluated from Eq. (13), but is obtained by fitting the linearized model in Eq. (11) to plant data on and around the base trajectory fr1 ðtÞ; u1 ðtÞ; z1 ðtÞ; y1 ðtÞgN t¼1 : For this purpose, in each iteration of Eq. (7), a second batch run is conducted with the same controller weights wi for a reference signal r1 ðtÞ þ DrðtÞ that represents a small perturbation around the base trajectory. The deviation fDuðtÞ; DyðtÞgN t¼1 of the new perturbed trajectory with respect to the base trajectory provides the input–output data for fitting the open-loop transfer function F ðtÞ of the linear model in Eq. (11) (see Fig. 1). Thereby, the integers nz and nu in Eq. (13) are either available from prior knowledge or deduced as part of the model design and validation. The best-fit F ðtÞ is obtained by minimizing the loss criterion t 1 X V ðt; F Þ ¼ ltj ðtÞ½ðDyðjÞ F ðjÞDuðjÞ2 2N j¼1 N 1 X þ ljt ðtÞ½ðDyðjÞ F ðjÞDuðjÞ2 ; 2N j¼tþ1
ð26Þ
off-line, based on the fDuðtÞ; DyðtÞgN t¼1 data only from the two batch runs in the current iteration of Eq. (7); data from batch runs in the previous iterations are not cumulated, since those data are valid only along the different trajectories obtained in the respective iterations. Note that this procedure does not imply any ‘‘real-time’’ adjustment of the process model during a batch experiment; the model parameters are updated only between two successive batches. Here 0olðtÞp1 is a user-specified discounting factor for data at previous and later t values in the data set. The value of l expresses together the confidence in the sensor readings and in the model structure. lðtÞ ¼ 1 corresponds to a time-invariant model, whereas low l values lead to rapid time-variance in the model. Lower values of l give lower bias and higher variance in the identified F ðtÞ values, but the disadvantage of the higher variance is largely mitigated by the averaging effect due to the time-invariant nature of the controller weights for whose tuning the F ðtÞ values are needed. Of course, loss criteria other than that in Eq. (26) are also feasible. 2.7. Summary of the tuning algorithm The tuning algorithm is described by the following steps: 1. Perform two experiments on the plant with reference signals rðtÞ and rðtÞ þ DrðtÞ:
2. Estimate a linear time-varying model of the plant linearization, F ðtÞ; along the trajectories of the data from the experiments. 3. Compute estimates of the derivatives y0 ðtÞ and u0 ðtÞ according to Eqs. (24) and (25). 4. Update the control parameters according to Eq. (7) and (8). 5. Go to 1 if the tuning has not converged.
3. Batch reactor control 3.1. Reactor simulation The usefulness of the method of Section 2 in tuning neural controllers for trajectory tracking in batch processes is demonstrated using the example of a batch reactor (Kravaris and Chung, 1987) for the consecutive reactions k1
k2
A - B - C;
ð27Þ
that is simulated using the mass, balance equations c’A ¼ k1 ðTÞc2A ;
ð28Þ
c’B ¼ k1 ðTÞc2A k2 ðTÞcB ;
ð29Þ
T’ ¼ g1 k1 ðTÞc2A þ g2 k2 ðTÞcB þ ða1 þ a2 TÞ þ ðb1 þ b2 TÞu: ð30Þ Here cA and cB are concentrations of A and B, respectively, in the reaction mass, T is the reaction temperature, k1 and k2 are rate constants following Arrhenius temperature dependence k1 ðTÞ ¼ A1 expðE1 =RTÞ;
ð31Þ
k2 ðTÞ ¼ A2 expðE2 =RTÞ
ð32Þ
with frequency factors A1 and A2 ; activation energies E1 and E2 ; and gas constant R; u is a single dimensionless control input representing common scaling of two physical manipulated variables, g1 ; g2 ; a1 ; a2 ; b1 ; b2 are certain combinations of physical parameters; and all variables are in SI units as listed in the original paper (Kravaris and Chung, 1987). The parameter values given in the original example are modified to make the process more nonlinear, and the control problem more challenging, for demonstration of the tuning method. The parameter values A1 ¼ 3 103 ; E1 =R ¼ 1 101 ; A2 ¼ 3 103 ; E2 =R ¼ 5 103 ; g1 ¼ 3:33;
ð33Þ
J. Sjoberg, M. Agarwal / Engineering Applications of Artificial Intelligence 15 (2002) 41–51 .
47
b2 ¼ 3 102 ;
trajectory yd ðtÞ of Eq. (36), as given by Eq. (4) with g ¼ 0: The minimization is performed with the algorithm of Eq. (7), using mi and Ri as suggested in (Dennis and Schnabel, 1983) for the Gauss–Newton method. The structure of the linear time-varying model of Eq. (11), which is used for the gradient calculation, is taken to be
and the initial conditions
yðtÞ ¼ aðtÞyðt 1Þ þ bðtÞ uðt 1Þ
g2 ¼ 6:67 101 ; a1 ¼ 1 102 ; a2 ¼ 2 102 ; b1 ¼ 1;
cA ð0Þ ¼ 20; cB ð0Þ ¼ 0; Tð0Þ ¼ 5;
ð34Þ
are used for simulation. The process output y is the temperature yðtÞ ¼ TðtÞ þ eðtÞ
ð35Þ
which is measured with a unit sampling period and with measurement noise e simulated as zero in order to allow more transparent interpretation of the results. Using Td ¼ 1 in Eq. (5), the reference and the desired-output trajectory that is to be tracked is given by rðtÞ ¼ yd ðtÞ ¼ 20 expð0:02 tÞ þ eðtÞ;
which corresponds to under-modeling of the above third-order batch reactor description that is used to simulate the process. The time-varying parameters aðtÞ and bðtÞ are identified in each iteration of Eq. (7) by performing two batch runs with reference trajectories given by Eq. (30), so that DrðtÞ ¼ DeðtÞ
ð40Þ
and minimizing for the input–output deviation data fDuðtÞ; DyðtÞgN t¼1 the loss function in Eq. (26) with a time-independent discounting factor l ¼ 0:95: 3.3. Results
ð36Þ
where eðtÞ is a noise sequence simulated as a unitvariance white Gaussian signal filtered through the firstorder low-pass filter 1=ðs þ 1Þ: This disturbance is added to simulate fluctuations in the desired trajectory that is often obtained by on-line re-optimization of an upperlevel control in response to incoming data (Terwiesch et al., 1994) and therefore inherits the fluctuations in those data. 3.2. Neural controller A neural controller structure is specified as shown in Fig. 1. The network contains two nodes in the input layer, two nodes in the single hidden layer, and one node in the output layer. The input and output nodes are linear, while the hidden nodes use the standard sigmoidal nonlinearity sðÞ: One input node receives the control error ey ; ey ðtÞ ¼ rðtÞ yðtÞ
ð39Þ
First, a linear P-controller uðtÞ ¼ ey ðtÞ;
ð41Þ
where ey ðtÞ is the control error, is simulated that does not have any parameters to be tuned. Closed-loop performance of this controller is shown in Figs. 2 and 3 and is expectedly poor. Since this controller offers closed-loop stability, it can be used to initialize controllers that are to be tuned using the method of Section 2, which requires an initial controller tuning that is closed-loop stable. The above controller is then extended to an affine controller uðtÞ ¼ w1 ey ðtÞ þ w2 ;
ð42Þ
ð37Þ
and the other receives a unit bias. Both input nodes are connected to both the hidden nodes, and the output node is connected to all the other four nodes. The controller weights wi ; i ¼ 1; y; 8 are as shown in Fig. 1, so that the control law can be expressed as uðtÞ ¼ w1 ey ðtÞ þ w2 þ w7 sðw3 ey ðtÞ þ w4 Þ þ w8 sðw5 ey ðtÞ þ w6 Þ:
ð38Þ
This simple neural controller can also be interpreted as a P-controller with a nonlinear gain, it therefore allows an easier analysis. The controller weights wi ; i ¼ 1; y; 8 are to be tuned so as to minimize the deviation from the desired
Fig. 2. Desired (dashed) and measured (solid) temperature trajectory for the P-controller.
48
J. Sjoberg, M. Agarwal / Engineering Applications of Artificial Intelligence 15 (2002) 41–51 .
Fig. 3. Control input for the P-controller.
Fig. 5. Control input for the tuned affine controller.
Fig. 4. Desired (dashed) and measured (solid) temperature trajectory for the tuned affine controller.
Fig. 6. Parameters aðtÞ (solid) and bðtÞ (dashed) of the linear timevarying model (Eq. (39)) identified in the final tuning iteration for the affine controller around the actual input–output trajectories shown in Figs. 4 and 5.
that can be seen as a linear neural network with two linear input nodes directly connected to a single linear output node. The weights of this controller are initialized as w1 ¼ 1 and w2 ¼ 0 to correspond to the P-controller of Eq. (41) and are tuned using the same procedure as those of the desired neural controller in Eq. (38), to obtain w1 ¼ 2:4; w2 ¼ 0:94:
ð43Þ
Closed-loop performance of this controller, in Figs. 4 and 5, shows significant improvement over that of the Pcontroller in Figs. 2 and 3, but also exhibits stability problems towards the end of the batch. The limited degrees of freedom in this low-parameterized linear controller do not allow a small error to be achieved during the entire batch run, so that the closed-loop system is forced close to instability towards the end of
the run in order to obtain significant improvement near the beginning of the run. That the controller is underparameterized is also evident from the identified parameters aðtÞ and bðtÞ of the linear time-varying model (Eq. (39)), which show drastic variation over the batch duration, as shown in Fig. 6. The strongly nonlinear nature of the process seems to require a nonaffine controller. The neural controller of Eq. (38), with significantly more degrees of freedom, is implemented next. Using a procedure described in detail in a previous work . (Sjoberg, 1997), the controller weights are initialized as follows: w1 ; w2 are set to the values w1 ¼ 1 and w2 ¼ 0 corresponding to the stable P-controller in Eq. (41), w3 –w6 are chosen so as to place the sigmoid functions in regions occupied by the available data, and
J. Sjoberg, M. Agarwal / Engineering Applications of Artificial Intelligence 15 (2002) 41–51 .
49
w7 ; w8 are set to zero in order to obtain a stable closed-loop system with the initialized controller. Initializing w1 and w2 as in Eq. (43) for the tuned affine controller of Eq. (42) is ill advised, not only because the initial controller would then be at the limit of stability, but also because the extra batch runs involved in obtaining the weights in Eq. (43) might outweigh any resultant reduction in the number of iterations needed for the neural-controller tuning. The weights obtained using the tuning procedure of Section 2 are w1 ¼ 2:9; w2 ¼ 1:5; w3 ¼ 2:7; w4 ¼ 4:0;
Fig. 8. Control input for the tuned neural controller.
w5 ¼ 0:05; w6 ¼ 3:4; w7 ¼ 2:1; w8 ¼ 0:047:
ð44Þ
The tuning optimization converged after 4 iterations of Eq. (7), thus requiring 8 batch runs for the controller tuning. The performance index JðwÞ attained the values 18.4, 17.9, 15.4, 13.0, and 6.7 after the iteration numbers 0–4 and showed small, noise-induced fluctuations around the value 6.7 during further iterations. For other noise sequences eðtÞ in Eq. (36), the optimization converged to the same fluctuation range of the minimum JðwÞ after 4 to 5 iterations. The closed-loop performance of the tuned neural controller is shown in Figs. 7 and 8. The performance is superior to those of the simpler controllers over the entire batch duration. This shows that satisfactory tracking is possible with a relatively simple (compared to the process complexity) neural controller structure; further improvement may be
Fig. 7. Desired (dashed) and measured (solid) temperature trajectory for the tuned neural controller.
Fig. 9. Parameters aðtÞ (solid) and bðtÞ (dashed) of the linear timevarying model (Eq. (39)) identified in the final tuning iteration for the neural controller around the actual input–output trajectories shown in Figs. 7 and 8.
possible with a more complex neural-network structure, but would entail the usual disadvantages that come with added complexity. Fig. 9 shows the parameters of the linear time-varying model (Eq. (39)) for the tuned neural controller, which reflect the strongly nonlinear nature of the batch process. Comparison with Fig. 6 shows that the underparameterized affine controller (Eq. (42)) faces drastically different process dynamics, which is partly responsible for the stability problem it encountered towards the end of the batch run. Since the plant is under-modeled by the time-varying model, the difference in the actual trajectories in Figs. 4 and 7 leads to completely different parameter estimates in Figs. 6 and 9. The difference in the parameter estimates is so large because the neural controller has a higher bandwidth, which changes the frequency at which the low-order model fits the plant. Note that a first-order model has
50
J. Sjoberg, M. Agarwal / Engineering Applications of Artificial Intelligence 15 (2002) 41–51 .
controller weights based on only a limited number of trial batch runs. The proposed approach of tuning a time-invariant controller based on a time-varying adaptive model is specially suited to the control of poorly known batch processes, where *
*
Fig. 10. Time-varying weights w1 ðtÞ (solid) and w2 ðtÞ (dashed) for the affine controller (Eq. (42)) obtained through linearization of the tuned neural controller (Eq. (38)) around the reference trajectory yd (Eq. (36)).
been chosen, because it does not make sense to use a model of higher order than the order of the controller. The extent of restriction of the degrees of freedom due to this under-parameterization is also apparent in Fig. 10, which shows the time-varying weights w1 ðtÞ and w2 ðtÞ that the affine controller would have needed in order to achieve tracking performance similar to that of the neural controller.
the poor prior knowledge about the process precludes the use of a sufficiently accurate time-invariant model, and the batch nature of the process requires satisfactory control at early batch times to ensure adequate endpoint performance, thus ruling out the use of timevarying adaptive controllers that attain satisfactory performance only asymptotically.
Although the approach enables satisfactory trajectory tracking in batch processes with only scant prior information, such as a rough knowledge of the plant order, if certain useful information in the form of a firstprinciples model is available, it could be incorporated into the scheme in one of several possible ways discussed previously (Agarwal, 1997a). The most beneficial way of utilizing a physical nonlinear model in the present scheme is an issue that merits further research. Since the method relies solely on the plant data, gradual drifts in the process characteristics and unknown persistent disturbances are automatically accounted for in the controller response, if one or a few tuning iterations are repeatedly performed intermittently during the application of the initial tuned controller.
4. Conclusion Temperature control in an exothermic batch reactor is taken as an example of trajectory tracking control problem in batch processes. In the absence of an accurate physical model, a black-box nonlinear controller is postulated in the form of a neural network. A method recently proposed by the authors is applied to tune the controller weights based only on actual plant data from trial batch runs. In order to economize the number of expensive batch runs required, the method expedites the tuning convergence by computing a rigorous objective-function gradient that accounts for the feedback path and the plant dynamics. The rigorous gradient calculation utilizes a rapidly time-varying, linear, parameterized model of the process, which is adapted anew to the plant data in each optimization iteration. Results demonstrate the viability of using generic neural controllers for trajectory tracking in batch processes where the available prior knowledge based on physical insight is inadequate for the design and tuning of a satisfactory controller. The optimization results also show the efficacy of the proposed gradientcalculation method for economically tuning the neural
References Agarwal, M., 1997a. Combining neural and conventional paradigms for modeling, prediction, and control. International Journal of Systems Science 28 (1), 65–81. Agarwal, M., 1997b. A systematic classification of neural-networkbased control. IEEE Control Systems Magazine 17 (2), 75–93. Bhat, J., Chidambaram, M., Madhavan, K.P., 1990. Robust control of batch reactors. Chemical Engineers Communications 87, 195–204. Chang, J.S., Hsu, J.-S., Sung, Y.-T., 1996. Trajectory tracking of an optimizing path in a batch reactor: experimental study. Industrial Engineering and Chemical Research 35, 2247–2260. Defaye, G., Regnier, N., Chabanon, J., Caralp, L., Vidal, C., 1993. Adaptive-predictive temperature control of semi-batch reactors. Chemical Engineering Science 48 (19), 3373–3382. Dennis, J.E., Schnabel, R.B., 1983. Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Prentice-Hall, Englewood Cliffs, NJ. Hjalmarsson, H., Gunnarsson, S., Gevers, M., 1995. Model-free tuning of a robust regulator for a flexible transmission system. European Journal of Control 1 (2), 148–156. Ishikawa, T., Tsuji, J., Ohmori, H., Sano, A., 1993. Novel configuration of nonlinear adaptive control incorporating neural network. Proceedings of the 12th IFAC World Congress, Vol. 9. Sydney, Australia, pp. 483–488.
J. Sjoberg, M. Agarwal / Engineering Applications of Artificial Intelligence 15 (2002) 41–51 . Kiparissides, C., Shah, S.L., 1983. Self-tuning and stable adaptive control of a batch polymerization reactor. Automatica 19 (3), 225–235. Kravaris, C., Chung, C.-B., 1987. Nonlinear state feedback synthesis by global input/output linearization. A.I.Ch.E Journal 33 (4), 592–603. Kravaris, C., Wright, R.A., Carrier, J.F., 1989. Nonlinear controllers for trajectory tracking in batch processes. Computers in Chemical Engineering 13 (l/2), 73–82. Ku, C.-C., Lee, K.Y., 1995. Diagonal recurrent neural networks for dynamic systems control. IEEE Transactions on Neural Networks 6 (1), 144–156. Kuhn, K.P., 1989. A comparison of adaptive controllers on nonlinear systems. In: Kummel, . M. (Ed.), Adaptive Control of Chemical Processes 1988 (ADCHEM’88): selected papers from the 2nd International IFAC Symposium, Lyngby, Denmark. Pergamon Press, UK, pp. 27–31. Lee, J.H., Datta, A.K., 1994. Nonlinear inferential control of pulp digesters. A.I.Ch.E Journal 40 (l), 50–64. Lee, K.S., Bang, S.H., Chang, K.S., 1994. Feedback-assisted iterative learning control based on an inverse process model. Journal of Process Control 4 (2), 77–89. Lightbody, G., Irwin, G.W., 1995. Direct neural model reference adaptive control. IEE Proceedings—Control, Theory and Application 142 (1), 31–43. Narendra, K.S., Mukhopadhyay, S., 1994. Adaptive control of nonlinear multivariable systems using neural networks. Neural Networks 7 (5), 737–752. Perne, R., 1990. Model-based control system for exothermic chemical reactions. Proceedings of the 11th IFAC World Congress, Tallinn, Estonia, pp. 205–207. Peterson, T., Hernandez, E., Arkun, Y., Schork, F.J., 1989. Nonlinear predictive control of a semi batch polymerization reactor by an extended DMC. Proceedings of the American Control Conference, Pittsburgh, PA, pp. 1534–1539.
51
Rafalimanana, A., Cabassud, M., le Lann, M.V., Casamatta, G., 1992. Adaptive control of a multipurpose and flexible semi-batch pilot plant reactor. Computers in Chemical Engineering 16 (9), 837–848. Rotstein, G.E., Lewin, D.R., 1992. Control of an unstable batch chemical reactor. Computers in Chemical Engineering 16 (1), 27–49. Sanchez del Rio, J.A., le Lann, M.V., Cabassud, M., Casamatta, G., 1990. Adaptive control of a semi-batch pilot reactor using a generalized predictive controller. Proceedings of the 11th IFAC World Congress, Tallinn, Estonia, pp. 168–173. Sastry, P.S., Santharam, G., Unnikrishnan, K.P., 1994. Memory neuron networks for identification and control of dynamical systems. IEEE Transactions on Neural Networks 5 (2), 306–319. . Sjoberg, J., 1997. On estimation of nonlinear black-box models: how to obtain a good initialization. IEEE Workshop in Neural Networks for Signal Processing, Amelia Island Plantation, FL, pp. 72–81. . Sjoberg, J., Agarwal, M., 1997. Nonlinear controller tuning based on linearized time-variant model. Proceedings of the American Control Conference, Vol. 5. Albuquerque, NM, pp. 3336–3340. . Sjoberg, J., Zhang, Q., Ljung, L., Benveniste, A., Deylon, B., Glorennec, P.-Y., Hjalmarsson, H., Juditsky, A., 1995. Non-linear black-box modeling in system identification: a unified overview. Automatica 31 (12), 1691–1724. Soroush, M., Kravaris, C., 1992. Nonlinear control of a batch polymerization reactor: an experimental study. A.I.Ch.E Journal 38 (9), 1429–1448. Terwiesch, P., Agarwal, M., Rippin, D.W.T., 1994. Batch unit optimization with imperfect modeling: a survey. Journal of Process Control 4 (4), 238–258. Wang, Z.L., Corriou, J.P., Pla, F., 1993. Nonlinear control of a batch polymerization reactor with on-line parameter and state estimations. Proceedings of the 32nd IEEE Conference on Decisions and Control, San Antonio, TX, pp. 3858–3863.