Enhancing dynamic data reconciliation performance through time delays identification

Enhancing dynamic data reconciliation performance through time delays identification

Available online at www.sciencedirect.com Chemical Engineering and Processing 46 (2007) 1251–1263 Enhancing dynamic data reconciliation performance ...

1MB Sizes 22 Downloads 108 Views

Available online at www.sciencedirect.com

Chemical Engineering and Processing 46 (2007) 1251–1263

Enhancing dynamic data reconciliation performance through time delays identification Ignacio Y´elamos, Carlos M´endez, Luis Puigjaner ∗ Universitat Polit`ecnica de Catalunya, Chemical Engineering Department, Av. Diagonal 647, 08028 Barcelona, Spain Received 16 October 2006; accepted 16 October 2006 Available online 30 November 2006

Abstract In order to improve the performance of data reconciliation methods, an efficient Genetic algorithm (GA) for determining time delays has been developed. Delays are identified by searching the maximum correlation among the process variables. The delay vector (DV) is integrated within a dynamic data reconciliation (DDR) procedure based on Kalman filter through the measurements error model. The proposed approach can be satisfactorily applied not only off-line but also on-line. It was firstly validated in a dynamic process with recycles and feedback control loops. Then, the methodology was successfully applied to a highly non-linear and complex challenging control case study, the Tenessee Eastman benchmark process, demonstrating its robustness in complex industrial problems. This case study required to implement an extended Kalman filter to deal with the existing non-linearities. © 2006 Elsevier B.V. All rights reserved. Keywords: Genetic algorithm; Time delay identification; Dynamic data reconciliation; Kalman filter

1. Introduction Data reconciliation (DR) is a model-based filtering technique that attempts to reduce the inconsistency between measured process variables and a process model. Thus, by decreasing this inconsistency, measurement errors are reduced and more accurate process variable values are obtained, which then can be used for process control and optimization. Nevertheless, the performance of DR can be seriously affected by the presence of any event that increases this inconsistency such as modeling errors, gross-errors and/or delays in sampling data. In such situations, an error in the estimation can be generated by forcing a matching of the process model and measured signals that are incompatible. To deal with the presence of gross-error(s) several approaches have been presented in the literature [1], covering steady and dynamic systems as well as different kinds of errors (e.g. bias, process leak). However, less attention has been devoted to man-

Abbreviations: CPU, central processing unit; DR, data reconciliation; DDR, dynamic data reconciliation; EKF, extended Kalman filter; GA, genetic algorithm; KF, Kalman filter; PRD, process related delay; SRD, sensor related delay; TDI, time delay identification ∗ Corresponding author. Tel.: +34 93 401 66 78; fax: +34 93 401 09 79. E-mail address: [email protected] (L. Puigjaner). 0255-2701/$ – see front matter © 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.cep.2006.10.013

age situations in which delays are present. Since time delays appear as a time variable, they introduce non-linearities even if the process is a linear one. Correlation measures have been previously used in order to determine process delays. Wachs and Lewin [2] proposed an algorithm that applies relative shifts between process signals within a process data matrix in order to find out the position in which the correlation among variables is maximum. The optimal shifts are those that minimize the determinant of the associated correlation matrix. Such mechanism was applied for improving resolution of a PCA based fault diagnosis system. When the number of process variables n is large or the maximum delay dmax is high in terms of sampling time, exhaustive search of the maximum correlation by direct data matrix manipulation is impracticable due to the unmanageable combinatorial size (it involves (dmax )n determinant calculations). Wachs and Lewin [2] realized this problem and proposed a new algorithm that reduces the problem size to dmax × s × (n − s), where s is the number of inputs. However, this algorithm assumes that the output variables are correlated among themselves with no delays present and that the process inputs are independent. Apart from such limitation, the reduced but exahustive searching strategy maintained in the methodology, makes it unsuitable for actual industrial problem involving continuous

1252

I. Y´elamos et al. / Chemical Engineering and Processing 46 (2007) 1251–1263

(not discrete) time delays. Since real industrial problems have to deal with continuous operation times, such delays identification methodology becomes infeasible in the on-line implementations because of the CPU time limitations. In that sense, more efficient and systematic approaches should be developed in order to tackle with the real time delays identification. In this work an efficient time delay identification (TDI) procedure based on GA optimization is presented for determining all the existing delays with respect to a reference process signal (e.g. the most or the least delayed process signal). This information is then introduced within the dynamic data reconciliation (DDR) procedure by means of the measurements error model. An academic dynamic case study with recycles and a challenging industrial control problem reported in the literature, the Tennessee Eastmant benchmark [3] case study have been selected to illustrate the advantages of the DDR approach. The main goal of this work is to properly combine DDR and TDI techniques in such a way that data/model delay mismatches can be reduced as much as possible. 2. Data reconciliation DR is a technique that takes advantage of the redundancy between the process model and the measurements model (Eq. (1)). Therefore, the existence of both models is a prerequisite to reconcile redundant measured data. The measurements model is constructed by assigning to each one of the n measured process variables a normal probability distribution around its true value. In the absence of gross-errors the measurements model can be written as: ytk = yt∗k + εtk ,

ytk ∈ Rn

(1)

where y is the n × 1 measurements vector, y* the n × 1 vector of unknown true values, and ε stands for the n × 1 vector of random measurement errors whose expected value is the null vector and has a known positive definite variance matrix (measurements are assumed to be independent). Additional information must be introduced through the process model equations (constraints). The process model is in general represented by a set of differential equation constraints:   dˆytk f (2) , yˆ tk , δˆ tk , θˆ tk = 0, f ∈ Rf dt algebraic equality constraints: g(ˆytk , δˆ tk , θˆ tk ) = 0,

g ∈ Rg

(3)

and inequality constraints: h(ˆytk , ␦ˆ tk , ␪ˆ tk ) ≤ 0,

h ∈ Rh

(4)

where yˆ tk represents the values of the estimated vector at discrete time tk and δˆ tk and θˆ tk are the p × 1 vector of estimated unmeasured variables and the q × 1 vector of adjusted model parameters at discrete time tk , respectively. The adjustment of measurements in order to compensate random errors usually leads to a constrained optimization problem.

In most applications, the objective function (OF) to be minimized is a weighted least-squares of the differences from the measured values subject to the process model constraints. OF(y, yˆ , σ) =

K 

[ˆytk − ytk ]T

−1

[ˆytk − ytk ]

(5)

tk =0

  where is a diagonal matrix = diag(σ12 , . . . , σn2 ) and σi2 is the variance of the ith measured variable. Different approaches have been proposed to solve this optimization problem [4,5]. If the process does not show high non-linearities and no inequality constraints are present, Kalman filter (KF) technique can be efficiently used [6]. 2.1. Kalman filter The Kalman filter (KF) is an efficient recursive filter which estimates the state of a dynamic system from a series of incomplete and noisy measurements, assuming that the true current state evolves from the previous one. The algorithm uses the following state-space and measurements models, respectively: xtk = Atk xtk−1 + Btk utk + wtk

(6)

ytk = Htk xtk + vtk

(7)

where tk represents a sample time, xtk is the nx dimensional vector of state variables, utk the nu dimensional vector of manipulated input variables and ytk is the ny dimensional vector of measured variables. The state transition matrix Atk , the control gain matrix Btk and the observation matrix Htk are matrices of an appropriate dimension and if their coefficients are time independent the subscript tk can be dropped. The Kalman filter assumes errors in the process model and in the measured data. The process noise wtk represents errors in the state transition model. This noise is assumed to be white, with zero mean and a variance Q. vtk represents a measurement noise with a variance R. Using process measurements, the error covariance matrix Ptk /tk associated with the estimated state vector xˆ tk /tk is updated as follows: Ptk /tk = (I − Ktk Htk )Ptk /tk−1

(8)

where Ktk is the Kalman filter gain given by: −1

Ktk = Ptk /tk−1 HtTk (Htk Ptk /tk−1 HtTk + Rtk )

(9)

Then, the process expected measurements and the model current state estimation can be evaluated by means of Eqs. (10) and (11), respectively. yˆ tk = ytk − Htk xˆ tk /tk−1

(10)

xˆ tk /tk = xˆ tk /tk−1 + Ktk yˆ tk

(11)

where xˆ tk /tk is the model state prediction and yˆ tk is the measurements model prediction. ytk − Htk xˆ tk /tk−1 is called the residual or innovation and reflects the discrepancy between the predicted measurements Htk xˆ tk /tk−1 and the real measurement ytk . The gain matrix Ktk is a factor that allows to weight more or less

I. Y´elamos et al. / Chemical Engineering and Processing 46 (2007) 1251–1263

the residual depending on the confidence of the system in the process measurements or in the previous process estimations. In industrial cases, applicability of simple Kalman filter is limited because of the common non-linear nature of real life problems. More accurate non-linear techniques have been proposed to deal with such problems. 2.2. Extended Kalman filter In case the process to be estimated or the measurements relationship is not linear, the simple Kalman filter is in many cases not an accurate state estimator. In such cases, the Extended Kalman filter (EKF) that linearizes about the current mean and covariance is the best choice. Extended Kalman filter assumes that the process has an associated state vector x␧Rn , but now the process is ruled by the next non-linear stochastic difference equations: xtk = γ(xtk−1 , utk , wtk−1 )

(12)

ytk = λ(xtk , vtk )

(13)

where the function γ is used to compute the predicted state from the previous estimate and similarly the function λ is used to compute the predicted measurement from the predicted state. However, γ and λ cannot be applied to the covariance directly. Instead a Jacobian matrix is computed. At each time-step (tk ) the Jacobian is evaluated with current predicted states. Basically, what this process does is to linearize the non-linear function around the current estimate and allows to update the error covariance matrix Ptk /tk and to estimate the current process state xˆ tk /tk and the process measurements vector yˆ tk by using the real process measurements, as it was discussed in the simple Kalman filter. In practice, the individual values of the noise wtk and vtk are not known, but the state and measurement vector can be approximated. For details regarding the complete extended Kalman filter formulation we refer interested readers to the work developed by Ref. [7] which includes the procedure followed in this work. 3. The role of time delay in data reconciliation In most of the cases, KF and EKF could be able to manage certain modeling errors by assuming uncertainties in the process model. However, a poor performance can be obtained when the data correlation is seriously affected by unknown time delays in the measured variables. This section introduces an efficient method for time delay identification that intends to overcome such critical limitation.

1253

Two types of time delays can be distinguished: (1) process related delays (PRD) are caused by the intrinsic process dynamics, the control algorithm and/or the placement of the sensors in the plant (e.g. thermocouple or flowmeter in a pipeline); generally these delays are propagated to the subsequent part of the process under consideration and are unknown for process operators. (2) Sensor related delays (SRD) present a local effect and can be obtained from suppliers specifications. Nevertheless, from DR perspective, these “local” delays will certainly induce a biased estimation in the rest of the involved redundant process variables. Although different, both delays cause a decrease in the DR performance. Independently of which is the delay source, knowing which is the overall delay between variables will be crucial to obtain better DDR results. As it was discussed in previous sections, the approach presented in Ref. [2] assumes that the output variables are correlated among themselves with no delays present and that the inputs are independents. Nevertheless, in real processes it is difficult to discriminate between input and output variables, since this distinction is process unit dependent (a variable may be at the same time an output of a process unit and the input of the subsequent one). These distinctions are even more difficult to be assumed in industrial problems, where it is common the use of recycles and feedback control systems. Moreover, this methodology has to simplify the real continuous problem by using time discritezation because of the searching computational cost it would require. This work presents a TDI method that overcomes all the distinctions discussed above and that is able to efficiently deal with continuous time delays. This approach also is able to tackle with real situations without requiring any a priori knowledge about the number and location of time delays or the process input and output variables. Since time delays identification in a problem with elevated number of variables and delays requires of an intense computational effort, we adopted a genetic algorithm (GA) based optimization method with the purpose of increasing the exhaustive searching methodology efficiency. The proposed stochastic-based optimization approach cannot guarantee the global optimality of the solution. However, it is able to generate near-optimal solutions in a short time, which is crucial in on-line applications, where techniques that give good solutions with modest CPU time are preferred to those that can only generate optimal solutions in a time that exceeds the process requirements. GA-based methods take advantage of well known genetic schemes so as to be able to cope with numerous and complex optimization problems in a reasonable time (see Ref. [8]). Moreover, it will be shown later that near-optimal solutions are also able to produce remarkable and constant improvements in the DDR performance.

3.1. Time delay identification 3.2. Genetic algorithm Time delays are present in almost all the industrial processes. They affect the performance of control, monitoring and DDR systems. Nevertheless, usual approaches to most of these techniques do not take into account delays or simply consider them as known and constant over time.

GA constitutes a vast class of stochastic algorithms used to solve optimization problems [9,10]. In comparison with other stochastic procedures for optimization GAs consider many points in the search space simultaneously and there-

1254

I. Y´elamos et al. / Chemical Engineering and Processing 46 (2007) 1251–1263

variance in each data vector [1]. Let Y¯ be this normalized data matrix. Introducing DV into Y¯ allows to obtain a new delay adjusted data matrix (YDA ) as: YDA = [¯y1,tk−d1 , y¯ 2,tk−d2 , . . . , y¯ n,tk−dn ]

(16)

The correlation matrix R for two random variables is formally defined as: cov(x, y) Rx,y = (17) σx σy where cov(x, y) is the covariance of both variables and σ x and σ y are the respective standard deviation of each variable. Then, extending that formulation to YDA , the correlation matrix of the adjusted variables matrix can be easily evaluated by: cov(YDA ) RYDA = n i=1 σy¯ i ,tk −di

(18)

The norm of R, that give us a quantitative information about the correlation of matrix YDA , is calculated as its maximum singular value, max SV (R). The fitness of DV is considered to be proportional to max SV (R). The complete procedure to evaluate the signals correlation is summarized next: • Calculate the correlation matrix R of YDA . • Calculate the norm of R as its maximum singular value max SV (R). • Assign to each individual a fitness proportional to max SV (R).

Fig. 1. Genetic algorithm loop.

fore have a reduced chance of converging to local optima [11]. In the classical GA formalism (Fig. 1), a set of Nind candidate solutions (population) is generated randomly. Each potential solution (individual) in the multidimensional search space is coded as a vector called chromosome (which consists of a string of genes, each one representing a feature). The goodness of each individual in the population is evaluated by utilizing a prespecified fitness criterion φ. Upon assessing fitness of all the chromosomes in the population, a new generation of individuals is created from the current population by using the selection, crossover and mutation operators. In this work, each chromosome, also known as a delay vector (DV) is defined as it is shown in Eq. (14). It contains the delays of each process signal (di ) regarding a reference signal. This reference signal can be the most or the least delayed process signal but any other one can be also selected. DV = [d1 , d2 , . . . , dn ]

To implement this algorithm, the Matlab genetic algorithm toolbox developed by the University of Sheffield [12] has been used. 3.3. Illustrative example To demonstrate the performance of the proposed algorithm an illustrative example is presented in Fig. 2. In Fig. 3 the evolution of the three main process variables can be seen, representing: valve command (Vc ), vapour flow through the valve (Fv ) and the temperature of the tank heated by mean of the vapour (T). Two types of delays are present: • The valve response is delayed from its command signal. • The controlled variable T is delayed from the manipulated variable Fv .

(14)

The fitness of each individual DV is evaluated considering the correlation among the process variables. Let us consider a data matrix: Y = [y1,tk , y2,tk , . . . , yn,tk ],

tk = 1, . . . , m

(15)

containing data corresponding to m time samples of n process variables. y1,tk represents a sample of process variable i at time tk . The error in Y is supposed to follow a normal probability distribution and must be normalized with zero mean and unit

Fig. 2. Illustrative example.

I. Y´elamos et al. / Chemical Engineering and Processing 46 (2007) 1251–1263

Fig. 3. Raw process measured data with actual delay (illustrative example).

The reference signal selected in this case has been the most delayed one (T). The GA-based optimization method was run with the following parameters: 50 individuals, 10 generations, 0.7 crossover probability and 0.07 mutation probability. Also, a maximum delay of 32 sampling times was considered. In most of simple cases, maximum delays can be easily estimated by plotting the process variables, as shown in Fig. 3. The algorithm converges to the best individual, DVb = [0, 26, 29] (in sample time units), in three generations. This is the same vector as the one obtained by means of an exhaustive search. Nevertheless, a slight difference with the real delay, DVb = [0, 20, 24], has been observed due to the inherent process dynamics. Adjusting the initial data matrix (see Fig. 3) using the obtained delay vector (DVb ) leads to more consistent data (YDA ) as it is shown in Fig. 4. In Fig. 5 the eigenvectors of the correlation data matrix of this illustrative example are shown. It can be seen that the system variance is concentrated in the first eigenvector after the delay adjustment (in other words, the process variance is explained in a unique direction). Therefore, by

Fig. 4. Adjusted process data by means of TDI (illustrative example).

1255

Fig. 5. Variance explained in the eigenvectors of the correlation matrix. Illustrative example.

maximizing the variance explained by the first eigenvector, the process correlation is also maximized. Thus delays have been correctly identified and their negative effect has been properly eliminated. It is important to note that in the proposed approach the combined effect of all process delays is globally compensated. 4. Integrating TDI into DDR Integration of the time delay identification procedure and DDR can be achieved by considering the identified time delays within the measurement model (Eq. (1)) as follows: ∗ yi,tk = yi,t + εi,tk −di , k −di

ytk ∈ Rn

(19)

In the case of sensors related delays, DDR can be performed simply using the known sensor delay specifications and modifying the measurement model according to Eq. (19). In the case of process related delays or any unknown delay in the process (as those caused by the sum of SRD and PRD), the time delay identification can be efficiently performed by using the approach proposed in the previous section. Although, the effect of delays in the process may not be completely compensated, the performance of the system can be highly enhanced as it will be shown in next section. Time delays (DV) can dynamically change throughout the time, so an online adjustment must be performed regularly in order to maintain the DR performance. A quick delays identification is crucial because of the on-line time requirements limitations. Therefore, the efficiency of the proposed GA searching procedure becomes a very attractive application for real-world industrial environments. In that sense, DV is estimated periodically using the GA based procedure (Fig. 6). If the square sum of the residuals is less than a threshold value, then, the same DV is used to perform DDR but if the threshold is exceeded, the TDI is activated in order to re-calculate DV. The residuals are evaluated based on the differences between process measurements and reconciled data. That evaluation is carried out

1256

I. Y´elamos et al. / Chemical Engineering and Processing 46 (2007) 1251–1263

Fig. 6. Adaptive time delay identification.

periodically, in a frequency established from a previous process dynamic analysis. Seeking to reduce computation time of TDI, the current DV is included in the initial population of the new delay estimation. Although the current DV may be outdated or erroneous, the proposed GA-based approach can quickly improve it through the mutation and crossover operations.

5. Case studies 5.1. Academic case study The proposed methodology can be applied both to PRD and SRD. However, a case study presenting PRDs is first introduced to highlight the complexity of delay propagation effects. Fig. 7 shows a flowsheet of the process considered to evaluate the proposed DDR methodology. It was taken from the work [13] and consists of eight streams and four storage tanks. In this example, the flow rates (Q) and mass hold-ups (W) are the process variables to be monitored. Because of the simplicity and linear relationships nature existing among monitored variables, simple Kalman filter is enough to obtain an accurate and reliable dynamic data reconciliation. The good performance of the proposed DDR methodology is next demonstrated. The original example [13] has been modified by adding three process delays ([da , db , dc ] = [200, 100, 10], in sample time units) at the positions shown in Fig. 7. The combined delay effect leads

Fig. 8. Academic case study: Q1 measured data (disturbed during 250 samples).

to a considerable discrepancy between the mass balance (model) around tank 1 and the measured values. Therefore, a bias in the DDR is expected. In addition to the predefined delays, a disturbance due to a sudden but lasting change causes a modification in the operation conditions: the flow Q1 rises off suddenly to reach the value of 34:5 and it is maintained during 250 sample times. Then, it returns to drop to its original value of 29:5 (see Fig. 8). First, the DDR has been applied to the case study without any delay. The performance of such application can be seen in Fig. 9, showing some ability to reduce the measurement noise

Fig. 7. Academic case study with three added delays in positions indicated by the black boxes.

I. Y´elamos et al. / Chemical Engineering and Processing 46 (2007) 1251–1263

Fig. 9. Academic cases study: reconciled Q3 values without delays.

and a notable estimation accuracy in variable Q3 reconciliation. In this figure the continuous line is the true value behavior, the discontinuous line is the estimated value and the points represent the measurements. In the case of incorporating the PRD, DDR can still perform correctly in some variables (Fig. 10a) (due to the fact that KF can manage modeling errors) even when the non-linearities introduced by the delay effect are present.

Fig. 10. Academic cases study: (a) reconciled Q3 values with delays and (b) reconciled W1 values with delays.

1257

Nevertheless, it shows unacceptable results in other variables where the combined delay effect is more critical (Fig. 10b). By using the proposed combined methodology (DDR-TDI), a delay vector was obtained as it was previously described. The GA-based optimization method was run with the following parameters: 1000 individuals, 100 generations, 0.7 crossover probability, 0.0007 mutation probability and a maximum delay of 300 sampling times. This delay vector is used to adjust the measurements signals. Big differences between delayed and adjusted variables can be easily observed in Fig. 11. The best correlation found for the adjusted signals shown in Fig. 11b, can be quantitatively evaluated by calculating the variance distribution along the extracted eigenvectors, as it was done in the illustrative examples in Section 3. In this way, without any adjustment the variance explained by first singular value was equal to 55.93 whereas in the adjusted case it took a value of 68.77, showing much better variables correlation because of the time delays adjustment. This fact can be graphically checked by revising the variance distribution throughout the eigenvectors in both cases depicted in Fig. 12. This significant correlation improvements between process variables positively affect to the later DDR performance, as it can be observed in all the examples addressed in this paper.

Fig. 11. Academic cases study: process variables (a) with delays and (b) with adjusted delays.

1258

I. Y´elamos et al. / Chemical Engineering and Processing 46 (2007) 1251–1263 Table 2 Academic case study: DDR errors (VEi and GE) Variables

Without actual delays

Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 W1 W2 W3 W4 Fig. 12. Academic cases study: variance explained in the eigenvectors of the correlation matrix.

In order to quantitatively compare the DDR performance before and after the delays adjustment, a quadratic error function for each variable is proposed: VEi =

m 

AD1

AD2

AD3

0.172 0.169 0.210 0.189 0.185 0.124 0.133 0.137 3.268 3.425 3.053 3.267

0.249 0.198 0.137 0.147 0.142 0.172 0.127 0.129 8.395 3.451 3.389 3.628

0.164 0.160 0.153 0.127 0.130 0.115 0.130 0.129 3.722 3.907 3.219 3.260

0.165 0.158 0.140 0.139 0.126 0.127 0.132 0.135 3.780 3.882 2.975 3.246

0.157 0.146 0.145 0.153 0.130 0.129 0.135 0.134 3.562 3.837 3.406 3.416

14.332

20.164

15.216

15.007

15.352

DDR obtained without removing the delays. The performance of the reconciliation is considerably enhanced applying any of the optimized delay vectors.

(20)

VEi represents the variable i estimation error along time (m are the considered sample times), xi,tk is the actual true value of variable i at sampling time tk and xˆ i,tk represents the variable i estimated value from DDR at sampling time tk . The global error (GE) can be evaluated considering all the studied variables (n): n 

With adjusted delays

5.2. Industrial case study

(xi,tk − xˆ i,tk )2

tk =1

GE =

Total (GE)

With delays

VEi

(21)

i=1

Two additional optimization experiments were carried out to demonstrate the methodology robustness. The parameters of the optimization algorithm and the corresponding achieved delay vectors are summarized in Table 1. The DDR results obtained without delay adjustments are compared with those obtained from the adjusted ones in Table 2. The results show an important decrease in estimation errors when the delay vectors are used. Furthermore, it should be noted that all three experiments highly increase the DDR performance without appreciable differences between them, even in the case that sub-optimal solutions are generated. This DDR performance improvement can be visually confirmed comparing Figs. 13–16. They represent the DDR obtained with the best delay vector from Table 2 (AD2), and the

The Tennessee Eastman (TE) benchmark [3] has also been tested with the proposed DDR methodology in order to check the actual scope and robustness of the technique in an industrial problem. The case involves the production of two products from four reactants. There are 41 measured process variables and 12 manipulated variables. The process (Fig. 17) is composed of five main unit operations: an exothermic two-phase reactor, a product condenser, a flash separator, a reboiled stripper, and a recycle compressor. Such process gathers the required nonlinearities and recycles to definitely check the wide robustness of the proposed DDR methodology. The case study was simulated in MATLAB, using the original Fortran code given by Ref. [3] imported in MATLAB 6.5. The implemented control strategy is that one proposed by Ref. [14]. Besides the highly coupled variables and the non-linearities of the problem, different variables were deliberately delayed to confront a more challenging case study. The EKF was applied to deal better with such non-linear complex problem. The state-space model required to apply the methodology is based in the TE model proposed by Ref. [15]. It is a dynamic model which covers the material balance of the process, consisting of 26 states and 15 parameters. It satisfactorily demonstrates the good performance of the model by giving accurate estimation

Table 1 Academic case study: optimization parameters and obtained DV Experiments

Adjustment 1 (AD1) Adjustment 2 (AD2) Adjustment 3 (AD3)

Optimization parameters

Delay vectors (sample time units)

Nind

NG

Pc

Pm

dmax

[Q1, Q2, Q3, Q4, Q5, Q6, Q7, Q8, W1, W2, W3, W4]

1000 250 1000

100 100 10

0.7 0.7 0.7

0.0007 0.0028 0.0007

300 300 300

[270, 90, 70, 80, 70, 60, 0, 220, 270, 70, 70, 70] [270, 100, 80, 70, 70, 60, 0, 210, 270, 70, 60, 80] [270, 90, 80, 50, 60, 100, 0, 270, 270, 90, 110, 90]

I. Y´elamos et al. / Chemical Engineering and Processing 46 (2007) 1251–1263

1259

Fig. 13. Academic cases study: reconciled Q1 values (a) with delays and (b) with adjusted delays.

Fig. 14. Academic cases study: reconciled Q2 values (a) with delays and (b) with adjusted delays.

of the 26 considered states variables and the 23 involved output variables. For deep details of the applied dynamic model, interested readers should look over articles: [15] and [16]. In order to apply the EKF, the model was linearized as it is described in Ref. [16]. Data used in that study were collected from 35 h of simulated operation of the process by sampling each 2 min. Twenty-three output variables considered correspond to different XMEAS given in Ref. [3]. Table 3 shows the relationship among 23 model output variables and original process variables and also depicts forced delays applied to each one of the considered variables. All the forced delays are applied to compositions measurements and are equals 36 min in each one. Because of the unknown principles that rule the TE process, no previous knowledge is available to determine the actual values of the process variables in order to compare with the model estimations. To perform a similar comparison that in the case of the studied academic example, a wavelet filtering was applied on the process signals to extract the essential features of process measurements that could be considered as an estimation of the true values of variables. To check that this represents a proper approximation, visual inspection of the wavelets filtering were maintained to assure such approximation was valid.

Table 3 TE benchmark: elements of the output vector and associated delays V. number

Description

Downs vogel index

Delay

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

Reactor pressure Reactor liquid level Separator pressure Separator liquid level Stripper bottoms level Stripper pressure Reactor feed flow rate A in reactor feed B in reactor feed C in reactor feed D in reactor feed E in reactor feed F in reactor feed A in purge B in purge C in purge D in purge E in purge F in purge G in purge H in purge G in product H in product

7 8 13 12 15 16 6 23 24 25 26 27 28 29 30 31 32 33 34 35 36 40 41

0 0 0 0 0 0 0 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36

1260

I. Y´elamos et al. / Chemical Engineering and Processing 46 (2007) 1251–1263

Fig. 15. Academic cases study: reconciled Q6 values (a) with delays and (b) with adjusted delays.

Fig. 16. Academic cases study: reconciled W1 values (a) with delays and (b) with adjusted delays.

Fig. 17. Tennessee Eastman process flowsheet.

I. Y´elamos et al. / Chemical Engineering and Processing 46 (2007) 1251–1263

The “daubechies 6” wavelet and a three level decomposition were applied in a “soft thresholding” procedure for denoising the process signals [17]. Examples of the filtering results are given in Fig. 18, where process measurements are represented by dots (observed) and wavelets filtering signals are given by the bold lines (true). Results obtained for variables shown in Fig. 18 under disturbance IDV(1) (from disturbance description in Ref. [3]) are similar for the remaining considered variables under any normal or abnormal tested event. They show very good denoising capabilities, which make us to adopt them as reliable true process variables values. With the purpose to make evident the DDR methodology effects, several disturbances were introduce in the process under the conditions described so far. Such disturbances correspond to the first five originally included in the TE original paper [3]. The description and correspondence with those from the original paper are shown in Table 4, next to the results obtained with the tested methodology. For each case the disturbance was introduced at the very early moments of simulation and was maintained until the end of the experiment. The GA was applied to any of the previous five commented cases, with the same parameters definition: 100 individuals, 50 generations, 0.7 crossover probability, 0.0007 mutation

Fig. 18. TE case: wavelets filtering results under IDV(1). Bold line represents the wavelet filtered signals and dotted line the variables measurements. (a) Reactor liquid level and (b) reactor feed flow rate.

1261

probability and a maximum delay of 40 min. It was required approximately 10 min of CPU to obtain the final solution, which was considered a reasonable time for an on-line application as the DDR proposed approach. Firstly, we introduced the first disturbance (IDV(1) in Ref. [3]), a sudden change in A/C feed ratio composition that caused a modification in the operation conditions changing many of the variables normal behavior. Under such disturbance, different DDR were obtained depending if the delay adjustment procedure is working or not. Then, we tested the four remaining considered disturbances in a similar way. Table 4 summarizes the disturbances addressed and the global estimation error before and after the delays adjustment. The CPU times required to obtain the DV in each case are also shown in the last column of the table. The short calculation times obtained allow applying a new DV each 10 min approximately in any of the cases tested, which makes the DDR flexible and quite adaptable to the process dynamics. Significant and general reductions in GE from DDR were obtained for all the tested cases. The TDI approach proposed offered good and fast time delays identification that clearly improved the DDR. The reduction of the estimation errors can be also checked graphically in Figs. 19 and 20. These figures show variables 14 and 7, respectively from Table 3 under IDV(1) disturbance arising at early moments of the simulation. In both

Fig. 19. TE case: reconciled variable 14 from Table 3 under IDV(1), (a) with delays and (b) with adjusted delays.

1262

I. Y´elamos et al. / Chemical Engineering and Processing 46 (2007) 1251–1263

Table 4 DDR results in the TE case under five different disturbances Disturbance

Description

GE with delays

GE with adjusted delays

Calc. time (min)

IDV(1) IDV(2) IDV(3) IDV(4) IDV(5)

A/C, feed ratio step B, composition step D, feed temp. step React. cool., water inlet temp. step Cond. cool., water inlet temp. step

5.2748 2.2920 1.0731 1.0898 1.0802

5.0106 2.2825 1.0564 1.0652 1.0555

10.97 10.91 11.63 10.89 10.95

that no a priori process knowledge is needed to define the number and location of time delays as well as the characteristics of the process variables, i.e. if they correspond to input or output variables. Also, the effectiveness of the proposed TDI procedure makes the approach suitable when delays adjustment is regularly performed to maintain the on-line plant performance. Firstly, a complex dynamic case study was successfully addressed, highlighting the performance of the proposed TDI approach. Besides that, the procedure was also tested in an industrial case study (Tennessee Eastman case) demonstrating promising robustness and applicability in highly non-linear and coupled variables processes. Quantitative and qualitative evidences were included in the paper to show the significant improvement obtained through the proposed GA-based optimization method for TDI. Although this approach shows good results for the case of PRD, it is specially suited for SRD since it deals with the measurements model. An extension of this approach to time-varying delays has been also sketched. Acknowledgements Financial support received from the Spanish Ministerio de Educaci´on y Ciencia through the FPI program are fully appreciated. Furthermore, support by the European community (contracts No. G1RD-CT-2001-00466, MRTN-CT2004-512233 and RFC-CR-04006) is thankfully acknowledged. Appendix A. Nomenclature Fig. 20. TE case: reconciled variable 7 from Table 3 under IDV(1), (a) with delays and (b) with adjusted delays.

figures process measurements, true values and reconciled variables are represented together. Each sample time shown in the figure corresponds to 2 min simulation time. In one hand, Fig. 19 shows how the model estimates better variable 14 than variable 7 represented in Fig. 20. On the other hand, in both cases a better estimation is reached by adjusting the existing delays with the proposed TDI. 6. Conclusions In most cases of dynamic processes, different sources of time delay between causes and effect appear in the system. In these situations, the performance of DDR can be highly improved by means of the integration with TDI in both on-line and off-line cases. One important advantage of the presented TDI method is

dmax di DV f g GE h m max SV n Nind NG Pc Pm R TE

maximum delay delay element delay vector set of differential equation constraints set of algebraic equality constraints global estimation error set of inequality constraints number of samples maximum singular value number of process variables number of individuals in each generation number of generations crossover probability mutation probability correlation matrix Tennessee Eastman process

I. Y´elamos et al. / Chemical Engineering and Processing 46 (2007) 1251–1263

uˆ VE x xˆ y yˆ y* Y Y¯ YDA

estimated unmeasured variables vector variable estimation error state variables vector estimated state variables vector measurement variables vector estimated measurement variables vector unknown true process variables values process data matrix normalized process data matrix delay adjusted data matrix

Greek letters γ function for predicting current state in EKF δˆ estimated unmeasured variables vector ε random measurement errors θˆ vector of adjusted model parameters λ function for predicting the current measurements in EKF σ variables standard deviation  variance matrix φ fitness function Subscripts i process variables tk sample time References [1] S. Narasinham, C. Jordache, Data Reconciliation and Gross Error Detection, Gulf Publishing Comp, TX, Houston, 2000.

1263

[2] A. Wachs, D.R. Lewin, Improved PCA methods for process disturbance and failure identification, AIChE J. 45 (1999) 1688–1700. [3] J. Downs, E. Vogel, A plant-wide industrial process control problem, Comput. Chem. Eng. 17 (3) (1993) 245–255. [4] M.J. Liebman, T.F. Edgar, L. Lasdon, Efficient data reconciliation and estimation for dynamic nonlinear programming techniques, Comp. Chem. Eng. 16 (10) (1992) 963–986. [5] J.S. Albuquerque, L.T. Biegler, Data reconciliation and gross-error detection for dynamic systems, AIChE J. 42 (10) (1996) 2841. [6] F.I. Lewis, Optimal Estimation with an Introduction to Stochastic Control Theory, John Wiley and Sons Inc., 1986. [7] J.H. Lee, N.L. Ricker, Extended Kalman filter based nonlinear model predictive control, Indus. Chem. Res. 33 (1994) 1530–1541. [8] E.G. Shopova, N.G. Vaklieva-Bancheva, A genetic algorithm for engineering problems solution, Comput. Chem. Eng. 30 (8) (2006) 1293–1309. [9] J. Holland, Adaptation in Natural and Artificial Systems, University of Michigan Press, 1975. [10] D. Goldberg, Genetic Algorithms in Search, Optimisation, and Machine Learning, Addison-Wesley, 1989. [11] C. Karr, Fuzzy control of pH using genetic algorithms, IEEE Trans. Fuzzy Syst. (1993) 146–153. [12] Genetic Algorithm Toolbox [on-line] Site web, URL: http://www.shef.ac. uk/uni/projects/gaipp/ga-toolbox/, consulted August 1, 2003. [13] M. Darouach, M. Zasadzinski, Data reconciliation in generalized linear dynamic systems, AIChE J. 37 (1991) 193. [14] T.J. McAvoy, N. Ye, Base control for the tennessee eastman problem, Comput. Chem. Eng. 18 (1994) 383. [15] N.L. Ricker, L.H. Lee, Nonlinear modeling and state estimation for the Tennessee Eastman challenge process, Comput. Chem. Eng. 19 (9) (1995) 983–1005. [16] N.L. Ricker, L.H. Lee, Nonlinear predictive control of the Tennessee Eastman challenge process, Comput. Chem. Eng. 19 (9) (1995) 961–981. [17] R.V. Tona, C. Benqlilou, A. Espu˜na, L. Puigjaner, Dynamic data reconciliation based on wavelet trend analysis, Indus. Chem. Res. 44 (2005) 4323.