The influence of measurement noise on PLS-based batch-end quality prediction*

The influence of measurement noise on PLS-based batch-end quality prediction*

Proceedings of the 18th World Congress The International Federation of Automatic Control Milano (Italy) August 28 - September 2, 2011 The influence o...

401KB Sizes 0 Downloads 19 Views

Proceedings of the 18th World Congress The International Federation of Automatic Control Milano (Italy) August 28 - September 2, 2011

The influence of measurement noise on PLS-based batch-end quality prediction ? J. Vanlaer P. Van den Kerkhof G. Gins J.F.M. Van Impe BioTeC, Department of Chemical Engineering, Katholieke Universiteit Leuven, W. de Croylaan 46, B-3001 Leuven, Belgium (e-mail: {jef.vanlaer, pieter.vandenkerkhof, geert.gins, jan.vanimpe}@cit.kuleuven.be). Abstract: The development of automated process monitoring systems to assist human operators in their decisions is an important challenge for today’s chemical and biochemical companies. Especially for batch processes, close monitoring is required to achieve a satisfactory product quality at the end of the batch operation. Techniques based on Partial Least Squares (PLS) were developed to obtain online predictions of the batch-end quality (e.g., product purity or concentration). However, a lot of (bio)chemical companies are still reluctant to implement these monitoring techniques since, among other things, not much is known about the influence of measurement noise on the prediction performance. In this paper, the influence of measurement noise on (i ) input selection, (ii ) model order selection, and (iii ) on- and offline prediction performance for PLS-based prediction of batch-end quality is investigated. A (simulated) process for penicillin production is selected as a case study. The noise level influences selected model inputs since more variables become uninformative when more noise is present in the data. The model order is influenced as well as more important underlying phenomena are masked by the noise. As expected, higher noise levels result in lower offline prediction performance. However, online predictions can improve when more noise is present in the data, due to the selection of different inputs. Even at very large noise levels, accurate and stable predictions of the final penicillin concentration are obtained. Keywords: Batch control; data processing; fermentation processes; multivariate quality control; noise levels. 1. INTRODUCTION Today’s chemical and biochemical production processes and plants are equipped with numerous sensors that record various flows, temperatures, pressures, pH, concentrations, . . . Despite the frequent use of automated low-level control actions (e.g., PID-control for opening and closing valves), responding to abnormal events –one of the most important control tasks– most often remains a manual operation. Human operators investigate the information arising from sensors in the process and compare this information to measurements from previous process runs to detect a departure from normal operation. However, the size and complexity of modern process plants (e.g., the very high number of sensors) largely complicates this task. As a result, incorrect operator decisions often occur. ? Work supported in part by Project PVF/10/002 (OPTEC Optimization in Engineering Center) of the Research Council of the Katholieke Universiteit Leuven, Project KP/09/005 (SCORES4CHEM) of the Industrial Research Council of the Katholieke Universiteit Leuven, and the Belgian Program on Interuniversity Poles of Attraction initiated by the Belgian Federal Science Policy Office. J. Vanlaer en P. Van den Kerkhof have a Ph.D grant of the Institute for the Promotion of Innovation through Science and Technology in Flanders (IWT-Vlaanderen). J. Van Impe holds the chair Safety Engineering sponsored by the Belgian chemistry and life sciences federation essenscia. The scientific responsibility is assumed by its authors.

978-3-902661-93-7/11/$20.00 © 2011 IFAC

Therefore, the development of automated systems for fault detection and identification, assisting human operators in their decisions, is an important challenge (Venkatasubramanian et al., 2003). This especially holds true for (bio)chemical batch processes. Because batch processes are commonly used for the manufacture of products with a high added value (e.g., medicines, enzymes, high-performance polymers), the loss of a batch due to process faults is very costly. The dynamic nature of batch processes leads to low controllability. Hence, close monitoring of these processes is required to achieve a satisfactory product quality at the end of the batch operation. Batch runs that deviate from normal process behavior should be detected as soon as possible so that corrective actions can be taken. Most research effort in this area has been directed towards fault detection using techniques based on Principal Component Analysis (PCA; Eriksson et al., 2002; Nomikos and MacGregor, 1994; Simoglou et al., 2005). These techniques detect deviations from nominal process behavior, but are unable to provide estimates of the final batch quality (e.g., product purity or concentration). Techniques based on Partial Least Squares (PLS; Geladi and Kowalski, 1986) take quality measurements into account and are not only used to detect process faults, but also to obtain batch-end

7156

10.3182/20110828-6-IT-1002.02775

18th IFAC World Congress (IFAC'11) Milano (Italy) August 28 - September 2, 2011

quality predictions (Garc´ıa-Munoz et al., 2004; Nomikos ¨ and MacGregor, 1995b; Undey et al., 2003). In this work, the influence of measurement noise on online batch-end quality prediction is investigated based on an extensive dataset from a simulator for industrial penicillin production (Birol et al., 2002). While PLS has been developed to deal with large datasets with correlated measurements and to filter noise from these measurements, it will never be able to remove all noise present in the data. Hence, the noise will negatively influence the predictive performance of the models. Furthermore, the presence of measurement noise in the data will influence the selection of optimal input variables for the model and the model order. However, the importance of these effects has never been studied, which is probably one of the reasons why a lot of industrial companies are still reluctant to implement these techniques. The structure of this paper is as follows. In Section 2, Multivariate Partial Least Squares is briefly explained. Next, Section 3 describes the online implementation of this technique for batch-end quality estimation. Section 4 provides an explanation of the techniques for model order and input variable selection, after which the case study is presented in Section 5. Section 6 presents the results, and final conclusions are drawn in Section 7. 2. MULTIVARIATE PARTIAL LEAST SQUARES MODELLING

2.2 Multiway Partial Least Squares (MPLS) After data matrix unfolding, a regression is made between the unfolded (input) data matrix X and L quality measurements for each batch contained in the (output) data matrix Y (I × L) using standard two-dimensional Partial Least Squares (PLS; Geladi and Kowalski, 1986).  X = TPT (+EX ) (1) Y = TQT (+EY ) The PLS model projects the input and output matrices onto a lower-dimensional space, each dimension of which is defined by one of the R components or latent variables. These latent variables are chosen in such a way that the input variables in the reduced space contain as much information (correlation) about the original input and output measurements as possible. The projections of X and Y are defined by the loading matrices P (JK × R) and Q (L × R) respectively. The scores matrix T (I × R) represents the data matrices in the reduced space. The matrices E contain the modelling errors or residuals. The matrix P is not invertible, nor are its columns orthonormal. Hence, a JK × R weight matrix W with orthonormal columns is introduced to compute the scores T and quality prediction Y for a given measurement set X. PT W is invertible, so that the projection of the inputs X on the scores space T, with corresponding regression matrix B (JK × R), is calculated as follows: T = XB

To predict the end quality of a batch process, a Multiway Partial Least Squares (MPLS; Nomikos and MacGregor, 1995b) model is used. The modelling consists of two steps. Due to the threedimensional structure of the data matrix of a batch process, this matrix has to be unfolded to a two-dimensional matrix in a first step (Section 2.1). In the second step, a general Partial Least Squares (PLS; Geladi and Kowalski, 1986) model is constructed based on this two-dimensional data matrix, as explained in Section 2.2. 2.1 Data matrix unfolding For I batches, measurements of J different variables are available over K time points. This results in a threedimensional data matrix X of size I × J × K. To deal with this specific three-dimensional structure, the dimensionality of the matrix X is reduced by means of batch-wise data matrix unfolding (Nomikos and MacGregor, 1994, 1995a). The procedure is illustrated in Figure 1. By dividing the matrix X in K slices of size I × J and placing these slices side by side, an unfolded data matrix X of size I × JK is obtained. The technique preserves the batch direction: every row of the unfolded matrix corresponds with one complete batch. Other techniques for data matrix unfolding exist (e.g., variable-wise unfolding (Wold et al., 1987)). However, since final product quality is related to the complete batch history, batch-wise unfolding is used for batch-end quality prediction.

4

(2) T

−1

B=W P W . (3) The relation between the quality variables Y and the measurements X then becomes Y = TQT = XBQT . (4) 3. ONLINE BATCH-END QUALITY PREDICTION While running a new batch online, only the measurements up until the current time k are known. Missing data techniques are used to compensate for the unknown future measurements. Garc´ıa-Munoz et al. (2004) investigated different techniques, of which Trimmed Scores Regression (TSR; Arteaga and Ferrer, 2002) showed the best performance. The advantage of TSR is that it only requires a single PLS model to predict the final batch quality at every sample instance throughout the batch instead of K different models. Previous research by the authors has shown that it shows similar performance to the training of a new PLS model for every time at which a prediction is needed (Gins et al., 2009). K I

X J

I

X JK

Fig. 1. Illustration of batch-wise data matrix unfolding.

7157

18th IFAC World Congress (IFAC'11) Milano (Italy) August 28 - September 2, 2011

TSR uses a regression model to estimate the final scores Tnew,k of a new batch based on the trimmed scores T∗new,k . These trimmed scores are calculated by multiplying the known part of the data matrix Xnew (the first k columns of this data matrix, referred to as Xnew,k ) with a matrix Bk , consisting of the first k rows of the PLS regression matrix B. T∗new,k = Xnew,k Bk (5) The time-varying regression matrix Ak that links the estimation of the final scores to these trimmed scores is calculated by means of a least-squares regression on the training data, for which both the complete scores Ttrain and the trimmed scores T∗train,k are known. Ttrain = T∗train,k Ak (+ET ) ⇓ −1 ∗T ∗T ∗ Ak = Ttrain,k Ttrain,k Ttrain,k Ttrain

(6)

Using this regression matrix the final scores of a new batch can be estimated from the trimmed scores of this new batch. ˆ new,k = T∗ T (7) new,k Ak Substituting Equation (6) into Equation (7) and exploiting the PLS relations from Equations (2) and (4), the online estimation of the final batch quality is obtained. −1 TSR Ynew,k = Xnew,k Bk BTk XTtr,k Xtr,k Bk · ... (8) . . . · BTk XTtr,k Xtr BQT 4. MODEL ORDER AND INPUT VARIABLE SELECTION To obtain good quality predictions, it is important to select the correct model order (number of principal components) for the PLS model. The procedure for model order selection is explained in Section 4.1. Furthermore, not all available measurements are necessarily correlated with the final quality of the process. Hence, a selection of the most relevant model inputs, as explained in Section 4.2, will improve the batch-end quality prediction. 4.1 Model order selection The optimal model order R (number of principal components) of a PLS model is selected by means of a leave-oneout cross-validation procedure. Every batch in the training dataset is left out once and PLS models of different model orders are trained based on the other available batches. Next, the models are validated on the left out batch and the mean Sum of Squared Errors (SSE ) over all training batches is calculated for every model order. Instead of taking the number of latent variables corresponding to the observed minimum in the SSE -curve, an adjusted Wold’s criterion with a threshold of 0.9 is used, as proposed by Li et al. (2002). Thus, the number of latent variables is determined as the smallest model order R for which the following equation holds. SSE(R + 1) > 0.9 (9) SSE(R) SSE(R) is the (crossvalidation) SSE of the MPLS model with model order R. According to Wold’s criterion, the

(R + 1)th component is only added if it significantly decreases the crossvalidation error (i.e., improves the prediction). 4.2 Input variable selection PLS models are capable of dealing with noisy data, but the elimination of useless measurements that are not correlated with the final batch quality improves the model prediction. Under the assumption that the optimal set of j input variables also contains the optimal set of j − 1 variables, the optimal input set can be selected using a bottom-up branch-and-bound procedure. Given J available measurement variables, J single input models are trained in a first step, each using one of the available variables as input. The input of the model that yields the lowest leave-one-out cross-validation SSE is selected as the most important input variable. In a second step, the selected variable is combined with all remaining measurement variables resulting in J − 1 combinations of two inputs, which are used for the training of new PLS models. Again, the input combination that yields the lowest cross-validation SSE is selected. This optimal set of two inputs is then combined with the remaining variables and the procedure continues until all available measurements are ranked from most to least important. Finally, the cross-validation SSE of the best models for each number of inputs is compared. Initially, the SSE decreases when adding extra input variables. However, at a certain number of inputs, the SSE curve reaches a minimum after which it starts climbing. The number of input variables that corresponds to this minimum SSE value, is selected as the optimal number of inputs. 5. CASE STUDY As a case study, an industrial biochemical process for penicillin fermentation is simulated via an extended version of the Pensim simulator (Birol et al., 2002). Initially, the bioreactor is operated in batch mode. Once the substrate concentration is lower than 0.3 g/L, the fed-batch phase is started, during which additional substrate is fed into the reactor at a constant feed rate. The fermentation is terminated after the addition of 25 L of substrate. The quality variable of interest is the penicillin concentration at the end of the batch. To investigate the influence of measurement noise on the prediction of the final penicillin concentration, a total of 200 batches is simulated with varying initial substrate and biomass concentration, and culture volume. During the fermentation, 15 concentrations and flows, and the bioreactor temperature and pH are available from the simulator. In practice, only 11 of these measurements are generally acquired by online sensors and thus available for online prediction of the final penicillin concentration. Gaussian noise of 20 different levels is added to the measurements of these variables after simulation (to eliminate the influence of noise on PID controller outputs and isolate the effect on the PLS model performance). The different levels will be denoted with respect to a reference noise level described in Table 1. The measured signals are aligned and resampled to a length of 602 samples via indicator variables,

7158

18th IFAC World Congress (IFAC'11) Milano (Italy) August 28 - September 2, 2011

Table 1. Overview of available online measurements with their mean nominal values and the standard deviation of the reference noise level σnoise,ref for these measurements. Variable Time [h] DO [mmol/L] Volume [L] pH [-] Reactor temp. [K] Feed rate [L/h]

Mean 1.1 107.5 5.0 298.0 0.05

σnoise,ref 0 1.333e−2 6.667e−1 3.333e−2 3.333e−1 1.667e−3

Variable Aeration rate [L/h] Agitator power [W] Feed temperature [K] Water flow rate [L/h] Base flow rate [L/h] Acid flow rate [L/h]

Mean 8.0 30.0 296.0 64.2 2.5e−5 7.9e−6

σnoise,ref 1.667e−1 3.333e−1 3.333e−1 1.667 6.667e−6 6.667e−7

comparable to the procedure of Birol et al. (2002). For the alignment of the batch phase, a straight line is fitted through the measurements of the bioreactor volume to make it monotonous. Since the time signal is added as an extra (aligned) variable, 12 online measurement signals are available for each batch. Hence, the training data matrix X is of size 200 × 12 × 602.

over the 200 training batches. In an attempt to filter out the noise 6 (correlated) inputs are again selected.

To improve the final quality prediction, the optimal model inputs and the model order are selected as explained in Section 4. Model predictions, both after conclusion of the batch operation (offline) and online, are compared based on the leave-one-out cross-validation Root Mean Squared Error (RMSE ). The prediction performance of models that use all 12 available measurements as inputs is also compared for different noise levels to assess the influence of measurement noise on quality measurements for models with the same input variables. All calculations are performed thrice with different noise values sampled from the respective Gaussian distributions.

6.2 Model order selection

At higher noise levels, DO measurements become uninformative due to the noise. The reactor volume is then selected as the most important variable. The number of inputs varies between 2 and 5.

The model order for the different noise levels, selected via leave-one-out cross-validation as explained in Section 4.1, shows a decreasing trend, ranging from 9 for the noiseless case to 1-2 for a noise level of 6 times the reference level. In an ideal case, the model order is a measure for the number of independent underlying phenomena that determine the course of a batch. With more noise present in the measurements, some of these phenomena are masked, which results in the selection of lower model orders. 6.3 Prediction performance

6. RESULTS AND DISCUSSION In the next sections, the results of the study are presented and discussed. Section 6.1 gives an overview of the selected input variables for the different noise levels, Section 6.2 discusses the influence of noise on the model order selection and finally, Section 6.3 assesses the noise influence on the offline and online prediction performance of the models. 6.1 Input variable selection After noise addition and alignment of the data, the optimal set of input variables for every noise level is selected according to the procedure in Section 4.2.

With respect to the prediction performance, the influence of measurement noise on both offline and online quality prediction is investigated. Offline quality prediction Offline quality prediction refers to the estimation of the batch-end quality (in this case the final penicillin concentration) after completion of the batch operation. In this case, no compensation for missing variables is needed since the complete data matrix X is known. The full curve in Figure 2 shows the average offline prediction RMSE as a function of the noise level for MPLS models with optimal inputs. As expected, the RMSE increases (so the prediction performance decreases) with increasing noise level. However, the increase

In the noiseless case, 6 input variables are selected: Dissolved Oxygen (DO) concentration, feed rate, time, pH, reactor temperature, and water flow rate. When low amounts of measurement noise (i.e., smaller than 1/8th of the reference noise level) are present in the data, a lower number of inputs (mostly 3) is selected. Several variables in the input set of the noiseless case (e.g., pH and temperature) are controlled and vary only slightly. While these slight variations are still informative for prediction of the final batch quality when no noise is present in the data, even low amounts of noise render these variables uninformative. Up to a noise level equal to 1/8th of the reference level, DO, feed rate, and time remain the most important input variables. At this point, the size of the noise on the DO approaches the normal variation of the DO measurements

Fig. 2. Leave-one-out cross-validation RMSE for offline prediction of the final penicillin concentration in function of the noise level: with selection of model inputs (—) and with all available inputs (- -).

7159

18th IFAC World Congress (IFAC'11) Milano (Italy) August 28 - September 2, 2011

ation quickly drops below 1% and the prediction evolves towards the correct value at the end of the batch. For noise of 6 times the reference level, the decrease of the relative deviation is slower. A stable prediction that deviates less than 1% from the real final penicillin concentration is obtained in fewer than 200 samples.

(a)

(b) Fig. 3. Offline prediction of the final penicillin concentration versus real penicillin concentration for (a) the noiseless case and (b) a noise level of 6 times the reference level. is most obvious at low noise levels. At higher noise levels, the increase is less pronounced and the RMSE saturates. In Figure 3(a), the offline leave-one-out cross-validation prediction of the final penicillin concentration is plotted against the real value of the penicillin concentration for the noiseless case. Without measurement noise, a nearly perfect prediction is obtained (RMSE equals 0.001). However, even for noise levels much higher than those encountered in industry (e.g., 6 times the reference level), very good offline quality predictions are obtained, as evidenced by Figure 3(b). Hence, the PLS models succeed in removing the noise from the data in this case study.

Table 2(a) gives an overview of the maximal relative prediction deviation for different noise levels and sample times. At first sight, the results are somehow unexpected: adding noise sometimes improves the online prediction. This is especially visible during the batch phase, which corresponds to the first 101 samples of the process: better predictions are obtained with noise of the reference level than at a noise level which is 16 times smaller. This counterintuitive result is caused by the input selection, which aims at an optimal offline prediction of the batchend quality. The selected inputs do not guarantee optimal online predictions, especially not when the process consists of different phases. As discussed in Section 6.1, different model inputs are selected for low and high noise levels. The selected inputs for the low noise levels are not informative enough to obtain good online predictions during the batch phase. This is evidenced by the results in Table 2(b), which shows the prediction deviation in function of time and noise level for models that use all available online measurements as inputs. These models result in better online predictions during the batch phase for low noise levels. Consequently, it is better to use all available model inputs to obtain good online predictions from the start of the process. Another option is employing different models (with different inputs) for different process phases. It is clear from these results that higher noise levels result in lower prediction performance. The predictions improve

Figure 2 also shows the prediction RMSE at different noise levels for models employing all 12 available online measurements as input variables (dashed curve). The curve follows the same trend as the RMSE for models with selected inputs. Selecting the optimal input variables leads to lower values of the offline prediction RMSE and hence better offline estimations of the batch quality. Online quality prediction Online predictions of the batch-end penicillin concentration are obtained employing Trimmed Scores Regression (TSR) to compensate for missing future measurements, as explained in Section 3. Figure 4 shows the evolution in time of the maximal relative deviation of the online prediction from the real final penicillin concentration both for the noiseless case and for noise of 6 times the reference level, much higher than in industrial practice. At the start of the batch, the prediction for the noiseless case deviates considerably from the real final value since very few measurements are available. However, the devi-

(a)

(b) Fig. 4. Maximal relative deviation of the online prediction from the real final penicillin concentration in function of time for (a) the noiseless case and (b) noise of 6 times the reference level.

7160

18th IFAC World Congress (IFAC'11) Milano (Italy) August 28 - September 2, 2011

the selection of different model inputs. Inputs selected based on optimization of the offline prediction performance are not always informative during all process phases. The use of all available input variables improves the online prediction in the first part of the process for this case study. When all online available input variables are used, higher noise levels result in lower prediction performance, but accurate and stable online predictions are obtained for all noise levels.

Table 2. Influence of noise on the maximal relative deviation of the online prediction from the real final penicillin concentration for models with (a) selected inputs and (b) all available inputs. Time 1 50 100 200 300 400 500 602

No noise 12.4% 1.3% 0.9% 0.7% 0.6% 0.6% 0.5% 0.1%

Level 1/16 12.5% 10.8% 7.9% 1.0% 0.7% 0.6% 0.5% 0.3%

Level 1 1.7% 1.0% 1.0% 0.9% 0.8% 0.8% 0.7% 0.7%

Level 3 4.3% 1.1% 1.0% 1.0% 0.9% 0.9% 0.9% 0.9%

Level 6 7.1% 1.4% 1.2% 1.0% 1.0% 0.9% 0.9% 0.9%

Level 1 3.1% 1.1% 1.2% 1.2% 1.1% 1.1% 1.1% 1.0%

Level 3 5.0% 1.3% 1.3% 1.6% 1.5% 1.4% 1.3% 1.2%

Level 6 7.5% 1.6% 1.4% 1.3% 1.3% 1.2% 1.2% 1.2%

REFERENCES

(a) Time 1 50 100 200 300 400 500 602

No noise 0.9% 0.9% 1.1% 0.8% 0.7% 0.6% 0.5% 0.3%

Level 1/16 2.5% 1.1% 1.1% 0.9% 0.8% 0.7% 0.6% 0.5% (b)

with time. For all noise levels, good and stable predictions are obtained in fewer than 200 samples. 7. CONCLUSIONS In this paper, the influence of measurement noise on PLS-based batch-end quality prediction is investigated. A simulated fed-batch process for penicillin production is selected as a case study. Gaussian noise of different levels is added to the process measurements. The effect of noise on (i ) input selection, (ii ) model order selection and (iii ) on- and offline prediction performance is studied. As the noise level increases, the information content of the measurements decreases. Measurements of controlled variables vary only slightly. They are informative in the noiseless case, but are soon rendered uninformative. When the noise size approaches the normal variation of informative measurements and PLS is no longer able to filter out the noise by selecting more (correlated) inputs, other optimal input variables are selected. The model order decreases with the noise level since higher noise values mask more and more important underlying phenomena. As expected, the offline prediction performance of the PLS models decreases with increasing noise level until the prediction error stagnates at the higher investigated noise levels. Even for noise levels much higher than those encountered in industry, very good offline quality predictions are obtained. This proves the ability of PLS models to filter the noise from the data. Online prediction using Trimmed Scores Regression (TSR) leads to stable estimations of the final penicillin concentration. Counterintuitively, the online prediction performance does not necessarily decrease with increasing noise level when optimal model inputs are selected. The addition of more noise sometimes leads to better predictions due to

Arteaga, F. and Ferrer, A. (2002). Dealing with missing data in MSPC: several methods, different interpretations, some examples. Journal of Chemometrics, 16, 408–418. ¨ Birol, G., Undey, C., and C ¸ inar, A. (2002). A modular simulation package for fed-batch fermentation: penicillin production. Computers and Chemical Engineering, 26, 1553–1565. Eriksson, L., Johansson, E., Kettaneh, N., and Wold, S. (2002). Multi- and Megavariate Data Analysis: Principles and Applications. Umetrics Academy. Garc´ıa-Munoz, S., Kourti, T., and MacGregor, J. (2004). Model predictive monitoring for batch processes. Industrial and Engineering Chemistry Research, 43, 5929– 5941. Geladi, P. and Kowalski, B. (1986). Partial least-squares regression: a tutorial. Analytica Chimica Acta, 185, 1– 17. Gins, G., Vanlaer, J., and Van Impe, J. (2009). Online batch-end quality estimation: does laziness pay off? In J. Quevedo, T. Escobet, and V. Puig (eds.), Proceedings of the 7th IFAC International Symposium on Fault Detection, Supervision and Safety of Technical Processes (SafeProcess2009), 1246–1251. Li, B., Morris, J., and Martin, E. (2002). Model selection for partial least squares regression. Chemometrtics and Intelligent Laboratory Systems, 64, 79–89. Nomikos, P. and MacGregor, J. (1994). Monitoring batch processes using multiway principal component analysis. AIChE Journal, 40(8), 1361–1375. Nomikos, P. and MacGregor, J. (1995a). Multivariate SPC charts for monitoring batch processes. Technometrics, 37(1), 41–59. Nomikos, P. and MacGregor, J. (1995b). Multiway partial least squares in monitoring batch processes. Chemometrics and Intelligent Laboratory Systems, 30, 97–108. Simoglou, A., Georgieva, P., Martin, E., Morris, A., and de Azevedo, S. (2005). On-line monitoring of a sugar crystallization process. Computers and Chemical Engineering, 29(6), 1411–1422. ¨ Undey, C., Ertun¸c, S., and C ¸ inar, A. (2003). Online batch/fed-batch process performance monitoring, quality prediction, and variable-contribution analysis for diagnosis. Industrial and Engineering Chemistry Research, 42, 4645–4658. Venkatasubramanian, V., Rengaswamy, R., Yin, K., and Kavuri, S. (2003). A review of process fault detection and diagnosis. Part I: Quantitative model-based methods. Computers and Chemical Engineering, 27, 293–311. ¨ Wold, S., Geladi, P., Ebensen, K., and Ohman, J. (1987). Multi-way principal components- and PLS-analysis. Journal of Chemometrics, 1(1), 41–56.

7161