Reliability of analytical systems: use of control charts, time series models and recurrent neural networks (RNN)

Reliability of analytical systems: use of control charts, time series models and recurrent neural networks (RNN)

Chemometrics and Intelligent Laboratory Systems 40 Ž1998. 1–18 Tutorial Reliability of analytical systems: use of control charts, time series models...

469KB Sizes 0 Downloads 23 Views

Chemometrics and Intelligent Laboratory Systems 40 Ž1998. 1–18

Tutorial

Reliability of analytical systems: use of control charts, time series models and recurrent neural networks žRNN / A. Rius ) , I. Ruisanchez, M.P. Callao, F.X. Rius ´ Departament de Quımica, UniÕersitat RoÕira i Virgili, Pl. Imperial Tarraco, 1, 43005 Tarragona, Spain ´ `

Abstract In this tutorial, the techniques used to study the reliability of analytical systems over time are discussed. The most classical approach is to use statistical process control ŽSPC. with control charts, and its principal characteristics, benefits and limitations are shown. The advanced process control ŽAPC. approach, developed and mainly used in the field of engineering, is also studied and its possibilities for monitoring chemical measurement processes evaluated. The fundamentals and potentialities of recurrent neural networks ŽRNN. in this field are also presented. The bases of these three approaches are described, and their advantages and drawbacks discussed. They are applied to a simulated time series and to real process analytical data, and the results obtained for these data are compared. q 1998 Elsevier Science B.V. All rights reserved. Keywords: Statistical process control; Control charts; Advanced process control; Time series models; Recurrent neural networks

Contents 1. Introduction .

...................................................

2. Statistical process control ŽSPC. . . . . . . . . . . . . . . . . 2.1. Shewhart charts . . . . . . . . . . . . . . . . . . . . . . 2.1.1. Control charts for individual measurements . . . . 2.1.2. Control charts for subgroups . . . . . . . . . . . . 2.2. Cumulative sum Žcusum. charts . . . . . . . . . . . . . . 2.3. Exponentially weighted moving average ŽEWMA. charts 2.4. Comparison between the different control charts . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

4 4 4 5 5 6 7

3. Introduction to advanced process control ŽAPC. . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1. The identification of time series models . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1. Autoregressive models of order p ŽAR p . . . . . . . . . . . . . . . . . . . . . . . . 3.1.2. Moving average model of order q ŽMA q . . . . . . . . . . . . . . . . . . . . . . . . 3.1.3. Autoregressive integrated moving average model of orders p, d and q ŽARIMA pdq . 3.2. Integration of SPC and time series modelling . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

8 8 9 9 10 10

........................ ........................

11 11

4. Artificial neural networks ŽANN. applied to time series models 4.1. Backpropagation learning strategy . . . . . . . . . . . . .

)

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

2

Corresponding author. Tel.: q34-77-558122; fax: q34-77-559563; e-mail: [email protected].

0169-7439r98r$19.00 q 1998 Elsevier Science B.V. All rights reserved. PII S 0 1 6 9 - 7 4 3 9 Ž 9 7 . 0 0 0 8 5 - 3

A. Rius et al.r Chemometrics and Intelligent Laboratory Systems 40 (1998) 1–18

2

4.2. Recurrent neural networks, RNN . . . 4.3. Counterpropagation learning strategy .

................................... ...................................

5. A case study in automated analytical measurements .

..............................

14

....................................................

17

..................................................

18

......................................................

18

6. Discussion .

Acknowledgments References

12 13

1. Introduction It is often useful to visualize the behavior of a chemical process over time as a box with several inputs and one, or several, outputs. There are always a series of unavoidable perturbations, sometimes uncontrolled, that cause the response of the system to have an inherent variability. These perturbations may be, for example, changes in the environmental conditions, variations in the inputs, etc. The control of a process should at least, sample the output products, analyze them, and adjust the process as a function of the results of the assays performed and the desired level of performance. This scheme is shown in Fig. 1a. Every analytical process that works continuously Žfor example in routine analysis, or a method used for a long period of time. can also be viewed in the same way w1x. The simplest analytical system can be considered as a process with inputs Žthe samples. and an output Žthe result of the analysis., as shown in Fig. 1b. So, the basic ideas about quality control applied in industry are useful also when applied to the results of the analysis. The analytical process should be validated prior to the application of quality control techniques, and then the goal of these techniques will be to detect deviations over time from the controlled state. Such controlled state is not a trivial question as it sometimes appears to be in physical engineering processes. First of all, its traceability should be proved. Reference methodologies, reference materials, the so-called ‘certified reference materials’ currently available on the market or reference values given by intercomparison exercises are characterized so that reference values are demonstrably traceable to the measurement units in which the values are expressed. Reliability

over time requires traceable and stable over time references which are the only means to control and detect systematic influences on measurement systems. Otherwise, variations in the reference sample can produce false variations in the system or hide real deviations. Another interesting topic related to reliability of analytical systems is the measurement of uncertainty. There are a number of reasons: one of them relates to the preparation of a valid uncertainty budget for an analytical procedure requires an explicit statement of understanding of the processes affecting the measurement. This evaluation leads to a better knowledge of the system and the processes affecting the measurement. Such knowledge is a very important

Fig. 1. Ža. Scheme of a process control. Žb. An analytical system viewed as a continuous process.

A. Rius et al.r Chemometrics and Intelligent Laboratory Systems 40 (1998) 1–18

point to be considered prior to the application of mathematical or statistical models. Another reason is that the evaluated uncertainty of an analytical process is central to the setting of the process control limits themselves, and yet another is that the uncertainty of measurement of the reference Žor control. samples is very germane to the assessment of compliance with those limits and quantitatively evaluating confidence in that compliance. Once the analytical procedure has been validated, in order to be considered reliable, it should be regularly checked to ensure that its quality characteristics Že.g. accuracy, trueness, precision, selectivity, etc.. are maintained over time w2x. So, stability over time is a quality parameter that should be ensured, particularly when the analytical system is designed to work continuously on line. The classical method for checking the reliability of a system is to repeat the analysis of a reference sample at regular intervals and monitor it with statistical process control ŽSPC. techniques w3,4x. The basic tools are Shewhart charts, cumulative sum Žcusum . charts, and exponentially weighted moving average charts ŽEWMA.. If a deviation in the results is found the system must be adjusted, if this is possible. In general, the variation in the response of any analytical system can be interpreted as the sum of two components w5x: a random component which is due to the unavoidable variability of the system, and a nonrandom component that might be time dependent, and that can be caused by instrument use, changes in the environmental conditions, etc. Most of the presentday analytical systems contain mechanical, electrical andror optical parts, in which the time dependent factor can be of great importance. Some examples of the kind of problems that might arise are: lamp aging of the photometric detector, a decrease in the separation capacity of the chromatographic column, changes in the detector response with use, the influence of points of contamination on subsequent responses in continuous flow analytical systems ŽFIA, SIA, . . . ., or sensor deterioration. Other analytical systems can be affected by changes in the environmental conditions, such as temperature, humidity, etc. These changes can cause drifts in the results, i.e. systematic trends in the results as a function of time, and the variation of the result with time for a reference sample can be affected by small time-dependent changes,

3

as shown in the plot in Fig. 1b. In many systems, the presence of drift in the results causes dependent consecutive measurements to be obtained Žautocorrelated data. w6x. When classical SPC techniques are applied to these measurements there are false alarms, the control limits are often inadequate andror outliers are not detected w7x. A methodology which is complementary to SPC within the domain of time series analysis, and which is used in classical process control, is the so-called advanced Žor automatic. process control ŽAPC ., which can reveal autocorrelated data, forecast system behavior, or adjust the process w8,9x. APC analyses and identifies time series models using the shape and structure of the simple autocorrelation and partial autocorrelation functions. Once the model has been identified, it is validated by inspecting the prediction residuals, and it can be used to predict the system behavior. Artificial neural networks have been applied to the study of time series problems from the very beginning but it was not until the development of the backpropagation learning strategy that practical results were obtained. Recurrent neural networks ŽRNN. have been shown to be useful for time series prediction w10,11x. In the literature, RNN have been successfully applied to model complex time dependent systems in continuous chemical processes w12x, financial data w11x, word recognition, etc. RNN can be described as non-linear extensions of traditional linear models w13x and should have the ability to represent relationships between events in time. As it has been stated, these methodologies can be applied in analytical chemistry to the data obtained from the analysis of reference or control samples each certain period of time, and the prediction of system behavior one position ahead is of great importance because between two consecutive results for reference samples there are a lot of analysis of unknown samples. It is important to remark that all these techniques are intended to assist the analyst in the interpretation of data and knowledge of system, but not replace the analyst experience and knowledge of the measurement system. It is desirable that the actions to be made over the system measurement can not rely only on the results obtained by the application of these techniques. No statistical or mathematical model can

4

A. Rius et al.r Chemometrics and Intelligent Laboratory Systems 40 (1998) 1–18

do away for the need of the analyst to understand the measurement process being evaluated. In this tutorial, the basics of SPC and time series models are described and applied to the monitoring of analytical systems using reference samples. In Section 2, the various control charts used by SPC are briefly described and compared, and their advantages and drawbacks discussed. Section 3 describes the statistical models for time series and discusses the different models, ways of identifying them, and their complementary use with SPC. Section 4 discusses recurrent neural networks ŽRNN. applied to time-dependent series, including back and counter propagation learning strategies. Section 5 shows how these techniques can be applied to analytical systems and Section 6 discusses their main characteristics, advantages and drawbacks.

2. Statistical process control (SPC) All chemical processes, including analytical processes, are subject to certain variability, and the goal of SPC is to separate the common Žrandom. from the special causes Žsystematic. of variability; control charts are the main tools used to achieve this goal. The idea of control charts is to test the hypothesis that there are only common causes against the alternative that there are special causes of variation. In the former case, the system is said to work under statistical control or to be in a controlled state, and in the latter, the system goes out of control, and so needs to be adjusted. Control charts are applied in different ways depending on the data. These data are called variables when the quality characteristic is a measured value and attributes when the quality characteristic is the number of times that the item conforms or fail to the requirements. In this paper, we consider only the control charts for variables, because this is the usual situation for analytical data. The plots are also different if the represented values are obtained from an individual measure or by replicating several measurements; both cases are common in analytical data, so both will be explained. Below the three most important types of control charts are described and compared.

As a pedagogical example for comparing the different approaches, a series of 75 autocorrelated points with mean 100 was used. This series was generated from a white noise normally distributed series with standard deviation 1, and an order one autoregressive model. 2.1. Shewhart charts The first type of control chart was developed during the 1920s by W.A. Shewhart w14x. The simplest control chart consists of plotting the measured data on the vertical axis versus the order Žor time. in which those data were obtained on the horizontal axis w15– 17x. The central line and the limits for these charts are different if the observations are obtained from individual values or from subgroups. 2.1.1. Control charts for indiÕidual measurements The most usual charts in these cases are X and R charts. The X chart controls the mean of the process and the R chart controls the variability. As well as the individual points, X i , the X chart plots a central line at m and warning and control limits at "2 s "3 s are also plotted, where m is the mean and s the standard deviation of the population. Often, in practice, m and s are not known and must be estimated. Usually m is estimated by the mean m ˆ s X; several authors suggest using at least 50 observations w16x. There are several approximations for the estimation of sˆ w16–18x, the most usual one being Eq. Ž1.: s sˆ s Ž 1. c4 where s is the standard deviation of at least 50 observations, and c 4 a value that can be calculated by Eq. Ž2.: c4 s

(

2

G Ž nr2 .

n y 1 G Ž n y 1 . r2

Ž 2.

where n is the number of observations and G the gamma function. The values of c 4 calculated in this way approximate to unity as n increases w16x, and the value obtained for src4 is an unbiased estimator of s . Discussion about several other ways of estimating m and s can be found in the bibliography w16–18x.

A. Rius et al.r Chemometrics and Intelligent Laboratory Systems 40 (1998) 1–18

The R chart Žmoving range chart. is used to control the variability, and the central line is plotted at MR calculated by Eq. Ž3.: MRs

1 Ý ny is1 MR i

ny1

Ž 3.

where MR i s < x iq1 y x i <, and x i the observation at time i. The control limits for this R chart are plotted at D 3 MR and D4 MR; the values for D 3 and D4 can be found in table form w16x. As an example of this kind of chart, Fig. 2 shows the X and R charts corresponding to the simulated series. In the X chart, all the points are between the control limits and, consequently, no out-of-control points are detected; in the R-chart two points are be-

5

yond the control limits, pointing to a possible increase in the data variability at these sampling times. Several criteria, often called run rules, have been suggested for the detection of slight trends in the data. The application of those supplementary run rules w19–21x increases the effectiveness of the Shewhart charts. For example, one of these supplementary run rules states that when 8 consecutive points are on one side of the central line, the system should be suspected of being out of control. When this run rule is applied to the X chart in Fig. 2, seven out-of-control points are detected, and these are shown with solid points in the figure. 2.1.2. Control charts for subgroups When it is possible to obtain replicates of several measurements the most often used control charts are the X–R and the X–s charts. In both kinds of charts, the central tendency is controlled in the X chart, in which the mean of each subgroup is computed Ž X i ., and the central line is estimated by the mean of the means Ž X s ŽÝ i X i .rN .. The warning and control limits are plotted at "2 s "3 s , where s is estimated by src4'n , where s is the mean of the standard deviations for all subgroups Ž s s ŽÝ i si .rN ., c 4 is the same constant as for individual measurements, n is the number of replicates in each subgroup, and N the number of observations. In the X–R chart, the variability is controlled by the ranges R, which are the differences between the higher and the lower values in each subgroup at time i, and in the X–s chart by the standard deviation s. Usually the latter is preferred to the former. When the standard deviation is used to control the variability, the central line is plotted at s and the control limits at B3 s and B4 s; the values for B3 and B4 have been tabulated for different values of n w16x. The same additional criteria, or run rules, discussed in the paragraph above for the X chart, can also be used in the X chart. 2.2. CumulatiÕe sum (cusum) charts

Fig. 2. X–R chart for a simulated series. Solid points indicate the out-of-control measurements. ŽUCL and LCL: upper and lower control limits; UWL and LWL: upper and lower warning limits..

The cumulative sum control chart was first designed by E.S. Page during the 1950s w15,22x in order to improve the effectiveness of control charts for slight drifts. In a cusum chart the deviations of the sample measurements, X i , from the mean of the pop-

A. Rius et al.r Chemometrics and Intelligent Laboratory Systems 40 (1998) 1–18

6

dard deviation of the data, w is a scaling factor related to the geometry of the control chart and the mask dimensions w15x, s is the standard deviation of the observations estimated by src4 as in the Shewhart chart, a is the probability of detecting a false out of control, and b the probability of not detecting a change of magnitude D. While the points lie within the V-mask arms the system is said to be under control. Another way of using the cusum chart is to represent the SH and SL parameters w16,23–25x, which are calculated for each measurement by Eqs. Ž6. and Ž7.:

Fig. 3. Cusum chart for the simulated series. Ža. The classical Vmask cusum chart. Žb. Plot of the cusum chart using the parameters SH and SL . Solid points indicate the out-of-control measurements.

ulation, estimated by X, are calculated by the cumulative sum, cusum i s Ý i Ž X i y X . and plotted on the ordinate axis. In order to detect the out of control points, a V-mask is used Žas shown in Fig. 3a. for the simulated data. The parameters of the mask are the angle u and the distance d, calculated using Eqs. Ž4. and Ž5. w23x:

SH Ž i. s max 0, SH Ž iy1. q X i y X y Dr2

Ž 6.

SLŽ i. s max 0, SLŽ iy1. y X i q X y Dr2

Ž 7.

with SH Ž0. s SLŽ0. s 0. The control limits for these charts are "h, calculated by: h s dDr2, where d and D are the same as before. The system is said to be out of control if SH ) h or SL ) h. This type of cusum chart is easier to apply in computer-based systems because there is no need to use the V-mask w25x. A proposal has been made to enhance the cusum chart w26x. This is called fast initial response or FIR cusum and consists of starting with SH Ž0. / SLŽ0. / 0. This modification enables a faster detection of out of control points than the normal cusum. Fig. 3b shows an example of the application of cusum charts for the simulated data. The parameters used to compute the control limits are D s 2, a value which is twice the standard deviation of the data, and a s b s 0.05. It can be seen that 9 points are beyond the control limits Žthe first one is the 8th., and accordingly, the system should be revised at this point. 2.3. Exponentially weighted moÕing aÕerage (EWMA) charts

Ž 4.

S.W . Roberts introduced the exponentially weighted moving average control chart in 1960 w27x. The EWMA statistic, z i , is calculated w28,29x according to Eq. Ž8.: z i s l X iy1 q Ž 1 y l . z iy1

Ž 5.

s l X iy1 q l Ž 1 y l . X iy2 q l Ž 1 y l . X iy3 q . . . Ž 8.

where D is the change in the process to be detected, the value of which is usually 2 or 3 times the stan-

where X i is the observed value at time i, and l is the EWMA parameter Ž0 - l - 1.. The start value z 0 is usually taken to be X. In the EWMA chart the

tan u s ds

D 2w

2s 2 D2

ln

2

1ya

b

A. Rius et al.r Chemometrics and Intelligent Laboratory Systems 40 (1998) 1–18

statistic z i is plotted versus the sample number. The control limits are calculated according to: z 0 " 3w lrŽ2 y l.x1r2s , where s is estimated by s as in the Shewhart chart w28x. One important point when using EWMA control charts is the determination of the parameter l to be used. In practice it is often selected from experience, or it is estimated from the data in the following way: the sum of the squares of the errors ŽÝŽ z i y X i . 2 . for different values of l are calculated, and the value that gives the smallest sum of squares is selected. An example, for the simulated data, of the plot of the sum of squares versus the l values is shown in Fig. 4a; it can be seen that, in this case, the curve’s minimum is at l s 0.85. When this value is used, the EWMA control chart is the one shown in Fig. 4b, and it can be seen that no out-of-control points are detected. Frequently, the EWMA statistic z iq1 is con-

Fig. 4. Ža. Plot of the sum of squares versus the parameter l. Žb. EWMA chart for the simulated data using a value of l s 0.85.

7

sidered to be a forecast of the system’s behavior, and so the interpretation is somewhat different from the Shewhart and cusum charts, as will be shown in the next section. 2.4. Comparison between the different control charts In general, it is accepted that cusum charts are faster than Shewhart charts for detecting slight deviations from the mean of the population, even when supplementary run rules are used in the Shewhart charts w19x. This difference depends, of course, on the value chosen for parameter D when calculating SH , SL and the control limits in the cusum chart; the lower the value of D, the faster the detection of an out-ofcontrol value, but the higher the probability of finding a false out-of-control value. In contrast, the Shewhart charts are generally quicker at detecting a sudden change in the mean of the process, and are easier to apply, since the calculations are much simpler and the plots easier to interpret. Several studies have been published w30x which compare Shewhart and cusum charts and measure their effectiveness by computing their average run length ŽARL.. The meaning of ARL is the average number of points that must be measured before a change of magnitude D is detected. The ARL values are computed for D s 0 ŽARL 0 . and D / 0 ŽARL D .. The meaning of ARL 0 is the probability of a false out-of-control. Of course, ideally, values for ARL 0 will be large, that is to say that there is a low probability of a false out of control value, and ARL D will be low, which means that change D should be detected with only a few measurements w7,15,18,19,27x. The ARL 0 values are higher for a cusum chart, and the ARL D values Žwith D / 0. are higher for the Shewhart chart w18x. These values reinforce the fact that cusum charts are more sensitive to slight changes. EWMA charts are interpreted differently from Shewhart or cusum charts. The most important difference is in how the previous information is taken into account to calculate the statistic: in a Shewhart chart the last plotted point depends only on the last measurement, while in a cusum chart it depends on all the previous measurements, and in an EWMA control chart it depends on the value of l: the larger the value of l, the greater the influence of the last measurements. As l ™ 0 the EWMA plot takes the

8

A. Rius et al.r Chemometrics and Intelligent Laboratory Systems 40 (1998) 1–18

appearance of the cusum plot, and as l ™ 1 it takes the appearance of the Shewhart plot. Another difference is that the EWMA statistic can be considered a one-step-ahead prediction for the process w29x, and it allows the system’s behavior to be forecast. So, the EWMA statistic has the advantage that it can be used as a feedback control because the prediction can be used to adjust the process. It has been shown that this statistic can be interpreted as a particular case of the time series models w31x.

3. Introduction to advanced process control (APC) The philosophy behind the treatment of the time series from the point of view of advanced process control is completely different from the treatment given by the control charts and explained in the section above: the application of SPC assumes that the data are independent and identically distributed Žiid. random variables, i.e. uncorrelated data with the same distribution for all the measurements, and the goal is to detect whether there has been a deviation from the mean or variability. On the other hand, APC and process adjustment are useless in an iid process, and they are applied to detect autocorrelations and to forecast the system behavior accordingly. It is in these cases where the time series treatment has advantages over the SPC. Correlation in the data is usual in the process industry environment, and so traditionally these methodologies have been developed and applied in this field w8,9x. But, as was stated in the introduction, there are reasons for believing that many analytical systems might have correlated data, and when this is so the application of control charts are severely compromised w7,31x in spite of the modifications for the computation of the control limits that have been proposed w32x. The main steps involved in the advanced process control approach are the following: Ž1. Identification of the model behind the data. Ž2. Estimation of the parameters for the identified model. Ž3. Validation of the model. Ž4. If the model is correctly validated, use it for prediction. Otherwise identify a new model. These steps are described in the sections below, and the integration of SPC and APC is presented.

3.1. The identification of time series models The first step in the application of APC is to detect autocorrelations in the data of the time series by computing the correlation coefficients. Let us suppose that there are N individual points in a time series: X 1 , X 2 , X 3 , . . . , X N , as shown in the first row in Table 1. If we shift the values one position to the right in each row, we can compute the correlation coefficient between the first ant the other rows using Eq. Ž9.:

rk s

Ny k Ý ts1 Ž Xt y Xt .Ž Xtyk y Xtyk . r Ž n y k y 1.

S X t S X ty k

for k s 0, 1, 2, 3, . . . , k

Ž 9.

where: Ny k Xt Ý ts1

X tyk s

S X ty k s

Nyk

)

and

Nyk Ý ts1 Ž Xt y Xtyk .

2

Nyky1

The plot of r k versus k is called the simple autocorrelation function Žsimple acf., where k Žthe lag. is the number of values that separate the data for which the correlation is computed, in such a way that we calculate the correlation of each observation with itself when k s 0, the correlation of each observation with its previous one when k s 1, and so on. For k s 0, the correlation coefficient r 0 equals 1 since the correlation is calculated of each observation with itself.

Table 1 Representation of N points Ž X1 . . . X N . corresponding to a time series, and the same data shifted k position ahead in order to easily compute the autocorrelation coefficients. See text for explanations Time 1 X1 Shift 1 — 2 — 3 — ... k —

2 3 ... X2 X3 . . .

... ...

X1 X 2 . . . . . . — X1 . . . . . . — — ... — — X1

... t . . . Xt

... ...

... ...

... N . . . XN

. . . X ty1 . . . . . . X ty2 . . . ... ... . . . X tyk . . .

... ...

. . . X Ny1 . . . X Ny2 ... ... . . . X Nyk

...

A. Rius et al.r Chemometrics and Intelligent Laboratory Systems 40 (1998) 1–18

The next step is to compute the correlation coefficients between X t and X tyk after eliminating the effects of X ty1 , X ty2 , . . . , X tykq1. There are several ways of computing these correlation coefficients w39x, one of which is based on the so-called Yule–Walker equations: r1 r2 ...

r1 1 r1 ...

r ky1

r ky2

1

r2 r1 1 ... ...

... ... ... .. . ...

r ky1 r ky2 ... ... 1

r Xk1 r1 r Xk 2 r2 ... s ... ... ... X rk k rk

Ž 10 .

Solving these equations for k s 1, 2, 3, . . . successively, we obtain the values for r X11 , r X22 , . . . , r Xk k . The plot of r Xk k versus k is called the partial autocorrelation function Žpartial acf.. The values for the partial acf are a measure of the correlation between variables which are shifted k lags without the influence of the intermediate values. The shapes of the simple and the partial acf give information about the model behind the data. If the data are uncorrelated the values of both functions are not statistically different from zero, as it is shown, for example, in Fig. 5a, b; in this case control charts could be applied as explained in the previous section. On the other hand, if the data are correlated the values of these functions are clearly different from zero; for example, Fig. 5c shows the simple acf and Fig.

9

5d the partial acf for the simulated data. The most usual shapes of these plots correspond to the autoregressive, the moving average and the non-stationary models, which will be discussed below. The number of points needed for autocorrelation to be detected in the data depends on the structure and the noise, but it is usually at least 50. 3.1.1. AutoregressiÕe models of order p (AR p ) In an autoregressive model of order p each individual value, z t , is expressed as a finite sum of the p previous values: z t s F 1 z ty1 q F 2 z ty2 q . . . qFp z typ q a t

Ž 11 .

where z t s X t y m , with m being the mean of the population Žusually estimated by X, a t is the white noise, which is a sequence of random normally-distributed variables with mean zero and variance s 2 , and F i are the parameters of the model. The orders 1 or 2 are the most frequent models of this kind. As has been stated above, these models are identified by the shape of the autocorrelation functions. So, the simple acf of an ARŽ1. model has r k values that decay exponentially with k, and only the first value of its partial acf is statistically different from zero. And for an ARŽ2. model, the simple acf has a mixture of exponentials and sines, and only the first two values of the partial acf are statistically different from zero. An example of the simple acf of a simulated autoregressive model of order 1 Žwith F 1 s 0.8. is shown in Fig. 5c, and its partial acf in Fig. 5d. The parameters F i can be estimated from the Yule–Walker equations, a set of linear equations in terms of r k , but they can be estimated in other ways w31,33x. 3.1.2. MoÕing aÕerage model of order q (MA q ) In a moving average model of order q the current value, z t , is expressed as a finite sum of the q previous a t : z t s a t y u 1 a ty1 y u 2 a ty2 y . . . yuq a tyq

Fig. 5. Some examples of the shape of the autocorrelation functions. Ža. and Žb.: simple and partial autocorrelation functions for uncorrelated data; Žc. and Žd.: simple and partial autocorrelation functions for a simulated autoregressive model of order 1 ŽARŽ1..; Že. and Žf.: simple and partial autocorrelation functions for a moving average model of order 1 ŽMAŽ1...

Ž 12 .

where z t s X t y m , with m being the mean of the population Žusually estimated by X ., a t are the residuals, which have a white noise structure with mean zero and constant variance Žfor the past observations a t is the difference between the value mea-

A. Rius et al.r Chemometrics and Intelligent Laboratory Systems 40 (1998) 1–18

10

sured at this time, z t , and the value predicted from the previous data at the same time., and u i are the parameters of the model. Orders 1 and 2 are the most frequent, and they are identified by the shape of the autocorrelation functions: in a MAŽ1. model, only the first value of the simple acf is different from zero, and the partial acf decays exponentially. In a MAŽ2. model, only the first two values of the simple acf are different from zero, and the partial acf has a mixture of exponentials and sines. An example of the simple acf of a simulated moving average model of order 1 Žwith u 1 s 0.8. is shown in Fig. 5e, and its partial acf in Fig. 5f. The model parameters, u i , may be estimated from a set of nonlinear equations in terms of the autocorrelations r k using an iterative method w31x.

ferenced, and then, the shape of the autocorrelation functions used to identify the model. So, an ARMAŽ1, 1. Žfirst order autoregressive and first order moving average model. has a simple and a partial autocorrelation function with exponential decays Žtheir shapes correspond to the sum of the functions of an ARŽ1. and a MAŽ1. model.. Once the model has been identified, the parameters of the model must be estimated. This can be done by several methods, in the same way as has been explained above. When the parameters of the model have been computed, the model can be used to forecast the system’s behavior. But previously the model must be validated, as will be shown in the example in Section 5. 3.2. Integration of SPC and time series modelling

3.1.3. AutoregressiÕe integrated moÕing aÕerage model of orders p, d and q (ARIMA p d q ) The time series models described in the sections above are all stationary, that is to say the measurements vary around a fixed mean. This section discusses non-stationary models. If they are to be treated, these series must be transformed to stationary series. This is usually done by differentiation: wt s z t y z ty1 s = z t .

Ž 13 .

Sometimes one differentiation is not enough, and stationarity can be achieved by differentiating d times; thus, the original series is transformed by wt s = d zt. Once it has been transformed, the resulting series may contain autoregressive and moving average terms. A general ARIMA series is expressed as a sum of several autoregressive and moving average terms: wt s F 1wty1 q F 2 wty2 q . . . qFp wtyp q a t y u 1 a ty1 y u 2 a ty2 y . . . yuq a tyq

Ž 14 .

where wt is the dth difference of the original series x t Ž wt s = d z t ., a t is the white noise, and F i and u i are the parameters of the model. When the series is stationary the value of d is zero, and the general model is an ARMA p q . Non-stationary series can be identified because, in their simple acf, the values of r k decrease in a fairly regular way. In such cases, the series should be dif-

As has been stated above, SPC is useful when the data are independent and identically distributed Žiid. random variables, but the use of SPC techniques is not only limited to cases where there is no correlation in the data. The control charts can be used as a complementary tool for the time series modelling techniques. The term algorithmic statistical process control, ASPC w8,9x, has been proposed for the approach that uses the capabilities of time series analysis to detect autocorrelations Žand then, to adjust the process. and the capabilities of the control charts to detect special causes of variability in the process. When time series models are applied, the dynamic behavior of the system is modelled so that the measured parameters can be predicted. It is possible to continuously adjust the system, so every time a new measure is obtained the process can be adjusted in order to reduce the variability w34x; of course, this will be useful if the cost of the adjustments is low. Once the model has been well identified, the residuals, i.e. the difference between the predicted and actual values, are uncorrelated, and so all the values in its simple acf should not be statistically different from zero. In this case, control charts can be applied to these residuals in order to find any special variability produced in the system. On the other hand, if autocorrelation is detected it means that the system has not been well identified, and another model should be tried: usually, the shape of the simple acf of the

A. Rius et al.r Chemometrics and Intelligent Laboratory Systems 40 (1998) 1–18

11

residuals gives an indication of how the new model should be; for example, if the model is initially identified as an ARŽ1. and the residuals indicate an MAŽ1. model, then the correct model will probably be an ARMAŽ1.. So, the time series modelling can be interpreted as a filter that transforms the correlated data to uncorrelated data, and SPC is applied to the latter in order to detect changes in the model or the presence of out-of-control signals.

4. Artificial neural networks (ANN) applied to time series models Recurrent neural networks ŽRNN. applied to time dependent data will be shown in this section, both as complementary and as a new approach of the linear models ŽASPC., specially to tackle the problems that cannot be solved by the traditional methods as is the case when we are dealing with complicated systems. Among the well known neural networks learning strategies, backpropagation and counterpropagation neural networks have shown an increased interest and development and therefore are used in many applications. Their application to analytical time dependent data will be shown and at the same time we would try to clarify the confusing terminology that often is found when working with RNN. The most common types of RNN are addressed taking into account the network design Žtopology, connection schemes., training or learning methods and interpretation. The main difference between each type of RNN is the scheme connection, in our case a recurrent connection that can be considered as a weighted past observation yielding a local memory into a neuron. Therefore, the RNN are able to memorize and extrapolation in time can be used to predict or forecast future signals. 4.1. Backpropagation learning strategy Backpropagation of errors is the name of a learning method, a strategy for the correction of the weights. Backpropagation networks consist of multiple layers of arranged neurons, being the neurons connected to each other through a connection parameters called weights Ž w .. Fig. 6 shows a simple

Fig. 6. Multilayer feedforward neural network with three, two and one inputs, hidden and output units, respectively.

topology with one hidden layer and one output neuron. In this case, the relationship between predicted value ‘ y’ ˆ and input variables ‘ x i ’ can be expressed as Eq. Ž15.: m

yˆ s g

n

ž ž

Ý Õj f Ý wji x i

js1

is1

//

Ž 15 .

where: g Ž.. is a monotone function, usually a linear function, f Ž.. is a non-linear monotone function, usually a sigmoid function, n is the number of input variables Ž1 . . . i . . . n., m is the number of hidden neurons Ž1 . . . j . . . m., wji and Õj are the weights between the hidden neurons and the input variables and output neurons respectively, both estimated from the training process. The sum of the squared residuals is minimized during the training process, ŽEq. Ž16..: k

Ý Ž ys y yˆs .

2

ss1

k s number of training samples Ž 1 . . . s . . . k .

Ž 16 . so the correction of the weights Ž Dweights. is proportional to the error obtained and is done using the well known gradient descendent procedure, generalized delta rule w10,35,36x ŽEq. Ž17... Dweightss hd input q mDweightsŽ previous.

Ž 17 .

where: ‘input’ is the input value to the neuron of the layer where the corrections are being made h is the learning rate: an adaptive parameter which deter-

12

A. Rius et al.r Chemometrics and Intelligent Laboratory Systems 40 (1998) 1–18

mines the speed of the weights change, m is the momentum: a constant parameter, usually between 0.01 and 0.9, which allows to exit from a local minimum, d is a correction factor that depends on the layer where the corrections are being made, that means the output or hidden layer. In the output layer: Dweights s DÕ j , d s Ž y y yˆ . yˆ Ž1 y yˆ .; in the hidden layer: Dweights s Dwji , d j s Ž d Õ j s Ž d Õ j .out j Ž1 y out j ., out j is the output value of the neuron j. Once all objects from the training set have been introduced into the network and the weights are adjusted according Eq. Ž17., the prediction is carried out using Eq. Ž15.. For any unknown object defined by the input variables ‘ x i ’, the predicted value ‘ y’ ˆ for the output variable is calculated. 4.2. Recurrent neural networks, RNN Recurrent neural networks are extensions of the multilayer feed forward neural networks. Broadly, a subdivision can be made between open and closed loop neural networks. Open loop neural network means that there is no feed-back of outputs or activations Žoutput of hidden neurons. during the training step. Strictly speaking, in this case no recurrent connection is implemented. In such networks, inputs and outputs values are the same but shifted in time and the network is able to memorize in time and therefore to predict or forecast future values. Fig. 7 shows an open loop network topology applied to time se-

Fig. 7. Multilayer feedforward neural network applied to time series data. Network topology has ‘n’, ‘m’ and one input, hidden and output units, respectively.

Fig. 8. Recurrent neural network applied to time series data. Network topology has ‘n’, ‘m’ and one, input, hidden and output units, and ‘q’ delays, respectively. The recurrence implemented back to the inputs ‘r ’, can be either the predicted output ‘ xˆ t ’ or the residual ‘eˆt ’s x t y xˆ t ’.

ries. In this case, the relationship between output and input variables can be expressed as Eq. Ž18.. m

xˆ t s g

ž ž

n

Ý Õj f Ý wji x tyi

js1

is1

//

Ž 18 .

where, g Ž.. and f Ž.. functions and wji and Õj parameters are the same as the ones specified in Eq. Ž15.. Eq. Ž18. resembles the autoregressive model of order p, described in Section 3.1.1. Therefore, this type of neural network can be compared with an ARp model. In the engineering field, it is also called infinite impulse response filter ŽIIR.. Close loop neural networks are related to the types of neural networks where a recurrent connection is implemented. Taking into account the type of recurrence implemented Žthe real outputs, the outputs predicted during the training or any activations. several types of RNN can be distinguished. Roughly, we can divide the close loop or recurrent networks into two types: Ži. when the output of the neural network is feedback as an input ŽFig. 8.. The network is known as Jordan type and it can be compared with a non-linear

A. Rius et al.r Chemometrics and Intelligent Laboratory Systems 40 (1998) 1–18

13

by different means such as autocorrelation functions as the ones specified in the previous Section 3.2. 4.3. Counterpropagation learning strategy

Fig. 9. Recurrent neural network applied to time series data. Network topology has ‘n’, ‘m’ and one, input, hidden and output units, and ‘q’ delays, respectively. The recurrence implemented back to the inputs ‘r ’, is the output of the hidden layer.

ARMA model. The mathematical expression relating the output and inputs variables can be expressed as Eq. Ž19.: m

xˆ t s g

ž ž

n

q

Ý Õj f Ý wji x tyi q Ý

js1

is1

The philosophy behind counterpropagation learning is completely different from the backpropagation learning, but it has been proved that counterpropagation strategy can be as good at modeling as the backpropagation model w37x. Counterpropagation learning uses a two layer neural network, being the first or input layer a Kohonen layer w38x. Both layers have the same number and layout of neurons, and each neuron has as many weights ŽWj . as input values in each layer. Using counterpropagation learning only one type of time dependency architecture and therefore presentation of data, has been implemented, namely moving window w37x. Moving window means to present the data in the same way as it has been explained in the open loop neural networks. Similar to backpropagation model, the inputs, X Ž x ty1 , . . . , x tyq . are presented to the Kohonen layer, but the main difference is that here the output, Y Ž x t ., is presented as an input to the second or output layer. Fig. 10 shows a counterpropagation architecture applied to time dependent data with ‘q’ time delays. As in the

ps1

wXj p xˆ typ

//

Ž 19 .

where: g Ž.. and f Ž.. functions and wji and Õj parameters are the same as the ones specified in Eq. Ž15., q is the number of delay units Ž1 . . . p . . . q . and, wXji are the recurrent weights estimated from the training process. Similar expression would be found when instead of feed-back the output of the network, the recurrence is implemented with the difference between the real and the predicted output, ‘eˆt s x t y xˆ t ’. Žii. when the output of the hidden neurons are feed-back to the inputs ŽFig. 9.. Due to the different configurations of this last type that can be implemented, several names has been addressed, Kernel, Elman, tap-delay neural network ŽTDNN., time-delay and in the engineers field finite impulse response filter ŽFIR.. In all described types of recurrent neural networks, the time dependency Ždelay. have to be found

Fig. 10. Counterpropagation neural network applied to time series data. The neurons are arranged in a quadratic matrix and the length of the columns Ž t y q . corresponds to the number of weights Žcircles. and consequently to the dimension of the input vector Ž X .. The output layer has only one weight which corresponds to the predicted value Xˆt .

14

A. Rius et al.r Chemometrics and Intelligent Laboratory Systems 40 (1998) 1–18

RNN based in backpropagation neural network, the time dependency Ždelay. have to be found by means of autocorrelation functions. As in any modeling technique, the goal of the counterpropagation neural network learning is to provide an output value Žpredicted value, ‘ xˆ t ’. most similar to the desired Žreal. one. In counterpropagation this is done by correcting the weights independently in both layers, in the Kohonen and in the output layer. Being a supervised learning, during the training strategy a data pair for each object  X, Y 4 is presented to the network and one neuron among all is selected in the Kohonen layer. The most widely used criteria of selecting the neuron is to select the neuron having the weight vector Wj Ž wty1 , . . . , wtyq . most similar to the input vector X Ž x ty1 , . . . , x tyq . ŽEq. Ž20... tyq

selected neuron§ min

½ÝŽ isty1

x i y wjik .

2

5

Ž 20 .

where: q is the number of input values or time delays, wjik are the weights in the Kohonen layer Žsuperscript. k, in neuron j and for the input object i. Once the neuron has been selected, the correction of the weights is done independently in each layer, input and output layers. The correction is done in such a way so as to minimize the difference between the weights and the inputs. The weights in the input layer are corrected so as to be most similar to the inputs values Ž x ty1 , . . . , x tyq ., while the weights in the output layer are corrected to be most similar to the output value Ž x t .. The correction of the weights depends on the size of the network and on the time of training. At the beginning of the training the correction is done over the entire layer Žthe weights in all neurons are corrected. while at the end of the training the correction is done only in the selected neuron w37,38x. In the prediction step, the input values of an unknown object are entered in the Kohonen layer. The neuron having its weights most similar to the inputs values is selected ŽEq. Ž20... In the output layer, the neuron in the same position as the one selected in the Kohonen layer is selected and finally the weights of this output neuron are associated with the object, so the predicted value ‘ xˆ t ’ is associated to the unknown object.

5. A case study in automated analytical measurements The three approaches discussed, SPC, APC and RNN, were applied to the data of an analytical process: these are the results of 56 consecutive sulphate determinations in a control sample using a SIA methodology w39x. The samples were regularly analyzed for two days, and the control charts for these data are shown in Fig. 11. Fig. 11a shows the X and R charts for these data, with the central line and their upper and lower warning and control limits. In the X chart, all the points are in the range Žy3src 4 , q3src 4 . and, consequently, no out-of-control points are detected. In the R chart, two out-of-control signals are obtained for sample numbers 9 and 15. But by applying supplementary run rules to the X chart, 13 out-of-control points are detected Žthese are shown with solid points in the figure.. It can be seen that the first out-of-control point detected is number 9. In Fig. 11b, the cusum chart for the same data is shown. The parameters used to compute the chart are a s b s 0.05, and D s 15 Žtwice the standard deviation of the data.. It can be seen that 15 points are beyond the control limits Žthe first one is the 15th.. Finally, the EWMA control chart is shown in Fig. 11c, and the value of the l parameter used is 0.8 Žin this case, the plot of the sum of squares versus l has no minimum, but above l f 0.8 it is almost constant.. It can be seen that two outof-control points Žnumbers 35 and 36. are detected. The detection of several out-of-control points in the control charts suggests that the system should be revised at different points to find the causes for these individual results. The results of applying the APC approach to the same data are shown in Fig. 12. The simple acf is shown in Fig. 12a, where it can be seen that the data are clearly correlated. So, the application of time series models is appropriate. Moreover, the shape of this simple acf indicates that the model can probably be identified as an autoregressive first order model ŽARŽ1... This is confirmed by the partial acf shown in Fig. 12b, where only the first value is statistically different from zero. Considering that the model is supposed to be autoregressive of first order, the estimation of the model parameter yields a result of F 1 s 0.84, so the model identified is z t s 0.84 z ty1 q a t ,

A. Rius et al.r Chemometrics and Intelligent Laboratory Systems 40 (1998) 1–18

15

Fig. 11. Control charts for 56 consecutive determinations of sulphate in a control sample. Ža. X–R chart. Žb. Cusum chart. Žc. EWMA chart.

where z t s x t y m , and m is the mean of the population Žestimated by x .. When an observation is made at a time t y 1Ž z ty1 ., the model can be used to predict the next observation z t Žprediction one step ahead. assuming a t s 0, and this predicted value can be used to predict the following observation z tq1 Žprediction two steps ahead., and so on. In Fig. 13a the predicted values and the actual values are shown:

the prediction at time t is made using the known values for the time t y 1. The model is validated by calculating the residuals for the predictions, and computing the simple acf of these residuals. When this is done with the sulphate data, all the values in the simple acf are not statistically different from zero Žnot shown., and so the model has been identified correctly. SPC comple-

16

A. Rius et al.r Chemometrics and Intelligent Laboratory Systems 40 (1998) 1–18

Fig. 12. Ža. Simple autocorrelation function for the data corresponding to the determination of sulphate. Žb. Partial autocorrelation data for the same data.

ments time series models by plotting the Shewhart chart of the residuals when the ARŽ1. model is supposed. This control chart is shown in Fig. 13b, where one out-of-control signal is detected, in contrast to the Fig. 13 that was detected when it was applied to the original data. Backpropagation and counterpropagation learning strategies were applied using the open loop configuration Žarchitecture. which is the most similar to the lineal model ŽARp. described in the section above. Taking into account the information obtained from the correlation analysis of the sulphate data ŽFig. 12., a delay equal to one was implemented to train both neural networks, backpropagation and counterpropagation. So, the neural networks were trained with input and output pairs of data shifted by one time unit:  XY 4 s  x ty1 x t 4 ,  x ty2 x ty14 , . . . ,  x tyŽ ny1. x tyn4 . When an observation is made at time t Ž x t ., the neural network will predict the next observation in time t q 1Ž x tq1 .. Using the backpropagation learning strategy, there was only one neuron in the input and output layer as the data is shifted by one unit time; there were three neurons in the hidden layer, so the final backpropagation network architecture used was 1:3:1 neurons in each layer. The prediction results are shown in Fig. 13a after 200 epochs and using the parameters specified in Table 2. With the counterpropagation learning strategy, a neural network with 64 Ž8 = 8. neurons was used. Each neuron, both in the input or Kohonen layer and

Fig. 13. Ža. Comparison of the predicted values using autoregressive Ž- - -., backpropagation NN Ž — – — ., and counterpropagation . for the NN Ž – – – . models with the measured values Ž sulphate determination data. Žb. Shewhart chart of the residuals for the autoregressive model.

in the output layer, has only one weight. The network architecture optimized is non-toroidal and uses a triangular function to adapt Žcorrect. the weights. An autoscaling pre-treatment was applied to the input and the output values. After 200 epochs, the prediction results obtained are shown in Fig. 13a. Table 2 Parameters used to train the recurrent backpropagation neural network Parameters

Specifications

Data pre-treatment

input and outputs scaled between 1 and 0 method of Nguyen and Widrow sigmoid linear 0.1 0.9

Weights initialization Hidden transfer function Output transfer function Learning rate Momentum

A. Rius et al.r Chemometrics and Intelligent Laboratory Systems 40 (1998) 1–18

As can be seen in Fig. 13a, the predictions which use an autoregressive model and RNN with backpropagation learning strategy are very similar, and in general the predictions made using RNN with counterpropagation learning strategy seem to be slightly better. In order to compare the ability of prediction of the AR model and the RNN models, the root mean square errors of prediction ŽRMS. is calculated from the measured and the predicted values according to Eq. Ž21.: RMSs

(

Ý i Ž yi y yˆi . I

2

Ž 21 .

where yi are the measured values, yˆi the predicted values and I the number of samples. The results obtained are RMSs 4.8 for the autoregressive order 1 model, RMS s 4.5 for backpropagation RNN, and RMSs 3.2 for counterpropagation RNN.

6. Discussion Statistical process control ŽSPC. is the classical method for checking the reliability of analytical systems over time. Shewhart, cusum and EWMA charts are the basic tools used for monitoring system. The Shewhart chart is the simplest; it allows the mean and the variability of the process to be controlled, and it can be used for individual measurements and also for subgroups. The main drawback of the Shewhart chart in relation to the other control charts is that it is slower at detecting slight deviations; in this case, the cumulative sum Žcusum. chart is preferred, although the calculations and interpretation are not as simple as for Shewhart charts. The third kind of control chart is the EWMA: it is somewhat different from the others, because the EWMA statistic can be interpreted as a prediction of the system behavior one position ahead; this is an advantage over the other control charts, but as has been stated, this statistic is a particular case of the time series models described in the advanced process control ŽAPC. section. SPC methodologies have two important drawbacks: the first is that they assume that the observations are independent; the second is that the control charts allows a current observation Ž‘in vivo’. or a completed time series Ž‘post mortem’. to be studied,

17

but it does not allow the system behavior to be predicted Ž‘in futurum’. w6x, except in the particular case of the EWMA statistic. In the field of analytical chemistry the assumption of independence of the observations is clearly compromised because in the present-day analytical systems a time dependent factor can be important. For this reason, the control chart information sometimes leads to erroneous conclusions. In order to overcome these limitations, APC was introduced. Its tools detect autocorrelations, model the data, validate the model, and predict system behavior. The predictions can be made one position ahead, i.e. at time t q 1, or also beyond, at time t q 2, t q 3, . . . . Of course, the predictions are much less reliable for longer times. The drawbacks of this approach are that it is necessary to assume a model behind the data, and that a considerable number of measurements must be made to find the model, mainly if there is low autocorrelation in the data, although this is also true for computing the control limits in the SPC approach. Anyway, when few measurements are available these techniques can be applied, although the results are not so reliable as they would be with more data. The recurrent neural networks, RNN, which we have studied, backpropagation of errors and counterpropagation learning strategies, are well suited for time series prediction. The results of both types of network strategies are comparable with the autoregressive model Žsee Fig. 13a.. Since RNN does not use a model, in principle it may be more suitable than APC for fitting complicated time dependent data. However, an important drawback is that the system can not be interpreted. Another limitation of RNN, as in any type of neural network, is the lack of rules for selecting the parameters to train the network. This is sometimes reflected in very long training times. However, when the data is quite simple, the neural network is also a simply one and training times are not long, as can be concluded from the example discussed in this paper, although in such systems it would be recommended to use a simpler approach. Finally, for RNN, as for other models described in this paper, the input configuration is critical for a good prediction performance. Time dependency or input data configuration have to be known or extracted from an autocorrelation analysis. Once the network has been trained, it is able to predict data

18

A. Rius et al.r Chemometrics and Intelligent Laboratory Systems 40 (1998) 1–18

several times ahead, t q 1, t q 2, . . . , etc. As for APC, the predictions are much less reliable for longer times. The comparison of the three approaches has shown that SPC leads to inadequate conclusions when there is autocorrelation in the data, and so it is better to use APC or RNN. The results obtained with counterpropagation RNN are slightly better, with our data, than the ones obtained with autoregressive models and backpropagation RNN. APC and RNN approaches are still not widely used in the field of time series in analytical chemistry, and so many related problems still have to be dealt with. For example: the frequency of analysis of control samples, the effect of changing the control sample, how to detect a change in the system’s behavior Žappearance of drifts or autocorrelations., how to use the knowledge of the system that these techniques give to achieve better predictions, what can be done when a deviation is detected or anticipated, etc. All the described techniques can do away for the need of the analyst or the assessor of a measurement system to understand the measurement process being evaluated. The understanding itself is the raw material on which work the tools discussed in this tutorial.

Acknowledgements Financial support given by Spanish Ministry of Education and Science ŽDGICyT project No. BP930366. is acknowledged.

References w1x C.B.G. Limonard, Clin. Chim. Acta 94 Ž1979. 137–154. w2x D.L. Massart, B.G.M. Vandeginste, S.N. Deming, Y. Michotte, L. Kaufman, Chemometrics: a textbook, Elsevier, 1988. w3x K. Doerffel, G. Herfurth, V. Liebich, E. Wendlandt, Fresenius J. Anal. Chem. 341 Ž1991. 519–523. w4x K. Doerffel, Fresenius J. Anal. Chem. 348 Ž1994. 183–187. w5x C. Liteanu, I. Rica, Statistical Theory and Methodology of Trace Analysis, Ellis Horwood Limited, Chichester, England, 1980. w6x K. Doerffel, R. Niedtner, U. Raschke, S. Blase, Anal. Chim. Acta 238 Ž1990. 55–62.

w7x J.H. Harris, W.H. Boss, Can. J. Chem. Eng. 69 Ž1991. 48–57. w8x G. Box, T. Kramer, Technometrics 34 Ž1992. 251–285. w9x W.T. Tucker, F.W. Faltin, S.A. Vander Wiel, Technometrics 35 Ž1993. 363–375. w10x Y. Chauvin, D.E. Rumelhart, Backpropagation. Theory and Applications, Lawrence Erlbaum Ass. Publ., Hilldale, NJ, 1995. w11x A.S. Weigend, N.A. Gershenfeld ŽEds.., Times Series Prediction: Forecasting the Future and Understanding the Past, SFI Studies in the Sciencies of Complexity, Proc. Vol. XV, Addison-Wesley, 1993. w12x O. Nerrand, P. Roussel-Ragot, D. Urbani, L. Personnaz, G. Dreyfus, IEEE Trans. Neural Networks 5 Ž1994. 178–184. w13x J.T. Connor, R. Douglas, L.E. Atlas, IEEE Trans. Neural Networks 5 Ž1994. 240–254. w14x W.A. Shewhart, Economic Control of Quality, Van Nostrand, 1931. w15x E.L. Grandt, R.S. Leavenworth, Statistical Quality Control, 6th ed., McGraw-Hill, New York, 1988. w16x T.P. Ryan, Statistical methods for Quality Improvment, John Wiley, New York, 1989. w17x K.C.B. Roes, R.J.M.M. Does, Y. Schurink, J. Qual. Technol. 25 Ž1993. 188–198. w18x A.J. Duncan, Control de Calidad y Estadıstica Industrial, Ed. ´ Alfaomega, Mexico, 1989. ´ w19x C.W. Champ, W.H. Woodall, Technometrics 29 Ž1987. 393– 399. w20x A.F. Bissell, Bull. Appl. Stat. 5 Ž1978. 113–128. w21x D.J. Wheeler, J. Qual. Technol. 15 Ž1983. 155–170. w22x E.S. Page, Biometrika 42 Ž1955. 523–527. w23x E.R. Ott, E.G. Schilling, Process Quality Control, McGrawHill, 1990. w24x W.H. Woodall, J. Qual. Technol. 18 Ž1986. 99–102. w25x J.M. Lucas, J. Qual. Technol. 8 Ž1976. 1–12. w26x J.M. Lucas, R.B. Croisier, Technometrics 24 Ž1982. 199–205. w27x S.W. Roberts, Technometrics 1 Ž1959. 239–250. w28x J.S. Hunter, J. Qual. Technol. 18 Ž1986. 203–210. w29x J.M. Lucas, M.S. Saccucci, Technometrics 32 Ž1990. 1–12. w30x C.W. Champ, W.H. Woodall, Technometrics 29 Ž1987. 393– 399. w31x G.E. Box, G.M. Jenkins, G.C. Reinsel, Time series analysis: forecasting and control, 3rd ed., Prentice Hall, Englewood Cliffs, NJ, 1994. w32x A.P. Vasilopoulos, A.P. Stamboulis, J. Qual. Technol. 10 Ž1978. 20–30. w33x The System Identification toolbox, Matlab. The Mathworks, South Natick, MA, USA. w34x J.F. MacGregor, Chemical Engineering Progress, 1988, pp. 21–31. w35x J.R.M. Smits, W.J. Melssen, L.M.C. Buydens, G. Kateman, Chemom. Intell. Lab. Syst. 22 Ž1994. 165–189. w36x B.J. Wythoff, Chemom. Intell. Lab. Syst. 18 Ž1993. 115–155. w37x J. Zupan, J. Gaisteiger, Neural Networks for Chemists, An Introduction, VCH Verlagsgesellschaft, Weinheim, 1993. w38x J. Dayhoff, Neural Network Architectures, An Introduction, Van Nostrand Reinhold, New York, 1990. w39x A. Rius, M.P. Callao, F.X. Rius, The Analyst 122 Ž1997. 737–741.