Environmental statistical process control using an augmented neural network classification approach

European Journal of Operational Research 174 (2006) 1631–1642 www.elsevier.com/locate/ejor Production, Manufacturing and Logistics Environmental sta...

Download PDF

282KB Sizes 0 Downloads 7 Views

Report

PDF Reader
Full Text

European Journal of Operational Research 174 (2006) 1631–1642 www.elsevier.com/locate/ejor

Production, Manufacturing and Logistics

Environmental statistical process control using an augmented neural network classiﬁcation approach Deborah F. Cook

a,*

, Christopher W. Zobel a, Mary Leigh Wolfe

b

a

b

Department of Business Information Technology (0235), Virginia Tech., 1007 Pamplin Hall, Blacksburg, VA 24061, United States Department of Biological Systems Engineering, Virginia Tech., 305 Seitz Hall, Blacksburg, VA 24061, United States Received 1 December 2003; accepted 27 April 2005 Available online 19 July 2005

Abstract Shifts in the values of monitored environmental parameters can help to indicate changes in an underlying system. For example, increased concentrations of copper in water discharged from a manufacturing facility might indicate a problem in the wastewater treatment process. The ability to identify such shifts can lead to early detection of problems and appropriate remedial action, thus reducing the risk of long-term consequences. Statistical process control (SPC) techniques have traditionally been used to identify when process parameters have shifted away from their nominal values. In situations where there are correlations among the observed outputs of the process, however, as in many environmental processes, the underlying assumptions of SPC are violated and alternative approaches such as neural networks become necessary. A neural network approach that incorporates a geometric data preprocessing algorithm and identiﬁes the need for increased sampling of observations was applied to facilitate early detection of shifts in autocorrelated environmental process parameters. Utilization of the preprocessing algorithm and the increased sampling technique enabled the neural network to accurately identify the process state of control. The algorithm was able to identify shifts in the highly correlated process parameters with accuracies ranging from 96.4% to 99.8%. Ó 2005 Elsevier B.V. All rights reserved. Keywords: Environmental quality; Neural networks; Statistical process control; Correlation

1. Introduction *

Corresponding author. Tel.: +1 540 231 4847; fax: +1 540 231 3752. E-mail addresses: [email protected] (D.F. Cook), czobel@vt. edu (C.W. Zobel), [email protected] (M.L. Wolfe).

Environmental researchers and practitioners are often interested in monitoring the behavior of an environmental parameter over time. The goal of this monitoring is to identify when a change,

0377-2217/$ - see front matter Ó 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.ejor.2005.04.035

1632

D.F. Cook et al. / European Journal of Operational Research 174 (2006) 1631–1642

or shift, has occurred in an environmental parameter of interest, such as dissolved oxygen, nitrate, or temperature. Knowledge of a change or shift in the value of the parameter allows appropriate action to be taken. For example, increased concentrations of copper in water discharged from a manufacturing facility might indicate a problem in the wastewater treatment process. Increased numbers of fecal bacteria in a stream near a dairy farm might indicate a failure in a manure storage tank. The ability to identify changes or shifts in the environmental parameter allows for early detection of problems. Statistical techniques are typically used to identify changes or shifts in the values of parameters. Statistical process control (SPC) is the most widely used technique, primarily in manufacturing. The SPC process identiﬁes whether an observed output or measurement from a system represents a process that is in control or one that has shifted out of control. While variation is present in virtually all processes, only natural, or random, variation is present in an in control process. In contrast, an out of control condition signals the presence of assignable or special cause variation. This type of variation must be identiﬁed and eliminated to return the process to a state of statistical control. There is currently a high level of interest in the use of SPC techniques in environmental data management [6,19,20,24,33]. Corbett and Pan [6] reported great potential for the application of industrial statistics, particularly SPC control charts, in environmental management. They described the development and application of control charts to analyze nitrate blank measurements and nitrate concentration data. These control charts allowed the identiﬁcation of the presence of a special cause, signaling that the process from which samples were derived should be investigated. Maurer et al. [20] used SPC to identify long- and shortterm trends and outliers of sediment cadmium concentration data. Zimmerman et al. [33] used SPC to examine water quality data sets collected from the Mobile Bay and illustrated capabilities of SPC in identifying special causes such as measurement errors and spills. Traditional SPC techniques have been shown to be quite eﬀective in discrete manufacturing opera-

tions practice [8,18]. However, these techniques are typically not applicable in situations where autocorrelation is present in a data set. The presence of autocorrelation indicates that the value of a parameter depends upon its previous values, which violates a basic assumption used to develop the discrete SPC techniques: statistical independence. When autocorrelation is present, not as much information is gained from an additional observation as there would be if that observation was independent from previous observations. Autocorrelation is present in many environmental data streams; various researchers have identiﬁed environmental parameters as time series [14,21, 31,32,36]. Hipel and McLeod [13] provide an extensive description of time series modeling of water resources and other environmental systems. The presence of autocorrelation within the data stream requires that alternative approaches to identifying shifts in the process mean be considered. Complex statistical techniques exist for SPC in the presence of autocorrelation. These techniques have seen very limited application and no technique has demonstrated high performance in its ability to detect shifts. Techniques that do not require typical underlying statistical assumptions, such as data independence and data normality, might be more widely applied. Recently, Zobel et al. [34] applied neural network theory to develop such a technique. The overall goal of this study was to evaluate the application of the neural network based technique to environmental processes. The speciﬁc objective was to evaluate the capability of the technique to identify shifts in the mean of autocorrelated environmental data streams.

2. Techniques for analyzing correlated process data The primary techniques available for analyzing correlated process data are statistical approaches. Recently, neural network models have been developed as an alternative approach to statistical techniques. The following sections describe applications of statistical approaches and neural networks to analyzing correlated process data.

D.F. Cook et al. / European Journal of Operational Research 174 (2006) 1631–1642

1633

2.1. Statistical techniques

2.2. Neural network models

Several authors [1,12,29] have recommended the use of time series modeling techniques when monitoring the mean of correlated process data. Wardell et al. [28] evaluated the performance of time series control charting techniques by determining the properties of an X chart of residuals for process data from several common time series models. The results of these studies showed that an X chart of residuals from an AR(1), AR(2), or ARMA(1,1) model performed poorly for most parameter values likely to occur with process control data. Speciﬁcally, an X chart required an average of 123–223 samples to identify a standardized shift of one standard deviation, for positive values of the autoregressive parameter of 0.5–0.9. Process measurements are often taken hourly, implying that an average of 123–223 hours could pass before the SPC time series control chart identiﬁed a shift in the mean value of a process parameter. Wardell et al. [28] also studied the run-length distribution of the special-cause control chart (X chart of the residuals) proposed by Alwan and Roberts [1] for correlated observations, given that the result of the assignable cause to be detected is a shift in the process mean. Reynolds et al. [25] used a variable sampling interval (VSI) to monitor the mean in the presence of correlation. VanBrackle and Reynolds [27] evaluated the use of EWMA and CUSUM control charts for the process mean when the observations are correlated. They found that although correlation shortens the time required to detect small to moderate shifts, it also has a higher false alarm rate (signaling an out of control process when the process is actually in control). False alarms can result in lack of conﬁdence in the information, leading to process operators ignoring control chart signals. In addition, the time needed to detect shifts is actually lengthened for larger shifts under this approach. No consistently high performing statistical methods have been developed to detect shifts in the process mean under the presence of autocorrelation. Corbett and Pan [6] pointed to the presence of correlation in environmental data streams and identiﬁed SPC for correlated data as a largely open research area to be investigated.

A neural network consists of a number of simple, highly interconnected processing elements or nodes and incorporates the ability to process information by a dynamic response of these nodes and their connections to external inputs. A primary advantage of neural network models for monitoring environmental data is that they do not make the assumptions of data normality and independence that underlie traditional SPC. Detailed descriptions of neural networks are available in Freeman and Skapura [10] and Haykin [11]. Neural networks have been applied to process control as an alternative to strictly statistical techniques. Much of the existing research on this use of neural networks has assumed statistical independence of the process data [15–17,23,35]. A more limited set of existing research, described in the following paragraphs, addresses the presence of correlation. Ruis et al. [26] used a counterpropagation recurrent neural network (RNN) and a backpropagation RNN for monitoring chemical measurement processes. The authors concluded that the counterpropagation RNN is better than the backpropagation RNN when analyzing correlated data. West et al. [30] investigated the ability of radial basis function neural networks to monitor and control complex manufacturing processes that exhibit both auto- and cross-correlation and showed that their method is superior to the classical SPC methods such as the multivariate Shewhart and the multivariate EWMA. Cook and Chiu [7] and Chiu et al. [5] were successful in separating data that exhibited a shift of one, two, and three standard deviations from non-shifted data for simulated data. The neural networks outperformed the traditional SPC control charts as in the previously cited neural network studies. Cook and Chiu [7] trained radial basis function (RBF) neural networks to identify mean shifts in highly correlated parameters (/ = 0.9) by using training data sets consisting of pairs of consecutive observations of the parameter values: (xt1, xt). Each bivariate data point was drawn from one of two possible subsets, non-shifted = 0 or shifted = 1, according to whether or not a mean shift

1634

D.F. Cook et al. / European Journal of Operational Research 174 (2006) 1631–1642

occurred in the time interval between the initial and following observation. Their simulations assumed that if a shift occurred, it occurred between the ﬁrst and second points of the bivariate pair. Consequently, the resulting bivariate pairs were either (non-shifted, shifted) or (non-shifted, nonshifted). Their goal was to determine if a neural network could be trained to recognize the occurrence of a process shift. After suitable training, the RBF neural networks of Cook and Chiu [7] were able to identify shifts of 1.5 and 2 standard deviations with a fair degree of consistency.

6 5 4 3 2 1 0 -4

-2

-1 0

2

4

-2

in control out of control

-3 -4

2.3. Combined classiﬁcation/neural network approach

a

6

One of the shortcomings of Cook and ChiuÕs [7] research was that all chances of detection were lost if the shift was not detected immediately. Zobel et al. [34] developed a neural network based approach that overcomes this shortcoming. Their approach initially pre-classiﬁes neural network training data into mutually exclusive subclasses of observations. When a large shift in the mean occurs, as shown in Fig. 1(a), the two subsets of shifted and non-shifted data will typically occupy fairly distinct regions within the output space; consequently, the neural networks are able to distinguish between the two types of observations with a high degree of certainty. When a smaller shift occurs, however, as displayed in Fig. 1(b), there will be a great deal of overlap between the two subsets, and the networks have a more diﬃcult time accurately classifying individual observations. An observed point falling in the overlap area might signal the need for the collection of an additional sample or samples to accurately classify the process condition. Accurate identiﬁcation of the overlapping region and resampling increase the chances of detecting a shift, if one occurs, between time (t 1) and time (t). In order to determine the region of overlap between the in control and the out of control observations, Zobel et al. [34] developed an algorithm that identiﬁes, for each class of points, the smallest convex polygon that contains all of the observations within that class. The algorithm creates three distinct classes of observations based upon two ini-

5 4 3 2 1 0 -4

-2

-1

0

2

4

-2 in control out of control

-3 -4 b

Fig. 1. Illustration of shifts in the process mean. (a) Large mean shift (2 sigma); (b) small mean shift (0.5 sigma).

tial sets (Set 1 and Set 2) of possibly overlapping bivariate data points. Individual observations are identiﬁed as belonging to one of the original two classes of points if they lie within one of these polygons but not within the other (Classes A and B, respectively), and any observations that lie within the intersection of the two polygons are classiﬁed as members of the new third class of points (Class C). Once the neural network is trained to classify bivariate data points into one of these three classes, it can be used to monitor an ongoing process to identify the presence or absence of a mean shift.

D.F. Cook et al. / European Journal of Operational Research 174 (2006) 1631–1642

If the trained neural network model identiﬁes a new observation as falling into Class C, immediate increased sampling may be used to clarify the true state of the process. Such an increased sampling approach could be feasible in environmental monitoring applications such as wastewater discharges from manufacturing, food processing, and other industrial facilities. This neural network classiﬁcation system showed signiﬁcant improvement over current statistical and analytical techniques for identifying step shifts in correlated manufacturing processes. Zobel et al. [34] automated the augmented neural network technique in Excel, incorporating both the initial pre-classiﬁcation algorithm and an appropriate neural network model for identifying the process behavior. The system helps to automate the data analysis and serves to provide a process manager with eﬀective advice as to when additional sampling is required or when a special cause has been introduced into the system.

3. Application to environmental process data Application of the augmented neural network procedure to environmental data monitoring consisted of two steps. The ﬁrst step was to apply the network procedure to simulated correlated data sets, representative of environmental parameter values. Simulated data were used so that known shifts could be induced in the simulated data stream, allowing quantiﬁcation of the ability of the neural network model to recognize shifts. An Excel-based system was developed to simulate the time series data used in the neural network training. The second part of the study included application of the network procedure to measured in-stream nitrate concentrations presented by Burt et al. [4]. The ﬁrst part of the evaluation process, as shown in the following sections, clearly demonstrated that the augmented neural network procedure can accurately identify mean shifts in correlated data. The second part of the evaluation describes and demonstrates the potential for use in practice.

1635

NeuralWareÕs Predict [22], an add-in for Microsoft Excel, was chosen for the development of neural networks for environmental SPC. Predict is an automated tool for network development and utilizes a backpropagation neural network that incorporates an adaptive gradient learning rule with a weight decay factor that is tuned automatically by the Predict software. In general, neural networks develop a functional mapping of the relationship between input and output parameters based on data examples provided to the network training algorithm. The particular backpropagation neural networks used in this research consisted of an input layer, a single hidden layer, and an output layer. The hidden layer is essential in the computation to develop a representation of the relationship between the input and output parameters. The number of nodes in the hidden layer was determined by Predict using a constructive method known as cascade learning. In cascade learning, hidden nodes are generally added one at a time, new hidden nodes have connections with input nodes as well as with the previously established hidden nodes, and construction is stopped when performance shows no further improvement [22]. 3.1. Testing and results with simulated training data sets To initially develop and test the neural network models, training data sets, consisting of synthetic bivariate data points representing shifted and non-shifted data, were generated using Microsoft Excel. Each training data set consisted of 2500 data vectors. The data vectors consisted of sequential pairs of (non-shifted, non-shifted) or (non-shifted, shifted) points and were simulated observations from an AR(1) time series with varying correlation coeﬃcient (u) values. An AR(1) process can be represented by the following equation: X t ¼ l þ /ðX t1 lÞ þ et ;

ð1Þ

where Xt is the value of the time series at time t, l is the mean of the data series, Xt1 is the value of the time series at time (t 1), t is a normal, independently distributed error term, and / is the

1636

D.F. Cook et al. / European Journal of Operational Research 174 (2006) 1631–1642

autoregressive coeﬃcient restricted to lie between 1 and 1. A standardized version of Eq. (1) with a range of correlation coeﬃcients (/ = 0.5, 0.7, and 0.9) and the appropriate value of l (nonshifted or shifted with shifts of 1, 1.5, and 2 sigma) was used to generate the training data sets. In the case where no shift occurred between time (t 1) and time (t), each pair of observations was generated using: X t1 ¼ N ð0; 1Þ X t ¼ uðX t1 Þ þ et

ð2aÞ ð2bÞ

where et = N(0, re). The value of re is determined based on the relationship with the standard deviation of the data, r, as shown in Eq. (3): sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ r2e ð3Þ r¼ ð1 /2 Þ Similarly, in the case where a shift had occurred between time (t 1) and time (t), each individual data pair in the training and testing sets was simulated as: X t1 ¼ N ð0; 1Þ

ð4aÞ

X t ¼ l þ /ðX t1 lÞ þ et

ð4bÞ

where et is determined as above and l represents the shift size (1, 1.5, or 2 sigma). The data sets consisted of consecutive time series observations with correlation coeﬃcients of 0.5, 0.7, and 0.9, and with mean shifts of 1, 1.5, and 2 sigma. A total of 90 training data sets were generated, speciﬁcally, 10 replications of nine combinations of correlation coeﬃcient and mean shift. The pre-classiﬁcation algorithm was ﬁrst applied to each of the original bivariate data sets to generate corresponding new data sets, each consisting of three mutually exclusive subsets of observations. Backpropagation neural networks were developed and trained to accurately classify data samples as belonging to either a shifted, non-shifted, or indeterminate process. Predict automatically selected training, testing, and validation sets from the 2500 data vectors provided in each training data set. Thirty percent of the data set (or 750 data vectors) was reserved and used for testing model performance during training. The weights connecting the nodes in the

network were adjusted during training to minimize the error between the network output and the target output. Additionally, processing nodes were added incrementally so that network error was reduced. This technique is known as cascade correlation [22]. The performance of the network was periodically evaluated during training, using the 750 reserved data vectors, to ensure that the network did not overﬁt the training data. Table 1 summarizes the performance of the trained neural networks with respect to accurate classiﬁcation of the 750 test vectors in the training data set. The testing results demonstrate that the new neural network models are capable of accurately classifying the data into the three subclasses, with accuracy rates ranging from 95.72% to 99.80%. Once each neural network was trained, examples of shifted and non-shifted data pairs were input into the automated simulation system and the trained neural network model was used to generate a predicted classiﬁcation (Class A, B, or C) for each data pair. These examples were generated using Eqs. (2)–(4). Each trained network was tested with 10 example data sets, with each set containing 1000 data vectors. If the network-predicted classiﬁcation of the current data pair was Class A (representing a non-shifted classiﬁcation) or Class B (representing a shifted classiﬁcation), classiﬁcation was complete. If the predicted classiﬁcation was Class C (representing the overlap area), then the simulated Table 1 Augmented neural network average correct classiﬁcation rates (%) for three classes (A—shifted, B—non-shifted, C—overlap) with correlation coeﬃcients (/) of 0.9, 0.7, and 0.5 Correlation coeﬃcient

Shift Classiﬁcation rate (std dev) size Class A Class B Class C

0.9

1.0r 98.80 (1.03) 98.80 (1.36) 97.02 (1.78) 1.5r 99.53 (0.394) 99.33 (0.752) 98.35 (1.129) 2.0r 98.58 (0.883) 98.42 (0.935) 99.30 (1.02)

0.7

1.0r 98.13 (3.13) 99.33 (1.42) 96.30 (1.99) 1.5r 99.50 (0.650) 98.02 (2.254) 97.43 (0.763) 2.0r 99.07 (0.540) 99.22 (0.860) 98.46 (0.622)

0.5

1.0r 98.44 (3.44) 97.02 (3.89) 95.72 (1.65) 1.5r 99.80 (0.632) 98.60 (2.42) 97.83 (1.10) 2.0r 99.44 (0.786) 99.55 (0.617) 97.10 (1.71)

D.F. Cook et al. / European Journal of Operational Research 174 (2006) 1631–1642

process was put into an intensive sampling, or resampling, mode. Additional observations of the second of the two data points were then generated to simulate immediate resampling of the process, until a conclusive classiﬁcation of non-shifted (Class A) or shifted (Class B) was ﬁnally generated. The predicted classiﬁcation was compared to the actual state of the process and the results were compiled into summary statistics on the number of correct and incorrect classiﬁcations. Results are given in Table 2 for a correlation coeﬃcient values of 0.9, 0.7, and 0.5. In all three cases, the networks were able to do a very good job not only of identifying when a shift of 1 sigma, 1.5 sigma, or 2 sigma occurred, but also of identifying when no shift was present in the data. All of the classiﬁcation rates for the 1.5 and 2 sigma shifts were above 96%, and overall classiﬁcation accuracy improved as the size of the shift and the level of correlation increased. The results for the smaller 1 sigma shift were not as good as the others, yet they exceeded 91% accuracy in all but one instance. Classiﬁcation accuracy generally increased with an increase in the correlation coeﬃcient of the underlying data. This behavior can be attributed to the relative amount of overlap between the two underlying subsets of observations in each test set. As illustrated by the examples given in Fig. 2, a lesser degree of correlation will typically lead to a much higher degree of overlap, because of the underlying shape of the associated ‘‘clouds’’ of data points. Since the overlap area is responsible for a majority of the classiﬁcation error [34], one would expect an increase in that error when the Table 2 Augmented neural network average correct classiﬁcation rates (%) with / = 0.9, 0.7, and 0.5 for 10 simulated data sets Data pairs

Shift Average correct classiﬁcation rate size (std dev)

(non-shifted, non-shifted)

1.0r 98.61 (0.76) 94.19 (2.73) 85.24 (7.74) 1.5r 99.54 (0.31) 98.20 (1.01) 96.39 (1.87) 2.0r 99.72 (0.21) 99.24 (0.46) 98.42 (1.01)

(non-shifted, shifted)

1.0r 98.85 (0.64) 95.80 (2.37) 91.98 (7.58) 1.5r 99.54 (0.30) 98.75 (0.62) 98.00 (1.09) 2.0r 99.76 (0.20) 99.42 (0.32) 99.12 (0.54)

/ = 0.9

/ = 0.7

/ = 0.5

1637

6 In control

5

Out of control

4

Inconclusive

3 2 1

-6

-4

0 -1 0

-2

2

4

-2 -3 a

-4

6 In control

5

Out ofcontrol

4

Inconclusive

3 2 1 0

-6

-5

-4

-3

-2

-1 -1 0

1

2

3

4

-2 -3 -4 b

Fig. 2. Illustration of relative overlap associated with diﬀerent correlation coeﬃcients. (a) 0.9 Correlation coeﬃcient (1 sigma shift)—62.9% overlap; (b) 0.5 correlation coeﬃcient (1 sigma shift)—94.3% overlap.

relative size of the corresponding overlap becomes greater. The average number of additional samples required to achieve a correct classiﬁcation for each level of correlation was also calculated for the test problems above. Table 3 gives the number of additional samples for / = 0.9, and Tables 4 and 5 give the corresponding values for / = 0.7 and / = 0.5, respectively. In each case, the number of additional samples required to achieve a deﬁnitive classiﬁcation of either in control or out of control increased as the size of the shift decreased. Similarly, the number of additional samples increased as the correlation coeﬃcient decreased. Given an initial output of ‘‘inconclusive’’, a correlation coeﬃcient of 0.9, and a shift of 2 sigma,

1638

D.F. Cook et al. / European Journal of Operational Research 174 (2006) 1631–1642

Table 3 Average number of resamples required to identify process shifts with / = 0.9 Shift size

2.0r 1.5r 1.0r

Mean additional samples (std dev) (non-shifted, non-shifted)

(non-shifted, shifted)

1.12 (0.06) 2.88 (3.65) 5.05 (1.04)

1.18 (0.37) 2.31 (2.10) 4.57 (1.86)

Table 4 Average number of resamples required to identify process shifts with / = 0.7 Shift size

2.0r 1.5r 1.0r

Mean additional samples (std dev) (non-shifted, non-shifted)

(non-shifted, shifted)

2.90 (0.56) 6.46 (1.59) 17.20 (5.18)

2.59 (0.82) 5.91 (2.94) 14.86 (7.73)

Table 5 Average number of resamples required to identify process shifts with / = 0.5 Shift size

2.0r 1.5r 1.0r

Mean additional samples (std dev) (non-shifted, non-shifted)

(non-shifted, shifted)

4.94 (1.28) 11.55 (3.24) 33.39 (23.44)

4.17 (1.38) 9.83 (6.18) 29.90 (36.46)

only slightly more than one additional sample was required to achieve a correct classiﬁcation. In the most extreme case reported, the mean number of additional samples required to achieve a correct classiﬁcation was less than 34. This compares very favorably to the mean of 123–223 samples reported in Wardell et al. [28] as needed to identify a shift using the X chart. 3.2. Analysis of nitrate concentration data As reported earlier, time series models have been developed for various hydrologic and environmental applications. The augmented neural network technique for identifying shifts in time series parameters was developed using the widely applied AR(1) time series model for illustrative purposes because it is the most basic and most

often applied model. Bras and Rodriguez-Iturbe [2] reported that the AR(1) model is the most popular model of time-series simulation and forecasting in hydrology and other ﬁelds. The simulations performed above illustrate the ability of the neural network to recognize mean shifts in the AR(1) model. The simulations were required to verify the performance of the augmented neural network methodology, as there is no way in practice to always know whether a shift has actually occurred. Laboratory experiments, in which water quality characteristics are maintained at speciﬁed levels and then altered using designed experimentation to generate shifted data, would be a way to further quantify the ability of the methodology to recognize shifts. The augmented neural network methodology can be applied in practice to monitor and assess shifts in water quality parameters. Nitrate concentrations in surface and ground water are a matter of concern world-wide [3,4] and are a potential application area for the augmented neural network technique. Additional water quality parameters that might be monitored include, for example, phosphorus, bacteria, dissolved oxygen, pH, and toxic chemicals. To illustrate the application of the augmented neural network technique, monthly median stream nitrate concentrations collected over a 15-year period were utilized. Burt et al. [4] described the data set collected in a small catchment in southwest England. The data were standardized and an AR(1) time series model with a correlation coeﬃcient of 0.7826 was identiﬁed as an appropriate time series model. Again, a data set consisting of bivariate data points representing shifted and non-shifted data, was simulated based on the nitrate AR(1) model. The simulation of the data set was required in order to generate a suﬃcient number of training examples. It may be possible in some instances to collect enough data from the measured process to develop suﬃcient training examples. In other cases simulation of data from a statistical model or from a computer model of a particular process may be required to provide suﬃcient training examples. The data set simulated from the nitrate AR(1) model was used for training an augmented neural network for monitoring nitrate concentra-

D.F. Cook et al. / European Journal of Operational Research 174 (2006) 1631–1642

1639

tion levels. Networks were trained to detect shifts of 1.5, 2, and 2.5 standard deviations in the process mean. Each network had a testing accuracy of over 99% in the training process. The actual data from Burt et al. [4] were used for testing the trained neural network model. Since the nitrate data were not analyzed in real time, it is impossible to know if a process mean shift occurred. However, the use of the actual data gives an illustration of how the process could be used in real time and the possibility of shifts in the historical data can be analyzed. Table 6 contains the neural network identiﬁcation of the process condition (in control, out of control, inconclusive). No out of control conditions were identiﬁed by the three networks. In practice, an environmental manager would determine what size shift would be considered signiﬁcant. In manufacturing processes, managers often wish to identify shifts of 1.5 standard deviations or more [9]. Identiﬁcation of smaller shifts is not attempted as a variation of ±1.5 sigma is considered acceptable. It is likely that 1.5 sigma is the smallest variation an environmental process manager might try to identify, and realistically many process managers would start at a much higher shift level. The nitrate data were analyzed

using three diﬀerent shift levels (1.5, 2.0, and 2.5) to evaluate the sensitivity of the network. As expected, the 1.5 sigma network was the most sensitive as a number (8 of 15) of ‘‘inconclusive’’ classiﬁcations were generated. A classiﬁcation of ‘‘inconclusive’’ signals the possibility of a shift and signals the need for additional sampling to determine whether a shift occurred. The 2 sigma and 2.5 sigma networks were progressively less sensitive, with the 2.5 sigma network identifying no shifts in any observations except Observation 10 (inconclusive). If the 2.5 sigma network was being applied in practice, additional sampling would be called for at Observation 10. All other observations showed that the process was in control and that no special causes were present. To further illustrate the ability of the network to recognize shifts, we induced shifts in the actual nitrate data and presented that data to the neural network trained to recognize shifts of 1.5 standard deviations. Shifts of 1, 1.5, 2, and 2.5 standard deviations were added between each pair of nitrate observations. The trained neural network identiﬁed all points as either shifted or inconclusive for the 1.5, 2, and 2.5 sigma shifts (Table 7). The network identiﬁed the largest shift (2.5 sigma) in all instances and either identiﬁed the shift or called

Table 6 Neural network analysis of nitrate process condition (0—in control, 1—out of control, 2—inconclusive)

Table 7 Neural network analysis of nitrate process condition with induced shifts condition (0—in control, 1—out of control, 2— inconclusive)

Data pair

Data pair

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Shift size 1.5 sigma

2.0 sigma

2.5 sigma

2 2 0 0 0 0 0 2 0 2 2 2 2 2 0

2 0 0 0 0 0 0 0 0 2 0 0 2 0 0

0 0 0 0 0 0 0 0 0 2 0 0 0 0 0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Shift size 1 sigma

1.5 sigma

2.0 sigma

2.5 sigma

1 2 2 0 2 2 2 1 2 1 1 2 1 2 2

1 1 2 2 2 2 2 1 2 1 1 1 1 1 2

1 1 2 2 1 2 1 1 2 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

1640

D.F. Cook et al. / European Journal of Operational Research 174 (2006) 1631–1642

for additional sampling with the 1.5 and 2 sigma shifts. The network completely missed only one observation (Observation 4) with the smallest shift of 1 sigma. This analysis further illustrates the ability of the augmented neural network to enable environmental process managers to identify shifts in an environmental data stream. A trained neural network model would be used in real time to monitor future values of the river nitrate parameter. No action would be taken by the environmental manager when the neural network classiﬁes the process as in control. Additional sampling would be conducted when an inconclusive classiﬁcation is made by the neural network. When the augmented neural network technique identiﬁed the presence of a shift (or special cause), the process manager would analyze the process so that the source of the special variation could be identiﬁed and appropriate action taken. Some potential causes for these out of control conditions include malfunction (mechanical, electrical, biological) of a treatment component, changes in inﬂow (volume or strength) to the treatment process, or a catastrophic pollution event (e.g., chemical spill) into a stream or river. The development of such a neural network tool provides a realistic technique for monitoring environmental parameters.

4. Summary and conclusions An augmented neural network classiﬁcation approach was used to identify shifts in autocorrelated data, with particular application to environmental monitoring data. The classiﬁcation approach includes an initial geometric pre-classiﬁcation algorithm as part of the training process for neural networks. The pre-classiﬁcation step improves the capability of the trained network to identify shifts in correlated process data by identifying potential areas of overlap for the classiﬁcation scheme. The mapping of the overlap area allows for the generation of a signal that calls for a resampling of the process to help make the determination of the process state of control. The augmented neural network classiﬁcation system was applied to a set of environmental data,

speciﬁcally, nitrate concentrations [4]. The data were standardized and an AR(1) time series model with a correlation coeﬃcient of 0.7826 was identiﬁed as an appropriate time series model. A data set consisting of bivariate data points representing shifted and non-shifted data was simulated based on the nitrate AR(1) model. This data set was used for training an augmented neural network for monitoring nitrate concentration levels. Networks were trained to detect shifts of 1.5, 2, and 2.5 standard deviations in the process mean. Each network had a testing accuracy of over 99% in the training process. The actual data from Burt et al. [4] was used for testing the trained NN model. Excellent classiﬁcation results were obtained on AR(1) time series with high autocorrelation coeﬃcients using the augmented neural network classiﬁcation system for 2, 1.5 and 1 sigma shifts with minimally increased sampling eﬀorts. As expected, fewer resamples were required at higher correlation levels, and the number of resamples required increased as the shift size decreased. The neural network classiﬁcation approach was able to identify shifts in a process with accuracies ranging from 94.10% to 99.76% for the higher correlation coeﬃcients of 0.7 and 0.9. Correct classiﬁcation rates for a correlation coeﬃcient of 0.5 with shifts of 1.5 and 2 sigma ranged from 96.39% to 99.12%. Only the one sigma shift with a 0.5 correlation coeﬃcient produced less satisfying classiﬁcation rates of 85.24% and 91.98%. This capability represents a signiﬁcant improvement over existing methods and would greatly increase an environmental managerÕs ability to successfully monitor and control an autocorrelated process. The augmented neural network methodology would be expected to be applicable to data streams modeled by time series models other than the basic AR(1) since there are no underlying assumptions within the neural network methodology that depend on the speciﬁc time series model. Neural network models would be developed, trained, and tested for these additional types of time series models of environmental processes. The augmented neural network methodology oﬀers a viable approach for process monitoring in the

D.F. Cook et al. / European Journal of Operational Research 174 (2006) 1631–1642

presence of autocorrelation. This is particularly relevant in water quality monitoring, as many water quality parameters would be expected to be autocorrelated.

[16]

[17]

References

[18]

[1] L.C. Alwan, H.V. Roberts, Time series modeling for statistical process control, in: J.B. Keats, N.F. Hubele (Eds.), Automated Manufacturing, Marcel Dekker, Inc., New York, 1989. [2] R.L. Bras, I. Rodriguez-Iturbe, Random Functions and Hydrology, Addison-Wesley Publishing Company, Reading, MA, 1984. [3] T.P. Burt, A.L. Heathwaite, T. Trudgill, Nitrate: Processes, Patterns, and Management, John Wiley & Sons, Chichester, 1993. [4] T.P. Burt, B.P. Arkell, S.T. Trudgill, D.E. Walling, Stream nitrate levels in a small catchment in south west England over a period of 15 years, Hydrological Processes 2 (1988) 267–284. [5] C.C. Chiu, M. Chen, K. Lee, Shifts recognition in correlated process data using a neural network, International Journal of Systems Science 32 (2) (2001) 137– 143. [6] C.J. Corbett, J.N. Pan, Evaluating environmental performance using statistical process control techniques, European Journal of Operational Research 139 (1) (2002) 68– 83. [7] D.F. Cook, C.C. Chiu, Using radial basis function neural networks to recognize shifts in correlated manufacturing process parameters, IIE Transactions 30 (3) (1998) 227– 234. [8] W.E. Deming, Out of the Crisis, Massachusetts Institute of Technology Center for Advanced Engineering Study, Cambridge, MA, 1986. [9] J.R. Evans, W.M. Lindsay, The Management and Control of Quality, Southwestern Thomson Learning, Cincinnati, OH, 2002. [10] J.A. Freeman, D.M. Skapura, Neural Networks: Algorithms, Applications, and Programming Techniques, Addison-Wesley, Reading, MA, 1991. [11] S. Haykin, Neural Networks: A Comprehensive Foundation, Prentice-Hall, Englewood Cliﬀs, NJ, 1998. [12] T.J. Harris, W.H. Ross, Statistical process control procedures for correlated observations, The Canadian Journal of Chemical Engineering 69 (1991) 48–57. [13] K.W. Hipel, A.I. McLeod, Developments in Water Science: Time Series Modelling of Water Resources and Environmental Systems, Elsevier, Amsterdam, 1994. [14] K.W. Hipel, A.I. McLeod, R.R. Weiler, Resources Bulletin 24 (3) (1988) 533–544. [15] E.S. Ho, S.I. Chang, Integrated neural network approach for simultaneous monitoring of process mean and variance

[19] [20]

[21]

[22] [23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32] [33]

1641

shifts—a comparative study, International Journal of Production Research 37 (8) (1991) 1881–1901. H.B. Hwarng, N.F. Hubele, X control chart pattern identiﬁcation through eﬃcient oﬀ-line neural network training, IIE Transactions 25 (3) (1993) 27–39. H.B. Hwarng, N.F. Hubele, X-bar chart pattern recognition using neural nets, ASQC Quality Congress Transactions, Milwaukee, 1991, pp. 884–889. J.M. Juran, A.B. Godfrey, JuranÕs Quality Handbook, Irwin McGraw-Hill, New York, NY, 1999. C.N. Madu, Managing Green Technologies for Global Competitiveness, Quorum Books, Wesport, CT, 1996. D. Maurer, M. Mengel, G. Robertson, T. Gerlinger, A. Lissner, Statistical process control in sediment pollutant analysis, Environmental Pollution 104 (1) (1999) 21– 29. A.I. McLeod, K.W. Hipel, F. Comancho, Trend assessment of water quality time series, Water Resources Bulletin 19 (4) (1983) 537–547. NeuralWorks, Predict, Technical Publications, NeuralWare, Inc., 202 Park West Drive, Pittsburgh, PA, 2002. G.A. Pugh, A comparison of neural networks to SPC charts, Computers and Industrial Engineering 21 (1–4) (1991) 253–255. C. ReVelle, Research challenges in environmental management, European Journal of Operational Research 121 (1) (2000) 218–231. M.R. Reynolds, J.C. Arnold, J.W. Baik, Variable sampling interval x charts in the presence of correlation, Journal of Quality Technology 28 (1996) 12–30. A. Ruis, I. Ruissanchez, M.P. Callao, F.X. Ruis, Reliability of analytical systems: Use of control charts, time series models and recurrent neural networks (RNN), Chemometrics and Intelligent Laboratory Systems 40 (1998) 1–18. L.N. VanBrackle, M.R. Reynolds, EWMA and CUSUM control charts in the presence of correlation, Communications in Statistics—Simulation and Computation 26 (1997) 979–1008. D.G. Wardell, H. Moskowitz, R.D. Plante, Run length distributions of special-cause control charts for correlated processes, Technometrics 36 (1994) 3–17. D.G. Wardell, H. Moskowitz, R.D. Plante, Control charts in the presence of data correlation, Management Science 38 (8) (1992) 1084–1105. D.A. West, P.M. Mangiameli, S.K. Chen, Control of complex manufacturing processes: A comparison of SPC methods with a radial basis function neural network, Omega, International Journal of Management Science 27 (1991) 349–362. H. Wittenberg, M. Sivapalan, Watershed groundwater balance estimation using streamﬂow recession analysis and baseﬂow separation, Journal of Hydrology 219 (1/2) (1999) 20–33. F. Worrall, T.P. Burt, A univariate model of river water nitrate time series, Journal of Hydrology 214 (1999) 74–90. S.W. Zimmerman, M.R. Dardeau, G.F. Crozier, B. Wagstaﬀ, The second battle of Mobile Bay—Using SPC

1642

D.F. Cook et al. / European Journal of Operational Research 174 (2006) 1631–1642

to control the quality of water monitoring, Computers and Industrial Engineering 31 (1/2) (1996) 257–260. [34] C.W. Zobel, D.F. Cook, Q.J. Nottingham, An augmented neural network classiﬁcation approach to detecting mean shifts in correlated manufacturing process parameters, International Journal of Production Research 42 (4) (2004) 741–758.

[35] F. Zorriassatine, J.D.T. Tannock, Review of neural networks for statistical process control, Journal of Intelligent Manufacturing 9 (3) (1998) 209–224. [36] S. Zou, Y.S. Yu, A dynamic factor model for multivariate water quality time series with trends, Journal of Hydrology 178 (1/4) (1996) 381–400.

Environmental statistical process control using an augmented neural network classification approach

Environmental statistical process control using an augmented neural network classification approach

Recommend Documents