European Journal of Operational Research 174 (2006) 1631–1642 www.elsevier.com/locate/ejor
Production, Manufacturing and Logistics
Environmental statistical process control using an augmented neural network classification approach Deborah F. Cook
a,*
, Christopher W. Zobel a, Mary Leigh Wolfe
b
a
b
Department of Business Information Technology (0235), Virginia Tech., 1007 Pamplin Hall, Blacksburg, VA 24061, United States Department of Biological Systems Engineering, Virginia Tech., 305 Seitz Hall, Blacksburg, VA 24061, United States Received 1 December 2003; accepted 27 April 2005 Available online 19 July 2005
Abstract Shifts in the values of monitored environmental parameters can help to indicate changes in an underlying system. For example, increased concentrations of copper in water discharged from a manufacturing facility might indicate a problem in the wastewater treatment process. The ability to identify such shifts can lead to early detection of problems and appropriate remedial action, thus reducing the risk of long-term consequences. Statistical process control (SPC) techniques have traditionally been used to identify when process parameters have shifted away from their nominal values. In situations where there are correlations among the observed outputs of the process, however, as in many environmental processes, the underlying assumptions of SPC are violated and alternative approaches such as neural networks become necessary. A neural network approach that incorporates a geometric data preprocessing algorithm and identifies the need for increased sampling of observations was applied to facilitate early detection of shifts in autocorrelated environmental process parameters. Utilization of the preprocessing algorithm and the increased sampling technique enabled the neural network to accurately identify the process state of control. The algorithm was able to identify shifts in the highly correlated process parameters with accuracies ranging from 96.4% to 99.8%. Ó 2005 Elsevier B.V. All rights reserved. Keywords: Environmental quality; Neural networks; Statistical process control; Correlation
1. Introduction *
Corresponding author. Tel.: +1 540 231 4847; fax: +1 540 231 3752. E-mail addresses:
[email protected] (D.F. Cook), czobel@vt. edu (C.W. Zobel),
[email protected] (M.L. Wolfe).
Environmental researchers and practitioners are often interested in monitoring the behavior of an environmental parameter over time. The goal of this monitoring is to identify when a change,
0377-2217/$ - see front matter Ó 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.ejor.2005.04.035
1632
D.F. Cook et al. / European Journal of Operational Research 174 (2006) 1631–1642
or shift, has occurred in an environmental parameter of interest, such as dissolved oxygen, nitrate, or temperature. Knowledge of a change or shift in the value of the parameter allows appropriate action to be taken. For example, increased concentrations of copper in water discharged from a manufacturing facility might indicate a problem in the wastewater treatment process. Increased numbers of fecal bacteria in a stream near a dairy farm might indicate a failure in a manure storage tank. The ability to identify changes or shifts in the environmental parameter allows for early detection of problems. Statistical techniques are typically used to identify changes or shifts in the values of parameters. Statistical process control (SPC) is the most widely used technique, primarily in manufacturing. The SPC process identifies whether an observed output or measurement from a system represents a process that is in control or one that has shifted out of control. While variation is present in virtually all processes, only natural, or random, variation is present in an in control process. In contrast, an out of control condition signals the presence of assignable or special cause variation. This type of variation must be identified and eliminated to return the process to a state of statistical control. There is currently a high level of interest in the use of SPC techniques in environmental data management [6,19,20,24,33]. Corbett and Pan [6] reported great potential for the application of industrial statistics, particularly SPC control charts, in environmental management. They described the development and application of control charts to analyze nitrate blank measurements and nitrate concentration data. These control charts allowed the identification of the presence of a special cause, signaling that the process from which samples were derived should be investigated. Maurer et al. [20] used SPC to identify long- and shortterm trends and outliers of sediment cadmium concentration data. Zimmerman et al. [33] used SPC to examine water quality data sets collected from the Mobile Bay and illustrated capabilities of SPC in identifying special causes such as measurement errors and spills. Traditional SPC techniques have been shown to be quite effective in discrete manufacturing opera-
tions practice [8,18]. However, these techniques are typically not applicable in situations where autocorrelation is present in a data set. The presence of autocorrelation indicates that the value of a parameter depends upon its previous values, which violates a basic assumption used to develop the discrete SPC techniques: statistical independence. When autocorrelation is present, not as much information is gained from an additional observation as there would be if that observation was independent from previous observations. Autocorrelation is present in many environmental data streams; various researchers have identified environmental parameters as time series [14,21, 31,32,36]. Hipel and McLeod [13] provide an extensive description of time series modeling of water resources and other environmental systems. The presence of autocorrelation within the data stream requires that alternative approaches to identifying shifts in the process mean be considered. Complex statistical techniques exist for SPC in the presence of autocorrelation. These techniques have seen very limited application and no technique has demonstrated high performance in its ability to detect shifts. Techniques that do not require typical underlying statistical assumptions, such as data independence and data normality, might be more widely applied. Recently, Zobel et al. [34] applied neural network theory to develop such a technique. The overall goal of this study was to evaluate the application of the neural network based technique to environmental processes. The specific objective was to evaluate the capability of the technique to identify shifts in the mean of autocorrelated environmental data streams.
2. Techniques for analyzing correlated process data The primary techniques available for analyzing correlated process data are statistical approaches. Recently, neural network models have been developed as an alternative approach to statistical techniques. The following sections describe applications of statistical approaches and neural networks to analyzing correlated process data.
D.F. Cook et al. / European Journal of Operational Research 174 (2006) 1631–1642
1633
2.1. Statistical techniques
2.2. Neural network models
Several authors [1,12,29] have recommended the use of time series modeling techniques when monitoring the mean of correlated process data. Wardell et al. [28] evaluated the performance of time series control charting techniques by determining the properties of an X chart of residuals for process data from several common time series models. The results of these studies showed that an X chart of residuals from an AR(1), AR(2), or ARMA(1,1) model performed poorly for most parameter values likely to occur with process control data. Specifically, an X chart required an average of 123–223 samples to identify a standardized shift of one standard deviation, for positive values of the autoregressive parameter of 0.5–0.9. Process measurements are often taken hourly, implying that an average of 123–223 hours could pass before the SPC time series control chart identified a shift in the mean value of a process parameter. Wardell et al. [28] also studied the run-length distribution of the special-cause control chart (X chart of the residuals) proposed by Alwan and Roberts [1] for correlated observations, given that the result of the assignable cause to be detected is a shift in the process mean. Reynolds et al. [25] used a variable sampling interval (VSI) to monitor the mean in the presence of correlation. VanBrackle and Reynolds [27] evaluated the use of EWMA and CUSUM control charts for the process mean when the observations are correlated. They found that although correlation shortens the time required to detect small to moderate shifts, it also has a higher false alarm rate (signaling an out of control process when the process is actually in control). False alarms can result in lack of confidence in the information, leading to process operators ignoring control chart signals. In addition, the time needed to detect shifts is actually lengthened for larger shifts under this approach. No consistently high performing statistical methods have been developed to detect shifts in the process mean under the presence of autocorrelation. Corbett and Pan [6] pointed to the presence of correlation in environmental data streams and identified SPC for correlated data as a largely open research area to be investigated.
A neural network consists of a number of simple, highly interconnected processing elements or nodes and incorporates the ability to process information by a dynamic response of these nodes and their connections to external inputs. A primary advantage of neural network models for monitoring environmental data is that they do not make the assumptions of data normality and independence that underlie traditional SPC. Detailed descriptions of neural networks are available in Freeman and Skapura [10] and Haykin [11]. Neural networks have been applied to process control as an alternative to strictly statistical techniques. Much of the existing research on this use of neural networks has assumed statistical independence of the process data [15–17,23,35]. A more limited set of existing research, described in the following paragraphs, addresses the presence of correlation. Ruis et al. [26] used a counterpropagation recurrent neural network (RNN) and a backpropagation RNN for monitoring chemical measurement processes. The authors concluded that the counterpropagation RNN is better than the backpropagation RNN when analyzing correlated data. West et al. [30] investigated the ability of radial basis function neural networks to monitor and control complex manufacturing processes that exhibit both auto- and cross-correlation and showed that their method is superior to the classical SPC methods such as the multivariate Shewhart and the multivariate EWMA. Cook and Chiu [7] and Chiu et al. [5] were successful in separating data that exhibited a shift of one, two, and three standard deviations from non-shifted data for simulated data. The neural networks outperformed the traditional SPC control charts as in the previously cited neural network studies. Cook and Chiu [7] trained radial basis function (RBF) neural networks to identify mean shifts in highly correlated parameters (/ = 0.9) by using training data sets consisting of pairs of consecutive observations of the parameter values: (xt1, xt). Each bivariate data point was drawn from one of two possible subsets, non-shifted = 0 or shifted = 1, according to whether or not a mean shift
1634
D.F. Cook et al. / European Journal of Operational Research 174 (2006) 1631–1642
occurred in the time interval between the initial and following observation. Their simulations assumed that if a shift occurred, it occurred between the first and second points of the bivariate pair. Consequently, the resulting bivariate pairs were either (non-shifted, shifted) or (non-shifted, nonshifted). Their goal was to determine if a neural network could be trained to recognize the occurrence of a process shift. After suitable training, the RBF neural networks of Cook and Chiu [7] were able to identify shifts of 1.5 and 2 standard deviations with a fair degree of consistency.
6 5 4 3 2 1 0 -4
-2
-1 0
2
4
-2
in control out of control
-3 -4
2.3. Combined classification/neural network approach
a
6
One of the shortcomings of Cook and ChiuÕs [7] research was that all chances of detection were lost if the shift was not detected immediately. Zobel et al. [34] developed a neural network based approach that overcomes this shortcoming. Their approach initially pre-classifies neural network training data into mutually exclusive subclasses of observations. When a large shift in the mean occurs, as shown in Fig. 1(a), the two subsets of shifted and non-shifted data will typically occupy fairly distinct regions within the output space; consequently, the neural networks are able to distinguish between the two types of observations with a high degree of certainty. When a smaller shift occurs, however, as displayed in Fig. 1(b), there will be a great deal of overlap between the two subsets, and the networks have a more difficult time accurately classifying individual observations. An observed point falling in the overlap area might signal the need for the collection of an additional sample or samples to accurately classify the process condition. Accurate identification of the overlapping region and resampling increase the chances of detecting a shift, if one occurs, between time (t 1) and time (t). In order to determine the region of overlap between the in control and the out of control observations, Zobel et al. [34] developed an algorithm that identifies, for each class of points, the smallest convex polygon that contains all of the observations within that class. The algorithm creates three distinct classes of observations based upon two ini-
5 4 3 2 1 0 -4
-2
-1
0
2
4
-2 in control out of control
-3 -4 b
Fig. 1. Illustration of shifts in the process mean. (a) Large mean shift (2 sigma); (b) small mean shift (0.5 sigma).
tial sets (Set 1 and Set 2) of possibly overlapping bivariate data points. Individual observations are identified as belonging to one of the original two classes of points if they lie within one of these polygons but not within the other (Classes A and B, respectively), and any observations that lie within the intersection of the two polygons are classified as members of the new third class of points (Class C). Once the neural network is trained to classify bivariate data points into one of these three classes, it can be used to monitor an ongoing process to identify the presence or absence of a mean shift.
D.F. Cook et al. / European Journal of Operational Research 174 (2006) 1631–1642
If the trained neural network model identifies a new observation as falling into Class C, immediate increased sampling may be used to clarify the true state of the process. Such an increased sampling approach could be feasible in environmental monitoring applications such as wastewater discharges from manufacturing, food processing, and other industrial facilities. This neural network classification system showed significant improvement over current statistical and analytical techniques for identifying step shifts in correlated manufacturing processes. Zobel et al. [34] automated the augmented neural network technique in Excel, incorporating both the initial pre-classification algorithm and an appropriate neural network model for identifying the process behavior. The system helps to automate the data analysis and serves to provide a process manager with effective advice as to when additional sampling is required or when a special cause has been introduced into the system.
3. Application to environmental process data Application of the augmented neural network procedure to environmental data monitoring consisted of two steps. The first step was to apply the network procedure to simulated correlated data sets, representative of environmental parameter values. Simulated data were used so that known shifts could be induced in the simulated data stream, allowing quantification of the ability of the neural network model to recognize shifts. An Excel-based system was developed to simulate the time series data used in the neural network training. The second part of the study included application of the network procedure to measured in-stream nitrate concentrations presented by Burt et al. [4]. The first part of the evaluation process, as shown in the following sections, clearly demonstrated that the augmented neural network procedure can accurately identify mean shifts in correlated data. The second part of the evaluation describes and demonstrates the potential for use in practice.
1635
NeuralWareÕs Predict [22], an add-in for Microsoft Excel, was chosen for the development of neural networks for environmental SPC. Predict is an automated tool for network development and utilizes a backpropagation neural network that incorporates an adaptive gradient learning rule with a weight decay factor that is tuned automatically by the Predict software. In general, neural networks develop a functional mapping of the relationship between input and output parameters based on data examples provided to the network training algorithm. The particular backpropagation neural networks used in this research consisted of an input layer, a single hidden layer, and an output layer. The hidden layer is essential in the computation to develop a representation of the relationship between the input and output parameters. The number of nodes in the hidden layer was determined by Predict using a constructive method known as cascade learning. In cascade learning, hidden nodes are generally added one at a time, new hidden nodes have connections with input nodes as well as with the previously established hidden nodes, and construction is stopped when performance shows no further improvement [22]. 3.1. Testing and results with simulated training data sets To initially develop and test the neural network models, training data sets, consisting of synthetic bivariate data points representing shifted and non-shifted data, were generated using Microsoft Excel. Each training data set consisted of 2500 data vectors. The data vectors consisted of sequential pairs of (non-shifted, non-shifted) or (non-shifted, shifted) points and were simulated observations from an AR(1) time series with varying correlation coefficient (u) values. An AR(1) process can be represented by the following equation: X t ¼ l þ /ðX t1 lÞ þ et ;
ð1Þ
where Xt is the value of the time series at time t, l is the mean of the data series, Xt1 is the value of the time series at time (t 1), t is a normal, independently distributed error term, and / is the
1636
D.F. Cook et al. / European Journal of Operational Research 174 (2006) 1631–1642
autoregressive coefficient restricted to lie between 1 and 1. A standardized version of Eq. (1) with a range of correlation coefficients (/ = 0.5, 0.7, and 0.9) and the appropriate value of l (nonshifted or shifted with shifts of 1, 1.5, and 2 sigma) was used to generate the training data sets. In the case where no shift occurred between time (t 1) and time (t), each pair of observations was generated using: X t1 ¼ N ð0; 1Þ X t ¼ uðX t1 Þ þ et
ð2aÞ ð2bÞ
where et = N(0, re). The value of re is determined based on the relationship with the standard deviation of the data, r, as shown in Eq. (3): sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r2e ð3Þ r¼ ð1 /2 Þ Similarly, in the case where a shift had occurred between time (t 1) and time (t), each individual data pair in the training and testing sets was simulated as: X t1 ¼ N ð0; 1Þ
ð4aÞ
X t ¼ l þ /ðX t1 lÞ þ et
ð4bÞ
where et is determined as above and l represents the shift size (1, 1.5, or 2 sigma). The data sets consisted of consecutive time series observations with correlation coefficients of 0.5, 0.7, and 0.9, and with mean shifts of 1, 1.5, and 2 sigma. A total of 90 training data sets were generated, specifically, 10 replications of nine combinations of correlation coefficient and mean shift. The pre-classification algorithm was first applied to each of the original bivariate data sets to generate corresponding new data sets, each consisting of three mutually exclusive subsets of observations. Backpropagation neural networks were developed and trained to accurately classify data samples as belonging to either a shifted, non-shifted, or indeterminate process. Predict automatically selected training, testing, and validation sets from the 2500 data vectors provided in each training data set. Thirty percent of the data set (or 750 data vectors) was reserved and used for testing model performance during training. The weights connecting the nodes in the
network were adjusted during training to minimize the error between the network output and the target output. Additionally, processing nodes were added incrementally so that network error was reduced. This technique is known as cascade correlation [22]. The performance of the network was periodically evaluated during training, using the 750 reserved data vectors, to ensure that the network did not overfit the training data. Table 1 summarizes the performance of the trained neural networks with respect to accurate classification of the 750 test vectors in the training data set. The testing results demonstrate that the new neural network models are capable of accurately classifying the data into the three subclasses, with accuracy rates ranging from 95.72% to 99.80%. Once each neural network was trained, examples of shifted and non-shifted data pairs were input into the automated simulation system and the trained neural network model was used to generate a predicted classification (Class A, B, or C) for each data pair. These examples were generated using Eqs. (2)–(4). Each trained network was tested with 10 example data sets, with each set containing 1000 data vectors. If the network-predicted classification of the current data pair was Class A (representing a non-shifted classification) or Class B (representing a shifted classification), classification was complete. If the predicted classification was Class C (representing the overlap area), then the simulated Table 1 Augmented neural network average correct classification rates (%) for three classes (A—shifted, B—non-shifted, C—overlap) with correlation coefficients (/) of 0.9, 0.7, and 0.5 Correlation coefficient
Shift Classification rate (std dev) size Class A Class B Class C
0.9
1.0r 98.80 (1.03) 98.80 (1.36) 97.02 (1.78) 1.5r 99.53 (0.394) 99.33 (0.752) 98.35 (1.129) 2.0r 98.58 (0.883) 98.42 (0.935) 99.30 (1.02)
0.7
1.0r 98.13 (3.13) 99.33 (1.42) 96.30 (1.99) 1.5r 99.50 (0.650) 98.02 (2.254) 97.43 (0.763) 2.0r 99.07 (0.540) 99.22 (0.860) 98.46 (0.622)
0.5
1.0r 98.44 (3.44) 97.02 (3.89) 95.72 (1.65) 1.5r 99.80 (0.632) 98.60 (2.42) 97.83 (1.10) 2.0r 99.44 (0.786) 99.55 (0.617) 97.10 (1.71)
D.F. Cook et al. / European Journal of Operational Research 174 (2006) 1631–1642
process was put into an intensive sampling, or resampling, mode. Additional observations of the second of the two data points were then generated to simulate immediate resampling of the process, until a conclusive classification of non-shifted (Class A) or shifted (Class B) was finally generated. The predicted classification was compared to the actual state of the process and the results were compiled into summary statistics on the number of correct and incorrect classifications. Results are given in Table 2 for a correlation coefficient values of 0.9, 0.7, and 0.5. In all three cases, the networks were able to do a very good job not only of identifying when a shift of 1 sigma, 1.5 sigma, or 2 sigma occurred, but also of identifying when no shift was present in the data. All of the classification rates for the 1.5 and 2 sigma shifts were above 96%, and overall classification accuracy improved as the size of the shift and the level of correlation increased. The results for the smaller 1 sigma shift were not as good as the others, yet they exceeded 91% accuracy in all but one instance. Classification accuracy generally increased with an increase in the correlation coefficient of the underlying data. This behavior can be attributed to the relative amount of overlap between the two underlying subsets of observations in each test set. As illustrated by the examples given in Fig. 2, a lesser degree of correlation will typically lead to a much higher degree of overlap, because of the underlying shape of the associated ‘‘clouds’’ of data points. Since the overlap area is responsible for a majority of the classification error [34], one would expect an increase in that error when the Table 2 Augmented neural network average correct classification rates (%) with / = 0.9, 0.7, and 0.5 for 10 simulated data sets Data pairs
Shift Average correct classification rate size (std dev)
(non-shifted, non-shifted)
1.0r 98.61 (0.76) 94.19 (2.73) 85.24 (7.74) 1.5r 99.54 (0.31) 98.20 (1.01) 96.39 (1.87) 2.0r 99.72 (0.21) 99.24 (0.46) 98.42 (1.01)
(non-shifted, shifted)
1.0r 98.85 (0.64) 95.80 (2.37) 91.98 (7.58) 1.5r 99.54 (0.30) 98.75 (0.62) 98.00 (1.09) 2.0r 99.76 (0.20) 99.42 (0.32) 99.12 (0.54)
/ = 0.9
/ = 0.7
/ = 0.5
1637
6 In control
5
Out of control
4
Inconclusive
3 2 1
-6
-4
0 -1 0
-2
2
4
-2 -3 a
-4
6 In control
5
Out ofcontrol
4
Inconclusive
3 2 1 0
-6
-5
-4
-3
-2
-1 -1 0
1
2
3
4
-2 -3 -4 b
Fig. 2. Illustration of relative overlap associated with different correlation coefficients. (a) 0.9 Correlation coefficient (1 sigma shift)—62.9% overlap; (b) 0.5 correlation coefficient (1 sigma shift)—94.3% overlap.
relative size of the corresponding overlap becomes greater. The average number of additional samples required to achieve a correct classification for each level of correlation was also calculated for the test problems above. Table 3 gives the number of additional samples for / = 0.9, and Tables 4 and 5 give the corresponding values for / = 0.7 and / = 0.5, respectively. In each case, the number of additional samples required to achieve a definitive classification of either in control or out of control increased as the size of the shift decreased. Similarly, the number of additional samples increased as the correlation coefficient decreased. Given an initial output of ‘‘inconclusive’’, a correlation coefficient of 0.9, and a shift of 2 sigma,
1638
D.F. Cook et al. / European Journal of Operational Research 174 (2006) 1631–1642
Table 3 Average number of resamples required to identify process shifts with / = 0.9 Shift size
2.0r 1.5r 1.0r
Mean additional samples (std dev) (non-shifted, non-shifted)
(non-shifted, shifted)
1.12 (0.06) 2.88 (3.65) 5.05 (1.04)
1.18 (0.37) 2.31 (2.10) 4.57 (1.86)
Table 4 Average number of resamples required to identify process shifts with / = 0.7 Shift size
2.0r 1.5r 1.0r
Mean additional samples (std dev) (non-shifted, non-shifted)
(non-shifted, shifted)
2.90 (0.56) 6.46 (1.59) 17.20 (5.18)
2.59 (0.82) 5.91 (2.94) 14.86 (7.73)
Table 5 Average number of resamples required to identify process shifts with / = 0.5 Shift size
2.0r 1.5r 1.0r
Mean additional samples (std dev) (non-shifted, non-shifted)
(non-shifted, shifted)
4.94 (1.28) 11.55 (3.24) 33.39 (23.44)
4.17 (1.38) 9.83 (6.18) 29.90 (36.46)
only slightly more than one additional sample was required to achieve a correct classification. In the most extreme case reported, the mean number of additional samples required to achieve a correct classification was less than 34. This compares very favorably to the mean of 123–223 samples reported in Wardell et al. [28] as needed to identify a shift using the X chart. 3.2. Analysis of nitrate concentration data As reported earlier, time series models have been developed for various hydrologic and environmental applications. The augmented neural network technique for identifying shifts in time series parameters was developed using the widely applied AR(1) time series model for illustrative purposes because it is the most basic and most
often applied model. Bras and Rodriguez-Iturbe [2] reported that the AR(1) model is the most popular model of time-series simulation and forecasting in hydrology and other fields. The simulations performed above illustrate the ability of the neural network to recognize mean shifts in the AR(1) model. The simulations were required to verify the performance of the augmented neural network methodology, as there is no way in practice to always know whether a shift has actually occurred. Laboratory experiments, in which water quality characteristics are maintained at specified levels and then altered using designed experimentation to generate shifted data, would be a way to further quantify the ability of the methodology to recognize shifts. The augmented neural network methodology can be applied in practice to monitor and assess shifts in water quality parameters. Nitrate concentrations in surface and ground water are a matter of concern world-wide [3,4] and are a potential application area for the augmented neural network technique. Additional water quality parameters that might be monitored include, for example, phosphorus, bacteria, dissolved oxygen, pH, and toxic chemicals. To illustrate the application of the augmented neural network technique, monthly median stream nitrate concentrations collected over a 15-year period were utilized. Burt et al. [4] described the data set collected in a small catchment in southwest England. The data were standardized and an AR(1) time series model with a correlation coefficient of 0.7826 was identified as an appropriate time series model. Again, a data set consisting of bivariate data points representing shifted and non-shifted data, was simulated based on the nitrate AR(1) model. The simulation of the data set was required in order to generate a sufficient number of training examples. It may be possible in some instances to collect enough data from the measured process to develop sufficient training examples. In other cases simulation of data from a statistical model or from a computer model of a particular process may be required to provide sufficient training examples. The data set simulated from the nitrate AR(1) model was used for training an augmented neural network for monitoring nitrate concentra-
D.F. Cook et al. / European Journal of Operational Research 174 (2006) 1631–1642
1639
tion levels. Networks were trained to detect shifts of 1.5, 2, and 2.5 standard deviations in the process mean. Each network had a testing accuracy of over 99% in the training process. The actual data from Burt et al. [4] were used for testing the trained neural network model. Since the nitrate data were not analyzed in real time, it is impossible to know if a process mean shift occurred. However, the use of the actual data gives an illustration of how the process could be used in real time and the possibility of shifts in the historical data can be analyzed. Table 6 contains the neural network identification of the process condition (in control, out of control, inconclusive). No out of control conditions were identified by the three networks. In practice, an environmental manager would determine what size shift would be considered significant. In manufacturing processes, managers often wish to identify shifts of 1.5 standard deviations or more [9]. Identification of smaller shifts is not attempted as a variation of ±1.5 sigma is considered acceptable. It is likely that 1.5 sigma is the smallest variation an environmental process manager might try to identify, and realistically many process managers would start at a much higher shift level. The nitrate data were analyzed
using three different shift levels (1.5, 2.0, and 2.5) to evaluate the sensitivity of the network. As expected, the 1.5 sigma network was the most sensitive as a number (8 of 15) of ‘‘inconclusive’’ classifications were generated. A classification of ‘‘inconclusive’’ signals the possibility of a shift and signals the need for additional sampling to determine whether a shift occurred. The 2 sigma and 2.5 sigma networks were progressively less sensitive, with the 2.5 sigma network identifying no shifts in any observations except Observation 10 (inconclusive). If the 2.5 sigma network was being applied in practice, additional sampling would be called for at Observation 10. All other observations showed that the process was in control and that no special causes were present. To further illustrate the ability of the network to recognize shifts, we induced shifts in the actual nitrate data and presented that data to the neural network trained to recognize shifts of 1.5 standard deviations. Shifts of 1, 1.5, 2, and 2.5 standard deviations were added between each pair of nitrate observations. The trained neural network identified all points as either shifted or inconclusive for the 1.5, 2, and 2.5 sigma shifts (Table 7). The network identified the largest shift (2.5 sigma) in all instances and either identified the shift or called
Table 6 Neural network analysis of nitrate process condition (0—in control, 1—out of control, 2—inconclusive)
Table 7 Neural network analysis of nitrate process condition with induced shifts condition (0—in control, 1—out of control, 2— inconclusive)
Data pair
Data pair
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Shift size 1.5 sigma
2.0 sigma
2.5 sigma
2 2 0 0 0 0 0 2 0 2 2 2 2 2 0
2 0 0 0 0 0 0 0 0 2 0 0 2 0 0
0 0 0 0 0 0 0 0 0 2 0 0 0 0 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Shift size 1 sigma
1.5 sigma
2.0 sigma
2.5 sigma
1 2 2 0 2 2 2 1 2 1 1 2 1 2 2
1 1 2 2 2 2 2 1 2 1 1 1 1 1 2
1 1 2 2 1 2 1 1 2 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1640
D.F. Cook et al. / European Journal of Operational Research 174 (2006) 1631–1642
for additional sampling with the 1.5 and 2 sigma shifts. The network completely missed only one observation (Observation 4) with the smallest shift of 1 sigma. This analysis further illustrates the ability of the augmented neural network to enable environmental process managers to identify shifts in an environmental data stream. A trained neural network model would be used in real time to monitor future values of the river nitrate parameter. No action would be taken by the environmental manager when the neural network classifies the process as in control. Additional sampling would be conducted when an inconclusive classification is made by the neural network. When the augmented neural network technique identified the presence of a shift (or special cause), the process manager would analyze the process so that the source of the special variation could be identified and appropriate action taken. Some potential causes for these out of control conditions include malfunction (mechanical, electrical, biological) of a treatment component, changes in inflow (volume or strength) to the treatment process, or a catastrophic pollution event (e.g., chemical spill) into a stream or river. The development of such a neural network tool provides a realistic technique for monitoring environmental parameters.
4. Summary and conclusions An augmented neural network classification approach was used to identify shifts in autocorrelated data, with particular application to environmental monitoring data. The classification approach includes an initial geometric pre-classification algorithm as part of the training process for neural networks. The pre-classification step improves the capability of the trained network to identify shifts in correlated process data by identifying potential areas of overlap for the classification scheme. The mapping of the overlap area allows for the generation of a signal that calls for a resampling of the process to help make the determination of the process state of control. The augmented neural network classification system was applied to a set of environmental data,
specifically, nitrate concentrations [4]. The data were standardized and an AR(1) time series model with a correlation coefficient of 0.7826 was identified as an appropriate time series model. A data set consisting of bivariate data points representing shifted and non-shifted data was simulated based on the nitrate AR(1) model. This data set was used for training an augmented neural network for monitoring nitrate concentration levels. Networks were trained to detect shifts of 1.5, 2, and 2.5 standard deviations in the process mean. Each network had a testing accuracy of over 99% in the training process. The actual data from Burt et al. [4] was used for testing the trained NN model. Excellent classification results were obtained on AR(1) time series with high autocorrelation coefficients using the augmented neural network classification system for 2, 1.5 and 1 sigma shifts with minimally increased sampling efforts. As expected, fewer resamples were required at higher correlation levels, and the number of resamples required increased as the shift size decreased. The neural network classification approach was able to identify shifts in a process with accuracies ranging from 94.10% to 99.76% for the higher correlation coefficients of 0.7 and 0.9. Correct classification rates for a correlation coefficient of 0.5 with shifts of 1.5 and 2 sigma ranged from 96.39% to 99.12%. Only the one sigma shift with a 0.5 correlation coefficient produced less satisfying classification rates of 85.24% and 91.98%. This capability represents a significant improvement over existing methods and would greatly increase an environmental managerÕs ability to successfully monitor and control an autocorrelated process. The augmented neural network methodology would be expected to be applicable to data streams modeled by time series models other than the basic AR(1) since there are no underlying assumptions within the neural network methodology that depend on the specific time series model. Neural network models would be developed, trained, and tested for these additional types of time series models of environmental processes. The augmented neural network methodology offers a viable approach for process monitoring in the
D.F. Cook et al. / European Journal of Operational Research 174 (2006) 1631–1642
presence of autocorrelation. This is particularly relevant in water quality monitoring, as many water quality parameters would be expected to be autocorrelated.
[16]
[17]
References
[18]
[1] L.C. Alwan, H.V. Roberts, Time series modeling for statistical process control, in: J.B. Keats, N.F. Hubele (Eds.), Automated Manufacturing, Marcel Dekker, Inc., New York, 1989. [2] R.L. Bras, I. Rodriguez-Iturbe, Random Functions and Hydrology, Addison-Wesley Publishing Company, Reading, MA, 1984. [3] T.P. Burt, A.L. Heathwaite, T. Trudgill, Nitrate: Processes, Patterns, and Management, John Wiley & Sons, Chichester, 1993. [4] T.P. Burt, B.P. Arkell, S.T. Trudgill, D.E. Walling, Stream nitrate levels in a small catchment in south west England over a period of 15 years, Hydrological Processes 2 (1988) 267–284. [5] C.C. Chiu, M. Chen, K. Lee, Shifts recognition in correlated process data using a neural network, International Journal of Systems Science 32 (2) (2001) 137– 143. [6] C.J. Corbett, J.N. Pan, Evaluating environmental performance using statistical process control techniques, European Journal of Operational Research 139 (1) (2002) 68– 83. [7] D.F. Cook, C.C. Chiu, Using radial basis function neural networks to recognize shifts in correlated manufacturing process parameters, IIE Transactions 30 (3) (1998) 227– 234. [8] W.E. Deming, Out of the Crisis, Massachusetts Institute of Technology Center for Advanced Engineering Study, Cambridge, MA, 1986. [9] J.R. Evans, W.M. Lindsay, The Management and Control of Quality, Southwestern Thomson Learning, Cincinnati, OH, 2002. [10] J.A. Freeman, D.M. Skapura, Neural Networks: Algorithms, Applications, and Programming Techniques, Addison-Wesley, Reading, MA, 1991. [11] S. Haykin, Neural Networks: A Comprehensive Foundation, Prentice-Hall, Englewood Cliffs, NJ, 1998. [12] T.J. Harris, W.H. Ross, Statistical process control procedures for correlated observations, The Canadian Journal of Chemical Engineering 69 (1991) 48–57. [13] K.W. Hipel, A.I. McLeod, Developments in Water Science: Time Series Modelling of Water Resources and Environmental Systems, Elsevier, Amsterdam, 1994. [14] K.W. Hipel, A.I. McLeod, R.R. Weiler, Resources Bulletin 24 (3) (1988) 533–544. [15] E.S. Ho, S.I. Chang, Integrated neural network approach for simultaneous monitoring of process mean and variance
[19] [20]
[21]
[22] [23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32] [33]
1641
shifts—a comparative study, International Journal of Production Research 37 (8) (1991) 1881–1901. H.B. Hwarng, N.F. Hubele, X control chart pattern identification through efficient off-line neural network training, IIE Transactions 25 (3) (1993) 27–39. H.B. Hwarng, N.F. Hubele, X-bar chart pattern recognition using neural nets, ASQC Quality Congress Transactions, Milwaukee, 1991, pp. 884–889. J.M. Juran, A.B. Godfrey, JuranÕs Quality Handbook, Irwin McGraw-Hill, New York, NY, 1999. C.N. Madu, Managing Green Technologies for Global Competitiveness, Quorum Books, Wesport, CT, 1996. D. Maurer, M. Mengel, G. Robertson, T. Gerlinger, A. Lissner, Statistical process control in sediment pollutant analysis, Environmental Pollution 104 (1) (1999) 21– 29. A.I. McLeod, K.W. Hipel, F. Comancho, Trend assessment of water quality time series, Water Resources Bulletin 19 (4) (1983) 537–547. NeuralWorks, Predict, Technical Publications, NeuralWare, Inc., 202 Park West Drive, Pittsburgh, PA, 2002. G.A. Pugh, A comparison of neural networks to SPC charts, Computers and Industrial Engineering 21 (1–4) (1991) 253–255. C. ReVelle, Research challenges in environmental management, European Journal of Operational Research 121 (1) (2000) 218–231. M.R. Reynolds, J.C. Arnold, J.W. Baik, Variable sampling interval x charts in the presence of correlation, Journal of Quality Technology 28 (1996) 12–30. A. Ruis, I. Ruissanchez, M.P. Callao, F.X. Ruis, Reliability of analytical systems: Use of control charts, time series models and recurrent neural networks (RNN), Chemometrics and Intelligent Laboratory Systems 40 (1998) 1–18. L.N. VanBrackle, M.R. Reynolds, EWMA and CUSUM control charts in the presence of correlation, Communications in Statistics—Simulation and Computation 26 (1997) 979–1008. D.G. Wardell, H. Moskowitz, R.D. Plante, Run length distributions of special-cause control charts for correlated processes, Technometrics 36 (1994) 3–17. D.G. Wardell, H. Moskowitz, R.D. Plante, Control charts in the presence of data correlation, Management Science 38 (8) (1992) 1084–1105. D.A. West, P.M. Mangiameli, S.K. Chen, Control of complex manufacturing processes: A comparison of SPC methods with a radial basis function neural network, Omega, International Journal of Management Science 27 (1991) 349–362. H. Wittenberg, M. Sivapalan, Watershed groundwater balance estimation using streamflow recession analysis and baseflow separation, Journal of Hydrology 219 (1/2) (1999) 20–33. F. Worrall, T.P. Burt, A univariate model of river water nitrate time series, Journal of Hydrology 214 (1999) 74–90. S.W. Zimmerman, M.R. Dardeau, G.F. Crozier, B. Wagstaff, The second battle of Mobile Bay—Using SPC
1642
D.F. Cook et al. / European Journal of Operational Research 174 (2006) 1631–1642
to control the quality of water monitoring, Computers and Industrial Engineering 31 (1/2) (1996) 257–260. [34] C.W. Zobel, D.F. Cook, Q.J. Nottingham, An augmented neural network classification approach to detecting mean shifts in correlated manufacturing process parameters, International Journal of Production Research 42 (4) (2004) 741–758.
[35] F. Zorriassatine, J.D.T. Tannock, Review of neural networks for statistical process control, Journal of Intelligent Manufacturing 9 (3) (1998) 209–224. [36] S. Zou, Y.S. Yu, A dynamic factor model for multivariate water quality time series with trends, Journal of Hydrology 178 (1/4) (1996) 381–400.