On line detection of mean and variance shift using neural networks and support vector machine in multivariate processes

Applied Soft Computing 12 (2012) 2973–2984 Contents lists available at SciVerse ScienceDirect Applied Soft Computing journal homepage: www.elsevier...

Download PDF

917KB Sizes 4 Downloads 56 Views

Report

PDF Reader
Full Text

Applied Soft Computing 12 (2012) 2973–2984

Contents lists available at SciVerse ScienceDirect

Applied Soft Computing journal homepage: www.elsevier.com/locate/asoc

On line detection of mean and variance shift using neural networks and support vector machine in multivariate processes Mojtaba Salehi ∗ , Reza Baradaran Kazemzadeh, Ali Salmasnia Faculty of Engineering, Tarbiat Modares University, 14115 Tehran, Iran

a r t i c l e

i n f o

Article history: Received 10 August 2010 Received in revised form 15 April 2012 Accepted 23 April 2012 Available online 30 May 2012 Keywords: Statistical process control Multivariate process Mean shift Variance shift Support vector machine Neural network

a b s t r a c t The effective recognition of unnatural control chart patterns (CCPs) is one of the most important tools to identify process problems. In multivariate process control, the main problem of multivariate quality control charts is that they can detect an out of control event but do not directly determine which variable or group of variables has caused the out of control signal and how much is the magnitude of out of control. Recently machine learning techniques, such as artiﬁcial neural networks (ANNs), have been widely used in the research ﬁeld of CCP recognition. This study presents a modular model for on-line analysis of out of control signals in multivariate processes. This model consists of two modules. In the ﬁrst module using a support vector machine (SVM)-classiﬁer, mean shift and variance shift can be recognized. Then in the second module, using two special neural networks for mean and variance, it can be recognized magnitude of shift for each variable simultaneously. Through evaluation and comparison, our research results show that the proposed modular performs substantially better than the traditional corresponding control charts. The main contributions of this work are recognizing the type of unnatural pattern and classifying the magnitude of shift for mean and variance in each variable simultaneously. © 2012 Elsevier B.V. All rights reserved.

1. Introduction In many industrial processes, statistical process control (SPC) techniques are some of the most frequently used tools for improving quality. Control charts are the most widely applied SPC tools used to reveal abnormal variations of the monitored measurements. In addition, the rapid growth of the automatic data acquisition system for process monitoring has led to the increased interest in the simultaneous scrutiny of several interrelated quality variables. These techniques are often referred as multivariate SPC (MSPC) procedures. The main problem of multivariate quality control charts is that they can detect an out of control event but do not directly determine which variable or group of variables has caused the out of control signal and how much is the magnitude of out of control. Incorporating pattern recognition in the control charting scheme can address this problem. With a certain control chart pattern, the diagnosis search can be shortened if one has knowledge of the CCP type (e.g., a shift or a trend) and corresponding knowledge of which process factors could cause these CCPs. Therefore,

∗ Corresponding author. E-mail addresses: m [email protected], [email protected] (M. Salehi). 1568-4946/$ – see front matter © 2012 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.asoc.2012.04.024

timely recognition of CCPs is a crucial task in SPC for determining the potential assignable causes [1]. To clarify the main problem, let Xij = (Xij1 , Xij2 , ..., Xijp ) be a p dimension vector that represents the p quality characteristics in the jth observation of the ith subgroup (sample), where i = 1,2,. . . and j = 1,2,. . .,n. The lth component of Xij , Xijl denotes the lth quality characteristic, l = 1,2,. . .,p. A standard multivariate quality control problem is to determine whether an observed vector of measurements Xij = (Xij1 , Xij2 , ..., Xijp ), a p-component vector, from a particular sample exhibits any evidence of a location shift from a set of satisfactory mean values. It is assumed that Xij’s are independent and have identically a multivariate normal distribution with the known mean and covariance matrix ˙ when the process is in control. Let X¯ i represent the mean vector for the ith subgroup. The statistic plotted on a multivariate 2 control chart for the ith subgroup is given by, 2i = n(X¯ i − )˙ −1 (X¯ i − )

(1)

when the process is in control, it follows a 2 central distribution with p degrees of freedom. Therefore, a multivariate 2 control chart can be constructed by plotting 2i versus time with an upper control limit (UCL) given by 2˛,p where ˛ is an appropriate signiﬁcance level for performing the test. Speciﬁc patterns of mean and variance charts can be associated independently with different problems when relevant process

2974

M. Salehi et al. / Applied Soft Computing 12 (2012) 2973–2984

knowledge is accessible. Therefore, the simultaneous recognition and analysis of mean and variance CCPs is very helpful in the diagnostic search for an out of control process. Most of the researches in SPC have been focused on controlling of means. However, monitoring of process variability will be desirable. Alt [2] presented a control method based on a sample generalized variance, denoted by |S| and uses the mean and variance of |S|. For a given sample size n, the upper control limit, centerline and lower control limit of the control chart for |S| would be:

UCL = ˙0 (b1 + 3

b2 )

(2) CL = b1 ˙0 UCL = ˙0 (b1 − 3 b2 ) where ˙0 is the determinant of the in-control covariance matrix. The coefﬁcients b1 and b2 are computed as: b1 =

b2 =

1 (n − 1)p 1

p

(n − i)

i=1 p

(n − 1)2p

i=1

⎛

(n − i) ⎝

p

p

j=1

j=1

(n − j + 2) −

⎞

(3)

(n − j)⎠

If the calculated value for lower control limit is less than zero, it is replaced with zero. Usually ˙ will be estimated by a sample covariance matrix S, based on the analysis of preliminary samples. In this case, ˙ must be replaced by |S|/b1 . Control charts do not provide any pattern-related information when the process is out of control. Many supplementary rules, like zone tests or run rules and expert systems have also been implemented in control chart pattern recognition (CCPR). But according to the reported works, the overall percentages of correctly recognized for these approaches is low. Recently, many studies used artiﬁcial neural networks (ANNs) in order to detect patterns more effectively than the conventional approach and their aim is the automatic diagnosis of the patterns. Neural networks (NNs) have excellent noise tolerance in real time, requiring no hypothesis on statistical distribution of monitored measurements. This important feature makes NNs promising and effective tools that can be implemented to improve data analysis in manufacturing quality control applications. In addition, in recognition problems, NNs can recall learned patterns from noisy representations. This feature makes NNs highly appropriate for CCPR because unnatural CCPs are generally contaminated by natural variations in the process. Such applications have been reported to outperform the conventional methods in terms of recognition accuracy and speed. Development of NNs in CCPR is brieﬂy reviewed as follows. Artiﬁcial neural networks have been successfully applied to univariate statistical process control. The reader can refer to Zorriassatine and Tannock [3] that reviews of the applications of neural networks for univariate process monitoring. Recently, many researchers have investigated the application of artiﬁcial neural networks to multivariate statistical process control. In many quality control settings, the process may have two or more correlated variables. To control this process, the usual practice has been to maintain a separate (univariate) chart for each characteristic. Unfortunately, this could result in some fault out of control alarms when the characteristics are highly correlated. Zorriassatine et al. [4] applied a neural network classiﬁcation technique known as novelty detection to monitor bivariate process mean and variance. Chen and Wang [5] developed an artiﬁcial neural network-based model for identifying the characteristic or group of characteristics that cause the signal and for classifying the magnitude of the mean shifts. Niaki and Abbasi [6] developed a special two levels-based model using T2 control chart for detecting the out of control signals

and a multi layer perceptron neural network for identifying the source(s) of the out of control signals. A similar study can be found in Aparisi et al. [7] They evaluated the correct classiﬁcation percentage, and showed that the neural network is better than traditional decomposition method. Later, Aparisi et al. [8] designed a neural the network to interpret the out of control signal of the MEWMA chart. Guh and Shiue [9] proposed a straightforward and effective model to detect the mean shifts in multivariate control charts using decision tree learning techniques. Experimental results using simulation showed that the proposed model could not only efﬁciently detect the mean shifts but also accurately identify the variables that have deviated from their original means. Yu and Xi [10] presented a learning-based model for monitoring and diagnosing out of control signals in a bivariate process. In their model, a selective neural network ensemble approach was developed for performing these tasks. El-Midany et al. [11] proposed a framework for multivariate process control chart recognition. The proposed methodology uses the artiﬁcial neural networks to recognize a set of subclasses of multivariate abnormal patterns, identify the responsible variable(s) on the occurrence of abnormal pattern and classify the abnormal pattern parameters. In the most presented approaches, recognition problem is limited to identifying the characteristic or group of characteristics that cause the unnatural pattern, but this study proposes a new approach that can identify unnatural patterns (mean shift and variance shift) for each quality variables simultaneously and identify the magnitude of shift for each deviated quality variable. In addition, most of previous works focused on mean shift, but the application of neural network to monitoring variability of multivariate processes is limited. Low et al. [12] presented a neural network procedure for detecting variance shifts in a bivariate process. They indicated that neural networks have better performance than the traditional multivariate chart according to the average run length. In their approach the performance of neural network is dependent on the covariance matrix as well as the patterns of shifts in the covariance matrix. Zorriassatine et al. [4] used neural networks to detect a proportional changes in all elements of covariance matrix. They evaluated classiﬁcation accuracy for bivariate processes. Cheng and Cheng [13] considered two classiﬁers based on neural network and support vector machine to identify the source of variance shifts in the multivariate process. In their approach, after detection a variance shift by the generalized variance |S| chart, a classiﬁer will determine which variable is responsible for the variance shift. Most previous works consider variance shift and mean shift for a multivariate process separately or they consider these unnatural patterns simultaneously only for univariate processes. In addition, few works that consider the recognition problem of multiunnatural patterns for multivariate process, do not obtain any information about magnitude of deviations. This information can help quality participators for rapid recognition of unnatural pattern roots. Type of unnatural patterns and magnitude of shift in a variance shift or a mean shift will be recognized for each variable by the proposed model simultaneously. In this paper proposes a model that consists of two modules. In the ﬁrst module using a support vector machine-classiﬁer, type of unnatural pattern can be recognized. Then using two special-neural networks for shift mean and variance shift, it can be recognized the magnitude of shift for each quality variable simultaneously. The statistical performance of model will be compared with other competing multivariate process mean and variability control schemes. The rest of this research is organized as follows. Section 2 describes the proposed model for solving CCPR problem. Section 3 presents a case study and also the overall performance of model are evaluated and compared to the corresponding multivariate control charts. Conclusions will be presented in Section 4.

M. Salehi et al. / Applied Soft Computing 12 (2012) 2973–2984

2975

Start

Read window data

Natural pattern

Support Vector Machine -based classifier Module I

Unnatural pattern Module II

Input process data to the corresponding network

Network A (Mean Shift Analyzer)

Network B (Variance Shift Analyzer)

Display magnitude ofShift Fig. 1. Proposed approach for CCPR.

2. Methodology To improve the efﬁciency of recognition process, it can be used two systems (two modules) for detection and analysis of unnatural patterns. A CCPR system can be developed and trained either as a general-purpose system that can detect several types of CCP, or as a special-purpose system that can analyze only a particular type of CCP [14]. Using this approach, the efﬁciency of computations will be improved. Moreover, in some situations, detection of unnatural patterns is enough for participators and it can be optional to analysis unnatural patterns using the other module. In this research, a modular framework was introduced to use the advantages of general-purpose and special-purpose simultaneously. Using this approach, the main recognition problem is split into more manageable sub-problems. Because the recognition problem is approximately impossible using one network and makes it complex. In this situation, the convergence of network will be very difﬁcult. Module I implements a support vector machinebased classiﬁer as a general propose system for recognizing a mean shift and/or a variance shift and Module II is a neural networkbased classiﬁer as a special propose system for estimating the shift magnitude of mean and/or variance for each quality variable simultaneously. As Fig. 1 shows, at ﬁrst the proposed system reads data process, and then it classiﬁes the situation of process in natural, mean shift and variance shift by the support vector machine. If Module I shows a special unnatural pattern, the corresponding network in Module II will be implemented for estimation the shift magnitude of the unnatural CCPs, otherwise collecting of data will be continued.

level of background variation, ANN makes a pattern misclassiﬁcation problem. In addition, ANN models need a lot of time and effort to construct the best architecture [15,16]. Support vector machine embodies the structural risk minimization, which has been shown to be superior to the traditional empirical risk minimization principle employed by ANN. Hence, this research uses SVM-based classiﬁer as a general-purpose classiﬁer for the on-line real-time recognition of the mean and variance shift. SVM is based on the supervised learning method that constructs a hyperplane or set of hyperplanes in a high- or inﬁnite-dimensional space, which can be used for classiﬁcation and regression. 2.1.1. SVM-based classiﬁer SVM is a reliable classiﬁcation technique, which is based on the statistical learning theory. This technique was ﬁrstly proposed by Vapnik [17]. As shown in Fig. 2, a linear SVM was developed to classify the data set which contains two separable classes such as {+1, m −1}. Let the training data consist of n datum {(Xi , yi )}N i=1 , Xi ∈ R and yi ∈ {−1, +1}. To separate these classes, SVMs have to ﬁnd the optimal (with maximum margin) separating hyperplane so that SVM

D ( x ) = +1 x1

D ( x ) = −1

Class 1

x2 W

2.1. Module I Machine learning techniques, such as artiﬁcial neural networks, have been widely used in the research ﬁeld of CCP recognition. However, ANN-based approaches can easily over ﬁt the training data or the producing models can suffer from the difﬁculty of generalization. Therefore, when the training examples contain a high

Class 2

D ( x) = 0 Fig. 2. The structure of a simple SVM.

2976

M. Salehi et al. / Applied Soft Computing 12 (2012) 2973–2984

Table 1 Kernel functions. Comments

Kernel function

Polynomial (homogeneous) Polynomial (inhomogeneous)

k(Xi , Xj ) = (Xi · Xj ) d k(Xi , Xj ) = (Xi · Xj + 1)

margin hyperplane will be found. To determine these two parameters, a cross-validation experiment was used to choice parameters that yield the best result. Data sets of normal and abnormal examples are generated using Monte-Carlo simulation. Therefore to train and test the SVM-classiﬁer, a process with three variables is considered which is just an example of a limited case of the general multivariate when p, the number of variables, equals 3. In the simulated example, following mean vector and covariance matrix are used:

d

2

2 Xi −Xj

Radial basis function

k(Xi , Xj ) = exp(− Xi − Xj ) for > 0

Gaussian radial basis function

k(Xi , Xj ) = exp

Hyperbolic tangent

k(Xi , Xj ) = tanh(Xi · Xj + c) for k > 0 and c > 0

−

2 2

⎛

(1 , 2 , 3 ) = (0, 0, 0),

has good generalization ability. All of the separating hyperplanes are formed with T

D(X) = (W X + w0 )

(4)

and provide following inequality for both y = +1 and y = −1: yi (W T · Xi + w0 ) ≥ 1,

i = 1, 2, ..., n

(5)

The data points which provide above the formula in case of equality are called the support vectors. By these support vectors, the classiﬁcation task in SVMs is implemented. Margins of hyperplanes obey following inequality: yk × D(Xk) ≥ , w

k = 1, ..., n

(6)

To maximize this margin ( ), norm of w is minimized. To reduce the number of solutions for norm of w, following formula is determined × w = 1

(7)

1

˙ = ⎝ 1.5 0.5

1.5

0.5

3

1

1

2

⎞ ⎠

The cross-validation was carried out in two stages. In the ﬁrst stage, a search was made for an estimate of the penalty factor C and the kernel parameter that obtained the best calciﬁcation accuracy. In the second phase of training, the estimated values of these two parameters were used to train the SVM model using the entire training samples. Subsequently this set of parameters was applied to the test dataset. To use cross validation, a grid research on and C is implemented. In this research, and C were checked in [0.025, 1] and [1,40] ranges, respectively. Note that in this work, the number of class k is set to 26 as shown in Table 2. In Table 2, the zero value indicates that corresponding variable is normal. After training the SVM by simulated examples = 0.08 and C = 12 are determined. After training the SVM, the test data are implemented on the SVM. Table 2 indicates that the proposed approach can recognize the type of unnatural pattern (mean shift and variance shift) with a good classiﬁcation rate. In this work, SVM was implemented in STATISTICA software.

Then formula (8) is minimized subject to constraint (5). 1 2w

(8)

2

Then, slack variables i , are added into formulas (5) and (8). Instead of formulas (5) and (8), new formulas (9) and (10) are used. yi (W T Xi + w0 ) ≥ 1 − i C

n i=1

i +

1 2w2

(9) (10)

where C is the penalty factor. Since originally SVMs classify the data in the linear case, in the nonlinear case SVMs do not achieve the classiﬁcation tasks. To overcome this limitation on SVMs, kernel approaches have been developed. A nonlinear input data set is converted into high dimensional linear feature space via kernels. Some popular kernel functions are deﬁned in Table 1. For information about multi-class SVM, reader is referred to [18]. 2.1.2. Selection, training and testing of model One of the most problems in the SVM design is the selection of an appropriate kernel function and the speciﬁc parameters for that kernel. Kernel functions map the original data into a higherdimension space and make the input data set linearly separable in the transformed space. The choice of kernel functions is highly problem-dependent and it is the most important in SVM applications. In CCPR problem, different kernel functions from Table 1 are used and compared in terms of their classiﬁcation performance for the selection of kernel function. Consequently, the best performance is obtained with radial basis function (RBF). The use of SVM involves training and testing procedures for setting parameter in a particular kernel function. For RBF kernel function, the parameters that must be determined are the kernel parameter and the penalty factor C. Kernel parameter deﬁnes the structure of the high dimensional feature space where a maximal

2.2. Module II This module identiﬁes the magnitude of shift for each unnatural pattern. It consists of two separated specialist-NNs for identiﬁcation the magnitude of shift. It must be noted that If only one network was employed to perform all the required recognition problems, the network would have to be large and complex. In this situation, training and convergence of the network will be very difﬁcult. Modular design splits the original problem into more manageable sub-problems [19]. The selection of each ANN is based on the output of the ﬁrst module. 2.2.1. Selecting input feature vector The selection and representation of the data in the training set has a strong inﬂuence on the performance of neural networks. Extracting useful features from the input data is one of the successful methods for classiﬁcation in the area of pattern recognition. In this research, mean, standard deviation, skewness, mean-square value, autocorrelation and cumulative sum control chart (CUSUM) are implemented with raw data as an input feature vector. The use of individual data usually results in a higher type I error (i.e., shorter in-control average run length (ARL)), it can reveal an out of control situation quickly (i.e., lower type II error). In-control ARL means the average number of observations that must be taken before an observation indicates an out of control condition when the process is actually in control. Battiti [20] and Smith [21] showed that the BPN integrating raw data and statistical features as input feature vectors has improved performance. Hassan et al. [22] conducted an experimental study and indicated that the BPN using statistical features as input vectors has better performance than those of the other BPN using raw data as input vectors. Some of experiments were conducted for investigating whether the inclusion of six statistical features in the input vector can improve the performance of the pattern recognizers. In this

M. Salehi et al. / Applied Soft Computing 12 (2012) 2973–2984

2977

Table 2 The result on the test data by the SVM. No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Number of errant variables

Type of error

Aggregate CR with different magnitudes of error for variables

One quality characteristic

(mean shift, 0, 0) (0, mean shift, 0) (0, 0, mean shift) (variance shift, 0, 0) (0, variance shift, 0) (0, 0, variance shift)

0.932 0.943 0.912 0.832 0.843 0.912

Two quality characteristic

(mean shift, mean shift, 0) (mean shift, 0, mean shift) (0, mean shift, mean shift) (variance shift, variance shift, 0) (variance shift, 0, variance shift) (0, variance shift, variance shift) (mean shift, variance shift, 0) (mean shift, 0, variance shift) (0, mean shift, variance shift) (variance shift, mean shift, 0) (variance shift, 0, mean shift) (0, variance shift, mean shift)

0.853 0.832 0.851 0.853 0.832 0.853 0.732 0.751 0.751 0.775 0.778 0.718

Three quality characteristic

(mean shift, mean shift, variance shift) (mean shift, variance shift, mean shift) (variance shift, mean shift, mean shift) (mean shift, variance shift, variance shift) (variance shift, mean shift, variance shift) (variance shift, variance shift, mean shift) (mean shift, mean shift, mean shift) (variance shift, variance shift, variance shift)

0.775 0.785 0.714 0.783 0.778 0.778 0.786 0.778

experiment, three structures for input vector were tested including: raw data-based, statistical feature-based and integration of raw data- and statistical feature-based. The results showed that integration approach gives a better performance. When a special disturbance d(t, t ) (zero when no unnatural pattern present) occurs at time t , the observations X of a quality characteristic is expressed as follows: X(t) = + Y (t) + d(t, t ),

t ≥ t

(11)

where is the X(t) quality characteristics measured at time t, is the process mean vector when the process is in control, Y (t) N(0, ˙), d(t, t ) = u × b u is the parameter to determine the position of shifting (u = 0 before shifting, u = 1 after shifting). b = (k1 1 , k2 2 , ..., kp p ), where kl is the magnitude of shift in terms of l , which is the lth quality characteristic. d(t, t ) N(0, ˙) where ˙ is the value of shift in the new covariance matrix ˙1 . It must be noted that this shift in the covariance matrix is the result of shift in the variance of variables. For the simulation study, the variance shifts are measured as D, the ratio of the determinant of out of control matrix (˙1 ) over the determinant of in control covariance matrix (˙). This study considers seven distinct types of shift for mean and four distinct types of shift for variance with the lth quality characteristic; hence kl has seven possible values, from −3 to +3 in increments of one for mean shifts and has three possible values, from 0 to +1.5 in increments of 0.5 for variance shifts as shown in Table 3.

2.2.2. Designing the network architecture In this study a three-layer fully connected feed-forward network with a back-propagation training algorithm was implemented. The details of network architecture are determined as follows. Input layer: In the proposed methodology, the size of the input feature vector or window size can signiﬁcantly inﬂuence the performance of the proposed model. A small input feature vector will typically detect the unnatural patterns more quickly, and may also yield a short in-control ARL (equivalent to a high type I error). A large window can reduce the recognition efﬁciency by increasing the time required to detect patterns (higher type II error, or longer out of control ARL). The suitable feature vector size here should balance the type I and type II errors. A threshold or cut-off value ( ∈ [0, 1]) was applied to the outputs of neurons in the output layer. Any value above was regarded as signaling the presence of an unnatural pattern. In the absence of an unnatural pattern, a type I error occurs when any of the output values from the above neurons equals or exceeds . According to the previous study [23], = 0.9 gives appropriate window size with satisﬁed type I error. Since in this situation, model gives a similar in-control ARL to a typical Shewhart chart (3s limits), which has an in-control ARL of about 370. The proposed model with number of quality variables p = 2, 3, 5, 8 and 10 and input feature vector size w = 5, 8, 12, 20, and 25 were studied. To simplify the required simulation and get an Table 3 Relationship between target value and type of shift for the lth quality characteristic. Target value

Magnitude of mean shift for the lth quality characteristic

Magnitude of variance shift for the lth quality characteristic

+1 +0.7 +0.4 0 −0.4 −0.7 −1

+3 +2 +1 0 −1 −2 −3

+1.5 +1 +0.5 0 – – –

2978

M. Salehi et al. / Applied Soft Computing 12 (2012) 2973–2984

Fig. 3. In-control ARL of the proposed model with different feature vector size (w) and various number of quality variables (p).

approximate result, the correlation between the quality variables were set at 0.5 for all the p values considered. The results are shown in Fig. 3. Each in-control ARL value in Fig. 3 was obtained based on the Average of 1000 simulation runs. Fig. 3 indicates that for a speciﬁc number of quality variables (p), the in-control ARL decreases when the feature vector size decreases. Additionally, the proposed model needs a larger feature vector size to maintain a ﬁxed incontrol ARL when number of quality variables increases. However, it is seen that when p changes between 2 and 10, window size can is considered 12. Fig. 4 shows the neural network architecture, which includes an input layer with 18p nodes that are used for the input data for 12 consecutive points in a control chart, a hidden layer with 10p nodes and an output layer with p nodes. In the input layer Xij,l denotes the value of jth observation (j = 1,2,. . .,12) for lth quality characteristic in ith batch (i = 1,2,. . .). It must be noted that in addition of raw data Xif,l denotes the value of selected feature (mean, standard deviation, skewness, mean-square value, CUSUM) for lth quality characteristic in ith batch (i = 1,2,. . .).

Output layer: Each output node represents a unique target value associated with a type of shift for a particular quality characteristic. Table 3 shows the relationship between target values and type of shifts associated with a particular quality characteristic for mean and variance. Transfer function: Back-propagation algorithm works with any differentiable transfer function. Cheng [24] claimed that the hyper tangent transfer function effectively detects process changes in different directions. The hyper tangent (f(x) = (ex − e−x )/(ex + e−x )) function was used as the activation function of the hidden and output layers of the neural network A for mean shift to recognize upward and downward shift. The sigmoid function (f(x) = 1/(1 + e−x )) was used in the output layers of the neural network B for variance shift. Hidden layer: Many theoretical and simulative investigations of engineering applications have demonstrated that the number of hidden layers need not exceed two [24]. Since one hidden layer can approximate any continuous mapping from the input patterns to the output patterns in backpropagation network, one hidden layer

Input layer (18p)

X i1,1

Hidden layer (10p)

X i12,1 Output layer (p)

X imean,1 X isd ,1 X isk ,1 X ims ,1 X iac ,1 X icusum,1 X i1, p

X icusum, p Fig. 4. The neural network architecture.

M. Salehi et al. / Applied Soft Computing 12 (2012) 2973–2984

2979

Table 4 Mean shift test results on the training data for Module II. Mean shift

Magnitude of shift

Target value

Average output of major parameter special network Mean

Aggregate standard deviation

Errors

1st quality characteristic

(+1,0,0) (−1,0,0)

(+0.4,0,0) (−0.4,0,0)

(+0.44,+0.03,+0.10) (−0.44,+0.08,−0.07)

0.280 0.173

(0.04,0.03,0.10) (0.04,0.08,0.07)

2nd quality characteristic

(0,+1,0) (0,−1,0)

(0,+0.4,0) (0,−0.4,0)

(+0.04,+0.42,−0.06) (−0.05,−0.47,+0.07)

0.1836 0.209

(0.04,0.02,0.06) (0.05,0.07,0.07)

3rd quality characteristic

(0,0,+1) (0,0,−1)

(0,0,+0.4) (0,0,−0.4)

(−0.05,−0.07,+0.38) (−0.12,+0.09,−0.48)

0.246 0.245

(0.05,0.07,0.08) (0.12,0.09,0.08)

1st and 2nd quality characteristic

(+2,−2,0) (−2,+2,0)

(+0.7,−0.7,0) (−0.7,+0.7,0)

(+0.64,−0.64,+0.08) (−0.76,+0.76,−0.03)

0.176 0.223

(0.04,0.06,0.08) (0.06,0.06,0.03)

1st and 3rd quality characteristic

(+2,0,−2) (−2,0,+2)

(+0.7,0,−0.7) (−0.7,0,+0.7)

(+0.67,+0.15,−0.72) (−0.73,+0.03,+0.77)

0.234 0.223

(0.03,0.15,0.02) (0.03,0.03,0.07)

2nd and 3rd quality characteristic

(0,+2,−2) (0,−2,+2)

(0,+0.7,−0.7) (0,−0.7,+0.7)

(−0.04,+0.76,−0.73) (−0.08,−0.73,+0.73)

0.256 0.246

(0.04,0.06,0.03) (0.08,0.03,0.03)

1st and 2nd and 3rd quality characteristic

(+3,−3,+3) (−3,+3,−3)

(+0.7,−0.7,+0.7) (−0.7,+0.7,−0.7)

(+0.96,−0.107,+0.97) (−0.103,+0.93,−0.104)

0.234 0.356

(0.04,0.07,0.03) (0.03,0.03,0.04)

Table 5 Variance shift test results on the training data for Module II. Variance Shift

1st quality characteristic 2nd quality characteristic 3rd quality characteristic 1st and 2nd quality characteristic 1st and 3rd quality characteristic 2nd and 3rd quality characteristic 1st and 2nd and 3rd quality characteristic

Magnitude of shift

(+0.5,0,0) (0,+ 0.5,0) (0,0,+0.5) (+1,+1,0) (+1,0,+1) (0,+1,+1) (+1.5,+1.5,+1.5)

Target value

(+0.4,0,0) (0,+0.4,0) (0,0,+0.4) (+0.7,+0.7,0) (+0.7,0,+0.7) (0,+0.7,+0.7) (+1,+1,+1)

was considered. The number of nodes in the hidden layer is highly problem-dependent. In general, a network with too many neurons loses its ability to generalize and lead to memorization. On the other hand, a network with too few neurons is not able to learn the mapping relations between input variables and output patterns. Based on trial and error experiments, the number of hidden neurons was set to 10. 2.2.3. Training and parameter setting Training data is very critical in applications of NNs, which determines the quality of NNs’ work. In this study, the Monte-Carlo simulation method was applied to generate the required data sets of normal and abnormal examples for training and testing. A p-dimensional multivariate normal process is simulated by a multivariate normal distribution whose mean is and whose covariance matrix is ˙ (a p × p matrix). In practice to estimate ˙, it is necessary to collect data over a substantial amount of time when the process is in control. The variance shifts are measured as D, the ratio of the determinant of out of control matrix (˙ 1 ) over the determinant of in control covariance matrix (˙ 0 ). By manipulating the determinant of ˙, an in-control or an out of control process can then be simulated. The Levenberg–Merquardt quasi-network is used as training algorithm of network A and B. The mean square error (MSE) associated with the output layer is propagated backward through the network, by modifying the weights. Since the number of examples was very many, Tables 4 and 5 detail only some of the test examples. 100 examples were simulated for each type of mean shift and variance shift. Increasing the number of examples did not signiﬁcantly improve the learning performance. The initial weights were randomly set between [−0.01, +0.01]. The epochs of the iteration were set to 350. The learning rate and momentum factor were set

Average output of major parameter special network Mean

Aggregate standard deviation

Errors

(+0.33,+0.07,+0.09) (+0.09,+0.29,+0.07) (+0.09,+0.13,+0.43) (+0.81,+0.80,+0.12) (+0.77,+0.11,+0.75) (+0.09,+0.76,+0.78) (+0.83,+0.63,+0.72)

0.130 0.241 0.187 0.261 0.194 0.214 0.183

(0.07,0.07,0.09) (0.09,0.11,0.07) (0.09,0.13,0.03) (0.11,0.10,0.12) (0.07,0.11,0.05) (0.09,0.06,0.08) (0.05,0.07,0.02)

Table 6 Evaluation results of neural network A and B.

Mean shift Variance shift

Average error

Aggregate standard deviation of average error

(0.0731, 0.0728, 0.0723) (0.0792, 0.0737, 0.0783)

0.0216 0.0301

to 0.13 and 0.25, respectively. These NN training parameters were optimized mainly based on trial and error experiments. The BPN were trained by the implementation of a backpropagation algorithm in MATLAB toolbox. During training, the convergence condition was reached within 100 training epochs. 2.2.4. Performance evaluation of Module II To examine the performance of Module II, the proposed trained networks A and B were evaluated using the simulated data with the various levels of shift. In simulation process, to have more practical state, it was assumed that the process is in-control condition ﬁrstly and the pattern features are slowly strengthen as the recognition window moves forwards through the process data stream. Table 6 summarizes the average errors and aggregate standard deviation of average error of the CCP shift magnitude identiﬁcation. According to these results, the overall performance of shift magnitude identiﬁcation for mean and variance is reasonably good. 3. Results and discussion In this section, ﬁrst a set of experiments are conducted on a case study to illustrate the advantages and contributions of the model and then a set of experiments are conducted that compare the performance of the proposed method with the existing approach.

2980

M. Salehi et al. / Applied Soft Computing 12 (2012) 2973–2984

Fig. 5. Plans A, B and C in a typical part.

Since the proposed method ﬁrstly try to recognize unnatural pattern (mean shift or variance shift), to evaluate this performance of the method, the correlation between the actual output and the output generated can be used by the trained SVM as a measure. The correlation between the actual output and the output generated by SVM that presented in Table 2 is revealed that the output of the trained model is very strongly correlated with the corresponding actual output for every quality characteristic. To evaluate the other capability of method, the following experiments are conducted. 3.1. Case study In the presented part in Fig. 5, the plans A, B and C are ﬁnished on the same machine having the same tool holder. This part is assembled with other parts and the height of plans is correlated to each other. These three heights are considered as x1 , x2 and x3 , and the variance covariance matrix, ˙ is as follows.

˙=

1 0.65 0.86 1 0.73 0.65 0.86 0.73 1

The in-control mean and standard deviation of the process for x1 are 0.3750, 0.0062, for x2 are 0.8130, 0.0096 and for x3 are 0.200, 0.0035 respectively. The UCL and LCL are calculated to be 0.3936 and 0.3564 for x1 , 0.8418 and 0.7842 for x2 and 0.2105 and 0.1895 for x3 respectively. To control process, a control chart pattern recognition problem with three variables must be considered. Fig. 6 shows the measurements of 50 consecutive batches of the part for three quality characteristics. An assignable cause makes a shift for mean x1 to 0.3874 (displacement of the mean = +2) at sample No. 25, a shift for mean x2 to 0.8034 (displacement of the mean = −1) and also a shift for variance x2 to 0.0196 (displacement of the variance = +1) at sample No. 25. Fig. 6 shows the individual control charts for each variable. Figs. 7 and 8 show 2 control chart and |S| control chart for these observations. The observed data are presented to the proposed model in realtime with window size 12. Output of the SVM detected a mean shift for x1 in 33-th observation, a mean shift for x2 in 36-th observation and also a variance shift for x2 in 40-th. By considering these unnatural patterns in quality characteristics, practitioners may be able to recognize cause of deviations. However, for obtaining more accurate results, he/she can use the outputs of special purpose classiﬁers (network A and B). Magnitude of mean shift +2 in x1 and magnitude of mean shift −1 in x2 was recognized by network A. magnitude of variance shift +1 in x2 was recognized by network B. These results are highly correlated with real situations. The outputs of SVM and network A and B for case study are shown in Table 7. Notice that no points exceed control limit (0.32116) in Fig. 8, ones might conclude that the process is in control. However, SVM correctly signals that the process is out of control at sample 29. The case study is provided to explain further the method developed above. Excluding the natural pattern, 3 × 3 × 3 − 1 = 26 unnatural patterns (three states for each variable are natural, mean shift, and variance shift) must be considered for the output of SVM-classiﬁer. For each type of unnatural pattern, a Monte-Carlo simulation generates 1000 input vectors when the

2 statistic exceeds the UCL. These 1000 input vectors for each unnatural pattern include different magnitudes of deviation from natural patterns for each variable according to Table 3. Two-thirds of them were used for training, and the rest of the data were used for testing. The training was terminated since the MSE of the trained network was 0.075. Then, the test data set that composed of 8666(= 1/3 × 26 × 1000) input vectors with a window size of 12 were classiﬁed in the on-line test stage to test the classiﬁcation capability. The trained model starts to check an abnormal pattern based on the most recent 12 observations. The result is presented in Table 7. As a result of checking data by the proposed modular model, the following ranges of the pattern parameters are classiﬁed: 1. SVM classiﬁer indicates a normal pattern in the samples 1–21. 2. SVM classiﬁer indicates an abnormal pattern (mean shift) for x1 in the sample 22, therefore the collected data were presented in network A and the real magnitude of shift for x1 was recognized in the sample 25 or 26. As a result, magnitude of a shift that has been recognized by SVM-classiﬁer can be determined after about four samples by neural network A. 3. SVM classiﬁer indicates a mean shift for x2 in the sample 25. The real magnitude of mean shift for x2 was recognized in the sample 30 after ﬁve samples that it was determined by SVM-classiﬁer. 4. SVM classiﬁer indicates a variance shift for x2 in the sample 29. The sample 36 indicates the real magnitude of variance shift for x3 after about seven samples that it was determined by SVMclassiﬁer. The above results for case study indicate the magnitude of mean shift can be determined earlier than the magnitude of variance shift after that it was determined by SVM-classiﬁer. 3.2. Comparative study This section gives a comparison among different CCPR approaches for mean shift and variance shift and shows the effectiveness of the proposed method against popular approaches in the literature. Mean shift: Two measures are considered for evaluation of the proposed method: (1) classiﬁcation rate (CR) and (2) average run length (ARL). According to the classifying rule presented in Table 3, the unnatural patterns can be classiﬁed using the output of the corresponding trained network. The proposed model can classify with 82.1% and 76.2% accuracy for mean shift and variance shift respectively. Some of experiments were conducted to check the optimality of model. The results are as follows: (1) It is necessary to use SVM-Classiﬁer for recognizing mean or variance shifts. It is impossible to recognize the unnatural patterns and also the magnitude of shifts only by NNs. because when number of quality characteristics increase, NNs cannot consider generality of the problem. (2) Small window provides better protection against large shifts than a large window. On the other hand, using a large window, the neural networks can learn small shifts. Therefore, the neural network with a large window can better recognize small shift than that with a small window. However, large window size increases the complexity of NNs. Thus, the window size must be varied according to the requirements of real-world applications. (3) The number of training samples has a little effect on the performance of the neural network. By increasing training samples, the performance of model can improve up to a certain level of accuracy.

M. Salehi et al. / Applied Soft Computing 12 (2012) 2973–2984

2981

Fig. 6. Individual control chart for x1 , x2 and x3 for the presented part. (a) Displacement of the mean +2 for x1 . (b) Displacement of the mean −1 and displacement of the variance +1 for x2 . (c) Process is normal for x3 .

2982

M. Salehi et al. / Applied Soft Computing 12 (2012) 2973–2984

Fig. 7. 2 control chart for the presented part.

Fig. 8. |S| Control chart for the presented part.

For the mean shift recognition evaluation, the ARL performance of the proposed method with p = 2 and correlation between quality characteristics q = 0.5 was compared in Table 8 with the several statistics-based approaches Hotelling’s 2 chart, MCUSUM

chart [25], two MEWMA charts proposed by Lowry et al. [26] and MCUSUM chart (MC1) [27]. The average ARL performance of the proposed method was estimated by seven shifts combination presented in Table 3. The type

Table 7 Outputs of SVM and network A and B for case study. No.

Output SVM

Output A

Output B

No.

Output SVM

Output A

Output B

1–12 2–13 3–14 4–15 5–16 6–17 7–18 8–19 9–20 10–21 11–22 12–23 13–24 14–25 15–26 16–27 17–28 18–29 19–30 20–31

(N, N, N) (N, N, N) (N, N, N) (N, N, N) (N, N, N) (N, N, N) (N, N, N) (N, N, N) (N, N, N) (N, N, N) (N, N, N) (N, N, N) (N, N, N) (N, N, N) (N, N, N) (N, N, N) (N, N, N) (N, N, N) (N, N, N) (N, N, N)

– – – – – – – – – – – – – – – – – – – –

– – – – – – – – – – – – – – – – – – – –

21–32 22–33 23–34 24–35 25–36 26–37 27–38 28–39 29–40 30–41 31–42 32–43 33–44 34–45 35–46 36–47 37–48 38–49 39–50 40–51

(N, N, N) (MS, N, N) (MS, N, N) (MS, N, N) (MS, MS, N) (MS, MS, N) (MS, MS, N) (MS, MS, N) (MS, MS + VS, N) (MS, MS + VS, N) (MS, MS + VS, N) (MS, MS + VS, N) (MS, MS + VS, N) (MS, MS + VS, N) (MS, MS + VS, N) (MS, MS + VS, N) (MS, MS + VS, N) (MS, MS + VS, N) (MS, MS + VS, N) (MS, MS + VS, N)

(0.42, −0.01, 0.01) (0.50, 0.05, 0.06) (0.58, 0.07, −0.07) (0.64, −0.05, 0.08) (0.66, −0.02, −0.04) (0.73, −0.09, 0.05) (0.70, −0.15, −0.02) (0.69,−0.21, −0.01) (0.71,−0.32, 0.1) (0.77, −0.42, −0.03) (0.63, −0.43, 0.00) (0.67,−0.46, 0.05) (0.66, −0.39, 0.07) (0.64,−0.44, 0.02) (0.71,−0.31, 0.08) (0.72, −0.38, 0.07) (0.70, −0.43, 0.10) (0.71,−0.41, 0.13) (0.68, −0.44, 0.10) (0.73, −0.43, 0.12)

(0.03,0.34, 0.05) (0.13, 0.37, 0.06) (0.04, 0.51, 0.01) (0.06, 0.45, 0.02) (0.03, 0.53, 0.08) (0.04, 0.52, 0.00) (0.03, 0.58, 0.07) (0.08, 0.68, 0.07) (0.10, 0.72, 0.1) (0.10, 0.72, 0.09) (0.05, 0.76, 0.05) (0.09, 0.74, 0.07) (0.04, 0.71, 0.1) (0.07, 0.71, 0.17) (0.08, 0.68, 0.1) (0.08, 0.74, 0.07) (0.04, 0.69, 0.13) (0.15, 0.69, 0.17) (0.11, 0.73, 0.12) (0.08, 0.70, 0.14)a

a

N, MS, and MV means normal, mean shift and variance shift respectively.

M. Salehi et al. / Applied Soft Computing 12 (2012) 2973–2984

2983

Table 8 Comparison of ARLs between the proposed model and the existing MSPC approaches. Mean shift

0 1 1.5 2 2.5 3

Proposed approach

200 5.39 3.45 2.45 2.04 1.81

Statistics-based approach 2 (UCL=10.6)

MCUSUM (h = 5.5, k = 0.50)

MC1 (h = 4.75, k =0.50)

MEWMA-1 (h = 8.79, r =0.10)

MEWMA-2 (h = 8.66, r = 0.10)

200 42.00 15.80 6.90 3.50 2.20

200 9.35 5.94 4.20 3.26 2.78

203 9.28 5.23 3.69 2.91 2.40

200 7.76 4.07 2.59 1.89 1.50

200 10.20 6.12 4.41 3.51 2.92

Table 9 ARL comparison of the traditional |S|, adaptive |S|, MDS |S| control charts and neural networks. Variance shift (D)

|S|

Adaptive sample size

MDS

NN

0 0.1 0.2 0.3 0.5 1 1.5 2

200.0 141.4 104.6 80.45 51.91 24.11 18.34 10.16

200.0 143.0 102.7 73.21 38.45 13.89 10.61 5.549

200.0 96.78 62.63 45.52 25.77 10.57 8.85 4.726

200.0 102.3 60.14 42.08 24.35 12.24 9.74 6.036

I errors (expressed here as in-control ARL) of all the control methods in Table 8 are kept approximately same. Each ARL value of a shift combination setting for the proposed model was obtained using 2000 simulation runs. As a result, the proposed approach performs better than the statistics-based approaches, especially when the shift magnitude is small. Fig. 9 illustrates the comparison of ARLs between the proposed model and existing MSPC approaches. Variance shift: The performance of the proposed model can be compared for variance shift with the other competing schemes in terms of ARLs. The ARL values of all selected schemes depend only on D. To facilitate comparison, all schemes selected for comparison had the same type I error probability and the same sample size. In Table 9 presents the ARLs for the traditional |S| chart, Adaptive sample size control charts from Aparisi et al. [7], MDS |S| charts of Grigoryan and He [28] and the neural network for the bivariate simulated example. The ARLs of Adaptive size scheme and MDS |S| charts are minimized for each shift. It can be seen in Table 9, neural network B outperforms the current control chart approaches for small-tomedium shift. Best performance for neural network B occurs for shift values between 0.2 and 1. The neural network performs comparable to MDS |S| charts when D < 1. The neural network performs

superior to |S| chart with adaptive sample size when D ≤ 1.5, but marginally worse in cases when D > 2. It must be noted for the unknown D case, the proposed neural network might perform superior to adaptive sample size charts and the MDS |S| charts. Fig. 10 indicates the comparison of ARLs between the neural network B and existing statistical based approach. The paper proposed a modular model for recognizing concurrent mean and variance shift for multivariate process with the satisﬁed accuracy. In addition, this approach can classify the magnitude of mean shift or variance shift for each variable simultaneously. The proposed model can lead to ﬁnding of multiple assignable causes simultaneously and thus signiﬁcantly reduce the diagnostic time of out of control process. The main contributions of this research are recognizing type of shift (mean or variance) and shift magnitude for each variable simultaneously. Because the most previous researches consider only one type of unnatural patterns for multivariate process or they consider multi-unnatural pattern for univariate process. On the other hand, few works such as El-Midany et al. [11] that consider the recognition problem of multi-unnatural patterns for multivariate process, do not obtain any information about magnitude of deviations. Finally, while most of researches have focused on the determination of contributors for a process mean shifts; this study is motivated to consider the cases of process variance shifts.

Fig. 9. Comparison of ARLs between the proposed model and existing MSPC approaches.

Fig. 10. ARL comparison of neural network B and current statistical based approaches.

2984

M. Salehi et al. / Applied Soft Computing 12 (2012) 2973–2984

4. Conclusion In this paper proposes a modular model based on support vector machines and neural networks for detecting mean and variance shifts in a multivariate process. In addition, this model can classify magnitude of shift for each variable simultaneously. These contributions can address the main problem of traditional approaches that cannot directly determine which variable or group of variables has caused the out of control signal and how much is the magnitude of out of control. This information is clues for ﬁnding the cause of an unnatural process. The performance of the proposed approach was evaluated using a case study and some of comparative studies. Important design issues of the proposed model are discussed. The performances of neural networks were evaluated by estimating the ARLs using simulation. Extensive comparisons showed that the proposed approach offers a competitive alternative to the existing control procedures. For future research, an expert system can be integrated with the proposed model for rapid ﬁnding and consulting about an unnatural pattern. This system only considers two main unnatural patterns (main shift and variance shift). For future researches, the other unnatural patterns can be studied for likely ﬁnding the certain NNs that can recognize these patterns also. References [1] C.-S. Cheng, H.-P. Cheng, Using neural networks to detect the bivariate process variance shifts pattern, Computers and Industrial Engineering 60 (2) (2011) 269–278. [2] F.B. Alt, Multivariate quality control, in: S. Kotz, N.L. Johnson, C.R. Read (Eds.), The Encyclopedia of Statistical Sciences, Wiley, New York, 1985, pp. 110–122. [3] F. Zorriassatine, J.D.T. Tannock, A review of neural networks for statistical process control, Journal of Intelligent Manufacturing 9 (1998) 209–224. [4] F. Zorriassatine, J.D.T. Tannock, C.O. Brien, Using novelty detection to identify abnormalities caused by mean shifts in bivariate processes, Computers and Industrial Engineering 44 (2003) 385–408. [5] L.H. Chen, T.Y. Wang, Artiﬁcial neural networks to classify mean shifts from multivariate 2 chart signals, Computers and Industrial Engineering 47 (2004) 195–205. [6] S.T.A. Niaki, B. Abbasi, Fault diagnosis in multivariate control charts using artiﬁcial neural networks, International Quality and Reliability Engineering 21 (2005) 825–840. [7] F. Aparisi, G. Avendano, J. Sanz, Techniques to interpret T2 control chart signals, IIE Transactions 38 (2006) 647–657. [8] F. Aparisi, J. Sanz, G. Avendano, Neural networks to identify the out-of control variables when a MEWMA chart is employed, in: Proceedings of the Applied Simulation and Modeling, Palma De Mallorca, Spain, 2007, pp. 29–31.

[9] R.S. Guh, Y.R. Shiue, An effective application of decision tree learning for online detection of mean shifts in multivariate control charts, Computers and Industrial Engineering 55 (2008) 475–493. [10] J.-b. Yu, L.-f. Xi, A neural network ensemble-based model for on-line monitoring and diagnosis of out of control signals in multivariate manufacturing processes, Expert Systems with Applications 36 (2009) 909–921. [11] T.T. El-Midany, M.A. El-Baz, M.S. Abd-Elwahed, A proposed framework for control chart pattern recognition in multivariate process using artiﬁcial neural networks, Expert Systems with Applications 37 (2010) 1035–1042. [12] C. Low, C.M. Hsu, F.J. Yu, Analysis of variations in a multi-variate process using neural networks, International Journal of Advanced Manufacturing Technology 22 (2003) 911–921. [13] C.-S. Cheng, H.-P. Cheng, Identifying the source of variance shifts in the multivariate process using neural networks and support vector machines, Expert Systems with Applications 35 (2008) 198–206. [14] R.S. Guh, A hybrid learning-based model for on-line detection and analysis of control chart patterns, Computers and Industrial Engineering 49 (2005) 35–62. [15] S. Lawrence, C.L. Ciles, A.C. Tsoi, Lessons in neural network training: overﬁtting may be harder than expected, in: Proceedings of the Fourteenth National Conference on Artiﬁcial Intelligence, AAA1-97, AAAl Press, Menlo Park, CA, 1997, pp. 540–545. [16] W.S. Sarle, Stopped training and other remedies for overﬁtting, in: Proceedings of the Twenty-seventh Symposium on the Interface of Computing Science and Statistics, 1995, pp. 35–360. [17] V. Vapnik, The Nature of Statistical Learning Theory, Springer, New York, 1995. [18] S. Abe, Support Vector Machines for Pattern Classiﬁcation, Springer-Verlag New York Inc., 2010. [19] M. Salehi, A. Bahreininejad, I. Nakhai, On-line analysis of out of control signals in multivariate manufacturing processes using a hybrid learning-based model, Neurocomputing 74 (2011) 2083–2095. [20] R. Battiti, Using mutual information for selecting features in supervised neural net learning, IEEE Transactions on Neural Networks 5 (1994) 537–550. [21] A.E. Smith, X-bar and R control chart interpretation using neural computing, International Journal of Production Research 32 (1994) 309–320. [22] A. Hassan, M. Shariff Nabi Baksh, A.M. Shaharoun, H. Jamaluddin, Improved SPC chart pattern recognition using statistical features, International Journal of Production Research 41 (7) (2003) 1587–1603. [23] R.S. Guh, On-line identiﬁcation and quantiﬁcation of mean shifts in bivariate processes using a neural network-based approach, Quality and Reliability Engineering International 23 (3) (2007) 367–385. [24] C.S. Cheng, A multi-layer neural network model for detecting changes in the process mean, Computers and Industrial Engineering 28 (1995) 51–61. [25] J.J. Pignatiello, G.C. Runger, Comparisons of multivariate CUSUM charts, Journal of Quality Technology 22 (3) (1990) 173–186. [26] C.A. Lowry, W.H. Woodall, C.W. Champ, S.E. Rigdon, Multivariate exponentially weighted moving average control chart, Technometrics 34 (1) (1992) 46–53. [27] R.B. Crosier, Multivariate generalizations of cumulative sum quality control schemes, Technometrics 30 (3) (1988) 29–1303. [28] A. Grigoryan, D. He, Multivariate double sampling |S| charts for controlling process variability, International Journal of Production Research 43 (2005) 715–730.

On line detection of mean and variance shift using neural networks and support vector machine in multivariate processes

On line detection of mean and variance shift using neural networks and support vector machine in multivariate processes

Recommend Documents