Tests for special causes with multivariate autocorrelated data

Tests for special causes with multivariate autocorrelated data

Computers Ops Res. Vol. 22. No. 4, pp. 443-453. 1995 Pergamon TESTS FOR Copyright :~ 1995 ElsevierScienceLtd Printed in Great Britain. All rights r...

786KB Sizes 0 Downloads 57 Views

Computers Ops Res. Vol. 22. No. 4, pp. 443-453. 1995

Pergamon

TESTS FOR

Copyright :~ 1995 ElsevierScienceLtd Printed in Great Britain. All rights reserved 0305-0548/95 $9.50+0.00

0305-0548(94)00052-2

SPECIAL

CAUSES WITH

AUTOCORRELATED

MULTIVARIATE

DATA

John M. Charnest Department of Mangement Science, University of Miami, Coral Gables, FL 33124, U.S.A.

Abstract--Runs tests for special causes are used routinely with Shewhart quality-control charts that are based on independent and identically distributed univariate processes. Recently, researchers have proposed the use of alternative, time-series-based statistical models for constructing control charts that are valid for autocorrelated processes. Monte Carlo simulation is used here to examine the effects of incorrectly assuming serial independence and using the runs tests for special causes on data generated by autocorrelated multivariate processes. The results have implications for developing and using statistical process-control techniques in practice.

1. I N T R O D U C T I O N

Statistical process-control (SPC) control charts have been used in industry since Walter A. Shewhart devised them for Bell Telephone Laboratories in 1924, but global competitive pressure has recently stimulated increased industrial interest in their application to service and production processes. The model that Shewhart (1931) proposed explicitly assumes that the data are generated by a univariate process that yields independent and identically distributed (iid) observations. This model works very well when it is appropriate, but it is not suitable for direct application to all industrial processes. The renewed interest in SPC by industry has led researchers to develop alternate models for process control that allow for both multivariate processes and the relaxation of the iid assumptions underlying the validity of Shewhart's explicit model. The intent of this paper is to use Monte Carlo simulation to examine the Type I error rate of control charts based on several alternate models when the data are generated by multivariate processes that produce vectors of observations which are not independent over time. It is assumed here that the parameters are known for the data generating process. These known parameters are used to construct various control charts for the simulated observations in order to eliminate any uncertainty due to sampling error. While the assumption that the parameters are known might sometimes be unrealistic in practice, there are times when enough data are available to make the assumption of known parameters innocuous. In the next section the model proposed explicitly by Shewhart is presented. Then comes a discussion of some of the many extensions of Shewhart's explicit model that have been proposed recently. Following that are the simulation results. The final section concludes and gives implications for SPC practitioners.

t Professor Charnes is an Associate Professor of Statistics and Quality Management in The School of Business, The University of Kansas, Lawrence, Kansas. He holds the Bachelor of Civil Engineering, Master of Business Administration, and Doctor of Philosophy (Business Administration) degrees, from the University of Minnesota. He has published papers in Management Science, Decision Sciences, Journal of Business Logistics and International Journal of Production Research. His current scientific interests are in the statistical aspects of quality management, applied time series analysis and simulation output analysis. 443

444

John M. Charnes 2. SHEWHART CONTROL CHARTS

Shewhart's (1931) method for constructing quality control charts was designed for use with a univariate stationary sequence of independent and identically distributed (iid) measurements on a normally distributed process, {Xt: t = 1, 2 .... }, having a mean value # = E [ X J , and variance, ~72 =g[(xt--#)2]. The model given explicitly by Shewhart is

Xt-I~=a,,

(1)

where {at: t = 1, 2 .... } is a Gaussian white noise process with mean zero and variance cr2. Here Xt may represent either individual observations generated by the process or arithmetic averages (means) of logical subgroups of observations. This paper is concerned exclusively with the use of the model for individual observations, but of course the implications are the same if the subgroup means are generated by the processes investigated here. Model (1) is used by first estimating the mean/~ with some estimate/2 taken over a long period of time 0 < t ~< T during which the process is in statistical control, which means that all variation in the process is assumed to be due to chance, not some assignable cause--using Shewhart's (1931) nomenclature--or special cause--using Deming's (1982). Then an estimate of the process standard deviation, 0, is computed. See Cryer and Ryan (1990) for a comparison of different possibilities for & Given the estimates/2 and 0, a time-sequence plot is used to monitor the process by drawing a centerline at/2, three-sigma control limits at/2_+ 30, and then plotting the observed values of {Xt} on the chart as they occur. If an observed value of the process falls outside the three-sigma limits a search for an assignable or special cause of variation should commence. In addition, the process can be monitored by applying several other "runs" tests to the series of observations to judge whether the observed variation is due solely to chance. The statistical software package Minitab (1992) gives seven additional tests that are valid in testing for special causes when the process generating the data is well approximated by Shewhart's model (1). The runs tests are designed to detect various patterns that may indicate different reasons for departure from a state of statistical control. In the Minitab (1992) implementation of the runs tests, the control chart is divided into Zones A, B and C on both sides of the centerline. These zones are depicted in Fig. 1. Zone C is the area between / 2 - 0 and /1 + 0, Zone B is the area between / 2 - 0 and / i - 2 0 and between t2+ 0 and /i+20, and Zone A is the area b e t w e e n / ~ - 2 0 a n d / 2 - 3 0 a n d / i + 2 0 a n d / i + 3 0 . The tests are applied to the last set of observations whenever an observation is added to the control chart; if any of the test conditions in Fig. 1 are met, a search for a special cause should commence. Test 1--called "Criterion I" by Shewhart (1931)---is whether an observation falls outside the three-sigma control limits. If the model in (1) is valid, the probability that Test 1 will falsely signal the presence of a special cause (Type I error rate) is about 0.0027. The remaining seven runs tests are rule-of-thumb indicators that react to patterns in the chart having a similarly low probability of occurrence when (1) is valid. The use of runs tests has been studied extensively for the univariate case by Champ and Woodall (1987), Davis and Woodall (1988), Wheeler (1983), and Walker et al. (1991). 3. EXTENSIONS OF SHEWHART'S MODEL Alt (1984), Jackson (1985), Lowry et al. (1992), Montgomery and Mastrangelo (1991), and Pignatiello and Runger (1990) provide reviews of several important extensions of Shewhart's model. In this section the extensions of model (1) to multivariate, uncorrelated processes, and to univariate, autocorrelated processes are shown, after which is considered an extension of the model to the case of multivariate, autocorrelated observations.

3.1. Multivariate non-autocorrelated data Several correlated measures of quality-related characteristics are often observed in industry. Alt and Smith (1988) give an example in which the usefulness of the plastic film being manufactured is dependent upon both the transparency and the thickness of the film, which are inversely related. They present a generalization of (1) that can be used as an aid to monitoring the process. The

Tests for special causes

445

UCL

UCL

c c

B LCL A

LCL



I Test 1 : One point b e y o n d Z o n e A

I

I

I

I

I

I

Test 2: Nine points in a row in Z o n e C or beyond on one side of centerline

UCL UCL CDX

oF LCL

A

LCL

I

I

I

I

I

I

I

I

Test 3: Six points in a row, all i n c r e a s i n g or all d e c r e a s i n g

I

I

1

I

I

I

Test 4: Fourteen points in a row, alternating up and down

UCL

UCL

B c

C

c

C

B LCL A

LCL

D

I

I

I

I

[

I

I

I

I

T e s t 5: Two out o f three p o i n t s in a row in Zone A or b e y o n d

I

I

I

I

I

I

Test 6: Four out of five p o i n t s in a row in Zone B or b e y o n d on one side of centerline

UCL I A

UCL A

C c

B LCL

LCL

[

[

I

I

I

I

I

]

Test 7: F i f t e e n p o i n t s in a row in Z o n e s C above and b e l o w centerline

A

,

I

I

I

I

I

I

I

Test 8: Eight points in a row b e y o n d Z o n e s C above and below centerline

Fig. 1. Runs tests.

446

John M. Charnes

method is similar to that of Hotelling (1947), who may have been the first to recognize that it is necessary to consider jointly two or more correlated measures of quality on the same product. Alt and Smith's (1988) methodology for constructing a joint control chart on more than one variable uses the model Xt-tt=a t

for t = l , 2 .... where the d variables being measured constitute the vector of observations X t = ( X L t , . . . , Xa.t)' taken at time t. The mean vector of the process is denoted by/~ = (#1, #2 ..... #d)' and the process {a,} = {(al,t, a2,, ..... aa,t)'} is Gaussian multivariate white noise with mean 0 and variance-covariance matrix ~. Alt and Smith (1988) estimate /l with X = E r = I X t / T = ( X 1 , X 2 . . . . . Xa), , then calculate S=(I/(T-1))Etr=I [ ( X - X ) ( X - X ) ' ] , as an estimate of ~. The multivariate control chart is a plot of the quadratic form Q t = ( X t - X ) S - I ( X I - ~ ) ', for t > ~ T + 1. The quadratic form Qt has the Hotelling T 2 distribution (see Anderson, 1984), but for large samples, S will be very precise estimate of X and a single control limit drawn at the 0.9973 quantile of the z2-distribution with d degrees of freedom will give this chart approximately the same Type I error rate as Test 1 in the three-sigma Shewhart control chart described above. The other seven tests discussed here were developed for sequences of normally distributed random variables and thus are not applicable to the multivariate control chart. However, runs rules could be constructed for multivariate charts as well. If the multivariate process is in statistical control, the observed value of Qt will be approximately distributed as a Z 2 random variable with d degrees of freedom, and this distribution could be used to construct the runs tests. Alt and Smith (1988) also advocate making individual plots similar to those described in Section 2 for each component {Xi.t} using the elements s, on the main diagonal of S to aid in detecting which variable(s) may have been most responsible for leading to an out-of-control point on the multivariate control chart. Because more than one plot is used simultaneously, Alt and Smith recommend using the Bonferroni method to deal with the multiple comparisons problem in constructing the control limits. All of the runs tests described above are applicable to these control charts. 3.2.

Univariate autocorrelated

data

Alwan and Roberts (1988) suggest that it may be necessary to modify the control chart model (1) to account for any autocorrelation that may be present in a univariate process. They present a general technique that accounts for autocorrelation by fitting a time-series model to the data and then applying Shewhart's control-chart methodology to the residuals. Alwan and Bissell (1988) give an example of applying the univariate time-series technique to clinical chemistry quality-control measurements. Alwan and Roberts (1988) propose fitting an ARIMA (autoregressive integrated moving-average) model to the data, which is an ARMA (autoregressive moving-average) model fit to data that have been differenced to remove any trend. An ARMA(p, q) model is written S t - tz - dp 1( X , _ 1 - P) . . . . .

c~p(Xt _ p - p) - 0 t a t - 1 . . . . .

O~ar_ ~=

at

(2)

where # = E [ X , ] is the mean of the process; ~b1..... ~bp are autoregressive coefficients; 01 . . . . . 0, are moving-average coefficients; and {at: t = 1, 2 .... } is a Gaussian white-noise process with mean zero and variance a 2. This model is used by fitting it to differenced data observed over a long period of time during which the process is in statistical control, then using the estimated parameters of the fitted model to obtain an estimate of the variance of the residuals, a 2, which is then used to construct control limits for a time-sequence plot of the residuals. Wardell e t al. (1992) and Alwan (1992) studied the capability of univariate control charts to identify special causes in the presence of autocorrelation. 3.3. M u l t i v a r i a t e

autocorrelated

data

The models presented previously lead to another extension of (1) that accounts for both autocorrelation within the process and correlation across the components of a multivariate process. A vector ARMA(p, q) model is written X t - - 1./-- lID l ( X t - 1 - ~ ) . . . . .

~p(X t _ p - ~) -- 0 la i .....

Oqa t _ q = a t

(3)

Tests for special causes

447

where/~ = E[Xt] = (#1,/~2 . . . . . #d)' is the vector of means of the component process; ~1 . . . . . ~ are (d x d) matrices of autoregressive coefficients; O1 . . . . . Oq are (d x d) matrices of moving-average coefficients; and the process {at: t = l , 2 .... } is multivariate white noise with mean 0 and variance-covariance matrix lg. A vector ARMA model can be used to monitor simultaneously more than one quality-related measure. These models can be fit with statistical software packages such as SCA System (Scientific Computing Associates). It is beyond the scope of this paper to give a full discussion of the procedure for fitting vector ARMA models to data. See Box and Tiao (1981), or Tiao and Tsay (1989) for details. The vector ARMA model is used in a manner similar to that described by Alwan and Roberts (1988) for univariate ARMA models. The model is fitted to vectors of observations taken over a long period of time during which the process is in statistical control. Then the estimated parameters of the fitted model are used to obtain vectors of residuals, from which multivariate control charts may be constructed using the Alt and Smith (1988) methodology. Similarly, individual charts may be constructed for each of the component series of the residual vectors. Montgomery and Friedman (1989) give an example of applying a vector ARMA model to a bivariate process. Lowry et al. (1992) give another multivariate time-series extension of (1), the M E W M A (multivariate exponentially weighted moving-average) model. This is also known as a vector IMA(1) (integrated first-order moving-average) model, and is thus closely related to the class of vector ARMA models. Both the M E W M A and vector ARMA techniques can be used to obtain vectors of residuals with which to construct multivariate control charts fi la Hotelling (1947), but as in Alt and Smith (1988), it is also useful to look simultaneously at individual plots of the residuals from the model, to which the runs tests apply if model (3) is valid.

4. CONTROL-CHART MODEL COMPARISON This section describes Monte Carlo experiments that were performed to investigate the Type I error rate of the runs tests when the processes were modeled with some of the different models discussed above. The data were generated from vector ARMA processes and the alternative models were used to construct control charts for the observations. The runs tests were then administered to the charts and the frequency of signals for the presence of a special cause was recorded for each alternative modeling of the data for the purpose of determining the Type I error rate. This section describes those experiments in detail. 4.1. Vector A R M A ( 1 , 1) model

A d-dimensional vector ARMA(1, 1) process is written in matrix notation as

X t - ¢PlXt- 1 - O a r - 1 =at.

(4)

where ¢Pl is a (d x d) matrix of first-order autoregressive coefficients, O is a (d x d) matrix of first-order moving-average coefficients, and {at: t = 1, 2 .... } is a d-dimensional Gaussian white-noise process. If the roots of the determinantal equation [I-q~Iz [ = 0 lie outside the unit circle in the complex plane, the process {X 1, X2 .... } is stationary. Thus p = 0 can be specified without loss of generality. The second moments of a stationary multivariate process measure the strength of the linear relationships among elements of the vector as a function of the difference in the time index. The second moments are the elements, ),o{k) (for i , j = 1. . . . . d), of the autocovariance matrices, which are defined as F(k) = E[(X t + k - - ~)(Xt - - ~)"]

for k . . . . .

- 2, - 1, 0, 1, 2 . . . . .

Because the multivariate process is Gaussian, the mean vector and the autocovariance matrices completely characterize the distribution of X,. With ~u=0, the relationship between the autoregressive-coefficient matrices and the autocovariance matrices are given by

E[XtX~_k] = E[Xt_ 1X't_k] + E[Oat_ 1Xl-k] + E[atX't- k],

448

John M. Charnes Table 1. Factor levels for vector ARMA(I,I) experiment

Level

~ 1

O

() ()

() ()

0.9 0.9

High

Low

0.8

0.0 0.9

0.8

1.0

(o,)

0.0 0.8

0.9

0.2 0.2

O,1 0.1 0.0 0.1

1.0

1.0

0.0 0.2

0.1

1.0

which yield r(0) = ~ I F ( - 1) + ~ + OE(O, + O)' r(1) = 0 1 r ( o ) + oY, r(k)=.f(k-

(5)

1) for k = 2 , 3,....

(Note also that F ( - k ) = F(k)'.) The auto- and cross-correlation are measured by the elements po(k) of the autocorrelation matrices R(k), for k . . . . . - 2, - t, 0, 1, 2 ..... which are derived from the elements of the autocovariance matrices as

Because these are correlations, it is true for a stationary multivariate autocorrelated process that - 1 ~
for alli, j a n d k = . . . , - 2 , - 1 , 1 , 2

....

po(0) = 1 for all i = j - 1 < po(0) < 1

for all i ~-j.

Estimates of the elements of the autocorrelation matrices or related sample statistics, the partial autocorrelation matrices, could be used to test statistically for a white-noise process. See Brockwell and Davis (1987), or Tiao and Tsay (1989) for details. 4.1.1. Two-dimensional vector A R M A (1, 1) process. If d = 2, the vector ARMA(1, 1) process with mean 0 is written in scalar notation as X l . t = ~)11XI,t X2,t=ct~21Xl,t-1

1 At- ~ 1 2 X 2 , t - 1 At- al,t

+ 01 lal,t- 1 + 012a2,t - 1

--F t ~ 2 2 X 2 , t - 1 +a2.t+O21al,t-1

+022a2.t-1.

In the Monte Carlo study, this two-dimensional model was used to generate observations and control charts were constructed using different specifications of the process standard deviations. The standard deviations specified were the expected values of the "natural" estimators for the process models used, and are obtained from the second moments of the vector ARMA(1, 1) process, as given above. Of course, in practice one would never know these true values and would instead use estimates obtained from the data. However, the true values are used here to focus on the Type I error rate alone without the uncertainty associated with using estimates of these values. The conditions under which the data were generated were varied by specifying two levels each of the experimental factors ~1, O and ~. The levels for the 23 full-factorial design are shown in Table 1. At each of the eight design points 100 autocorrelated vector observations were simulated from the two-dimensional model. The initial observations were drawn randomly from the stationary distribution to eliminate any initial-transient bias. The experiment was replicated independently 100 times at each design point. For each replication, the vector observations {X 1, X2 .... , Xloo} and the model parameters were then used to construct three different individuals control charts for each series. Each chart was then subjected to the runs tests to detect the presence of special causes. The data were generated with Marse and Roberts' (1983) portable random-number generator in conjunction with the polar method to generate vector observations on the Gaussian process {a,}

Tests for special causes

449

in (4). The control charts were constructed and the runs tests were administered by MINITAB version 9 (Minitab 1992) running under VMS version 5.4-2 on a DEC VAX 8650. The MINITAB command ICHART with specified means and standard deviations was used to obtain all charts studied here. Each replication of the experiment consisted of generating 100 two-dimensional observations from (4) and using MINITAB to construct individuals control charts on each of the two processes using three different models for the data. The differences among these control-chart models are in the manner in which the second moments of the processes were used to specify the standard deviations to MINITAB.

4.2. Alternative control-chart models 4.2.1. Univariate independent data model. For the first set of charts the auto- and cross-correlation present in the data were ignored, and model (1) was used to construct control charts for the sequences {Xl,t} =

{~11Xl,t- 1 --b~ 12X2,t_ 1 -b a~., + 01 lal .t-1 -~- O 1 2 a 2 . t - 1 }

{X2.,} = {4)22X2.,-t + a2a + 022a2.t-J }.

(6) (7)

(Recall that 4)21=021 =0.0 was specified at all design points.) In what follows, sequence (6) is referred to as the process UID1, and sequence (7) as the process UID2. The runs tests were applied to the charts made from these sequences, with the means specified as 0.0, and the values of s specified as x/~-~l(0) and x/~222(0) for UID1 and UID2, respectively. Note that the multiple comparisons problem was also ignored for all charts in this study, i.e. the Bonferroni Inequality was not used in developing the control limits for any of the charts.

4.2.2. Univariate time-series model. The second set of charts was constructed using the univariate time-series model (2) for each of the processes. This accounts for the autocorrelation present in each series, but not for the cross-correlation between the two series. Thus, control charts were constructed for the sequences { X I . , - 4 ) , , X , , , - , - O , lal,,-,}={4)12X2,,-1

+al,t+O12az,t-1}

{ X 2 , t - 4)22X2,t-1 - 022a2.t- 1} = {a2,t}.

(8) (9)

In what follows, the sequence (8) is referred to as the process UTSl and (9) as the process UTS2. In general, these processes have variances

EE(~b 12X2,t- 1 --F-a 1 ,t + O 1 2 a 2 , t - 1) 2] -~ 011 ---b(24)12012 "b 022)0"22 "~ ~b22~)22(0 ) E[(4)zlXl.,- 1 +aa,,+O21al.,-1) 2] =0"22 + (24)2,02~ +0221)0"11+ 4)22171a(0). Thus, x/0"11 +(24)12012 + 022)0"22+ 4)122~22(0) and x/~22 were specified as the values of s for UTS1 and UTS2, respectively. Note that, because 4)21 = 02 a = 0.0 at all design points, UTS2 is a white-noise process.

4.2.3. Multivariate time-series model. As a check, a third set of charts was constructed using the model (3), which accounts for both autocorrelation and cross-correlation. Then the charted values were just the observations on the Gaussian multivariate white-noise process, so the tests for special causes reacted only as often as would be expected from chance, and hence the results from these tests are not tabulated in this paper. 4.3. Vector ARMA(1, 1) data Table 2 gives the mean number of failures of each test averaged over the 100 runs. The values shown in the table are those for which the sample mean number of failures was greater than 0.005 + 3x/0.995(0.005)/100 = 0.026, which was selected somewhat arbitrarily as a conservative upper bound on the probability of (incorrectly) getting a signal for the presence of a special cause from each test individually when a process is in a state of statistical control. See Adams et al. (1992) for a discussion of incorrectly getting a signal for a special cause. Upon scanning the columns, the reader will note that all values for Tests 1 and 4 are omitted, but for different reasons. The values in the Test 1 column are omitted because the true values

450

John M. Charnes Table

2. M e a n

number

of failures for special-cause

tests: ARMA(I,I)

Parameters Process

UID1

UID2

UTS1

data

Test

~1

O

~

1

2

3

4

5

6

7

8

_

_

_

,

*

*

*

*

*

*

*

_

_

+

*

*

*

*

*

*

*

*

-

+ +

+

* *

0.028 0.031

* *

* *

* *

* *

* *

* *

+ +

-

+

* *

0.661 0.679

0.324 0.433

* *

0.031 0.043

0.233 0.250

0.374 0.385

O. 1 6 3 0.183

+ +

+ +

+

* *

0.679 0.689

0.384 0.416

* *

0.030 0.041

0.261 0.263

0.333 0.361

0.182 0.190

_

_

_

.

*

*

*

*

*

*

*

_

_

q-











,





-

+ +

+

* *

0.030 0.029

* *

* *

* *

* *

* *

* *

+

-

-

*

0.408

0.066

*

0.030

O, 1 6 2

O. 1 5 2

0,064

+

-

+

*

0.411

0.067

*

0.030

O. 1 5 9

O. 1 4 2

0,065

+ +

+ +

+

* *

0.453 0.448

O. 1 1 9 0.114

* *

0.027 0.030

O, 1 8 8 0,180

O. 151 0,164

0,084 0,081

_

_

--

*

*

*

*

*

*

*

*

_

_

+

*









,



,

-

+

-

*

*

*

*

*

*

*

*

--

+

+

*









,



,

+

-

-

*

0.273

*

*

*

*

0.060

0.032

+ +

+

+ -

* *

0.395 0.355

0.049 0.028

* *

0.029 *

0.154 0.161

0.127 0.072

0.060 0.044

+

+

+

*

0.436

0.084

*

0.029

0.177

0.147

0.076

v/~l(0) and x/~222(0) were specified to MINITAB as the standard deviations for the charts. Even though there is both auto- and cross-correlation present in UID 1, UID2 and UTS 1, the relationship in (5) accounts for both types of correlation and the use of the true values for s caused Test 1 to signal the presence of a special cause only as often as expected by chance. As will be seen later, Test 4 reacts to negative autocorrelation, of which there is none in this first experiment, and thus all the values in that column are omitted. Because UTS2 is a white noise process, all values for every test were less than or equal to 0.026, and the results for UTS2 were therefore omitted entirely from the table. Multivariate analyses of variance (MANOVAs) of the eight-dimensional vectors of test results for each of the three processes listed in Table 2 showed that there were significant two- and three-way interactions among the factors so that definitive statements cannot be made about the main effects individually; however, it is clear that from Table 2 that Tests 2, 3, 5, 6, 7 and 8 signalled the presence of special causes most often when the factor ~1 was at the high level. Therefore, an additional experiment was undertaken to study the effects of varying the autocorrelation coefficients only.

4.3.1. Comparison usin9 vector AR(1) data. In order to focus on the effects of autocorrelation present in the data, a vector AR(1) (first-order autoregressive) model was used to generate the data for an additional Monte Carlo experiment. The vector AR(I) model used here is written Xt = ~ X r - 1 + ar

where @ = ( ~ b

\o.o

~

/

and ~b was varied b e t w e e n - 0 . 9 and 0.9. The amount of cross-correlation in N

0.5 1.0~. Proceeding as before, the sequences that / were used to construct individuals control charts were

1.0 the data was fixed by setting the level of I; at 0.5

{Xl,t} = {(~Xl ,t-1 -~-~)X2,t-1.4.

{xl.,-

al,t

}

(10)

{x2.,} =

+ as,,}

(11)

4,x1.,-1 } =

+ al.,}.

(12)

Tests for special causes

451

T a b l e 3. M e a n n u m b e r o f failures for s p e c i a l - c a u s e tests: A R [ I ) d a t a Test Process

UIDI'

UID2'

UTSI'

0

1

2

3

- 0.9 - 0.8

4

5

6

7

8

*

*

*

0.580

0.037

*

0.376

0.174

*

*

*

0.377

0.027

*

0.205

0.100

- 0.7

*

*

*

0.228

*

*

0.100

0.042

-0.6

*

*

*

0.181

*

*

0.041

*

-0.5

*

*

*

0.175

*

*

*

*

- 0.4

*

*

*

0.039

*

*

*

*

0.3

*

0.037

*

*

*

0.036

*

*

0.4

*

0.066

*

*

*

0.057

*

*

0.5

*

0.123

0.029

*

*

0.082

*

*

0.6

*

0.222

0.055

*

0.028

0.121

0.0,16

*

0.7

*

0.336

0.104

*

0.033

0.162

0,104

0,048

0.8

*

0.505

0.189

*

0.037

0.214

0.191

0,104

0.9

*

0.674

0.365

*

0.047

0.260

0.350

0.182

- 0.9

*

*

*

0.313

*

*

0.107

0.062

-0.8

*

*

*

0.176

*

*

0.051

*

-0.7

*

*

*

0.092

*

*

*

*

- 0.6

*

*

*

0,049

*

*

*

*

- 0.5

*

*

*

0.033

*

*

*

*

-0.4

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

0.3 0.4

*

0.028

*

*

*

0,028

*

*

0.5

*

0.047

*

*

*

0,040

*

*

0.6

*

0.083

*

*

*

0,056

*

*

0.7

*

0.135

*

*

*

0.079

*

*

0.8

*

0.232

0.030

*

*

0.110

0.037

*

0.9

*

0.393

0.039

*

0.032

0.171

0.110

0.062

- 0.9

*

*

*

0.272

*

*

0.068

0.039

-0.8

*

*

*

0.131

*

*

*

*

- 0.7

*

*

*

0.066

*

*

*

*

-0.6

*

*

*

0,037

*

*

*

*

-0.5

*

*

*

*

*

*

*

*

-0.4

*

*

*

*

*

*

*

*

0.3

*

*

*

*

*

*

*

*

0.4

*

*

*

*

*

*

*

*

0.5

*

*

*

*

*

*

*

*

0.6

*

0.040

*

*

*

0.038

*

*

0.7

*

0.082

*

*

*

0.056

*

*

0.8

*

0.152

*

*

*

0.088

*

*

0.9

*

0.318

*

*

0.028

0.147

0.070

0.044

In what follows, the sequences (10, (11) and (12) are referred to as processes UIDI', UID2' and UTSI', with values of s specified as ~ , ~ and x//1 + 2~b + ~bZyzz(0), respectively. The results of the second experiment are shown in Table 3. The results for - 0 . 3 ~
452

John M. Charnes Test 4. Fourteen points in a row alternating up or down. This test shows up clearly as a test for negative autocorrelation, which tends to cause the alternating pattern of observations that this test signals. Note that this pattern occurs even in process UTS 1', in which the residual is a function of the lag-1 value of the process {Xt}, which shows that cross-correlation at different lags can cause the tests to react. Thus, this test might be failed even by univariate time-series procedures, which account for autocorrelation within an individual series but do not account for cross-correlation between series. Test 5. Two out of three points in a row in Zone A or beyond. This test signals the presence of positive autocorrelation like Test 2, but the frequencies are lower because Test 5 requires observations to be farther from the centerline than Test 2. Test 6. Four out of five points in a row in Zone B or beyond. This test reacted a significant number of times to the same levels of positive autocorrelation as Test 2, but somewhat less frequently. Test 7. Fifteen points in a row in Zone C (above and below centerline). This test was effective at detecting relatively high absolute values of both negative and positive autocorrelation. Test 8. Eight points in a row on both sides of centerline with none in Zone C. This test also reacts to relatively high absolute values of both positive and negative autocorrelation, but not as often as Test 7.

Note that points signalling the presence of a special cause can be counted more than once. For example, if it happened that nine points in a row fell in Zone B or beyond then Test 6 would react 6 times and the ninth point would be counted as causing both Tests 2 and 6 to react.

5. CONCLUSION With a growing use of control charts for monitoring processes, the tests for special causes will be applied more frequently. This paper shows that accounting for auto- and cross-correlation are very important in setting up control charts because such correlation can cause each one of the tests to signal the presence of a special cause when in fact the process is in statistical control as defined in Shewhart (1939). Shewhart (1931) gives the model (1) as being useful for describing many industrial processes, but this does not limit the definition of a process that is in statistical control to one that produces independent, identically distributed observations. All of the recently developed extensions of model (1) are in the spirit of Shewhart (1939). His fundamental notion was to monitor variation in a manufacturing process by fitting to the process a statistical model in order to get an "explainable" part and an "unexplainable" part--/~t and at, respectively, in (1). What is added by the extensions to his model is simply the degree of sophistication of the statistical model that is used to obtain the explainable part of the process. A quality-control technician wishing to implement the extensions to (1) has the additional burden of having to learn something about time-series modeling, but in an increasingly competitive global marketplace, this may well become necessary. Furthermore, with the availability of relatively sophisticated software, it is not as difficult as it once was to use time-series models, although, as with any technique, the user should have a firm grasp of the underlying concepts. The experimental results presented here show that if auto- and cross-correlation are not taken into consideration, the runs tests can indicate that a stationary multivariate autocorrelated process is out of statistical control. Thus the testing for auto- and cross-correlation is most important in setting up control charts and should be done before the runs tests are applied. This is done easily with many statistical software packages. If the process is white noise, the estimates of the elements of the autocorrelation matrices can be used to test for auto- and cross-correlation in the data before calculating control limits for the chart. If the autocorrelation matrices are not significantly different from 0, then Shewhart (or Hotelling) control charts could be used. If the matrices are significantly different from 0, then charts based on one of the alternative time-series models might be applicable.

Tests for special causes

453

This paper deals with a type of quality-control chart (individuals variables) that is likely to become the most widely used as on-line data collection becomes more widespread. For other types of variables charts, such as those in which means of subgroups of randomly chosen points from autocorrelated processes are plotted the averaging may help to ameliorate some of the effects of the auto- and cross-correlation, but depending on how the process is modeled, the autocorrelation could still be significant. This is an area for further investigation. REFERENCES Adams B. M., Lowry C. A. and Woodall W. H. (1992) The use (and misuse) of false alarm probabilities in control chart design. In Frontiers in Statistical Quality Control 4 (Edited by J. Antoch) Physica-Verlag, Heidelberg. pp. 155-168. Alt F. B. and Smith N. D. (1988) Multivariate process control. In Handbook of Statistics (Edited by P. R. Krishnaiah and C. R. Rao), Vol. 7. Elsevier Science, Amsterdam. Alt F. B. (1984) Multivariate quality control. In The Encyclopedia of Statistical Sciences (Edited by S. Kotz, N. L. Johnson and C. R. Read). John Wiley, New York. Alwan L. C. (1992) Effects of autocorrelation on control chart performance. Commun. Stat. Theory Meth. 21, 1025-1049. Alwan L. C. and Bissell M. G. (1988) Time-series modeling for quality control in clinical chemistry. Clin. Chem. 34, 1396-1406. Alwan L. C. and Roberts H. V. (1988) Time-series modeling for statistical process control. J. Business Econ. Stat. 6, 87-95. Anderson T. W. (1984) An Introduction to Multivariate Statistical Analysis, Second edition. John Wiley, New York. Box G. E. P. and Tiao G. C. (1981) Modelling multiple time series with applications. J. Am. Statist. Assoc. 76, 802-816. Brockwell P. J. and Davis R. A. (1987) Time Series." Theory and Methods. Springer-Verlag, New York. Champ C. W. and Woodall W. H. (1987) Exact results for Shewhart control charts with supplementary runs rules. Technometrics 29, 393-399. Cryer J. D. and Ryan T. P. (1990) The estimation of sigma for an X chart: MR/d 2 or S/c4? J. Quality Technol. 22, 187-192. Davis R. B. and Woodall W. H. (1988) Performance of the control chart trend rule under linear shift. J. Quality TechnoL 20, 26-262. Deming W. E. (1982) Quality, Productivity and Competitive Position. Center for Advanced Engineering Study, Massachusetts Institute of Technology, Cambridge, MA. Hotelling H. (1947) Multivariate quality control--illustrated by the air testing of sample bombsights. In Techniques of Statistical Analysis (Edited by C. Eisenhart, M. W. Hastay and W. A. Wallis). McGraw-Hill, New York. Jackson J. E. (1985) Multivariate quality control. Commun. Stat.--Theory Meth. 14, 2657-2688. Lowry C. A., Woodall W. H., Champ C. W. and Rigdon S. E. (1992) A multivariate exponentially weighted moving average control chart. Technometrics 34, 46-53. Marse K. and Roberts S. D. I1983) Implementing a portable FORTRAN uniform(0, 1) generator. Simulation 41, 135-139. Minitab, Inc. (1992) Minitab Reference Manual (Release 9), State College, PA. Montgomery D. C. and Friedman D. J. (1989) Statistical process control in a computer-integrated manufacturing environment. In Statistical Process Control In Automated Manufacturing (Edited by J. B. Keats and N. F. Hubble) M. Dekker, New York. 67-87. Montgomery D. C. and Mastrangelo C, M. (1991) Some statistical process control methods for autocorrelated data. J. Quality Technol. 23, 179-193. Pignatiello J. J. Jr, and Runger G. C. (1990) Comparisons of multivariate C USU M charts. J. Quality Technol. 22, 173-186. Scientific Computing Associates, SCA System. Statistical Software. Shewhart W. A. (1931) Economic Control of Quality of Manufactured Product. Van Nostrand, New York. Shewhart W. A. (1939) Statistical Method From The Viewpoint of Quality Control. The Department of Agriculture, Washington. Tiao G. C. and Tsay R. S. (1989) Model specification in multivariate time series. J. R. Statist. Soc. B 51, 157-213. Walker E., Philpot J. W. and Clement J. (1991) False signal rates for the Shewhart control chart with supplementary runs tests. J. Quality Technol. 23, 247-252. Wardell D. G., Moskowitz H. and Plante R. D. (1992) Control charts in the presence of data correlation. Mgmt Sci. 38, 1084-1105. Wheeler D. J. (1983) Detecting a shift in process average: tables of the power function for .~ charts. J. Quality Technol. 15, 155-170.

CAOR 2 2 / 4 - H