A computer-automated statistical process control method with timely response

A computer-automated statistical process control method with timely response

301 Engineering Costs and Production Economics, 18 ( 1990) 30 l-3 IO Elsevier Science Publishers B.V., Amsterdam - Printed in The Netherlands A CO...

822KB Sizes 6 Downloads 175 Views

301

Engineering Costs and Production Economics, 18 ( 1990) 30 l-3 IO

Elsevier Science Publishers B.V., Amsterdam -

Printed in The Netherlands

A COMPUTER-AUTOMATED STATISTICAL PROCESS CONTROL METHOD WITH TIMELY RESPONSE Emmanuel P. Papadakis Center for Nondestructive

Evaluation, Iowa State University, 329 Wilhelm Hall, Ames, IA 500 1 1 (U.S.A.)

ABSTRACT

A wide variety of metrology instruments can now interface with computers to acquire and process data on parts as they are produced. In addition to accept/reject actions, the instrument can provide information for statistical analysis. For Statistical Process Control (SPC),

a variable in every part can be measured at production line speeds and the data can be used to generate points for control charts. The process control computer can then analyze the control chart (held in its memory and updated in realtime) to identify out-of-control conditions.

1. INTRODUCTION

analyze them by Run Rules in the computer memory, the “period” between control chart points can be shrunk to any desired degree consistent with the acquisition of data. While the data for real-time SPC are being taken and processed, the data stream on the entirety of production is simultaneously being processed to sort out the nonconforming parts (out-of-specification). This done, there is no need to “quarantine” (stop and hold for later inspection) batches of parts made while the process was out of control but not yet detected as such. The nonconforming parts will already be in a separate bucket. Automated nondestructive evaluation (NDE) methodologies and other types of metrology will improve quality while permitting automated SPC to enhance productivity. The present work treats control charts for the mean of variable data and explores the implications of one model for computerized SPC and examines the possibilities for improved timeliness of spotting out-of-control conditions. The influence of Type I errors is also calcu-

In the Factory of the Future (FOF), production will be unified under a system of Computer Integrated Manufacturing (CIM). Automatic, computer-integrated control of processes, detection of errors, and determination of corrective action will be necessary because of the proposed level of manpower in the FOF [ 1,2 1. Under these circumstances, a process which might go out of control and remain that way would be highly detrimental to productivity because of the waste involved, and because of the time spent producing and correcting nonconforming items. Very rapid statistical process control (SPC) will provide definitive warnings of out-of-control conditions. SPC involves taking measurements periodically on relatively small sets of specimens, calculating control chart points, and analyzing the control charts by run rules [ 3 ] in terms of exceedence of control limits. With automatic electronic measurement systems, and with computers to calculate control chart points and 0167-188X/90/$03.50

0 1990 Elsevier Science Publishers B.V.

302 lated. Their frequency is studied by a Monte Carlo computer simulation of SPC run rules during under-control operation. A strategy (algorithm) based on reliability theory applied to the output of the simulation is developed to limit the occurrence of Type I errors in practice. The probability of continued occurrence of Type I errors under the regime of the new algorithm is assessed. THEORY 2.1 Background

and rules

Shewhart control charts is the primary method used to measure the probability that a process is out of control. All charts are characterized by an average value and control limits. In using control charts, there are two steps: ( 1) data gathering on a periodic basis to add one point per period to the chart, and (2) interpretation of the positions of the recent points relative to the control limits and the mean. The control limits must be known from previous capability studies. The use of several recent points can be summarized by run rules which the operator must observe on the control chart to interpret an out-of-control condition. In this work, charts for the mean value of a variable are considered. The Western Electric Co. [ 31 has listed four run rules as follows, with the definitions Definition of terms the variable being measured. the mean value of X for the most recent group of n specimens measured. the grand mean of the process while it is operating under control. the estimate of the standard deviation of the mean of the process while it is operating under control. Western Electric run rules 1. x exceeds x+ 3& for one point. (Either positively or negatively.) 2. x exceeds x& 26X for two out of three suc-

cessive points. (Both positive or both negative with respect to x. ) 3. 1 exceeds x& l& for four out of live successive points. (All four either positive or negative with respect to x. ) 4. x lies on the same side of x (either higher or lower) for eight consecutive points. When all four of these run rules are used together, the probability [ 41 of making a Type I error (saying that good production is out-ofcontrol) is estimated [ 41 to be slightly lower than 1% every time a new data point is added to the X-bar chart and the run rules are applied to the most recent eight points. A more recent, exact calculation [ 5 ] puts the probability at 1.09%. 2.2 Operation Consider that a batch of N parts is produced but not measured, and that the next n parts are measured as a sample for a control chart point. During production at the end of one period, the operator measured parts N+ 1, N+2, and so on through N+n; then he calculates x, and plots the value on the means control chart. With the chart before him, he notes, using the run rules, whether there is an indication of an out-of-control condition. “Out-of-control” implies some sort of process failure which makes the measured values of the output, X, deviate from the intended value x further than statistical fluctuation would indicate. Thus, the statistical run rules indicate the process failure as the averages, x, deviate also. If there is an indication of out-of-control, he takes corrective action. If not, then he runs production for another time period and then measures parts 2N+ 1,2N+ 2, and so on through 2N+ n, and proceeds as before. He continues through the points for 3N, 4N, and so on, as time and production continue. When he finds an out-ofcontrol condition, he takes corrective action including repairing the process and sorting out nonconforming product. With the Western Electric Co. run rules, the amount of produc-

303 tion to be sorted could vary from Nparts to 8N parts or more. 2.3 Reduction of N It will be necessary in the Factory of the Future, and would be advantageous now, to reduce the number of parts, N, produced between samples so that: 1. Out-of-control conditions could be caught promptly, and 2. Quarantining-and-sorting would involve a small number of parts. One method of reduction N is considered here. Throughout, it is understood that the aim is to make statistical process control as efficient as possible. Other benefits will appear. The proposed method is to utilize automated control charting with computers in order to match the rate of the automated production already in place. A hypothetical system embedded in a Local Area Network in a Factory of the Future is shown in Fig. 1 and described below. Consider an electronic measuring system which could make the same type of measurement currently for X-bar control charting of some process. It could be, for instance, a laser gage for dimensions or an eddy current instrument to correlate with material hardness. This

Fig. 1. An automated computer-controlled inspection system for SPC embedded in a Local Area Network with a governing CIM computer.

electronic measuring system under computer control and mounted on-line could measure every part passing through it. By a counting procedure the computer could select the data of parts N+ 1, N+ 2, and so on through N+ n as a set for a control chart data point; compute z, and analyze the point along with the preceding 7 points, as specified by the run rules. Similarly, by counting further, the computer algorithm could segregate the data on points 2N+ 1, 2iV+2 ...2N+n. repeat the computation and update the set of means by discarding the oldest one; and analyze the new set on the basis of the run rules. Then, the computer could proceed to the production sets corresponding to 3N, 4N, and so on, as these parts are produced. With the speed of modern electronics, the calculation of the mean of the set indexed by N+ 1 through N+ 12could be performed in the time while the measuring device was waiting for the production line at speeds up to 1000 parts per hour, for instance, to present it with part number N+n+ 1 to measure. (Such speeds are typical of many machining opera-

Fig. 2. Time compression of SPC permitted by computer automation. (a) Present manual inspection, computation, and decision. Too slow for FOF. (b) Manual system speeded up to point where it is impractical/uneconomical. (c) Computer automated SPC (CASPC). Decisions made in less than 100 microseconds after the production of each group of n parts. Group size n= 5 here.

304

tions, as an example. ) Thus, the batches could be shrunk in size so that the number Ngoes to zero (N=O); every group of y1parts would be tested as an SPC sample with no extra production intervening. This concept of time compression is shown in Fig. 2. Since the longest run rule under consideration uses eight (8) consecutive values of X, it would be necessary for the computer memory to hold the most recent eight values of x and to update them by eliminating the oldest x every time a new x was calculated from the next n parts. After each new x was generated, the computer would check all the run rules in order to locate an out-of-control condition. The computer sequence is shown in the flow chart in Fig. 3. If rules nos. 1,2 and 3 were not activated but rule no. 4 showed out-of-control, then eight ( 8 ) batches of n parts would be involved in the nonconforming behavior. This adds up to 8n parts at most produced over a period of time 8nr where r is the time to produce one part. This is the longest time before detection. As an example, for 600 parts per hour, r= 6 s. With n = 5, the time to detection is 8nr= 240 s, i.e., 4 min, whereas at one manual point per hour the delay would be one shift. (Of course, during this time, quality might deteriorate more and consequently be caught sooner by a shorter Run Rule.) This short calculation demonstrates the potential timeliness of computerized electronic SPC. As a result of the timeliness, the number of parts to be quarantined could be 40 instead of 4800 in the above manual example of eight points, each obtained an hour apart. 2.4 Repetitive run rule decisions to mitigate the Type I error

As mentioned [ 5 1, use of the Western Electric Co. run rules results in a probability of 1.09% of generating a Type I error at every control chart point. This means that an in-con-

trol process would be judged to be out-of-control with 1.09% probability, every time a new point was calculated for a means control chart, and the run rules were subsequently applied. In the example given in Section 2.3, using r= 6 s and n = 5, the Mean Time Between Type I Errors (MTBTIE), also called Average Run Length ( ARL) , would be approximately 3000 s or 0.83 h. The calculation is (6 s/part) x (5 parts/point) X ( 100 points/failure) = 3000 s/ failure. This short time would be unreasonable and unacceptable, and would lead to downtime expended for false reasons. Consequently, production would be slowed down unnecessarily for improper quarantining and for engineering work to track down nonexistent but statistically indicated “special causes” of fluctuations. The strategy proposed here to reduce Type I errors is to utilize a reliability approach by repeating the out-of-control determination with the same Run Rules on further specimens. Reliability theory would indicate that Type I errors would not repeat at short intervals frequently for statistical processes. It is recognized that other well-known statistical techniques such as moving averages, geometric moving averages, CUSUMs, or simply x charts with wider limits and larger values of n are usable in the case presented. A complete treatment of all methods is beyond the scope of this paper. An a priori calculation of the probabilities of consecutive or closely spaced repetitive Type I errors is tedious and unwieldy. A much more fruitful approach is a computer simulation of the operation of the control charts as they monitor a process which is under control and which shows expected statistical fluctuations. A Monte Carlo run through a multiplicity of possible fluctuations will define the frequency of occurrence of short run lengths between Type 1 errors. Analysis of the Monte Carlo output should lead to an algorithm for reducing the Type I error accumulation to a manageable frequency.

305 2.5 Monte Carlo study of Type I error repetition

A Monte Carlo computer simulation was performed to test the feasibility of using the method suggested in Section 2.4 to mitigate the effects of Type I errors. A computer program was written in FORTRAN 77 to analyze control charts according to the Western Electric Co. run rules. The flow chart is essentially the same as Fig. 3 except that the normal random number generator (GGNML subroutine [6 ] in the IMSL scientific subroutine library) is substituted for the initial measuring of 5 parts and the computing of X from them. X is chosen directly by the normal random number generator, and represents draws from a distribution expressed analytically by a Gaussian bell curve.

A consideration of the range R is omitted in this analysis. One enters the flow chart in Fig. 3 below the box “Compute x and R". This program was run with normal random number input simulating a process under control; i.e., the random number mean was set equal to the nominal mean of the process and the random number standard deviation was set equal to the standard deviation of the mean of the process. Whenever the program called an out-of-control condition, the occurrence indicated a Type I error. For 99,999 control chart points (determined by the Fortran 15 integer limitation), there were 1,104 indications of out-of-control for a frequency of about 1.10%. This figure verifies the probability prediction in [ 5 1. The computer output yielded also the time between Type I errors in terms of the times bet TlME IN CHART 0 POlNTS , 1.

TYPE

I

ERROR

HISTOGRAM

OCCURRENCES 20

40

60 I

80 I

IO0 1

120 I

140

I60 1

IIll

91 101 III 121 131 141 151 161 171. 161 191 201 221 231 241 251 261 271 2.81 291 301 31 I 320 321 330 331 340 341 .350 351 360 361 370 371 360

Fig. 3. Flow chart of a computer program to do computerautomated statistical process control (CASPC). Specialised to subsets of five measurements per control chart point and to analysis by Western Electric Run Rules.

c 13 BEYOND

THlS

OUT TO

t = 661-670

1

Fig. 4. Frequency histogram of the times of occurrences of Type I errors. Output of a Monte Carlo computer simulation of the operation of the Western Electric Company Run Rules for xcontrol charts.

306 TYPE TIME IN CHART 0 POINTS I O-I



I

ERROR

HISTOGRAM

2

4 I

6 1

8 I

IO I WECO

3

l-2 2-

3

‘I

CHART

5-6

14 I

,

I,,,,

2.2.1,l.l 8 1

2,l.l,l,~,~~,3 9

2.3,1,2,2,1.1.2.3

4

6-7

FLOW

1.1.~.1,1,2.2,1,1

5

1

4-5

7-

I2 / Types

9

3-4

ALGORITHM

DETAIL

OCCURRENCES

1 8

8-9 9-10

NO

1.3,2x1 IO

1.3.3.1,~3,3,3.3,1

IO

3.1,4.4.4,4,3,4,1.4

IO

3.3,3,4,4,2,1.3,4,4

8

4,/t

2.4.4,3.3,1

is

YES

r-43 K.10?

NO

IS

YES

.J,I?

TRUE 0 oc.

i

I I

&

/

Fig. 5. Detail of Fig. 4 for the first ten time intervals. Each time interval is one control chart point. TABLE 1 Time intervals for triple Type I calls (99,999 control chart points; run rule in parenthesis) To first call

Between first call and second call

Between second call and third call

Over 10 Over 10 Over 10

5 (2) g (4) g (1)

3 (2) g (4) g (1)

tween control chart points. A frequency histogram versus the time intervals from one indication to the next indication is given in Fig. 4. Details concerning the frequency distribution in the first ten intervals are given in Fig. 5 by another histogram. It can be seen from Fig. 5 that few second indications happen very soon after a first indication. The simulation output was analyzed further in terms of the frequency of three “calls” closely following one another. The data are shown in Table 1. Only three (3) occurrences were noted of three calls with 10 or fewer time intervals between calls 1 and 2 and between calls 2 and 3. If the fourth run rule is neglected, the number is reduced to two (2) occurrences. Thus, the requirement of finding three closelyfollowing out-of-control indications (spaced apart by up to 10 time intervals) improves the

Fig. 6. Flow chart for computer algorithm to test for three out-of-control calls separated by IO or fewer X-bar points from 1 to2andfrom2to3.

reliability to 3 parts in 100,000 or better. The MTBTIE (or ARL) becomes 280-420 h (35 to 52 shifts), quite acceptable in view of the timeliness of the indication which would be 10 min at most (6 s/partx 5 parts/point x 20 points max + 60 s/min) . Here, the example of a production rate of 600 parts/h is used. The output made during the out-of-control condition would be at most 140 parts. This figure is smaller than the production between two manual control chart points by a factor of 4, at the same line speed and at one manual control chart point per hour. Thus, the probability of a Type II error is reduced by a factor from 2 to 4 while the probability of a Type I error is lowered drastically. These simulation results strongly suggested that three successive out-of-control indications with 10 or fewer time intervals between the first and second and between the second and third should be used as the algorithm to mitigate the effects of Type I errors in computer-automated statistical process control. See

307 the flow chart in Fig. 6 for an embodiment the algorithm. 2.6 Byproduct 2.6.1

of

benefits

Just-in-Time

inventory

Rapid automatic SPC would signal out-ofcontrol conditions to the control office before excessive faulty parts had been produced and had undergone further wasted processing. This timeliness would improve the capability of the factory to ship for Just-in-Time (JIT) inventory control. 2.6.2

Statistical

many of the measurement systems are currently configured that way. With this as background, one can see that the decision to go to automatic statistical process control in the present-day factory, with the production batch shortened to the point of being equal to the SPC sample (n produced and tested; N=O produced between samples) is only a matter of an economic calculation. One should compare the cost of automatic equipment with the cost of manual SPC at an acceptable level. In the Factory of the Future, the automation of the SPC function will be necessary.

data

3.2 Computer

For analysis of the actual functioning of a process, the automatic SPC equipment would furnish ongoing streams of data upon which to perform time series and other types of analysis. 2.6.3

Potential

for sorting

in situ

With reliable data on every part, the automated SPC methodology would provide the means to sort production into categories during both under-control and out-of-control sequences of operation. With this capability, there would be no need for quarantining with its concomitant costs. 3. POTENTIAL

software

The author has written a subroutine in Fortran 77 for the analysis of control chart points according to the Western Electric Run Rules. This subroutine is intended to analyze data acquired automatically by electronic instruments which monitor the output of production processes. The output of the control chart analysis subroutine is a flag indicating out-of-

FOR IMPLEMENTATION

3.1 General

Actual human measurements are being supplanted by electronic measurements at a rapid rate. These include relationships between material properties and such properties as ultrasonic velocity and eddy current response, which are nondestructive [ 7- 111, as well as measurements made by various methods of gaging. All the electronic measurement instrnments have the potential for being linked with computers for control and data acquisition;

Fig. 7. Monitor display of control chart computed and analyzed automatically by the software of [ 13 ] as it acquires data from automated metrology equipment (interfaced as needed). Figure supplied by BBN Software Products; used by permission.

308 control and a pointer showing the number of the Run Rule which detected the condition. Thus, this control chart analysis subroutine tells the operator when to stop production and how far back in time to quarantine the product, assuming the Run Rules are considered adequate. This subroutine with appropriate modifications could be interfaced with metrology instruments to provide a demonstration of the theory reported in this paper. Programs with similar capabilities are available from at least two vendors of software [ 12,131. These must be interfaced with metrology systems on a custom basis. The software reported in [ 13 ] yields a complete Shewhart Control Chart output on a color monitor

with color-coded indicators. A typical output sequence is shown in Fig. 7 in a black-andwhite half-tone representation. One is presented graphically with indications of out-ofcontrol conditions and sentences on the screen give the reason for the decision, such as the run rule which detected the condition. One vendor [ 141 of machine vision systems builds multi-sensor units which measure automotive body dimensions on production lines. SPC information is measured an reported for every panel, and out-of-control conditions are flagged. One NDE and metrology systems vendor [ 15 ] has a gaging system with a computerized statistical package which plots control charts

Fig. 8. Dedicated metrology equipment of [ 151 including software and monitor display for simplified control charts ( f 30~ only). This system does automatic multi-channel SPC on parts. Figure supplied by K.J. Law Engineers; used by permission.

309

on a video monitor for visual decisions concerning points exceeding x& 36%.This system is shown in Fig. 8. While its software does not compute the entire flow chart in Fig. 3, its program could be expanded to do so. At that point, this dedicated equipment would account for an entire set of Run Rules, not just the three sigma limits, and would be performing automated SPC. Then, this metrology system would represent an embodiment of the instrumentation suggested in this paper. 4. CONCLUSIONS Points for control charts can be calculated and the charts can be analyzed for out-of-control conditions as fast as production occurs, providing that the measurements are made by electronic instruments interfaced with computers. It is shown in this paper that an out-ofcontrol condition in production could be detected by computer-automated SPC after the production of only forty (40) parts, using the longest Run Rule in the literature with five (5) parts per data point. Certain statistical techniques can be used to check on the validity of out-of-control indications which might be Type I errors, and to reduce the probability of taking unneeded corrective action in the presence of Type I errors. The analysis in this paper suggests that the use of three (3 ) successive outof-control indications with the first and second separated by 10 or fewer Xpoints and with the second and third separated similarly would be adequate to discriminate against Type I errors. With the suggested algorithm, the probability would be 3 per 100,000 chart points. The mean time between Type I errors would be of the order of 50 shifts, i.e., several weeks. Only 140 parts would be produced within the algorithm time. An added benefit would be the automatic sorting of faulty parts during the period between the inception of the out-of-control condition and its detection by the Run Rules in the SPC computer software.

ACKNOWLEDGMENTS

The author is indebted to R. Hough, D. Hobson, H. Arnold, and W.W. Scherkenbach for critical comments and enlightened discussions. R. Hough and W.W. Scherkenbach kindly reviewed the manuscript in preliminary form. This work was initiated while the author was at the Manufacturing Development Center, Ford Motor Company. Continuation of this work was supported by the Center for NDE at Iowa State University and was performed at the Ames Laboratory. Ames Laboratory is operated for the U.S. Department of Energy by Iowa State University under Contract No. W7405-ENG-82. REFERENCES 1

2 3

4

5

6

7

8

9

10

Tukloff, J., 1984. MANTECH program aims to make factory of the future a reality in defense. Ind. Eng., 16 (2): 46-52. Merchant, M.E., 1983. Production: A dynamic challenge. IEEE Spectrum, 20 (5 ): 36-39. Western Electric Company, 1956. Statistical Quality Control Handbook. Western Electric Co., Newark, NJ, pp. 24-28. Western Electric Company, 1956. Statistical Quality Control Handbook. Western Electric Co., Newark, NJ, pp. 180-183. Champ, C.W. and Woodall, W.H., 1987. Exact results for Shewhart control charts with supplementary runs rules. Technometrics, 29 (4): 393-399. Normal or Gaussian Random Deviate Generator “GGNML”, IMSL Library, Edition 9.2, IMSL, Houston, TX, 1986. Boeing Airplane Company, 1979. Temper inspection of aluminium alloys. Boeing Process Specification BAC 5946, Revision G. Boeing Airplane Company Seattle, WA. Papadakis, E.P., 1976. Ultrasonic velocity and attenuation: Measurement methods with scientific and industrial applications. In: W.P. Mason and R.N. Thurston (Eds.), Physical Acoustics: Principles and Methods, Vol. XII. Academic Press, New York, pp. 339-342. Giza, P. and Papadakis, E.P., 1979. Eddy current tests for hardness certification of gray iron castings. Mater. Eval., 37 (8): 45-50, 55. Papadakis, E.P., Bartosiewicz, L., Altstetter, J.D. and Chapman, II, G.B, 1984. Morphological severity factor for graphite shape in cast iron and its relation to ultrasonic velocity and tensile properties. AFS Trans. 92: 72 I728.

310 11

12 I3

Libby, H.L., 1971. Introduction to Electromagnetic Nondestructive Test Methods. Wiley, New York, pp. 3957. Statistics and Control Chart Package, Advanced Systerns and Designs, Inc., Dearborn, MI, 1985. RS Series Quality Control Analysis 3.0, BBN Software Products, Northbrook, IL, 1986.

14 15

Data Cam 2.0, Perceptron, Inc., Farmington Hills, MI, 1988. Model 8200 Verigage Brochure, K.J. Law Engineerings, Inc., Farmington Hills, MI, 1987.