The statistical approach to the improvement of reliability

The statistical approach to the improvement of reliability

Electronics Reliability ~ Microminiaturization Pergamon Press 1962. Vol. 1, pp. 11-20. Printed in Great Britain T H E S T A T I S T I C A L APPROACH ...

774KB Sizes 1 Downloads 51 Views

Electronics Reliability ~ Microminiaturization Pergamon Press 1962. Vol. 1, pp. 11-20. Printed in Great Britain

T H E S T A T I S T I C A L APPROACH TO THE IMPROVEMENT OF RELIABILITY E. D . v a n

REST

Department of Engineering, University of Cambridge Abstract--Reliability can only be studied by observing more than one occasion or equipment. A statistical view of the problem is therefore essential. The frame of the statistical view is the pattern of variability experienced. One source of variability is the manufacturing process, another the whole-life environment of the equipment. The designer cannot design a reliable part for an equipment unless he knows the patterns of both of these. All stages of production, namely, approval, inspection, manufacture and use can contribute this information. Because the reliability of an equipment is the product of the reliabilities of its parts, the standard of reliability required of the parts to produce even an equipment of poor reliability is high, unusually high by the ordinary, standards of manufacture. A general principle is stated for seeking and using the information recommended. It is that it is more efficient to seek it in the bulk of the product rather than from the rare failures of parts. The application of this principle in design, approval, inspection, manufacture and use is briefly discussed. GENERAL THE statistical view must certainly be taken when studying the p r o b l e m of reliability because, although a given e q u i p m e n t may function at one time, others, or the same one on other occasions, do r~ot always do so. W e are accustomed to say that such effects are due to' chance. I t is i m p o r t a n t that we should admit that this means nothing more than that we are ignorant of the cause and perhaps therefore of the incidence of a particular variation from expected. T h e statistical view is taken whenever a group, instead of a single item, is studied. T h e items concerned may be things or they may be occasions. W h e r e effects occur sometimes and not others and when we do not know on which occasions they will occur, we can do nothing b y studying a single occasion; it is essential to take the statistical view as we do when we study, for example, the proportion of occasions on which the effect occurs. Although "chance" connotes ignorance, searching for more information is not the only thing we should do; we should also ensure the best use of the information already possessed. A n y information gained will still leave the total short of perfection; we cannot hope that the chance element will disappear entirely; we shall still need to take the statistical view.

T h e n u m b e r of flying b o m b s that fell on an area of 300 square miles in the south of England during a certain period was 3000, so that if there were no specific aiming points each sub-area of 1/10th square mile would, on average, expect to receive one. T h e r e were however m a n y sub-areas of this size which received more than one, and several which received so many that suspicion was voiced that aiming points existed and that aiming was effective. I t is possible to calculate, on the assumption of no effective aim, the expected n u m ber of sub-areas (chosen independently of the actual falls) that would receive each possible n u m b e r of bombs. T h e results of these calculations did, in fact, agree well with observation, so that it was reasonable to accept the assumption as representing the truth; to reject the supposition that aiming points existed; and so avoid the waste of effort that might have taken place in explaining the e n e m y ' s knowledge of t h e m and his power of aim. I n other words the statistical picture enabled the drawing of a conclusion in a situation which, because it was labelled "chance" might have led us to abandon hope of knowing more. T h e r e are similar conclusions to be drawn in the cases of complex equipments. Of a n u m b e r of equipments u n d e r observation some fail before their function is discharged. W e observe the p r o portion not so failing and call this the "reliability". 11

12

E. D. VAN R E S T

T h e equipments did not all behave alike; some failed, some did not, but we could not, beforehand, distinguish the two kinds. We attribute each failure to chance, meaning thereby nothing more than this ignorance. We could not have known what to expect by studying one equipment; we are in a statistical situation. We still do not know what will happen to another series, but we could have some idea if there were some link, some similarity, between the two series. If, for example, we could believe that the new series was manufactured in the same way as the first and was to be subject to the same stresses we should have such a link. We do in fact believe that such links exist; we are using such a belief whenever we carry forward experience of the past to the future. It is convenient to think of the link as being provided by the two series having variability generated by the same process. We do not need to define the physical nature nor the details of the process, we are content to define it as "remaining the same". Yet these words involve a paradox, for of the product of the process some units will fail, others will not. I n other words the process cannot be exactly the same from unit to unit. So we arrive at the idea of a "statistical process", one which involves variability of unknown cause, but always belonging to the same pattern. It is to processes such as these that our reasoning refers. Their product is variable, we cannot tell what will come next but we recognize and find invariable a pattern of this variability, a pattern characteristic of each process. With the aid of statistical ideas we can marshall what information we have and make inferences which will guide our actions to improve reliability. We shall not, of course, remain content to describe the overall performance of an equipment. Since our object is to improve reliability it will be advantageous to break down the complex into its parts, to study them and their performance separately. T h e reliability of an equipment is b m h of the reliability of its parts; it is important to know how this build-up occurs, how the chance effects causing failure of a part will show themselves as a pattern of failure when the whole equipment is studied. It would also be useful to recognize from their effects different reasons for failure; for example, the "wear-out" type and the so-called

" r a n d o m failure", independent of time. T h e next few paragraphs are devoted to a few examples of how patterns of variability build up into other patterns. Single units subject to chance/failure have the simple pattern of variability shown in Fig. l(a) in

Z

FAIL

NOT FAIL

Fzc. l(a). which the lengths of the vertical bars represent frequencies. W h e n such units are studied in sets of n units the pattern of variability for the sets is different from that for the units. For one thing, the possible number failing in a set is any number from zero to n. T h e pattern could be, for example, as in Fig. l(b). For a given Fig. l(a) the patterns are easily calculable for any value of n.

tM

°0 "

I

I

I

I 2 54

56

7 8910

"Nt OF FAILURES tN SETS OF n (11"10)

FIc. l(b). Another way in which such a pattern of variability can arise is from a set, not of items, but of deviations from nominal in a particular property of a single item. Suppose there are a number of causes of slight changes in the life of a part and that each is as likely to cause a slight increase as a decrease. T h e n in any one part, the increases and the decreases will tend to average out though not completely; different parts will have different sums of increases and decreases, the overall effect being like Fig. l(b) for a few causes of change in life but like Fig. l(d) when these causes are numerous and the changes each produces relatively small. Yet another pattern could arise when order of occurrence of failure and success is considered.

STATISTICAL APPROACH TO THE IMPROVEMENT OF RELIABILITY

13

will function when called upon. T h e appropriate variability patterns here are like Fig. l(c) for n = 50 and n = 100; but we are interested only in the class of zdro failures since all the other classes (of one or more parts failed) lead equally to failure of the equipment. The frequency of this N~OF FAILURES t~ SelS OF n (n-IO0) class is the product of the frequencies of survival Fzo. l(c). of the separate parts. For equipments made of 50 parts the frequency of survival will be (0.999) 50 = 0"9517; for equipments made up of 100 such parts NOMINAL VALUE the survival rate is 0"9057. In the above example the reliabilities, or proportions of equipments not failing are 95.2 and 90.6 per cent. Although the fall is not proportional o" to the number of parts it is nevertheless alarming for the larger numbers of parts. For 200 parts of DEVIATIONS FROM NOMINAL the same reliability as above, the reliability of the whole equipment falls to 82 per cent; for 400 parts FIO. l(d). it falls to 67.3 per cent. That is, although the parts An event (failure) whose basic frequency is repre- fail at a rate of only one in a thousand the equipsented by Fig. l(a) would yield intervals (or num- ments made of 400 such parts fail at a rate of one bers of successes between failures) distributed as in three. Higher numbers of parts are not uncommon and it only needs the equipment to in Fig. l(e). contain 700 such vital parts for its reliability to fall to 50 per cent. Part numbers and reliabilities are of this order. These calculations focus attention on the fact that what we have ordinarily called high reliabilities for parts are no longer high enough to secure reasonable reliabilities for equipments made of many such parts. This setting of a new standard is INTERVAL OR NII OF SUCCESSES BEFORE FAILURg. one of the important things that arises from the Fio. l(e). statistical approach. Many have been willing to believe that the standards of manufacture are high We can now discuss in terms of these patterns enough and that the failures are due to "chance" of variability some of their consequences pertinent and therefore outside any remedial action. to the problem of reliability. Consider the effect We can calculate what standards are required. of complexity on reliability. An equipment with I f an equipment of 700 parts has a reliability of 100 parts is more likely to fail than one with only 50 per cent, what improvement in reliability of the 50, simply because there are more parts to go parts would be required to improve the reliability wrong (it is supposed that the failure of any one of the equipment to, say, 75, 80 or 90 per cent? part causes the failure of the whole equipment). T h e answers are that the reliabilities of the parts But there is not a simple proportionality between would have to be 99-960, 99"968 and 99"985 per the frequencies of failures nor between the reli- cent or failure rates of parts of 1 in 2,500; 1 in abilities in the two cases. T o make a simple 3,000; 1 in 7,000. situation, suppose that the 100, or the 50, parts Different parts do not in fact have the same are all alike in their liability to failure; say one in failure rotes but the simple rule of multiplying the a thousand fail, on average, during the required reliabilities still holds when they are different. life so that the reliability of one kind of part is From these calculations it is seen that to achieve 0.999, this being the proportion of times these 90 per cent reliability of an equipment with 700

° , , lllFIl,,.,

E. D. VAN R E S T

14

parts we can allow a failure rate in the parts of about an average (a geometric average) of 1 in 7,000. Since this rate must include those failing from chance occurrences of unusual stresses it is clear that there is little room for even slight departures from specification. Another phenomenon which it is useful to look at from the point of view of the patterns of Fig. 1 is the "wear-out" effect and its distinction from the "purely random failure" effect. The wear-out effect is the result, usually, of variability in the effective life of a part. This variability arises from a number of ill-defined causes and therefore, by an easily demonstrated averaging effect, results in the lives clustering about some average value, with the relative frequencies of other values decreasing with their deviation from average. The frequencies are usually represented by a diagram such as Fig. l(d), in which the areas of the blocks represent the frequencies of occurrence of values in the range represented by the base of the block. If the lives are represented by some other frequency distribution the following arguments will still apply. There are probably two classes of parts in any complex equipment. A few with lives accidentally very short because of some defect in manufacture resulting in a weakness. Most of these will be discovered and removed during the manufacturer's inspection and test, but a few, the longer-lived of this "short-life" class, will only show themselves in use. The other class has the designed, long life with average value well above that required for the normal reliability of the equipment but still with distributed values of life so that some, even if only a few, will fail within the expected equipment life.

There may be a third class having art average life intermediate between those of the other two classes. These will be troublesome; their presence will call for a test and replacement service. When the third class is absent the number failing at a given time changes as in Fig. 2(a) which is a plot of number failing against time of failing. This consists, at low values of time, of the right-hand tail of the distribution of lives of the short-lived class of parts and, at high values of time of the left-hand tail of the long-lived class. Superposed on these two will be a small, constant-with-time, failure rate to be ascribed to the rare occurrence of unusual stresses. The designer will have taken care that only the extreme tails of both these distributions occur in the expected lives of the equipments so that in ordinary cases Fig. 2(a) will approximate to a straight horizontal line. There have been suggestions that the wear-out failure rate might be distinguished from the failure rate due to environmental causes by the fact that the first will be expected to fall or rise whereas the other will be constant with time. It is however unlikely that the changing rates will be distinguishable, especially with the numbers of equipments likely to be available for observation. As will be considered later it is in any case inadvisable to separate the two causes of failure, innate and environmental. When the third class of parts is present, a class having lives within the expected life of the equipment and requiring replacement, the effect of replacement will be that there will at any one time exist in the equipment, parts of a variety of ages. If the number of such parts is large enough the number failing at any one time will remain constant since this number will be represented as the sum of a number of distributions as in Fig. 2(b).

/

0

z_ __.~,

<...~

zL

-1

TIME t DESIGNED LIFE OF EQUIPMENT F1o. 2(a).

STATISTICAL APPROACH TO THE IMPROVEMENT OF RELIABILITY

LIFETI.E OFEQUIPMENT ~tI Fzo. 2(b).

Any attack on the problem of reliability must obviously be made on the parts of which the complex is built. The foregoing paragraphs have shown how the chance effect may manifest itself when observed as failures of whole equipments; we now need to study statistically what happens to the part. Just as attention will naturally be focused on some parts which fail more often than others, so when studying these parts attention will be directed towards certain properties of those parts which are critical for functioning of the part. With some of these properties the designer is familiar; he knows what values must be achieved for satisfactory functioning; of others he will not have such complete knowledge, and this is often true of those properties that affect length of life. In order to state the statistical view we can consider either kind and see what is the form of the knowledge required. Each property has the nature of a "response" to a "stress". For example, a resistor passes a certain current for each value of the applied voltage; the support for the interior assembly of an electronic valve deflects in response to a blow; a part required to fit with another has the alternative responses "fit" and "not fit". There is a relation between response and stress for each property, as, for example, is shown in Fig. 3(a). T h e designer will decide as part of his design what

15

response is required and will, taking account of the varying stress likely to be encountered, put limits on that value in the form of a maximum or a minimum or bdth. He will, for example, state that the output of a certain circuit should be at least 0.5 A and should not exceed, say, 0.75 A. The stress or voltage which wiU produce values of current between these limits can be read from the stress-response curve as indicated in Fig. 3(b). I f the voltage he has available, by design

;~

~a,~

- MA~

APPLIEDVOLTAGE FIo. 3(b).

from another part of the equipment, is not the stress which will produce a response within the stated limits, he can either alter the requirements or alter the stress-response relation, as in Fig. 3(c) as he can do by re-designing the part.

//

uJ

3 U

t/!~'~OLD

S" /

APPLIED

VOLTAGE

Fzo. 3(c).

STRESS Fza. 3(a).

There are two kinds of happening which may nullify the designer's efforts and cause an equipment not to function even though the prototype functioned. One could be that the stress encountered on the occasion of failure is so great as to cause a response outside the limit set. The other could be that manufacturing variability causes a part to be made having a different stress-response curve so that the response even to a normal stress causes the limiting response to be exceeded.

E. D. V A N R E S T

16

The statistical picture then is of a range of possible responses, caused by manufacturing variability, for each possible stress and a range of possible stresses originating from outside the part, caused by the variability of the environment. The stress-response diagram becomes something like .;;:,~.':::. ...~6.:::::::,:.: ,:~'::::.:.:::.:,::" Z 0 qffJ

..:.:.:.:.:.:.:,.



STRESS Fzo. 3(d).

Fig. 3(d). The actual shape of the stress-response curve for a single part under discussion is immaterial to our argument since a simple change of scale could make it any shape we wish. If now we wish to represent the frequencies with which each pair of values of stress and response occur we need a third dimension and the diagram becomes as Fig. 3(e) in which volume represents frequency. For most purposes of this paper it will be sufficient to retain the two-dimensional diagram and draw contours of equal frequency as in Fig. 3(f). The peak (and there will usually only be one) will represent the commonly occurring stresses and their responses. The line of minimum response of Fig. 3(b), when drawn on this diagram shows that the designer's task is to ensure that this line lies in the low frequency parts of

STRESS FIG, 3(e).

w z 0 oi. m nr

~,

.b 5"I'RES5

FIG. 3(f).

I~11~

S T A T I S T I C A L APPROACH TO T H E I M P R O V E M E N T OF R E L I A B I L I T Y the diagram. It is, however, the line that is fixed (by requirements outside the part) and the positions of the contours which can be changed (by design, choice of materials, dimensions, etc.). With this background it is possible to say, first, what information is required, second, what choices must be made, and last what is different about the statistical approach; what we are enabled to do that could not be done by other approaches. First, the information must be that which enables the contour map to be drawn or at least guessed at; we also need to know the shape of the stress-response curve for the critical properties in order to know how to move the peaks as required by the reasoning given above. We see two ways of "placing" the peaks; one is to explore the contours near the peaks, using the measured slopes, for instance, to estimate how far from the line ab they ought to be to ensure low frequencies beyond ab. The other is to explore the "lowlands" beyond ab and move the peaks so that ab stays in them for the whole of its length. The second of these methods is the more direct; it requires no knowledge of the shape of the hills and it explores directly, and not by implication, what is required to be known, namely the frequency of points beyond ab. For this reason it is commonly used in manufacturing, where the practice appears as gauging with gauges set to the tolerable limits of the part. A good manufacturing process will not make any items outside those limits, so that in effect the gauging process becomes only a precaution, an act of assurance, and can do little to control the process of manufacture, except in the extreme case of the process going badly out of setting. It is well known, though not perhaps to manufacturers, that more efficient control is achieved by gauging at points which are not necessarily the specified limits. In the case of complex equipments we are not in the position of having "good" manufacturing processes; the requirement is so much higher than usual that we have still to make them good. In these circumstances the heavy disadvantage of the direct method becomes very conspicuous. The disadvantage is that information is sought from the infrequent events, so infrequent that when they do not occur, even after a large number of opportunities, we still think one might occur and must go on searching.

17

The alternative is to seek our information where information exists, in the main body of the frequency field. Nor need this necessarily involve measurement. Counting is only another form of measurement. If we are only able to note the occurrence of a failure at a given stress and not measure the response and thus know how far beyond failure (i.e. beyond ab) the event was, then we can still get information, though not so much. This approach, to study what happens rather than what we hope will not happen, is quite a general one, which could have wider recognition and application than it does get even among statisticians. THE PRINCIPLE I N USE

In this part of the paper we consider how the application of the principle stated in the first part would govern the mode of attack at each stage of the production of a complex equipment. The stages considered are design, approval, production, inspection and use.

Design The requirements of design lead directly from the preceding section. It is ordinarily regarded as quite a feat that a design capable of functioning should be created. Much ingenuity and contriving go into the original prototype. But we have shown above that if replicas are to be produced (or even if the prototype is to be reliable in the sense that it will continue to function in a variety of conditions) much more is required. The existence of this further work has, in some organizations, led to the establishment of an additional stage of design, the reliability engineering stage. Whether the two stages of design should be separate or not is not the concern of this paper; there is obviously a very close link between them; the reliability development cannot be entirely independent of the original design which may need to be modified by these later requirements. The important thing is the recognition that additional work has to be done after a prototype has been shown to work and before embarking on manufacture in quantity. This reliability engineering centres on the contour map of Fig. 3(f). It is not pretended that in the case of every critical property it will be possible to provide the designer with such a complete picture as this. In all of the following paragraphs the ideal approach will need to be modified

18

E. D. VAN REST

because of lack of exact information; the principle be the "response" to some other "stress") and remains however as a guide, not only to the course whose distribution can be studied. The slope of of action but to the obtaining of additional infor- the stress-response curve is in this case the mation from research to improve matters in the resistance. The designer, knowing both the value of the applied voltage and the value of the current future. In summary, the information required is of the required, will be able to specify the value of the shape of the bivariate frequency distribution of resistance required. But all these will be average response and stress. This information can, in time, or nominal values. The resistor actually supplied be gained about each important property of each to this nominal value will have a distribution of component part. The importance of a property is values (the manufacturing distribution), and this decided by its contribution to the successful combined with the distribution of values of applied functioning of the equipment. It is not to be voltage gives the appropriate bivariate distribution expected that complete information will always be with contour lines as in Fig. 3(f). The line ab is available; the distribution of the stresses likely to fixed by the designer as the minimum requirebe encountered must in many cases be guessed at; ment for functioning of another part of the equipbut, on the other hand, there is a lot of such basic ment. The reliability engineer must now specify information available which is not used to the full by the manufacturer's description the resistor because only the extremes are thought to be of whose manufacturing variability, in conjunction interest. For example, there are records of the with the given distribution of applied voltage, gives vertical accelerations given to aircraft by wind a bivariate distribution with its peak well away gusts, but the emphasis and method of recording from ab; just how far from ab will depend on the are on the maximum rather than on the frequency closeness of the contours between the peak and ab; the closer they are the closer the peak could be, if distribution as a whole. The distributions of the responses arise from necessary, to ab. The distance from the peak to deviations from nominal specification occurring by ab is the distance commonly referred to in the chance in the manufacturing process so that this unlvariate case as the "safety margin"; this distribution can only be obtained from a knowledge description shows that it is more appropriately of the manufacturing process. This is an illustra- thought of as a margin than as a factor to be tion of the theme that will recur often in discussions applied to the peak value. For the second example consider a strength on reliability, that no stage can be considered as independent of another, so that a fundamental property; the stem of a valve that is liable to break requirement of a reliability programme will be when vibrated. The response in this case is the the acquisition of this kind of information and maximum stress per unit area in the material of the stem; the stress is the amplitude of the vibraits communication from one stage to another. Being provided with this bivariate distribution, tion or the magnitude of the acceleration involved however incomplete, the designer's task is to in the vibration. For a given size of stem from a decide where the line ab should be drawn; the given material there will be one stress-response reliability engineer's task is to move the peak of curve; other sizes in the same material will give the distribution so that few if any points lie curves of similar shape but not coincident. The beyond ab. The decision where to place ab may line ab will be drawn at the known maximum stress depend on the function of the part in the equip- per unit area for the material. Once again the task ment or it may, in the case of a strength property, of the reliability engineer will be to construct the for example, be dictated simply by the maximum bivariate distribution from the assumed or explored stress per unit area tolerated by the material. distribution of vibration accelerations or ampliConsider two properties as examples. First, a tudes and from the explored variation in size resistor is required to pass a certain specified arising in the manufacturing process. From this minimum current in order to operate another part diagram he will be able to choose the nominal of the equipment. The current (response) is dimension for the stem which gives a peak of the governed by the applied voltage (the stress) whose distribution well away from ab. source is another part of the equipment (and might What is advocated here may seem to require a

S T A T I S T I C A L APPROACH TO T H E I M P R O V E M E N T OF R E L I A B I L I T Y

19

tolerable that the methods of control are statistical. There is a full discussion elsewhere of the methods available so that there is no need to set them out here." For the very high standards required it may however be necessary to go further than is ordinarily required and improve the proApproval cesses as regards their variability. In the ordinary It is evident, as has been argued elsewhere, that way this would req~uire isolation of at least some approval of a design must include approval of a of the causes of variability and their removal from process of manufacture. It is not sufficient to have the process. Methods of efficient experiment or a design; we must have a design that can be manueven of analysis of records obtained during process factured; the fact that the prototype has been made control which enable this to be done are also given and functions is no guarantee that the replicates to elsewhere. But, in the writer's experience, there is be manufactured will also function. It is the room for an intermediate method that does not inevitable variability of manufacture which is the require preliminary experiment. It may aptly be origin of the distribution of responses. It is imporcalled "blind" control, because what is recomtant to say here what will be given more attention mended is control of features which are not speciin the paragraph on manufacture, that this varified properties of the parts but are rather parts of ability of manufacture is not necessarily restricted the process, and whose influence on the variability to specified properties. T o take the example of the is only guessed at. The procedure recommended previous paragraph the size and material of the is akin to what the chemist does when making stem of a valve may not be accurately specified, duplicate analyses; he tries to do the same thing the outside diameter of the tubing of which it is each time, to use the same apparatus, at the same made may be given but not the exact composition temperature, on the same day, under as nearly as nor perhaps the inside diameter. The information possible the same conditions irrespective of that is required, then, about manufacture before whether or not he knows the effects of these conapproval of a design for manufacture in quantity, ditions on the result. Many will say that this is is not whether or not a small batch exhibits no only what a careful worker operating any process failures (at the standard of reliability required a will do anyway; but what are usually lacking are small batch such as can be made for approval is the aids to deliberate control, the record and the unlikely to show any failures even though it is study of the short-term, inherent variability of the worse than the standard); but about the variability feature under attention. It is the writer's experience likely to occur in all the properties likely to be the that many processes of manufacture, and this is cause of a failure. especially true of the manufacture of electronic Here again the general principle of looking at components, are designed in the laboratory; the the bulk rather than at the extreme is seen to be specification of the process (time to process, very necessary; for the very large numbers that strength of chemical and soon) is laid down, but would need to be examined in order to establish no further control exercised over these features that the failure rate (by number) of a part was except indirectly through the final product of the small enough to justify its use in an equipment process. Such control would be especially valuable would be very large indeed, so large that sponsors in reducing variability in the ancillary unspecified have been willing to forego assurance about reliproperties which are more likely than the specified ability rather than indulge in them. ones to be the cause of the random failure which is "unreliability". For example, the insulation on Manufacture A process of manufacture having been approved a resistor may be specified and controlled during by the studies recommended in the section on manufacture by total weight of insulating material Approval, the task of the manufacturer is so to added, whereas it is the variability from point to keep the process under control that it generates the point which regulates whether a breakdown occurs, same pattern or amount of variability and no more. so that control is needed over the method of It is because some variability must be accepted as application of the insulation. tremendous amount of exploratory work, but it is true to say that rather a lot of work is being done already; what is suggested is a guide for its profitable direction. In any case the reward at stake, reliable equipment, is a tremendous one.

20

E. D. VAN R E S T

Inspection It is perhaps the inspector who gains most from the statistical view that it is better to control the process of manufacture than to seek the very rare transgressor of the specification. He cart make use of this principle in two ways. First, he cart satisfy himself of the existence and potential effectiveness of the manufacturer's control over the variability of the process. Second, where there must be some examination of the product rather than of the process itself, he should seek information about the bulk of the product rather than look for the rarity, the defective item. The very large samples necessary for this last are not only expensive, time and effort consuming, but, for economic reasons, usually refer to large batches, and these are less likely to be uniform than small batches. The nonuniform batch has the serious disadvantage when sampling that bad is diluted with good and so made less easy to detect. The recommended alternative of seeking information about the bulk of the product is done by measurement; the observed variability should agree with that used as the response variability when constructing the contour map for the designer. If measurement is not possible then the use of gauges set at other than the specification limits will provide, in a coarse way, similar information. Inspection, whether carried out on behalf of production or as an assurance, is one of the major sources of information about variability, and norte of the effort devoted to it should be wasted as it often is when the purely negative result of "no defects found" is the sole record. One result of taking the statistical view is that inspection is seen as a necessary part of the production process, one that should be catered for by others. If, for example, the inspection of a particular part or property of a part is made difficult then the total inspection effort being limited, less information is obtainable about that property than would otherwise be the case. Where information is lacking, there are the dangerous sources of unreliability. Use It is to the user that the remainder of the team must look for much of the information vital to the production of reliable equipments. Inspection, unless it is the very expensive "environmental

testing", cannot usually give the complete bivariate distribution and even the environmental testing does so only to the extent that the environment is correctly simulated. T h e user could provide informarion on a scale that is not possible for any of the other stages. For in effect he tests each equipment to the end of its life. Also in their simplest form his records will form a guide to the parts whose unreliability can most profitably be attacked. The important thing from the statistical point of view is that the information should not be biased or imperfect as, for example, would happen if failures were reported and not all successes. Besides providing some at least of the information about the stress distribution necessary for delineating the contour map of each property, the user can do something to ameliorate this distribution since some of the extreme stresses are likely to be the unnecessary ones the equipment may receive in handling and maintenance. It would be important from this point of view that the user should understand the importance of constancy of the stress distribution; once determined and used for design purposes much of the progress towards reliability could be cancelled by an alteration in this distribution. Just as the manufacturer is required to control the variability of his production process so the user might be required to control the variability of his use process. This brief survey of the several stages of production leads to several conclusions. First, that at all stages a study of the pattern of variability of all that is produced is more profitable than a search among the rare extremes. Second, that a knowledge of how this variability arises, part in manufacture, part in environment, leads to a better understanding of what must be done at each stage to improve reliability. I f the occasional failures are to be attributed to "chance" there is a tendency to think that nothing can be done about them. Third, all stages of production form the sources of the information required by design; and fourth, that some organization is needed to channel this information in the form indicated in this paper to the designer.

Acknowledgement--Crown copyright reserved. Reproduced by permission of the Controller, H.M.S.O.