Solid-Stole Ekctmnicr, Vol. 24, No. Printed in Great Britain.
II, pp.10454047,
1981
W3~1101/81/111045~3$02.00/0 Pergamon Press Ltd.
A NOTE ON IC-YIELD STATISTICS R. M. WARNER, JR. Electrical Engineering Department, 123 Church Street S.E., University of Minnesota, Minneapolis, MN 55455, U.S.A. (Received 22 January 1981; in reoised form 30 April 1981)
Abstract-Expressions given by Mooreand by Dingwallfor relating integrated-circuit yield to area are compared and are given a common interpretation in terms of the composite model. The issue of large-chip yield projection is discussed, a subject that is still unclear because of the concave-up projections of the empirical curves, the linear projection of the composite model, and the concave-down projections of recent analytical models.
INTRODUCTION
A perennial problem besetting students of integratedcircuit (IC) yields has been a shortage of real-life data in a form appropriate to various yield analyses. The data are costly to acquire and are often of a commercially sensitive nature. In view of this dilemma it is worthwhile to extract a maximum of meaning from relationships that have been tested against literally millions of IC slices. One such formula, an empirical expression describing integrated-circuit experience at Intel, was offered by Moore [ 11: y = e-(A/Ao~). (1) Here Y is “probe yield”, A is integrated-circuit area, and AOM is a reference area. We may consider that AOM is defined as the integrated-circuit area that leads to a probe yield of l/e, or 37%. As such, it is of course a timedependent quantity. Technology refinements have caused a steadily increasing IC size that Intel has been able to manufacture at a 37% yield. Although this formula was offered over a decade ago, it has continued to provide a serviceable description of experience through the seventies[2] in spite of significant technology changes, such as the use of projection aligners in place of contact printers. Another formula has been offered by Dingwall [3] ,and it too has proved useful through the seventies in experience at TRW [4]: Y=(lCgJ.
Here, Y once more is probe yield, fi is the dimensionless mean defect density, (i.e. the average number of point defects per IC), A is IC area as before, and AoD is a different reference area. Dingwall’s formula has been written in this way for parallelism with Moore’s. (Dingwall originally wrote D,, in place of fi/AoD, where Do is defect density in number per cm*.) We may take as the definition in this case that AoD is an IC size that leads to a probe yield of 42% when the mean defect density d is one defect per IC. Dingwall’s formula is not purely empirical, but rather has a basis in statistical theory, a
point that has been treated by Sredni[S]. Also, Okabe et a1.[6] and Paz and Lawson[7] have offered on theoretical grounds the same expression (though with variable exponent), but without noting Dingwall’s prior use of it. We shall see below that the overall mean defect density involved in defining both AoD and AOM must of necessity involve averaging over a nonhomogeneous defect population wherein there are areas characterized by d values well over unity, and other areas with fi values well under unity. It is, of course, the latter subareas that contribute most of the good devices. Putting the matter another way, the area employed for averaging (usually a slice or a group of slices) must be nonhomogeneous for the formulas to fit; if it were homogeneous, the resulting semilog plot of yield versus area would be linear. ANALYSIS The two formulas are plotted in Fig. 1. While they are
similar in character, they certainly are not congruent. Nonetheless the differences are appreciably less than the statistical scatter in points relating to ten or fifteen years of IC experience. As is evident on the abscissa’s calibration, the ratio AodAoM has been taken as 1.5, a choice that achieves a reasonable reconciliation of the two empirical curves. Note that in attempting a reconciliation of the two we make the tacit assumption that the yield experiences of the two IC manufacturers have been similar. Now let us ask for the simplest possible statistical interpretation of these curves. A worthwhile aim is to minimize the mathematical complexity of the yield model, and at the same time, to maximize its content of physical meaning. As one possibility, invoke the composite model of Warner[8]. The dotted line in Fig. 1 is constructed on the assumption that half the aggregate area of all the slices represented is homogeneous (in the sense that the point defects therein constitute a single binomial-Poisson population) and characterized by a low value of defect density. The numerical values involved can be inferred and verified as follows: The linear portion of the dotted curve extrapolates to 50%. This is the fraction of the aggregate area characterized by a single low value of 1045
R. M. WARNER, JR.
1046
DINGWALL
FORMUIA
COMPOSITE
MOORE
APPROXIMATION
FORMULA .
Fig. I. Comparing the empirical yield-vs-area curves of Moore and Dingwall with each other and with a curve based upon a composite model that approximates both. Plotted points are based on a slightly modified composite model.
defect density 0. Where A/A,, = 1, we observe Y = 37% on the linear curve. If this homogeneous region held through the entire area, the yield would be 74%, or 0.74. Hence
concave-up throughout, while the composite-model curve has a linear “tail”, and consequently makes a more pessimistic projection of expected large-chip yield. Given a map of an IC slice, showing the locations of good and defective chips, one can carry out one kind of extrapolation by the time-honored “window method”[7,8,10]. In this process one looks for good (adjacent) pairs, good quads, good octets, etc., and plots the yield of these as the expected yield for an IC whose area is a corresponding multiple of the reference-K area. This exercise has in the past led to linear projections[ll, 121,and can be shown to do so yet another time by considering a low-defect-density portion of Moore’s venerable half slice [l], as given in Fig. 2(a); the window-method projection can be seen to be linear in Fig. 2(b). The statistical meaning of such a linear result is that the fatal defects involved constitute a homogeneous Poisson population. (Note that we consider only the low-defect-density subarea when rendering this judgment.) It is worth noting that Y=e-”
(4)
and y = y;;l’4
(5)
1 D=In0.74=0.3. The curved portion of the dotted curve rising to 100% represents a contribution to the total yield of the highdefect-density remainder of the aggregate area. At A/AoM = I, it lies above the linear portion by about 9%, so that an overall area with these properties would exhibit a yield of about_ 18%. From the equation (3) prescription we infer a D value of 1.7. The composite model was designed in this case to give a curve falling midway between the two empirical curves. With a defect density of 0.3 in half of the area and 1.7 in the other half, it is evident that the overall mean defect density at Moore’s reference area AOM is 1.0 defect per IC. Hence for this condition we have at Dingwall’s reference area AoD an overall mean defect density of 1.5 defects per IC. Continuing to consider Dingwall’s reference area, if we scale down the defect density in the poorer half from 2.55 to 1.55 (keeping it the same at 0.45 in the better half) so that the overall value is 1.0 defect per IC in the case of Dingwall’s reference area, then we get the plotted points shown in Fig. 1. Thus the composite model fits Dingwall’s formula very well. As a practical matter, the resolution of total area into subareas of differing (and individually homogeneous) defect densities is not straightforward. The defect pattern may be “patchy”. Islands of high defect density can exist separated by low-density regions, and vice versa, a problem that has been examined by Ham[9]. A primary motivation for developing yield models, whether empirical or analytical, is to permit one to predict accurately the yield results to be expected from new and larger integrated circuits. In Fig. 1, the curves do not agree in this respect. Both empirical curves are
(a)
0
2
4
6
6
9.
Ao b) Fig. 2(a) A reasonably homogeneous portion of the defect map given by Moore[l]. (B = 128: number of integrated circuits of area A. N = 43: number (total) of spot defects.) (b) Log yield versus area for the sample of part (a), inferred by using the “window method”.
A note on IC-yield statistics
constitute equivalent statements of this Poisson tionship. That is, using the notation of Ref.[S], e -D = e~NIG/A)_-(em CN/SMO1A/AO=
yzp
,
rela-
(6)
where Y0 is the yield at the area A0 within the homogeneous region. It has been pointed out by Stapper[l3], however, that this is not the end of the story. Large chips are rarely constituted as exact multiples of smaller chips. If differing segments of a circuit (e.g. memory cells versus decoders) have differing sensitivities (or vulnerabilities) to defects, then the shifting composition that goes with IC upscaling will render the window method invalid. Stapper developed a complicated theory to deal with such factors and applied it specifically to a series of memory chips. His theory predicted that for these circuits the result would be neither concave-up nor linear, but would actually be concave-down. Testing his theory against voluminous data from two IBM facilities, however, he found that in both cases the best leastsquares fit was provided by a straight line. Unfortunately the area extremes in the experimental data differed by less than a factor of two, and hence these data do not constitute a very sensitive test of the point at issue. It certainly would be worthwhile to repeat the study with a sequence of chips differing overall in area by a factor of four, six, or eight, if such an opportunity presents itself. A different aspect of the projection problem was treated somewhat earlier by Hu [ 141.Focusing on a single slice, he considered an extreme limiting case wherein hypothetical integrated circuits grew to approximately quarter-slice dimensions. He makes the interesting point that in the limit of large IC size, even a nonhomogeneous defect pattern leads to Poisson behavior. (Note that in this case he considers the entire slice area and not just the low-density subarea we discussed above.) It follows that he would predict a more pessimistic yield in this extreme limit than would any of the curves in Fig. 1. It should be emphasized that the area range considered by Stapper was far removed from the range where Hu’s effect would enter. The largest chips considered by Stapper probably came 100 or more to a slice. But to return to Hu’s suggestion, we should note that it has been pointed out earlier[l5] that binomial statistics without elaboration when applied to a homogeneous defect population will predict a curve of log yield versus area that is concave-down when chip area approaches a physical constraint, such as slice area[lS], an effect that exaggerates the tendency described by Hu. In the same paper [141,Hu also addressed such real-life problems as the effect of defects of finite size, and the tendency of some defects to cluster, or even to be attracted by an IC feature such as an oxide step. The present author freely acknowledges the need for yield analysis capable of dealing with such complications, as well as with the scaling problems addressed by Stapper [ 131,and equally acknowledges the limitations of the relatively simplistic composite model. Nonetheless, for the time being this model may constitute a useful compromise lying between the early concave-up empiri-
1047
‘Or 04 I
0.004 /
ooo’~--1975 1980 -. l+g. 3. An estimate of how Moore’s reference area
AOM has
increased with time through technological refinement.
cal projections and the more recent concave-down analytical projections of Hu and Stapper, until additional data can be accumulated to clarify this issue. Figure 3, finally, shows the author’s estimate of the changing value of Moore’s reference area AOM through time. Extrapolating this curve suggests that integrated circuits having an area of one square centimeter will become commonplace in the mid-1980s. The growth in area is being abetted by a growing willingness to employ redundancy in integrated circuits[I6, 171,a measure that has the effect of increasing the tolerable value of mean defect density. Acknowledgement-I am indebted to a reviewer for correcting a major error in the original version of this manuscript. REFERENCES
I. G. E. Moore, Hectronics 43, 126(1970). 2. G. E. Moore, private communication. 3. A. G. F. Dingwall, High-yield-processed bipolar LSI arrays. International Electron Devices Meeting, Washington, DC. (Oct. 1968). 4. J. L. Buie, private communication. 5. J. Sredni, Use of power transformations to model the yield of IC’s as a function of active circuit area. International Electron Devices Meeting, Washington, D.C. (Dec. 1975). 6. T. Okabe, M. Nagata and S. Shimada, Elec. Engng Japan 92, I35 (1972). 7. 0. Paz and T. R. Lawson, Jr., IEEE J. Solid-St. Circuits
SC-12, 540 (1977). 8. R. M. Warner, Jr., IEEE J. Solid-St. Circuits SC-9,86 (1974). 9. W. E. Ham, RCA Rev. 39, 231 (1978).
IO. R. B. Seeds, Yield and cost analysis of bipolar LSI. Presented at the IEEE ht. Electron Devices Meeting, Washington, D.C. (Oct. 1967). Il. Ref. 181.Fig. 4. 12. Ref. i7j, Fig. 3. 13. C. H. Staoner. Solid-St. Electron. 24. 127 (19811. 14. S. M. Hu, ‘Solid-St. Electron. 22, 205’(1976). 15. Warner, op. cit., p. 89, column 1. 16. C. H. Stapper, A. N. McLaren and M. Dreckmann, IBM J. Res. Develop. 24, 398 (1980). 17. A. Tuszynski, Digest Sot. Mfg. Engrs 1(1980).