Microlithography in Semiconductor Device Processing

Microlithography in Semiconductor Device Processing

VLSI ELECTRONICS: MICROSTRUCTURE SCIENCE, VOL 6 Chapter 5 Microlithography in Semiconductor Device Processing RONALD C. BRACKEN* SYED A. RIZVI Mostek...

5MB Sizes 2 Downloads 221 Views

VLSI ELECTRONICS: MICROSTRUCTURE SCIENCE, VOL 6

Chapter 5 Microlithography in Semiconductor Device Processing RONALD C. BRACKEN* SYED A. RIZVI Mostek Corporation Carrollton, Texas

Introduction Pattern Generation A. Reduction Cameras B. Optical-Pattern Generators (OPG) C. Electron-Beam Pattern Generation D. Future Trends in Pattern Generation Photomask Fabrication A. Emulsion Photomasks B. Contact-Printed Chrome Photomasks C. Directly-Stepped Photomasks D. Electron-Beam 1 x Masks E. Future Trends in 1 x Mask Making Wafer Resist-Patteming A. Contact Printing B. Proximity Wafer Printing C. Wafer Patterning by Projection Printing D. Wafer Patterning by Step-and-Repeat Cameras E. Electron-Beam Wafer Patterning F. X-Ray Lithography (XRL) Wafer Exposure Summary and Conclusions References

256 260 260 261 263 273 275 275 277 284 287 290 292 292 297 300 310 316 320 324 327

* Present address: Solid State Technology Center, RCA Corporation, Somerville, New Jersey. 255 Copyright © 1983 by Academic Press, Inc. All rights of reproduction in any form reserved. ISBN 0-12-234106-6

256

Ronald C. Bracken and Syed A. Rizvi

I. INTRODUCTION

Since about 1970, microlithography has been one of the major "gating" items on the advancement of semiconductor device processing. By microlithography is meant the whole process whereby a device concept is reduced to actual working devices on silicon and/or other semiconductor wafers. The principal steps in this process are (1) pattern generation, whereby the device layout drawing is reduced to a reticle for step-and-repeat patterning on a reduction camera, (2) plate preparation at final size for wafer patterning, (3) wafer patterning in photoresist, and (4) fabrication of the patterns in silicon and interconnection materials. In this discussion we shall cover the first three items as being the realm of photopatterning. Initially, of course, there was no need for photolithography, in that grown junction and/or diffused junction devices were simply material configurations. Bars were sawn from the materials and leads were solder-connected. The patterning was done with a saw. A difficulty encountered in these devices is the leakage in the exposed junctions that occur on sawn bars. This difficulty was lessened with the invention of the planar device whereby patterns now are formed on the wafer surface. Local areas are exposed to the dopants, and diffusions occur locally. This arrangement allows the junction to be safely buried under a protective layer of silicon oxide. The dicing cuts now expose no junctions, and the device stability is improved. However, if this junction protection is to occur, two microlithographic steps must be accomplished. First, an array of circles must be formed on the wafer surface, and the silicon oxide must be patterned. After the/?-diffusion, a second array of contact circles concentric to and smaller than the first array must be superimposed onto the diffusion pattern seen on the wafer. These simple devices contain the major elements in all microlithography:

GROWN P-N JUNCTION

DIFFUSED WAFER P-N JUNCTION

Fig. 1. Grown crystals and diffused wafers as they are prepared for the sawing operation.

5. Microlithography in Semiconductor Device Processing

257

CONTACT WINDOW P DIFFUSION - n-TYPE SUBSTRATE P-N JUNCTIONS

Fig. 2. A cross-sectional view of a diffused wafer showing the locally diffused p-type windows. Concentric with these windows are the contact holes.

pattern preparation, pattern superpositions, and sequential wafer processing steps. The history of microlithography is then seen as an ever narrowing series of improvements where the gating item has been either the mask-making or the wafer fabrication processing. In this chapter we plan to touch briefly on the history of microlithographic process development as a reference point. From that point we shall go to current microlithographic practices and give our ideas on where this process is leading. The concerns of microlithographic processing are encompassed in these items: 1. Resolution

This term refers to the ability to image two points of diminishing spacing. The usual definition of resolution is the Rayleigh criterion, which was devel­ oped for telescopes to give a mathematical estimate of the resolving power of optical systems. This expression means that when the light intensity between images decreases to approximately 80% of the image intensity, the images are said to be just resolved [1]. Mathematically this criterion is ex­ pressed as d = 0.6U/N.A., where d is the separation distance of two images, λ the illuminating wave­ length, N.A. the numerical aperture of the object lens (N.A. = n sin 0/2), n is the index of refraction of the medium, and 0/2 the half angle of the cone of light entering the objective. The usual convention for optical systems is that two and one-half times the Rayleigh limit gives a estimate of resolution [2]. 2. Pattern Registration

This category refers to the merging of two patterns so that they "fit" to each other. A pertinent example of registration is the color lithography process for magazine printing. Successive layers of a photograph are printed one color at a time. The successful merging of all the patterns results in a pleasing photographic reproduction. The registration demanded of semicon­ ductor photolithography is orders of magnitude tighter, but the same princi­ ples apply.

258

Ronald C. Bracken and Syed A. Rizvi

3. Dimensional Control

This control refers to imaging the patterns on the wafer at the sizes con­ templated by the device designer. There is a systematic variation of image size from reticle to master to working plate to the final image as it is fabri­ cated onto the wafer. There is also variation on the wafer of any particular geometry due to local differences in wafer characteristics, imaging quality, processing variations, and mask dimension variation. 4. Cosmetic Quality

This category refers to the presence on the wafer or the photomask of un­ planned and undesired features. These features can be mask defects such as extra or missing chrome on a chrome mask, mask damage such as scratches, or mask contamination. There may be wafer defects of a similar sort such as unremoved contamination from cleaning, silicon chips from wafer handling, or spurious growths from the chemical vapor deposition (CVD) steps. The most notorious of these CVD steps is that of silicon epitaxial growth and the formation of "epi" spikes. One of the most intriguing aspects of cosmetic quality is its measurement. Until about 1976, only visual inspection was available to determine cosmetic quality of masks. Mask inspection machines are commercially available. Wafer inspection machines are being developed, but the inspection problem is more difficult for wafers due to the lower contrast. 5. Throughput of the Machine

This category bears on the overall usefulness of any technique, but it may or may not be a deciding factor. If device layout rules demand resolution or registration that can only be obtained on a low throughput machine, then the use of that machine is necessary regardless of throughput. In general, ma­ chine productivity has a strong influence on the usefulness of the process the machine implements. In the refining of processes, the development of measuring techniques of increasing precision has had a very beneficial effect. This has been the case for defect inspection and geometry measurement and is currently developing for registration. The various components of the microlithographic process and the various categories of process quality are seen in relation to each other in Table I. This table gives our best estimate of the most recent practices as they re­ late to that process. In some cases we cross diciplines where strictly applied common definitions cannot be used. This problem is recognized, and the at­ tempt to make commonly based comparisons was undertaken by analogy. The final two columns in the table, the figures of merit, are an indulgence we could not deny ourselves. In some cases, as in pattern generation, the

1.5 0.10 0.03 1.5 1.5 0.5 0.13 1.5 + 0.8 + 0.5 + 0.40 0.5 0.13 0.8 + 0.40

5.0 (5.0) 3.0 (3.0) 1.2(1.2) 1.0(1)

2.0 (4.0) 3.5 (7.0) 2.0 (4.0) 1.2(2.4) 1.5 (3.0) 0.8(1.6) 0.5 (1.0) 0.5(1.0)

(3.3) (1.7) (1.0) (1.0)

0.50 0.25 0.15 0.15 0.25 0.40 0.25 0.30 0.30 0.15 0.20 0.20

(11.5) (11.5) (3.8) (1) X (11.5)d X (6.2) X (3.8) (3.1) (3.8) (1.0) X (6.2) (4.0) + X (1.7) + X (2.7) + X (1.7) (2.0) (2.0) (1.0) + X (1.3) (1.3)

(6.7) (2.7) (1.0)

Dimensional control (±μ,πι) 0.2 0.08 0.03

(50)* (3.3) (1.0)

Registration (μπ\)

N/A a N/A N/A

Resolution (μτη)

0.8 + X 0.4 + X 0.18 + X 0.09 0.09 0.30 0.50 + X 0.30

1.70 0.90 0.15 0.30

N/A N/A N/A

(8.9) (4.4) (2.0) (1.0) (1.0) (3.3) (3.3) (3.3)

(11.3) (6.0) (1.0) (2.0)

Cosmetic defects/cm2

(1.0) (1.2) (2.0) (30)

(1.0) (1.0) (1.3) (2.3) (1.7) (60.0) (3.0) (20)

30 plates 25 plates 15 plates 1 plate

60 wafers 60 wafers 45 wafers 26 wafers 35 wafers 1 wafers 20 wafers 3 wafers

5k exposures (80.0)c 35k exposures (11.1) 400k exposures (1)

Machine throughput (unit/hr)

26.1 20.4 11.5 8.5 9.8 6.9 11.8 9.6

31.1 22.2 7.0 5.0

56.7 6.0 2.0

Quality

b

27.1 21.4 12.8 10.8 11.5 66.9 14.8 29.6

32.1 23.4 8.2 35.0

136.7 17.1 3.0

Overall

Figure of merit (sum of normalized numbers)

N/A indicates not available. The numbers in parentheses are normalized to the smallest number in the category. c The throughput figures are normalized to the reciprocal of the largest number in the category. d The X represents the contribution of the mask to this particular category. The normalized figures are for the nonmask contribution only.

a

Reduction cameras Pattern generation Electron beam Photomask fabrication Emulsion print Hard surface print Direct-stepped mask Electron-beam mask Wafer resist patterning Contact printing Proximity printing Projection printing Direct step (10x) Direct step (1:1) E-beam writing X-ray (1:1) full wafer X-ray (1:1) stepper

Pattern generation

Process step

Comparison of the Various Processes of Microlithography Based on Characteristics of the Processes

TABLE I

260

Ronald C. Bracken and Syed A. Rizvi

results so completely tally with our own experience that it gives us some con­ fidence in the technique as a useful method of comparison. The remainder of the chapter will consist of our description of the tech­ niques and an elucidation of the categorical values associated with them. II. PATTERN GENERATION A. Reduction Cameras

The most logical technique for producing a reduction of any image is to take a picture of it. This photographic technique was initially used to per­ form reductions. The device was drawn at a magnified scale. The pho­ tograph was taken at a reduction that would yield the image size desired on the wafer. The final product of the reduction camera was a reticle that was a 10 x image of the pattern as it would appear on the wafer. The set-up for performing this operation is simply a copy board for holding the drawing, a film holder for the reticle, and a lens system. The "drawing" was rendered in a two-layer plastic film Rubylith™. The patterns were cut in the top layer of the plastic film, and the field areas were peeled from the film. The resultant image was a red-on-transparent film, which in some cases measured up to approximately 6 ft by 6 ft. Since ini­ tially the cutting was done by hand, this technique allowed a great deal of flexibility. Unfortunately, it also allowed a great deal of error. As digitizing techniques were developed, they were applied to graphics and then to the cutting process. A knife replaced the pen on the plotter. The movements of either knife or pen were controlled by a magnetic plot tape, which greatly enhanced the speed and accuracy of the ruby cutting. The problems encountered in this technique revolved around the flexibil­ ity of the plastic film. The parallelism between the copy board and the reticle plate could be best controlled by permanently fixing the two in position. This fixing meant that the plastic film had to be lifted onto the vertical copy board and then held in place either electrostatically or by adhesive tape. Lifting the Rubylith™ usually caused it to stretch into a nonrectangular shape, which meant that the die suffered a similar shear distortion. A distortion at the pat­ tern edges of ±1.5 μ,πι was considered very good on the final l x (wafer) images implying that the plastic image had distorted by about 3750 μ,πι (0.15 in.). This distortion is about 0.2% overall stretching of the film under its own weight. Such performance assumes freshly prepared Rubylith™ pat­ terns and constant temperature and humidity conditions. Storage and han­ dling of these film sheets imposed additional problems. The cutters had a precision of about ± 500 μ,πι (0.020 in.), which leads to a control of the dimensions of the individual geometries of ±0.2 μ,πι on the wafer resulting from the Rubylith™.

5. Microlithography in Semiconductor Device Processing

261

The dies that were rendered using this technique were in some cases large and for that time complex. We estimate from levels that have been digitized from the Mylar Rubylith™ patterns that about 30,000 exposures on a con­ temporary optical pattern generator would characterize them. The reticle preparation time involved from hanging the pattern-to-plate exposure and processing is estimated at about 6 hr or the better part of a day's work. If the ruby cutting were included, this time would be about doubled. An equivalent rate of approximately 5,000 exposures/hr is in the range of what could be ex­ pected of this technique. The most usual problems encountered with this technique were those of pattern orthogonality and missing or extra geometry, since occasionally pat­ tern geometries did not adhere to the clear plastic as the colored field area was peeled from it. B. Optical-Pattern Generators (OPG)

In about 1970 the tape-driven optical-pattern generators began to be com­ mercially available for producing the 10x reticles. These OPGs are in es­ sence very high precision numerically operated machines not too far re­ moved in mechanics from the punch-tape-controlled milling machines used in sophisticated machine shops. The optics and the precision, however, make these machines a significant technical achievement. In order to use the OPGs, the polygon patterns need to be decomposed into primitive rectangles. In software parlance this step is referred to as cracking or fracturing the patterns or polygons. An example of one sort of fracture is given in Fig. 3. This rather simple polygon was fractured into the five rectangles shown in the figure. The problem of imaging a polygon is then reduced to imaging a rectangle having a variable height, width, X and Y

Fig. 3. (a) The polygon before fracture into rectangles; (b) the fractured polygon.

262

Ronald C. Bracken and Syed A. Rizvi

center coordinates, and any angle relative to the coordinate axis. These parameters are, respectively, //, W, X, Y, and Θ. The rectangles are described to the pattern generator by these five param­ eters. The pattern generator itself has a set of variable apertures that work together to define H and W. These apertures are mounted on a rotatable head, which can adjust to define Θ. The rectangle thus defined is then illumi­ nated and imaged through a 10:1 reduction lens onto a photosensitive emul­ sion or photoresist film that is positioned under the optical axis by precision X- Y stages. These stages move to set X and Y, which are the center coordi­ nates of the rectangle defined by the variable apertures. These elements of the optical-pattern generator are shown in Fig. 4. The precision of the variable apertures is specified as ± 7.5 /xm (± 0.3 mil), which means the reticle or 10 x geometries have a ±0.75-/xm size precision after 10 x reduction. After the final (10 x) reduction to achieve lx images on wafer or photoplate, the image precision becomes ±0.075 μπι. This number reflects the critical dimension error, which can be traced to the pattern gen­ erator. The OPG registration error cited in Table I reflects a combination of the X, Y location precision and the variable aperture precision. Since the X, Y table is under laser control, the actual location can be determined to ±0.025 jam. However, knowing a position and causing stage movement to travel to that position involves the precision of stage control. An error circle for stage placement is used. This circle can be enlarged or shrunk as position precision and machine throughput are interchanged. In its tightest setting, the circle has a diameter of 0.12 μπι (or ±0.06 μ,πι). The root mean square (rms) addition of these two sources of errors leads to about ±0.10 /xm, which is the registration error cited in Table I. The machine throughput depends on whether the imaging medium is a

10/1 REDUCTION LENS

*- X

Fig. 4. The optical pattern generator for reticle production.

5. Microlithography in Semiconductor Device Processing

263

silver bromide emulsion or a photoresist-sensitized chrome-plated glass sub­ strate. The 35,000 exposures/hr exposure rate is an average number for emulsion. If the rectangles are presented to the OPG in an efficient manner, up to 50,000 exposures/hr can be achieved. In this context efficient means as little angular and variable aperture adjustment as possible. If the design uses a lot of angles and a wide spectrum of rectangle sizes, these rates may drop as low as 25,000/hr. If the reticle is imaged in photoresist rather than emul­ sion, the exposure time per rectangle must increase as this material is less sensitive. The rates achieved on photoresist are about 15-20% of those achieved on emulsion. A high complexity photomasking level such as a 64,000 random access memory (64k RAM) metal removal level may contain 300,000 exposures. Generation times in emulsion of 8-10 hr on complex levels are not consid­ ered unusual. C. Electron-Beam Pattern Generation

The dual development of tight design rules and very large dies have com­ bined to make the use of the OPG an increasingly marginal proposition. The very large flash count encountered in these levels force the use of emulsion as the imaging medium in order to achieve acceptable throughput on the OPG. The emulsion raw stock, however, contains about 1.0 defect/cm2. As the die size increases, the probability of getting an acceptable reticle falls, and multiple reruns usually have to be made. The yield at break-even between chrome and emulsion is about 20% emulsion yield if only exposure rate is considered. The long run times, besides limiting machine capacity, tend to increase the chance of encountering a machine operating error such as the repeated failure to place an exposure within the tolerance circle. A better medium was needed for the new designs, and a rapid patterning technique is necessary to make this medium practical. The electron-beam exposure system (EBES) is an almost ideal machine for this purpose [3]. The chrome blank is the medium. As the EBES was originally developed by Bell Laboratories, it was con­ figured to image patterns at final size (or 1 x). As such it represents a great shortening of the usual process. By use of the EBES, the step-and-repeat master can be built directly from the magnetic tape, so the OPG and step-and-repeat operations are by­ passed. The throughput of the EBES for lx master plate generation is low compared to the step-and-repeat cameras (about 7%). The EBES has not found its primary mission at 1 x. If, however, a reticle could be generated in the same time the master plate is generated, the OPG problem mentioned above could be circumvented. Redirecting the EBES to this use involved a software problem: the data

264

Ronald C. Bracken and Syed A. Rizvi

were until recently presented to the EBES as a computer pattern file com­ posed of address units or dots that had a 0.25-/xm diameter. A 5 mm x 4 mm (200 mil x 160 mil) die would contain 3.2 x 108 of these address units, which must either be exposed or left unexposed. These units are addressed at a 20-MHz fill rate so the address units in the die could be filled in 16 sec plus some allowance for software overhead. A 100-mm-diameter circular array of these circuits contains about 400 dies, so the electron-beam write time is about 107 min to produce a 1 x master. Not previously mentioned was that the die was written from information in a computer pattern memory that originally contained 8,388,608 binary bits. (This is often called a 1 megabyte memory where a byte comprises 8 binary bits.) The original machines wrote in stripes of 512 units height and 16,384 units length where the units are the address units or AU. These address units are assigned the electron spot-size dimension, which can be varied. The die being considered in the example would be written in 40 steps 0.125 mm high and up to 4.096 mm long. The fill time for the pattern buffer for each stripe may be as much as 10 sec, but this time penalty is incurred only once per stripe. In the lx writing scheme, this same set of data is written 400 times before the array is completed. Then a new set is written 400 times, and so on, until the pattern is completed. The software overhead time for the complete array was about 6.7 min total, so the total machine time would be 113.7 min. For 10 x writing all of these considerations change. In writing a reticle the stripe data are of use only once, so the computer pattern fill time must be spent on each stripe. If, for example, the lx array we discussed were con­ sidered as a 10 x pattern, the machine time per stripe would be 10.419 sec. There were 16,000 stripes in the array so the machine time on the entire pat­ tern would be 46.3 hr unless something changed. Later changes in the hard­ ware and the software include a l-/xm beam, which gives a 16 x increase in the rate of area coverage. The stripe size was enlarged by doubling the com­ puter pattern memory to 2 megabytes, which doubled the number of address units (AU) in a stripe. This change allowed the overhead penalty to be in­ curred less frequently. With these enhancements alone, let us consider a 90 mm x 90 mm reticle (Table II). This rather simplistic analysis gets an estimate of the current writing times involved that is within a factor of less than two of the times ac­ tually observed. Other enhancements have attacked the software overhead times. These enhancements include a more rapid core fill routine and a "look ahead—skip on zero" routine that allows the electron beam to skip data rows and stripes that are empty or not written. All of these speed en­ hancements create demand for more and more computer capacity, and that is where the big additions to the system have occurred. Fortunately, all of this change happened in a time of falling computer prices, so the expense was not prohibitive.

5. Microlithography in Semiconductor Device Processing

265

TABLE II

Number of stripes AU/stripe Sec/stripe Overhead sec/stripe Machine time/reticle

Old scheme (20 MHz)

New scheme (20 MHz)

15,488 8.388 x 106 0.419 10 sec 45.8 hr

528 16.777 x 106 0.838 10 sec 1.59 hr

The electron beam was designed to be a very precise 1 x machine, so the precision for 10 x writing is outstanding. The EBES writing scheme needs to be briefly explained in order to discuss the precision specifications of the ma­ chine. In Fig. 5 the EBES writing is shown. The sensitized raw stock is moved uniformly in X under the electron beam. The beam itself is driven over a 0.512-mm maximum scan in the Y TABLE MOVES BACK AND FORTH ALONG X AXIS WITH CONSTANT VELOCITY OF 2 cm/sec

LAST ELEMEN1 ON E B COLUMN

BEAM SWEEPS BACK AND FORTH ALONG Y AXIS

INTERFEROMETER REFLECTOR FOR TABLE

INTERFEROMETER REFLECTOR FOR TABLE

WORK STAGE

Fig. 5. The work stage of the e-beam. The electron-beam scans in the Y axis as the stage tra­ verses under the column in the X axis. (Courtesy of Perkin-Elmer—ETEC.)

Ronald C. Bracken and Syed A. Rizvi

266

direction. The width of the scan is always one-e-beam-spot width. The scans lie contiguously on the plate. The maximum number of scans that can be held in pattern memory at a time is 32,768. The rectangle that is defined by this pattern memory is then 512 AU high and 32,768 AU long. It is called a stripe. The entire pattern is composed of a mosaic of these stripes. The smoothness of the stripe abutment as well as their size uniformity contribute to the precision of the pattern composed by the stripes. Several components of the machine's precision can be seen in this example. The direction of the scan relative to the pattern edge strongly influences the roughness of that edge. A pattern edge that is parallel to the scan direction is written with a sweep of the beam and is very smooth. The edges that are perpendicular to the scan direction will show a granularity that is dependent on the electron spot size being used. The larger the spot being used, the larger will be the edge roughness expe­ rienced. The number specified by the manufacturers of EBES-type ma­ chines is ±40% of the spot size. For reticle making, the 1-um spot size re­ sults in a roughness of ± 0.4 urn on the 10 x reticle. After the 10/1 reduction step, this roughness might become 3cr < ±0.04 urn. In practice this worst case of roughness is not experienced as the reduction lens will not resolve images separated by less than 1.0 um. (The useful resolution is 1.25 um for the reduction lens.) Another and more serious form of edge roughness occurs when edges are formed at an angle to the scan direction. This form of roughness results from so-called raster effects, which arise from the fact that an angled line bears different spatial relationships to the different spots in the matrix making up the dot pattern. This relationship is illustrated in Fig. 7 for several angled lines with pitch varying from 10/3 to 10/6 (pitch = ΔΓ/ΔΧ). We see in this set of examples that the worst case excursion of a point from the angle line always occurs when a point falls on the line and therefore is one-half the spot size. In this case this distance is 0.5 urn. The different angles simply cause the spacing of these jogs to be more or less widely spaced; the excursion of the jogs is always the same. When the pitch becomes ΔΚ/ΔΑΓ = 1 (or 45°), the angle line again always bears the same relationship to each spot in the dot matrix (i.e., it intersects it). The separation of the 1.0-um spots is 1.414 urn so partial resolution of these address units could occur even after the 10/1 reduction step. In practice we have found the 45°-angled lines to be nearly as smooth as the line formed at 90° to the scan direction. Figure 8 shows a plot of a device

GGGGQQ-

-PATTERN EDGE

iîiîiîililiîil

! I ! I I I ! I I I ! I i I SCAN DIRECTION

Fig. 6. The roughness resulting from an edge being perpendicular to the scan direction.

5. Microlithography in Semiconductor Device Processing

267

Fig. 7. Raster effects on edge roughness in EBES writing.

pattern for lines of varying angle. Here the pattern was simulated at 1 x with a 0.5-μτη spot size (a) and at 10 x with a l-μ,πι spot (b). The writing algorithm is such that the number of AU in X is maintained constant as the jogs are tra­ versed. This rule results in a variation in the feature width as it is measured in the direction normal to the angle lines. At the extreme case this variation could be one AU or 1.0 ^m at 10 x . After 10/1 reduction, this variation would be 0.1 μ,πι. All of the preceding assumes perfectly placed AU and completely consist­ ent processing. If these conditions are obtained, the dimensional variations are summarized in Table III. If a design calls for "other" angles, the spot size must be reduced to main­ tain critical dimension (CD) control. The spot size then becomes simply a function of the CD control desired. If a reticle is written at 0.25 μπι rather than 1.0 μπι, the write time is increased as was discussed previously. In some cases the "other"-angled feature can be segregated, and the reticle can be written as two jobs: one at 1 /xm and a second at 0.25-μπι spot size. In this way the fine control is only maintained where it is needed, and the writing time is not needlessly increased. Other errors that can impact CD control in­ volve the failure of the two above-mentioned assumptions, i.e., improperly placed address units and inconsistent processing. The improper placing of address units falls under the specification of stripe abutment and electronbeam drift. The stripe abutment error arises from the inability to exactly align the stage motions to the beam motions. The maximum error to be ex­ pected from this cause is 0.125 μ,πι. Figure 9 illustrates this error. On angled features this error can result in a 0.177-μπι (10 x) variation in the dimension from that which was contained in the data. The 90° feature suffered a 0.250-μπι change at 10 x .

Ronald C. Bracken and Syed A. Rizvi

268

*m

(a)

Emmrn

Fig. 8a. Plots that simulate raster effects for angled lines. This shows a lx pattern written with a 0.5-μτη spot.

Electron-beam drift is typically about 0.0625 μπι in the 5-min intervals between re registrations of the beam. After the 10/1 reduction step to final size, this error is 0.006 μ,πι. The final causes of variations, the resist thickness, processing variations, and exposure effects are more difficult to define in a quantitative manner.

5. Microlithography in Semiconductor Device Processing

269

Fig. 8b. Plots that simulate raster effects for angled lines. This shows a 10x-reticle pattern written with a 1.0-μ,πι spot.

The electron dose is sized for some nominal resist thickness. As the resist varies from that thickness, the dose becomes inappropriate and dimensional variations can result. For resists of different sensitivity, this effect is more or less serious as the sensitivity is increased or decreased. This subject has been treated in the literature [4]. A summarization of these various contrib­ utions to CD variation is contained in Table IV. From this table it is obvious

270

Ronald C. Bracken and Syed A. Rizvi TABLE III Angles

10X CD variations

IX CD variations

0°, 90°, 45° Other

±0.4(spot size) ±0.5(spot size)

Nondetectable ±0.05(spot size)

that when they are used, the "other" angled lines are the single most signifi­ cant problem in CD control. When the design is constrained to a 0°, 45°, and 90° feature, the CD error is reduced by a factor of four. If the design avoids the ' 'other" angled features, then the largest contribu­ tor to CD variation is the stripe abutment variation. The edge roughness of 40% of spot size is more significant at 1 x than it is for 10 x reticles. For this reason in the accounting of Table IV, that item was not included as a contrib­ utor to CD variation. A great many of the items that contribute to CD variation also influence registration, but registration variation is also influenced by factors that do not impact CD. For this discussion, consider a rectangular geometry. The rectangle can be defined by the parameters //, W, X, Y, and 0, which were mentioned earlier in connection with the OPG. When considering CD variation, the geometry center placement was not considered—only the feature size (H and W). For registration feature placement, X, Y, and Θ are the important parameters. By registration is meant the ability to place a geometry center repeatedly within a specified ΔΖ and ΔΚ

Fig. 9. Geometry and CD distortions arising from stripe abutment error.

(10) RMS sum of errors

(3) Dark reaction (4) Bake temperature (5) Process RMS Subtotal items (1), (2b), (3), (6) Edge roughness 0°, 90°, 45° features (7) Edge roughness other angled features (raster effects) (8) Stripe abutment (9) E-beam

(1) Resist thickness (2) Beam current

Parameter (P)

(4)

1.0 μπι

0.250 um 0.0625 um

SS = 1.0 um

Maximum error = 0.125 um Register at 5 umin = /

(50% SS) x 2

2 x abutment error 0.0625 um/5 min

1.04 urn 0.27 urn

0.08 um 0.40 um

0.015 μπι 0.005 um 0.012 μπι 0.05 um 0.06 um

0.015 urn 0.2 nA 0.2 nA Exposure time = cure time ±5°C

SS = 1.0 um

ACO contribution

Typical AP

40% x spot size (SS)

1 μτη/μΐη (a) 0.027 um/nA (b) 0.06 um/nA 0.25 μπι/decade (min) 0.006/°C

ACO/AP

Comments

Items (5), (7), (8), (9) Items (5), (8), (9)

COP resist used 1.0-um spot size assumed not resolved in usual case 1-um spot size used worst case: 0.5-um jogs exposed across geometry

Negative resists EG OEBR 100 OEBR 100 COP All negative resists Prebake at 80-90°C

Factors Contributing to Critical Dimension Variation for Plates Exposed on an EBES System

TABLE IV

Ronald C. Bracken and Syed A. Rizvi

272

Registration precision refers to being able to repeatedly place a geometry at or near a site regardless of whether that was the intended site. Registra­ tion accuracy is the ability to place a geometry at the intended site. If only one machine is used and the precision is stable, accuracy is of little interest. When multiple machines are used interchangeably, they must be made to match. Matching involves measuring an artifact produced on a "standard" machine and making adjustments to the second machine to get it to corre­ spond to the standard. If the artifact was produced using an accurate ma­ chine, the adjusted machine will become accurate. The items that impact registration are those that affect beam placement. There is also the temperature change, which might be expected of a plate or the machine from one hour or day to the next, and the affect ofthat change on the plate material. These contributions are shown in Table V below for a 4-in.-span length. This use of abutment error as a contributor to misregistration will come into play only if the scan direction is adjusted between writing the two levels in the registration test. If no scan adjustment occurs, the abutment (or nonabutment) of the stripes will not change, and registration error from this source becomes negligible. In Table I the number used (0.3 μ,πι) assumes abutment error is included and low-expansion glass is used. This material is the commonly used sub­ strate in e-beam mask making. One potential advantage of the e-beam that in some cases is extremely im­ portant is the uniform exposure given the pattern. By way of illustration, consider the geometries shown in Fig. 10 where an OPG decomposition is illustrated. The area in the center of the upper polygon is double exposed. Since fea­ ture size varies with exposure on emulsion, dimension D will become smaller in the area where the A, B double exposure occurs. TABLE V

Parameter

Expected change of parameter

Registration change (μΐη)

Temperature Soda lime glass ±0.25°C Low expansion glass ±0.25°C Quartz glass ±0.25°C Abutment Beam drift Max registration variation (rms sum of above variations) Soda lime glass Low expansion glass Quartz glass

0.45 0.18 0.03 0.25 0.06 0.52 0.31 0.26

5. Microlithography in Semiconductor Device Processing

273

POLYGON

OPG DECOMPOSITION

Fig. 10. The double exposure regions that result from optical pattern generation (OPG) rec­ tangle decomposition.

The e-beam writing will not double expose because the pattern is com­ posed of address units that are either turned off or on. They cannot be turned on twice, so double exposure will not occur. This problem is especially sig­ nificant with the extremely fine geometries that routinely occur on high den­ sity devices now being designed. D. Future Trends in Pattern Generation

The defect density argument for chrome plates, as compared to emulsion as a raw stock for pattern generation, will continue to hold true. The raw material will be chromium-coated plates, and the sensitive medium will be photoresist. The cost per exposure on current machines favors the EBES. This cost is shown below in Table VI. TABLE VI Cost per Exposure for Pattern Generation on Chrome Plates and Emulsion Plates" Machine configuration'' OPG OPG EBES EBES REBES

(CR) (Emul) (SA) ( + MINICON) (Proposed)

Machine cost (k$)

Exposures/hr (k)

Cost/exposure $ X (10-3)

400 400 2000 2500 1200

6.0 40.0 400.0 400.00 800.00

2.02 0.303 0.252 0.189 0.045

a Assumption: The OPG will be assumed to run 3 shifts/day, 6 days/week and 51 weeks/year. 0 CR indicates chrome; Emul, emulsion.

274

Ronald C. Bracken and Syed A. Rizvi

The EBES will operate in two modes; the first is stand alone (SA), where one shift is dedicated to OPG tape conversion and two shifts to running ex­ posures. In the other mode a second computer (+ MINICON) is available to do the conversion and allow the EBES to run exposures 3 shifts/day. In both OPG and EBES cases, a 5-year straight line depreciation schedule is used. For both machines, 90% uptime and availability are assumed. Such a comparison leaves out two significant factors: (1) For those numbers to apply, the machines must be fully loaded. If the demand is such that only an OPG is needed, the extra capacity of the EBES is wasted, and the cost per exposure on the EBES goes up in proportion to that waste. (2) The OPG (Emul) pattern yields are lower than EBES yields, and the cost per exposure must be adjusted to comprehend that fact. Among future EBES developments will be attempts to reduce the asset cost of the EBES machines. One approach is to specialize the machine to make a "reticle-only'' EBES. This change will allow large-beam writing to become the sole mode of operation with the consequent increase in pattern throughput and machine economy. It is estimated that the asset cost will be halved and the throughput doubled. This reticle-only machine is included in the table as the REBES. Another development that must occur is the alleviation of the CD problem associated with the raster scan effects on "other" angled features. Since this CD error is proportional to the beam size used, the error can be decreased by using smaller beams for these angled features. This solution requires identification of the features and some method of handling them separately in a way that does not create more problems. The use of a variable scan direction would eliminate the problem by al­ lowing the beam to scan these 4 Other" angled features as if they were 0° fea­ tures. Such a technique would require beam compensation for lens distor­ tion for arbitrary directions of the beam sweep. It is obvious that the fewer directions used, the less will be the software burden on the machine. Shaped electron beams would also eliminate these raster-induced CD errors. This approach is being developed by some commercial equipment makers. The shaped beam would allow the formation of polygon edges from a straight line rather than a series of round spots. However, the real problem being addressed here is '"other" angled lines. The rotation of the apertures in an e-beam system, while maintaining beam focus and uniformity, will not be an easy technical task. If only orthogonal features are allowed, then the shaped beam is addressing a contrived problem since sufficiently smooth edges are maintained on orthogonal edges by conventional scanning ma­ chines. All of the enhancements mentioned will require more and more in­ vestment in software and computers to make them effective. The vector scan versions of the e-beam machines have the same problems to address as the raster scan machines. The vector scan technique is essen­ tially a software routine that identifies the areas to be written per se. In the

5. Mi ero lithography in Semiconductor Device Processing

275

version offered by Philips, a large area (1.6 mm x 1.6 mm) is handled at one time. (Compare the EBES writing window coverage, which is 160 μ,πι x 512 μτη in the 1-μηι mode.) Since in vector scan such a large area of the lens is being used, the software corrections for lens distortion must be much more elaborate. The beam still scans the geometries in conventional raster fashion in the currently offered vector machines, which means that the raster effects alluded to earlier will be present. Since the large area lens cor­ rection must be known and made, it would seem to be a short step to scan­ ning on the edge direction whatever the angle of the edge. The current vector scan machines appear to be closer to solving the raster effect problems than the current raster machines. This route to the solution takes an already software-intensive technique and increases the software demands. The trends in EBES pattern generation will be toward special purpose re­ ticle machines that will utilize much more powerful computers and more so­ phisticated software support systems. Cost per exposure, which is already less than that for OPG machines, will be further reduced as the manufac­ turers drive toward lower asset costs. III. PHOTOMASK FABRICATION

The photomask is the pattern of the reticle image at final size or the size as it will appear on the wafer. As such, the demands for resolution, registra­ tion, and CD control are approximately an order of magnitude greater than they were at the reticle stage. Historically, the microlithography of the pho­ tomask began the same way as did that of the reticle, in that a picture of the pattern was taken. The first photomasks were produced on film and the imaging medium was silver halide emulsion just as it is in black-and-white photography. The emulsion photoplate soon followed. A. Emulsion Photomasks

The presently available emulsion photomask is composed of a layer of gel­ atin containing a finely dispersed powder of silver bromide. This gelatin is spread onto glass plates that are roughly 15 in. x 20 in. in dimension and range from 0.060 to 0.090-in. thick and up. The exact sheet size varies and is dependent on the size of the final plate to be cut from the larger sheets. The cosmetic quality of the raw plates then is influenced by the vendor's ability to produce clean gelatin, uniform AgBr powder, and an even and clean spread thickness. It is further influenced by the cleanliness of the scribe and breaking operation. Glass chips and fractured emulsion flakes at the plate edge are a chronic problem that the emulsion plate vendors must routinely face. Considering the implacable nature of the problem, some of the vendors

276

Ronald C. Bracken and Syed A. Rizvi

have been remarkably successful. On an average, the defect density to be found in and on emulsion plates is between 0.75 and 1.0 defect/cm 2 . This de­ fect density is further increased by the mask/master clamping during the printing process (0.3 defective dies/cm 2 ). The plate processing tends to free glass chips and emulsion flakes at the plate edge to immigrate inward to the patterned area (0.3 defective dies/cm 2 ). These numbers were obtained under what were considered to be good conditions of manufacturing. Overall, somewhere between 1.5 and 2.0 defective dies/cm 2 are not considered unusual in a finished emulsion mask. The control of geometry sizing in emul­ sion is a function not only of the printing and processing variability, but is significantly impacted by such background problems as plate age and the elapsed time between exposure and processing [5]. These are not insu­ perable problems at all, but they do require constant attention to the sensi­ tivity of the plates being currently produced in order that an appropriate cor­ rection to the "expected" behavior can be applied. In Table VII some of the background and processing effects on linewidth control are noted. A further consideration in emulsion processing is that as line density is in­ creased, a proximity effect begins to be noticed—the Ross effect. As two emulsion lines approach each other, the scattered light in the "unexposed" area increases, and the line separation in these areas decreases. Above about 4 - 5 μΐη, this effect is not of consequence. The final contributor to CD control is the emulsion edge. The emulsion coating thickness is about 4 - 6 μπι. The edge in this material is a rather arbi­ trary point since the optical density of the geometry is not a step function change at the geometry edge. Schappel found for several operators and dif­ ferent measuring machines that an average difference between operators was about 0.1 μ,πι, and the standard deviation on their results was about ±0.15 μπι. All of these measurements were on one artifact of one mask on each day measurements were made. This result speaks only to the ability to find the edge repeatedly, not to the variation of an actual edge. The variation

TABLE VII Linewidth Control in Emulsion Processing Background: Plate age 0 - 3 months 3-6 months Exposure processing Processing: Develop time Exposure time Developer temperature Developer age (air oxidation)

0.5 μΠ\ 0.2 μΠΙ

0.1 /xm/24 hr 0.1 μπι/min 0.05 /xm/sec 0.1 M m/°C 0.05 μτη/hr (after mixing)

5. Microlithography in Semiconductor Device Processing

277

of CD from plate to plate was found to be about 0.25-0.3 μπι for a 2.5-μτη line. The CD control noted in Table I (±0.5 μ,πι) comprehends all of the above contributions as they might be commonly encountered. The emulsion reso­ lution, 5.0 μπι, is greater than that which actually can be resolved in wellcontrolled circumstances, but it avoids the proximity problems that exist for the finer line patterns. Registration on emulsion plates is a combination of the error that exists in the master as well as the contribution of the printing process. The Ultratech printer, which is widely used in the industry, is very effective in controlling misregistration. One of the authors (Bracken) has measured the misregistra­ tion error between 4 in. x 4 in. x 0.060 in. emulsion plates and the chrome master used to imprint them. About 95% of the array variation from the master could be contained within ± 1.0 μ,πι of the master size. These results were obtained using a Leitz Linear 200 Comparator having a 3cr precision of 0.25 jLtm. A good set of production master masks can be held to within ±0.5 μπι of a specification nominal; so a total variation of 1.5-/xm registra­ tion error is expected. The emulsion photoplate is currently the most heavily used mask in microlithography. Several factors make this true: The cost of the plate is typically about 10-15% of a similarly sized chrome plate; secondly, the nature of some (bipolar) circuits is such that mask wear is extremely high and the reso­ lution of chrome is not needed. The wear rate in bipolar production is high due to the epitaxial silicon layer that is used in bipolar integrated circuit pro­ cessing. B. Contact-Printed Chrome Photomasks

The limits of emulsion photomasks are principally those of resolution and defect density. As higher resolution and lower defect density masks are re­ quired, a different substrate must be used. Several materials (iron oxide, sili­ con, and molybdenum) have been used in addition to chromium, but chro­ mium has proven over the long run to be the material of choice. Several factors contribute to this popularity: its inertness to most cleaning solutions, a very fine grain size in the as-deposited film, the ready availability of high-purity moderately priced targets, the film adhesion to most glasses, and the adhesion of most photoresists to the chromium itself. Chromium is coated onto a glass blank that is the final size of the mask that is delivered. The scribing problem that was so serious for emulsion quality can be addressed independent of the photoresist film quality problem. Plate edges are usually ground, and the corners rounded to remove glass flakes as­ sociated with the scribe-and-break operation. Chromium films of about 70.0 nm are deposited by evaporation or sputtering to achieve an optical

278

Ronald C. Bracken and Syed A. Rizvi TABLE VIII Characteristics of Currently Available Glasses

Glass type

Normalized price

Thermal coefficient of expansion (°C)

UV transmission edge (nm)

Soda lime Low expansion Quartz

1 8 12

90 x 10' 7 37 x 1 0 7 5 x 10~7

300 300 180

density of —3-4 (optical density = log /0//trans)· Photoresist films of 300-600 nm are spun on the plate, the resist is cured, and the plate is ready to use. Under conditions of "good" practice, it is not unusual to be able to produce blanks that will average 1-3 defects (l-μπι sizing) in the entire wafer region of the plate. These defects include both those caused by the metal film and the photoresist. The glass substrates that are currently available are soda lime (window glass), "low expansion" glasses, fused silica, and quartz plates. The charac­ teristics of these glasses are summarized in Table VIII. The need for these substrates arises from two factors. First, as wafer sizes increase, the device tolerances must be maintained. A temperature change between alignment machines becomes more and more of a problem as the wafer size increases. This trend can be seen in Table IX. Devices that can tolerate 0.25-0.5 μπι of misalignment (of which tempera­ ture can cause 0.25 /xm) can probably utilize the low-expansion glasses. However, in any masking application, temperature differences arise and TABLE IX Misregistration at Wafer Edge Caused by 1°C Temperature Difference in Two Sequential Masking Steps

Soda lime Misregistration (/xm) Misalignment (μπι) % wafer area with >0.25-/xm misalignment Low expansion Misregistration (μπ\) Misalignment (Mm) % wafer area with >0.125-/xm misalignment Quartz Misregistration μπ\) Misalignment (Mm) % wafer area with >0.125-μπι misalignment

51 mm

76 mm

102 mm

127 mm

0.47 0.23 0

0.70 0.35 51

0.95 0.48 73

1.10 0.55 79

0.19 0.10 0

0.28 0.14 20

0.38 0.19 56

0.47 0.24 72

0.03 0.15 0

0.04 0.02 0

0.05 0.025 0

0.06 0.03 0

5. Microlithography in Semiconductor Device Processing

279

must be considered. They will not be completely resolved by any answer short of using quartz as the mask substrate. Whatever substrate is employed, one of the most widely used of hard sur­ face plates is the chrome contact print. Its characteristics will now be con­ sidered. 1. Registration of Chrome Contact Prints

Contact printing of chrome plates is a process that is extremely suscep­ tible to oversimplification. In principle nothing could be simpler than to con­ tact the master to the copy plate% expose the photoresist, and then develop and etch the pattern. In bringing the two plates together, care must be taken that there is no flexing of either member. Consider the impact of such flexure as shown in Fig. 11. When the master plate is flexed, the lower surface is compressed with the result that its patterns are displaced inwardly. At the exposure, the distorted pattern is imprinted onto the flexed copy plate whose upper surface is stretched or in tension. After the contact, the stretched upper surface of the copy plate is relaxed with the result that the imprinted patterns are displaced inwardly beyond the amount of the master mask distortion. A flexure such as that shown in Fig. 11 results in a copy plate having negative array registra­ tion relative to the master plate (run-in). An opposite flexure would lead to

-ΕΠΠ

mmi

h\\\VA

mum -

(c) Fig. 11. The creation of array misregistration (run-in) in a plate printed under conditions of flexure. Condition (a) is before contact; condition (b) is during exposure; condition (c) is after printing.

280

Ronald C. Bracken and Syed A. Rizvi

run-out. Rizvi has derived a closed expression describing this misregistra­ tion, where r is run-out the other symbols are as described in Fig. 11: r = h/DiXc + tm). We have measured array misregistration under controlled conditions of deflection and found substantial agreement of this expression to experiment. This expression predicts that a 25.4-mm (1-mil) deflection of 5 in. x 5 in. x 0.090 in. master and copy plates will result in 1.14 urn of array misregistra­ tion between the master and copy plate over the 102 mm wafer width. As a result of such consideration, a great deal of effort has gone into opti­ mizing the mechanics of bringing about the master and copy plate contact. The contact printer that probably embodies the most widely used solution to this contacting problem is the Tamarack™ printer. When properly used and maintained, this printer is capable of maintaining master-to-copy registration of ±1.0 urn. This number is the same as that reported previously for the emulsion plate. If the emulsion printer is adopted for chromium contact printing, as was once the practice, a wider registration distribution is ob­ tained. Given then the chrome print misregistration and that for the master mask set (0.5 urn), the total misregistration usually encountered for chro­ mium contact prints is that for emulsion prints: ±1.5 urn. These statements do not imply that contact printed sets cannot be pur­ chased having smaller registration values. In the case of both emulsion and contact prints, the mode of the distribution of prints about the master was zero misregistration. If print registration is closely checked, a set of prints can be selected having arbitrarily tight registration at some cost. Usually prints are used because cost is a serious consideration, and economically, registration can be checked only to identify printer malfunction. Under these latter circumstances, the naturally occurring amount of misregistration, which is ±1.5 urn, will be observed. 2. Resolution of Chrome Contact Prints

The resolution of chrome contact prints is an interesting matter. In­ herently, the chrome/photoresist imaging medium is capable of extremely high resolution. The masters, from which the prints are obtained, are limited only by the resolution of the optical equipment used in the imaging process. As such, their limit of resolution is given as the same as the limit for the step-and-repeat camera lens, which is 1.25 urn. The printing procedure is one-to-one contact printing or shadow casting, not optical imaging. For shadow casting under conditions of completely intimate contact, the modu­ lation transfer function is given as unity, or anything that can be imaged on a master should be able to be imaged on a contact-printed mask. That such resolution is not routinely observed is due principally to the failure to achieve completely intimate contact throughout the printing step. This fail-

5. Microlithography in Semiconductor Device Processing

CO

^

N2

LIGHT, HEAT

^

/ \

281 .0 = 0 ^ 2 !

^ / k ^

Fig. 12. The decomposition of the photosensitizer in positive resist to emit N2.

ure is due to the nature of positive photoresist, which is the almost univer­ sally used photosensitive film. During the exposure step, the photoresist emits a considerable amount of nitrogen, which is a product of the photoreaction shown in Fig. 12 [6]. The volume of N2 given off at standard conditions of temperature and pressure is estimated at 8-12 times the volume of the photoresist. The result of this N2 emission from the photoresist is a loss of contact between the copy and the master plate. As the two plates separate, light is diffracted at the pattern edge and reso­ lution is degraded on the copy plates from that achieved on the master mask. This separation is never uniform. As a consequence, the resolution will vary from point to point on the copy plate. The center of the mask is flexed, be­ comes a collection area for the N2, and will exhibit the poorest resolution. A concomitant of this loss of resolution is geometry rounding and corner inver­ sion. In practice it has been found that for geometries smaller than about 3 jLtm the image quality becomes so inconsistent as to be unacceptable. This conclusion does not mean that smaller geometries are not ever resolved (and occasionally resolved well); it simply means that the predictability of the contact printing process for an acceptable product has declined to a point of diminishing returns. 3. Critical Dimension (CD) Control of Chrome Contact Prints

The dimensional variation (CD control) is strongly effected by the same considerations that bear on resolution. As resolution and geometry are found to vary over the area of a contact-printed mask, so is geometry sizing found to vary. Under conditions of good control, the variation on a single plate is usually observed to be about 0.25 μ,ιτι. From one plate to another, the plate-to-plate average will vary by about 0.25 μ,πι. Hence, for a series of plates, about 80-90% of the distribution of observed measurements will fall within a range of ±0.25 μ,πι. Also influencing this ability to maintain CD control is the processor used. In order to maintain consistency, most contact print processing is done on automatic spray processors that have carefully timed develope, etch, and rinse cycles. An additional contributor to CD variation is the instrument that is used to measure the dimension. Until about 1976, the filar eyepiece with a vernier

282

Ronald C. Bracken and Syed A. Rizvi

drum readout was commonly used. An improvement on this arrangement was the addition of a digital readout that removes operator error in reading the vernier. After 1976, several versions of image shearing microscopes were intro­ duced both with and without the digital readout. In 1978, several versions of automatic edge sensing for CD measurement became available. The automatic edge sensing made a great difference in the precision with which a measurement could be made. A comparison shown in Table X illustrates this instrument precision on repeated measurements of a single artifact. In this particular case, exactly the same geometry was measured using the two measuring systems and several operators. The X agreement indicates that the calibration technique is accurate. The difference in standard devia­ tion is almost a factor of four. This difference is due to the precision of the two systems. It is our experience that this sort of difference is observed between any system that relies on the operator visually to estimate an edge position and a system that has some means of automatically sensing an edge. Recently (1980) Nikon introduced its LAMPAS system, which, in addition to a new automatic edge-sensing technique, has a laser interferometer to measure the distances between the sensed edges. This machine has demon­ strated a precision for CD measurement of 3cr = 0.05 μπι, which is less than half the deviation seen using previous equipment. The use of automatic spray processing that led to better CD control has also led to an improved visual quality in the photomasks. Prior to this type of processing, immersion tanks were common. While this method has high capacity, consistency is hard to maintain. A common problem in immersion systems is that air bubbles cling to the plate surface and inhibit either the develop or etch in their locality. Unless careful attention is paid to this problem, immersion processing will tend to have a high incidence of extra chrome defects—chrome spots and opaque bridges or projections. 4. Chrome Print Cosmetic Quality

As was discussed earlier, it is possible to specify and purchase chrome blanks that have an average of 1 defective die/in.2 or less (0.15 defect/cm2). Also, it is possible on a routine basis to produce master masks that have no TABLE X Instrument Precision Digital filar (Leitz system) Χ=5.04μπ\ σ = 0.13 μπ\

Range = 0.36 μΐτι η = 9

Automatic edge sensing (Nanometrics) X = 5.04 μτη σ = 0.039 μπι

Range = 0.127 μπ\ η = 9

283

5. Microlithography in Semiconductor Device Processing

greater than 1 defective die/in.2. However, the prints produced from the contacting of these two typically range from 4-6 defective dies/in.2 (0.6-0.9 defect/cm2). Simple addition of the raw stock and master mask defects gets us to 2 defective dies/in.2. The act of processing the contact-printed raw stock will elicit some further defects—about 0.5 defective die/in.2. Also, the average in-use master has accumulated wear and contamination, which will amount to about 0.5 defective die/in.2. This acounting still leaves us short of rationalizing the observed 4-6 defective dies/in.2 of the contact print. The remaining 2-3 defective dies/in.2 have been found by the authors to come from the contacting step itself. In order to achieve good CD control and res­ olution, the copy plate-to-master mask contact must be as intimate as pos­ sible. However, this intimacy causes some photoresist damage, which leads to defects during the processing steps. In the authors' experience, these defects are mostly missing chrome de­ fects; however, they need not be. Others have reported a similar density of added chrome defects. The difference, we speculate, is in the processing step. If the plates are given a post-develop bake, the resist damage incurred at the contacting can be healed, but usually contamination will be picked up from the baking step. This contamination may lead to extra chrome defects, as shown in Table XI. In this table, the data in the last two columns rep­ resent automatic inspection results of masks made from material that has been characterized in the first four columns. The master mask that was used to print these contact prints had about eight defective dies. As we saw in Table I, the improvements in masking quality that are of­ fered by the change from emulsion to chromium contact prints are a halving of the specification limits for resolution, CD control, and defect density. This halving represented a significant improvement in mask quality and for many years established the state of the art. However, as die size continued its inexorable increase and design rules were tightened, even this performance began to act as a limit on device

TABLE XI Plate Quality as Estimated by Defective Dies/Plate Unaided visual inspection Received plates Method" No. dies/plate cr

X2.0 1.5

X+ 2.0 1.0

Processing only

Contacting + processing, no exposure

X2.5 2.0

X4.0 8.0

° X —: backlighting for pinholes. X + : oblique lighting for resist defects.

Automatic inspection, finished plate X60 12

X+ 20 5

284

Ronald C. Bracken and Syed A. Rizvi

yields. The initial response of the wafer fabrication areas was to request a lowered specification limit on the chromium contact print. The attempts to comply with that request in general resulted in a steadily decreasing ability to meet quality and quantity plate requirements on the part of photomask fabricators. The heart of the problem lay in the fact that the tightened re­ quirements no longer described the product flowing from the contact print process. The attempt to select usable material from this marginally useful product flow was, in many cases, unsuccessful and was at best accomplished at lowered plate yields. The way out of this problem arrived with the advent of high-speed stepand-repeat cameras. Using these cameras, it was no longer necessary to copy master plates in order to support high volume plate demands; instead support could be maintained using the directly-stepped plates themselves. In this same time frame also, projection printing for wafers was replacing wafer contact printing. This trend resulted in lowering the overall demand for masks. What was needed was very high quality masks in smaller quan­ tities. These two changes, the high-speed step-and-repeat camera and the wafer projection printer, meant that directly-stepped plates could become a viable substitute for the contact-printed plate. C. Directly-Stepped Photomasks 7. Resolution

Elimination of the contact printing step meant that the resolution of the reduction camera and the imaging medium became the limit on patterning. As was discussed previously in the portion on master masks, the usable reso­ lution of the Carl Zeiss (10-77-82) lens is about 1.25 /xm. In practice this is a number that can be reliably achieved and in some circumstances, it can be improved. 2. Registration

The high-speed step-and-repeat cameras are typified by the GCA/MANN 3696 and the TRE/ELECTROMASK (image repeater) machines. The step­ ping motion of these machines is laser metered using Ne-He laser interfero­ meters. The demonstrated stepping precision of the machines is 3(7 < 0.25 μπι. The components of mask registration are bound up in the camera used for producing the finished plate. In Fig. 13 is a schematic of a typical stepand-repeat camera. The raw stock is placed on the stage against banking pins to establish loca­ tion. The stage motions X and Fare the basic directions on the camera. The

5. Microlithography in Semiconductor Device Processing RETICLE

285

FIDUCIAL MARKS ON RETICLE

LASER LIGHT

MIRROR

Fig. 13.

Y STEP-AND-REPEAT CAMERA The step-and-repeat reduction camera.

X and Y direction of the banking pins and the stage must align closely. The reticle, which is the die pattern at 10x magnification, contains fiducial masks that align with the X and F directions on the die pattern. When the reticle is placed on the camera head, these fiducials are aligned to illuminated targets fixed to the head. The X and Y of these targets must align to the stage mo­ tions. The failures of registration are bound up with (1) the stage motion or array registration, (2) the proper setting of the reduction lens, and (3) the aligning of the reticle fiducials to the camera targets (or reticle rotation). Given the current camera and an operator of "standard" competence, the last two errors, which are summed under die registration, amount to <0.30 μτη. The first category, errors resulting from stage motion, can mani­ fest themselves in several ways, all of which are array registration that should be <0.25 μπι. The registration error for all causes is the rms average of these errors or 0.4 /xm. A problem with the overlay technique is that it depends on operator judg­ ment to determine edge locations where the error is measured. We have found the precision of optical overlay to be 2σ ~ 0.2 μ,πι. The use of the Nikon Lampas will represent the same improvement in registration mea­ surement precision that was seen when CD measurements changed from visual edge estimation to automatic edge detection. The Nikon machine has a demonstrated precision of 3cr < 0.15 μ,πι. 3. Cosmetic

Quality

One defective die/in.2 (0.15/cm2) was used in the section on chrome con­ tact prints to describe the directly-stepped master plate. When this plate is

Ronald C. Bracken and Syed A. Rizvi

286

delivered to the wafer fabrication area, this same defect density can be main­ tained. The impact of this improved cosmetic quality can be profound on the wafer probe yields. Yield can be estimated as Y = K txp(-nda), where Y is the yield of nondefective dies, n the number of yield impacting levels, d the defect density of a single mask level, a the die size, and K the noncosmetic limits on device yields. Assume n = 4, K = 0.7, and a = 22 K mil2 = 0.142 cm2. For the contact print at 0.9 defective dies/cm2 F(cp) = 42%. For the stepped working plate at 0.15 defective dies/cm2 F(swp) - 64%. Since 70% represents the maximum yield that can be achieved if all the levels are at zero defects, the contact print represents an achievement of 60% of possible yield, and the stepped working plate represents 91% achievement. This improvement in defect density has effectively removed the mask cosmetic quality as the limiting factor in probe yield. 4.

Productivity

Until we considered the directly-stepped plate, productivity was not a serious problem. The copy plates could usually be produced as rapidly as needed. Additional print capacity was cheaply available. For a contact print process, productivity generally meant the yield of the prints through the inspection steps or the availability of masters for printing. The printing step itself was not a serious problem. With directly-stepped plates, both of the latter problems evaporate. There are no masters, and the yield through inspection is high. The high yield is ac­ complished by maintaining the former contact print criteria for the stepped plates. When these criteria are maintained, a highly desirable situation exists. That is, the inspection criteria now describe basically all of the prod­ uct flow, not just a part of it. Under these conditions, the inspection plans now can operate as they were designed: they describe the process and can indicate when it is out of control. When these quality criteria were applied to contact prints, they represented more of a goal than a description. The result was that an appreciable portion of the product flow was rejected in an at­ tempt to separate usable material from the rest. This attempt was not partic­ ularly effective and in most cases was costly in wasted material and effort. However, with the use of directly-stepped plates, the sample plan became an accurate description of the process, and the yields through the plan were

5. Microlithography in Semiconductor Device Processing

287

high. The productivity limit now was at the step-and-repeat camera. A con­ tact printer can produce about 25 copies/hr and costs about $30,000. A step-and-repeat camera costs about $300,000 and can produce anywhere from 0.5 to 15 plates/hr, depending on how complicated the requirements are for multiple patterns within the plate and plate size itself. To achieve the higher number, the array should be stepped using multiple-die reticles, the use of test patterns must be avoided, and arrays no larger than necessary must be used. When sufficient capacity is made available for stepped plate production, a considerable capital investment is involved. The overall cost per delivered plate can be lower for the stepped plate since the inspection yields are high, and the master production is not involved. A problem for anyone considering these economics is that until the step-and-repeat cameras are loaded to capacity, their overhead contribution is heavy. As device complexity increases, the mask demands are frequently out­ stripping the capacity of optical generation equipment. Several areas of con­ cern for optical equipment are (1) dies larger than can be imaged with currently available lenses (14-mm diagonal); (2) devices having images smaller than 1.25 μ,πι; (3) array configurations involving several reticle patterns, especially automatic alignment patterns that must register closely to the majority pat­ tern; and (4) device mask sets requiring the overlay registration error to be 0.125 μπι. When these conditions are required for a mask set, then an electron-beam generated set is required. As was discussed in the section on electron-beam reticles, the e-beam machines were originally designed to build lx masks. Their tolerances are aimed at this level of precision and performance. In the following section, this mask type will be discussed. D. Electron-Beam 1 x Masks

Whereas the electron beam can easily outperform the 10 x pattern genera­ tors, the contest is much more closely matched comparing the e-beam at 1 x and the current step-and-repeat cameras. 7.

Resolution

The standard e-beam writing requires four scan lines to make up a feature. Hence, using the smallest beam now available (0.25 μ,πι), the smallest line that can be written is 1.0 μ,πι. For line and spaces of this dimension, little or no proximity correction is required. If the requirement for four scan lines per

288

Ronald C. Bracken and Syéd A. Rizvi

i"

DESIRED FEATURE NEG. RESIST CROSS-HATCHED AREAS ARE IRRADIATED

ίΜ$77Ζ%Ζ77Ζν//'ΜΛ

CD CHANGE IN CLOSELY SPACED AREA

Fig. 14. An example of pattern broadening caused by the proximity effect in closely-spaced geometries.

feature is relaxed, smaller features can and have been resolved. However, features that involve lines and spacings of less than 1.0 μ,πι need to be cor­ rected for proximity effects. These effects arise from electron scattering in the specimen being irradiated. Under vacuum conditions, these scattered electrons constitute a contribution to the electron dose that is dependent on the proximity of nearby irradiated areas. An illustration of this is shown in Fig. 14. Until the software is available to compensate for such effects, geometries of less than 0.75-1.0 μ,πι should not be considered to be within the capability of the machine resolution. A circumstance that tends to mitigate the impact of this limitation is that currently available 1 x imaging equipment cannot effectively exploit masks made beyond this resolution. 2. Registration

Currently available commercial electron-beam equipment is specified as being capable of 0.125-μ,πι registration level to level. In order to achieve such registration, very careful temperature control and low expansion sub­ strates are necessary. There are several components in this registration: (1) (2) (3) serted

machine variables that were treated in the section on reticles, long-range cyclic temperature changes of the machine itself, and temperature variations of the plate and plate holder as they are in­ and removed from the machine.

The use of quartz as a substrate material can effectively eliminate the plate as a cause of trouble. The cassette and the machine constitute a more serious problem. The cassette can be accommodated to the temperature of the room, and that temperature can (with care) be controlled to about ±0.25°C. Because the e-beam machine contains heat sources and sinks within itself, it will not generally be at room temperature. When the cassette enters the ma­ chine and is banked onto the stage, it will begin to accommodate to the stage temperature. This accommodation will subject the plate to mechanical forces that can either translate or deform it. Both these changes are to be avoided. This conclusion means that provision must be made for control of the stage temperature, so that it can be adjusted to the room temperature.

5. Microlithography in Semiconductor Device Processing

289

TABLE XII Thermal Coefficients of Expansion" Material Soda lime glass Le-30 glass Quartz glass Aluminum Magnesium a

TCE (°C)-i 90 37 5 238 266

x x x x x

7

1010- 7 IO"7 10-7 10- 7

Expansion over a 4-in. span (μ,πι) 0.23 0.09 0.01 0.60 0.66

AT = 0.25°C.

The thermal coefficients of expansion (TCE) listed in Table XII illustrate the problem. This thermal effect will not lead to a linear expansion or contraction but will result in an erratic distortion, which is difficult to predict. 3. Dimensional Control As was discussed in the section on e-beam reticles, the specification on critical dimension (CD) control is ± 0.1 μ,πι. This number excludes effects of edge roughness due to fixed address spacing or raster effects. An important contributor to CD variations is the resist processing, resist adhesion, and the variation of these quantities over the plate. A more serious problem still is fixed address spacing. As we saw in considering reticles, this is at least one-half an address unit in magnitude; unless angles other than 0°, 90°, or 45° are avoided. Some mitigation of this problem can be had by one of two routes. Either adjust the post-exposure processing to cause resist flow and edge smoothing or write the pattern with a beam larger than the address structure. Either approach will "smudge" the address structure of granu­ larity. The bottom line on all of this discussion is that a CD control of 0.15 μ,πι in an absolute sense is about as tight as one can hope to achieve. 4. Defect Density This characteristic is intimately tied to raw stock quality, mask pro­ cessing, and environment for processing. In 1980 the figure which was widely used was a 4 defective dies/in. 2 . This is to be compared with the then available 1 defective die/in. 2 on optical plates. Currently e-beam plates are available which have 1 defective die/in. 2 after repair. Of the types of resist available, COP [poly(glycidyl methacrylate-coethyl acrylate)] and PBS [poly(butene-l-sulfone)] are the most commonly used. The former is a negative resist; the latter is positive. The COP cures to a tacky state, which tends to cause contamination adherence. Hence, with COP a higher defect density is usually encountered. PBS processing is par­ ticularly critical especially with regard to CD control. It has been shown that control of the relative humidity (RH) in the processing environment is a key

290

Ronald C. Bracken and Syed A. Rizvi

to successful PBS processing. Sufficient control appears to be ± 1% RH. The more desirable resist from a processing point of view is COP, but the defect density dictates that PBS be used. It is common at this writing to see PBS be the resist of choice for a majority of the e-beam masks built. The number in Table I is 0.3 defective die/cm2, which is twice the repaired number. 5. Throughput (Productivity)

This figure is, of course, strongly dependent on the pattern being written. The writing time can vary from a few minutes to 2 to 3 hr depending on de­ vice complexity. The extremely long running patterns are hard to justify running as 1 x on the e-beam since much higher productivity rates can be sustained optically. The arguments for running such patterns are those of e-beam uniqueness: the patterns cannot run optically because they involve dies that are too large or lines and spaces that are too small to run on the optical equipment. A second argument is complexity; several closely reg­ istered reticles may be required per masking level. In general an aver­ age figure of 1 level/hr can be safely used as a comparative number for productivity. E. Future Trends in 1 x Mask Making

At present the ability to produce 1 x masks that will satisfy the needs of VLSI electronics is closely matched between optical and e-beam equipment. Neither is complete in itself—each has advantages not available to the other. The earlier discussions are summarized in Table XIII. The advances in 1 x mask making will not come as a direct effort by the hardware manufacturers in that field to improve the equipment. Instead, these advances will be a fallout of lx wafer writing. 7. Optical 1 x Mask Making

There appears to be little likelihood that lens design much beyond the 10/1 Zeiss 10-77-82 lens will become available in the next 2 to 4 years. This limita­ tion means that die size and resolution will remain at their current state for the near future. The advent of the laser-controlled step-and-repeat cameras, however, allows one with careful work to compose patterns of die size larger than the limits that would be imposed by the lens provided intra-die-precision of less than 0.25 μ,πι can be tolerated. Test die insertion and the use of alignment aids will benefit from the auto­ matic reticle changer and the automatic reticle alignment equipment, which were primarily developed for wafer patterning. We have seen some auto­ matic alignment systems that can insert a minority pattern with such preci­ sion that its registration equals that of the majority pattern.

291

5. Microlithography in Semiconductor Device Processing TABLE XIII Summary of 1 x Mask Making Attribute Die size

Geometry size and spacing

CD control

Array registration

Die registration

Defect Density

Optical steppers

E-beam writing

Limited by lens working diameter typically 14-mm diagonal of die Limited by the MTF of lens. Present practice is 1.25— 2.0 μπι in production, 1.0 μπι with special effort. Limited by processing and resist variation. Present best practice is 0.2 μΐη plate to plate and 0.10 μπι within a plate at 2σ variation. Limited by temperature control. Present best practice is 0.50 μ,πι plate to plate at 2σ. Can be delivered at 0.25 μΐη at lowered yields. Limited by operator precision in alignment, typical performance is 3(7 < 0.3 μπι.

Limited only by stage travel currently 6 in. x 6 in. available

Limited by raw material and processing. At high yield can expect 0.15 defective die/in 2 Plates with no defects often encountered.

Limited by proximity corrections software and beam size control. Present practice is at 0.8 μΙΏ.

Limited by the effects of fixed addressed spacing to ± A U / 2 . With special processing can get performance equivalent to optical plates. Limited by temperature control. Present best practice is 0.15 μπι. Under ideal conditions can produce 0.125 μηι. Limited only by machine linearity and stripe abutment; typical performance 3σ < 0.2 μτη Limited by raw material and processing. Material with 0.3 defective die/cm 2 in current practice.

The quality of the raw material—chrome blanks—is improving. Material with better resist thickness and sensitivity uniformity is becoming readily available. Vendors both in the United States and abroad are making signifi­ cant efforts to improve their product. Much of this effort has been taken in conjunction with the use of higher-priced substrates such as low-expansion glass and quartz. These substrates are 10 to 20 times as expensive as the soda lime glass that was formerly the standard of the industry. When such materials are used, extra care must be expanded on the chrome and photore­ sist layers. As a result of this effort, it is to be expected that CD control of 3cr < 0.125 μηι will be available in the next year. To be "available" in a practical sense means that good yields to this specification must be main­ tained and instrumentation capable of demonstrating such CD control must become widely used. This last requirement implies an instrument precision of 3cr < 0.01 /xm would be desirable if the traditional 10/1 measurementto-equipment precision were maintained. Currently the best available instru­ ment has 3(7 < 0.05 μπι.

292

Ronald C. Bracken and Syed A. Rizvi

2. Electron-Beam 1 x Mask Advancements

Electron-beam wafer writing will be enhanced in several ways, which will benefit mask making at l x . Several of these enhancements follow. (1) Shaped beam systems: This technique employs a rectangular elec­ tron beam. Such a system is similar to the optical pattern generators in its patterning technique. The Japan Electro-Optical Laboratory (JEOL) has developed such a machine. Their model JBX-6A uses a shaped beam with vector scan. This machine can use a "spot" size that can be varied from 1 to 12.5-/xm square. Patterns that were shown as having been produced using this machine are free from the fixed address structure effects that afflict the raster machines. A second Japanese machine, the Hitachi EB-55, also uses a shaped beam and vector scan. This machine is said to be capable of 0.5-μτη square-shaped beams. For lx mask writing, such machines will offer signif­ icant enhancements in pattern size control and fidelity to design intent as to pattern shape. (2) The proximity software will allow pattern control to be maintained as geometries of less than 1 μ,ηι are written. The currently available program SPECTRE [7] shows some promise in this regard, but until the proper soft­ ware and electron sources are available, the way to submicron geometries is not really supported with a solid technology base. We recognize that pat­ terns of submicron dimension have been imaged, but not with the degree of control that would qualify these efforts as a production technology. (3) The use of the brighter electron sources as a standard item will allow an increase in writing speed and provide the currents for fine line definition. The most likely source is, of course, LaB6 single-crystal material. Some use of a field emission point source may become feasible as techniques for main­ taining a stable point are developed. The primary mask making equipment for future advances of 1 x patterning will be e-beam. This is not to say that optical approaches cannot succeed; in­ stead, it seems that the e beam is the easiest route to producing such masks. The software and hardware of present machines will have to be enhanced over current art. But these advancements are not more than two years from practical availability. IV. WAFER RESIST PATTERNING A. Contact Printing

Contact printing from a mask to a wafer is the major method of wafer lithography at the current time. There are several reasons for this popularity. First and foremost, until the advent of the Perkin-Elmer Micralign™ 100 in

5. Microlithography in Semiconductor Device Processing

293

1973, contact printing was the only effective method for patterning wafers. As a consequence, all semiconductor manufacturers were heavily equipped with contact printers. The balance between contact and projection print has slowly shifted in favor of the latter. Presently the more advanced circuits are being run on projection printers while the older, established product lines have yet to be updated to projection print. As of 1980, about 2000 Micraligns™ were in place in semiconductor wafer fabrication lines. The second reason for contact print popularity is cost. The projection printers are any­ where from 10 to 20 times as expensive as a contact printer, and they have only 60-80% of the throughput of a contact printer. Unless the "benefits" of a projection printer are needed, a corporation will not make the investment. Finally, there is an optical advantage to contact printing in that very nearly 100% contrast can be obtained. This patterning method theoretically can transfer to a wafer any geometry that can be imaged on a photomask, assum­ ing that the contact between the mask and wafer is sufficiently intimate. In light of these advantages, why do we find the industry changing to pro­ jection printing? The answer lies in the deficits of contact printing. One of the principal deficits of contact printing is described in the name—contact. For optimum, or really even acceptable image transfer, intimate mask-towafer contact must be maintained. When this intimacy is obtained, mask damage of some sort usually occurs. If the wafers are clean, smooth, and particle-free, this damage can be minimized. However, even in well-run wafer fabrication areas, processing operations will create rough wafer sur­ faces, and these will cause mask damage. Figure 15 shows the results of a study done by the authors to quantify this point on an NMOS device of 22,000 mil2. The wafers being patterned as level A were smooth, having a thin coating of thermal oxide and deposited silicon nitride. The level C wafers were coated with polycrystalline silicon. The wear rates are roughly 5:1 different between the two levels. Wear rates are best characterized by the units "de­ fective dies per plate per wafer exposed." If defects and not defective dies are used, then defect structure can lead to nonrepre sentati ve counts: 30 de­ fects may be in 2 dies and only three in the remainder on the plate. The result in defects would be 33 defects/plate. The result in defective dies is 6 de­ fects/plate. The latter is more representative of the actual plate quality as only 6 dies are defective. The impact of wear rates can be seen in a yield calculation (Table XIV). Wear rates affect devices of different die sizes differently. It is assumed in this calculation that chrome masks are used, and the average wear rate will be 2.5 defective dies/plate/wafer. Initial mask quality will be approximately 12.5 defective dies/plate (1 defective die/in.2). Wafer lots of fifty are used. The yield calculations are for microlithographic-limited yield only. In this calculation, X0 is the initial die percent defective, Xf the final die percent defective, and X the arithmetic average of the two. The microlitho-

Ronald C. Bracken and Syed A. Rizvi

294 10

LEVEL

c

LEVEL

B

LEVEL

A

6+

2+

10

20

40 WAFERS

EXPOSED

Fig. 15. Mask defects are shown as a function of the number of wafers exposed using con­ tact printing. The die size was 22,000 mil2. The defects are Cr~ (pinholes, scratches, etc.). The rates are as follows: level A = 0.016, level B = 0.048, and level C = 0.074. Units are defective dies per square-inch wafer.

graphic yield equation used is Y = exp - (NAT/lOO), where N is the number of critical or yield-affecting levels in the device. The average yield is that which would be obtained using the average quality mask. The Y0 term is the yield that would be obtained if the initial mask qual­ ity could be maintained throughout the wafer set. This yield should also be that obtained through using projection printing. A major impact of projection TABLE XIV Die Area (kmil2):

2.5

Dies per 4 in. wafer 5000 X0 0.25% X{ 2.8% X 1.5% Y0 99.0% Yf 89.4% f 94.1%

10

20

40

60

1250 1.0% 11.0% 6.0% 96.0% 64.4% 78.7%

625 2.0% 22.0% 12.0% 92.3% 41.4% 61.9%

312 4.0% 44.0% 24.0% 85.2% 17.2% 38.3%

208 6.0% 66.0% 36.0% 78.7% 7.1% 23.7%

5. Microlithography in Semiconductor Device Processing

295

printing is that contact-induced mask wear and the associated yield loss are avoided. As can be seen, if the die size is small, the benefits are also small. Since the older circuits, which are produced on contact printers, are gener­ ally small, there is little incentive to convert these to projection print. A second deficit of contact printing is its consumption of masks. The wear on the masks means that at some point they must be discarded. The larger the die, the lower the wafer-use number should become. 7. Resolution in Contact Printing

A serious problem associated with contact printing is the quality of the patterning achieved. When perfect intimacy of the mask and wafer are ob­ tained, the mask becomes the limiting factor on resolution. It is not often that perfect contact over the entire wafer surface is obtained. This failure can be attributed to (1) wafer nonflatness, (2) particles on either the front or back wafer surface or on the mask, (3) spike growths on the wafer. Such sep­ aration will lead to geometries that are rounded or with inverted corners as well as size variations in regions around the separation point. The size of the separation is important only when it is sufficient to cause a widening of a pat­ tern by greater than 10%. We have calculated the separation that would cause this widening; the basis of this calculation will be discussed in the sec­ tion on proximity print. The smaller the geometries are on the mask, the smaller is the "allowed" particle size. Assuming that 365-nm light is being used, the design rules in Table XV dictate the maximum allowable particle diameter or mask-wafer separation. We have found that contact-printed wafers with geometries below about 2.0 μτη tend to exhibit unacceptable geometry distortion and CD variation. This problem in almost every case investigated is caused by either separa­ tion or particles. Although the actual resolution capability of contact prints is considerably less than 2.0 μ,πι, a practical limit is probably appropriate at that geometry size. This limit is based on the increasing inability to maintain large areas of the wafer in sufficient intimacy to get acceptable geometry control or sizing.

TABLE XV Max. Allowable Particle Size for 365-nm Light Geometry size (μπ\)

Separation (particle diameter) (μ,πι)

10 5 3 2 1

27 7.0 2.5 1.1 0.3

296

Ronald C. Bracken and Syed A. Rizvi

2. CD Control in Contact Printing

At this same limit, the CD control is defined: We assumed acceptable con­ trol was ± 10%, and at 2.0 /xm, CD control becomes ±0.20 μτη. In practice we have found this to be an achievable tolerance. One limit to going much beyond this control is the precision of the equipment measuring the geome­ tries, as we mentioned earlier in mask making. 3. Registration Control in Contact Printing

As was also mentioned in the mask-making section, the contacting step can introduce considerable misregistration. In some ways, the demands of registration and resolution are countervailing forces. To achieve good reso­ lution, the mask and wafer must be forced into intimate contact and the wafer "flattened" against the mask. However, this forcing will flex the mask and lead to misregistration. The same flexure, which caused run-out in contact-printed masks, will lead to a similar array misregistration when it occurs on wafers. Soft contact is often used in contact print aligners. It is primarily used to limit mask wear, but it has a secondary effect of helping control array misregistration. We have used the same number for registration on wafers as we used for masks: ±1.5 μ,πι in Table I. 4. Defect Density

The contacting step, in addition to causing mask wear, CD variation, and array misregistration, also induces wafer resist damage. In contact printing the chrome mask will be used to pattern 50 wafers, so the average usage is 25 wafers. Table XVI shows what we have measured at the 25-wafer use point. The Cr~ is the increase in missing Cr defects caused by wear. The average of high-wear and low-wear masks for total defects is 1.17 defective dies/cm2. In this illustration the accumulated contamination was counted on masks as they exited the wafer line. In most cases the contamination was re­ sist adhered to the mask. We have taken this contamination as indicative of the degree of resist damage the wafers suffer. Since the mask is cleaned after 25 wafers, the average wafer experiences a mask having about 0.6 defective dies/cm2, which are the wear and contamination-induced effects. Since the TABLE XVI Level A (low wear) (defective dice/cm 2 )

Level C (high wear) (defective dice/cm 2 )

Cr-

Contamination

Total

Cr-

Contamination

Total

0.06

0.74

0.81

0.29

1.24

1.53

5. Microlithography in Semiconductor Device Processing

297

initial mask quality is about 0.2 defective die/cm2, there is a total density of 0.8 defective die/cm2 on the average wafer. Summary of Contact Printing for Wafers

We have briefly examined the pros and cons of contact printing. Its advan­ tages accrue mostly to the older devices where larger layout rules and smaller die mitigate the contact problems. As device sizes increase and de­ sign rules shrink, the contact print situation becomes increasingly difficult and expensive. In most cases mask savings alone will justify the investment in new equipment aimed at eliminating the contact step. In the next section, we shall consider one such approach—proximity printing. B. Proximity Wafer Printing

The most logical and inexpensive solution to the contact problem is simply to provide for some separation between the wafer and the mask. In wafer patterning, this approach is called proximity printing. The most immediate advantages in this technique are twofold: mask wear and wafer resist dam­ age are practically eliminated. t. Resolution

Some of the problems associated with proximity printing are the follow­ ing: First, light diffraction at the pattern edge causes some illumination in the region of the geometric shadow. This diffracted light will cause pattern size and shape variation from the intended size and shapes. An estimate of the size variation can be obtained from the formulation for Fresnel diffraction at an edge as illustrated in Fig. 16. The relationship between the mask pattern and the wafer pattern can be estimated by W = W (1 4- F)/F, where F is the Fresnel number (= W2/kS). In Fig. 17 we see the effect of the proximity gap setting for different mask slit size at 250- and 365-nm light. In many cases line and space sizes are similar for device design rules; i.e., 1-jLtm line; l-μΐη space or 2-/xm line; 3-μτη space, etc. When this relationship LIGHT AT WAVELENGTH λ

llllllll

MASK

Fig. 16. The diffraction of light at a mask pattern edge. The mask-to-wafer separation is S, the pattern width on the mask is W, and the width on the wafer is W.

298

Ronald C. Bracken and Syed A. Rizvi

4

1

1

1

5

10

15

MASK-TO-WAFER

1—·

20

SEPARATION S(/im)

Fig. 17. The variation of the wafer geometry width (W) as a function of gap setting and wavelength for different mask geometry widths (W).

occurs, it allows a limit to useful resolution to be defined. Consider a 2-μπι line and space rule. At 5-μπι gap, the 2-μπ\ line on the mask has widened to about 3 μπι, and the space has become 1 μπι. An attempt can be made to ''bias" the line to 1 μπι in order to compensate for this spreading on the wafer. A l-μΐΏ line at 5-μπι gap has also spread to 3-μπι width. This result means that a smaller gap must be used for printing these design rules. How­ ever, as the gap decreases, the ability to maintain uniform separation falls apart. For masks and wafers, a 2-μΐη flatness is about as good as can be rou­ tinely obtained, so even at 5-μηι gap there will be serious fluxuation of the gap width. This fluxuation will impact CD variations so seriously as to make them unusable. Experimental results were recently reported [8] on proximity printing using the Canon PLA 520F, which incorporates a X e - H g deep-UV (250-nm) light source. As can be seen in Fig. 17, some improvement in spreading is predicted on the basis of Fresnel diffraction. A considerable enhancement in resolution beyond what we would expect from calculations was obtained by the use of deep UV and the resist PMMA or PMIPK. Kameko and coworkers report resolution of a 1.5-μπι line-space pair at a gap setting of 20 μπι. The key to understanding the performance is in the resist sensitivity: these resists are very insensitive. The Fresnel calculation was based on a sensitivity of the resist to low-level light that was diffracted into the area of the geometric shadow. Since these resists are less sensitive, they respond only to the light in the higher intensity areas, and this distribution does not

5. Microlithography in Semiconductor Device Processing

299

change rapidly with gap setting. At this writing, these resists are not widely used in commercial operation, so the previous analysis represents a fair rep­ resentation of the current manufacturing practice. Unless wide gaps—such as 20 μ,πι—are used, gap variation makes CD control difficult to maintain. The wide gap also decreases mask damage due to silicon chips and growth modules on the wafer. These wide gaps tend to limit the proximity printers to patterning the coarser geometries but have several advantages: (1) gap variation due to wafer and mask nonflatness is minimized, (2) CD variation can be better controlled, and (3) mask damage is minimized. From these considerations, we can estimate a working resolu­ tion at a 15-20-μ,πι gap setting at about 3.5 μ,πι. 2. Registration

in Proximity Wafer

Printing

There are several reasons to believe that the registration obtained by prox­ imity printing should be good. (1) The absence of the mask-wafer contact eliminates a major contribution to misregistration. (2) The use of wellcollimated light will lessen the impact of wafer bowing on registration. (3) The parallelism between the mask and wafer that is necessary for good CD control will limit any registration problem from this source. A potential source of misregistration that remains is mask heating that can occur during the exposures. In the section on projection printing, this topic is discussed in detail. This comparison shows that mask heating during expo­ sure can cause temperature differences between different masking levels. These differences can be up to 2.9°C and can lead to misregistrations of 0.5 /im if low-expansion glass is used and 1.1 μ,πι if soda lime glass is used for the masking set. These studies were done on 100-mm wafers. In Table I we have used 0.8 μπι as the characteristic registration error of proximity printing. This error is obtained by assuming the temperature error for low-expansion glass and including other sources of error as being about one-half those of contact print. The rms sum is that listed in Table I. 3. CD Control

Dimensional control will almost always be poorer than that which can be obtained using contact printers. As we have discussed, the gap control and design rules have a strong impact on CD variation. At best the wafer and mask are each no better than a 2-μτη flatness—either positive or negative deflection. At worst the variation then will be a 8 μπι from wafer to wafer. The rms variation would be about 2.8 μ,πι assuming 1er = 2 /xm for both wafer and plate. From Fig. 17 we see that this variation amounts to 0.25 μπι variation on a 4.0-μπι nominal geometry. This variation is then added to those expected from resist thickness, light intensity, and processing, which were delineated in the section on contact print. Overall a variation of 0.4 μπι might be expected from proximity printing. This variation is another reason

300

Ronald C. Bracken and Syed A. Rizvi

that very small geometries will be only poorly rendered using proximity printing. 4. Defect Density

The cosmetic quality improvement will be strongly dependent on the wafer and processing environment cleanliness as well as the gap chosen. If the gap is less than the particle size, damage will occur. At Western Electric, Jones noted a — 35% decrease in damage rate at a 5-μ,πι proximity gap [9]. For a 20-μ,πι gap all but the worst damage is avoided; so under current manu­ facturing conditions, we would estimate the average wafer to be about 0.4 defective die/cm2. This value represents a 50% decrease in damage from that experienced using contact printing. The productivity of proximity printers is approximately that of contact printers since they are very similar machines. All in all, the limits of both contact and proximity printing have not led to either of these techniques being the method of choice for VLSI wafer pat­ terning. Contact could resolve the patterns at the cost of heavy wafer dam­ age. Proximity avoids some of this damage but lacks the elements of control necessary for high-yield VLSI. There may be some significant work to be done in deep-UV resist in proximity printing, but most effort is going toward projection printing. In the next section, the benefits of projection printing are explored. C. Wafer Patterning by Projection Printing

Awareness of the consequences of the contact problem led engineers in the direction of finding techniques for separating the wafer and the mask. As we have seen, the proximity printer achieves this separation to a degree but suffers severe limitations from gap control and diffraction effects. The way out of this dilemma was to stop trying to pattern the wafers by shadow casting techniques and to begin imaging the mask pattern on the wafer using lens systems. This approach was completely different and car­ ried with it its own peculiar problems. In discussing projection printing and making comparisons among the 10:1 printers, the 1:1 printers, and contact print, several useful optical terms need to be briefly discussed. The first is contrast. This item is defined as the ratio of (7max Anin)/(/max + Anin) = C. Where 7max is the intensity within the illuminated area, 7min is the intensity in the area of the geometric shadow. The modula­ tion is M and M = (Cimage/C0t)ject)· The modulation transfer function (MTF) describes how contrast varies as a function of the spatial frequency of the illuminated object where the line-plus-space dimension equals the period

301

5. Microlithography in Semiconductor Device Processing

TRANSFER FUNCTION 1.0 2 0.8

<

cr H

co < ce

0.6

£0.4 o o

0.2 0

0.125

jÔ77-82(LP/mm)l32

0 2 5 0 0.375 0.500 0.625 M A

264

3^6

528

660

0.750 0.875 792

924

1.00 ICjSSl

Fig. 18. The variation of contrast or modulation as a function of the line-space density per millimeter. The influence of partial coherence of the illumination is seen on the shape of the MTF curve. At σ = 1 is effective incoherence; at cr = 0.3 the light is nearly coherent, at σ = 0 complete coherence, and a square wave cut off at 0.5-normalized frequency occurs. (From Roussel [2].)

[10]. An MTF = 1 implies that the contrast input to the optical system has not changed during transit (Fig. 18). In contact printing with perfect contact intimacy, the MTF = 1 and is independent of frequency. The illuminating source and the optical system have a characteristic called partial coherence. If a point source is imaged onto the entrance pupil of an optical system as a point source, then the source image will constitute coherent illumination. With coherent illumina­ tion, the MTF = 1 until a cut-off frequency is reached. At this frequency and above, MTF = 0 and no further useful imaging can occur. If the illumi­ nating source is imaged onto the entrance pupil of the optical system at a size equal to the entrance pupil, then the illumination is effectively incoherent. With incoherent illumination, some contrast is lost at all frequencies. The cut-off frequency is greater for incoherent illumination. If the image of the source partially fills the entrance pupil, the illumination is partially coherent. This adjustment allows a trade-off of contrast and resolution for a given op­ tical system within its performance limits. The final term is numerical aperture (N.A.) and was discussed in an earlier section. We saw that the limit to resolving two objects of a separation d is re­ lated to d = 0.61 ë/Í.Á., where 100% coherent light in air is assumed. Typically, a factor of 2.5 to 3.0

Ronald C. Bracken and Syed A. Rizvi

302

times this limit will describe useful resolution for an optical system. When partial coherence is introduced, we find the resolution limit is increased as was previously discussed. In this way we see the extension of the resolution limit to smaller geome­ tries that is gained by introducing incoherence into the light source. In the discussion that follows, these terms will frequently occur and they become quite important in discriminating among the various optical imaging systems. Around 1970 several projection printing machines were made available on the market. These machines contained refractive optics and met with some degree of commercial success. Kulicke and Soffa offered the so-called Telefunken machine (K&S 689), and Rank offered a projection printer. The origi­ nal machines were capable of imaging 2-in. wafers. In 1972-1973 3-in. ma­ chines became available. These machines were specified as: (1) /^ (2) (3) (4) (5) (6) (7) (8)

Magnification = 1 : 1 . TM. t_ x f 120 wafers/hr 1st mask, Throughput = { 6 0 w a f e r s ; h r a U g n Image field = 3 in. Resolution = 1.5 /xm. Distortion = 1 ì,ðé. Alignment accuracy = 1 /xm. Mask size = 4 in. Tolerance for wafer and mask curvature = ± 5 firn.

NUMERICAL APERTURE o o p o c

The applications of the K&S 689 machines were found in MSI and LSI. The l-ì,éç distortion and alignment specification meant that the 1.5-/xm reso­ lution could not be fully exploited. In the final analysis, the refractive aligners did not offer enough advantage over contact printing to capture the market. A disadvantage of refractive optics can be seen in Fig. 19. As the image field diameter expands, the resolvable image size has to in­ crease [2]. As the industry moved from 3-in. to 4-in. wafers, the resolvable

3 4

6

8 10

15 20

30 40

60 80100 150 200

0= IMAGE FIELD DIAMETER (mm) Fig. 19. The relationship between the N.A. and the effective field size of commercially available (1980) microelectronic photolithographic lenses. (From J. M. Roussel [2].)

5. Microlithography in Semiconductor Device Processing

303

size would then have been forced larger if refractive optics were employed. Unfortunately, design rules were trending to smaller images. This situation spelled the end of 1:1 refractive projection print. A way to avoid this problem is not to use refractive lenses, but instead to use a reflective system. With this system, many of the classical optics problems can be avoided. A similar approach was taken long ago by Sir Issac Newton when he suggested reflective optics as a way to avoid chro­ matic abberation effects in celestial telescopes. Some of the disadvantages of refractive optics are the following. (1) The refractive lens must be color corrected to minimize chromatic abberation. This correction means that a portion of the Hg arc spectrum cannot be used, so available energy for resist exposure is lost. Second, there will be attenuation of the chosen wavelength by the lens system. And finally, the elements in the lens (there are often as many as 16) will to some degree reflect and re-reflect the transmitted rays and contribute to background glare, which reduces contrast at the image plane. All of these problems are avoided by reflective optics. The full output of the exposure system can be utilized. Much shorter exposure times and hence higher machine throughputs can be maintained. (2) The focus and to some degree the magnification of the refractive lens is a function of the illuminating wavelength. The more narrowly controlled the wavelength, the better is the focus and magnification control, but the illuminating energy also becomes lower. Control and throughput are again put into an opposing situation. (3) Finally, the very complexity of the refractive lens makes it subject to change with temperature fluxuation. The materials of reflective optics can be chosen for thermal stability rather than for optical design properties; conse­ quently, the reflective system can be designed to compensate for tempera­ ture changes. All of these items, throughput, resolution, and stability, contributed to the success the reflective optical system met in the market when Perkin-Elmer introduced their Model 100 Micralign™ in 1973. The basic elements of their optical design are shown in Fig. 20 [11]. In this system, light from a point on the mask falls on about one-half of the primary mirror and is reflected to the secondary. It is reflected back from the secondary mirror onto the other half of the primary from which it is focused onto the wafer. The distortions in the system can be corrected for a narrow, annular area concentric with the center of the mirror. If an arc is illumi­ nated on the mask and the light falls on the mirror, it will be imaged on the wafer without distortion. In the Perkin-Elmer system, the mask is scanned through this arc of illumination at the same time that the wafer is scanned through the arc of focus. The narrower the illuminating slit is made, the finer are the geometries that can be defined, but light energy is decreased so scan

Ronald C. Bracken and Syed A. Rizvi

304

PRIMARY MIRROR

MASK

SECONDARY

MIRROR

WAFER

SCAN DIRECTION

Fig. 20. The basic components of the ringfieldoptics of the Perkin-Elmer reflective system. (FromD. A. Markel [11].)

times must increase. As the slit is widened, light is falling on either side of the ring of perfect correction so distortions increase as throughput increases. There are several advantages in this system besides those inherent in re­ flective optics. These advantages are as follows. (1) The illumination pattern is scanned, so to some extent scaling up to larger wafers involves only lengthening the arc and scan dimension. (2) The ability to utilize any wave­ length means that potentially any resist can be used on the wafer rather than just resists sensitive to the wavelengths for which the system was corrected. The PE-240 machine is currently the industry workhorse for 4-in. wafer production. The published specifications for this machine are as follows.

(2)

Magnification = 1 : 1 . f 100 wafers/hr _. . ThroughputA = [ 6 0 w a f e f s / h r

(3) (4) (5) (6) (7)

Image field = 4 in. Resolution = 1.5 ì,ðé. Depth of focus = ± 8 ì,ðé (for 2-ì,ðé line and spaces). Distortion = ±0.25 ì,ðé. Alignment accuracy = ±0.25 /xm.

(1)

(1st mask),

(align)

The system is telecentric which means that magnification or registration is almost unaffected by the flatness of mask and wafer. Resolution is a com­ pletely different matter; it is seriously affected by flatness. For example, a 2-μπι line can stay within ± 10% only if the wafer and mask can stay within ± 5-ì,ðé depth of focus on the PE-120. For a 3-ì,ðé line this depth of focus be­ comes ± 12.5 /im. The exposure characteristics of resist can enhance resolu­ tion by effectively increasing contrast.

305

5. Microlithography in Semiconductor Device Processing

It is interesting to note that the quoted resolution of this machine (PE-240) is unchanged from that of the K&S 689 while the distortion and alignment accuracy specifications have been divided by four. The lesson is that resolu­ tion is not especially useful unless placement or registration is under control. This same point will emerge in a later section as we get into x-ray lithography. Some very interesting work has been done on registration by Hershel et al. [12] at Hewlett Packard and Makita et al. [13] at the Japanese Computer Development Laboratories. These workers have analyzed registration problems as they exist on wafers that have been patterned using projection printing. The machines used were PE Model 120. Their results are com­ pared, where possible, in Table XVII. The Hewlett Packard study used a PE-test resolution mask. The Japanese used a five-level set of low-expansion photomasks (LE-30). These practical results on alignment and distortion are in substantial agreement. They do not agree, however, with specification. A specification represents a 3ó- not a 1er-value. If the overall registration error reported by Makita is accepted, the result is that a "usable" resolution estimate 3 x 2
JCDL" (ì,ðé)

H P 0 {μχη)

Alignment error Distortion Thermal expansion Photomask error Residual error Registration error

0.26 0.18 0.16 0.06 0.08 0.41

0.25 0.25

— — —

" Japanese Computer Development Laboratories. Hewlett Packard.

b

306

Ronald C. Bracken and Syed A. Rizvi TABLE XVIII Wafer Sequence Number as It Influences Variation of Mask Array from Its Thermal Equilibrium Dimension" Wafer sequence

Mask level Contact holes Polysilicon Mask Ä (l

0

1

2

3

4

5

6

7

8

9

10

0 0 0

0.3 0.15 0.15

0.6 0.3 0.3

0.75 0.35 0.40



1.0 0.4 0.6

— — —







1.0 0.4 0.6

Measurements are given in micrometers.

mask (the light-to-dark ratio) and number of wafers exposed. In this experi­ ment wafers were sequentially exposed. The dimension across the wafer was measured on each wafer and compared to the initial mask dimension. These measurements were done on two different masking levels. These re­ sults are summarized in Table XVIII. The mask-to-mask difference, 0.6 μπι, in this expansion is the result of nonequivalent heating of the masks under equal conditions of illumination. Since the span is 60 mm and the glass is LE-30, the temperature difference necessary to cause the misregistration can be calculated as AT = 2.9°C. If this same temperature difference is as­ sumed to develop independent of glass type, we can estimate the misregis­ tration that would develop between these masking levels for the various sub­ strate materials (Table XIX). This result is not peculiar to projection printing but applies to any exposure system. It is only with the use of projection printing that a complete solution to the problem is practical, i.e., the use of quartz as the mask substrate. It seems likely that some of this thermal problem has contributed to the re­ ported registration error being so much larger than the machine specification error. Since Perkin-Elmer uses quartz resolution masks for their machine set up and qualification, this thermal problem, although real to a nonquartz user, is not comprehended in the published specification. The elements of resolution, focus, throughput, and registration are all inTABLE XIX Steady-State Misregistration Caused by Thermal Heating and AT = 2.9°C

Soda lime LE-30 Quartz

60-mm span

90-mm span

1.45 μπ\ 0.60 μπ\ 0.09 μπ\

2.18 μπ\ 0.90 μπ\ 0.14/xm

307

5. Microlithography in Semiconductor Device Processing

terdependent to some degree in projection printing. The conditions that jus­ tify projection printing are die-size increase and design-rule reduction; hence, it is fair to characterize projection printing as it would operate to sat­ isfy these conditions: (1) Slit width = 1 mm. , Î50 wafers/hr T, u = | 4 0 wafers/hr (2) Throughput

(1st mask), (aligned).

(3) Resolution = 2 /xm. (4) Registration = 3σ = 0.50 /xm (quartz mask set). (5) Depth of field = ±5.5-8.0 ì,ðé (PE-120 or PE-240). The registration number was obtained by adding three times the thermal error for a quartz set with differing levels of light to dark to the specified error (0.25) to get an rms estimate. In contrast to the registration and resolu­ tion, we find no mention at all of the control of geomety sizing (CD) in the projection printing specifications. The reason for this lapse is not hard to find. This subject is so dependent on the resist type used, resist pro­ cessing sequences employed, and the device processing steps involved that little can be promised by the vendor that cannot in practice be quickly found to be untrue. In mask making an almost ideal optical substrate is used, i.e., very flat uniform plates with extremely uniform coatings of photoresist. In wafer patterning the wafers can be nonflat, and the resist thickness on the wafers can vary greatly over diffusion steps on the wafer. An example of such a variation is shown in Fig. 21. When steps are introduced on the wafers, this control will degrade. The CD variation can be held to 0.25-ì,ðé range if one is careful to optimize illu­ mination conditions. The number 0.25 ìðé will be used as a characteristic of projection printing.

^"^•5

- ^ " • g ^d

RESIST

^-^



STEP

CONSTRUCTIVE NODE DESTRUCTIVE

NODE

SUBSTRATE

Fig. 21. The variation of resist thickness as it passes over a diffusion step on a wafer. (From A. R. N e u r e u t h e r ^ al. [14].)

308

Ronald C. Bracken and Syed A. Rizvi 1.5

w

in



z o X l·h-

0.5

(j)

m

LU


-1.0 0

1.0

0 DISTANCE FROM LINE CENTER

(^.m)

Fig. 22. Simulated resist profile corresponding to the thickness in (a) to (d) in Fig. 21. (From A. R. Neureuther et al. [14].)

Cosmetic Quality of Projection Printed Wafers

One of the main reasons for projection printing was to improve the wafer cosmetic quality. The contributors to wafer defect density are (1) chrome de­ fects on the mask, (2) resolvable contamination on the mask, and (3) the de­ fects on the wafer itself. In the section on contact-printed wafers, we saw that mask wear rate and contamination varied with the wafer. The contamination density was found to be 10 x or 5 x, the permanent defective density for silicon nitride and poly silicon, respectively (Table XX). These are defects that are adduced to be on the wafer as judging from the wear and contamination on the mask. If we assume only one-half the wafer defects leave a trace on the mask, then we can estimate the wafers to be from 0.04 to 0.14 defective die/cm2 depending on the film on the wafer. The average of this number, 0.09 defective die/cm2, is taken as characteristic of the wafers. We shall make the assumption that the mask can be made no cleaner than the wafer, so the estimate for contamination will be the same as the wafer, i.e., 0.09. The sum of defects on the projection printed wafer, which excludes chrome defects on the mask, will be 0.18 defective die/cm2. Future trends in projection printing are the move to ever-larger scanned fields and smaller geometries. As we mentioned earlier, this trend for reflec­ tive optics does not involve an ever more elaborate and expensive lens deTABLE XX Wafer Quality" Film type

Wear rate

Contamination rate

Total

Poly cry stalline silicon Silicon nitride

0.0115 0.002

0.058 0.020

0.070 0.022

α

Measurements given in defective dies per square-centimeter wafer.

309

5. Microlithography in Semiconductor Device Processing TABLE XXI 1:1 Projection Scanning Machines

Perkin-Elmer PE-300 PE-300 PE-500

Registration (/xm)

Field (in.)

0.5 0.5 0.25 0.25

4 4 5 5

60 50 50 60



3-5



4 4

100

Canon MPA 500 FA Cobilt CA 3000 CA 3400 a

— — —

300 nm 436 nm 240 nm

uv«

1.25 1.125 1.125 0.9



UV

1.5

0.19 0.19

UV UV

2.0 2.0

0.17

Capacity (wafer/hr)

(ìÀ¿)

Resolution N.A.

Machine

0.5 0.8



UV = 365-nm, 405-nm, 436-nm line.

sign as it would in refractive optics. There are currently three companies in­ troducing 1:1 projection scanning machines onto the market (Table XXI). All are reflective systems. The use of the deep-UV, 240-nm wavelength has been a difficult task for the machine vendors. The light source employs at least one lens element, and the optical coatings on the mirrors must be changed if 240 nm is to be effectively used. Several resists are currently available for deep-UV printing (Table XXII) [8]. As a point of reference, a 1800-nm film of AZ1350J in 4 'normal" UV exposure (365, 405, and 436 nm) has a threshold exposure of 2.5 sec. The problem of deep-UV exposure is to match a rich source output in the 240-nm region with a resist sensitivity in the resist. One such source is the Xe-Hg arc lamp which peaks at 240 nm. The result of using deep UV is that resolution is usually traded for throughput. The solution for this problem is to increase the lamp intensity. Another trend in projection and proximity printing is the use of two-level resist systems. As was mentioned earlier, the presence of steps on the wafer leads to extreme variations in the resist thickness. Two levels of resist can smooth the wafer surface considerably. This technique was discussed by Lin TABLE ××ÉÉ Resist type

Exposure threshold (isec) at 240 nm

Thickness (nm)

AZ2400 PMIPK OFPR800 AZ1115 AZ1350J PMMA

4 8 10 14 15 30

1000 500 1300 1000 1800 550

310

Ronald C. Bracken and Syed A. Rizvi

[15]. In that application, a thick film of PMMA (2 ìðé) is spun on the wafer. On top of this film a thin (0.2-ì,ðé) coating of AZ1350 is spun. The masking pattern is imaged in the thin AZI350 and good resolution can be achieved. The PMMA is exposed in a flood lamp condition using the AZ1350 as a per­ fectly contacted mask for the deep UV. Since the imaging is done on the more uniform AZI350 resist, CD uniformity can be improved to essentially that of a flat, featureless wafer. In this manner the benefits of the high contrast of contact printing can be obtained without the problems associated with the contact step. Both of these techniques, deep UV and the multilevel resist system, represent ways of obtaining better contrast and resolution in projection printing. Another and more direct way is to use an optical system with a higher numerical aper­ ture and hence better resolution. This approach is embodied in the various wafer steppers, which image single die or groups of dies at various magnifi­ cations onto the wafer. These machines will be the subject of the next sec­ tion.

D. Wafer Patterning by Step-and-Repeat Cameras For years master masks had been made using the 10:1 step-and-repeat cameras, and the industry was familiar with their performance. When higher resolution and registration were needed on wafers than the then-available projection printers could supply, the performance of these 10:1 cameras was attractive as an alternative. There are several obvious advantages. The 0.28-0.35 N.A. of the 10:1 lenses will provide better resolution than the 0.17 N.A. of the projection printer. The practical numbers for resolvable geome­ tries are 1.25 /xm versus 2.0 /xm. The die-by-die exposure technique carries with it a capability of die-by-die alignment when autoalign becomes avail­ able. In this way extremely tight registration can in principle be maintained on very small geometries. These considerations were very attractive— especially when it is considered that the PE-100 was the then-available pro­ jection printer against which the stepper performance could be gauged. In the present day the performance difference is not really so great. The 1:1 projection scanners have improved considerably while the details of wafer stepping have been resolved. Both techniques are currently repre­ sented in the market place with several vendors supplying machines of either type. A detailed description of wafer stepping is not necessary since we covered the step-and-repeat camera in an earlier section. The main difference between mask patterning and wafer patterning is the aligning of the layers on the wafer. This requirement means some provision must be made for viewing the wafer and aligning it to the camera stage motion. Two ap­ proaches which have been taken to aligning are off-axis alignment and

5. Microlithography in Semiconductor Device Processing

311

through-the-lens alignment. In the former scheme, which is used by the GCA 4800, the wafer is aligned to targets on the stepper. The wafer is then moved to a position under the lens and a "dead reckoning" stepping proce­ dure is used to expose the wafer. Obviously the target alignment to the stepper motion must be very accurate as this positioning determines the sub­ sequent alignment accuracy of the other layers. In practice was have found that this procedure is satisfactory for 2-3-/xm design rules. In the through-the-lens alignment, there is an advantage of simplicity in that the optics are directly aligned to the wafer. The disadvantages are as follows: The exposure optics must be color corrected to allow through-the-lens wafer viewing at an optical frequency which is not destructive to the resist, or the wafer can be aligned using low-intensity UV, which is viewed using a video camera. These approaches mean that either beam-splitting mirrors or video cameras must be introduced for viewing. Occasionally, the films on the wafer and the resist are such that destructive interference conditions occur. Wafer contrast then becomes so low as to be unusable when viewed through the exposure optics. In such cases off-axis broadband viewing is necessary even to see the patterns. An alternative is to change the thickness of the re­ sist or wafer films to avoid these interference conditions. If such problems are kept in mind when the processes are specified, it is usually possible to avoid them. A fundamental question that remains to be resolved is "How much value does automatic alignment net the user?" The usual answer is that the dieby-die alignment allows any sort of wafer distortion to be compensated as an in-line process. We feel that depends on the distortion. There are two sorts of wafer distortion, elastic and inelastic. Elastic distortion principally results from uncompensated front and back surface tension forces. An example often encountered is caused by unbal­ anced oxide growth on the two wafer surfaces. Such distortions are revers­ ible and involve few or no dislocations. When the unequal films are re­ moved, the distortion disappears. This distortion results in symmetric run-in or run-out. Process modifications can minimize this problem in many cases. The 1:1 projection scanners can only perform a best fit of a mask to such a distorted wafer. If the distortion is unavoidable but well characterized and stable, compensated lx masks can be made that will match the average wafer behavior. Wafers deviating from the average will again be best fit, not matched. For this application the two-point alignment of the 10:1 stepper is a solution and has an advantage over the 1:1 projection scanner. Two-point alignment can be performed in either the manual or automatic mode. Nonelastic wafer distortions are not reversible, tend to be random, and always involve dislocations. The dislocations in most cases degrade device properties, so accurate registration is of little benefit. Such wafer distortions are a symptom of processing problems and not an unavoidable fact of life. As such, they need to be fixed not matched.

Ronald C. Bracken and Syed A. Rizvi

312

TABLE XXIII Exposure time (msec) Wafers/hr

200 28

300 25

400 23

500 21

600 19

So the main advantage of wafer stepper alignment is that it tends to make elastic wafer distortions transparent to both the wafer processing and the mask-making engineer. The disadvantages of automatic or die-by-die align­ ment are several: First of all their throughput is a disadvantage. The align­ ment time is given as being from 0.25 sec (TRE, GCA, Optimetrix) to 0.4 sec (Nikon) per alignment. The exposure time is generally in this range, so the addition of autoalign has the effect of doubling the exposure time. Table XXIII gives the GCA/Mann-reported relationship between exposure time and 100-mm wafer throughput for a GCA 4800 stepper at 10:1 using a 2 x 1 64 K DRAM-arrayed reticle. Increasing the exposure time from 300 to 600 msec caused a 24% decrease in wafer throughput. The same change should effect autoalignment. The second disadvantage is the available contrast. In the occasional situa­ tion where autoalignment does not work, levels will have to be aligned on some other machine such as a 1:1 projection scanner, or the process must be modified. A more subtle problem will occur when the contrast is low; the alignment can still be performed, but the accuracy is poor. When this condition occurs, everything will appear to be functioning well, but the alignment error will in­ crease. These considerations suggest that automatic alignment on a die-by-die basis will be useful or even necessary when l-ì,ðé design rules are needed. Before one makes a commitment to automatic alignment, it would be best to review the processing steps to anticipate interference conditions and make changes to avoid them. For devices having looser design rules, a two-point alignment should be satisfactory. This alignment will compensate elastic distortion and not im­ pact throughput as severely as die-by-die alignment. The two-point align­ ment need not really be automatic since two manual alignments are not ex­ cessively time consuming. It is good practice in all processing to take steps to avoid introducing even elastic deformations when it is possible. If such processing practices are ob­ served, then there is no distortion to compensate. In such a case the 1:1 pro­ jection scanner is as useful a solution as the 10:1 stepper and has the advan­ tages of higher throughput. 1. Resolution of the Wafer Stepper

As was seen in discussing the 1:1 projection scanner, there is a difference between the optical resolution and the usable resolution. The latter is tied to

5. Microlithography in Semiconductor Device Processing

313

the registration. Since these machines have a higher numerical aperture, ranging from 0.28 to 0.35, they are capable of resolving smaller images than the 1:1 scanners. These images are 1.2 and 1.0 μτη, respectively. The regis­ tration of these steppers is a combination of the error in alignment and the in­ trinsic machine die-fit precision. This number is most widely stated as 3σ = ± 0.35 /im. The 2.5 x rule for practical resolution then gives a 0.9-ì,ðé value, which means that the optical resolution and the useful resolution are very similar. The optical resolution, 1.2 ì,ðé, which appears in Table I, is used as characteristic of 10:1 wafer steppers. 2. Registration of the Stepped Wafers

The number specified for array misregistration, ±0.35 ì,ðé, applies to two matched cameras. The specification of a single machine is ±0.25-ì,ðé preci­ sion; so for two machines that have had their averages made identical, the precision is (y/2 x 0.25 ìðé) = 0.35 ì,éôé. Strictly speaking, the precision of TV matched machines must be ±y/N x 0.25 ì,ðé. In most device sets, each level does not have to register with every other level with the same require­ ments on precision. In most devices a critical relationship exists only between two levels, so ± 0.35 ì,ðé is an appropriate value for array misregis­ tration. The other elements of misregistration, reduction, rotation, and lens distortion, all are about 0.3 ì,ðé, so the total misregistration that will charac­ terize an aligned wafer will be 0.45 ì,ðé. In the calculation of the useful reso­ lution, only the die-fit misregistration was considered since array misfit can be compensated when two-point aligning is used. 3. Dimensional Control on Stepped Wafers

The same problem that faced the 1:1 projection scanners is faced by the 10:1 steppers: the problem of resist thickness variation over steps on the wafer. The numbers cited in that section, ±.25 ì,ðé, applied to the 1:1 scan­ ner where multiple wavelengths in the UV can be used. As was mentioned in that section, the use of multiple wavelengths to smooth the interference pat­ terns is often not available to a refractory lens system. For example, the Zeiss S-Planar 10-77-82, which is the 10:1 lens, used on the GCA and TRE steppers is g-line (436 nm) corrected as is the Nikon NA = 0.35 lens for their NSR 10106 wafer stepper. Of the commercially available steppers, Optimetrix and Canon offer a choice of 436-nm and 404 + 436-nm lenses. The consequence of using the monochromatic light is variation of geome­ try CD as wafer steps are crossed. Neureuther et al. [14] have shown that for a reflective substrate such as aluminum the variation can be large. The ad­ justment of the partial coherence from S = 0.7 to S = 0.3 (more coherent) can give some relief. For substrates such as silicon, the double-wavelength and single-wavelength approaches appear roughly equivalent. La Rue and Ting [16] have shown the variation given in Table XXIV. We see that the

314

Ronald C. Bracken and Syed A. Rizvi TABLE XXIV Resist Image Size Tolerance" Average resist thickness (nm) 800

1600

Exposing light (nm):

436

405 + 436

436

405 + 436

Surface (ì,ðé) Al Silicon + 500-nm oxide Silicon + 50-nm oxide

1 1 0.2

0.4 0.3 0.4

1 0.3 1

1 0.2 1

" With ±70-nm resist thickness for geometry = 1 . 5 ì,ðé.

variation can vary from ± 0.5 to ± 0.1 μτη depending on the exposure condi­ tions, surface being exposed, etc. We expect, however, that we shall have to accept poorer control of CD with the refractory lens system. We have used ±0.3 ì,ðé as a characteristic value for the direct-stepped wafers (DSW). 4. Cosmetic Quality of Stepped Wafers

This characteristic is very much a function of the care in handling given the wafers and the environment of the wafer fabrication area. The stepper reticle must be free of all killing defects if it is to be usable at all. However, the wafers now become the defect-bearing item. From the section on projec­ tion printing, we have the estimate of 0.09 defective die/cm2 as a character­ istic of the wafers. With the steppers this is the total cosmetic contribution. One of the real advantages of DSW is that the 10 x reduction of small con­ tamination and random defects can lead to their elimination. 5. Wafer Stepper Throughput (10:1)

This number will vary depending on the reticle size that is employed. The production figures called out in vendor specifications most usually assume a maximum-size reticle. This sizing means that smaller die are arrayed to build up a reticle that as nearly as possible will fill the stepper lens. It also means that as much as possible the reticle is left on the machine because changing such a reticle is often a time (and capacity) consuming proposition. An alter­ nate strategy of DSW operations is to try to keep the overall reticle dimen­ sion smaller. For example, 7.6 mm x 7.6 mm rather than the maximum 10 mm x 10 mm (for a 14-mm lens) could be used. This strategy has a derating effect in that the stepping time can be increased by as much as 73%. The benefit of this strategy is that reticle preparation can be more quickly done. In a wafer fabrication area where a diverse mix of devices is being built, the latter strategy would be appropriate. In a single-product front end, the former is more likely to be effective.

5. Microlithography in Semiconductor Device Processing

315

The numbers quoted in specifications range from 30 to 50 wafers/hr. In a real life situation about 30 wafers/hr of the first pattern can be run. When alignment is involved, we can assume the 24% derating, which was discussed in the section on autoalign, to get a run rate of 23 wafers/hr. The value we shall use as characteristic of direct step (10 x) throughput will be 26 wafers/hr. 6. Future Trends in Wafer Steppers

The situation in the field of wafer steppers is in a state of flux. Two vendors (Canon and Ultratech) are offering 1:1 wafer steppers. In this case a compromise is being struck between numerical aperture and field size. By making this compromise, a N.A. greater than 0.17—that of the 1:1 scanners—can be used, and improved resolution becomes possible. The wafer patterns are stepped using large arranged reticles so productivity will be greater than that of the 10:1 steppers. These machines incorporate auto­ matic alignment and relatively imprecise stage movements. The purpose of the latter is to keep the machine cost down; the former is to allow good pre­ cision in spite of the stage. If, however, wafer condition does not allow the autoalign to work, there is a serious problem. These wafers must then either be manually aligned step by step or patterned on another machine. The im­ precision of the stage will not allow a "dead reckoning" patterning. The values for 1:1 stepping shown in Table I represent our estimate of these machines' performance as judged from vendor specification, not actual practice. The comparison in Table XXV should help illustrate the difference between the low-N.A. 1:1 scanners and higher-N.A. steppers. The DUV is deep UV or the 240-nm Hg arc line. The deep-UV productivity was derated to account for longer exposure times. The specified rates of throughput for these machines are much larger than these numbers. The published numbers may represent peak rates that are occasionally achieved, but it is difficult to believe that the numbers can actually be used to plan wafer capacity. A po­ tential advantage some of the machines may have is autoload and autoalignTABLE XXV Resolution

1:1 Scanners 1:1 Scanners 1:1 Stepper 10:1 Stepper 10:1 Stepper 10:1 Stepper α

(UV) (DUV«) (GCA/TRE) (Optimetrix) (Nikon/Canon)

Lens

Throughput

N.A.

μτη

1st mask

Align

Telecentric

0.17 0.17 0.20 0.28 0.32 0.35

2.0 1.2 1.5 1.3 1.0 0.9

100 70 40 30 30 30

60 45 35 20 20 20

Yes Yes Yes Yes Yes Yes

DUV = deep UV or the 240-nm Hg arc line.

316

Ronald C. Bracken and Syed A. Rizvi

ment of reticles. If these features are really effective, then productivities of the order of those quoted may really be achievable. The comparison between the 1:1 scanners and the wafer steppers appears to be fairly even. The scanner will probably be able to deliver better produc­ tivity than the steppers. We feel that the problems with reticle preparation are not being taken seriously by the vendors of the DSWs in quoting throughput. If the deep-UV feature of the 1:1 scanner is an effective feature, it will allow resolution in the l-ì,ðé range, which puts it in the same range as the higher-N.A. steppers. The matter of registration favors the steppers in that they can be used in two-point alignment to compensate for elastic wafer deformation. If the wafer deformation is well characterized, a mask set can be built for the 1:1 scanner that will minimize the impact of this problem. As device rules are pressed beyond the l-ì,ðé limit into the submicron range, we leave optical techniques behind. The next section will treat what we see as the techniques available for submicron patterning: e-beam and x-ray wafer writing. E. Electron-Beam Wafer Patterning

In order to penetrate beyond the l-ì,ðé barrier, nonoptical techniques have become necessary. One of the most likely techniques, of course, is electron-beam lithography. Beams of sizes down to 0.1 ì,ðé can be formed using commercially available machines, so it appears to be a field of great promise. Some of the potential advantages of electron-beam lithography (EBL) are the following. (1) The equipment development appears to be well under way. Just as the wafer steppers got their prototype equipment from the photomaskmaking step-and-repeat cameras, so have the EBL wafer exposure systems gotten equivalent experience from the mask exposure EBES systems. (2) For reasons similar to those above, the resist systems appear to be in place or in active development. (3) For over 15 years developmental and quasi-manufacturing waferwriting efforts have been under way at Western Electric, IBM, Texas Instru­ ments, Hewlett Packard, Hughes Research, Hitachi, and several other industrial corporations. These are powerful, positive factors that benefit the EBL approach for wafer writing. The disadvantages of EBL are analogous to all those that af­ flicted the optical methods with a few extra that are peculiar to EBL. Some of these problems follow: (1) Wafer throughput: Whereas a photomask that could be exposed in 100 min was considered acceptable, a like exposure time for a wafer would

5. Microlithography in Semiconductor Device Processing

317

constitute a great drawback in any process engineer's mind. A great deal of effort is being expended on just reducing the write time for wafers. (2) CD control: (a) As we saw earlier, the proximity effect must be com­ pensated by special writing strategies if the small geometries are to be written with control, (b) The resist thickness variation over wafer steps will lead to CD variation. This variation will occur in spite of the very large depth of field of the electrons. It will occur because of the dose dependence and backscatter dependence of CD on resist thickness. (3) Wafer alignment marks must be devised that are useful for precise alignment. We feel that writing speeds will eventually be obtained that are sufficient for economical use of EBL. Hewlett Packard has discussed a 300-mHz beam writer. Most of the in-house e-beam wafer writers are formated for vector scan and several involve some version of shaped-beam writing, both of which will contribute to more rapid writing. In addition, the commercial e-beam vendors are developing vector scan wafer writing machines. The last problems, CD control and alignment marks, are to some degree interdependent. CD control—especially over wafer steps—runs into the same problems that were encountered in direct 10:1 wafer stepping. That is, large resist thickness changes occur at these steps. The problem can also be relieved by use of a multilevel resist strategy. The wafer features are smoothed by coating the surface with a deep (~2-ì,ðé) resist film. A second layer, often metal, serves to isolate the first film from the uppermost or third level which is thin (0.5-ì,ðé) resist. Since the features have been smoothed, the top resist coating will be very uniform as well as thin, so CD uniformity will be much better than will single-level resists. This resist process is illustrated in Fig. 23. The thick lower layer of resist is patterned using either plasma ash or flood UV exposure and develop. Alignment schemes mentioned in the literature involve a feature that can be found for alignment purposes. Since the aim of multilevel resists is to obscure features, they tend to work against the purpose of alignment. The answer that is emerging is to use exceptionally deep grooves (7-10 μτή) for alignment purposes [17]. Such grooves are deeper by a factor of 3 to 5 than most wafer features. A very desirable feature of the multilevel resist systems is that they tend to mask the contribution of the substrate to the proximity effect. This effect is still not completely understood but is held to result from electron scat­ tering as the beam penetrates the resist and substrate. Since backscattering becomes worse as the atomic mass of the scattering center increases, sub­ strates, such as gold, will tend to have more severe problems with backscat­ tering than will lower atomic mass substrates. If the patterned resist rests on a "substrate" of PMMA, which is primarily carbon, substrate backseat-

Ronald C. Bracken and Syed A. Rizvi

318

ROUNDED

m,LÄ1_ (c) Fig. 23. The multilevel resist procedure for wafer surface smoothing, (a) A first level smoothes features, (b) The top level is patterned, (c) The lower level is patterned using the top level as a portable conformable mask (PCM).

tering can be lowered. Table XXVI gives calculations of Greeneich [18] as they relate to a multilevel system. Two different resist schemes are com­ pared. The first employs 2-/xm PMMA as a single-level resist. The second employs 0.5-ì,ðé patterning resist resting on 50 nm of Al as an interlayer for pattern transfer. These in turn rest on 2 ì,ðé of PMMA, which serves the purpose of smoothing the wafer surface. The electron dose for each resist structure and substrate was always adjusted to be sufficient to clear an isolated 0.5-ìðé contact hole and achieve the nominal CD. The proximity effect was judged by calculating the width of a nominal 0.5-ìðé resist gap between two exposed windows. As the scattering increased the gap shrinks. The effect for various substrates is shown in Table XXVI. The case of "no substrate" implies the 2-ìðé PMMA was self-supported and only scattering TABLE XXVI Substrate

Resist structure

Ratio: Actual/nominal pattern

None PMMA

2-ìðé PMMA 2-^m PMMA

0.93 0.72

Si Permalloy GaAs

2-/im PMMA 2-^m PMMA 2-ì,ðé PMMA

0.25 0.09 0.00

Si Permalloy GaAs Au

Trilevel Trilevel Trilevel Trilevel

structure structure structure structure

0.68 0.77 0.70 0.78

5. Microlithography in Semiconductor Device Processing

319

within the resist itself could contribute to the proximity effect. The "PMMA substrate" assumes a backscatter appropriate to a PMMA film of infinite thickness. The trilevel structure gives gap widths that are very similar to the PMMA substrate case. Thus, this technique shows promise of making the control of resist patterns less dependent on the substrate material. The leveling effect of the first level of resist can be judged by an empirical estimate that the resist surface is essentially flat when the film thickness is 2.3 times the feature depth for grooved-shaped features [19]. This rule im­ plies that a 2-ì,ðé first resist film will provide for features of up to ~ 900-nm depth, which would be sufficient for many processes in current use. The proximity effect then is being attacked from both software and pro­ cessing directions. Software corrections for proximity are still necessary even with the multilevel resists. The backscatter is decreased but still exists. Proximity corrections most likely will be necessary if good control on submi­ cron CD is to be maintained. The software corrections will usually tend to slow writing time while the multilevel approach will slow processing time. In terms of characterizing electron-beam direct wafer writing, the current capability is 1 ìðé, but submicron (—0.5 ì,ðé) as a production capability will become commonplace in 2 years. We have used 0.8 ì,ðé as the EBL charac­ teristic in Table I. The effort that is going into EBL is really immense. The governmentsponsored 5-year VHSIC program will infuse about $170 million into EBL development. Commerical effort is probably as costly. The registration that EBL is capable of demonstrating for masks (0.125 μπί) will probably not be improved for wafers. There are problems with signal to noise in alignment-mark location finding that will mitigate any improvements in stage precision that may be made. Also, the demand for throughput will begin to conflict with the need for thermal stability of the wafers, and compromises effecting registration will have to be struck. The CD control on wafers will inevitably decrease from that which can be obtained on masks but will be about 0.15 μτη. Even this control will only be maintained if multilevel processing and proximity software are developed. Visual quality will probably be no worse than that which can be obtained using EBL on masks since similar resists and processes are involved. The throughput will probably not be much improved over current esti­ mates of 1 hr/wafer for submicron design rules. The more complicated wafer processing and the software will take their toll in consuming any write-time improvements the hardware people may make. This does not mean that for larger design rules higher throughput will not be available. The EBL approach appears to be one that will be used when necessity dic­ tates; not as a technique of choice. By necessity we mean when submicron geometries are needed or when the overall cycle time of tape to wafer needs to be kept small. Both IBM and Hughes Research utilize direct writing whenever cycle-time considerations are paramount. Regardless of the writing strategy of any e-beam machine, it still must

320

Ronald C. Bracken and Syed A. Rizvi

serially write the patterns one element at a time. Almost as a given, a whole wafer exposure technique will have a cycle time advantage. Such consider­ ations have led to the exploration of x ray as an alternative to EBL for achieving submicron geometries. In the next section, this technique will be discussed. F. X-Ray Lithography (XRL) for Wafer Exposure

The arguments of resolution led to both EBL and x rays as microlithography techniques. The advantages of x ray are the following. (1) The wavelengths used range from 3 to 10 Â, so the corresponding res­ olution will be good. The actual usable resolution, however, is of the order of 0.5 μπι due to secondary electrons and shadowing effects. (2) Since x rays cannot be simply focused, it is necessary to use a shadow casting (promixity print) technique for imaging. Because of this sim­ ilarity to proximity printing, a similar experimental set-up can be used which means equipment cost can be kept low and implementation should not be difficult. (3) In many cases full wafer exposure can be achieved. Throughput rates then are only limited by resist sensitivity. Bell Laboratories has recently an­ nounced an x-ray exposure machine that potentially has a capacity of 75 wafers/hr when coupled with a new, sensitive resist. (4) Contamination—at least organic contamination—on the mask will be largely transparent to the x rays. (5) Diffraction—which was a severe limitation on optical proximity printing—is a very small factor for x rays. In light of these outstanding advantages, why has such effort gone into EBL and such comparatively little effort into XRL? The answer boils down to two factors, the masks and the x-ray sources. Of these two items, the weighting factors of importance are probably 10:1, respectively. A useful x-ray mask, like a good optical mask, must have good transmis­ sion in the clear areas and high absorption in the areas intended to be opaque. The problem with x rays is that they tend to be absorbed by most materials, so getting good transmission in the clear areas has been a matter of getting low atomic number materials in thin cross sections. The "thincross-sections" requirement is a really serious problem in any mask-making effort. Orthodox methods of achieving good flatness have been to use thick substrates. A thin substrate is difficult to get flat, is easily streched, or is de­ formed and often broken. Some of the masking substrates that have been described in the literature are boron nitride, polyimide, Mylar, silicon, tantulum and titanium. A typical x-ray mask can be formed from a silicon wafer. Boron nitride is chemically deposited onto the silicon surface; polyimide is spun over the boron nitride.

5. Microlithography in Semiconductor Device Processing

321

TABLE XXVII Element

Pd

Rh

Mo

Si

Al

Cu

C

Wavelength (Â)

4.36

4.60

5.41

7.13

8.34

13.36

44.70

The circular center region of the wafer is etched away leaving the boron nitride-polyimide film as a window. The masking pattern is formed on the windows in a thin film of gold-tantulum [20]. The silicon film may have to be as thin as 5 ìðé if it is to be transparent to the x rays. Berillium, which is commonly used as the windows on x-ray tubes, has also been suggested as a substrate. A 125-ìðé (0.005-in.) film could be used. As these masks get larger, the problems with stability and flatness become increasingly worse. In addition to this flatness problem, the source begins to enter into the situation as a problem. In order to achieve short exposure times, a bright source is necessary. Conventionally, x rays are formed by electron bombardment of a material. In Table XXVII are listed some materials that have been experimentally used as x-ray sources [21]. To achieve high source brightness, a high-electron-current density must be employed. When the current is high, there is danger of melting the target and destroying the equipment. There have been two approaches to keeping the target cool. The target can be rotated and water cooled as is done in a number of commercial hard x-ray sources. The problem with this approach is that the source position will not be stable. This instability will cause pat­ tern spreading and loss of CD control. Bell Laboratories has used a stationary palladium anode that is conical and water cooled. The electrons focused on the surface of the cone allow an increase in power over that which would result from focus onto a flat surface without excessive heating. Other illuminators that have been discussed are plasma and synchrotron x-ray sources. The plasma sources could offer the benefits of a large stable illuminator. A synchrotron source is rich and uniform in x rays but expen­ sive and inconvenient. The interaction of the source and the masks could result in mask heating. Some x rays are absorbed in the membrane and more in the masking mate­ rial. As we saw in optical work, this sort of heating can lead to a divergence in registration between different mask levels. This divergence will be espe­ cially severe when the light-to-dark ratios differ by a large amount. The x-ray resists need to have some absorbent atoms in the polymer mole­ cules. Many resist polymers have been explored. Those containing chlorine, bromine, sulfur, phosphorus, or heavy atoms tend to be good absorbers [22]. The characteristics of x-ray lithography can be summarized as follows.

Ronald C. Bracken and Syed A. Rizvi

322 1. Resolution

The "inherent" resolution of the system based on wavelength consider­ ations is for very small geometries. The x rays cause secondary electrons to be emitted by the resist. These electrons have a range of ~ 100 nm and tend to expose the resist. If a 0.1-um line were irradiated, we would find the sec­ ondaries exposing a band of approximately 0.1 urn on each side of the geometry. So —0.25 to 0.3 urn is the minimum linewidth that can be ex­ posed. The problem of variation of resist thickness over wafer steps also af­ fects XRL. In the attempt to make the resists more sensitive to x rays, they also became sensitive to CD problems arising from resist-thickness varia­ tions. As a consequence, multilevel resists are beginning to be discussed in XRL papers. All of this processing introduces pattern spreading and loss of resolution. We feel at this time that 0.5 urn is a usable resolution that can be delivered by XRL. 2. Registration

There is no reason to expect inherently better registration than can be achieved using optical proximity print. Indeed, the nature of the XRL mask should militate against even comparable registration. The shadow-casting technique as it is used in x-ray lithography has some additional problems as regards registration. In Fig. 24 this sort of geometry is illustrated. As the gap distance changes, the placement of a geometry will change. It is not unusual to find ± 2-um variation on a wafer. If we assume a 2-um change in the Z location of the wafer, this will constitute a shift in the geome­ try by 0.25 urn for geometries placed 2 in. from the axis of the machine. This sort of variation is a direct result of the noncollimated light. Optical proxim­ ity printing will not be subject to this problem as well-collimated light sources can be used. This calculation takes no account of mask nonflatness; only wafer flatness is considered. SOURCE

A

*

D

*

/

' v

t Q

-^

Ê-Ä1_

-*— L — ■ -

Fig. 24. The proximity geometries for x-ray lithography.

5. Microlithography in Semiconductor Device Processing

323

This registration problem must be added to any other source of error in getting an overall characteristic. The masks have the potential of being a large contributor to the registration error in XRL. The Bell Laboratories re­ port cites 0.33 ì,ðé as the full wafer registration they obtained. Realistically speaking, such registration, although exceptional under the physical condi­ tions that prevail in XRL, means the 0.5-ìðé resolutions can only be margin­ ally exploited. This conclusion is reached based on the 2.5 x rule for prac­ tical resolution we employed for other microlithographic techniques. We feel that there is no reason that registration better than that obtained in proximity print should be considered as realistic. The registration character­ istic we have used in Table I is 0.8 ìðé. In practice this registration error means that the resolution of x ray cannot be exploited in an integrated circuit context. Such a situation will limit x ray to first-pattern situations where fine line resolution is demanded, or to situations where only one level of pat­ terning is involved such as surface accoustic wave (SAW) devices. 3. CD Control

As in the other microlithography techniques, a great deal of control rests with the process used and its inherent stability. A factor that is likely to influence CD control is the "penumbral blur" which results from the x-ray source having a finite extension. The blur width is given by d = (g/D)S9 where g is the gap, D the source-to-mask distance, S the source width, and d the penumbral blur. The presence of the blur means that an area of partially-exposed resist extends beyond the completely-shadowed area. The area of complete shadow is equal to the mask CD width. On flat substrates the blur is uniform, but it will vary over wafer steps. The worst problem pre­ sented by the blur is that it makes CD variation sensitive to process varia­ tions that occur at different points on the wafer. The diffraction that caused such a variation in CD in proximity printing will be of no consequence in XRL. Most sources quote 0.1 ì,ðé as the variation on a wafer. It is mandatory that this sort of control is maintained if the benefits of submicron micro­ lithography are to be exploited. For groups of wafers we expect this value will be doubled. We used 0.2 ìðé in Table I as a characteristic of XRL. 4. Defect

Density

Since the resists used in x ray are similar to the EBL resists, we expect that the defect densities will be similar so we have used 0.3 defective dies/cm2 as a characteristic of XRL patterning.

324

Ronald C. Bracken and Syed A. Rizvi

5. XRL Stepper

A potential improvement in XRL patterning could be the x-ray stepper. The real advantage here is that the mask size can be reduced. In this way the distortions that are inherent in ever-larger membrane masks can be avoided. Registration in this case can be made similar to that of a con­ ventional step-and-repeat camera. The problems resulting from wafer nonflatness should be greatly lessened. Because this type of equipment is still in the embryonic stages of development, it will be included in Table I for comparison only as a potential technique. 6. Capacity for XRL Patterning

The exposures that initially were used in XRL were extremely long. Five-to-thirty-minute exposures were not uncommon. Better sources for x rays and more sensitive resists have changed this situation considerably. The Bell Laboratories machine is held to be potentially able to expose 75 wafers/hr. About the best exposure times that are found elsewhere are 1.5-3 min/wafer. In Table I we have used the throughput of 20 wafers/hr. This figure assumes about 1 min for alignment of these extremely close fea­ tures, 1.5 min for exposure, and 0.5 min for the handling steps. The XRL stepper will tend to have extremely poor capacity due to these long exposure times. Unless much more sensitive resists are developed, this capacity consideration will tend to limit the application of XRL as a stepand-repeat technology. 7. Overview on XRL

We recognize that the values we have given for XRL microlithography characteristics are discouraging. Indeed, the realized characteristics on commercial machines will have to be much better than these values if XRL is to be a functional technique. Until such data are available, about the best we can do is judge from the data on developmental machines. The reader should keep in mind, however, that there might be a difference between these numbers and those realized in a production atmosphere. V. SUMMARY AND CONCLUSIONS

The work in reticle fabrication and mask making have come a long way. We feel that in the e beam and the laser-metered stepper with automatic ret­ icle changing and alignment we have techniques for reticle and mask making that exceed the ability of wafer patterning to exploit. If adequate attention is paid to environmental control (particles, temperature, and

5. Microlithography in Semiconductor Device Processing

325

humidity) and materials, a zero-defect mask set can soon be made readily available. The problems of wafer registration, CD control, and pattern resolution will be the areas of really exciting challenge in the next few years. Advances over the next 2 years in wafer microlithography can be divided into two areas, geometries greater than 1 ì,ðé and submicron geometries. Geometries down to 1-2 ì,ðé will be able to be patterned with either the 1:1 projection scanner or the 10:1 wafer steppers. As can be seen in Table I, this competition is indeed close. The use of deep UV on the projection scanner will give it a resolution capability equal to that of the 10:1 stepper and better than the 1:1 stepper. Two problems will always be with the pro­ jection scanner. First, a mask must be used. This use means that contami­ nation and hard chrome defects will always be a matter of concern. The second problem is that the mask can only balance wafer-to-mask array mis­ registration for which it cannot completely compensate. On the other hand, scanner productivity is higher than that of a stepper. If the use of a projec­ tion scanner is selected, it would certainly pay dividends to pay close atten­ tion to mask quality, mask cleanup, and the wafer distortion involved in the process being used. The steppers have an inherent advantage in that no full wafer mask is in­ volved. Since a reticle must be free of killing defects, a stepped wafer can be considered as having been patterned with a zero-defect mask. Additionally the reducing steppers (10:1, 5:1) can "forgive" contamination and hard de­ fects that 1:1 patterning would print. The converse of this advantage is that a completely defect-free reticle is not easy to prepare. Since the steppers can align to wafers and compensate for wafer warpage, there is an advantage to the steppers on this account. As geometries shrink toward the l-ì,ðé limit, this advantage will become more and more impor­ tant. The sort of distortion that can be compensated by two-point alignment is symmetric orthogonal array expansion or contraction. If the array distor­ tion becomes more complicated, for example, the X and Y expansion be­ come unequal, more points must be aligned to effect the compensation. As more points are aligned, productivity of course declines. The 10:1 stepper is really the most logical machine for geometries in the l-2-ì,ðé range. The reason for this choice is the ability to fit the stepped pat­ tern to wafer distortion. The effective exploitation of these fine geometries can only be done if successive layers can be registered. Since the current 1:1 projection scanners can only balance the misregistration error, the advantage tends to lie with the stepper. Choosing between a 1:1 and a 10:1 stepper should really depend on the processing used and the geometries being imaged. The 1:1 stepper is similar to the scanner in that small contamination be­ comes a problem. In fact, it can be a worse problem in that the higher N.A.

326

Ronald C. Bracken and Syed A. Rizvi

of the 1:1 stepper lens will allow finer contamination to be imaged. The lx contamination cannot usually be detected at the reticle preparation step, so it will tend to go undetected. For this reason the use of pellicles should be strongly considered for 1:1 stepper patterning. Submicron patterning in the near future will only be able to be effectively done using the e-beam lithography (EBL). The problems facing EBL are (1) developing proximity software that will not seriously prolong pattern writ­ ing times, (2) getting the raster effects on CD variation under control, (3) working out the details of a reasonable multilevel resist patterning scheme, and (4) speeding the writing time by another order of magnitude. Items (2) and (4) are somewhat related in that vector scan and shaped beams are being discussed as methods of enhancing speed and reducing the edge roughness associated with raster granularity. Vector scan is viable and will likely make a significant contribution to solving both problems. The shaped beam appears to be solving one problem—raster effects—while creating another—multiple exposure. It is hard to see how a shaped beam can make exposures without involving double exposure. Correct algorithms can help avoid double exposure at an exterior edge, but there will be double exposure which can make a contribution to CD variation. The other approach to removing the raster effects is to scan parallel to the pattern edge. Such scanning will avoid the double exposure problem of shaped beams. X-ray patterning will most likely find its applications in first mask imaging or one-level mask imaging, where subsequent patterns can be stepped to match the particular array printed by the XRL. We feel the registration problems of XRL will be of such a serious nature as to preclude integrated circuit patterning with this technique. What has emerged in the process of gathering the information in this chapter is that submicron patterning is not an activity independent of wafer processing. Formerly, in the days of contact print, the wafer processing could evolve almost completely independent of the wafer patterning. A factor that helped make that independence a reality was the high contrast and multiband exposure of contact print. This situation has changed in a thorough-going way. Submicron patterning demands that the wafer be viewed almost as an element in an optical system. The wafer characteristics are critically important to the quality of the imaging that is obtained. The effort on multilevel resists can be viewed as an attempt to control the imaging qualities of the wafer, and indeed it really is such an effort. If this point is accepted and embraced by engineers specifying the submi­ cron processing, we feel a great deal of mileage can be gotten out of some of the less exotic (and costly) techniques of wafer printing. In this way the trend toward ever more expensive wafer patterning may be changed and the cost of wafer processing held to a minimum.

5. Microlithography in Semiconductor Device Processing

327

ACKNOWLEDGMENTS The authors gratefully acknowledge the work of Mrs. Carol Cook and Ms. Barbara Sutton in the preparation of the many versions and revisions of this manuscript. One of the authors (R.B.) acknowledges the encouragement and support of his wife, Mary, in the time this work took away from home. We also wish to thank Messrs. Rick Cobo and Rie Corless for their help in technical discus­ sions related to the stepper and e-beam parts of this chapter.

REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17.

18. 19. 20. 21.

E. B. Brown, "Modern Optics/' (Van Nostrand-Reinhold, New York, 1965). J. M. Roussel, Proc. SPIE 275, 9 (1981). D. R. Hernott et al., A praticai electron lithographic system, IEEE Trans. Electron De­ vices ED 22, 385 (1975). K. Soho, Y. Tanaka, and K. Uchiho, Fabrication of high quality E-beam master masks, Kodak Microelectron. Seminar p. 44 (October 1980). R. M. Shappel, Line width control in emulsion masks, Kodak Microelectron. Seminar p. 104 (October 1977). M. Long and C. Walker, Stress fractures in positive photoresist, Kodak Microelectron. Seminar p. 125 (October 1979). M. Parilch and D. F. Keyser, J. Appi. Phys. 50. 1104 Ð979). T. Kameko, T. Umegaki, and Y. Kawakami, A practical approach to sub micron photo­ lithography, Kodak Microelectron. Seminar p. 25 (October 1980). W. N. Jones, A 'far proximity' photolithographies process for semiconductor manufac­ tures, Kodak Microelectron. Seminar p. 49 (1975). A. Offner, The influence of partial coherence on the microprojection of images, Presented at The Opt. Soc. Meeting Opt. Microelectron. (January 26, 1971). David A. Markle, Solid State Technol. p. 50 (June 1979). R. Hershel, H. Hackelman, C. Lage, S. Shevenoch, and B. Tillman Registration monitor for 1:1 aligners, Kodak Microelectron. Seminar p. 8 (October 1980). M. Makita, N . Moriuchi, and K. Kodota, Analysis of registration errors in 1:1 projection mask aligners, Kodak Microelectron. Seminar p. 104 (October 1980). A. R. Neureuther, P. K. Jain, and W. G. Oldham, Factors affecting linewidth control in­ cluding multiple exposure and chromatic abberations, Proc. SPIE 275, 110 (1981). B. J. Lin, SPIE 174, 114 (1979). J. LaRue and C. Ting, Single and dual wavelength exposure of photoresist, Proc. SPIE 275, 17 (1981). Y. Ching Lin, A. R. Neureuther, and W. G. Oldhan, Alignment signals from symmetical silcon marks for electron beam lithography, Kodak Microelectron. Seminar (October 1981). J. S. Greeneich, Proc. Int. Conf., Electron, Internal Beam Sci. Technol. (R. Bakish, ed.), p. 282. Electrochem Society, 1980. S. Yamamoto, K. Kolbayashi, and Y. Toyama, Fujitsu Sci. Technol. J. 14, 143 (1978). D. Mayden, J. Vac. Sci. Technol. 16, 1959(1979). G. N. Taylor, Solid State Technol. p. 73 (May 1980).