A fully scaled 0.5μm CMOS process for fast random logic

A fully scaled 0.5μm CMOS process for fast random logic

Microelectronic Engineering 15 (1991) 257-260 Elsevier 257 A f u l l y s c a l e d 0 . 5 p m C M O S p r o c e s s for f a s t r a n d o m logic M...

170KB Sizes 3 Downloads 92 Views

Microelectronic Engineering 15 (1991) 257-260 Elsevier

257

A f u l l y s c a l e d 0 . 5 p m C M O S p r o c e s s for f a s t r a n d o m logic

M. Lerme, G. Guegan, S. Deldonibus, F. Martin, M. Heitzmann, F. Vinet C. Jaffard, M. Belleville, M. Guerin, G. Reimbold, C. Leroux DTA LETI, CENG, BP 85X 38041 Grenoble Cedex France

An advanced high performance 0.5 ttm technology for fast CMOS circuits has been developed. The main features for this 0.5 l~m technology include : diffused wells, field isolation with a SILO/RTN process, N+ polysilicon gate, TaSi2 gate material, contact with W plug, RTA for both BPSG reflow and j u n c t i o n activation, double a l u m i n u m m e t a l l i z a t i o n levels using BSGsacrificial SOG-BSG as intermetal dielectric. These modules allow 0.5 ttm design rules. Ring oscillators delay time of 72 ps, 6ns access time for a 16k x 1 SRAM and a typical 16xl6-bit multiplication time of 7.5 ns were measured at a power supply of 3.3 V.

1. PROCESS ASSEMBLY

The circuits were fabricated on P-/P+ epi substrates using a 0.5 ~ m C M O S process with twin tub and diffused wells. All the lithographic levels were performed on an I-line A S M P A S 5000/50 stepper. Field isolation was realized using the so called SILO/RTN [1] [2] process (Sealed Interface Local Oxidation by Rapid Thermal Nitridation) to ensure sufficient isolation for 0.9 ~tm active area spacing [1]. The transistors were designed with N + polysilicon as gate material, for a power supply of 3.3 V, a gate length of 0.5 ~tm, a gate oxide thickness of 12 n m [3]. The design of the p-channel was performed with BF2 counter doping and a deep As implant. A L D D structure with 0.15 tim sidewall oxide spacer was incorporated for both N and P channel devices. Shallow boron doped layer and S/D junction depth were optimized in order to maintain proper threshold control in the short burried channel P M O S ; the junction activation was performed by R T A during B P S G reflow. Tungsten plugs were used for contact refillingusing an in house developed process on a Precision 5000 cluster equipment. This process consists in an 0167-9317/91/$3.50 © 1991 - Elsevier Science Publishers B.V. All rights reserved.

258

M. Lerrne et al. / A fully scaled 0.Sl~n CMOS process

adhesion and diffusion barrier layer made of RTP nitridation of sputtered Ti, a blanket CVD tungsten and a selective etch back of W with respect to the barrier. In order to avoid notching problems, the two a l u m i n u m levels were processed using the DESIRE [4], dry developable, resist process. The intermetal dielectric was processed with a BSG, sacrificial SOG, BSG process [5]. The main design rules used are summarized in table 1.

2. DEVICE AND CIRCUIT PERFORMANCES The most important characteristics for both NMOS and PMOS transistors are given in table 2.

RULE N+/P+ ACTIVE POLY METAL1 METAL2

WlS 0.5 0.5 0.8 1.0

[3.0 10.9 ]0.6 ]0.7 | 1.0

Table 1. Main design rules in micron W=Width S=Space

PARAMETERS Lg(~m) Le(~tm) Vt (V), Vd=0.1 V Vt (V), Vd=3.3 V I D s a t , Vd=Vg=3.3 V (~k/~m) Gmsat(mS/mm) VBRK(V), V$=I.5V BVDSS (V) 10 pA/~m Subthreshold Slope (mV/decade)

NMOS PMOS 0.5 0.5 0.38 0.45 0.72 -0.89 0.66 -0.82 35O 156 152 5.2 >7 >7 >7 88 94

Table 2. Main device parameters

Threshold voltage variation versus channel length (Figures 1-2) for Vd=0.1 and 3.3 V shows a good short channel effect control for both N and P channel devices. Accelerated hot electron life time testing was done on these devices. The device lifetime is defined as the time during which the decrease of saturation current in the reverse mode is 10%. Assuming a duty cycle of 1/10, for CMOS technology, a lifetime of at least 10 years was obtained at 3.3 V supply voltage. Latch-up i m m u n i t y was checked on dedicated structures with various N+/P+ distances. Structures with N+/P+ distance of 3 ttm were assumed to be latch-up free using the following criteria, the s~lm of the two triggering currents ( injection through well diode and substrate diode ) should be < 100 mA.

259

M. Lerme et al. / A fu#y scaled 0.51zm CMOS process

1.0

Vt (V)

-1.0

Vt (V)

0.75

,~.

~g

-0.5

0.5

~ .

-0.25

0.25

Left (pm) 0.25

0.5

0.75

1.0

.~i~ VD=3.3V v VD=0.1V Figure 1. Threshold voltage versus effective n-channel length at Vd = 0.1 V and Vd = 3.3 V

Left (~m) 0

0.25

0.5

0.75

1.0

~, VD=-3.3V ~i~,VD=-0.1V Figure 2. Threshold voltage versus effective p-channel length at Vd = -0.1 V and Vd = -3.3 V

The d y n a m i c p e r f o r m a n c e s w e r e m e a s u r e d on circuits of various complexity such as ring oscillators, 16k SRAM and 16xl6-bit multiplier and are in good agreement with the ones previously published [6] [7]. Ring oscillators w i t h different F a n - i n an d F a n - o u t w e r e fabricated. Unloaded 101 stages ring oscillators (Wn=4 ~ n , Wp=8 ~m) achieved a delay per stage of 72 ps and a power delay product of 4.2 fJ while 212 ps was achieved with a 31 stages NAND (FI=FO=3) oscillators (Wn=Wp=8 Inn). A 16k x 1 SRAM, with 6 transistors by cell, previously designed with 1.2 ~m design rules was 50% s h r u n k down and processed with this technology. An access time to the address as fast as 5.5 ns was obtained with a stand-by current of 10 ~A. The access time variation versus power supply (Figure 3) shows t h a t the circuit is still operationnal at 4.5 V. We have designed and realized a 16x16-bit multiplier with this process. The multiplier performs 16x16-bit multiplication in signed two's complement. The chip consists of input registers, a Booth's encoder, a multiplier array, a carry propagation stage, and an output register. While the 8340 FET's of the circuit are integrated in 1.04 ram2, the total size is defined by 84 bond-pads. At 3.3 V, the typical multiplication time was about 7.5 ns for a power dissipation of 150 mW while the fastest one achieved 6.8 ns. The dynamic performances of the circuit versus power supply are shown in Figure 4 .

3. CONCLUSION A high performance CMOS process using 0.5 ~m design rules has been demonstrated. Good N+ poly gate devices behaviour with high drivability and negligible short channel effects were observed. The process using a SILO-RTN

260

M. Lerme et a L / A fully scaled 0.Slzrn CMOS process

isolation, W contact plugs and two aluminum levels, was well controlled ; 16kxl SRAM and 16x16-bit multiplier with cycle time of respectivily 5.5 and 7.5 ns at 3.3 V were fabricated.

10

18

TAA (ns)

T m u l (ns)

16

9

14

8

12

7 10 6

8

5

2

3

4 Vs(V) 5

Figure 3. 16kxl SRAM access time versus power supply

1

2

4 Vs(V) 5

Figure 4. 16xl6-bit multiplication time versus power supply

ACKNOWI.EnG~ This work has been supported by DRET contract. The authors wish to t h a n k the different teams of LETI SMSC and SAME for wafer processing and electrical tests.

REFERENCES 1 2 3 4

G. Guegan et al to be published ESSDERC (1991) S. Deleonibus et al, Proceeding ESSDERC, p151 (1989) G. Guegan et al, Proceeding ESSDERC, p 311 (1990) B. Roland et al, Advances in Resist Technology and Processing IV Proceeding SPIE 771, pp 102-110 (1987) 5 S. Deleonibus et al, Proceeding ESSDERC, p665 (1989) 6 Y. OowAld et al, IEEE Journal of Solid-State Circuits, vol. sc-22, n°5, p762 ocober (1987) 7 J.A. Michjeda et al, Proceeding Symposium on VLSI Circuits (1988)