FEOL technology trend

FEOL technology trend

MATERIALS CHEMISTRYAND PHYSICS ELSEVIER Materials Chemistry and Physics 52 (1998) I9i-i99 Invited Review FEOL technology trend Yuan Taur *, Tak H. ...

795KB Sizes 0 Downloads 56 Views

MATERIALS CHEMISTRYAND PHYSICS ELSEVIER

Materials Chemistry and Physics 52 (1998) I9i-i99

Invited Review

FEOL technology trend Yuan Taur *, Tak H. Ning IBM Thomas Y. Watson Research Center, Yorktown Heights, NY 10598, USA

Accepted 29 August 1997

Abstract

Trends in front-end-of-line technology are discussed. At the chip level, many of the important parameters are published in the National Technology Roadmap for Semiconductors in 1994. At the device and circuit level, both bipolar and CMOS are scalable. However, the large standby power of bipolar circuits severely limits the integration level of bipolar chips. The inherently low standby power of CMOS, on the contrary, allows the integration level of CMOS circuits to continue increasing with scaling. In reality, both the electric field and power density of CMOS devices have been gradually rising over the generations owing to non-scaling effects of thermal voltage and silicon bandgap. As power supply voltage reaches 1.5 V and below, circuit performance can only be gained at the expense of higher active or standby power of the chip. Implications of device scaling on contact and silicide technology are addressed. Trends of local and global interconnect scaling are discussed. © 1998 Elsevier Science S.A. Keywords: Front-end-of-linetechnology;Semiconductors;Bipolar circuits

1. Overall device- and chip-level silicon technology roadmap Although bipolar technology continues to play an important role in the microelectronics applications of today and tomorrow, CMOS technology has emerged as by far the dominant technology in most applications. The trend in application is to implement in CMOS any function that can be implemented in CMOS. Thus, the technology roadmap for CMOS should serve as the technology roadmap for silicon in general. In the United States, a technology roadmap for semiconductors, which forecasts CMOS technology for the period 1995-2010 was published in November 1994 by the Semiconductor Industry Association [ 1 ]. It contains, among other items, the projected wafer-level characteristics for CMOS memory and microprocessors, as well as the device and electrical characteristics of high-speed CMOS devices and microprocessors. These are shown in Tables 1 and 2. The assumptions for these projected characteristics are: 1. Lithography--the minimum lithography feature size will continue to reduce by 0.7 × every generation or every three years. The lithography field size will continue to * Corresponding author. 0254-0584/98/$ I9.00 © I998 Elsevier Science S.A. All rights reserved PIIS0254-0584(97)02029-4

increase to accommodate the largest of the projected chip sizes. 2. D R A M - - t h e DRAM chip size was assumed to increase by 1.5 × , and the DRAM cell size to decrease by 0.4 × , every generation. These assumptions were based on past DRAM product trends. 3. Logic--as for logic, power supply voltage reduction is required to allow continued reduction in device channel length and improvements in device and circuit performance. 2.5 V CMOS products are being ramped up in volume in 1995, and the post-2.5 V voltage standards are being established. These are likely to be 1.8 V, followed by 1.5 V and 1.2 V. The maximum chip size for ASIC was assumed to be the same as the lithography field size. Furthermore, it was assumed that the maximum gates/ chip will be about 5 M in the 0.35 ,am generation. For microprocessors, the transistors/chip was assumed to increase at a rate of 2.3 × every generation. A survey of published and/or announced microprocessor chips showed that the transistors/chip was in the 3-7 M range in the 0.5 ~m generation. For projection purposes, an average transistors/chip of 5 M was used. The projected characteristics shown in Tables 1 and 2 should be considered as what will happen if there were no roadblocks or show stoppers, and if process technologies for

192

K Taut; ZH. Ning/Materials Chemistry and Physics 52 (J998) 19J-199

Table 1 Principal wafer-level characteristics ±

Year First Ship Lithography (p.m)

1995 0.35

1998 0.25

2001 0.18

2004 0.13

2007 0.10

2010 0.07

DRAM (Bits/Chip) DRAM Chip Size (mm2) Lith, Field Size (ram2) Max. ASIC Gates/Chip

64 M 190 484 5M

256 M 280 676 14 M

btP Transistors/Chip txP Chip Size (mm2)

12 M 250

28 M 300

1G 420 780 30 M (26 M) " 64 M 360

4G 640 936 70 M (50M) ~ 150 M 430

16 G 960 1144 140 M (210 M) ~ 350 M 520

64 G 1400 1400 360 M (430 M) ~ 800 M 620

:' Numbers in Roadmap report.

fabricating these devices are developed in a timely manner. In fact, there are many technical issues, including the problems associated with mixed-voltage operation, device design and technology for power supply voltages less than 1.2 V, lithography system and processes, chip power and power density, electrical noise, etc. Some of these appear to be so overwhelming that they could be roadblocks or show stoppers. As far as on-chip metallization is concerned, it should be noted that the roadmap suggests the number of on-chip wiring levels will increase over time, but at a relatively slow rate. In addition, there will be a wiring-level hierarchy, with dense and thin wires at the lower levels, and thick and less-dense wires at the upper levels [2].

2. F E O L device t r e n d s 2.1. CMOS scaling rretzd CMOS technology evolution in the past 20 years has followed the path of device scaling for achieving density, speed,

and power improvements. MOSFET scaling was propelled by the rapid advancement of lithographic techniques for delineating fine lines of 1 btm width and below. It became clear early on that reducing the source-to-drain spacing, i.e., the channel length o f a MOSFET, led to short-channel effect. For digital applications, the most undesirable short-channel effect is a reduction in the gate threshold voltage at which the device turns on, especially at high drain voltages. Full realization of the benefits of the new high-resolution lithographic techniques therefore requires the development of new device designs, technologies, and structures whichcan be optimized for very small dimensions. Another necessary technological advancement for device scaling is ion implantation which not only allows the formation of very shallow source and drain regions but also is capable of accurately introducing a low concentration of doping atoms for optimum channel profile design. Constant-field scaling was first proposed in 1972 [3]. It was shown that one can keep the short-channel effect under control by scaling down the vertical dimensions, for example, gate insulator thickness and junction depth, along with the

Table 2 Principal device and electrical characteristics Year First Ship Lithography (~m)

1995 0.35

1998 0.25

2001 0.18

2004 0.13

2007 0. I0

2010 0.07

Set Vj,jStandard Logic Vda (V) Nominal L~, ( lain) Tox (nm/ Wire Levels / Logic) Thick-Wire Levels On-Chip Clock (MHz) DSP Capability High-Performance lxP Volume ~P Off-Chip Clock (MI-Iz) Chip-to-chip Chip-to-board On-Chip Regulator On-Chip Decoupling Cap

for 2001 3.3/2.5 0.28.0.35 7-12 4-5 Available

for 2004 2.5/1.2-1.8 0.20-0.9,25 4-6 5

for 2007 1.2-1.8 0.14-0.18 4-5 5-6

1.2-1.5 0.10-0. i3 4-5 6

< 1.2 (with SOIl < 0.10 <4 6-7

7-8

400 300 150

600 450 200

800 600 300

1100 800 400

1500 1000 500

1900 i 100 625

150 i 00 Available Available

200 133

300 166

350 200

450 200

550 200

193

Y. Taur, T.H. Ning / Materials Chemist~ F and Physics 52 (1998) 191-199

10

'

I

'

">K"

v / ~ ~

I ° TE*I I

l

n "~

I

~/



\ \soumce.# t-~ ILDRAIN . . . . .

,'

T

L

L p SUBSTRATE, DOPING

J

/

~,nk.kE#,,-,~'kn~

i

) /

-

/

I000

O

> I.C DOPING (z'N A

~=

NA

S~BTHRES, TURN-OFF

t-

Fig. I. Schematic cross-section of MOSFET constant-field scaling.

horizontal dimensions, while also proportionally decreasing the applied voltages and increasing the substrate doping concentration (decreasing the depletion width). This is shown schematically in Fig. 1. The principle of constant-field scaling lies in scaling the device voltages and the device dimensions (both horizontal and vertical) by the same factor oe such that the electric field remains unchanged. This assures that the reliability of the scaled device is not worse than the original device. The most important result of constant-field scaling is that once the device dimension and the power supply voltage are scaled down, the circuit speeds up by the same factor a. Meanwhile, the power density, i.e. active power per chip area, remains unchanged in the scaled-down device. This has important technological implications in that, in contrast to bipolar devices (Section 2.2), heat sinking does not become more difficult in the scaled CMOS devices. Although constant-field scaling provides a basic framework to shrink CMOS devices for higher density and speed without degrading reliabi]ity and power, there are several factors that scale neither with the physical dimension nor with the operating voltage. The primary reason for the non-scaling effects is that neither the thermal voltage k T / q nor the silicon bandgap Eg changes with scaling. The former leads to subthreshold non-scaling; i.e. threshold voltage cannot be scaled down like other parameters. The latter leads to non-scalability of built-in potential, depletion layer width, and short-channel effect. Because of these factors and reluctance to depart from the standardized voltage levels of the previous generation, the power supply voltage is seldom scaled in proportion to channel length. Fig. 2 shows the trends of power supply voltage, threshold voltage, and gate oxide thickness for CMOS logic technologies from 1 b~m to 0.1 btm channel length [4]. The trends of both the lateral and the vertical fields for these generations of CMOS technology are shown in Fig. 3 [5]. Both fields have increased by about a factor of 3 from the I btm generation to the 0.1 b~m generation. The most serious problems associated with the higher field intensity are reliability and power. Power density increases by the quadratic to cubic power of electric field [6], which puts a great deal of burden on VLSI packaging technology to dissipate the extra heat generated on the chip. Reliability problems arise from higher oxide fields, higher channel fields, and higher current densities. Higher current densities aggra-

hf

o

LIMIT

IOO

k g tad

y/

0.1 d

g

I

0.25

ON XE IDLE TUN ING LIMIT

0.5

0.I

10

1.0

MOSFET CHANNEL LENGTH {Fm)

Fig. 2. Power supply, threshold voltage,and gate oxide thickness trends vs. channel length for CMOS technologiesfrom 1 to 0.1 p,m.

I

i

-LI.

i

i

i

i

t i I

8xlO 6

"--L. L." 2xlO~

Q o-..,

"~X

IxlO5 -

X

2×106

,,-I

LT.

I xlO 6 °

4xlO

4

I

I

r

t

I

I

t

1 ~ "l"

0.1

I

CMOS CHANNEL LENGTH (bcm)

Fig. 3. Lateral electric field (Vdd/Lo~.t)and vertical field in oxide (Vdd/to~) VS.channel length. vate the electromigration of aluminum lines which is already becoming worse under constant-field scaling [3]. Higher oxide fields drive the gate oxide closer to breakdown condition, which adds a great deal of difficulty in maintaining its integrity. The most effective way to curb chip power growth is to reduce the supply voltage Vad since the active power is given by:

&c=CV#df

(1)

where C is the total capacitance switched in an average clock cycle, andfis the clock frequency. In order to maintain chip performance at the reduced power supply voltage, threshold voltage Vt should be reduced in step with Vdd since the circuit delay, ra

0 . 6 - (Vt/Vdd)

(2)

194

Y. Taro; ZH. Ning ~Materials Chemistt3"and Physics 52 (1998) 19]-199

.3o 100

1" ILl

L9 0

S

Ill

o

El

10

:3 0

3

RC component

1

1.-.

.~

POWER SUPPLY VOLTAGE ---~ Fig. 4. Power supply and threshold voltage design space for performance and power trade-off.

...................

0.3

0.03 I

,

Pd,:=Vafloexp(--qVJmkr)

(3)

In Eq. (3), m is a factor typically in the range of 1.2-1.4, and Io is a current quantity depending on the technology and chip size. The performance, active power, and standby power trade-off is illustrated qualitatively in a power supply-threshold voltage design plane shown in Fig. 4 [8]. It is clear that chip performance can only be gained at the expense of either higher active power or higher standby power. Future CMOS technologies are likely to have multiple threshold voltage designs in order to optimize speed and power independently for meeting the broadest application requirements.

2.2. Bipolar deWces and trends There are three main components in the delay of a bipolar circuit, e.g. an emitter-coupled logic (ECL) circuit [9]. These are: (i) the delay due to wire and other parasitic capacitances, (ii) the delay due to base and collector transit times, and (iii) the delay due to diffusion capacitance. The wire and parasitic capacitance component decreases with increase of the device operating current. The transit-time component is independent of current. The diffusion capacitance component increases with current, particularly at high collector current densities where base-pushout effect is significant. These delay components and the total delays are illustrated schematically in Fig. 5. it shows that at very high collector current densities, the circuit delay is dominated by the diffusioncapacitance component. The direction in bipolar technology development has been in reducing these delay components [ 10]. Fig. 6 shows the schematic cross-section of a typical advanced npn bipolar transistor [11 ]. It shows trench isolation, polysilicon base contact, polysilicon emitter contact, and a so-called pedestal collector which has significantly higher doping concentration underneath the intrinsic-base region, Both the trench isolation and the polysilicon base contact reduce the collector area, and the associated base-collector junction capacitance and collector-substrate junction capacitance. The polysilicon emitter contact allows a very thin base to be made without

~

"

, I ,,,,I

I

0.05 0.1

0.2

".-

, , I ,,,r~ 0.5

ent

..........

Transit-time component

.

0.01 0.02

increases rapidly with (Vt/Vd,~) [7]. Reduction of threshold voltage, however, causes the standby or dc power to increase exponentially:

"

."

Q::: 0,1

c D ~ 2 np

; ~",

1 , r ~,~,,

1

2

5

10

Relative Collector Current Density Fig. 5. Schematic diagram illustrating the delay of bipolar ECL circuits as a function of collector current density. In general, the circuit runs faster as the collector current, and hence the collector current density is increased until base pushout effect starts to become significant.

E

B

C

~-n+ emitter diffusion p+ polyZ

~;~'8-';-'e~'

.~ , ~ +

poly

--"~n ~-~r-~ ~"~'-'~ N . r ~ p-type/

/p+ ex. base ~diffusion

n. ~

~ N ~

r~

"" Polysilicon-fitled J deep trench isoIatien Fig. 6. Schematic cross-section of a typical advanced npn bipolar transistor. It shows deep and shallow trench isolation, polysilicon emitter, self-aligned polysilicon base contact, and pedestal collector doping profile (after Ref.

ill]). suffering from the problem of insufficient current gain [ 12]. The transit-time component is reduced by making thinner intrinsic-base regions. The pedestal collector suppresses base-pushout effects. The ECL circuit is the most commonly used bipolar logic circuit for high-performance applications. Fig. 7 plots the reported unloaded ECL circuit delays over the past 10 years or so, compiled by Warnock [11]. It illustrates a steady progress in the development of bipolar technology for highperformance digital applications. The limitation of bipolar technology lies not in the speed limit of the bipolar device itself, but in the large power dissipation of the bipolar circuits. An ECL circuit operates with a more-or-less constant current flow. The circuit switches from 'on' to 'off' by steering a current from one transistor to another within the circuit, instead of switching the current on and off. Thus, standby power of bipolar circuits is about the same as their active power. This large standby power severely limits the integration level of bipolar chips. For example, a bipolar chip with l04 ECL circuits each dissipating 3 mW will have a standby power of 30 W!

I95

Y. Taar, T.H. Ning / Materials Chemistry,' and Physics 52 (1998) 191-199

the reported cut-off frequencies compiles by Wamock [ 11 ]. It clearly shows that, with the advent of SiGe technology, the cut-off frequency of bipolar devices has improved by about 2 × . Thus, it is likely that analog applications will continue to drive the development of bipolar technology, particularly SiGe bipolar technology,

I

102

g





,,m c) LU

3. Device contact technology

.-%.

G) 0

P

101 1980

3. l. CMOS contact techl~oIogy

I

1985

1990

1995

Year Fig. 7. Reported unloaded bipolar ECL circuit delays (after Ref. [ 11] ). Digital bipolar technology has been used mainly in highend mainframe computers (see, for example, the articles in Ref. [ 13] ). As mainframe computers make their transition from bipolar technology to CMOS technology, there will be very little further development of bipolar technology for digital applications. For analog applications, the need for small-signal highfrequency devices remains. For these applications, the incorporation of germanium into the intrinsic-base layer of a bipolar transistor proves advantageous [ 14]. The incorporation of Ge into the Si base layer introduces a grading of the energy bandgap in the base region. As a result of base-bandgap grading, the base transit time % is reduced by:

%(AEg)/%(AEg =O)=2kT/AEg

(4)

where AEg is the total amount of bandgap grading across the base layer [ 15]. Thus for a bandgap grading of 150 meV, the base transit-time is reduced by about 3 × . Fig. 8 is a plot of 2

102--

I

S

I

o--o SiGe

A schematic diagram of the current pattern in the source or drain region of a MOSFET is shown in Fig. 9 [ 16]. The total source or drain resistance can be divided into several parts: R~c is the accumulation layer resistance in the gatesource (or drain) overlap region right beyond the channel; Rsp is the spreading resistance associated with the current flowing from the surface layer into a uniform pattern across the depth of the source-drain region; Rsh is the sheet resistance of the source-drain region between the channel and the contact; and R~o is the contact resistance (including the spreading resistance in silicon) in the region where the current flows into a metal line. Once the current flows into an aluminum line, there is very little additional resistance, since the resistivity of aluminum is very low, PA~~ 3 × 10 .6 f)-cm. In VLSI interconnects, the metal thickness is in the order of tat ~ 0.5-1.0 b~m, which means that the sheet resistivity, PAl/tAb is typically 0.05 f~/[E. This is negligible compared with the channel sheet resistivity Pch ~ 2000--7000 ~2/[Z, except when a long, thin wire is connected to a wide MOSFET. Fig. 9 only shows the series resistance on one side of the device. The total source-drain series resistance per device is, of course, twice of that shown in Fig. 9 assuming the source and drain are symmetrical. We now examine Rsh and Rco more closely. In Fig. 9, the sheet resistance of the source-drain diffusion region is simply

,...,-8

-

oc-

o-o

Si Metal-----~~

I.J._

~

.-m

¢...3 (D ,:D r-, Z

0 o u e/~/_I

10~

0._ Z

! /

[',,'q Contact b,"t Pc-J~ window ~ , ~



N\\~,\\",'-,\\\\",,~

~

Gate

O ~J

I 1980

~qb

I 1990

1985

//~'~

Metallurgical junction

~Rsp 1995

Rsh

Year Fig. 8. Reported cut-off frequency of silicon bipolar transistors. With SiGe for the intrinsic base, the cut-off frequency can be increased by about 2 × ( after Ref. [ 111 ).

Fig. 9. Schematic diagram show ng current pattern in the source-drain region

of a MOSFET and their representative resistance components (after Ref. [16]).

K Taut, T.H. Ning / Materials Chemisto' and Physics 52 (1998) 191-199

196

where W is the device width, S is the spacing between the gate edge and the contact edge, and P~a is the sheet resistivity of the source-drain diffusion typically of the order of 50500 D / [ ] . Since p~,~<
w

where l~ is the width of the contact window (Fig. 9), and Pc is the interracial contact resistivity of the ohmic contact between the metal and silicon in units of t%-cm2. Reo includes the resistance contribution of the current crowding region in silicon underneath the contact. Eq. (6) can be simplified in two extreme cases: short-contact and long-contact. In the short-contact case, lc << (P~./P~d)t/',, and Pc

R~o- Wl~

(7)

which is dominated by the interfacial contact resistance. The current flows more or less uniformly across the entire contact. In the long-contact case, l~ >> ( Pc/ P~a)",/2, and Roo-

qP,,dPc - W

(8)

This is independent of the contact width lo since most of the current flows into the front edge of the contact. Once in the long-contact regime, there is no advantage increasing the contact width. The interracial contact resistivity p~ plays a key role in determining the magnitude of the contact resistance. For ohmic contacts between metal and heavily doped silicon, the current conduction is dominated by tunneling or field emission, and the contact resistivity depends exponentially on the barrier height ~bs and surface doping concentration Nd [ 18 ] :

SPACERS

\\_

EOU"POTENT'AL / 'C Ti$i2

n

+

~

1

~....,,,._"

p

n+

EQUI-POTENTIAL " / Fig. 10. Schematic diagram of an n-channel MOSFETfabricated with selfaligned TiSi> showing the current distribution between the channel and silicide.

should be no more than 50 l%-txm. At the same time, Rco between the source-drain and silicide is also reduced, since now the contact area is the entire diffusion. As is shown in Fig. 10, the current flow is almost always in the long-contact limit such that Eq. (8) is applicable. However, the sheet resistivity of the region under the silicide, Psd', is higher than the non-silicided source-drain sheet resistivity P~d since a surface layer of heavily doped silicon is consumed in the silicidation process [ 20]. The contact resistivity between silicon and silicide may also rise if the interface doping concentration becomes lower owing to silicon consumption. This is particularly a concern when a thick silicide film is formed over a shallow source-drain junction. In the example shown in Fig. 11, the contact resistance with the thinner titanium film is lower than that with the thicker film, since in the former case less silicon is consumed and the doping concentration at the interface is higher. As a rule of thumb, no more than a third of the source-drain depth should be consumed in the silicide process. In a CMOS process, a silicide material such as TiSia with a near-midgap work function is needed to obtain approximately equal barrier heights to n + and p+ silicon. lO 21

t

I

I

k!/TiSi 1020

'~ ik

2 INTERFACE WiTH 35~m Ti (CASE N4)

'

*-TiSi 2 iNTERFACE

I SIMS As 75 2 xlOlS/crn2 900°C, 3Omin

~Ti

E •,~ 10 m o

[ 4¢rq5s ~ ' ~

pc(zexPt--~V---~-a )

i'~

D

(9)

Z

o

10 ts

I--

where h is Planck's constant and m* is the effective mass of electron. Depending on the doping concentration and contact metallurgy, Pc is typically in the range of 10-6-10 -7 f~_cm 2. Both Rsh and R~o are greatly reduced in advanced CMOS technologies with the use of self-aligned silicide [ 19]. As shown schematically in Fig. 10, a highly conductive ( = 25 f2/[]) silicide film is formed on all gate and source-drain surfaces using dielectric spacers in a self-aligned process. Since the sheet resistivity of silicide is 1-2 orders of magnitude lower than that of source-drain, it practically shunts all the diffusion current and the only significant contribution to R,h is from the non-silicided region under the spacer. This reduces S in Eq. (5) to 0.1-0.2 pore,which means that Rsh ;K W

I-Z lal t3

z I0 t7 0 o

As I0 16

10 t5

0

I 0.1

I 0.2

I

0.3 DEPTH (Frn)

I

0.4

0.5

Fig. 1 t. SIMS arsenic profiles showing the concentration at the projected TiSi2-Si interface for two titanium thicknesses.

Y. Taur, T.H. Ning / Materials Chemiszl T and Physics ,52 (1998) 191-199

Experimentally measured p~.' between TiSi_, and n + or p+ silicon is of the order of 1 0 - % I 0 -7 ~ - c m 2 [21 ]. Based on Eq. (8), therefore, R:o for a silicided diffusion is in the range of 7 5 - 2 0 0 ~ - I ~ m [20]. The minimum contact width lc required to satisfy the long-contact criterion can be estimated from (P¢'/P~d')~/2 to be about 0.25 ~m. The contact resistance between the silicide and metal is usually negligible, since the interracial contact resistivity is of the order of 10-7_ 10-8 f2-cm 2 in a properly carried-out process. 3.2. Contact technology f o r bipolar devices

As discussed in Section 2.2, for all advanced bipolar devices, the active regions of the emitter and the base are contacted by polysilicon, and metal contacts are made indirectly through these polysilicon layers. Both the emitter and the base polysilicon layers are usually doped very heavily [ 22], in excess of I × 1020 c m - 3. Thickness of the base polysilicon layer has no effect on the device characteristics. On the contrary, the emitter polysilicon layer thickness should be larger than the minority-carrier (holes) diffusion length. It was determined experimentally that as long as the emitter polysilicon layer is thicker than about 50 nm, the device characteristics are independent of the emitter polysilicon thickness [ 12]. In practice, the emitter and base polysilicon layers are 100-200 nm thick. Thus, metal contacts to emitter and base are made indirectly through contacting heavily doped polysilicon and relatively thick polysilicon layers. As can be seen from Fig. 6, metal contact to the collector terminal is made to a heavily doped 'reach through' region. This region is typically about 2 b~m thick. Good ohmic contact should be obtained relatively easily.

4. I n t e r c o n n e c t scaling and R C delay

Scaling of interconnects is similar to the scaling of MOSFET's, as is shown schematically in Fig. I2 [3]. All linear dimensions: wire length, width, thickness, spacing, and insulator thickness are scaled down by the same factor, oe, as the device scaling factor. Wire lengths (l) are reduced since the linear dimension of the devices and circuits that they connect to is reduced by c~. Both the wire and insulator thicknesses are scaled down for technological reasons (aspect ratio, etch selectivity, etc.). Furthermore, fringe capacitance and wireto-wire coupling (crosstalk) would increase disproportion-

h

/

,",

[

6F~OUNO PLANE



t

I

LA,',=

h/a

Fig. 12. Scaling of interconnect lines and dielectric thickness.

197

ally unless the thicknesses are scaled down along with the Iateral dimensions. All material parameters, such as metal resistivity and dielectric constant are assumed to remain the same. Wire capacitance then scales down by o~the same way as the device capacitance, while wire capacitance per unit length, C,,., remains unchanged. For silicon-dioxide insulation commonly used in VLSI technologies, Cw is approximately 0.2 pF r a m - i independent of technology generation. Wire resistance, on the contrary, scales up by ce in contrast to the device resistance which does not change with scaling. Wire resistance per unit length, Rw, then scales up by ~ , so does the RC time constant per unit length, R,~ × C,~. One of the key conclusions of interconnect scaling is that the wire RC delay, I 1"o= - R , , C w I 2 2 '

(10)

does not change as the device dimensions and delay are scaled down. Eventually, this will impose a limit on VLSI performance. Fortunately, for aluminum metallurgy, the RC limit, 12 r,,, ~ 3 × 1 0 -~s -A

(i1)

is in the order of I ps or Iess, much smaller than the intrinsic device delay ~ 20 ps of even the 0.1 ixm CMOS technology [4]. This figure assumes 1 2 / A ~ 104-105 for local wiring, where A is the wire cross-sectional area shown in Fig. 12. It is worth noting that the current density in interconnects increases with a, which implies that reliability issues such as electromigration may become more serious as the interconnects are scaled down. Based on the above discussions, RC delay of local wires will not limit VLSI circuit speed even though it cannot be reduced through scaling. The RC delay of global wires, on the contrary, is an entirely different matter. Unlike local wires, the length of global wires on the order of the chip dimension, does not scale down, since the chip size either stays the same or actually increases slightly for advanced technologies with better yield/defect density to accomodate a much higher number of circuit count. Even if we assume chip size does not change, the RC delay of global wires scales up by dE2 from Eq. ( 11 ) if the size of the global wires is scaled down the same way as the local wires. It is clear that one quickly runs into trouble. For example, in today's 0.5 txm CMOS technology, I2/A ~ 108-109, and rw ~ 1 ns, severely impacting system performance. A number of solutions have been proposed to solve the problem. The most obvious one is to minimize the number of cross-chip global interconnects in critical paths as much as possible through custom layout/ design and use of sophisticated design tools. One can also use repeaters to reduce the dependence of RC delay on wire length from a quadratic one to a linear one [2]. But that will introduce additional device delay and onIy provide a partial solution to the problem. A more fundamental solution is to increase or not to scale the cross-sectional area of global

198

K Taur, T.H. Ning /Materials Chemistry and Physics 52 (1998) 191-199

IE-8 ~

1E-9

r~

IE-t0

Wire Size: ........ 0.I p.m . . . . . 0.3gin --

N N N N

//"

10pm

,''"

.'" /

/

,'

,/z

," ""

/

z

/ /

z / /

,Ell .//"

1E-13 0.001

NNNNNNNNNN

."

¢"

/

0.01

peed

0.1 Wire Length (cm)

1

10

Fig. t4. RC delay for three different wire cross-sections vs. wire length. Als0 shown is the time-of-flight limit for oxide insulators.

"/////t//~///////~////////////'///.~ IN N N [N N N N NA N IN N N IN N] IN IN NI N N N

References

Chip Fig. 13, Schematic cross-section of levels of desired chip wiring for highperformance CMOS processors.

wires. However, just increasing the width and thickness of global wires is not enough, since wire capacitance wilt then increase significantly which degrades both performance and power. Inter-metal dielectric thickness must be increased in proportion as well to keep the wire capacitance per unit length constant. Of course, there is a technology price to pay to build such low-RC global wires. It also means more levels of interconnects, since one still needs several levels of thin, dense local wires to make the chip 'wimble'. The best strategy for interconnect scaling then is to scale down the size and spacing of lower levels in step with device scaling for local wiring, and to use unscaled or even scaledup levels at the top for global wiring, as shown schematically in Fig. 13 [2]. Unscaled wires allow the global RC delay to remain essentially unchanged as seen from Eq. ( 11 ). Scaledup (together with the insulator thickness) wires allow the global RC delay to scale down in proportion to the device delay. This is even more necessary if the chip size increases with technology generation. Ultimately, the scaled-up global wires would approach the transmission-line limit when the inductive effect becomes more important than the resistive effect. This happens when the signal rise time is shorter than the time of flight over the length of the line. The signal propagation is then limited by the speed of light, which is approximately 7 ps mm-1 for oxide insulators, instead of by RC delay. Fig. 14 shows the interconnect delay versus wire length calculated from Eq. ( 11 ) for three different wire crosssections. For a given chip size or global wire length, a larger wire cross-section is required to reach the limit of the speed of light. The transmission-line situation is more often encountered in packaging wires.

[ 1 ] The National Technology Roadmap for Semiconductors, Semiconductor Industry Association, 4300 Steven Creek Boulevard, Suite 271, San Jose, CA 95129, USA, 1994. [2] G.A. Sai-Halasz, Performance trends in high-end processors, Proc. IEEE 83 (1994) 20-36. [3] R.H. Dennard, F.H. Gaensslen, H.N. Yu, V.L. Rideout, E. Bassous, A.R. LeBlanc, Design of ion-implanted MOSFETs with very small physical dimensions, IEEE J. Solid-State Circuits SC-9 (1974) 256. [4] Y. Taur, S. Wind, Y. Mii, Y. Lii, D. Moy, K. Jenkins, C.L. Chen, PJ. Coane, D. Klaus, J. Bucchignano, M. Rosenfieid, M. Thomson~ M. Polcari, High performance 0. i-l~m CMOS devices with 1.5-V power supply, 1993 IEDM Technical Digest, 1993, pp. I27-130. [5] Y. Taur, CMOS technology evolution: from t ~m to 0.t ~.m, Fourth Int. Conf. on Solid-State and Integrated-Circuit Technology, Beijing, China, Oct. 1995. [6] G. Baecarani, M.R. Wordeman, R.H. Dennard, Generalized scaling theory and its application to a 1/4-micrometer MOSFET design, IEEE Trans. Electron Devices ED-31 (1984) 452. [7] Y. Mii, S. Wind, Y. Taur, Y. Lii, D. Klaus, J. Bucchignano, An ultralow power 0. l-lxm CMOS, VLSI Technology Symposium Technical Digest, 1994, pp. 9-10. [8] Y. Taur, Y. Mii, D. Frank, H-S. Wong, D. Buchanan, S. Wind, S. Rishton, G. Sai-Halasz, E. Nowak, CMOS scaling into the 21st century: 0. I txm and beyond, IBM J. Res. Develop. 39 (1995) 245. [9] D.D. Tang, P.M. Solomon, Bipolar transistor design for optimized power-delay logic circuits, IEEE J. Solid-State Circuits SC- 14 (1979) 679-684. li0] T.H. Ning, D.D. Tang, Bipolar trends, Prec. IEEE 74 (1986) 16691677, [ i 1] J.D. Warnock, Silicon bipolar device structures for digital applications: technology trends and future directions, IEEE Trans. Electron Devices 42 (1995) 377-389. [ 12] T.H. Ning, R.D. Isaac, Effect of emitter contact on current gain of silicon bipolar devices, IEEE Trans. Electron Devices ED-27 (1980) 2051-2055. [ 13] ES/9000 semiconductor and packaging technologies, IBM J. Res. Develop. 36 (1992). [ 14] G.L. Patton, J.H. Comfort, B.S. Meyerson, E.F. Crabb& G.J. Scilla, E, de Fr~sart, J.M.C. Stork, J.Y.-C. Sun, D.L. Harame, J.N. Burghartz, 75-GHz f r SiGe-base heterojunction bipolar transistors, IEEE Electron Devices Lett. i1 (1990) 171-173. [15] H. Kro6mer, Two integral relations pertaining to the electron transport through a bipolar transistor with a nonuniform energy gap in the ba~e region, Solid-State Electron. 28 (1985) 1 i01-1103.

Y. Taur. T.H. Ning / Materials Chemistr 3" and Physics 52 (1998) 191-199

[ 16] K.K. Ng, W.T. Lynch, Analysis of the gate-voltage dependent series resistance of MOSFETs, IEEE Trans. Electron Devices ED-33 (1986) 965. [ 17] H.H. Berger, Models for contacts to planar devices, Solid-State Electron. 15 (i972) i45. [ 18] A.Y.C. Yu, Electron tunneling and contact resistance of metai-silicon contact barriers, Solid-State Electron. 13 (1970) 239. [ I9] C.Y. Ting, S.S. Iyer, C.M. Osburn, G.J. Hu, A.M. Schweighart, The use of TiSi_, in a self-aligned silicide technology, ECS Symp. on VLSI Sci. and Technol., 1982.

199

[20] Y. Taur. Y.C. Sun, D. Moy, L.K. Wang, B. Davari, S.P. Klepner, C.Y. Ting, Source-drain contact resistance in CMOS with selfaligned TiSi,, IEEE Trans. Electron Devices ED-34 (1987) 575. [21] J. Hui, S. Wong, J. Moll, Specific contact resistivity of TiSi2 to p÷ and n ÷ junctions, IEEE Electron Devices Lett. EDL-6 (1985) 479. [22] Many advanced bipolar processes are reported in Special Issue on Bipolar BiCMOS/CMOS Devices and Technologies, IEEE Trans. Electron Devices 42 (1995).