Numerical simulation and verification of gas transport during an atomic layer deposition process

Numerical simulation and verification of gas transport during an atomic layer deposition process

Materials Science in Semiconductor Processing 21 (2014) 82–90 Contents lists available at ScienceDirect Materials Science in Semiconductor Processin...

873KB Sizes 0 Downloads 36 Views

Materials Science in Semiconductor Processing 21 (2014) 82–90

Contents lists available at ScienceDirect

Materials Science in Semiconductor Processing journal homepage: www.elsevier.com/locate/mssp

Numerical simulation and verification of gas transport during an atomic layer deposition process A.-A.D. Jones IIIa, A.D. Jones Jr.b,n a b

Department of Mechanical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA Department of Mathematics, Florida A&M University, Tallahassee, FL 32301, USA

a r t i c l e in f o

abstract

Available online 8 February 2014

Atomic Layer Deposition (ALD) is a process used to deposit nanometer scale films for use in semiconductor electronics. The reactor consists of a warm wall horizontal flow tube, a substrate mounted on a disk downstream from the inlet, and cyclic flow between a reactant gas, a purging gas and a gas that preps the surface of the substrate. The objective is to achieve a uniform coating on the substrate layer by layer in minimal time. It is possible to use in situ monitoring of the gas phase and deposition to modify layer formation. Process improvement is currently accomplished experimentally by monitoring the precursor delivery and the growth of the film and adjusting the parameters: flow rates, temperature, pressure, concentrations, etc. Accurate simulation and optimization can decrease processing time and cost and increase control during product development. In addition, increased accuracy of gas transport simulation can be used to analyze reaction and diffusion rates, reaction mechanisms and other physical properties. In this paper we introduce the first comprehensive numerical solution of the Dusty-Gas Model including the complete binary diffusion term. We derive a concentration dependent Damkohler number relevant to the purge step of the process. The simulation matched the experimental data at a specific Damkohler number and further variation of the parameter confirmed existing experimentally observed phenomena. & 2014 Elsevier Ltd. All rights reserved.

Keywords: Gas transport simulation Dusty-gas model Chemical vapor deposition Atomic layer deposition

1. Introduction Reactive gas transport techniques are used for material processing including Chemical Vapor Infiltration (CVI) and Chemical Vapor Deposition (CVD). In this work, we introduce the first comprehensive numerical solution of the Dusty Gas Model to simulate gas transport. While the main purpose of the Dusty-Gas Model is to simulate flow through porous media, the theoretical formulation does not restrict it to this [1]. We use this model to capture binary diffusion and Fickian diffusion assuming a continuum in the gas phase. Continuum modeling is necessary to apply optimization techniques and derive non-dimensional parameters as

n

Corresponding author. E-mail addresses: [email protected] (A.-A. Jones III), [email protected] (A.D. Jones Jr.). 1369-8001/$ - see front matter & 2014 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.mssp.2014.01.020

opposed to Markov or other random walk modeling. We apply this innovative simulation to the Atomic Layer Deposition (ALD) process. ALD, a CVD technique, deposits single layers of atomic thickness resulting in precise thickness and shape. ALD is used to manufacture integrated circuit memory to conserve surface area and cost per bit. ALD can be a useful tool for scientific investigation using the two species, single-layer, self-limiting reactions to find both reaction rates and reaction mechanisms. A detailed model is necessary to optimize run time parameters to increase the quality of material production and decrease production time or run more targeted experiments. In this section, we describe the physical setup of a single wafer atomic layer deposition process and present experimental parameters that define the relevant physical parameters in the model. In Section 2, we construct a model followed by a numerical scheme in Section 3. The purpose of the model is to predict the concentration

A.-A.-D. Jones III, A.D. Jones Jr. / Materials Science in Semiconductor Processing 21 (2014) 82–90

of the chemical species in the reactor as a function of time and space in combination with in situ modeling that will lead to an optimal ALD process. In Section 4, we describe the experimental setup and discuss the relationship between the system and the model. This section also discusses the limitations of the model to batch processing and other industrial procedures. In Section 5, we compare the experimental results to the numerical results. In this section we also describe the purpose of the numerical scheme created and the insights gained from the solution of the general flow problem and the in situ modeling. Future expansions to the project are also discussed. A typical ALD experimental reactor combines a warmwall horizontal-flow tube, a substrate mounted downstream from the inlet, and alternating flow between a reactant, a purging and a second reactant gas. The reactions can be visualized as shown in Fig. 1 (for detailed reaction mechanisms see [2,3]). The illustration shows one reactant, entrained in a carrier gas, with preferential activation energy for bonding to the substrate, cf. Fig. 1 (a). The second step is the introduction of an oxidant that reacts with the previous reactant, cf. Fig. 1(b) forming a gaseous byproduct and depositing a metal oxide layer. This process is repeated where the reactant has a higher preference for reacting with the metal oxide layer over itself resulting in self-limiting deposition. The process creates a uniform coating on the substrate layer-by-layer. This leads to uniform layers being deposited by ALD on non-planar surfaces and high-aspect ratio surfaces [4,5]. The rate of growth of the layers is proportional to the repetition rate. The growth time is then the product of the number of cycles with the repetition rate for a given monolayer deposition and the total thickness is the number of cycles times the monolayer thickness. A self-limiting process is desired and achievable but does not always result. For example, during the deposition of tungsten and molybdenum on silicon uneven coating

83

and islands are created. During metal-oxide ALD on Carbon Nanotubes (CNTs) wetting is also a problem [6]. This problem has been addressed by reducing the residence time of the precursor according to its reaction mechanism. The residence time is constrained by concentration of gas arriving at the substrate as a function of time. One of the problems of thermally activated ALD in particular is that during inflow the reactants may decompose before arriving at the substrate [7]. After reaching the surface, a condition for a successful process is that the binding energy on the surface is much greater than the binding energy of subsequent layers [2]. This implies that the primary control parameters for ALD are the temperature of the substrate, the concentration and type of reactive gas near the surface [2,8]. Optimization using a numerical model should predict the reactive species as well as flow parameters to yield the desired coating in the least amount of time. The reactor design studied here flows gas through a hotwalled CVD tube reactor. We use the recommendation from Suntola (1989) that a pressure of  1 Torr is optimal to balance interdiffusion and entrainment [2]. With respect to Fig. 1(a), interdiffusion is the diffusion of the reactant gas around the carrier gas to the substrate, (cf. Section 2.6 on binary diffusion). Entrainment is the ability of the carrier/ purge gas to affect the transport of the reactant/byproducts. If the pressure is too high, diffusion will be reduced, if the pressure is too low, entrainment will be reduced. The model does not specifically address other methods of optimizing the residence time of the reactant gas including the use of flow control such as Synchronously Modulated Flow and Draw, cross flow, shower-head, and temperature control such as cold and hot-walled reactors [9,6]. To do this, one would need to modify the flow field and/or the boundary conditions for temperature. We also do not focus on optimization using pre-clean schedule and post deposition annealing [10].

Fig. 1. This figure shows a typical metal-oxide ALD process. The first step is the deposition of a metal-precursor, carried by an inert carrier gas, onto a substrate(a). This is followed by the reaction of the metal precursor with an oxidant carried by the same carrier gas (b). The result is a metal oxide in a single layer (c). The inert carrier gas is also used to entrain and remove the excess reactants and reactant byproducts from the reactor.

84

A.-A.-D. Jones III, A.D. Jones Jr. / Materials Science in Semiconductor Processing 21 (2014) 82–90

The objective of this study is to construct an accurate simulation of the ALD process given the substrate and the precursor and compare it to experimental values. We will also vary some of the processing parameters (e.g. temperature, pressure, cycle rate, etc.) to show that optimization can be achieved. This model achieves a higher level of accuracy than that can be achieved through commercial software by considering multi-component binary diffusion in addition to the convective flux. The model along with optimal control allows for ALD to be used as a platform to verify binary diffusion coefficients and reaction rates through fitting and simulation. We derive a Damkohler number from the model that can be used to correlate purge times with reactor parameters and a qualitative behavior of the gases in situ.

2. Theoretical formulation

gas or reacting with the solid present we get ∂C i ¼  ∇  F i  ki C i ; ∂t

ð1Þ

Consider a fixed region in space. These equations state that the concentration of a species Ci changes with time by flowing into and out of a region or by reacting with another gas or a solid. During each cycle, the transport mechanisms driving each species towards the preform are the binary diffusion coefficient DAB, the concentration gradient, the Knudsen diffusion coefficient DK;A , viscosity μ, and the pressure gradient. The binary diffusion coefficient can be viewed as the diffusion of one species A at infinite dilution through B or vice versa. This is important when considering the mixture of the two reactant species in the carrier gas in both purge and infiltration time. The Knudsen diffusion coefficient can be viewed as the mean distance a molecule travels in a pore. This is important when considering most dilute gas flows.

2.1. Process modeling 2.3. Order and dimensional analysis This study focuses on the gas transport conditions needed for optimization. The growth rate is the thickness added per cycle. Each cycle consists of four time scales: the infiltration times TA and TB for each of the reactants and the purge times PA and PB. A mixture of a carrier gas and a metal-precursor is pumped into the chamber initially filled with pure carrier gas during the infiltration time TA (cf. Fig. 1(a)). The carrier gas is used to remove any remaining metal-precursor from the previous cycle during purge time PA (not shown). The oxidant gas, entrained in the carrier gas, is pumped in for infiltration time TB as shown in Fig. 1(b). Finally the reactant byproducts are removed by the carrier gas in purge time PB as shown in Fig. 1(c). The reactant time is determined by reaction parameters as described in Section 2.3, while the purge time, which is the experimental data we have to compare to the model, is determined by the diffusion/solubility of the previous reactant in the carrier gas. For some reactants, for example H2O, it is also important to consider the physisorption onto the reactor walls.

DK;A  Oð1Þ cm2 s  1 :

ð2Þ

For the species under consideration under pressures and temperatures of typical Thermal ALD systems the binary diffusion coefficient is DAB  Oð1Þ cm2 s  1 :

ð3Þ

The ALD sticking model gives the reaction probability γs fs ¼ ð4Þ 1  0:5γ s where a typical γ s ¼ 0:1. The molar flux to the surface can be written as

2.2. An intuitive formulation Our focus here is on the purge times, as this is governed by simpler reaction kinetics than the infiltration times. To build the model that follows we take the phenomenological approach used in The Dusty Gas Model [1] and add reaction kinetics. Molecules can flow due to a concentration gradient ∇C, a pressure gradient ∇P, a temperature gradient ∇T and binary diffusion. Putting this together we get the following equation for flux:  F i ¼ Dc ∇C i þ Dp ∇P þ Dt ∇T þ Db

The cylinder under consideration has a length of L C 0:25 m with a diameter of d C0:10 m. The pressure of the system is P 1 C 1 Torr  133 Pa and the temperature of the system p isffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi T C250 1C. The Knudsen diffusion coefficient is DK ¼ Lc RT=2πMg where R is the ideal gas constant, Mg is the mass of the gas species. For the species under consideration and a reactor with a characteristic length Lc ¼ 0:5d, the Knudsen Diffusion coefficient is

CjF i  CiF j Dij

where Dx are a proportionality terms. This equation is presented with detail in Eq. (24) in Section 2.6. We use this expression to describe the flow of gas through a region. Noting that the concentration of a gas can change by reacting with itself, another molecule in the

Pg n_ s ″ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; 2πM g RT

ð5Þ

where Pg is the partial pressure of the precursor gas [8]. The probability of deposition is f dep ¼ 1 

ns ″ χntotal ″

ð6Þ

where χ represents the number of layers deposited before growth stops. Without gas–gas reactions, the gas behaves as an ideal gas so that the density of the system is ρ ¼ RT=P ¼ 32 kg m  3 . The flow rate of the system is Q ¼ 300 mL min  1 that yields a velocity of v ¼ Q =A ¼ 6:11E 4 m s  1 . Therefore the Reynolds number is Re ¼

ρvd ¼ 69:5 5 2100 μ

ð7Þ

A.-A.-D. Jones III, A.D. Jones Jr. / Materials Science in Semiconductor Processing 21 (2014) 82–90

and it is concluded that for pipe flow, this is well within the laminar flow regime. The entrance length for laminar flow is Le ¼ 0:06d

and

Re ¼ 0:417

ð8Þ

The Mach number is

UL  0:1  1 D

ð10Þ

since the diffusivity being calculated from Chapman– Enskog theory of gases is D  50 200 cm2 s  1 . Thus neither diffusion nor advection dominates the system and both must be included. 2.4. Governing equations

ð1Þ

where for any chemical species 1 ri rν, C i ðx; tÞ½ ¼ mol cm  3 is the concentration and F i ðx; tÞ½ ¼ mol cm  2 s  1 is the flux of each chemical species i, kis is the reaction rate of species i with a surface, either the reactor or the substrate. With this model, the reaction mechanism has been assumed to be first order and the reaction coefficient is constant for fixed temperature. Although, for the ALD process, this is a reasonable assumption, more accuracy can be achieved by considering the sticking model as indicated in Section 2.3 and described further in [2]. The maximum number of species in the reactor is ν ¼ 4: the carrier gas, reactant gas, product gas, and undesired residue from a previous cycle. The total concentration is ν

C ¼ ∑ Ci:

C ν ðx; 0Þ ¼ Cðz; 0Þ:

ð13Þ

while for the other gases

ð11Þ

ð14Þ

For any future cycle, the concentration will be the concentration from the previous cycle, except at the left hand boundary and right hand boundary which are discussed below. Both from the perspective of modeling a real process and the perspective of numerically simulating a process at the inlet the gas concentration is approximated by a Gaussian distribution: C i ðx; tÞ ¼ C i ðzÞðai  bi e  βi z Þ 2

ð15Þ

For example, in the purge cycle, if the concentration of incoming carrier gas is C ν ðx; tÞ ¼ CðzÞ and the concentration in the reactor is that of a residue product and the carrier gas in a uniform 2:3 mixture then it is found that C ν ¼ Ce  βz

One of the objectives of this model is to determine the minimum purge time, PA and PB for the reactor, that is the time for the reactor to be nearly free one of the reacting species. Another objective is to predict the total concentration of the reactant arriving at the wafer as a function of time and estimate the rate of deposition. To build the numerical scheme, we write the continuity equation in the form ∂C i ¼ ∇  F i  kis C i ; ∂t

the concentration of the inert carrier gas C ν is initially uniform, or for all space 0 þ ε oz o L, the concentration is

C i a ν ðx; 0Þ ¼ 0:

U ¼ 7:17E 6 50:3 ð9Þ Ma ¼ a pffiffiffiffiffiffiffiffi 1 where a ¼ γRT ¼ 85 m s  1 with cp ¼ 20:786 J mol K1 so the flow is incompressible in the bulk. The Peclet number is Pe ¼

85

2

and

C 3 ¼ Cð1 e  βz Þ 2

ð16Þ

The concentrations will be nearly discontinuous near the inlet. In the numerical simulation, this is achieved by choosing β ZN 2z where Nz is the number of grid points in the axial or z^ direction as shown in Fig. 2. To maintain stability while calculating the initial flux using finite volume methods we must let β ¼ Nz and Nz be large (cf. Section 3.1). The flux out at the right hand boundary can be given by a Neumann boundary condition where  ∂C i ðx; tÞ  ¼ AC i ðL; t Þ ð17Þ  ∂z  ðL;tÞ

where the constant of proportionality, A, is determined by comparing the numerical results with experimental data and solving the resulting inverse problem. 2.6. The dusty-gas model for multi-component advective– diffusive flux In order to find the flux of a given species, it is not possible to consider the bulk flow. The development of the model here employs the Dusty-Gas Model (DGM) developed in Mason (1983) with an application to single pipe flow [1]. Free molecule or Knudsen flow, FK is used to

i¼1

For a straight tube reactor, the space dependence on the flux and concentration is ð12Þ

n+1

where the axial symmetry of the system allows one to neglect θ.

n+θ

x ¼ 〈r; θ; z〉

k-1

k

k+1

n

2.5. Initial and boundary conditions The initial conditions of a given cycle is the output of a previous cycle except for an vacuum tube ALD system where the partial pressure inside the reactor would be used. Ideally, to begin the process one would assume that

z=0

k-½

k+½

z= L

Fig. 2. Numerical grid showing the Forward Euler step and centered difference in time as described in Section 3.1.

86

A.-A.-D. Jones III, A.D. Jones Jr. / Materials Science in Semiconductor Processing 21 (2014) 82–90

where

model the low density flow through conduits as F K ¼ wnv;

ð18Þ   1 where w ¼ 3 d=L is the multiplicity of a collision in a tube of length L and at a given diameter d, n½ ¼ molecules cm  3 is the number of molecules per unit volume, v i ¼ ð8kB T=πmi Þ1=2 is the root mean square velocity of a species with a given mass mi, at temperature T where kB is Boltzmann's constant. In the continuum limit the free molecule or Knudsen flux becomes F Ki ¼  DKi ∇C i

ð19Þ

where the Knudsen diffusion coefficient is defined as DKi ¼ 43 K o v i and the viscous flow parameter K o ¼ d=4 for a pipe. To account for the flow, and consequently residence time and reactions along the wall of the reactor, viscosity is included. The total viscous flux is calculated as F visc ¼ 

CBo ∇p μ

ð20Þ

where the ‘permeability’ term in the model is a scaled area 2 Bo ¼ d =8½ ¼ cm2 for a pipe, p½ ¼ dPa is the pressure and μ½ ¼ dPa s is the viscosity [1]. Therefore the flux for a specific species is F visc;i ¼

Ci F visc : C

ð21Þ

The flux due to a temperature gradient in the system is FT ¼

1 ∇T ∑ C i C j αij C iaj T

ð22Þ

where T½ ¼ K is the temperature and αij is the thermal diffusivity that is dependent on the concentration, temperature, and species involved in the collision. Finally the diffusive flux of one particle around another is given by F Di ¼  Dij ∇C i ;

ð23Þ

where Dij is the binary diffusion coefficient. Adding all the flux terms together, Eq. (19)–(23), and solving for the individual flux Fi the Dusty-Gas Model states   CjF i  CiF j F 1 Bo P þC i 1 þ ∇ ln P  i ¼ ∑ C jai μDKi DKi Dij þ C∇

Ci 1 þ ∑ C i C j αij ∇ ln T: C iaj C

ð24Þ

The first term on the right is flux due to binary diffusion, the second is the flux due to Knudsen diffusion, the remaining terms on the right are due to pressure gradients, concentration gradients, and temperature gradients respectively. For this first approximation, and our experiments, thermal diffusion is negligible and the last term can be neglected. The effective binary diffusivity for species i and j and the effective Knudsen diffusivity of species i have units Dij ½ ¼ cm2 s  1 and DK i ½ ¼ cm2 s  1 , respectively. The binary diffusion constant Dij can be calculated using the Chapman–Enskog theory of gases: qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi T 3 nMij ð25Þ Dij ¼ 1:8583  10  3 Ps2ij ΩD;ij

  1 1 1 M ij ¼ 2 þ Mi Mj

ð26Þ 1

is twice the reduced molecular weight, M½ ¼ g mol ,   sij ¼ 12 si þsj is the average of the Leonard-Jones parameters. ΩD;ij is the collision integral, here computed as a dimensionless function of temperature and the geometric mean of the Leonard-Jones parameters for each species ɛ, or TðξÞ ΩD;ij ¼ pffiffiffiffiffiffiffi ɛi ɛj

ð27Þ

The Knudsen diffusivity constant DK ðd=2Þ is a function of the pipe radius, temperature and molecular weight. The Knudsen diffusion for the ith species is given by pffiffiffi T DK i ¼ 48:50  d ð28Þ Mi where d½ ¼ cm is the diameter of the pipe. This is a kinetic theory of gases estimate for straight cylinders [11]. In pffiffiffiffiffiffiffiffiffiffi general, the expression is DK ¼ K o T=M [12]. As the composition of the flow varies as a function of space and time, the viscosity of the mixture can be found using the semi-empirical formula of [13] in compact form ν

μmix ¼ ∑ i

xi μi ; ∑νj xj ϕij

ð29Þ

where xi is the mole fraction of a given species i and the dimensionless quantity ϕij represents (cf. [14] p. 24) 2 !1=2   3   M j 1=4 5 1 Mi  1=2 4 μi : ð30Þ 1þ ϕij ¼ pffiffiffi 1 þ M μ Mi 8 j j For this paper, it has been assumed that the flow is only in the axial direction. This assumption fails to capture all the desired information by not being able to compute the concentration in the boundary layer. This can be overcome in part by assuming that the change in the boundary layer is small over the length considered (cf. Section 2.3). Thus, the reaction rate at the walls will be proportional to the concentration at that point. With these assumptions Eq. (24) becomes   CjF i  CiF j 1 F 1 Bo ∂P ∂ Ci ∑ þ C ð31Þ þ i ¼  Ci C jai P μDKi ∂z ∂z C Dij DKi To make Eq. (31) dimensionless let the axial distance z ¼ Lξ where L is the length of the reactor, the concentration C i ¼ Cxi where C is the total concentration, giving a mole fraction, the flux F ¼ F F C , the pressure P ¼ P 0 P where P0 is the inlet pressure and for both diffusivity constants D ¼ DDc , to find ! xj F i  xi F j CF c FcF i ∑ þ CDc j a i D ij D Ki Dc   1 Bo P 0 ∂P C ∂xi  ð32Þ ¼  Cxi þ L ∂ξ P P 0 μD Ki Dc L ∂ξ From the above define the characteristic diffusivity as Dc ¼ Bo P 0 =μ and characteristic flux as F c ¼ Dc C=L. To further simplify equation define the flux due to a pressure

A.-A.-D. Jones III, A.D. Jones Jr. / Materials Science in Semiconductor Processing 21 (2014) 82–90

gradient as   1 1 ∂P : þ FP  P D Ki ∂ξ

ð33Þ

Note that if the flux due to a pressure gradient is zero then F p ¼ 0. Thus, Eq. (31) becomes Fi DKi

n

xjF i  xiF i

iaj

D ij

þ ∑

∂x ¼  F P xi  i ∂ξ

ð34Þ

The last term on the right hand side of Eq. (34) is the diffusive flux and the first term on the left hand side of Eq. (34) is the advective flux [15]. The system of equations for the flux of a given species is ! F1 x2F 1  x1F 2 x3F 1  x1F 3 x ν F 1 x 1 F ν þ þ þ⋯þ DK 1 D 12 D 13 D 1ν ¼  F P x1 

∂x1 ∂ξ

F2 x 1 F 2 x 2 F 1 x 3 F 2 x 2 F 3 xνF 2  x2F ν þ þ þ⋯þ DK 2 D 21 D 23 D 2ν ¼  F P x2 

ð35Þ

x1 x1 ∂x1 F 3 ⋯ F ν ¼  F P x1  ∂ξ D 13 D 1ν "  # x2 1 x1 x3 xν  F1þ þ þ þ⋯þ F2 D 21 D K2 D 21 D 23 D 2ν 

x1

x2 D 23

F2

F 3 ⋯

 þ

D " ν1 1



F1

DKν

 þ

¼  F P xν 

D ν2 x1

D ν1

x2 D 2ν

F ν ¼  F P x2 

∂x2 ∂ξ

F2 þ

xν D ν3

x2 D 2ν

þ

F 3 þ⋯

a12

a13



a22

a23



a32

a33









aν2

aν3



a1ν

32

F1

3

2

b1

3

6 7 6 7 a1ν 7 76 F 2 7 6 b2 7 76 7 6 7 6 7 6 7 a1ν 7 76 F 3 7 ¼ 6 b3 7 6 7 6 7 ⋮ 7 54 ⋮ 5 4 ⋮ 5 aνν bν Fν

x3 D ν3

þ⋯þ0

xi Dij

ð42Þ

ð43Þ

In normalizing Eq. (1) we assumed that the change in the total concentration changes very little with respect to space and time. To relax this assumption, one would have to post-multiply the results from Eq. (43) by the characteristic flux and find the total concentration at each time step (cf. Section 3.1). With the latter assumption and the characteristic flux, Fc, assume that time is t ¼ τt c . Thus, the continuity equation for each species is: ð44Þ

ð45Þ

The factor multiplying the mole fraction

# Fν

∂xν ∂ξ

Da ¼ ð36Þ

ð37Þ

ð38Þ

and for the diagonal elements let ν x 1 j þ ∑ DK i j a i Dij

or 2 a11 6a 6 21 6 6 a31 6 6⋮ 4 aν1

ð41Þ

∂xi ∂F L2 kis ¼ i xi ∂τ ∂ξ Dc

for i ¼ 1; …; ν. To write this in a matrix form for the offdiagonal elements let

aii ¼

aν1 F 1 þaν2 F 2 þaν3 F 3 þ ⋯ þaνν F ν ¼ bν

Let the characteristic time, t c ¼ L2 =Dc so that

which can be written as ! ν x ν F 1 ∂x j j þ ∑ ¼  F P xi  i F  xi ∑ DK i j a i Dij i D ∂ξ ij jai

aij ¼ 

a31 F 1 þ a32 F 2 þ a33 F 3 þ ⋯ þ a3ν F ν ¼ b3 ⋮

∂xi t c Dc ∂F i ¼ 2  kis t c xi ∂τ L ∂ξ

⋮ xν

a21 F 1 þ a22 F 2 þ a23 F 3 þ ⋯ þ a2ν F ν ¼ b2

2.7. Complete system

After a rearrangement of terms "  # 1 x2 x3 xν þ þ þ⋯ þ F1 DK1 D 12 D 13 D 1ν D 12

a11 F 1 þa12 F 2 þ a13 F 3 þ⋯ þa1ν F ν ¼ b1

where the flux vector is F ¼ ½F i . The flux of each species is then determined by F ¼ A  1 B.

!

∂xν ∂ξ

ð40Þ

and define the vector B ¼ ½bi . Thus, we have the system

AF ¼ B

Fν x 1 F ν  x ν F 1 x 2 F ν x ν F 2 þ þ þ⋯þ0 DK ν D ν1 D 2ν



and define the matrix A ¼ ½aij . Now let   ∂x 1 1 ∂P ∂xi  bi ¼ F P xi  i ¼  þ ∂ξ ∂ξ P D Ki ∂ξ

Thus,

∂x2 ∂ξ



¼  F P xν 

!

87

ð39Þ

L2 kis L2 ks μðcÞ ¼ P 0 Bo Dc

ð46Þ

can be identified as a Damkohler number, Da, resulting in the dimensionless equation ∂xi ∂F ¼  i  Daxi : ∂τ ∂ξ

ð47Þ

3. Numerical simulation To numerically simulate the above system is to model a non-linear partial differential equation. It can be shown that an explicit scheme is unstable regardless of choice of time or spatial step. An implicit scheme is impossible due to the nonlinear binary diffusion term. In this paper we introduce an implicit–explicit scheme to solve the

88

A.-A.-D. Jones III, A.D. Jones Jr. / Materials Science in Semiconductor Processing 21 (2014) 82–90

Dusty-Gas Model (DGM). We use an explicit scheme to linearize the equation and follow that by an implicit scheme thus producing a program that solves the full DGM and is unconditionally stable. Being able to predict the outcome of a numerical simulation is necessary for validation in addition to experimental data. To determine the influence of certain parameters we begin with a one dimensional one-species reactive advective–diffusive equation. Solving this equation in the most robust manner possible determines the importance of the terms while minimizing computational error. We then modeled the equation in Section 2.6 using an implicit–explicit model while analyzing its stability and accuracy using linear methods. 3.1. 1D Simulation of the Dusty-Gas Model For the one dimensional simulation of the Dusty-Gas Model description of the Atomic Layer Deposition flow process into the reactor we started by discretizing Eq. (47): ∂xi ∂F ¼  i Daxi ; ∂τ ∂ξ

ð48Þ

using a spatial discretization ξk A fξ0 ¼ 0; ξ1 ; …; ξNξ ¼ 1g and discretization in time t n A ft 0 ¼ 0; t 1 ; …; t N ¼ T f g as shown in Fig. 2. As a first order accurate scheme in time, we use a forward Euler scheme in time and centered difference in space except at the boundary as xni;kþ 1  xni;k Δt

¼

i 1 h n F  F ni;k  1  Daxni;k : 2δξ i;k þ 1

ð49Þ

At the left hand boundary there is no change in concentration: xni;0þ 1 ¼ x0i;0

ð50Þ

At the right hand boundary we use a backwards difference scheme in space: xni;kþ 1  xni;k Δt

¼

i 1h n F i;k  F ni;k  1  Daxni;k : δξ

ð51Þ

To discretize the flux, F ni;k , the only term to be approximated is the concentration gradient in B: x  xi;k  1 ∂xi  i;k þ 1 ∂ξ 2δξ

Source & Interferometer

f AH r e M te H TE Wa

ALD Reactor

ð53Þ

For the right hand boundary, as there is vacuum on the other side, we assume ∂xi  ALxi;N ∂δξ

To verify the model, experiments were done at the National Institute of Standards and Technology by Drs. Maslar and Sterling using in situ gas analysis. As the main focus of this paper is the numerical model, for full details see [16] and an upcoming publication by the same group. The reactor is a horizontally oriented hot-walled impinging jet Synchronously Modulated Flow and Draw CVD reactor cf. Fig. 3. The reactor cylinder under consideration has a length of L C 0:25 m to the wafer with a diameter of D C0:10 m. The pressure of the system is P 1 C1 Torr  133 Pa and the aluminum walls of the system are at 110 1C. The wafer chuck is externally heated to 230 1C. The inflow water and tetrakis(dimethylamino) hafnium (TEMAHf) lines are heated to 110 1C and 90 1C respectively. The carrier gas helium is continuously delivered while two lines alternate between He and the precursors. During purging, 75 sccm of He flow through each of the four lines. TEMAHf is injected into the reactor by diverting He through a heated bubbler at 75 1C for T TEMAHf ¼ 3 s. The vapor pressure of TEMAHf in the bubbler is approximately 50 Pa at this temperature. Afterward, the He flow is returned to its previous path, and the TEMAHf delivery line and the reactor are then purged for P TEMAHf ¼ 5 s. For water injection, valves open the reactor to a room-temperature water vessel for T H2 O ¼ 100 ms while momentarily stopping the flow of He through that line. The vapor pressure of water in the vessel is 2.5 kPa; its flow into the reactor is limited to approximately 75 sccm by a needle valve. Afterward, the water delivery line and the reactor are purged for P H2O ¼ 15 s. Optical access is provided by two 0.05 m diameter windows on both opposite sides of the chuck recessed so that the inner surfaces are inset only 0.0015– 0.005 m with respect to the interior walls of the reactor. Data from the FTIR instrument (RS-10000, Mattson Instruments) were collected in the form of interferograms, which are the measured intensity as a function of the position (retardation) of the moving mirror as determined by laser fringes. A Norton-Beer medium apodization

ð52Þ

a centered difference scheme for the interior points. For the left boundary we use a 3 point forward difference scheme: xi;2  4xi;1 þ 3xi;0 ∂xi  : ∂δξ 2δξ

4. Experimental setup

ð54Þ

where A is a constant to be determined by experiment. In the most general form, the values for Dij, DK i , αij and μ of the mixture should be experimentally determined otherwise estimating the initial concentrations, pressure, temperature, Bo, L, μi, Mi, using the expressions in Section 2.6 yield the results in Section 5.

ZnSe Windows

HgCdTe Detector Fig. 3. The experimental setup is shown in this figure showing the in situ gas analysis reprinted with permission from [16]. We have simplified the gas flow controller and the interferometer as the critical dimensions we have mentioned in the paper are the orientation, direction of flow and location of measurement. The absorbance was used to calculate a concentration. From the concentration data and the ideal gas law we estimated a mole fraction. For more details on the experimental work see [16].

A.-A.-D. Jones III, A.D. Jones Jr. / Materials Science in Semiconductor Processing 21 (2014) 82–90

In the following section, the results of the model presented in Section 3.1 are discussed with its successes and short-comings discussed constructively. The model in Section 3.1 is compared with experimental results from that discussed in Section 4. Our simulation of the process using the Dusty Gas Model includes all the flux terms except the flux due to a temperature gradient. To include the thermal gradient is an extension of numerical model and was dropped due to the experiment setup. It has been proved that the model used can explain multiple gas flow behaviors [1]. The current model is only solved in 1D. To extend to 3D, noting that the axial symmetry reduces the problem to 2D, would involve changing the grid and expressing the pressure in 2D. The boundary conditions would also need to be altered, which would lead to a more accurate model of the surface gas reactions. The coupling that this model has over the preliminary model introduces a nonlinearity. Our model is able to simulate the change in concentration during the purging process. The graph in Fig. 4 shows the simulation of the Thermal ALD process using the Dusty-Gas Model in Section 3.1 at varying times. Each progressing line is the concentration at an increasing time step. The (blue) bottom line is the concentration of the 1 0.9 0.8

Mole Fraction

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.1

0.2

0.3

0.8

0.6

0.4

0.2

0 –0.08038

5. Results and discussion

0

1

Mole Fraction

function was applied before Fourier transforming the interferograms. A transmittance spectrum T(ν), where ν is the frequency, was obtained from the ratio of a transformed sample interferogram against a transformed background interferogram. The absorbance spectrum A(ν) was obtained by the relationship A ¼ logðTÞ. From the absorbance, the total concentration of a given species can be correlated. The mole fraction of a given species is then found using the ideal gas law on the reactor at a given pressure to find a total number of moles.

89

0.4

0.5

0.6

0.7

0.8

0.9

1

Dimensionless Distance Fig. 4. At increasing times, the above graph shows the simulation of the Thermal ALD process using the Dusty-Gas Model in Section 3.1. The line on bottom is the concentration of the reactant dropping from xB ¼ 0.4 to xB ¼0 while the top line shows the concentration of the carrier/purge gas rising from xcarrier ¼ 0:6 to xcarrier ¼ 1 over the normalized length of the reactor in a total of dimensionless time τ ¼ 1:5. (For interpretation of the references to color in this figure caption, the reader is referred to the web version of this paper.)

0 0.15 0.3 0.45 0.6 0.75 0.9 1.05 1.2 1.35 1.5 1.65 1.8 1.95 Time[s]

Fig. 5. A comparison of numerical with experimental results ( þ), over a t¼ 2 s purge cycle of water. The total purge time was 15 s, yet concentrations were undetectable past 2 s. The mole fraction was derived from that of water while data from TEMAHf/MEA does not lend itself to as simple a computation. The data was fit to produce a sticking parameter of A  1500 in Eq. (54) and ks ¼ 500 s  1 (medium thickness line). The large value of ks is most likely due to the assumption that the reaction is dependent on the concentration in the bulk, not the concentration in the boundary layer above the wafer necessary for 1D modeling. The figure shows a variation of 12  DaðcÞ; DaðcÞ; 2  DaðcÞ in increasing line thickness. The inset shows the sinusoidal behavior in the experiment matched in the numerical results. This is likely due to changing the equilibrium positions as the entrained water is transported out, the equilibrium position of adsorbed shifts.

reactant dropping from xB ¼0.4 to xB ¼ 0 while the (green) top line is the concentration of the carrier/purge gas rising from xcarrier ¼ 0:6 to xcarrier ¼ 1 over the normalized length of the reactor in a total of dimensionless time τ ¼ 1:5. The overall scheme is unconditionally stable. Yet, the finite difference scheme used in space is not robust for sharp derivatives and leads to oscillations with inlet gradients greater than   xi;0  xi;1    ð55Þ  Δξ  4 1: Using a finite volume scheme would smooth out this derivative. Other methods for smoothing out the derivative, that are used in the current work, include reducing the coefficient β in the initial condition Eq. (15) while decreasing δξ. By decreasing δξ, δτ is decreased by the square that would increase computation time drastically. A comparison with experimental results over t ¼ 2 s during a purge cycle of water led to a best fit parameter of A  1500 in Eq. (54) and k ¼ 500 s  1 . This is shown in Fig. 5. The large value of ks is most likely due to the assumption that the reaction is dependent on the concentration in the bulk, not the concentration in the boundary layer above the wafer necessary for 1D modeling. The figure shows 12  DaðcÞ; DaðcÞ; 2  DaðcÞ in increasing line thickness. This is the same as multiplying reaction rate, ks by the same factor or multiplying the inlet pressure (decreasing the convection rate) by the inverse of the factor, from the definition of Da ¼ L2 ks μðcÞ=P 0 Bo . It is understood that slower kinetics or slower convection results in more stable concentrations of gas in the chamber even if the time to purge is the same. This result is used

90

A.-A.-D. Jones III, A.D. Jones Jr. / Materials Science in Semiconductor Processing 21 (2014) 82–90

experimentally to remove water from the walls by increasing wall temperature. The Damkholer number is recognized as an important parameter in static/quasi-static chemical reactions and can normally be computed from properties of a single chemical species. The form of the Damkholer number derived here combines properties of the bulk flow into a characteristic diffusivity. This form comes directly from the model noting that the binary diffusion and the Knudsen diffusion are dependent on the bulk entrainment. This is useful when considering entrainment of the gas versus its ability to stick to the surface. 6. Conclusion In this paper we constructed a model for a single wafer atomic layer deposition process. The purpose of this model was to predict the concentration of the reactant gases at the wafer. A one dimensional model was constructed using mass conservation and dimensional analysis. This model was used to inform the coupled system used as derived by [1]. An iterative Forward Euler centered difference scheme (implicit–explicit) was used to solve the system numerically. This scheme made the system stable. Experiments were run using in situ gas analysis in an ALD reactor. The numerical results lined up well with the experimental results. The model also confirmed experimental conjecture that increased dissociation rate decreased the total concentration in the chamber regardless of pressure head. We derived a Damkohler number, Da ¼ L2 ks =Dc ¼ L2 ks μðcÞ= P 0 Bo , that can correlate purge times with reactor parameters and a qualitative behavior of the gases in situ. We plan to extend this model to 2 dimensions to simulate flow across and around a sample wafer in the reactor.

Acknowledgments This research was supported by the MSI Division of the Nuclear Regulatory Commission.

The collaboration between NIST and FAMU was made possible by the NIST-ARRA Fellowship program. The experimental setup and data was contributed by Drs. James E. Maslar and Brent Sperling of the CSTL Division of the National Institute of Standards and Technology. References [1] E.A. Mason, A.P. Malinauskas, Gas Transport in Porous Media: The Dusty-Gas Model, Elsevier, Amsterdam, New York, 1983. [2] T. Suntola, Mater. Sci. Rep. 4 (1989) 261–312. [3] M. Leskelä, M. Ritala, Thin Solid Films 409 (2002) 138–146. [4] M.D. Groner, F.H. Fabreguette, J.W. Elam, S.M. George, Chem. Mater. 16 (2004) 639–645. [5] X. Liu, S. Ramanathan, A. Longdergan, A. Srivastava, E. Lee, T.E. Seidel, J.T. Barton, D. Pang, R. Gordon, J. Electrochem. Soc. 3 (2005) G213–G219. [6] S.M. George, Chem. Rev. 110 (2009) 111–131. [7] S. Wada, A. Sakurai, N. Yamada, T. Watanabe, H. Uchiuzo, ECS Trans. 25 (2009) 209–216. [8] A. Lankhorst, B. Paarhuis, H. Terhorst, P. Simons, C. Kleijn, Surf. Coatings Technol. 201 (2007) 8842–8848. (Euro CVD 16, 16th European Conference on Chemical Vapor Deposition). [9] Sneh O. ALD Apparatus and Method, US Patent # 6,911,092, 2005. [10] D. McNeill, S. Bhattacharya, H. Wadsworth, F. Ruddell, S. Mitchell, B. Armstrong, H. Gamble, J. Mater. Sci.: Mater. Electron. 19 (2008) 119–123. [11] E.L. Cussler, Diffusion: Mass Transfer in Fluid Systems, Cambridge University Press, Cambridge, UK, 1997. [12] R. Jackson, Transport in Porous Catalysts, Elsevier Scientific Publishing, Co., Holland, Amsterdam, NY, New York, 1977. [13] C.R. Wilke, J. Chem. Phys. 18 (1950) 517–519. [14] R.B. Bird, W.E. Stewart, E.N. Lightfoot, Transport Phenomena, John Wiley & Sons, New York, NY, USA, 1960. [15] A.-A.D. Jones, Numerical simulation of a single wafer atomic layer deposition process (Bachelors thesis), Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, 2010. [16] J.E. Maslar, W. Hurst, D. Burgess, W. Kimes, N.V. Nguyen, E. Moore, J. Hodges, ECS Trans. 13 (2008) 139–149.