Monte Carlo algorithms for QCD

Monte Carlo algorithms for QCD

Nuclear Physics B (Proc. Suppl.) 9 (1989) 447-456 North-Holland, Amsterdam 447 MONTE CARLO ALGORITHMS FOR QCD Don WEINGARTEN IBM Research, P.O.B. 21...

656KB Sizes 0 Downloads 100 Views

Nuclear Physics B (Proc. Suppl.) 9 (1989) 447-456 North-Holland, Amsterdam

447

MONTE CARLO ALGORITHMS FOR QCD Don WEINGARTEN IBM Research, P.O.B. 218, Yorktown Heights, NY 10598 We present a classification of the Monte Carlo algorithms which have been suggested for QCD including the full effect of fermions, and discuss which of these are likely to be the most efficient.

1.

INTRODUCTION

Over the years since the lattice formulati,~n of Quantum Chromodynamics 1 was first proposed, a variety of different methods have been suggested to calculate the theory's consequences. Among the algorithms available at present, some version of Monte Carlo integration appears to be the most efrecient way to obtain predictions for the infinite volume, continuum limit of lattice QCD. The first Monte Carlo algorithms for QCD including the full effect of quark vacuum polarization were proposed seven years ago. Since then about a dozen different possibilities have been suggested. We will give a classification scheme for this set of proposals. Then we will go back through the outline, review the entries in greater detail and show that most of the various proposals are rather closely related. From these relations, combined with the results of a few numerical experiments, it will follow that the best way to find predictions for the infinite volume, continuum limit of the theory, is probably one of two alternative. 2.

3. The classical dynamics-Langevin hybrid methods of Duane 7, of Ding and Martin 8, and of Gottlieb, Liu, Toussaint, Renken and Sugar 9. 4. The Langevin plus global correction step algorithm of Scalettar, Scalapino and Sugar 10. 5. The hybrid plus global correction step method of Duane, Kennedy, Pendleton and Roweth 11. The second group of algorithms derives from the pseudofermion method of Fucito, Marinari, Parisi and Rebbi 12. These consist of: 1. The Langevin method of Zwanziger 13. 2. The pseudofermion method of Creutz and Gavai 6. 3. The determinant ratio algorithms of Duncan and Furman 14, Scalapino and Sugar 15, and Barbour 16. The final set of related methods consists of: 1. The algorithm Stamatescu 17.

ALGORITHM CLASSIFICATION

There are three families of fermion algorithms. The first group consists of methods arising from the fermion-boson transformation of Petcher and the present author 2. The members of this group are: 1. The classical dynamics method of Polonyi and

Wyld 3. 2. The Langevin algorithms of Batrouni, Katz, Kronreid, Lepage, Svetitsky and Wilson 4 and of Ukawa and Fukugita 5, and the Monte Carlo method of Creutz and Gavai 6 0920-5632/89/$03.50 © Elsevier Science Publishers B.V. (North-Holland Physics Publishing Division)

of

Bhanot,

Heller

and

2. The method of Rossi and Zwanziger 18. 3. Grady's algorithm 19. 3.

FERMIONS TO BOSONS In this section we will briefly summarize the definition of Wilson's path integral for QCD using Grassmann integration, and then show how it can be converted to an ordinary integral and evaluated by Monte Carlo integration 2. Let A be a finite, 4-dimensional

448

D. Weingarten/ Monte Carlo algorithms for QCD

hypercubic lattice. For each site z, in A, the link matrix U~,(x), with link index I~ = i . . . 4, is a member of SU(3), and the site variables, C{,(x) and ¢~(x), with spin index / = I . . . 4 , color index a = I . . . 3 , and flavor index f = 1 . . . n F , are atomic elements of a Grassmann algebra. The vacuum expectation of a polynomial ~-(U, ¢ , ~ ) in these fields is then

(3.1)

:

z

(3.2)

=

where fdp(U).., is integration over one copy of SU(3) Haar measures for each link matrix and fd#(¢,-~).., is a product of Grassmann integrals for each component ¢,/~(x) and ¢[~(~,) at each lattice site x. The action S

in Eqs. (3.1) and (3.2)is S = S(U) + S(U,¢,¢), S(U,¢,¢) = ~ : ( ~ ) M v ( m , y ) ¢ l ( y ) ,

(3.3)

(3.4)

x,y,f

where S(U) is the plaquette action for the link fields, and M u ( x , #) is Wilson's fermion coupling matrix 20. There is still no practical way to do Monte Carlo integration directly on a Grassmann integral. A transformation was therefore suggested 2 which converts Eqs. (3.1) and (3.2) into ordinary integrals without Grassmann variables. This transformation requires an even value for r~F, which, for simplicity we will now take to be 2. Let ¢i~(x) be a complex lattice field with spin i n d e x i = i , . . . 4 , and color indexa = 1,...3. Then the transformed versions of Eqs. (3.1) and (3.2) are _-

exp [-S(U) - S(U, ¢)], z

:

J'a,,(u)]a#(¢)× exp [-S(U) - S(U, ¢)],

(3.5)

(3.6)

where the action S(U, O) is

s(u, ¢)

= ~1

(3.7)

The integral .fd#(¢)... in Eqs. (3.5) and (3.6) and the inner product (...,...) in Eq. (3.7) are defined in the natural way, and ~(U,./lI{~ I) in Eq. (3.5) is a function of U and -/l~r/1 determined by .T'(U, ¢, ¢). For the simplest case of a polynomial 5t-(U) which does not involve fermion fields, ~(U,/lcrul ) is independent o f / I ] ~ 1 and given by F(U).

With Eqs. (3.1) and (3.2) converted to (3.5) and (3.6), respectively, Eq. (3.5) can then be evaluated by Monte Carlo integration. A rejection Metropolis method, link by link, was introduced to update the U field, and a heat bath, site by site, was chosen to update ¢. For each of these updates, an algorithm is needed to solve M , X = 7, (3.8) for X for some specified 7/. Eq. (3.8) is actually just a lattice version of the inhomogeneous Dirac equation. Eq. (3.8) was solved by Gauss-Seidel iteration. The combination of Eq. (3.5), Monte Carlo integration and the Gauss-Seidel algorithm was tested on a 24 lattice and produced correct results in hundreds of hours on a VAX 11/780. 2 The main problem with this method, however, is that the work it requires grows rapidly with lattice size. To update the data on each link and each site, Eq. (3.8) must be solved a number of times. Each solution of Eq. (3.8) by Gauss-Seidel, on a lattice with V sites, requires O(V) arithmetic operations. For simplicity we ignore the effect of critical slowing here and in similar estimates later in the article. Since there are O(V) sites and links, each lattice update then costs O ( V ~) arithmetic opertions. A pure gauge theory, by comparison, requires only CO(V) operations for a full lattice update. So for a 10 ~ lattice, this method is (9(10000) times slower than a pure Eauge Monte Carlo. 4.

CLASSICAL DYNAMICS To accelerate the algorithm of Sect. 3, Polonyi and Wyld 3 suggested an adaptation of the" microcanonical method, which had been applied to guage theories without fermions by Callaway and Rahman 21. The fields ¢ and U are intepreted as the coordinates of a classical system, constrained by the requirement that each Uj~(x) be in SU(3). A set of real-valued momenta P{u(x), { = 1,... 8, is introduced conjugate to each constrained matrix U~,(x), and complex momenta /)~(x) are introduced conjugate to ¢{~(x). From these coordinates and momenta, a hamiltonian 7~ is defined by adding kinetic energy terms to the action of Eq. (3.5), which can be thought of as a potential energy:

= s(u) + s(v,

zlr o(=tl ~ax

+

.

i

]C

u

2

+

(41/

D. Weingarten/ Monte Carlo algorithms for QCD If the average of a function ~(U,/lit7 I) is taken over a canonical ensemble for "H, the momenta P~ and pu enter though simple gaussian integrals. These can be carried out exactly and give the expectation defined in Eq. (3.5) as the result. In the classical dynamics algorithm, averages over a canonical ensemble for ~-~ are replaced by time averages over a long classical trajectory generated by solving Hamilton's equations of motion arising from ~. If the system is ergodic, time averages will approach averages over a microcanonical ensemble which, in turn, will approach canonical ensemble averages if the system's physical volume is made large. The possible advantage of the classical dynamics algorithm over the algorithm of Sect. 3 can be found by looking at Hamilton's equations for a small time step A t . A convenient version of Hamilton's equations 22 gives for Pa and ¢

¢'(~) = P~'(.~) =

¢(~) + A~W(~) + ~X~F~(~), (4.2) r+(~) + z~ [F+(~) + F+'(~)], (4.3)

where the force F¢(x) is defined to be

F~(~)

=

1 (Nj'¢)(x),

(4.4)

and F¢'(x) is obtained similarly from ~' and U'. For Pu and U we have v'(~)v/(~)

=

1 + JAr E Aj/v~(z,)+ J

iAr~ E AjF]~(x),

(4.5)

J

r~'(~,) = F~(.) + ~ [F~(~) + F;,,Ut (~)],

(40)

where the force F ~ ( x ) is defined to be

F~(x)

--- - ~ a j . : s ( u ) +

~-(NEI¢,Ojj,=iuN~ ¢),

(4.7)

and F]~'(x) is obtained similarly from U' and ¢'. In Eq. (4.5) the quantities Aj,...As are an orthonormal set of 3 x 3 traceless, complex bermitian matrices, and

Nu is .1Vu= ~IuMt.

(4.8)

For a function f [U,(x)] of the link matrix U,(x) the derivative Oj~,~ is defined by

449

aj,,j [u.(:)] = a~ [(i + . (4.9) a___y i~Aj) y.(:)] o=0 The crucial feature of Eqs. (4.2)-(4.7) is that in order to update all coordinates and momenta on the lattice, we need to solve NuX -=--¢ (4.10) for X only once rather than O(V) times, as required by the algorithm of Sect. 3. Solving Eq. (4.10) takes O(V) operations. The remaining calculations in Eqs. (4.2)(4.7) also take only O(V) operations. Thus the total work required for a full lattice update is O(V) rather than O(V ~) as in Sect. 3. The classical dynamics algorithm has two new problems, however. First of all, the discretized evolution equations, Eqs. (4.2)-(4.'/), are correct only in the limit A7 --~ 0. As /..Xv is made small, the amount of arithmetic required to travel a fixed distance in r grows as ~ r -1. Thus the present algorithm will be faster than the method of Sect. 3 only if Eqs. (4.2)-(4.7) become trustworthy for a value of A r which is sufficiently large. A second problem is that this algorithm depends on the assumption that the motion generated by "/~ is ergodic. However, if the gauge couplingg is made small and the quark mass m is made large, with the lattice A fixed, the system approaches a set of decoupled harmonic oscillators. It follows rigorously 23, that for some value of g sufficiently small and m sufficiently large, the ergodicty hypothesis must be violated. It seems reasonable, therefore, to question whether ergodicity holds at physical values of g and m. How this question could be answered is not clear. 5.

LANGEVIN AND HYBRID A way around the ergodicity ~roblem of the classical dynamics algorithm was suggested by Batrouni, Katz, Kronfeld, Lepage, Svetitsky and Wilson 4, and independently by Ukawa and Fukugita 5. Related ideas were also discussed by Creutz and Gavel 6. All of these groups proposed, in effect, a Monte Carlo evaluation of Eq. (3.5), but in place of the update of Sect. 3, which gives a large change in a single link matrix at a time, these groups suggested updates which make small changes in all link matrices on the lattice at once. The particular form of this idea considered by BKKLSW and by Ukawa and Fukugita, is related to the Langevin algorithm suggested for gauge theories without fermions by Parisi and Wu 24.

D. Weingarten/ Monte Carlo algorithms for QCD

450

In the algorithm of Ukawa and Fukugita, for an appropriately chosen gaussian ensemble of possible steps, a Monte Carlo update has the same form as the classical update of Sect. 4, Eqs. (4.2), (4.4), (4.5) and (4.7). In place of the old values of momenta, P~(x) and P]~,(x), however, Ukawa and Fukugita have a set of uncorrelated complex gaussian random variables, ~ ( x ) , and real gaussian random variables, r/~(x), respectively, with means 0 and dispersions

>, = u

2

This update step clearly shares with the classical update the property that only a single solution to Eq. (4.10) is required for each update of the lattice. To leading order in AT, this step fulfills the detailed balance condition required to generate a Monte Carlo ensemble to evaluate Eq. (3.5). Since a fresh set of gaussian r/~(x) and r/y~(x) are chosen for each update, however, every point in configuration space has a nonzero probability of eventually being reached from any starting point. Thus the ergodicity problem of the classical dynamics algorithm is solved. If the classical dynamics algorithm does lead to an ergodic system, then distributions of individual components of Pc and pu will approach gaussians with mean 0 and dispersions given by Eqs. (5.1) and (5.2), respectively, as the system's physical volume becomes large. Thus if ergodicity holds, the update step of the classical dynamics algorithm and of the method of Ukawa and Fukugita become essentially identical. The only difference between the two algorithms would be that the classical dynamics algorithm successively steps along a single trajectory, while the algorithm of Ukawa and Fukugita takes one step along a particular trajectory, then jumps to a new trajectory before taking the next step. The algorithm of BKKLSW differs from the method of Ukawa and Fukugita by replacing the upate step for ¢, Eqs. (4.2) and (4.3), with ¢ ' = Mvr/¢,

(5.3)

for a gaussian r/¢ as before with mean 0 and dispersion given by Eq. (5.1). Eq. (5.3) fulfills the detailed balance condition needed for Monte Carlo evaluation of Eq. (3.5) exactly, not just to leading order in AT. Eq. (5.3) is a heat bath update on all components of ¢

at once in place of the single component heat bath used in the algorithm of Sect. 3. The amount of arithmetic required by Eq. (5.3) is not significantly different from that required by the ¢ update in the algorithm of Ukawa and Fukugita. However, since Eq. (5.3) makes a large change in all components of ¢, in place of the changes of O ( A r ) given by the method of Ukawa and Fukugita, we would expect it to lead to more rapid decorrelation of the system. Replacing Eq. (4.3) by Eq. (5.3) should therefore lead to an overall improvement in the efficiency of the algorithm. Although the Langevin algorithms of Ukawa and Fukugita and of BKKLSW resolve the ergodicity problem of the classical dynamics algorithm, these methods also have an unattractive feature. In the classical dynamics algorithm, successive updates are in nearly the same direction if AT is small. Thus after some small number of updates n, the total change in each link matrix b~(x) will be O ( n A r ) . In the Langevin algorithms, successive steps are in random directions, so that after a small number of updates, the total change in each Uj,(x) will only be O(V/nAT). Thus for small values of A7 and small numbers of updating steps, the classical dynamics algorithm appears to travel faster in configuration space and may therefore require less arithmetic to travel from an initial configuration to a configuration which is statistically decorrelated. It would be useful to combine the ergodicity of the Langevin algorithms with the rapid decorrelation of the classical dynamics method. A hybrid of Langevin, in the same form considered by Ukawa and Fukugita, with the classical dynamics algorithm was suggested by Duane 7. At each step a random choice is made between a classical update, with some probability p, and a Langevin update with probability 1-p. Thus the update is given by Eqs. (4.2)-(4.7), but with PC(x) and PU(x) replaced by

,~v~(~.) + (1 -,~)¢(~) ~-, ~r,~(x) + ( ~ _ ~) u(~) ~

v*(.~),

(s.4)

vu(x),

(5.5)

where r~ is a random variable which is i with probability p and 0 with probability i - p. If p is close to 1, this algorithm will advance, on the average, almost as rapidly as the classical dynamics algorithm. As long as p < I, however, some Langevin steps will be taken and ergodicity is assured. A proof that Eqs. (4.2)-(4.7) ~rith the substitutions of Eqs. (5.4) and (5.5) generates the correct probability distribution of configurations is given by Duane and Kogut 25.

D. Weingarten/ Monte Carlo algorithms for QCD An analytic calculation 7 for free field theories shows that there is an optimal p to produce fastest decorrelation and this optimum is in the range 0 < p < 1. Thus for free field theories the optimum includes both classical steps and Langevin steps. For interacting theories a similar result has been found numerically 25. A variation of the hybrid algorithm with ¢ updated by Eq. (5.3) was suggested by Gottlieb, Liu, Toussaint, Renken and Sugar 9. Related ideas were discussed by Ding and Martin 8. The algorithm of GLTRS updates configurations by successively applying three different types of steps: 1. Choose ¢' according to Eq. (5.3). 2. Choose the momenta pzll from a gaussian ensemble:

p,v,l(z ) = 7.u (~).

(5.6)

3. Evolve U and Pu according to Eqs. (4.5)-(4.7) for a total duration of r. It follows from the discussion of the algorithm of BKKLSW that step 1 here is likely more efficient than updating ¢ by iterating Eqs. (4.3), (4.2), and (5.4) as suggested by Duane. Steps 2 and 3 here are almost equivalent to the updating rule for U in Duane's algorithm, except that Duane's algorithm gives an exponential probability distribution for T,

451

rithms has been examined by BKKLSW 4, by Ukawa and Fukugita 5, and by Fukugita, Oyanagi and Ukawa 26, and for the hybrid algorithms this has been considered by Duane and Kogut 25 and by GLTRS 9,28. The overall conclusion of this work is that accurate extrapolations can be done from sufficiently large values of AT that these algorithms remain a great deal faster than the algorithm of Sect. 3. Hadron mass calculations using a Langevin method 26 on a lattice 9a x 18, for example, were done with approximately 1500 hours of time on a computer running at between 100 and 150 Mflops. Although it is possible to extrapolate A r to 0, this extrapolation is still time consuming and a source of uncertainty. Predictions must be calculated at a number of different values of A r and compared with an asymptotic dependence on A r obtained analytically. If the data does not fit, more Monte Carlo data must be collected at smaller AT until data which does fit is obtained. Since the data all comes with statistical errors, however, the conclusion that a set of data fits the expected form carries an element of uncertainty. To obtain an acceptable degree of uncertainty in this fit often requires measuring results at three or more different A v to rather high precision. In addition, the extrapolation to A v of 0 has a tendency to amplify errors in the measured data and therefore also requires small statistical errors in the data measured at nonzero AT. 6.

p,ob(,)

= p(1 - p ) ~ ,

,

(5.7)

in place of the fixed value taken in step 3. As it turns out, however, the choice of a fixed value of 7- leads to

some risk of losing ergodicity. For example, if a version of steps 2 and 3 is applied to a simple harmonic oscillator with period T, then it is easy to see that ergodicity will be lost if 7 is an integral multiple of T/2. Although it may seem unlikely that this problem would occur for QCD, the safest strategy is to apply step 3 with the distribution of Eq. (5.7) for r in place of a single fixed value. All of the algorithms in the present section reproduce the vacuum expectations of Eq. (3.5) only in the limit AT --, 0. A remaining issue concerning these algorithms, is how small A r must be taken to obtain reliable estimates of the A v ---, 0 limit. Since the work required by any of these methods grows as A v -1, for sufficiently small values of AT these algorithms will actually be slower than the method of Sect. 3. The extrapolation of predictions to AT of 0 for the Langevin algo-

CORRECTED LANGEVIN AND HYBRID One advantage of the algorithm of Sect. 3 over those of Sect. 4 is that it requires no extrapolation in A t . A way to combine this feature of the algorithm of Sect. 3 with the speed of the Langevin algorithm was suggested by Scalettar, Scalapino and Sugar 10. This method consists of a rejection Metropolis algorithm to evaluate Eq. (3.5), as suggested in Sect. 3, but in place of trial steps which modify only a single link at a time, trial updates are generated by a Langevin step. In a version of this algorithm due to Creutz 22, an update occurs in four stages: 1. Choose a new ¢ according to Eq. (5.3). 2. Choose the components of p u to be independent gaussian random variables according to Eq. (5.6). 3. Update (pu U) to a trial configuration, according to Eqs. (4.5)-(4.7).

(pu, U')

4. Accept the trial (pu, U') in place of (pu, U) with probability given by the minimum of p and 1 where

D. Weingarten / Monte Cado algorithms

452

P=

~v [-~(P~,

v')] u)]

'

(6.1)

and 7/is defined by Eq. (4.1). Creutz shows that an update by Eqs. (4.5)-(4.7) is exactly reversible. Thus if step 3 is applied to (_pu,, U') the result is (_pu, U), In addition, this update preserves phase space volume exactly,

dtL(pu')dlt(U ')

=

dp.(pu)d#(U).

(6.2)

Consider now a modified version of step 3, which begins by randomly replacing PV by _ptJ with probability 1/2. Using Eq. (6.2) combined with reversibility, it can be shown that the step 3' followed by 4 fulfills the detailed balance condition:

Pr°b [(pU, u) "-* (pUt, u')] _ V~ob [(P~,, v,) --. (P,:, u)]

exp

[ - 7 / ( P u, U)] "

(6.3)

Of course step 2 followed by 3' is the same as step 2 followed by 3. Thus Eq. (6.3) implies that steps 1 4 asymptotically generate a sequence of configurations distributed exactly according to , p [ - ~ ( P ~ , U)] for any value of AT. The algorithm just described is a corrected version of Langevin. Duane, Kennedy, Pendleton and Roweth 11 suggested converting this algorithm to a corrected version of the hybrid method. In the algorithm of DKPR, step 3 above is iterated some number of times n before applying step 4. The result is again exact for all values of AT. The crucial question for the corrected Langevin algorithm and for the corrected hybrid algorithm is whether a reasonable acceptance rate in step 4 can be obtained for values of AT not too much smaller than the smallest AT required to extrapolate the uncorrected algorithms to AT of 0. A further question for the corrected hybrid is whether the fastest average motion through configuration space is obtained with n > 1. As n is made larger the acceptance rate in step 4 may fall. Thus it is conceivable that the optimal n could be 1 so that the optimal corrected hybrid simply reduces to the corrected Langevin. Numerical experiments have now been done by a number of groups to answer these questions 10'11'22'28'29'30. The overall conclusion, for lattices up to about 84 and (staggered) quark masses

for

QCD

above 0.1 in lattice units, is that the optimal corrected algorithms has n > 1. In addition, the best corrected algorithm moves in phase space at least as fast as the uncorrected hybrid algorithm does, when the uncorrected algorithm is run at the smallest AT needed for extrapolation to AT of 0. In other words, results which require no AT extrapolation can be gotten from the corrected algorithm for the same amount of work the uncorrected algorithm requires to produce results which must then be extrapolated. It may seem surprising that the corrected algorithms can obtain large acceptance rates in step 4 without being forced to extremely small values of AT. An explanation is provided by an argument due to Creutz 22 for the corrected Langevin, and extended to the corrected hybrid method by Gupta, Kilcup and Sharpe 30. The essential reason that step 4 frequently accepts trial configurations is that step 3 nearly preserves the value of 7-/. Creutz gives a heuristic argument which shows that for a single update in step 3, the change in 7-( on a lattice with V sites is

"H (pU',U') - TL (pU, u) = ~AT~V +/~AT%~:C,

(6.4)

where c~ and I~ are constants and ( i s a gaussian random variable with dispersion 1. From Eq. (6.4) we can find the acceptance probability in step 4 of a trial configuration produced by step 3. The result is proportional to exp(~'/AvgV), where A and 7 are constants. Thus a reasonable acceptance rate can be obtained with AT falling only rather slowly as the lattice size grows AT ~ V 4 .

(6.s)

On the other hand, the change in a component of pu U after N such sweeps will be C9 (Ar,~/N--). Thus the number of sweeps needed to obtain a configuration with each component of p u and U altered by some finite, fixed amount is or

=O0r )

(6.6/

Since the number of operations required for each sweep is O(V), the total cost to produce a configuration with each component of p u and U altered by some fixed amount is O (, T £3). The extension of this argument to the corrected hybrid algorithm 30 shows that with n A T fixed, a reasonable acceptance rate in step 4 requires

D. Weingarten/ Monte Carlo algorithms for QCD ~T o~ v 4 ,

(6.7)

453 1

F~(x) = -~aj.~s(u) + (7.8)

and therefore = o

(6.8)

The number of operations to change each component of pu and U by some fixed amount by this method then grows as O (V~). The end result is that the corrected hybrid method requires no extrapolations and replaces work growing as O(V u) in the algorithm of Sect. 3, with work growing as O(V~). This is still a bit worse than the O ( V ) work for a gauge theory without fermions, but not much. The essential trick in going from O(V u) down to O(V~) is the replacement of the simple trial updates of Sect. 3 with a more complicated trial update which already has a good deal of the system's dynamics built in. 7.

PSEUDOFERMIONS We will now consider the pseudofermion algorithm of Fucito, Marinari, Parisi and Rebbi 12, and some related methods. If the integral over Grassmann variables is carried out in Eqs. (3.1) and (3.2) we obtain

Z-~/d#(U)Tdet ( M)"%xp [-S(U)], (7.1)

(:r) Z

=

/dl,(U)det(M)~rexp[-S(V)],

(7.2)

where Y is the quantity entering Eq. (3.5). For both Wilson fermions and staggered fermions it can be shown that

det (M)

= det

(M').

(7.3)

It follows that Eqs. (7.1) and (7.2) are equivalent to (s)

Z

= =

z-'ja.(u)fe~v[-sou(v)], fd#(v)~,[-soz(v)],

(714) (7.5)

where S~y.f is defined as

S~.fy - - S(U)

-

nF

2

In det (MMt) .

(7.6)

The algorithm of FMPR evaluates Eq. (7.4), in effect, by a version of the Langevin algorithm 24. An update step becomes

U:,(x)U,~(x) = 1 + i A T E AJrll/l,(x) + J iA~"~y~. AjF~(x), (7.7) J

where N u is defined by Eq. (4.8) and r/~(x) is the collection of independent gaussian random variables defined in Sect. 5. To obtain the matrix elements of N~ 1 needed for Eq. (7.8), FMPR define a collection of random complex "pseudofermion" variables Xia(x) with probability density proportional to e x p [ - (X, NuX)/2]. Then we have

(X~(x)X~b(y)}

=

2Nu~jb(X , y).

(7.9)

Before each update of U by Eqs. (7.7) and (7.8), a Monte Carlo algorithm is used to generate an ensemble of random X. The matrix elements of N~ 1 needed by Eq. (7.8) are then found using Eq. (7.9). If critical slowing of the Monte Carlo for X is ignored, the total amount of work for a single update of U by this method is O(V). The results obtained in this way must then be extrapolated to A7 of 0. Thus it would appear that this algorithm might be comparable in efficiency to the Langevin algorithms discussed in Sect. 5. As it turns out, however, the Monte Carlo evaluation of Eq. (7.9) for each step of Eq. (7.7) requires a large amount of arithmetic. Numericalexperiments 31 show that for lattices up to about 64 this algorithm is comparable in speed or slower than the method of Sect. 3. In an alternate version of this method, a Monte Carlo ensemble of X large enough to evaluate Eq. (7.9) accurately is replaced by an ensemble produced by only a small number of Monte Carlo updates applied to the final X generated for the preceding update of U. The extreme version of this method uses a single X generated by a single Monte Carlo update 13. It can be shown 13'31 that the errors introduced this way in matrix elements of Nt~ I average out in the limit of small A t , and a correct ensemble of U is still produced by Eq. (7.7). The cost of using a small ensemble of X" is that extremely small values of AT are then required in Eq. (7.7) to obtain accurate estimates of the A r --, 0 limit 31. The result is that the alternate version of this algorithm remains extremely slow. Another form of the algorithm of FMPR was suggested by Creutz and Gavai 6. Instead of generating an ensemble of X by a Monte Carlo, fields with the required distribution can be obtained by solving

D. Weingarten / Monte Cado algorithms for QCD

454

MaXh =

(7.10)

N~/v. = 0

where each r/Ck is an independent copy of the gaussian random field 7/¢ of Sect. 5. Generating a collection of statistically independent gaussian r/¢~ is not difficult using a random number generator. Each independent X k then costs one solution of Eq. (7.10). On a large lattice with small lattice spacing, we would expect that it is significantly easier to be sure that Eq. (7.10) has been solved to some specified degree of accuracy, than it is to determine the number of iterations of a Monte Carlo needed between statistically independent X k. Thus this method of generating X k is likely more efficient than the original suggestion of FMPR. Combining Eq. (7.10) with Eq. (7.8), we obtain =

+

(7.n)

For the case N~ = 1, Eq. (7.7) and Eq. (7.11) become identical to the updating equations, Eqs. (4.5), (4.7) and (5.3), for the BKKLSW kangevin method of Sect. 5. For other values of N n the present method is, in effect, a generalization of the BKKLSW algorithm. The value of Nn which optimizes the algorithm's efficiency can be obtained from an analytic calculation of the error arising from applying Eqs. (7.7) and (7.11) at nonzero AT-. An extension of the error calculations given by BKKLSW shows that the error arising from Eq. (7.11) in any expectation value will have the form

where a and b are constants. Numerical work 6 shows that a and b are both negative for the expectation value of a plaquette. On the other hand, after N~p updates of U by Eqs. (7.7) and (7.11), the change in the field U will fulfill

~#ab

=

o

if ~ r and N~p are not too large. Since the main work in an update is solving Eq. (7.10), the total work for N,p sweeps will be O(N, vNv). Combining Eqs. (7.12) and (7.13) gives

°)1

1 + N.g

.

(7.14)

Since a/b is positive for a plaquette, it follows that the amount of work required to travel a fixed distance in configuration space with a fixed error in a plaquette expectation is minimized at ]g, of 1. The overall conclusion is that the Langevin method of BKKLSW is the most efficient form of the pseudofermion method of Creutz and Gavai, which, at least for small lattice spacing and large lattices, is very likely a better algorithm than the original pseudofermion method of FMPR. A number of Monte Carlo methods for evaluating Eq. (7.4) were suggested by the pseudofermion method but do not involve a small step AT and an extrapolation of AT to 0. Algorithms in this class were suggested by Duncan and Furman 14, Scalapino and Sugar 15, and Barbour 16. All of these methods require work growing at least as O(V2). These algorithms will not be considered here in detail. 8.

OTHER ALGORITHMS The third family of related algorithms consists of the method of Bhanot, Heller and Stamatescu 17, the algorithm of Rossi and Zwanziger 18, and a method proposed by Grady 19. All of these algorithms may be viewed as evaluating the integral of Eq. (7.4) and require work growing as O(V 2) for each update of all components of U. Bhanot, Heller and Stamatescu propose evaluating Eq. (7.4) by a rejection Metropolis method successively applied to each link. The probability of retaining a trial configuration U I is then given by the minimum of i and the quantity p

=

-

S(Vl)]

[~1

(s.1)

The difficult part of finding p is determining the ratio of determinants. For this purpose an ensemble of random X is generated, just as in the pseudofermion algorithm, with probability density proportional to exp [- (X, N~:X)/2]. Originally a Monte Carlo was proposed to produce this ensemble, but following the discussion of Sect. 7, a more efficient method, once again, is probably to generate random gaussian fields r/¢~ and then solve Eq. (7.10) for X k. The ratio of determinants becomes

D. Weingarten / Monte Carlo algorithms for QCD

det ( N u , ) d~.t(Nu)

_

(8.2)

1 (~.v[(x,N.x)

- (x,N.,x)])'

A version of this strategy is adopted by Rossi and Zwanziger 18, Bhanot, Heller and Stamatescu also propose a second method for evaluating the ratio of determinants in Eq. (8.1). In this alternative an ensemble of random X is generated with probability density proportional to exp [ - (X, Nu,X)/2], and can be obtained by solving

M~,x ~ = ~".

(8.3)

The ratio of determinants is then given by the expectation d~t (Nu,) _ (exp [(X, Nu, X) - (X, Nux)]) • d d (Nu)

(8.4)

Grady's algorithm, in a form due to Creutz 32, is similar to the second method of Bhanot, Heller and Stamatescu. Grady shows that in determining p in Eq. (8.1) it is sufficient to approximate the required ratio of determinants by a simplified version of Eq. (8.4) using only a single field in the ensemble of X,

exp [(X, Nu,X) - (X, NuX)].

(8.5)

With this expression for the ratio of determinants, the asymptotic distribution of Monte Carlo U will still be exactly as required. By Eq. (8.3) combined with the definition of Nrj, Eq. (4.8), the quantity in Eq. (8.5) can be written

(8.6) It is interesting at this point to return to the algorithm of Sect. 3. Let us update components of E by a rejection Metropolis. After updating each link, update q~by Eq. (5.3). Then when U is updated, the probability of keeping E' will be the minimum of 1 and

p = e~p[S(U)-S(Ur)]

×

(8.7)

455

The fact that the same ideas have now turned up a number of times over is evidence, I think, that the space of possible algorithms has finally been explored with some degree of thoroughness. Among the various competing possibilities, the hybrid method and the corrected hybrid method appear to be the most efficient. For lattices up to 84 or so, the corrected hybrid seems to cost no more machine time than does the algorithm without correction, if the uncorrected method is run at small enough A r to permit a reliable extrapolation to AT of 0. Since the corrected hybrid does not entail the nuisance and uncertainty of extrapolation, however, it is probably the overall best choice. For sufficiently large lattices the hybrid method without correction may become faster than the corrected alagorithm, but it is possible that this gain in speed will be absorbed by time lost in extrapolation. Even if the corrected method is somewhat slower on very large lattices, the cost may be justified by the absence of extrapolation uncertainties in numbers produced by this algorithm. REFERENCES 1. K. Wilson, Phys. Rev. D10 (1974) 2445. 2. D. Weingarten and D. Petcher, Phys. Lett. 99B (1981) 333. 3. J. Polonyi and H. Wyld, Phys. Rev. Lett. 51 (1983) 2257. 4. G. G. Batrouni, G. R. Katz, A. S. Kronfeld, G. P. Lepage, B. Svetitsky and K. G. Wilson, Phys. Rev. D32 (1985) 2736. 5. A. Ukawa and M. Fukugita, Phys. Rev. Lett. 55

(1985) 2736. 6. M. Creutz and R. Gavai, Nucl. Phys. B280 (1987) 181. 7. S. Duane, Nucl. Phys. B257 (1985) 652. 8. H,Q. Dingand O. Martin, Nucl. Phys, B280 (1987) 497. 9. S. Gottlieb, W. Liu, D. Toussaint, R. L. Renken and R. L. Sugar, Phys. Rev. D35 (1987) 2531.

This is almost the same as the expression for p obtained by Grady's method except that the pair of operators _~IfM'fj, 1 has been commuted with the pair Z~ru,1M'u.

10. R. T. Scalettar, D. J. Scalapino and R. L. Sugar, Phys. Rev, B34 (1986) 7911.

9.

12. F. Fucito, E. Marinari, G. Parisi, and C. Rebbi,

CONCLUSION Although initially many of the algorithms considered over the past several years seemed to be new departures, actually almost all turn out to be rather closely related.

11. S. Duane, A. D. Kennedy, B. J. Pendleton, and D. Roweth, Phys. Lett. B195 (1987 / 216.

Nucl. Phys. B180 (1981) 369.

456

D. Weingarten / Monte Car/o algorithms for QCD

13. D. Zwanziger, Phys. Rev. Lett. 50 (1983) 1886.

24. G. Parisi and Y.S. Wu, Sci. Sin. 24 (1981) 483.

14. A. Duncan and M. Furman, Nucl. Phys. B190 (1981) 767.

25. S. Duane and J. B. Kogut, Nucl. Phys. B275

15. D. J. Scalapino and R. L. Sugar, Phys. Rev. Lett. 46 (1981) 519.

26. M. Fukugita, Y. Oyanagi and A. Ukawa, Phys. Rev. Lett. 57 (1986)953; Phys. Rev. D36 (1987) 36.

16. I. Barbour, in La~iee Gauge Theory edited by

(1986) 398

B. Bunk, K. H. MStter, and K. Schilling (Plenum Press, New York, 1986).

27. S. Gottlieb, W. Liu, D. Toussaint, R. L. Renken and R. L. Sugar, Phys. Rev. D36 (1987) 3797.

17. G. Bhanot, U. M. Heller, and I. O. Stamatescu, Phys. Lett. 129B (1983) 440.

28. S. Gottlieb, W. Liu, D. Toussaint, and R. L. Sugar, Phys. Rev. D35 (1987) 2611.

18. P. Rossi and D. Zwanziger, Nucl. Phys. B243 (1984) 261.

29. H. Gausterer and S. Sanielevid, Phys. Rev. D38 (1988) 1220.

19. M. Grady, Phys. Rev. D32 (1985) 1496.

30. R. Gupta, G. W. Kilcup, S. R. Sharpe, Phys. Rev. D38 (1988) 1278.

20. K. G. Wilson, in New Phenomena in Subnuclear Physic.q, edited by A. Zichichi (Plenum Press, New York, 1977). 21. D. Callaway and A. Rahman, Phys. Rev. Lett. 49 (1982) 613. 22. M. £reutz, Phys. Rev. D38 (1988) 1228. 23. V. I. Arnold, Usp. Math. Nauk. 18 (1963) 13 [Russian Math. Surveys 18 (1963) 9]; V. I. Arnold and A. Avez, Ergodic Problems of Classical Mechanics (Benjamin, New York, 1968).

31. D. Weingarten, Nucl. Phys. B257 (1985) 629. 32. M. Creutz, in La~ice Gauge Theory Using Parallel Proce,~sors, edited by X. Li, Z. Qiu and H.-C. Ren (Gordon and Breach, New York, 1987).