Variance reduction by means of deterministic computation: collision estimate

Variance reduction by means of deterministic computation: collision estimate

Journal of Statistical Planning and Inference 85 (2000) 159–169 www.elsevier.com/locate/jspi Variance reduction by means of deterministic computatio...

93KB Sizes 0 Downloads 27 Views

Journal of Statistical Planning and Inference 85 (2000) 159–169

www.elsevier.com/locate/jspi

Variance reduction by means of deterministic computation: collision estimate Stefan Heinrich ∗ Universitat Kaiserslautern, Fachbereich Informatik, Postfach 30 49, D-67653 Kaiserslautern, Germany

Abstract We study the collision estimate of Monte Carlo methods for the solution of integral equations. A new variance technique is proposed and analyzed. It consists in the separation of the main c 2000 part by constructing a neighboring equation based on deterministic numerical methods. Elsevier Science B.V. All rights reserved. MSC: primary 65C05 Keywords: Monte Carlo method; Variance reduction; Collision estimate

1. Introduction In a recent paper (Heinrich, 1995), a new variance reduction technique was introduced for the Monte Carlo solution of Fredholm integral equations. The idea, based on work in complexity theory (Heinrich and Mathe, 1993; Heinrich, 1994), consists in constructing a new equation suciently close to the original one and then applying standard schemes to both equations simultaneously. So the approach is a special case of the separation of the main part (also called control variate) technique. As shown in Heinrich (1995), neighboring equations can be constructed by exploiting the system and the approximate solution of deterministic schemes of solving the equation. The gain in variance reduction can be controlled by the discretization error. Hence, by applying the Monte Carlo method with n samples, the overall error is essentially the deterministic discretization error multiplied by the classical Monte Carlo rate n−1=2 (see Heinrich and Mathe, 1993; Heinrich, 1994 for a theoretical foundation of this statement). Considerable improvements are possible this way as experiments in Heinrich (1995) showed. ∗

Tel.: +49-631-205-2992; fax: +49-631-205-3270. E-mail address: [email protected] (S. Heinrich)

c 2000 Elsevier Science B.V. All rights reserved. 0378-3758/00/$ - see front matter PII: S 0 3 7 8 - 3 7 5 8 ( 9 9 ) 0 0 0 7 8 - 6

160

S. Heinrich / Journal of Statistical Planning and Inference 85 (2000) 159–169

In Heinrich (1995), exact expressions for the variance of the method as well as estimates in terms of proximity of the two equations were obtained for one of the classical Monte Carlo algorithms – the absorption scheme. As a rule (compare Ermakov, 1971; Mikhailov, 1991a), this is the technically simpler case. The question arises what happens for the other classical method – the collision scheme. This is the theme of the present paper. We present and analyze the new technique for the collision scheme and prove that the variance is dominated by the square of the proximity of the respective kernels and right-hand sides in some function space norms. This means the results of Heinrich (1995) carry over to the collision estimate. The analysis is di erent and more complicated, but will be based on the results of Heinrich (1995), and we shall work in a similar framework. Nevertheless, we try to keep the present paper self-contained by recalling all needed notions and results. Once the variance analysis of Heinrich (1995) is extended to the collision estimate, we can apply the other results of that paper: The Galerkin method can be used to construct a neighboring equation and the proximity of kernels and right-hand sides can be estimated from certain parameters of that method. We do not repeat this here, but refer to Heinrich (1995) and Heinrich (1996) instead. General references for Monte Carlo methods are Spanier and Gelbard (1969), Ermakov (1971), Sobol (1973), Ermakov and Mikhailov (1982), Kalos and Whitlock (1986), Ermakov et al. (1989), Mikhailov (1991a), Mikhailov (1991b), Sabelfeld (1991). Predecessors of our approach can be found in Ermakov (1971), chapter 6.2.5, Spanier (1979), Ermakov and Sipin (1985), Mikhailov (1991b), Section 5:10, Sabelfeld (1991), chapter 2.2.3, Lafortune and Willems (1994). 2. The algorithm We consider the Fredholm integral equation of the second kind Z K(x; y)u(y) d(y) + f(x): u(x) = X

(1)

Here X is a non-empty set, endowed with a -algebra  of subsets and a nite positive, -additive measure  on (X; ). If 16s ¡ ∞, Ls (X ) = Ls (X; ; ) stands for the space of s-integrable functions and if s = ∞ for the space of essentially bounded functions. We assume that f ∈ L∞ (X ) and that K is  ×  measurable and satis es Z k K k L∞ (L1 ) := ess sup |K(x; y)| d(y) ¡ ∞: x∈X

X

This is equivalent to saying that the integral operator Tk de ned for g ∈ L∞ (X ) by Z (TK g)(x) = K(x; y)g(y) d(y) X

acts continuously in L∞ (X ). Moreover, k TK : L∞ (X ) → L∞ (X )k = kK k L∞ (L1 ) :

S. Heinrich / Journal of Statistical Planning and Inference 85 (2000) 159–169

161

Our aim is to compute the value (u; ) of a functional  ∈ L1 (X ) at the solution u of (1). Let p0 (x) and p(x; y) be non-negative measurable functions on (X; ) and (X × X;  × ), respectively, satisfying Z p0 (x) d(x) = 1 X

and

Z X

p(x; y) d(y)61

(x ∈ X ):

We shall assume that {(x) 6= 0 and p0 (x) = 0} = 0;  × {K(x; y) 6= 0 and p(x; y) = 0} = 0: De ne ’ and k by (x) = ’(x)p0 (x); K(x; y) = k(x; y)p(x; y): We consider an absorbing Markov chain on X with density of initial distribution p0 (x) and density of probability p(x; y) of transition from x to y. We assume that the spectral radius of Tp in L∞ (X ) is less than 1, hence almost all trajectories of the Markov chain are of nite length. Let  = (x0 ; x1 ; : : : ; xm ) be such a trajectory. The classical collision estimate of the von Neumann Ulam scheme is de ned as m P ’(x0 )k(x0 ; x1 ) · · · k(xl−1 ; xl )f(xl ); (2) (k; f; ) = l=0

and it is well known that E(k; f; ) = ((I − Tkp )−1 f; ) = (u; ):

(3)

The approximation to (u; ) is then obtained by averaging N independent realizations of . Given another pair of functions (h; g) (assumed to be close to (k; f), see below), such that the exact solution v of v = Thp v + g is known, we de ne a new random variable (k; f; h; g; ) = (v; ) + (k; f; ) − (h; g; ): Observe that E(k; f; h; g; ) = (v; ) + ((I − Tkp )−1 f; ) − ((I − Thp )−1 g; ) = (u; ):

(4)

162

S. Heinrich / Journal of Statistical Planning and Inference 85 (2000) 159–169

Averaging over N independent trajectories, we approximate (u; ) by N N 1 P 1 P (k; f; h; g; i ) = (v; ) + ((k; f; i ) − (h; g; i )): N i=1 N i=1

(5)

Hence (h; g; ) serves as the main part (control variate). The quality of the new scheme is determined by the variance of : 2  N Var () 1 P : (k; f; h; g; i ) = E (u; ) − N i=1 N In the sequel we shall study this variance. As in Heinrich (1995), we shall assume that ’2 p0 ∈ L1 (X ) and that the functions k; h; f; g belong to the following classes: Fix ¿ 0; 0 ¡ ¡ 1, n0 ∈ N and de ne K( ; ; n0 ) to be the set of all k ∈ L∞ (X 2 ) with k k k L∞ (X 2 ) 6 and k Tkn20p : L∞ (X ) → L∞ (X )k6 : For  ¿ 0 we let F() = {f ∈ L∞ (X ); kf k L∞ (X ) 6}: For k; h ∈ K( ; ; n0 ) and f; g ∈ F() the collision estimate  and the new scheme  are well-de ned random variables with nite second moment. (This is well known, compare also Lemma 1 below, which we recall from Heinrich (1995) for the sake of completeness.) Lemma 1. Let ¿ 0; 0 ¡ ¡ 1; n0 ∈ N. If k ∈ K( ; ; n0 ); then (i) I − Tk 2 p is invertible in L∞ (X ) and k (I − Tk 2 p )−1 : L∞ (X ) → L∞ (X ) k 6 0 ; Pn0 −1 2j ; where 0 = (1 − )−1 j=0 (ii) k kp k L∞ (L1 ) 6 kk 2 p k 1=2 L∞ (L1 ) 6 ; n0 : L∞ (X ) → L∞ (X )k6 ; (iii) kTkp (iv) I − Tkp is invertible in L∞ (X ) and k (I − Tkp )−1 : L∞ (X ) → L∞ (X ) k 6 1 ; Pn0 −1 j . where 1 = (1 − )−1 j=0 If k; h ∈ K( ; ; n0 ); then n0 : L∞ (X ) → L∞ (X )k6 ; and k (I − Tkhp )−1 : L∞ (X ) → L∞ (X ) k 6 0 . (v) k Tkhp 3. Variance of the new scheme The rst theorem provides a complicated, but exact expression for the variance of the new estimators . Later on, we shall derive simpler forms yielding upper bounds for the variance.

S. Heinrich / Journal of Statistical Planning and Inference 85 (2000) 159–169

163

Proposition 2. Let ;  ¿ 0; 0 ¡ ¡ 1; n0 ∈ N. Suppose that k; h ∈ K( ; ; n0 ) and f; g ∈ F(). Then the variance of (k; f; h; g; ) can be expressed by the following formula: Var() = ((I − Tk 2 p )−1 (2f(I − Tkp )−1 f − f2 ) − (I − Tkhp )−1 (2f(I − Thp )−1 g − fg) − (I − Tkhp )−1 (2g(I − Tkp )−1 f − fg) + (I − Th2 p )−1 (2g(I − Thp )−1 g − g2 ); ’2 p0 ) − ((I − Tkp )−1 f − (I − Thp )−1 g; ’p0 )2 : Proof. We represent the random variable  − (v; ) by an in nite sum of random variables ∞ P n ; (6)  − (v; ) = (k; f; ) − (h; g; ) = n=0

where n = ’(x0 )(k(x0 ; x1 ) · · · k(xn−1 ; xn )f(xn ) − h(x0 ; x1 ) · · · h(xn−1 ; xn )g(xn )) if n6m() (the length of ) and n = 0 if n ¿ m(). Hence the sum (6) is in fact almost surely nite. For arbitrary m; n ∈ N with n¿m we have Em n Z =

X n+1

’(x0 )(k(x0 ; x1 ) · · · k(xm−1 ; xm )f(xm ) − h(x0 ; x1 ) · · · h(xm−1 ; xm )g(xm ))

×’(x0 )(k(x0 ; x1 ) · · · k(xn−1 ; xn )f(xn ) − h(x0 ; x1 ) · · · h(xn−1 ; xn )g(xn )) ×p(x0 )p(x0 ; x1 ) · · · p(xn−1 ; xn ) d(x0 ) · · · d(xn ) n−m n−m n−m n−m m m f) − Tkhp (fThp g) − Thkp (gTkp f) + Thm2 p (gThp g); ’2 p0 ): =(Tkm2 p (fTkp

(7) From (6) we get  ∞ 2 ∞ ∞ P ∞ ∞ P P P P n = E m n = 2 Em n − Em2 ; E n=0

m=0 n=m

m;n=0

m=0

(8)

where the operations on in nite series are justi ed, since the series (6) is absolutely convergent in the square mean (i.e. in the norm of L2 over the probability space of the Markov chain). This follows from (7), the assumptions of the theorem and Lemma 1. Now, we combine (7) and (8). We apply the summation of (8) to each summand of (7). For the rst one, we obtain  ∞  ∞ P ∞ ∞ P P P l n−m Tkm2 p (fTkp f) = Tkm2 p f Tkp f = (I − Tk 2 p )−1 (f(I − Tkp )−1 f) m=0 n=m

m=0

l=0

164

and

S. Heinrich / Journal of Statistical Planning and Inference 85 (2000) 159–169

∞ P m=0

Tkm2 p (f2 ) = (I − Tk 2 p )−1 (f2 ):

The remaining terms of (7) can be handled analogously. Hence we get E((k; f; ) − (h; g; ))2 = ((I − Tk 2 p )−1 (2f(I − Tkp )−1 f − f2 ) −(I − Tkhp )−1 (2f(I − Thp )−1 g − fg) −(I − Tkhp )−1 (2g(I − Tkp )−1 f − fg) +(I − Th2 p )−1 (2g(I − Thp )−1 g − g2 ); ’2 p0 )

(9)

which together with (6) and (3) proves the proposition. Next we recall some notation from Heinrich (1995), which we need in the sequel: (k; h) = kTkp − Thp : L∞ (X ) → L∞ (X )k Z = ess sup |k(x; y) − h(x; y)|p(x; y) d(y) x∈X

X

(k; h)2 = kT(k−h)2 p : L∞ (X ) → L∞ (X )k Z = ess sup (k(x; y) − h(x; y))2 p(x; y) d(y): x∈X

X

Holder’s inequality implies (k; h)6(k; h): Theorem 3. Let ;  ¿ 0; 0 ¡ ¡ 1, and n ∈ N. Then there exists a constant c ¿ 0 such that for all k; h ∈ K( ; ; n0 ) and f; g ∈ F() Var((k; f; h; g; ))6c((k; h)2 + kf − g k 2L∞ (X ) ): Proof. We have according to (4) and (9) Var() 6 E((k; f; ) − (h; g; ))2 6 2E((k; f; ) − (h; f; ))2 + 2E((h; f; ) − (h; g; ))2 = 2(w1 + w2 ; ’2 p0 )

(10)

with w1 = (I − Tk 2 p )−1 (2f(I − Tkp )−1 f − f2 ) −(I − Tkhp )−1 (2f(I − Thp )−1 f − f2 ) −(I − Tkhp )−1 (2f(I − Tkp )−1 f − f2 ) +(I − Th2 p )−1 (2f(I − Thp )−1 f − f2 )

(11)

S. Heinrich / Journal of Statistical Planning and Inference 85 (2000) 159–169

165

and w2 = (I − Th2 p )−1 (2f(I − Thp )−1 f − f2 − (2f(I − Thp )−1 g − fg) −(2g(I − Thp )−1 f − fg) + 2g(I − Thp )−1 g − g2 ) = (I − Th2 p )−1 (2(f − g)(I − Thp )−1 (f − g) − (f − g)2 ):

(12)

It is convenient to rewrite w1 as w1 = w11 + w12 with w11 = ((I −Tk 2 p )−1 − 2(I −Tkhp )−1 + (I −Th2 p )−1 )(2f(I −Tkp )−1 f−f2 )

(13)

and w12 = (I − Tkhp )−1 (2f(I − Tkp )−1 f − 2f(I − Thp )−1 f) +(I − Th2 p )−1 (2f(I − Thp )−1 f − 2f(I − Tkp )−1 f) = ((I − Tkhp )−1 − (I − Th2 p )−1 )(2f((I − Tkp )−1 − (I − Thp )−1 )f) = (I − Tkhp )−1 (Tkhp − Th2 p )(I − Th2 p )−1 (2f(I − Tkp )−1 (Tkp − Thp )(I − Thp )−1 f) = (I − Tkhp )−1 T(k−h)hp (I − Th2 p )−1 (2f(I − Tkp )−1 T(k−h)p (I − Thp )−1 f):

(14)

We need relation (27) of Heinrich (1995): k T(k−h)hp : L∞ (X ) → L∞ (X )k = k(k − h)hp k L∞ (L1 ) 6 (k; h): With the notation of Lemma 1 we get from (12) and (14) k w2 k L∞ (X ) 6 0 (2 1 + 1)kf − g k 2L∞ (X ) ; k w12 k L∞ (X ) 6 0 (k; h) 0 2 1 (k; h) 1  = 2 02 12 2 (k; h)2 62 02 12 2 (k; h)2 : For the estimate of (13) we recall relations (26) and (28) from Heinrich (1995) (I − Tk 2 p )−1 − 2(I − Tkhp )−1 + (I − Th2 p )−1 =(I − Tk 2 p )−1 (T(k−h)2 p + T(h2 −k 2 )p (I − Th2 p )−1 T(h−k)hp )(I − Tkhp )−1 and k T(h2 −k 2 )p : L∞ (X ) → L∞ (X )k62 (k; h):

(15)

166

S. Heinrich / Journal of Statistical Planning and Inference 85 (2000) 159–169

This gives k w11 k 6 0 ((k; h)2 + 2 2 0 (k; h)2 ) 0 k 2f(I − Tkp )−1 f − f2 k L∞ (X ) 6 02 ((k; h)2 + 2 2 0 (k; h)2 )(1 + 2 1 )2 6 02 2 (1 + 2 2 0 )(1 + 2 1 )(k; h)2 : This completes the proof. Finally, we show that in analogy to Theorem 5 of Heinrich (1995), L2 -estimates can be obtained for the variance, once slightly stronger assumptions are imposed. We suppose that ’2 p0 ∈ L∞ (X ) and Z ess sup p(x; y) d(x)61 (16) y∈X

X

which is the case, in particular, if p(x; y) is symmetric (as e.g. the choice of the transition probability for the radiance equation in Heinrich (1995)). De ne for ¿ 0; 0 ¡ ;

1 ¡ 1; n0 ; n1 ∈ N K∗ ( ; ; 1 ; n0 ; n1 ) = {k ∈ K( ; ; n0 ): k Tkn21p : L1 (X ) → L1 (X ) k 6 1 }: Lemma 4. Let ¿ 0; 0 ¡ ; 1 ¡ 1; n0 ; n1 ∈ N. If k; h ∈ K( ; ; 1 ; n0 ; n1 ), then (i) I − Tk 2 p is invertible in L1 (X ) and k (I − Tk 2 p )−1 : L1 (X ) → L1 (X )k 6 2 with Pn1 −1 2j ; 2 = (1 − 1 )−1 j=0 (ii) I − Tkp is invertible in L1 (X ) and k (I − Tkp )−1 : L1 (X ) → L1 (X )k 6 3 ; with Pn1 −1 j ; 3 = (1 − 1 )−1 j=0 (iii) I − Tkhp is invertible in L1 (X ) and k (I − Tkhp )−1 : L1 (X ) → L1 (X ) k 6 2 . Proof. For an integral operator TK we have k TK : L1 (X ) → L1 (X )k = kTK ∗ : L∞ (X ) → L∞ (X )k ; where K ∗ (x; y)=K(y; x). Hence, if k; h ∈ K( ; ; 1 ; n0 ; n1 ), then k ∗ ; h∗ ∈ Kp∗ ( ; 1 ; n1 ) where Kp∗ denotes the class K, based on p∗ instead of p. Now, Lemma 4 is a direct consequence of Lemma 1. Finally, we denote Z Z (k; h)2 = (k(x; y) − h(x; y))2 p(x; y) d(x) d(y): X

X

Theorem 5. Given ;  ¿ 0; 0 ¡ ; 1 ¡ 1; n0 ; n1 ∈ N, there exists a constant c ¿ 0 such that for k; h ∈ K∗ ( ; ; 1 ; n0 ; n1 ) and f; g ∈ F(), Var((k; f; h; g; ))6c((k; h)2 + kf − g k 2L2 (X ) ):

S. Heinrich / Journal of Statistical Planning and Inference 85 (2000) 159–169

167

Proof. Lemmas 1 and 4 yield k (I − Th2 p )−1 : L∞ (X ) → L∞ (X )k6 0 ; and k (I − Th2 p )−1 : L1 (X ) → L1 (X )k6 2 and hence, by the Riesz–Thorin interpolation theorem (see Triebel, 1978), k (I − Th2 p )−1 : L2 (X ) → L2 (X )k6( 0 2 )1=2 :

(17)

Using (10) – (14) of the previous proof, we obtain Var()62(kw11 k L1 (X ) + kw12 k L1 (X ) + k w2 k L1 (X ) )k ’2 p0 k L∞ (X ) : First, we estimate k w11 k L1 (X ) on the basis of relations (13) and (15). From (17) of the present paper and inequalities (35)–(37) of Heinrich (1995) we get k T(k−h)2 p : L∞ (X ) → L1 (X )k6(k; h)2 k T(h2 −k 2 )p (I − Th2 p )−1 T(h−k)hp : L∞ (X ) → L1 (X ) k 62 2 ( 0 2 )1=2 (k; h)2 : This gives k (I − Tk 2 p )−1 − 2(I − Tkhp )−1 + (I − Th2 p )−1 : L∞ (X ) → L1 (X ) k 6 0 2 (2 2 ( 0 2 )1=2 + 1)(k; h)2 : Moreover, from Lemma 1, k 2f(I − Tkp )−1 f − f2 k L∞ (X ) 6(2 1 + 1)2 : The last two relations combined with (13) yield k w11 k L1 (X ) 6 0 2 2 (2 1 + 1)(2 2 ( 0 2 )1=2 + 1)(k; h)2 : Next we deal with w12 . From (14) we infer k w12 k L1 (X ) 6 k(I − Tkhp )−1 T(k−h)hp (I − Th2 p )−1 : L2 (X ) → L1 (X ) k ×k2f(I − Tkp )−1 T(k−h)p (I − Thp )−1 f k L2 (X ) : We shall use (36) and (37) of Heinrich (1995) which give k T(k−h)hp : L2 (X ) → L1 (X )k6 (k; h): Consequently, the rst factor of (18) satis es k(I − Tkhp )−1 T(k−h)hp (I − Th2 p )−1 : L2 (X ) → L1 (X ) k 6 k (I − Tkhp )−1 : L1 (X ) → L1 (X )k · k T(k−h)hp : L2 (X ) → L1 (X ) k ×k(I − Th2 p )−1 : L2 (X ) → L2 (X ) k 6 01=2 23=2 (k; h):

(18)

168

S. Heinrich / Journal of Statistical Planning and Inference 85 (2000) 159–169

To estimate the second factor of (18), we note that (36) of Heinrich (1995) gives k T(k−h)p : L∞ (X ) → L2 (X )k6(k; h): Moreover, arguing in the same way as for the derivation of (17), we conclude from Lemmas 1 and 4 k(I − Tkp )−1 : L2 (X ) → L2 (X )k6( 1 3 )1=2 : Hence, k 2f(I − Tkp )−1 T(k−h)p (I − Thp )−1 f k L2 (X ) 622 k(I − Tkp )−1 T(k−h)p (I − Thp )−1 : L∞ (X ) → L2 (X ) k 622 k(I − Tkp )−1 : L2 (X ) → L2 (X )k k T(k−h)p : L∞ (X ) → L2 (X )k ×k(I − Thp )−1 : L∞ (X ) → L∞ (X )k 622 13=2 31=2 (k; h): So we get k w12 k L1 (X ) 62 01=2 13=2 23=2 31=2 2 (k; h)2 : Finally, we turn to w2 . According to (12), k w2 k L1 (X ) 6 k(I − Th2 p )−1 : L1 (X ) → L1 (X ) k ×k2(f − g)(I − Thp )−1 (f − g) − (f − g)2 k L1 (X ) 6 2 (2k(I − Thp )−1 : L2 (X ) → L2 (X )k k f − g k 2L2 (X ) + k f − g k 2L2 (X ) ) 6 2 (2( 1 3 )1=2 + 1)kf − g k 2L2 (X ) : This completes the proof. In conclusion, let us mention a few consequences of the results proved above. They are of the same form (up to obvious modi cations) as those in Section 3 of Heinrich (1995). It follows that all consequences drawn in Heinrich (1995) hold true also for the collision estimate. In particular, Corollary 4 is valid, which shows that the optimal rate obtained in Heinrich and Mathe (1993) for the absorption estimate is also true for the collision estimate. Moreover, the analysis of Section 4 of Heinrich (1995) carries over: Using some approximate deterministic Galerkin solution, one can construct neighboring h and g of increasing precision, resulting in decreasing variance of the new scheme based on the collision estimate. To estimate the variance, one can use Propositions 6 and 7 of Heinrich (1995). Finally, the applications to the radiance equation of computer graphics hold true as well. Details can be found in Heinrich (1995) and Heinrich (1996).

S. Heinrich / Journal of Statistical Planning and Inference 85 (2000) 159–169

169

References Ermakov, S.M., 1971. The Monte Carlo Method and Related Problems (in Russian). Nauka, Moscow. Ermakov, S.M., Mikhailov, G.A., 1982. The Theory of Statistical Trials (in Russian). Nauka, Moscow. Ermakov, S.M., Nekrutkin, W.W., Sipin, A.S., 1989. Random Processes for Classical Equations of Mathematical Physics, vol. 34. Math. Appl. (Soviet Ser.). Kluwer Academic Publishers, Dodrecht. Ermakov, S.M., Sipin, A.S., 1985. A new scheme of the Monte Carlo method for solving problems of mathematical physics. Sov. Phys. Doklady 30 (11), 935–936. Heinrich, S., 1994. Random approximation in numerical analysis. In: Bierstedt, K.D., Pietsch, A., Ruess, W.M., Vogt, D. (Eds.), Functional Analysis. Marcel Dekker, New York, pp. 123–171. Heinrich, S., 1995. Variance reduction for Monte Carlo methods by means of deterministic numerical computation. Monte Carlo Meth. Appl. 1, 251–277. Heinrich, S., 1996. Using the Galerkin method for variance reduction. In: Second St. Petersburg Workshop on Simulation, Publishing House of St. Petersburg University, pp. 56–59. Heinrich, S., Mathe, P., 1993. The Monte Carlo complexity of Fredholm integral equations. Math. Comput. 60, 257–278. Kalos, M.H., Whitlock, P.A., 1986. Monte Carlo Methods, vol. I: Basics. Wiley, New York. Lafortune, E.P., Willems, Y.D., 1994. The ambient term as a variance reducing technique for Monte Carlo ray tracing. In: Proceedings of the Fifth Eurographics Workshop. Darmstadt. Mikhailov, G.A., 1991a. Minimization of Computational Costs of Non-Analogue Monte Carlo Methods. World Scienti c, Singapore. Mikhailov, G.A., 1991b. Optimization of Weighted Monte Carlo Methods. Springer, Berlin. Sabelfeld, K.K., 1991. Monte Carlo Methods in Boundary Value Problems. Springer, Berlin. Sobol, I.M., 1973. Computational Monte Carlo Methods (in Russian). Nauka, Moscow. Spanier, J., 1979. A new family of estimators for random walk problems. J. Inst. Math. Appl. 23, 1–31. Spanier, J., Gelbard, E.M., 1969. Monte Carlo Principles and Neutron Transport Problems. Addison-Wesley, Reading, MA. Triebel, H., 1978. Interpolation Theory, Function Spaces, Di erential Operators. Deutscher Verlag der Wissenschaften, Berlin.