Annals of Nuclear Energy 38 (2011) 410–417
Contents lists available at ScienceDirect
Annals of Nuclear Energy journal homepage: www.elsevier.com/locate/anucene
An efficient Monte Carlo-SubSampling approach to first passage problems M. Marseguerra Polytechnic of Milan, Via Ponzio 34/3, 20133 Milan, Italy
a r t i c l e
i n f o
Article history: Received 25 June 2010 Received in revised form 29 September 2010 Accepted 30 September 2010 Available online 1 November 2010 Keywords: CTRW Monte Carlo Gaussian diffusion Anomalous diffusion SubSampling
a b s t r a c t In the realm of the Continuous Time Random Walk (CTRW) and in conjunction with the Monte Carlo (MC) approach, we consider the transport of a chemical or radioactive pollutant in a 3D heterogeneous medium, focusing on the first passage time (FPT), defined as the time required by the walkers representative of the dangerous particles to travel within the medium until crossing a target disk, thus entering another medium which should instead remain clean, such as a water well or an aquifer. The advantage of the MC approach is the possibility of simulating different features of the travel such as different waiting-time probabilities, space dependent jump lengths, absorption and desorption phenomena, Galilei invariant and variant velocities driven by external forces. When the computer time required for collecting a suitable number of target crossings is excessive, we propound to hybridize the MC approach with the recent SubSampling computational procedure, usefully applied in the engineering reliability field to computing very small failure probabilities in short computer time. To tackle the FPT problem we iteratively consider groups of a few thousands of walkers: in each iteration we select a fraction of them closer to the target, ignoring the remaining ones, and then restore the group by creating with the MC technique new walkers even more close to the target. The successive groups have large conditional probabilities which can be estimated one after the other in short time and whose product yields the total probability. By so doing the FPT can be actually computed in much shorter times: we report examples of Gaussian and anomalous distributions in which the reduction with respect to a pure MC computation is of orders of magnitude. Ó 2010 Elsevier Ltd. All rights reserved.
1. Introduction The first passage time (FPT), defined as the time when a stochastic process first reaches a given threshold, entails several research fields which span from the Gambler’s ruin problems, to the spread of viruses among computers and people (Lloyd and May, 2001), to the detector measurements (Clouvas et al., 2003), to the transport in disordered media (Margolin and Berkowitz, 2000). Here we focus on the diffusion of a substance in a 3D heterogeneous absorbing porous medium, an important subject of investigation in environmental research, where the problem of the underground dispersion of radioactive or chemical pollutants has an evident safety significance. It is generally assumed that the substance starts to diffuse out of a source and the problem is that of determining its distribution in time and space not only in the vicinity of the source but also far away. In this latter situation the FPT problem concerns the risk that the dangerous material travels far enough so as to finally reach and contaminate a medium which should instead remain clean, such as a water well or an aquifer. Starting from the 1950s, field experiments of contaminant transport in porous soils have been effectuated and analysed mostly with the classical Advection–Dispersion Equation (ADE),
E-mail address:
[email protected] 0306-4549/$ - see front matter Ó 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.anucene.2010.09.033
based on the Fickian behaviour (e.g. Bear, 1972). Over the last two decades, inadequacies of the ADE approach have been pointed out even at small laboratory scale (e.g. Silliman and Simpson, 1987). Nowadays the problem is best tackled in the realm of the Continuous Time Random Walk (CTRW) (Montroll and Weiss, 1965; Berkowitz and Scher, 1998; Margolin and Berkowitz, 2002; Berkowitz et al., 2002; Dentz and Berkowitz, 2003; Cortis et al., 2004) in which the diffusion is simulated by walkers (the particles) which travel in the medium by means of a series of jumps of random lengths, separated by random waiting times. After each jump, a walker may die either for being absorbed by the environment or because of the radioactive decay. The CTRW is a powerful technique which can tackle Fickian and non-Fickian dispersion behaviours, e.g. by considering waiting times obeying Gaussian or power law distributions, respectively. In the realm of the CTRW approach to the underground diffusion of contaminants, the Monte Carlo (MC) simulation methodology (Cashwell and Everett, 1959; Hammersley and Handscomb, 1964; Marseguerra and Zio, 2002) represents a powerful method widely adopted as a natural tool to obtain reliable and accurate solutions, possibly taking into account specific features actually found in the flow through porous media (Marseguerra and Zoia, 2006) such as different waiting-time probabilities, space dependent jump lengths, absorption and desorption phenomena, Galilei invariant and variant velocities driven by external forces. However,
411
M. Marseguerra / Annals of Nuclear Energy 38 (2011) 410–417
the more the model is sophisticated, i.e. the more walker fates are possible, the larger is the time required to simulate each jump and then the larger is the computing time required to attain statistically good results. In this paper, we focus on the MC approach to the contaminant diffusion in a finite 3D heterogeneous absorbing medium aiming at estimating the first passage time probability (FPT), i.e. the probability that a contaminant particle will reach a far located target disk. A brute force approach would consist in following a huge number of walkers along their stochastic travels to see how many of them succeed in hitting the target. However, a distant disk can only be reached through a great number of jumps and since in each jump the walker may be absorbed, the probability that a trajectory could succeed in crossing the disk is extremely low. The problems here of interest are those for which the FPT is of the order of 106 or even less. If only one particle over 106 can reach the boundary, the simulation must treat a number of walkers larger by a factor of 10 or more to attain reasonable statistics and this may require an excessive computer time. To tackle the contaminant diffusion problem with reasonable accuracy and reduced computational burden, e.g. in a series of exploratory researches, we propound to hybridize the MC approach with the recent SubSampling (SS) computational procedure (Au and Beck, 2001) usefully applied in the engineering reliability field to computing very small failure probabilities in short computer time, especially in high dimensional problems. In principle, the SS method consists in introducing some intermediate conditional events having large conditional probabilities which can be estimated one after the other by MC simulations with small numbers of samples. The paper is organized as follows: Section 2 summarizes the original SS procedure; Section 3 describes the MC procedure and how it has been here joined to the SS one for dealing with the first passage time problem; Section 4 reports some numerical results for the Gaussian diffusion and also for the quite different anomalous diffusion. Finally some conclusion are drawn in Section 5. 2. The SubSampling in reliability In safety applications one is often faced with the problem of assessing the failure probability of highly reliable complex systems whose functioning depends on many uncertain parameters having known distributions. Following Au and Beck (2001), the failure probability is written as
PF Pð# 2 FÞ ¼
Z
IF ð#Þqð#Þd#
where # = (#1, #2, . . . , #n) 2 H Rn is a random vector of system parameters obeying a multidimensional probability q(#), F is the failure region within the parameter space H and IF(#) is a characteristic function whose value is 1 if # 2 F and zero otherwise. A performance model of the system – possibly a computer code – which in correspondence of a vector # states whether the system fails along a mission time is available. The problem is that of finding the region F which gives raise to an assigned PF. In case of an high dimensional space Rn and geometrically complicated failure regions F, the solution by numerical integration represents an awkward problem (Schueller et al., 2004; Zio and Pedroni, 2009). In principle, the MC approach can face the problem but it is well known that in case of small probabilities, e.g. PF < 106, an excessive number of performance computations (at least of the order of 10/PF) and then an excessive computer time is required. At this point the MC–SS procedure, i.e. the MC approach joined with the SS idea, comes into play. Here we only sketch this procedure through an example, deferring the full exposition to the seminal paper by Au and Beck (2001). Consider a power plant whose functioning depends on n stochastic parameters #i which, assuming independence, vary accord-
Qn
ing to known pdf‘s p(#i) so that qð#Þ ¼ i¼1 pð#i Þ. The system performance is defined as the value of the maximum of a signal, e.g. temperature (or concentration, stress, . . .), reached at a test point during a mission time: the plant failure occurs when this temperature exceeds a given threshold Tmax and the region F is constituted by the corresponding values of the parameters. Since the failure probability is admittedly very mall, the region F is also very small. The MC–SS approach introduces a sequence of smaller temperatures T1 < T2 < < Tm = Tmax and, correspondingly, a sequence of m nested regions F1 F2 Fm F, the generic Fi containing the # values for which the test-temperature exceeds Ti. Note that a test-temperature larger than Ti may also be larger than Ti+1, Ti+2, . . . so that a point in Fi may well belong also to Fi+1, Fi+2, . . . The key point of the SS relies on the idea of successively computing P(Fi+1) by MC simulations starting from points certainly in Fi (i = 1, 2, . . . , m 1), which amounts to deal with the conditional probabilities P(Fi+1|Fi), viz.
PF ¼ PðF 1 Þ
m 1 Y
PðF iþ1 jF i Þ
i¼1
Now the question: how to select the sequence Ti and then the regions Fi? One possibility is that of initiating with a standard MC approach: a reasonably small number N of vectors # are sampled from q(#) and the corresponding N performances – the test-temperatures in the current example – are evaluated. In the first iteration, those vectors giving raise to the highest N1 performances, where N1 is a preestablished fraction p0 of N, are selected: T1 is chosen as the lowest of the N1 temperatures and therefore all the selected vectors, called ðkÞ #0 , (k = 1, . . . N1), belong to the F1 region. Then P(F1) N1/N = p0. ðkÞ Each of the #0 points is the ancestor of a Metropolis–Hastings ðkÞ ðkÞ (MH) chain of N/N1 m = 1/p0 new points ð#1 ; #2 ; . . . ; #ðkÞ m Þ with the constraint that none of these descendants will give raise to ðkÞ ðkÞ test-temperatures smaller than T1. Specifically, #0 generates #1 ðkÞ ðkÞ which generates #2 which generates #3 and so on. In general N is of the order a 102 or 103 and p0 of the order of 101; clearly, 1/ p0 and p0N must be integer numbers. The most important feature ðkÞ of the MH algorithm is that it guarantees that the jth point #j of ðkÞ each chain is (asymptotically) drawn from qðj#j1 2 F 1 Þ and not from q(#): therefore they do not spread over the whole range H ðkÞ but instead they cluster near the ancestor #0 . The overall effect of the MH procedure is that of yielding a total of N new points (the descendants) in F1 (and possibly also in F2, F3, . . .). At the second iteration, the test-temperatures in correspondence of these N points are computed and the N1 highest ones (to avoid proliferation ðkÞ of symbols these points will be again called #0 , even if totally different from the previous ones) are again selected: T2 is chosen as the lowest of the N1 temperatures and the corresponding N1 points constitute the F2 region. The probability of these points in F2, given that their ancestors were in F1, is the conditional probability P(F2|F1) ffi N1/N = p0 and their total probability is PðF 1 ÞPðF 2 jF 1 Þ ffi p20 . By resorting again to the MH procedure, the best N1 points in F2 generate as many chains, each of 1/p0 descendants, so that a total of N new points, suitable for continuing the iterations, is created in F2. Finally, at the mth iteration when pm 0 P F , the required failure ðkÞ ðkÞ N1 probability PF is reached, and the last f#1 ; #2 ; . . . ; #ðkÞ m gk¼1 are the N parameter values which are surely within the region F. 3. The first passage problem 3.1. The pure Monte Carlo approach The MC approach to the particle transport is based on the simulation of the microscopic behaviour of an ensemble of particles, also called walkers, moving in phase space according to specified
412
M. Marseguerra / Annals of Nuclear Energy 38 (2011) 410–417
rules and macroscopically resulting in the behaviour of the ensemble. The adoption of the MC method in transport problems is attributed to Fermi and Ulam during the Manhattan Project when they were interested in studying the transport of neutrons and it is still largely implemented in nuclear reactor designs (e.g. Pozzi and Pazsit, 2006; Kennedy et al., 2007). In the following we adopt the CTRW paradigm to depict the diffusion of contaminant particles in a 3D absorbing porous medium, in presence of a space dependent advection. The advective field is represented in form of an external forward velocity l which at each jump pushes forward the walkers; l is called the Galilei invariant parameter since the solution to the diffusion–advection problem is the Galilei-shifted propagator W(x, t) = P(x lt, t) where P(x, t) is the solution to the diffusion equation. In addition, we also consider a variation of the Galilei invariant model by taking into account the occurrence of a partial sticking of the walkers: a walker may be temporarily trapped in the medium pores and when it is eventually released, it jumps forward by a quantity m called the Galilei variant parameter. The trajectory of a particle consists of a succession of jumps of random length performed in random time intervals. In the present study, the first passage time (FPT) analysis (Redner, 2001) consists in estimating: (i) the probability that a walker crosses a far located bidimensional target where it is recorded, (ii) the distribution of the hit times and (iii) the distribution of the starting point of the last jumps. In addition to the constants ruling the walker dynamics, the input to the MC simulation is the number of walkers to be recorded. The simulation consists in launching a sequence of walkers until the required number of detections is attained. Since most of the walkers are absorbed, each detection is preceded by a lot of unfinished trajectories and the probability (i) is estimated by the ratio of the recorded to the launched walkers. In correspondence of each detection, the target crossing-time and the starting point of the last jump are also recorded. At the end of the simulation, the corresponding histograms approximate the probability density functions (pdf) of the target crossing-times and of the last jump start locations, respectively. In case of a dangerous substance which should not reach a specified location, from the last two histograms one can estimate the time available for the remediation actions and the locations from which the substance starts to reach the target. 3.2. The Monte Carlo-SubSampling approach We are now in the position of joining the pure MC and the SS procedures with the purpose of obtaining a fast estimate of the quantities related to the first passage problem. The proposed procedure will be described with reference to the Gaussian diffusion of particles in a 3D absorbing medium, in presence of a space dependent advection which includes both the Galilei invariant and variant schemes (Marseguerra and Zoia, 2008). The source is at the origin of a Cartesian system and the target is a disk of radius D centred at x = Xmax, orthogonal to the x-axis; (ti, xi, yi, zi) are the time and space coordinates at the end of the ith jump of a generic walker. This quadruplet is the analogous of the vector # defined in the reliability SS case and the performance of the quadruplet is here defined as xi, the walker distance from the origin along the x-axis. The FPT target is achieved through R 102 repetitions of a sequence of MC iterations. Consider the generic repetition: in the first iteration N walkers are moved one step out of the origin, and the N1 = p0N farthermost ones along the x-axis, i.e. the N1 most performant, are selected. Let their time and location coorðkÞ ðkÞ ðkÞ dinates be q0 ¼ ðt0 ; v0 Þ, (k = 1, 2, . . . , N1) where v (x, y, z). Among them, the minimum and the maximum x-distances from the origin are ðkÞ
ðkÞ
X min ¼ minðx0 Þ and X max ¼ maxðx0 Þ 1 1 k
k
With respect to the SS reliability case, X min is the analogous of 1 T1 and, provided X max < X max , the interval x > X min is taken as the 1 1 analogous of the failure region F1, here also called F1. The numbers N and p0 are chosen as in the reliability case. In order to further proceed, we now must increase from N1 to N the number of the walkers in F1. To do this, in the reliability case one resorts to the MH procedure which guarantees that the # values to be added are near to the N1 ones by construction. Here there is no need for such procedure since now we can obtain the analogous result simply by considering the progeny of the selected N1 points thereby called ancestors. More specifically, each of the N1 ancestors performs a trajectory of m = 1/p0 jumps and m new walkers, the descendants, are created and labelled with the (t, v) coordinates of the end of all these jumps. Note that the N1m = N walkers so created do not include the N1 ancestors. If a jump ends because of an absorption event, the remaining offspring is completed restarting from the same ancestor. Since the number N of involved walkers must not vary, the absorbed descendant is replaced by one born at the origin where it will be discarded in the next iteration. Clearly, the probability to be attributed to these N quadruplets in F1 is p0. Now the second iteration: as for the first one, among the N walkers, the N1 farthest from the origin are selected. To avoid symbol proliferation the selected quadruðkÞ ðkÞ ðkÞ plets are again called z0 ¼ ðt0 ; v0 Þ, (k = 1, 2, . . . , N1) even if they are totally different from the previous ones. Among them, the minimum and the maximum x-values (minimum and maximum performances) are ðkÞ
ðkÞ
X min ¼ minðx0 Þ and X max ¼ maxðx0 Þ 2 2 k
k
where X min is the analogous of T2 defined in the reliability case and, 2 provided X max < X max , the interval x > X min is the analogous of the 2 2 failure region F2. To increase the number of walkers from N1 to N, the selected ancestors generate progenies of new walkers as above said and the probability to be attributed to the new N walkers in F2 is p20 : The procedure continues until the final iteration, when the location X max of the best walker, i.e. the walker with m maximum x-value among the N1 ones selected from the previous iteration is for the first time X max P X max . Now the finite dimension m of the target disk must be taken into account. Indeed, in addition to the best walker, other good walkers may also cross the x = Xmax plane: those whose last jump intercepts the target disk are recorded while the others are ignored. In any case the iteration ends even in lack of recorded data. The contribution to the FPT probability of each recorded walker is computed by linearly interpolating m1 m between the points ðX max Þ and ðX max m1 ; p0 m ; p0 Þ. The final probability is the average over the R repetitions of all these interpolated probabilities. As a rule of thumb, we found that the average of the recorded walkers over all the R repetitions should be at least 3. If this recipe is not satisfied, it seems practical to vary the input quantities N or p0. Notice that the present application of the SS method (calculating the distribution of the first passage time and the position of the last jump) is completely different from that pertaining to the reliability (finding a failure region F): in other words, the principle described for the reliability analysis is here applied in a very different context. The described method might resemble that of the MC particle splitting, widely used e.g. in neutron particle transport, in which a particle arrived at a favourable location may generate in that place another particle with shared weights: the counterpart of the creation of more and more particles with lower and lower weights in better and better locations is here the creation of descendants with lower and lower probabilities in zones closer and closer to the target.
413
M. Marseguerra / Annals of Nuclear Energy 38 (2011) 410–417
0.03
4. Numerical results The adequacy of the proposed MC–SS approach was checked with reference to two largely different diffusion cases, namely the Gaussian and the anomalous ones. In both cases the source is placed at the origin of a Cartesian system and the target is a disk of radius D = 10, orthogonal to the x-axis and centred at (Xmax, 0, 0), where Xmax is chosen large enough as permitted by simulations performed in several hours of computer time. As soon as a walker exceeds Xmax it is recorded if it has crossed the target disk and in any case the trajectory ends. By so doing we neglect the probability of the following chain of events: (i) the walker crosses Xmax outside the disk, (ii) it is pushed backward because of the diffusion, against both the Galilei invariant and variant effects which would instead drive the particle forwardly, and (iii) in another jump it succeeds in crossing again Xmax but now within the disk. The jump lengths are ruled by independent normal distributions with zero mean and constant standard deviation r = (2, 2, 2) and by an advection occurring with Galilei invariant and variant parameters exponentially varying with x from the initial values (l0x, l0y, l0z) and (m0x, m0y, m0z) to half of these values, reached in correspondence of Xmax. This decrease implies a steady reduction of the jump lengths in the walker travels towards the target. Moreover, at each jump the walker may be absorbed with probability pabs = 0.25. All the computations below reported were done with homemade Matlab codes run on a Toshiba PC computer. 4.1. Gaussian diffusion The target disk is located at Xmax = 35, a value several jump lengths far from the origin, so as to render the FPT probability very low but still computable by the PC computer. The time intervals ruling the jump lengths are independently sampled from an exponential distribution with parameter s = 2.5; The initial Galilei’s parameters were l0 = m0 = (0.1, 0.05, 0.05). The expressions ruling the generic ith jump are then: For the jump time:
t i ¼ t i1 þ Dti
where Dt i ¼ s logð1 RU Þ
ð1aÞ
For the jump length in the x direction:
xi ¼ xi1 þ Dxi
where Dxi ¼ rx RG þ lx ðxi1 Þ þ v x ðxi1 ÞDti
ð1bÞ
and analogously for the jumps in the y and z directions. RU / U[0, 1) and RG / N(0, 1) are uniform and normal random variates, respectively and, e.g. ly(x) = l0y exp(0.693 x/Xmax). First, we run the MC code with Ndet = 1000 required detections: the results were as follows: the estimate of the FPT probability was 4.95 106 which means that on the average only one walker over 200,000 succeeded to be detected, escaping absorption or exit outside the disk; the average number of walker jumps per trajectory was 4.0 over all trajectories1 (thus the total number of jumps was 8 108), whereas it was 22.4 over the trajectories ending with a detection; the required computer time was 16,863 s 4.7 h. The normalized histograms representative of the detection times and of the coordinates of the starting point of the last jump, are reported as dotted lines in Figs. 1 and 2. Successively, we run the MC–SS code with R = 250 repetitions. Each iteration was done with N = 2000 walkers and N1 = 10 ances-
1 This value can be interpreted as the mean number of Bernoulli trials up to and including the first success (here the walker absorption) in a negative binomial distribution with success probability pabs.
0.025
0.02
0.015
0.01
0.005
0
0
20
40
60
80
100
120
140
160
time Fig. 1. Gaussian diffusion: Xmax = 35; solid = MC–SS; dotted = MC. Normalized target crossing-time density.
tors; the progeny of each ancestor was then of 200 descendants and p0 = 0.005. It turned out that the average number of iterations per repetition was 3.008 and that the number of detections was 787 (3.15 per repetition), fairly less than that of the MC code. The FTP probability was 3.11 106 and the computer time was 27.1 s. The probability densities of the target crossing-times and of the starting point of the last jump, as estimated by the normalized histograms, are reported as solid lines in Figs. 1 and 2. More detailed results of the computation are reported in Table 1. In comparison with the results of the pure MC approach – which represent the truth – the following main considerations may be drawn:
the MC–SS approach estimates quite correctly the FPT: indeed it is about 37% smaller: here one should consider that in case of very small probabilities, what is important is the order of magnitude;
there is a good agreement between the normalized histograms given in Figs. 1 and 2. – the average crossing-time is 55.9 to be compared with 57.74 ± 0.59, – the average value of the x coordinate of the last jump starting point is 33.31 to be compared with 33.07 ± 0.04. Analogous results were obtained for the y, z coordinates.
most important, the computer time is about 620 times smaller. We also did three other pairs of MC and MC–SS simulations for Xmax = 20, 25, 30. The FPT probabilities computed by the MC code decreased exponentially and, including the results for Xmax = 35, the ratios to those computed by the MC–SS code were (2.45, 4.05, 4.56, 1.59). Concerning the computer times, they increased exponentially in the MC case, while they were always of a few tens of seconds in the MC–SS case. The exponential behaviours of probabilities and computer times of the MC approach allowed an analytical extrapolation to Xmax = 40, which gave an FTP probability of 9.00 107 and a computer time of 9.24 104 s (26 h). The MC–SS computation was instead actually feasible and yielded a probability 8% higher, done in 35.33 s. The fact that in the tested cases the orders of magnitude of the FPT probabilities can be correctly estimated with very small computer times by the MC–SS approach, supports its adoption in place of the exact MC one when very small FPT probabilities must be estimated.
414
M. Marseguerra / Annals of Nuclear Energy 38 (2011) 410–417
0.8 0.6 0.4 0.2 0
27
28
29
30
31
32
33
34
35
x 0.2 0.15 0.1 0.05 0 -15
-10
-5
0
5
10
15
5
10
15
y 0.1
0.05
0 -15
-10
-5
0
z Fig. 2. Gaussian diffusion: Xmax = 35; solid = MC–SS; dotted = MC. Normalized densities of the coordinates of the starting point of last jump.
Table 1 Gaussian diffusion. Xmax
20 25 30 35 40 a
FPT probability
Computer-time/walker (ls)
Computer time (s)
MC–SS
MC
MC–SS
MC
MC–SS
MC
MC–SS
Iter/repet
8.90 104 15.84 105 29.49 106 4.95 106 9.0 107a
3.63 104 3.91 105 6.46 106 3.11 106 9.73 107
93.3 524.8 2800.8 16863.7 92400a
14.1 19.16 25.08 27.05 35.33
83.02 83.07 82.59 83.4 –
14.10 15.96 17.56 17.98 18.65
2.008 2.400 2.856 3.008 3.105
Exponential extrapolation from preceding values.
1 0.8 0.6 0.4 0.2 0 -3 10
-2
10
-2
10
10
-1
0
10
0
10
10
1
10
2
1
10
5
10
0
10
-5
10
-3
10
10
-1
time
10
2
Fig. 3. Anomalous diffusion: cumulative distribution function (top) and probability distribution function (bottom). The parameters are a = 0.5 and s = 103.
415
M. Marseguerra / Annals of Nuclear Energy 38 (2011) 410–417
0.8
lengths as done for the Gaussian case: ancestors and descendants easily spread over all time and space and the events of interest are highly dispersed. The parameters of the power law distribution are here set to a = 0.5 and s = 103 and a plot of the resulting distributions is shown in Fig. 3. The generic jump is still ruled by Eq. (1) in which the jump random times are
0.7 0.6 0.5
Dt ¼ sð1 RU Þ1=a ; RU / U½0; 1Þ
0.4
The initial Galilei’s parameters were l0 = (0.1, 0.1, 0.1) and m0 = (1, 0.5, 0.5) 105, the latter chosen very small since it multi-
0.3
plies the sampled times. The assumed power law pdf, after a large initial maximum, decreases very rapidly with a fat tail. While most of the sampled Dt are small (on the mean, half of the sampled Dt values are smaller or equal to 4 103), a few of them are very large and make large the corresponding Dx. Thus, while the majority of the walkers remains close to the origin, some of them succeed in crossing even in a single jump a target almost unreachable in the Gaussian case. First, we run the MC code with Ndet = 1000 and Xmax = 35. The results were as follows: the estimate of the FPT probability was 2.36 106 which means that on the average only one walker over 420,000 succeeded to be detected, escaping absorption or exit outside the disk; the average number of walker jumps per trajectory was obviously 4.0 over all trajectories (thus the total number of jumps was 1.7 109), whereas it was 23.0 over the trajectories with detection; the required computer time was 36,019 s 10 h. The normalized histograms representative of the detection times and of the coordinates of the starting point of the last jump, are reported as dotted lines in Figs. 4 and 5. Successively, we run the MC–SS code with the same parameters (R = 250, N = 2000 and N1 = 10) of the Gaussian case. It turned out that the average number of iterations per repetition was 3.040 and that the number of detections was 1016, almost equal to that of the MC code. With respect to the MC approach, the FTP probability was 2.61 106, that is 10% higher and the computer time was 27.25 s, about three order of magnitude lower. The normalized histograms of the target crossing-times and of the coordinates of the
0.2 0.1 0
0
2
4
6
8
time
10
12
14
16
18
Fig. 4. Anomalous diffusion: Xmax = 35; solid = MC–SS; dotted = MC. Normalized density of crossing-time. The plotted data refer to 841 points over 1016.
4.2. Anomalous diffusion The walker trajectories are ruled as in the Gaussian diffusion, except for the jump times which are now sampled from a power law distribution with cumulative distribution function
Wðt; s; aÞ ¼ 1
sa t
defined for t P s
and corresponding pdf
wðt; s; aÞ ¼ a
sa t 1þa
defined for t P s
This distribution is characterized by an infinite mean so that both the jump times and lengths have infinite means and it is impossible to set a value Xmax far from the origin in terms of jump 0.8 0.6 0.4 0.2 0
5
10
15
20
25
30
35
5
10
15
x 0.2 0.15 0.1 0.05 0 -15
-10
-5
0
y
0.2 0.15 0.1 0.05 0 -10
-5
0
z
5
10
15
Fig. 5. Anomalous diffusion: Xmax = 35; solid = MC–SS; dotted = MC. Normalized densities of the coordinates of starting point of last jump.
416
M. Marseguerra / Annals of Nuclear Energy 38 (2011) 410–417
Table 2 Anomalous diffusion. Xmax
20 25 30 35 40 a
FPT probability
Computer-time/walker (ls)
Computer time (s)
MC–SS
MC
MC–SS
MC
MC–SS
MC
MC–SS
Iter/repet
6.00 104 9.55 105 1.56 105 2.36 106 3.8 107a
3.46 104 7.13 105 1.91 105 2.61 106 5.41 107
143.06 899.7 5475.3 36019.3 2.2 105a
14.02 20.92 24.31 27.25 33.58
85.78 85.94 85.18 85.00 –
13.90 16.37 17.29 17.93 19.22
2.016 2.556 2.812 3.040 3.492
Exponential extrapolation from preceding values.
starting point of the last jump are reported as solid lines in Figs. 4 and 5. Note that as a consequence of the assumed power law, the crossing-times peak close to the origin while in the Gaussian case they exhibit a bell-shaped distribution centred at much higher times (see Fig. 5). We also did additional pairs of MC and MC–SS simulations for various Xmax and the conclusions were quite similar to those of the Gaussian case. More detailed results of the computation are reported in Table 2. We would like to end this section on the numerical results with two remarks on the computer times required by the pure MC and by the MC–SS approaches. First, what is really important is their ratio: powerful computers would certainly drastically reduce these times but since the two MC and MC–SS codes share the particle tracking section, we guess that the time ratios would not substantially vary. Second, we considered the possibility that an alternative to the proposed MC–SS method could consist in resorting to parallel computing: this point is ruled by the Amdahl theorem (Amdahl, 1967; Barney, 2010) which states that the potential speedup depends on the fraction P of code that can be parallelized and on the number N of processors. For example, in case of P = 90% and N = 102 the speedup is 6 9:2.
examined, the MC approach released an FPT probability of the order of 106 in several hours, while MC–SS code yielded a value only 35% lower in few tens of seconds. Moreover, in both computations, the exit-times and last jump starting point distributions had quite similar shapes. In order to check how the proposed approach could face other situations, we also did some calculations assuming an anomalous diffusion ruled by a power law distribution of the jump times. Now the dynamics is completely different since most of the walkers essentially remain near the origin while some of them succeed in reaching the target even in a single jump. This feature is testified by the crossing-time distributions, peaked near zero in the anomalous case and at relatively large delay for the Gaussian diffusion. When the target was most distant from the source, the two FPT probabilities, of the order of 106, agreed within 10%, while the MC–SS computer time was smaller by more than three order of magnitude. Moreover, in the two approaches the exit-times and last jump starting point densities had similar shapes. We believe that in dealing with very low FPT probabilities, particularly in repeated exploratory researches which require relevant computer time, the proposed MC–SS approach could be beneficial.
5. Conclusions
Acknowledgements
The FPT problem here of interest concerns the flow of contaminants in an absorbing medium. Actually, the problem is more general since it is also of importance in quite different fields which span from the gambler’s ruin, to the spread of viruses among computers and people, to the spread of diseases. . . In the realm of the CTRW model in which the flow is represented by walker jumps, and adopting the Monte Carlo approach, we considered the probability that the contaminant could finally intersect a far located target disk, entering a region which should instead remain clean. The advantage of this approach is that it allows simulating quite easily many different features of the jump dynamics. However, the more the model is sophisticated, the larger is the time required to simulate each single jump and the more the target is far from the source, the larger is the number of jumps which must be simulated: actually, in many instances the computing time required for tracking a number of walkers suitable for the attainment of a good statistics may be excessive for most of the usual computers. To overcome the computer time problem, in this paper we propose to investigate the FPT problems characterized by very low probabilities by hybridizing the Monte Carlo with the recent SubSampling approach advantageously utilized in engineering reliability problems. Some examples of Gaussian diffusion in an absorbing 3D medium with walker jumps ruled by space dependent parameters indicate that the MC–SS approach can yield in very short computer times good estimates not only of the FPT probability but also of the exit-times and of the last jump starting point distributions. Indeed, the computer time increases exponentially with the distance source-target in the pure MC approach, while it remains steadily low in the MC–SS approach. In the most time demanding case here
The author gratefully thanks the anonymous Referee whose comments and suggestions improved the manuscript. References Amdahl, G., 1967. Validity of the single processor approach to achieving large-scale computing capabilities. In: AFIPS Conf. Proc., vol. 30, pp. 483–485. Au, S.K., Beck, J.L., 2001. Estimation of small failure probabilities in high dimensions by subset simulation. Probab. Eng. Mech. 16 (4), 263–267. Barney, B., 2010.
, last modified 05/26/2010 [email protected]; UCRL-MI-133316. Bear, J., 1972. Dynamics of Fluids in Porous Media. American Elsevier, New York. Berkowitz, B., Scher, H., 1998. Theory of anomalous chemical transport in random fractured networks. Phys. Rev. E 57, 5858–5869. Berkowitz, B., Klafter, J., Metzler, R., Scher, H., 2002. Physical pictures of transport in heterogeneous media: advection–dispersion, random walk and fractional derivative formulations. Water Resour. Res. 38 (10), 1191. Cashwell, E.D., Everett, C.J., 1959. Monte Carlo Method for Random Walk Problems. Pergamon Press. Clouvas, A., Xanthos, S., Antonopoulos-Domis, M., Silva, J., 2003. Measurements with a Ge detector and Monte Carlo computations of dose rate yields due to cosmic muons. Health Phys. 84 (2), 212–221. Cortis, A., Gallo, C., Scher, H., Berkowitz, B., 2004. Numerical simulation of nonFickian transport in geological formations with multiple-scale heterogeneities. Water Resour. Res. 40, W04209. Dentz, M., Berkowitz, B., 2003. Transport behaviour of a passive solute in continuous time random walks and multirate mass transfer. Water Resour. Res. 39 (5), 1111. doi:10.1029/2001WR001163. Hammersley, J.M., Handscomb, D.C., 1964. Monte Carlo Methods. Methuen and Co., London. Kennedy, R., Metzroth, K., Blue, T., Aldemir, T., 2007. Preliminary neutronics design study of the molten salt breeder reactor concept with MCNP5. Trans. Am. Nucl. Soc. 95, 649–650. Lloyd, A.L., May, R.M., 2001. Epidemiology: how viruses spread among computers and people. Science 292, 1316–1317. Margolin, G., Berkowitz, B., 2000. Application of continuous time random walks to transport in porous media. J. Phys. Chem. B 104, 3942–3947 (errata 104, 8762).
M. Marseguerra / Annals of Nuclear Energy 38 (2011) 410–417 Margolin, G., Berkowitz, B., 2002. Spatial behaviour of anomalous transport. Phys. Rev. E 65, 1–11. Marseguerra, M., Zio, E., 2002. Basics of the Monte Carlo Method with Application to System Reliability. LiLoLe-Verlag GmbH, Hagen, Germany. Marseguerra, M., Zoia, A., 2006. The Monte Carlo and fractional kinetics approaches to the underground anomalous subdiffusion of contaminants. Annu. Nucl. Energy 33 (3), 223–235. Marseguerra, M., Zoia, A., 2008. Monte Carlo evaluation of FADE approach to anomalous diffusion. Math. Comput. Simul. 77 (4), 345–357. Montroll, E.W., Weiss, G.H., 1965. Random walks on lattices II. J. Math. Phys. 6 (2), 167–181.
417
Pozzi, S.A., Pazsit, I., 2006. Neutron slowing down in a detector with absorption. Nucl. Sci. Eng. 154, 367–373. Redner, S., 2001. A Guide to First-Passage Processes. Cambridge University Press. Schueller, G.I., Pradlwarter, H.J., Koutsourelakis, P.S., 2004. A critical appraisal of reliability estimation procedures for high dimensions. Probab. Eng. Mech. 19, 463–474. Silliman, S.E., Simpson, E.S., 1987. Laboratory evidence of the scale effect in dispersion of solutes in porous media. Water Resour. Res. 23, 1667–1673. Zio, E., Pedroni, N., 2009. Estimation of the functional failure probability of a thermal-hydraulic passive system by subset simulation. Nucl. Eng. Des. 239 (3), 580–599.