CHAPTER
6 ^h
Modeling Prospects APPEAL OF T H E SIMULATION A P P R O A C H Simulation is a popular method of performing risk analysis in petroleum exploration, and is widely used by the major oil companies. Indeed, some multinational companies have become so enamored of this particular riskassessment technique that a standardized simulation program must be run on each and every prospect that company geologists generate throughout the world. The results of the standardized assessments are used to rank the prospects in order of their desirability. It's easy to understand why simulation is so popular within the oil industry. Simulation has been widely and successfully used by engineers to model reservoir performance, to design offshore platforms, and to evaluate alternatives for production facilities. Corporate economists have used simulation to study the possible effects of changes in the future price of oil, in discount rates, or in investment policies. However, perhaps the most important reason why simulation is so appealing is because it allows geologists and engineers to address a strongly probabilistic problem in a manner that is almost deterministic. We can imagine how we might compute the amount of oil in a prospect if we had definite information about certain specific reservoir properties. If, for example, we knew the thickness, areal extent, and porosity of the reservoir unit, we could calculate the volume of the reservoir rock that is filled with fluids. If in addition we knew the oil saturation, we could
Computing Risk for Oil Prospects — Chapter 6 calculate how much of the pore volume contains oil. Indeed, this is exactly the volumetric calculation method that petroleum engineers use to estimate the initial oil in place in reservoirs that are being developed. However, we do not know any of these constituents of volume prior to the drilling of a prospect and the discovery of oil. We may have some indirect evidence, perhaps about the thickness of a potential reservoir interval, or the height of closure and area of closure on a reflecting horizon from seismic surveys, but most of the critical variables cannot be known in advance of drilling. We may have good ideas about some potential ranges, such as the possible porosities that sandstone reservoir rock might possess, but no definite information about the specific prospect being modeled. In such a circumstance, we cannot calculate a single, specific estimate of volume, but instead we might imagine calculating a very large number of combinations of possible values of the various constituents of volume. In eff^ect, we would be posing a number of "what if..." scenarios, each representing a possible state of the reservoir. The collection of possible outcomes, if sufficiently large, forms a distribution of volumes of oil that the prospect might contain. If the inputs are given in the form of probability distributions which describe the likelihood that the input properties have certain values, the output distribution is also a probability distribution of the likely volume of oil. The numerical technique named for the famous Monte Carlo casino in Monaco is the basis for most petroleum risk-assessment simulation procedures, and indeed, for simulation methods in many fields of application. Monte Carlo techniques operate on pure chance, drawing samples at random from distributions and combining these to obtain a possible output or realization. By repeating the process of random selection and combination, a distribution of outcomes is eventually created. Monte Carlo simulation procedures were developed during World War II in connection with research on the atomic bomb (Metropolis and Ulam, 1949). It is widely used by mathematicians and statisticians to solve numerical problems that otherwise would be intractable. In the 1960's, simulation techniques were applied to the analysis of business decisions (Hertz, 1964) and soon found their way into the petroleum industry (Stoian, 1965). Simulation is commonly used by reservoir engineers and facilities designers (Smith, 1968; Walstrom, Mueller and McFarlane, 1967) and plays an important role in resource assessments made by government agencies (Crovelli, 1984; Dolton and others, 1981; Lee and Wang, 1983). Bernard and Akers (1977) describe the computer program used for the United States government's Monte Carlo assessment of offshore prospects. Newendorp's (1975) textbook contains a lengthy and especially readable discussion of Monte Carlo procedures applied to financial evaluation of petroleum prospects. 112
Modeling Prospects Textbooks devoted to Monte Carlo procedures include those by Hammersley and Handscomb (1964), Law and Kelton (1982), and Rubinstein (1981). Initially, programs for Monte Carlo analysis had to be custom created for each application and the iterative procedure strained the capabilities of the available computers. As a consequence, the method was adopted primarily by the major oil companies (McCray, 1969), but the development of very powerful and inexpensive personal computers and general-purpose Monte Carlo software have placed the technique within the reach of any interested explorationist (Davis, 1992).
STEPS IN MONTE CARLO SIMULATION The process that leads to the formation of an oil reservoir can be regarded as a succession of steps, each of which has an associated uncertainty. For some of these steps, the uncertainty relates to whether a critical event has occurred or not, and hence to the likelihood that a prospect might contain oil. That is, these steps relate to the dry hole risk. For others, the uncertainty relates to the magnitudes of the variables involved, and hence to the amount of oil that may be contained in a prospect. We can consider only a few components in the process of formation, or we can attempt to model nature in great detail. Many assessment schemes attempt to simulate the possible status of petroleum formation and accumulation at five critical steps. These steps are designated generation (or source)^ migration, timing, trap, and seal (Fig. 6.1). Generation relates to the actual presence of possible source rocks and their characteristics: volume (usually broken down into areas and thickness of source rocks), organic content, and degree of maturation. Simulation of this phase yields a distribution of the possible volumes of oil that might have been generated from rocks comprising the source beds. Migration is a nebulous concept that deals with movement of oil from the source(s) to the trap. It considers the probability that the source beds and the reservoir unit are connected by carrier beds, and whether intervening faults or other geometric barriers may be present. Simulation usually results in a distribution of the likelihood that migration has successfully occurred. Generation and migration are collectively related to charge, the quantity of oil that might be created and expelled from source beds. Some companies consider it important to model both the volume of charge and the capacity, or volume of a prospect. A comparison of the two indicates whether a prospect may be only partially filled, filled to capacity, or filled to overflowing with the potential for excess oil to have migrated farther updip into additional traps or to be lost. Timing assesses the probability that oil was generated and migrated to the location of the prospect at a time after creation of the trapping 113
Computing Risk for Oil Prospects — Chapter 6
TOC, Source Area, Bed Thickness
Source
I — I — I — r — I — I — I — I — I — I — I — I
Generated Volume
Efficiency, Timing
Timing, {Migration
% Uncertainty that Migration Occurred Area, Net Pay, Porosity, So, Sg, Free Gas
I
I
I—I
I
I
I—I
I
Trap
I—I—I
Reservoir Volume
Seal
Seal Integrity
% Volume Loss Various Conversion Factors
Oil Volume I—r—1—1—I—I—I—I—I—I—I—I
Recoverable Oil Figure 6.1. Schematic diagram of Monte Carlo simulation of an oil prospect. Source represents generation of a charge of hydrocarbons; migration and timing act to decrease the volume of the charge; trap determines the potential capacity of a prospect; seal reduces the volume retained in the trap. TOC = Total Organic Carbon. 114
Modeling Prospects mechanism. This step attempts to take into account the possibility that a prospect is barren because all of the oil potentially in place moved updip prior to formation of a trap. Trap is the most complicated and definitive of the steps in a simulation, because it relates to the geometry of a specific prospect, and usually the most detailed information has been gathered about its location and configuration. The trap ordinarily is broken down into various constituents of oil-bearing pore volume, such as reservoir thickness, area, percent fill, porosity, and oil saturation. The product of these variables is the volume of oil contained within the reservoir. In some simulation procedures, percent fill is not a specified variable, but is determined by the interplay of source, migration, and timing. Specific schemes may include somewhat different components, such as net-to-gross ratio, but basically these are the same variables that an engineer would use to estimate the amount of oil in a reservoir by the volumetric calculation method. Seal describes the integrity of a trap, or the extent to which it can contain oil over time. The seal may be perfect, enabling the trap to retain all of the oil that migrates into it until the prospect fills to the spillpoint. Conversely, the seal may be leaky, perhaps fractured or semipermeable, causing oil to be lost from the trap at an uncertain rate. In some of the more elaborate Monte Carlo simulation schemes, variables that relate to the possible volumes of gas and condensate are incorporated. These include additional properties such as the gas-oil ratio, thickness of the gas cap, and the proportion of condensate. Gas, condensate, and oil require quite different volumetric calculations and have very diff'erent economic implications, but all constituents must fit inside the same gross volume of reservoir. This means that multiphase Monte Carlo modeling schemes must (or should) include negative constraints between the constituents of the reservoir. In a Monte Carlo simulation, each of the constituents is represented by a probability distribution that describes how likely any range of values is believed to be. The probability distributions may be of standard form, such as normal or lognormal distributions, or they may have empirical forms such as rectangular, triangular, or more complicated shapes. The parameters of the distributions may be chosen based on studies of the geological properties in the area or in analogous areas, or on personal or collective experience. No matter what their source, the distributions represent a codification of the expectations of the appraiser; the probabilities distinguish between values that are considered likely, and those that are considered unlikely or impossible. The result of a Monte Carlo run is a joint probability: This amount of oil is what the prospect will contain if this is the amount generated and this is the amount that migrates and the trap is so large and The joint 115
Computing Risk for Oil Prospects — Chapter 6 probability of several events is the product of their individual probabilities, so the Monte Carlo technique involves multiplying all of the probabilities attached to the individual outcomes at each step. A value is drawn randomly from the probability distribution that describes a property and is multiplied by a value randomly drawn from the probability distribution that describes the second property, and so on. Values are drawn more frequently from portions of the distributions that have high associated probabilities. (In fact, in the limit, the proportion of observations drawn from any specific part of the distribution is exactly equal to the probability.) Successive iterations or repetitions are performed, and each time new values are drawn randomly and multiplied together. After several hundred or several thousand iterations, the collection of outcomes (each of which is a possible volume of oil) is analyzed to determine the proportions of outcomes that fall within specific ranges. When assembled into a distribution, either in the form of a histogram or a cumulative curve, the result is a probability distribution of the volume of oil.
Risked or Unrisked Distributions? In prospect evaluation, the input probability distributions can be specified in either of two ways, as risked or unrisked distributions. The difference is illustrated in Figure 6.2, which shows the possible distribution of a reservoir property such as thickness of net pay. The risked distribution has two parts, a spike at zero thickness and a skewed part that begins at 1 m, rises to a maximum at 5 m, and then tails ofT at greater values. The spike represents the probability that the prospect is barren and hence has zero thickness of oil-saturated interval; this is equivalent to the dry hole risk. The remainder of the distribution describes the possible thickness of pay in a producing prospect. Since this is a probability distribution, the total area under the curve must be 1.00 or 100%. The area under the spike (or dry hole risk) is 50%, so the area under the remaining part of the distribution also must be 50%. In the second form of the distribution, the dry hole risk is not included. Instead, the probability distribution specifies the likely nature of a successful prospect. This is a conditional probability distribution, because it describes the probabilities attached to different thicknesses of net pay, given that the prospect contains oil. The dry hole risk and the uncertainty in the amount of oil contained in the prospect are treated separately. Since an unrisked distribution is also a probability distribution, the area under the curve must also be 1.00 or 100%, but because the spike of the dry hole risk is absent, the distribution is much simpler in form and can be modeled using an equation such as the normal or lognormal. 116
Modeling
Prospects
50 45 40 35 '^^ Dry Hole Risk
® 30
I ^ 25 !5
I 20 a. 15
0\sV*^^°''
10
0
2 4
6
201
8 10 12 1416 18 2022 24 26 2 8 3 0 32 34 Net Pay, Meters
(D
iJ 1 5 a>
tou^:^o^
Q.
^^.^^^'''
•t 10 (0
0
2 4
6
8 10 12 1416 18 2022 24 26 2 8 3 0 32 34 Net Pay, Meters
Figure 6.2. (a) Risked probability distribution with a spike which corresponds to the probability of a dry hole, (b) Unrisked probability distribution that is conditional upon a discovery having been made. 117
Computing Risk for Oil Prospects — Chapter 6 Monte Carlo simulation can involve either form of distribution, but it is simpler to specify the form of an unrisked curve. For this reason, most risk-simulation procedures treat the dry hole risk separately; they model the conditional probability related to the volume of oil in a prospect, given that the prospect will be a discovery. The final result is then converted into an unconditional probability by multiplying the probabilities of the output distribution by the complement of the dry hole probability. This yields a result identical to that obtained when a simulation is based on a risked distribution.
Simulating Field Size Distributions in a District of Magyarstan We can demonstrate a simple application of the Monte Carlo simulation technique using a prospect in southwestern Magyarstan. Exploration began in earnest in this area in the early 1930's, and numerous fields have been discovered in Jurassic rocks since that time. The XV and XVa intervals of the J3 and the XVb interval of the J2 consist of alternating limestones and shales deposited in a shallow epeiric sea. Seven or more major limestones are included, each a component within a transgressive-regressive cycle. The uppermost, regressive limestone of each cycle commonly has a porous, grain-rich reservoir interval near its top. In this area, 141 fields have been discovered, each producing from one or more of these intervals. Producing zones may be 2 to 10 m thick, and several producing zones usually occur in a field. The reservoirs seem to be combination structural and stratigraphic traps, localized by a complex interplay of limited structural closure (typically 10 m or less) and the local development of porous facies. Structural and stratigraphic components may be interrelated, with thicker and more porous lithologies having originated as exposed and leached marine banks whose topographic expression was further emphasized by compaction of enclosing shales. As a consequence of their mode of formation, the fields have limited areal extent; the largest field in the area covers less than 5 km^. Information on the areal size, location, discovery date, producing interval, and production history of fields in the area can be extracted from the records of the Magyarstan Scientific Research Ministry of Economics of Mineral Resources and Hydrocarbons. Prom these data, estimates can be made of the ultimate production that will be obtained from fields still in production. These data provide information on the characteristics of fields that have been discovered in the region, and by extension, characteristics of pools that are undiscovered. Unfortunately, nothing has been systematically collected on average net pay or producing interval thicknesses, porosities, or 118
Modeling Prospects oil saturations. Distributions of these properties must be based on general characteristics of carbonate reservoirs (compiled for North America and elsewhere in the world), detailed descriptions of typical fields in the region, and the personal knowledge of local experts. These sources of information can be used to calibrate the simulation of a "typical" carbonate reservoir in this area of Magyarstan. The parameters of the input distributions can be adjusted until the output distribution closely matches the known field size distribution for the region. This provides reassurance that the simulation is reasonable, but is subject to two caveats: First, simulations are not unique; if their diff^erences are mutually compensating, many different combinations of inputs may yield similar outputs. Second, the input distributions required to simulate the population distribution of field sizes in a region are different from the input distributions needed to simulate a probability distribution expressing the uncertainty in size of an individual prospect. Since this area is a mature petroleum province, we can presume that adequate hydrocarbons were generated in source beds, and that migration occurred with appropriate timing. The only uncertainty is associated with the characteristics of the prospects themselves, and the proportion of oil in the prospects that will be recovered (the latter uncertainty need not be considered if we content ourselves with simulating oil-in-place; however, for comparison with historical records, we must simulate the amount of oil that might be produced). A five-component Monte Carlo simulation as shown in Figure 6.3 is adequate to model a "typical" field. During each iteration of the Monte Carlo process, a value is drawn at random from each of the distributions shown in Figure 6.3. These randomly selected values are combined to yield a value of ultimate oil production in barrels according to the formula: Ultimate Production (bbls) = Area (ha) x Thickness (m) x Porosity (%) x Oil Saturation (%) x Recovery Factor (%) x 62,900. The numerical factor of 62,900 converts hectare-meters (10,000 m^) to barrels, assuming that porosity, oil saturation, and recovery factor are given as decimal fractions and that stock-tank and reservoir barrels are equivalent. All input distributions were modeled as truncated continuous probability functions whose parameters are given in Table 6.1. The areas of oil fields in the region are known to follow a highly skewed distribution, and the distribution of thicknesses is thought to be highly skewed as well; both area and thickness were modeled as truncated lognormal distributions. Porosities, oil saturations, and recovery factors may be presumed to vary more or less symmetrically about central values, so these were modeled as truncated normal distributions. 119
Computing Risk for Oil Prospects — Chapter 6 30-1 251 20 15H
100
200 300 400 Area, Hectares
10 15 20 25 Porosity, Percent
500
30
30i
25
25
20
20
15
15
10
10-1
5
5-^
0
5
10 15 20 25 Thickness, Meters
30
30 40 50 60 70 80 90 100 Oil Saturation, Percent
30-1
25J 20-^
15J 10-1 5 0
10 20 30 40 50 60 Recovery Factor, Percent
Figure 6.3. Input probability distributions used in Monte Carlo simulation of ultimate production from Jurassic fields in southern Magyarstan. Parameters of distributions are given in Table 6.1. The output distribution is shown in Figure 6.4a and may be compared to Figure 6.4b which shows the observed distribution of estimated ultimate production from the 141 Jurassic fields in the area. The simulation is 120
Modeling Prospects Table 6.1. Monte Carlo input parameters derived from characteristics of fields in a district in Magyarstan. Property Area (ha) Thickness (m) Porosity (%) Oil Saturation (%) Recovery Factor (%)
Mean
Standard deviation
Lower limit
Upper hmit
80 12 10 70 30
96 4 5 10 7.5
20 4 2 0 10
1200 36 30 100 50
very similar to the actual distribution, indicating that reasonable choices of distributions and parameters have been made.
Simulating a Specific Prospect in Southern Magyarstan This exercise in "history matching" reassures us that the simulation model produces acceptable results. The next step is to substitute distributions that describe the characteristics of a particular prospect that we wish to evaluate into the model. We will apply the model to a prospect, shown in Figure 6.5, which was proposed by the international partner in a joint exploration venture. Inputs to the Monte Carlo risk-assessment model are taken from the company's prospect folio. When applying Monte Carlo methods to the analysis of an individual prospect, it is very important to keep in mind that the input probability distributions describe the range of likely values that properties might assume in that specific prospect. These distributions have nothing directly to do with the distributions we have used to describe the variation in average properties of the collection of fields from the southwestern Magyarstan area, except that hopefully the properties of the specific prospect will become a new single set of values in these distributions. For example, perhaps the prospect can have at most a trap area of 160 ha, and cannot conceivably have an area of less than 120 ha. These values should define the upper and lower extremes of the distribution of area, even though the resulting distribution is quite different than the distribution of field areas in the region. Obviously, if the regional distributions pertained to the individual prospects, then all prospects would be expected to have exactly the same probability distribution! Another caveat is that the distributions of thickness, porosity, and oil saturation refer to field-wide average values and the uncertainties about the exact magnitudes of these field-wide averages. The uncertainty about 121
Computing Risk for Oil Prospects — Chapter 6
20-
15o Q.
CO
n 2 5H ^^^rrrru
1
2 3 Ultimate Production, Millions of bbis
1
2 3 4 Ultimate Production, Millions of bbIs
20-
15(0 0)
oio-
I CL
(H
l::::::::A:::::::::fc:::::x^^^^^^^^^
Figure 6.4. (a) Simulated ultimate production in barrels for 500 iterations of a Monte Carlo simulation of Jurassic fields in southern Magyarstan. (b) Actual distribution of estimated ultimate productions of 141 fields in southern Magyarstan. 122
Modeling Prospects
Figure 6.5. Prospect map of an area in southern Magyarstan, as developed during a joint venture. Inputs to Monte Carlo simulation are based on characteristics of this prospect. Grid has 1-km spacing. Contours in meters below sea level. average porosity or oil saturation is not the same as the variation in porosity or oil saturation that may occur between one depth and another in a well, or even the variation in average values that occurs between wells. Similarly, the uncertainty about the possible average thickness of net payis not necessarily the same as the possible variation in net pay thickness that occurs across the field. The model parameters selected, after some experimentation, are given in Table 6.2. Area and thickness were modeled using truncated normal distributions, rather than skewed lognormal distributions. As can be seen from 123
Computing Risk for Oil Prospects — Chapter 6 Table 6.2. Parameters of distributions used to model a specific prospect in southern Magyarstan. Property Area (ha) Thickness (m) Porosity {%) Oil Saturation (%) Recovery Factor (%)
Mean 164 6 10 70 30
Standard Minimum Deviation 40 3 5 10 7.5
30 3 2 0 10
Maximum 208 9 30 100 50
the prospect map in Figure 6.5, a field could not be larger than about 200 ha, or it would already have been encountered by one of the nearby drill holes. Explorationists who developed the prospect saw no reason to presume that the distribution should be asymmetrical, so a normal distribution of area was used. Nearby productive wells produce from one, two, or at most three zones in the XV and XVa limestones of J3, as several of the transgressive-regressive cycles are missing and no carbonates are present in the J2. Production comes from leached porous zones that typically are about 3 m thick at the tops of the limestones. Therefore, if one productive limestone is encountered, the net pay will be about 3 m; if two are encountered, the net pay will be about 6 m; and if three are encountered, the net pay will be about 9 m. The thickness of the net pay was represented by a normal distribution with a relatively large standard deviation and close cutoffs. Use of a uniform or rectangular distribution would yield similar results. Since no specific information was available on porosity, oil saturation, or recovery factor, these were modeled using the same parameters as used for the regional simulation. A Monte Carlo simulation of 500 iterations yields the distribution of the ultimate oil production (in bbls) from the prospect, if a discovery is made (Fig. Qt.^). The expected (mean) amount is 1,300,000 bbls.
Incorporating Risk in the Simulation A critical missing component in the simulation to this point stems from the fact that we have modeled a probability distribution that is conditional upon oil being discovered. That is, the distribution we have produced is unrisked. To produce the more useful risked, or unconditional form of the distribution, we must include the dry hole probability in our considerations. In this area, the dry hole probability can be estimated at about 47%, based 124
Modeling Prospects
15
c 0)
p
0
1
2 3 Ultimate Production, Millions of bbis
4
5
Figure 6.6. Simulated probability distribution of ultimate production in barrels for a prospect in southern Magyarstan based on 500 iterations of a Monte Carlo simulation. Expected value is 1,300,000 bbls. on the proportion of wildcats drilled that have been abandoned as dry. The probability of a discovery of some magnitude is then 100% — 47% — 53%. The probabilities of the unrisked distribution are simply multiplied by the probability that a discovery will be made. The area under the probability distribution will not sum to 100%, but rather will sum to 53%; the remaining 47% of the distribution is the probability that the prospect will be dry. Perhaps the most convenient way to show the probability distributions is not as histograms in which the bars represent the probabilities associated with equal intervals of oil volume or monetary worth, but as cumulative probability curves. These are made by graphing the successive percentiles of the probability distribution against barrels of oil produced (Fig. 6.7), producing a plot of the probability that a specified volume or less of oil will be discovered. Sometimes it is more convenient to use the complement of this probability, so the graph expresses the probability of the discovery of a specified volume or more (Fig. 6.8). Because the highest probabilities may be associated with the smallest amounts of oil, it may be useful to plot volume on a logarithmic scale (Fig. 6.9). Plots of the risked probability of discovery have the same form as unrisked plots, but the curves do not begin at 100%, but rather begin at the complement of the dry hole risk. Figure 6.10 shows a risked distribution on 125
Computing Risk for Oil Prospects — Chapter 6
90-
j
i
i
1 cj
^ j
80j
6050-
CO
i
i
i
i
1
70(D CL
i
o
1
1
i
\
1
40-
2 CL
30-
1
200
\
\
\
\
\
1
1
1
1
1
10-
nP u ^^ 0
1 1 1 i 1 1 1 1 i 11 1 1 1 11 1 1 1 11 1 1 11
1
2 3 4 Ultimate Production, Millions of bbis
5
1 i 1
6
Figure 6.7. Probability that a discovery will contain a specified volume of oil or less, given that a discovery is made. a logarithmic scale; note that the distribution begins at 53%, the probability of any discovery, regardless of magnitude. RISKSTAT contains a simple Monte Carlo simulator that allows you to experiment with the procedure for assessing an individual prospect. The Monte Carlo routine in RISKSTAT uses four input variables: area, thickness, porosity, and oil saturation. (The program assumes that area is in acres and thickness is in feet, so the results of your simulation will be off by a constant if you use hectares and meters. Output can be scaled to metric units by multiplying the number of barrels of oil by 8.1, or roughly increasing them by an order of magnitude.) For each input variable, you will be asked to choose a form for the distribution—normal, lognormal, exponential, or uniform. Depending upon your choices, you will then be asked for the appropriate parameters. RISKSTAT provides the option of simulating "oil-in-place" or "recoverable oil." If you choose to model recoverable oil, you must also specify the form of distribution and parameters for the recovery factor. RISKSTAT will also request that you specify the number of iterations in the 126
Modeling Prospects
100 I
90
y>
I
I
•
I
I
~T"""•^l---""»-n---"-----------i-----------"««-r---------------r--------------1-
80 H
70 c ~r 60 4
Q.
Q^
1
1
-..^.-^..-.----.- — ..^....
o|
r
-}
-r
...}.....
1-
_...^.
i
[
j-
I
^
504 50 4
-io-
]
[
!•
I
CO
40 •4.
J—O----
-j
\
*•
;-
2 Q.
30 > x . . . . . . . . . . . . . . . I . . _ . . _ . ^ ^ . . . . . J . . . . . . . . . . . . . . .
fc............_..b...--....---..ji.
20 10
j
I
I ^^
I
I
1
I
'
' ^^*^^^-«.
'
'
'
0
1
2 3 4 Ultimate Production, Millions of bbis
5
6
Figure 6.8. Probability that the specified volume of oil, or more, will be discovered in the prospect, given that a discovery is made. simulation. Depending upon the speed of your computer, time required for Monte Carlo simulations can be lengthy. It may be prudent to try an initial run using only 50 iterations or so to confirm that the proper input parameters have been chosen. When you are satisfied that appropriate forms of distributions and their parameters have been specified, the actual simulations should be run for at least 500 iterations. It is not unusual to specify several thousand iterations when using fast mainframe computers or workstations. The results of simulations can be displayed as histograms using the graphics options of RISKSTAT.
127
Computing Risk for Oil Prospects — Chapter 6 100
- _ ~ - . - . - . . . . . . - f
.....-.,...--,-...^...^..^.•.,..J..,,...-..-...-.,....-.-I,.
90 80 •£
70
«
60
Q.
i" 50
5 40 2 °- 30 20 10
0 0.1
1 Ultimate Production, Millions of bbis
10
Figure 6.9. Probability that the specified volume of oil, or more, will be discovered in the prospect, given that a discovery is made, plotted on logarithmic scale of volume.
SELECTING DISTRIBUTIONS AND SETTING PARAMETERS There are many numerical distributions, both discrete and continuous, that could be used as models in Monte Carlo simulation. Usually, workers choose a normal distribution to model properties that they believe are more or less symmetrical, and lognormal distributions to model properties that may be skewed with a tail extending to large values. However, no one knows what forms really describe the populations of geological variables, and the normal and lognormal models are chosen primarily for convenience and conformity with accepted usage. For example, some researchers have argued that field sizes should follow a Pareto distribution (Schuenemeyer and Drew, 1983), while others have advocated a log-gamma model (Davis and Chang, 1989). @RISK, a popular Monte Carlo simulation program for personal computers, contains a library of almost 30 different distributions that could be used in modeling a prospect. 128
Modeling Prospects
Ultimate Production, Millions of bbis
Figure 6.10. Probability that the specified volume of oil, or more, will be discovered in the prospect, adjusted for the dry hole risk. Volume plotted on a logarithmic scale. Faced with such a plethora of choices and very little guidance, many users settle for the simplest approach, which is to specify distributions as triangular in form, defined by a "lowest possible" limit, a "most likely" peak value, and a "highest possible" upper limit. Figure 6.11 shows a comparison between a triangular distribution and a normal distribution that have the same means and standard deviations {X = 20, s = 4.08). At first glance, there is little difference between the two, but a triangular distribution may be seriously misleading when used in the evaluation of petroleum prospects. The triangular distribution in Figure 6.11a has a minimum lower limit of 10 and a maximum upper limit of 30; no values can be drawn from the distribution that are smaller or larger than these limits. In contrast, the normal distribution shown in Figure 6.11b is theoretically limitless. In 100 random draws from the distribution, the smallest value was —1.82 and the largest was 41.83; even more extreme values would be drawn if the simulation were run for several hundred, a thousand, or more iterations. Of course, these extreme values are very rare; only 5% of the draws exceeded 129
Computing Risk for Oil Prospects — Chapter 6 101
i
-
^ii^
fe« a -
• ' •
;
(£6-
1 4^ 2 2-
!"^^
wz
a ^
10
15
10
15
20 Thickness, Meters
25
30
20
25
30
10 1 0) 8 Q-6 CO
| 2
Thickness, Meters Figure 6.11. Comparison between triangular distribution (a) and normal distribution (b) used to describe the thickness of a reservoir interval. Both have means of 20 m and standard deviations of 4.08 m. 26.71 in the simulation having 100 iterations. But these very large outcomes that are associated with small probabilities of occurrence play a critical role in risk assessment. It is the chance of discovering a bonanza that often drives exploration. Even if the probabilities of finding giant fields are very small, the potential rewards may be so vast that they strongly influence the worth of prospects. If triangular or other bounded distributions (such as the uniform distribution) are used, we run the risk of unknowingly truncating the size of field that a prospect might contain. We will not be able to estimate the probability associated with a giant field because our simulation will be 130
Modeling Prospects incapable of creating such a field, regardless of the number of iterations that are run. In effect, we have decreed that the occurrence of a very large field is not unlikely but, rather, impossible because of our choice of input distributions. If this is truly our intent, well and good. The danger lies in inadvertently creating a simulation which cannot represent the range of possibilities, and failing to realize that the potential economic range of the model is thereby constrained. In the absence of specific knowledge (or strong belief) about the form of various geological distributions, it seems reasonable to model most of them using normal and lognormal distributions. Some properties must be bounded; for example, it is not possible to have negative thicknesses or percentage variables such as porosities or oil saturations outside the range of 0% to 100%. Experience in a specific area or play may lead us to believe that the bounds are more restricted. In the simulation of a Magyarstan prospect, information from other fields leads us to believe that recovery factors can be no less than 10% and no greater than 50%. Having settled on the appropriate form of distribution for a geological property, we are then faced with the problem of specifying its parameters. Most distributions require a measure of the center (the mean, median, or mode) and one or more measures of spread (the standard deviation, range, or upper and lower limits). Geologists seem to have reasonably consistent and reliable ideas about the average or "most typical" values for many geological variables, but a poor grasp of the possible extremes. Psychological experiments have shown that people (including geologists) consistently underestimate the magnitudes associated with rare occurrences (z.e., those in the extreme tails of distributions). As a consequence, there is a tendency to be conservative in estimating the spread of distributions in Monte Carlo simulation. The result is an unwarranted reduction in probabilities associated with the most extreme outcomes, including the probability of making a very large discovery. The problem of specifying appropriate parameters is complicated if the normal or lognormal distributions are used, because the standard deviation is one of the required parameters for these distributions. Most geologists have only a vague "feel" for the meaning of the standard deviation, and little or no experience that might guide them in selecting appropriate values. The most appropriate estimates of these parameters are statistics calculated from data collected in the same province or play as the prospect being modeled. This was done, for example, in the simulation of a prospect in Magyarstan. Sample means and standard deviations can be calculated on the properties measured in known fields and used to guide the specification of parameters in a simulation. 131
Computing Risk for Oil Prospects — Chapter 6 Sample means and standard deviations may not be entirely reliable because both measures are sensitive to the occurrence of unusual values, especially if calculated from small data sets. In such circumstances, it may be better to estimate the center of a distribution by the median of the regional data, and to approximate the standard deviation by ranking the known observations, determining their 15th and S5th percentiles, and dividing the difference between them by two. This approximation is based on the fact that the interval within one standard deviation of either side of the mean of a normal distribution contains almost 70% of the area under the curve. Unfortunately, deriving modeling parameters from regional data is only feasible in relatively mature areas where abundant observations are available. In virgin areas, or when modeling a prospect based on a totally new geological concept, these data do not yet exist. And, of course, for many geological variables that are included in some of the more complicated Monte Carlo procedures, direct knowledge is almost never available. Who can really say what a distribution might be like that describes a property such as "adequacy of seal?" For some geological properties, there are national or worldwide compilations that can provide guidance for specifying realistic parameters. The American Petroleum Institute, for example, issues statistical summaries of the characteristics of oil and gas fields in the United States (American Petroleum Institute, 1967, 1984), and numerous authors have published studies of specific properties {e.g., Maxwell, 1964; Nehring and Van Driest, 1981; Schmoker, Krystinik, and Halley, 1985; Sluijk and Nederlof, 1984). In Monte Carlo schemes that encompass the complete sequence of oil generation, migration, entrapment, and recovery shown in Figure 6.1, some variables are not actual geological properties. An example is "trap timing," which is supposed to express the chance that a potential trap was formed prior to the migration of hydrocarbons through the location of the trap. The likelihood of a fortuitous coincidence of events is a probability, and is given in percent; its sole eff^ect is to reduce the quantity of oil or gas that is available to be included in the prospect. Usually, "timing" is not given as a single value but as a distribution having a lower limit, a most likely value, and an upper limit. In other words, it is a probability distribution of probabilities! It is extremely difficult to imagine how the parameters of this distribution might rationally be specified, even though they may have a significant effect on the final volume of oil contained in the prospect. Similar comments apply to variables such as "migration efficiency" and "seal quaUty," which also are expressed as percentages {i.e., probabilities of occurrence or failure). The foregoing comments might be taken as lightly veiled criticisms of some widely used Monte Carlo simulation procedures. Our remarks are 132
Modeling Prospects based on the belief that specifying in great detail an extremely uncertain series of events does not make the outcome any less uncertain. If geologists have difficulty assessing the volume of oil that might be contained in a prospect, their task is not made any easier (or the results more precise) if the prospect is broken down into a large number of components whose characteristics are even less well understood. In many areas, the bestknown property associated with oil fields is how much oil they contain. Geologists might well do a better job of estimating the distribution of field sizes directly, rather than estimating a large number of secondary attributes that are poorly known, and then multiplying these together to produce a distribution of field sizes.
ARE GEOLOGIC PROPERTIES INDEPENDENT? In Monte Carlo simulation of a petroleum prospect, values of geological variables are selected at random from the specified distributions and multiplied together to obtain the distribution of their products. Some of the variables are percentages and some are areas or thicknesses; the end result is a distribution of volumes. When we draw a value of one variable, the number we obtain has no effect on the value we will draw for another variable. That is, the variables are completely independent of one another. Is this a reasonable assumption, and what are the consequences if it is not? In Chapter 4, we discussed the relationship between field area and field volume. There are similar positive relationships between field areas and reservoir thicknesses (bigger fields tend to have thicker oil columns as well as greater areal extent) and sometimes between other geological variables (in some sandstone reservoirs, thicker intervals tend to be cleaner, and hence have higher porosities; in turn, oil saturations tend to be higher in reservoirs with higher porosities). Productivity may be correlated with geological characteristics (the recovery may be low for tight formations). Typically, these correlations are not especially pronounced, but if they are not considered in simulation, the results may be biased. Figure 6.12a shows a lognormal distribution that we will use to model field area; the distribution has a mean of 120 ha and a standard deviation of 20 ha. (Note that the distribution is skewed to the right when plotted on an arithmetic scale as shown here; if the distribution were plotted on a log scale it would be symmetrical.) Figure 6.12b is a plot of a normal distribution representing reservoir thickness; it has a mean of 10 m and a standard deviation of 2 m. We can sample randomly from each of these distributions and obtain their products, which will express the gross rock volume of the prospect. 133
Computing Risk for Oil Prospects — Chapter 6
0 6
03
f 2 50
100 150 Hectares
200
5
10 Meters
15
Figure 6.12. Input distributions used in a simulation of reservoir volume, (a) Area in hectares (lognormal with mean = 120 ha, standard deviation = 20 ha), (b) Thickness in meters (normal with mean = 10 m, standard deviation = 2 m). The distribution shown in Figure 6.13a is the result of 1000 iterations in an ordinary Monte Carlo procedure. The output distribution has a mean of 1198 hectare-meters (or 11.98 million m^). The upper ^bth percentile is 1733 hectare-meters and the maximum value calculated in 1000 iterations is 2571 hectare-meters. A simulation based on the same input parameters is shown in Figure 6.13b, but the thickness and area are specified as having a positive correlation of r = 0.80. (This is much higher than the correlations usually seen between reservoir properties.) The mean of the output distribution is somewhat higher, being equal to 1234 hectare-meters. However, the upper 95^/i percentile is 1985 hectare-meters, or 252 hectare-meters greater than for the ordinary simulation. The maximum value calculated in 1000 iterations was 5215 hectare-meters, more than twice the maximum calculated when the properties were assumed to be independent. (Correlations between variables in Monte Carlo simulation can be induced by a two-stage sampling procedure that orders the observations by their rank. Technical details are given by Iman and Conover (1980). Newendorp (1975) also discusses the problems that may arise in simulation with dependent variables and gives several ad hoc procedures for introducing dependence into a simulation.) The differences between results from the simulation that assumes the input variables are independent (Fig. 6.13a) and one that does not (Fig. 6.13b) might seem to be minor, but note where the bigger discrepancies appear. They occur in the tails, and particularly the upper tail if one or more of the input distributions are positively skewed {i.e., lognormal). As we have noted, this low probability but high payoff part of the field size 134
Modeling Prospects
251
c 20i CD
2 DL 1 5 1
I 10 CO O
5H ^ 1000
2000
3000
4000
Volume, Hectare-Meters 25•E 20-
I 151 lorn
H
JQ
» ^ Pi* 0-
m ^ ^ ^ p ^ r a , ^ ^
b
n^
3000 1000 2000 Volume, Hectare-Meters
4000
Figure 6.13. Simulated output distributions of the product area x thickness after 1000 iterations, (a) Thickness and area are considered to be independent, (b) Thickness and area are correlated (r = 0.80).
distribution plays a critical role in the worth of a prospect. Our simple experiment has shown that ignoring correlations between variables will not dramatically affect the resulting simulation, but it does have the potential to cause critical parts of the final distribution to be underestimated. Unfortunately, almost nothing has been published on possible interdependencies between geological variables (except for the interrelationship between porosity and water saturation), so it is difficult to determine if this might be a significant problem in Monte Carlo simulation of prospects. As another cautionary note, remember that our experiment involved multiplying 135
Computing Risk for Oil Prospects — Chapter 6 together only two variables, and in a full-scale simulation we might multiply a dozen or more distributions representing different geological properties. If several of these variables are dependent and are mistakenly treated as if they were independent, the combined effect may be much more severe than what is seen here. A prudent course of action for anyone who relies on Monte Carlo simulation of prospects would be to experiment and determine the effects, if any, of assuming independence versus nonindependence between the input variables they use. If possible, data should be collected from fields within the same play as the prospects being appraised, and statistical correlations calculated. Considering the enormous investment that many oil companies have made in Monte Carlo prospect evaluation software, it's surprising that so little has been done to verify the assumptions built into the technique.
136