Professional Documents
Culture Documents
Summary beta, gamma, Weibull, and Pareto). As we see when we use field
This paper demonstrates how to incorporate historical data into data, however, the distribution may take on a customized shape
Monte Carlo simulations, describes how the parameters are dis- instead of being an exact theoretical distribution. These theoretical
tributed, and quantifies dependencies among them. It also shows distributions and the grouped field data are represented as histo-
the effect of ignoring dependency and presents summary statistics grams, probability density functions (e.g., the bell-shaped curve),
on field data. or cumulative distribution functions. The three types of graphs in
Fig. 1 are examples of three common distributions.
Introduction Regardless of how we represent input distributions, it is helpful
to represent an output distribution with a cumulative distribution
As Cronquist 1 points out, the oil and gas industry has been un- function (CDF), which gives the user a means to compare alterna-
willing to adopt stochastic definitions of reserves. Nevertheless, tives. That is, two outputs may represent alternative prospects. Their
Monte Carlo simulation methods seem to be gaining acceptance by CDF's can be overlaid for easy comparison.
engineers, geoscientists, and other professionals who want to evalu- The CDF is also useful to illustrate how Monte Carlo sampling
ate prospects or to otherwise analyze problems that involve uncer- is accomplished (Fig. 2). First, a uniformly distributed random num-
tainty. Among the common applications of Monte Carlo simulation ber between 0 and 1 is selected and used to enter the vertical axis
are estimation of recoverable hydrocarbons from a reservoir, fore- of a CDF, which represents cumulative probability. Proceeding to
casting production and revenue streams for a well or a field, eval- the curve and then down to the horizontal axis, a unique value of
uation of a waterflood prospect, and comparison of net present the corresponding random variable is determined. Thus, the sam-
values of alternative investments. In each case, the user must pling process requires only the existence of a CDF for the parame-
prescribe statistical distributions for the input parameters. Select- ter being sampled, which is the key to using any set of field data
ing these distributions is often the most challenging aspect of the as a model for an input distribution. We simply construct the CDF
simulation. While experience and fundamental principles should for those data, first by grouping them into classes and then by cal-
guide us, field data may exist that could suggest both the type of culating the cumulative relative frequency.
distribution and the parameters necessary to describe it. Finding
appropriate data and incorporating them into the model is one fo- Examples of Simulation Models
cus of this paper.
Although we concentrate on one model for our example, several
A second focus is the possibility that two or more of the parame-
applications can be described by the following two general types
ters may depend on one another. For example, in some environ-
of simulation models-product and forecast models.
ments, area and net pay, porosity and permeability, or decline rate
The volumetric model in Eq. I represents a class of models in
and pay thickness may exhibit bivariate dependency. Instead of ig- which the output is a product of several input parameters. A sim-
noring the dependency or guessing how to quantify it, we may be plified form for reserves often used in exploration prospects is given
able to examine historical data for clues. by
What Is Monte Carlo Simulation? N=AhR, ......................................... (2)
A Monte Carlo simulation begins with a model. To illustrate, we where R is a recovery factor that includes efficiency, porosity, satu-
select one form of a volumetric model for oil in place (OIP), N, ration, and FVF's.
in terms of area, pay, porosity, water saturation, and FVF: Caldwell and Heather 2 featured two product models, one for
coalbed methane reservoirs,
N=7,758 Ah¢(l-Sw)IBo' .......................... (I)
Rc=AhCgpR, .................................... (3)
Think of A, h, S w' and Boas input parameters and N as the out-
put. Once we specify values for each input parameter, we can cal- and another for naturally fractured reservoirs,
culate a value for the output. Each parameter is viewed as a random
RH=RvLH (l-lj)(I-fjw)/£j' ....................... (4)
variable. A trial consists of selecting one value for each input and
calculating the output. A simulation is a succession of hundreds The exponential decline curve,
or thousands of repeated trials, during which the output values are
q=qi exp( -at), ................................... (5)
stored. Afterward, the output values are grouped into a histogram
or a cumulative distribution function. can be used in a Monte Carlo simulation by treating both the initial
Monte Carlo simulation is an alternative to both single-point (de- productivity, qi' and the decline rate, a, as random variables. The
terministic) estimation and the scenario approach that presents three production forecast appears no longer as a single curve, but as a
cases: worst, most likely, and best. band of uncertainty (Fig. 3).
After a production forecast is available, an economic forecast
How Do We Select a Value for Each Input? A Monte Carlo simu- can be generated by assigning prices and operating costs. These
lation customarily is run with special software-either spreadsheet parameters can be treated as random variables also. Typical out-
add-ins or compiled programs. The key step is to choose a value put distributions include present value and discounted cash flow.
for each input parameter according to a specified distribution.
Among distributions commonly used are the familiar (i.e., normal, Monte Carlo Simulation Advantages and Disadvantages. Some
triangular, lognormal, and uniform) and the less-familiar types (i.e., of the reasons to use Monte Carlo simulation.
Copyright 1994 Society of Petroleum Engineers I. The results contain maximum information about possible out-
SPE Computer Applications, April 1994 comes compared with either the scenario or deterministic approach.
12------------------------------------------------------------
'E7¥1-]
70 120 170
0.5
o . .__. .
70
~--------~
0.1 r--=-----,---_,
I
,·,:I~i·················
0.25 .
0.05 1
Fig. 1-Three common distributions, shown as cumulative distribution functions (first row), probability density functions (sec-
ond row), and histograms (third row).
2. The simulation emphasizes the underlying model with its as- 4. Sensitivity analyses reveal the key parameters and help quan-
sumptions and helps the user quantify and incorporate historical tify the value of additional information.
data, including dependence. Some of the disadvantages of Monte Carlo simulation are that
3. The results enable us to answer such questions as (1) "How (1) users need to buy and learn how to use the software, (2) the
likely is the most likely outcome?"; (2) "Which alternative is more language of probability and statistics can be a barrier to understand-
risky?"; (3) "Does one alternative dominate the other?"; (4) "How ing and explaining results, and (3) the results are only as good as
many wildcats will have to be drilled to have 90% confidence of the model and the input assumptions.
at least two successes?"; and (5) "What is the probability of going The first two disadvantages are simple to dismiss. Inexpensive
broke before the second success?" software is available. The time needed to learn to use the software
1.0000
~-
0.9000
0.8000
/
0.7000 I
0.6000
I
0.5000
/
0.4000 /
I
0.3000
0.2000
..... I
0.1000
/-
0.0000
~ ~ "
70 80 90 100 110 120 130
-.000008
2 3 4 5 6 7 8 9 10 Cell#
in RCIl~e
and to improve on the fundamentals is a small price to pay for the of our assumptions. If there is disagreement about the type of dis-
power of the simulation tool, and some of the learning experience tribution or the range of some input parameters, at least the simu-
should broaden the user's general analytical skills. lation can be run under various assumptions to quantify the
The importance of the selection of the model and its inputs can- differences. Sensitivity analysis is an important partner to Monte
not be overemphasized. You should not use an exponential decline Carlo simulation in the overall decision process.
curve to model production that has a definite hyperbolic shape. Nor
would you assume that reservoir acreage was uniformly distributed. Historical Data. The first place to look for data might be in your
own company. Corporate databases have drawn more attention in
Sources for Input Distributions recent years. There are obvious limitations of quantity and scope.
Where do we look for guidance when we select models and make In addition, there may be artificial barriers between groups, making
assumptions about input parameters? Three general sources are access difficult. It is not uncommon for two or more databases for
the same field to exist. For example, there may be a petrophysical
available: fundamental principles, expert opinion, and historical
database and a production database. It is even possible that these
data. While this paper emphasizes historical data, the usefulness
two sources will have conflicting data, resulting from revisions based
of guiding principles and experts should not be underrated. Indeed,
on additional information or advances in interpretative technolo-
the wise practitioner of risk analysis embraces all three sources.
gy. Nevertheless, this can be a valuable source of data, particular-
ly if the personnel in charge of the databases were to participate
Fundamental Principles. There are reasons you might expect cer- in the modeling and analysis.
tain parameters to be lognormally distributed. One key example Commercial database vendors and software vendors who accumu-
is field size, which is the product of acreage, net pay, and recov- late databases are natural sources of field data. State and federal
ery. A consequence of the Central Limit Theorem in statistics is agencies are other popular sources. On the list are the DOE, the
that products of variables tend to be lognormally distributed. Simi- U.S. Minerals Management Service, the Bureau of Economic Ge-
larly, sums of variables tend to be normally distributed. Cost engi- ology, the Gas Research Inst., and various state oil and gas com-
neers who estimate numerous subtotals should observe that their missions.
grand totals resemble normal distributions. Technical papers and journal articles often summarize their data
It is not mere happenstance that lab reports for core samples plot or present it only in graphical form, removing the level of detail
the log of permeability against porosity. The underlying facts are necessary for use in Monte Carlo simulation. More-detailed topi-
that permeabilities tend to be lognormally distributed and that there cal reports and field studies, however, often preserve adequate
tends to be a positive correlation between permeability and porosi- detail.
ty. Much to the chagrin of the well-test devotee, there is some truth
in the assumption that porosity can be a predictor of permeability. Case Study: Using Historical Data for a Volumetric Model
The way that water saturation and porosity are calculated im-
For our purposes, let us suppose that we assembled the perfect team
poses a negative correlation between those two parameters. Another for the analysis, selected an appropriate model to use, and exam-
argument for this inverse relationship, at least in water-wet rock, ined the fundamental principles. We are at a point where we need
is the following. Water saturation is the ratio of water volume to to select appropriate distributions for each input parameter. To be
total PV. In an idealized pore space, the water saturation is propor- specific, suppose we are using the volumetric model described above
tional to the surface area of a sphere (the pore space). As the radius to evaluate a drilling prospect in a play where extensive produc-
of a sphere increases, its area/volume ratio shrinks. tion data is available.
Incidentally, the lognormal distribution has been studied exten- We have a database 4 that contains parameters for 26 existing oil
sively outside the oil and gas industry. While not for the timid, the reservoirs in the Repetto turbidite sandstone, Geologic Play Code
treatise by Aitcheson and Brown 3 offers examples and explana- 415, which matches our prospect. We use these data along with
tions for the ubiquity of these types of random variables. data from two similar plays (Geologic Play Codes 414 and 416,
Puente turbidite and Repetto/Puente turbidite sandstones, respec-
Expert Opinion. Monte Carlo simulation at its worst is just another tively) to generate our distributions and to look for dependence rela-
black box. There is no substitute for experience. When used prop- tionships. Together, all three plays comprise 83 reservoirs.
erly, risk analysis is done in groups where engineers and geoscien- Table 1 contains the data for the Repetto sandstone. We select-
tists collaborate on the description of the prospect being analyzed. ed 7 of the 61 available database fields (i.e., columns in the data-
They must strive to account for the particular depositional envi- base): area, pay, porosity, initial water saturation, permeability,
ronments, driving mechanisms, heterogeneities, and other factors. initial FVF, and initial GOR. Our volumetric model uses five of
Just as the quantitative method suffers without the voice of ex- these parameters-all except permeability and initial GOR, which
perience, expert opinion rings hollow when its consequences are are included because they are commonly thought to be correlated
not scrutinized. One of the powerful aspects of risk analysis is the with others in the group. Later, we calculate the correlation matrix
ability to compare alternatives and to examine the consequences for all seven parameters.
14-----------------------------------------------------------
Two issues arise. First, how do we describe each parameter in
TABLE 1-DATA FROM 26 OIL RESERVOIRS USED TO
GENERATE SIMULATION INPUT DISTRIBUTIONS terms of a probability distribution? Second, to what extent do the
parameters depend on one another?
Initial
A h ¢ Swi k Boi GOR Distribution Results: Which Type Is Best? We used the data from
(acres) (ft) (%) (%) (md) (RB/O) (scf/O) 83 reservoirs (Plays 414 through 416) to construct histograms with
200 172 27 28 790 1.24 420 10 classes or spreadsheet "bins." Fig. 4 shows the field data (in
250 72 38 30 2,091 1.05 215 symbols) and a matching theoretical distribution (in lines). Each
355 388 21 40 300 1.17 800 parameter is displayed as a "density function," essentially obtained
1,268 125 32 35 1,000 1.04 97 by connecting the top center points of the histograms. The area under
388 224 20 37 133 1.30 400 the density function between A and B represents the probability
265 250 20 37 70 1.30 550 that a data point falls between A and B. Thus, the total area under
445 332 26 25 700 1.16 300 the curve is 1.00.
525 338 29 27 600 1.16 300 We used history-matching software (BestFit 5 ) to match our data
144 95 36 40 137 1.08 160 with each of 12 common distributions and to indicate the chi-square
365 133 32 25 680 1.04 98 "goodness-of-fit" measure, which allows us to select one distri-
1,200 511 24 31 337 1.05 697 bution that fits better than others. The distributions in Fig. 4 are
320 85 28 25 260 1.05 125 either the best fit or among the top two or three best fits. Table
3,000 250 36 36 1,100 1.05 200 2 shows how the different distribution types fit one parameter-
445 150 38 35 2,300 1.05 100 porosity. The arguments appearing with each distribution type spec-
1,133 300 23 40 60 1.15 150 ify the particular function in the class. The normal curve, for ex-
1,133 400 32 23 200 1.10 185
ample, has a mean of 29.84 and a standard deviation of 5.36,
1,133 325 26 30 250 1.15 200
1.43 whereas the gamma distribution that fits our data best has a shape
374 91 20 40 58 860
355 300 30 50 250 1.24 350 parameter (alpha) of 30.03 and a scale parameter (beta) of 1.01.
373 130 28 35 290 1.08 110 The lognormal distribution fits both area and pay quite well.
1,000 80 33 19 500 1.05 124 Porosity data is slightly skewed left and matched by the normal
859 123 33 19 500 1.05 15 curve. Water saturation is matched best by a beta distribution but
270 80 34 18 1,600 1.05 109 also could be matched reasonably well either by a lognormal or
400 50 35 18 1,600 1.05 113 by it normal distribution.
200 75 30 26 1,000 1.05 200 Little has been published about fitting distributions to field data.
180 325 25 37 600 1.00 40 Triangular and lognormal distributions are widely used in
exploration-prospect simulation. Yet, some evidence indicates that
0.001
I~
\ . ....
:::kJ±HJ
0.Of7.0 23.2 29.4 35.6 41.8 48.0
0.000
'77.0 1516.2 2955.4 4394.6 5833.8 7273.0 Values in 10"1
0.003 ~ 3.411----+"-..:--+----+---t---1
0.00°40
" 392
~
744
.
0.08 y - - - - - . - - - - , . - - , - - - , - - - ,
O. 04 t----+-----i~-+---f'I.___---i
~ 1500 ~ ... pairs of parameters. Crossplots are a good first step to identifY rela-
tionships. When two or more parameters in the underlying model
~cu
.........
1000 •• 1 •••• appear to depend on one another, the degree of dependence can
e
Q;)
500 + .- J'-. -.. be measured by regression and correlation tools. Any dependency
of this sort can be included in the Monte Carlo simulation.
c.. oi • 4·· •• ~~ ..~-~---·~ : In a field example, several reservoir parameters were analyzed
10.00 20.00 30.00 40.00
both for their types of distributions and for bivariate correlation.
Porosity, percent A simulation incorporating these distributions revealed that the range
of output parameter, OIP, was affected by the decision to include
Fig. 5-Crossplot of permeability vs. porosity using a dependence relationships.
Cartesian scale. Experts and fundamental principles supplement data analysis.
Types of distributions and relationships among the parameters are
1000 .
100
10
gp
1 ---"-
10.00 20.00 30.00 40.00
O~----~---'~--~--N-,-M-M-S-T-B--~1~O~OO~~----~~1~500
Porosity. percent
Fig. 6-Crossplot of permeability vs. porosity using a semilog Fig. 7-Comparison of output distribution from simulation-
scale. dependent vs. independent inputs.
------------------------------------------------------------17
dependent on the environment being modeled. As more users ex- SUbscript
amine field data, general guidelines may emerge. For now, users H = horizontal
must look carefully at available data. It is always wise to run mul- i = initial
tiple simulations to compare the effects of different assumptions.
References
Nomenclature 1. Cronquist, C.: "Reserves and Probabilities-Synergism or Anachron-
ism?," JPT (Oct. 1991) 1258.
a = exponential production decline rate 2. Caldwell, R.H. and Heather, D.l.: "How To Evaluate Hard-To-Evaluate
A = area, acres Reserves," JPT (Aug. 1991) 998.
Bo = oil FVF, bbl/B 3. Aitcheson, J. and Brown, J.A.C.: The Lognormal Distribution, Cam-
Cg = gas content, scflton bridge U. Press, Cambridge (1957).
4. "Enhanced Oil Recovery," Natl. Pet. Council (1984).
Jj = fraction of fractures depleted 5. "Best Fit-Distribution Fitting Software for Windows," Beta Release
f/w = fraction of fractures water filled 1.0, Palisade Corp., Newfield, NY (1993).
h = net pay thickness, ft 6. Holtz, M.H.: "Estimating Oil Reserve Variability by Combining Geo-
logie and Engineering Parameters," paper SPE 25827 presented at the
k = permeability, md 1993 SPE Hydrocarbon Economics and Evaluation Symposium, Dal-
L = length, ft las, March 29-30.
£/ = fracture systems spacing, ft 7. Peterson, S.K., Murtha, J. A., and Schneider, F.F.: "Risk Analysis and
n = number of data points in z-score calculation Monte Carlo Simulation Applied to the Generation of Drilling AFE Es-
timates," paper SPE 26339 presented at the 1993 Annual Technical Con-
N = reserves, bbl ference and Exhibition, Houston, Oct. 3-6.
q = production rate, bbl/day 8. "@RISK-RiskAnalysis and Simulation Add-in for Microsoft Excel,"
r = Pearson correlation coefficient Release 1.1 User's Guide, Palisade Corp., Newfield, NY (1992).
R = well recovery factor, fraction
SI Metric Conversion Factors
Rc = coalbed methane recovery, scf
R v = vertical well recovery, STB acre x 4.046 873 E+03 m2
bbl x 1.589 873 E-Ol m3
Sw = water saturation, percent
ft x 3.048* E-Ol m
t = time, month or year
z = standardized normal value • Conversion factor is exact. SPECA
p = density, ton/acre-ft Original SPE manuscript received for review July 11. 1993. Revised manuscript received
Feb. 17. 1994. Paper accepted for publication Dec. 12. 1993. Paper (SPE 26245) first
cp = porosity, percent presented at the 1993 SPE Petroleum Computer Conference in New Orleans, July 11-14.