Forecasting ultimate resource recovery

Forecasting ultimate resource recovery

International Journal of Forecasting 11 (1995) 543-555 ELSEVIER Forecasting ultimate resource recovery M. H a s h e m Pesaran a, Hossein Samiei b'* ...

972KB Sizes 0 Downloads 58 Views

International Journal of Forecasting 11 (1995) 543-555

ELSEVIER

Forecasting ultimate resource recovery M. H a s h e m Pesaran a, Hossein Samiei b'* aTrinity College, Cambridge, UK blnternational Monetary Fund, 700 19th St. N.W., Washington D.C. 20431, USA

Abstract

This paper considers Hubbert's model for forecasting ultimate resource recovery and its extensions by Kaufmann (1991, Resources and Energy 13, 111-127) and Cleveland and Kaufmann (1991, Energy Journal 12, 17-46). The emphasis of the paper is on econometric and forecasting issues, and it discusses alternative methods of estimating Hubbert's model. Using data on oil production in the U.S. lower 48 states, the paper reports the results of estimating the various specifications of the model and its extensions by the maximum-likelihood method, and provides the implied estimates for ultimate resource recovery and their associated standard errors. When economic factors are taken into account the estimates of ultimate resource recovery become state-dependent, and we find that in this case the estimates are higher than those obtained from the various specifications of Hubbert's original model. Although the accuracy of the estimates of ultimate recoverable reserves cannot be evaluated before oil reserves are actually exhausted, we examine how the various models estimated over the periods 1926-85 and 1948-85 perform in predicting oil production over the 1986-90 period. Keywords: Hubbert's model; U.S. oil production; Ultimate recoverable reserves JEL classification: C13; L71

1. Introduction

T h e r e exist a n u m b e r of different approaches to modelling and forecasting ultimate recoverable oil reserves. These approaches differ in the degree of structural details and the prominence that they give to geological and economic factors in modelling of oil supplies, and range from disaggregated engineering process techniques to econometric and, finally, to curve-fitting procedures. For a review of these approaches see A d e l m a n and Jacoby (1979), and K a u f m a n n * Corresponding author.

(1988). A m o n g these approaches the curve-fitting methods, due to their simplicity, have proved particularly popular in forecasting ultimate recoverable oil reserves in the U.S. and elsewhere. One prominent example of the curvefitting method is that developed in a series of papers by H u b b e r t (1956, 1962, 1967), which has been used extensively in the literature. H u b b e r t ' s approach, however, is subject to a n u m b e r of well-known shortcomings, which need to be weighed against the value of its simplicity. Firstly, it is not likely to be a very effective forecasting tool in regions where a large proportion of the ultimately recoverable reserves re-

0169-2070/95/$09.50 (~ 1995 Elsevier Science B.V. All rights reserved S S D I 0169-2070(95)00620-6

544

M.H. Pesaran, H. Samiei / International Journal of Forecasting 11 (1995) 543-555

main unexplored. Hubbert, of course, recognizes this problem and suggests that his model be used only when cumulative production has passed one third of the reserves ultimately recoverable. But note that in practice it would not be possible to know, without further assumptions, whether recovery has reached that stage. Furthermore, by ignoring the effects of price and cost changes on exploration and development of new oil reserves, Hubbert's model, and other pure curve-fitting techniques, may also exaggerate the rate of reserve depletion, and hence underestimate the energy resources that are ultimately recoverable. A possible alternative to Hubbert's approach is presented by the disaggregated-process approach. This lies at the other extreme to Hubbert's procedure and provides a detailed engineering framework which simulates the different stages of the oil and gas supply process, but is highly data intensive and is more suited to the analysis of individual "plays" rather than large geographical regions. The disaggregated approach also ignores important economic, technological and institutional considerations. A third alternative is the econometric approach that falls somewhere in between and given its flexibility is likely to be more promising than either of the curve-fitting or the disaggregatedprocess approaches. ~ The purpose of the present paper is to reexamine, from an econometric point of view, H u b b e r t ' s model and its extension by Kaufmann (1991) and Cleveland and Kaufmann (1991), in an attempt to include economic factors into the analysis. The intention is not so much to discuss in any detail the shortcomings of Hubbert's model as a model of oil supply, but rather to concentrate on some of the methodological is1There is also the intertemporal optimization approach developed in Pesaran (1990), and Favero and Pesaran (1994) that considers the simultaneous modelling of the discovery, development and the extraction stages of the oil supply process at the level of an oil field. This approach is potentially as complex as the disaggregated-processapproach, but has the virtue of being able to accommodate the economic factors influencing the oil supply process into the analysis. See also Pindyck (1979) for a discussion of alternative approaches to modelling oil supplies.

sues involved in the estimation of the model and its use in forecasting. We consider the problem of parameter identification, and estimation of Hubbert's model in the level of cumulative output, as formulated originally by Hubbert, as well as in the rate of recovery, and derive appropriate forecasting formulae for the estimation of ultimate recoverable reserves. We argue that it is more appropriate to base estimation of the ultimate recoverable reserves on an econometric model of the rate of recovery, rather than on the cumulative production function advanced in Hubbert's methodology. The econometric methods are then applied to oil production in the U.S. lower 48 states over the period 1926-90. The results clearly illustrate the sensitivity of the estimate of ultimate recoverable oil reserves to model specification, both in relation to the choice of the functional form and the assumed structure of the error term. We then consider an extended version of the Hubbert model with the view of allowing for the influence of economic factors on the oil supply process, and present new estimates of ultimate recoverable reserves. Although the accuracy of the estimates of ultimate recoverable reserves cannot be evaluated before oil reserves are actually exhausted, we examine how the various models estimated over the periods 1926-85 and 1948-85 perform in predicting oil production over the 1986-90 period. The plan of the paper is as follows. Sections 2 and 3 set out the basic model and consider appropriate methods for estimating it. Section 4 examines the choice of the start date in Hubbert's method of estimating ultimate recoverable reserves. Section 5 addresses the problem of residual serial correlation and the effect that this has on modelling the oil supply process based on the rate of extraction rather than the cumulative production which is the focus of Hubbert's curve-fitting approach. Section 6 considers the extensions of the model to take account of economic factors. Section 7 presents evidence on the predictive performance of the model in forecasting oil production over a 5-year horizon and, finally, Section 8 presents some concluding remarks.

M.H. Pesaran, H. Samiei / International Journal of Forecasting 11 (1995) 543-555

tions of the error term later. In order to estimate his model, Hubbert first writes (1) as

2. Hubbert's model and its estimation

The basic idea behind Hubbert's model of production (and discovery) is very simple and is derived from the observation that, under the influence of geological factors, there are three distinct phases to the production of an exhaustible resource. In the first phase, when knowledge about the oil fields is limited, production grows at a slow pace. In the second phase, it starts to accelerate as knowledge about the fields accumulates and larger fields are discovered. Finally, in the third phase, the "discovery decline phenome n o n " sets in and production begins to decline. Hubbert characterizes this process by means of the following logistic formulation for the cumulative output as a function of time:

Q~ O(t, to) = 1 + a exp[-/3(t - to) + ut] '

(1)

where t is time, t o is the start date for observations, u, is a disturbance term, and Q(t,to) is cumulative discoveries or production over the period t o to t (t > to), defined as:

Q(t, to) = ~ q, ,

545

(2)

7--t 0

where q, is the rate of production at time r. The p a r a m e t e r Q~, which is the asymptote of the function, is then interpreted as the size of the "ultimate recoverable" reserves. 2 Note that the disturbance term u, is not explicitly introduced in H u b b e r t ' s analysis, and it is not clear how it should enter. The rationale behind the particular specification of u, chosen in (1) is to make Hubbert's estimation procedure consistent with his specification of the cumulative production function. A similar issue also arises in relation to the more recent work of Cleveland and Kaufmann (1991) and Kaufmann (1991). Other possible ways of specifying u t in the cumulative output function would be to introduce it in additive or multiplicative forms. We shall discuss the consequences of these alternative specificaOther functional forms for Q(t,to) are also considered in the literature. For a useful survey see Kaufmann (1988).

Q~ t0) log ( Q(-~,

1)=log(a)-fl(t-to)+u

t

(3)

and then estimates the unknown parameters t o , a, /3 and Q~ by an iterative grid-search procedure. Thus for a given t 0, the series Q(t, to) is calculated and then, for a given value of Q~, the other parameters, namely a and/3 are estimated by the OLS method. The "best-fit" estimates are obtained by comparing RZs of the OLS regressions for different values of to and Q~ . Hubbert's method of estimation, however, is problematic for a number of reasons. First, the estimation of t 0 by maximizing the R 2 (or the adjusted R 2) o f (3) is inappropriate because it involves comparing the R 2 of regressions that are not comparable since the dependent variable of the regressions, log[Q~/O(t, t,,) - 1], itself varies with to. Cleveland and Kaufmann (1991) also follow a similar procedure, but their intention seems to be to illustrate that the best-fit results are unstable. They find that, " T h e adjusted R 2 of the logistic curve increases steadily as the start date for the analysis moves forward from 1900 to 1970." (Cleveland and Kaufmann, 1991, p. 24). Second, it is not clear why t o should be treated as a free parameter to be estimated. The appropriate procedure would be to use all the data available, thus fixing t o accordingly, but allowing for the fact that production from the field may have started prior to date t 0. We shall discuss this procedure later on. Third, even for a fixed start date, the estimation of Q~ by maximizing the R 2 of the regression in (3) is not appropriate and can lead to inconsistent estimates and misleading inference. This is because the "best-fitting" criterion should be applied to Q(t,to) and not to iog(Q~/ Q(t, t o ) - 1 ) which itself depends on the unknown parameter Q~. To obtain consistent and efficient estimates one would need to apply the maximum-likelihood (ML) method to (3). This method by construction takes account of the transformation involved in going from the dependent variable in (3) to the variable of interest, namely Q(t,to).

546

M.H. Pesaran, H. Samiei / International Journal o f Forecasting 11 (1995) 543-555

Finally, note that the error term in (1) was introduced in a manner that was consistent with Hubbert's analysis so that the estimation of his model could be discussed. However, it is equally reasonable to suppose that the error term in Hubbert's logistic specification enters in an additive or multiplicative manner. Thus, it is reasonable also to consider the following two alternative specifications:

Q~ O(t) - 1 + a exp[-/3(t - to)] + u l "

(4)

log(Q~) = log(O ~) - log{1 + a exp[-/3(t -/0)]}

(5)

+ u2,.

The errors are additive in (4) and multiplicative in (5). Both of these models may be estimated by non-linear least squares (which here will be identical to the M L procedure).

3. Maximum-likelihood estimation of Hubbert's model and some preliminary results

In order to apply the ML method to (3) we initially assume that the disturbances, ut, are serially uncorrelated and distributed as normal variates with zero means and a constant variance, 0-2,. The implications of relaxing this assumption will be examined later. To simplify the notations we let Qt = Q(t, to), but bear in mind its implicit dependence on t o . The relevant likelihood function for (1) is given by the joint probability distribution of (Q1,Q2,...,Qn) as a function of the vector of unknown parameters 0 = (a, fl, Q=)', where n is the number of available observations. Set

l,,(O)

n

= - ~ - l o g ( 2 m r 2) - 2 ~'2~[Z'(Q~)v,=. - l o g ( a ) -

B ( t - /0)] 2 ,

(8) and J, is the Jacobian of the transformation from u, to Qt, given by

OZt(Q o~) J' = ~

Q, 1 Q ~ - Qt + Q t"

(9)

The ML estimates are obtained by maximizing le(O ), with respect to a , / 3 and Q=, using data on the lower 48 states of the U.S. over the period 1926-90. The estimation results are presented in Table 1, column M 1. Note that Hubbert's method of estimation, which in general will yield an inconsistent estimate of Q~, is equivalent to maximizing l.(O) rather than lo(O). Note also that since Table 1 ML estimates of Hubbert's model for cumulative producti6n under alternative specifications of error term (estimation period 1926-90) Alternative specifications M1 Q~

154.44 (2.59) 41.02 (2.06) 0.090

& /3

M2 154.73 (2.66) 40.98 (2.05) 0.090

(0.002) 6LL

/~ z

(0.002)

0.191 -46.02 0.999

0.191 15.43 0.999

DW

-

-

X~c(1)

-

-

M3

M4

164.68 (1.78) 29.81 (0.75) 0.078

(o.ool) 1.439 -114.37 0.999 0.027 54.91

126.56 (5.67) 41.66 (2.42) 0.107

(0.0o3) 0.150 32.39 0.982 0.154 32.97

Notes: Specifications M1, M2, M3, and M 4 are defined in

Z,(f~)=log -

t

,

Q~>Q,,

(6)

and denote the log-likelihood function of (1) by

lo(O), then we have: IQ(O) = I,(0) + ~ log(L), t=l

where

1,(0) is given by

(7)

Section 3 of the paper. Standard errors are in brackets. 6- is the estimated standard error of the residuals, L L is the maximized value of the log-likelihood function, /~2 is the squared adjusted multiple correlation coefficient, D W is the Durbin-Watson statistic, and X~c(1) is the Lagrange multiplier test for residual serial correlation of order 1. Note that the equation standard errors, the maximized values of the likelihood function, and multiple correlation coefficients are not comparable across different specifications. All the computations are carried out using Microfit 3.0 (see Pesaran and Pesaran, 1991), and GAUSS 2.1.

M.H. Pesaran, H. Sarniei / International Journal o f Forecasting 11 (1995) 543-555

the Hubbert's estimates are obtained conditional on a given value of Q=, the estimated structural errors generated by the Hubbert's estimation m e t h o d are incorrect and do not allow for the sampling uncertainty associated with the choice of Q=. The estimates obtained by maximizing l , ( O ) are presented in Table 1, column M 2. Comparing M 1 and M 2, it turns out that there is not a great deal of difference between the two sets of estimates in this particular case. Both methods give an estimate of just over 154 billion barrels for the ultimate recoverable reserves. The results of estimating Eqs. (4) and (5) with additive and multiplicative errors are also presented in Table 1, columns M 3 and M 4. It can be seen that while in the case of additive errors, the estimate of Q~ is not too far from that obtained from Hubbert's original specification (164.68 as opposed to 154.44), the estimate is quite different when the error enters multiplicatively (126.56). Note, in any case, that both specifications suffer from the presence of substantial residual serial correlation. We shall return to the issue of serial correlation later.

4. Choice of the start date for the cumulative output variable

As we discussed above, taking t o as a parameter to be estimated is not plausible. It is more reasonable to use all the observations, thus fixing t o , and then explicitly allowing for the fact that there may have been production prior to date to, not known to the forecaster. Assuming that the same logistic specification applies to unobserved production prior to to, we have: Q~ Q'

1 + a exp(-flt)

(10)

where Q~ is the unobserved cumulative production from - ~ to t, and is defined by

- ~ , and tends to Q= as t tends to +oo. Eq. (10), however, cannot be estimated as it stands because the necessary production statistics for computing Q~ may not be available. To write it in terms of the observables, as before let t o be the earliest date for observed production, and note that: Q ( t , to) = Q , = Q*, - Qto

O~ Q' = 1 + a exp(-/3t)

2

q,

This specification, which is now in terms of observables, may be estimated by ML or nonlinear least squares, depending on how the error term is introduced in (13). If the error term is introduced analogous to that in (1), then the M L method should be applied, but proper account should be taken of the parameter restrictions implied by (13). To estimate the model note that instead of (3) we now have: log Q ,

(0)

1 = log(a) - / 3 t + u t ,

(14)

where A ( O ) = Q ~ / {1 + a exp[-/3(t 0 - 1)]}, which is a fixed constant and is fully determined in terms of the unknown parameters, a, /3 and Q~, and the start date of the available observations on production. Assuming, as before, that u,'s are serially uncorrelated, normally distributed with zero means and a constant variance 2 , we may proceed as before. The likelihood function in this case is given by n

log(L) -

(11/

This specification has the plausible property that cumulative production tends to zero as t tends to

Q~ 1 + a exp[-/3(t 0 - 1)] (13)

t=l r= -oc

(12)

1

where Q,, as defined above in (2), is cumulative output between t o and t which is observable, and Q~o-~ is cumulative output between - ~ and to - 1, which is unobserved. Substituting for Q~ and Q*lo-I from (10) we have:

to(0) = Q ,• =

547

-

1

2

2

)

Z t ( Q ~) - log(a) - / 3 t ] z ,

with the Jacobian equal to

(15)

548

M.H. Pesaran, H. Samiei I International Journal of Forecasting 11 (1995) 543-555

Table 2 ML estimates of Hubbert's model for cumulative production allowing for unobserved data on initial production (estimation period 1926-90) Alternative specifications

/~2

Nl 144.37 (1.62) 108.58 (14.49) 0.118 (0.004) 0.501 - 107.52 0.999

DW

-

X~c(1)

-

~ & /3 6" LL

N2 153.05 (2.09) 41.50 (1.54) 0.088 (0.001) 2.354 - 148.63 0.997 0.021 57.49

N3 116.49 (5.86) 62.76 (4.50) 0.125 (0.004) 0.194 16.02 0.972 0.124 36.56

Notes: See the notes to Table 1. The estimates reported in

this table refer to Eq. (14) in the text.

Jt = ° Z t ( Q = ) OQt

Q , - A(O) = a ~ - Q, + A ( O ) + a t

1 -- A ( O )

'

(16)

5. Modelling the rate of production So far we have considered the estimation of the Hubbert model and its various representations assuming that the error term ut, in whatever form it appears, satisfies the standard classical assumptions. In particular, it is assumed that the errors are serially uncorrelated. However, our tests of this hypothesis, for the various specifications of Hubbert's model, show that this assumption is strongly rejected when confronted with the data. Indeed, the very low value of the D W statistic indicates the possibility of the presence of a unit root in the estimated residuals. This implies that it may be more appropriate to concentrate on explaining the first difference of Qt, namely the rate of production, q,. This may also be justified on a priori economic grounds: decisions about qt determine cumulative output rather than vice versa. In what follows, we focus on the case with an additive error and consider the model examined in the previous section with -0o as the start date for production. The functional form for the cumulative output in Eq. (13) then implies the following specification for qt:

where

O~ Zt(Q~)=lOg(QtQA(o)

1).

(17)

The estimation results of this equation are presented in Table 2, specification N t. Note that we have replaced t by t - to, in order to make the size of the estimate of/3 comparable with those reported in the previous section. The alternative specifications based on additive and multiplicative errors are also estimated and the results are reported in Table 2, columns N 2 and N 3. It can be seen that in all three specifications there is a drop in the estimated value of Q~. However, as with the specifications reported in Table 1, these equations also suffer from a substantial degree of residual serial correlation; witness the high values of the LM statistics, X 2so. This is true irrespective of how the error term is introduced in the model.

qt - 1 + a exp(-/3t) + e,.

O~ 1 + a e x p ( - / 3 ( t - 1))

(18)

Note that the term involving t o disappears with first differencing. This equation is estimated by non-linear least squares and the results are reported in Table 3, column Px. This specification yields fitted values that are much closer to the actual rate of production as compared to the output estimates implied by Hubbert's cumulative output equation. This can be seen clearly from Fig. 1(a) where the fitted values from the output equation, P 1 , a r e compared to the actual rate of production, and the estimates of output implied by Hubbert's estimated cumulative production equation M3, reported in Table 1. These estimates suggest a value of 191.02 for the ultimate amount of recoverable reserves. However, note that the error in this equation still

M.H. Pesaran, H. Samiei / International Journal of Forecasting 11 (1995) 543-555

549

Table 3 ML estimates of the Hubbert's model in terms of the rate of production (estimation period 1926-90) Alternative specifications

0~ & /3 ,( 6LL

/~ 2 DW

X~c(1)

P, (Eq. (18))

P2 (Eq. (19))

P3 (Eq. (18'))

P, Eq. (19'))

191.02 (3.23) 17.62 (0.913) 0.064 (0.001) -

186.86 (8.40) 15.44 (2.37) 0.067 (0.004) 0.802 (0.078) 0.087 67.55 0.987 2.534

181.07 (3.91) 0.0789 (0.0030) 0.876 x 10 3 (0.372 x 10 4) -

174.92 (9.68) 0.0811 (0.0095) 0.0010 (0.00015) 0.839 (0.069) 0.087 68.48 0.988 2.02

0.145 34.68 0.965 0.386 41.66

0.163 27.30 0.956 0.312 44.73

Notes: See the notes to Table 1. For the specification of the equations underlying these estimates, see Section 5.

suffers f r o m a s u b s t a n t i a l d e g r e e o f serial c o r r e l a t i o n . W h e n w e i n c l u d e l a g g e d v a l u e s o f qt in t h e e q u a t i o n , to e s t i m a t e an e q u a t i o n o f t h e form:

c h a r a c t e r i z e t h e n o n - l i n e a r t r e n d in o u t p u t . A n o b v i o u s a l t e r n a t i v e to (18) is t h e f u n c t i o n a l f o r m o f the n o r m a l d i s t r i b u t i o n qt = Z e x p { a t - / 3 t 2} + e , ,

C qt = Aq,_ 1 + 1 + a exp(-/3t) C 1 + a e x p [ - / 3 ( t - 1)] + e , ,

(19)

s e r i a l c o r r e l a t i o n in t h e r e s i d u a l s s e e m s to disa p p e a r as i n d i c a t e d b y t h e L M test ( T a b l e 3, c o l u m n P2)- B u t n o t e t h a t t h e i n t r o d u c t i o n o f d y n a m i c s in t h e m o d e l m e a n s t h a t t h e Q ~ d o e s n o t e n t e r (19) explicitly. It is e a s y to see t h a t Q ~ = C / ( 1 - A), a n d using t h e e s t i m a t e s in colu m n P2 o f T a b l e 3 n o w yields Q = = 186.86 (8.40). T h e figure in b r a c k e t s is t h e a s y m p t o t i c s t a n d a r d e r r o r o f t h e e s t i m a t e a n d suggests quite a high d e g r e e o f p r e c i s i o n for this e s t i m a t e o f O~. The main advantage of basing the derivation o f o u t p u t e q u a t i o n s (18) a n d (19) o n H u b b e r t ' s c u m u l a t i v e o u t p u t f u n c t i o n lies in t h e fact t h a t it l e a d s to a b e l l - s h a p e d f u n c t i o n for t h e t r e n d component of output. However, there are many o t h e r b e l l - s h a p e d f u n c t i o n s that o n e c o u l d use to

(18')

where A, a and/3 are positive constants. Imposing the r e s t r i c t i o n t h a t t h e effect o f t h e d i s t u r b a n c e s , e,, o v e r t h e l i f e - t i m e o f t h e field a v e r a g e s o u t to z e r o , the v a l u e o f t h e u l t i m a t e r e s o u r c e r e c o v e r y can b e a p p r o x i m a t e d by 3

Q~ =

fA

exp{at-/3t2}dt

= A(rr/fl)l/: exp(a2/4/3).

3 Strictly speaking we are interested in the integral of q, over the range (-~,~), which is equal to Q~ plus the integral of e, over the same range. Here we are assuming that the latter is equal to zero, namely production short-falls relative to trend will be exactly compensated at a later date. Therefore, it cannot be assumed that e, are independently and identically distributed over the entire~range of t from -oo to +oo. Otherwise, the variance of t = Ee, would not be finite. In practice, heteroscedasticity and dependence in e, arise particularly at the initial and final stages of the recovery process, and will be less of a problem in other data ranges.

550

M.H. Pesaran, H. Samiei / International Journal of Forecasting 11 (1995) 543-555 . . . . . . . . . .

3.5

3.0

...

"~

3.5

6. Inclusion of economic and other factors in Hubbert's model

3.0

Having discussed Hubbert's model and the possible econometric forms that it could take, it is important to note also some other well-known shortcomings of the model. Firstly, the logistic specification implies a perfectly symmetric bellshaped function for the recovery rate, qt = Q t Q,-1. As recognized by Hubbert himself, this is not consistent either with the geological and engineering knowledge of oil reservoirs or the historical patterns of oil production. The justification given for the use of this specification is based on mathematical parsimony. Secondly, Hubbert's model assumes that Q~, the ultimate recoverable reserves, is fixed. This is not a realistic assumption. Although it may be possible to entertain a concept of ultimate recoverable reserves that is purely geologically determined, this may not be very useful because the model refers to actual production, and therefore Q~ will, in practice, be likely to be influenced by other factors. One such factor is the rate of recovery. Given the pressure dynamics of the oil reservoirs, and the extent of secondary and tertiary recovery efforts required to maintain production flows, the higher the rate of recovery, the lower will be the amount of oil potentially recoverable from a given reservoir. This means that the cost of secondary and tertiary recovery will be a factor in the determination of economically feasible amount of recoverable reserves. Furthermore, as emphasized by Kaufmann (1988, 1991) and Cleveland and Kaufmann (1991), the potential reserve recovery also depends on economic and political factors. More favourable price and cost conditions are likely to lead to more advanced technologies for use in oil production, and thus a higher value of Q~. In forecasting the size of ultimate recoverable reserves, it is therefore important that economic and technological factors are also taken into account. To carry out such an exercise properly would involve a detailed analysis of the various stages of the oil supply process, namely exploration, development and extraction, which is beyond the scope of the present paper. (But see,

del PI

2.5

2.0

1.:5 ] [ [ [ [ [ ] ] [ l l l l l l l l l l l l l [ [ l l l l [ l ! [ l l l l l l 1948 52 56 60 64 68

I 72

76

80

84

88

1.5

(a) 3.6

3.6

8.2

02 -.

2.8

2.8

2.4

2.4

2.0

2.0

1.6 i [ l l l t l l l l l l l [ t t l l l l l l l l l l l l l [ ! [ l l l l l l [ l l 1948 52 56 60 64 68

1.6 72

76

80

84

88

(b) Fig. 1 Actual rate of recovery and its fitted values (a) based on models M 3 and PI, (b) based on model K 3 (in billions of barrels).

Similarly, short-run dynamic effects can be introduced in (18'). For example qt = A q t - I -t- A exp{at - fit 2} q- et ,

(19')

which yields Q ~ = [ A / ( 1 - A ) ] ( I r / f l ) 1/2 exp(a2/ 4/3). The non-linear least squares estimates of the parameters of (18') and (19') are given in Table 3, columns P3 and P4, respectively. The results are very similar to those given in columns P~ and P2, and suggest that there is little to choose between Hubbert's trend specification and that based on the density of the normal distribution; particularly once proper adjustments are made for the dynamics. The same is also true of the different estimates obtained for Q~.

M.H. Pesaran, H. Samiei / International Journal of Forecasting 11 (1995) 543-55.5

for example, Pesaran (1990) and Favero and Pesaran (1994) where an intertemporal model of oil supply process is developed and estimated using data from the United Kingdom continental shelf.) Here we shall concentrate on the recent studies by Kaufmann (1988, 1991) and Cleveland and Kaufmann (1991) which aim at improving on Hubbert's approach by incorporating economic and political factors in the analysis, without embarking upon a full-fledged structural modelling of the oil supply process. The modifications proposed by Kaufmann (1988, 1991) begin with Hubbert's logistic specification and postulate a relationship between the residuals from Hubbert's cumulative production model and a linear function of running averages of current and past real oil prices, RP12 and RP35, relative price of oil to natural gas, OG, and the fraction of crude oil production capacity in Texas that is allowed to operate by the Texas Railroad Commission, TRC, capturing the extent of administrative controls or production flows. The first difference of the production curve (Eq. (4) in Kaufmann, 1991) after its peak has been reached, denoted by PC, is also included to capture possible asymmetries in the production profile.4 In what follows we gather these variables into a vector of dimension 5 which we denote by x,. The estimation strategy followed by Kaufmann is a two-stage iterative procedure. In the first stage Eq. (3) is estimated using a grid-search m e t h o d described in Section 2, the shortcomings of which have already been discussed. In the second stage the residuals from stage one are modelled as a function of x,. This is done in order to introduce economic factors in the curvefitting approach of Hubbert. This is an interesting contribution to the literature on ultimate resource recovery and improves on Hubbert's somewhat mechanical analysis. We shall not discuss this model from an economic point of view but rather examine the estimation proce-

551

dures followed by Kaufmann and discuss possible ways of improving upon them. The two-stage estimation procedure used in Kaufmann's study is likely to generate bias in the estimates (over and above that discussed above in relation to Hubbert's model), because the influence of the two sets of variables (long-run trend captured by Hubbert's model and the short- to medium-run effects identified by the elements of x,) is estimated separately. The first stage estimation thus faces omitted-variable bias. In particular, note that the estimate of Q~ obtained in the first stage does not take account of the effect of economic factors, which enter the analysis in the second stage. As we shall discuss later, the appropriate method would be to carry out the estimation of the long-term trend and the short-term effects simultaneously. This would also give the correct standard errors of the estimates. Kaufmann's estimation procedure, furthermore, conceals an identification problem that surrounds the model and only comes to light once the estimation is carried out in one stage. To see this note that Kaufmann's (1991) equations (4)-(6) can be combined to yield

q, =f(t)[1 + a o + b'x,] + v,

(20)

where v, is an error term and

Q~ f ( t ) - 1 + c~ e x p ( - / 3 ( t - t,,))

Q~ - 1 + a e x p ( - / 3 ( t - t o - 1)) "

(21)

Substituting (21) in (20) gives:

4, q' - 1 + a e x p ( - / 3 ( t - t0) )

4, 1 + a e x p ( - / 3 ( t - t o - 1)) +

0 'x t

1 + a e x p ( - / 3 ( t - t0) ) 0 'x t

4 See Kaufmann (1991) for a more detailed description of the variables and data sources. All the estimations are carried out over the period 1948-90.

- 1 + a e x p ( - / 3 ( t - t o - 1)) + vt'

(22)

where 4, --- Q=(1 + a0) and 0 = Q~b. In this speci-

552

M.H. Pesaran, H. Samiei / International Journal of Forecasting 11 (1995) 543-555

fication the dependence of q, on Q~ is through the parameters ~b and 0. Thus, given historical observations on q, and x,, one may obtain estimates for ~b, 0, ot and /3, but it will not be possible to obtain unique estimates of a 0, Q~ and b. The two-stage estimation, therefore, erroneously appears to identify all the parameters. Only when the simultaneous estimation of all the parameters is considered, does the identification problem manifest itself. This implies that in Kaufmann's extension of Hubbert's model, the size of the ultimate recoverable reserves, Q ~, can no longer be uniquely identified from the analysis of production and price data. One way to overcome this problem is to note that lack of identification arises from the specific functional form that Kaufmann employs, namely that the ratio of the residuals from the first stage to actual output is modelled as a function of xt .5 This also means that Q~, estimated in the first stage, no longer corresponds to the concept of ultimate recoverable resources, and is adjusted in the second stage so that it becomes dependent on the values of the economic and political factors. It seems, therefore, more reasonable to start with Hubbert's model and to explicitly allow for the possible dependence of ultimate recoverable reserves on prices, costs, and other variables in x,. More explicitly Q~ = Q ~ + a'x,, where Q~ is the size of the ultimate resource discovery, given the economic, technological, and other relevant factors that prevail at time t. In this formulation, ultimate resource discovery varies with x,, and is fixed only if economic and political factors have no impact on the recoverable resources, i.e. if all the elements in a are zero. Once the parameters o f the model are estimated one may obtain a measure of Q~, as a function of x t. This approach has two advantages: first, the parameters and the ultimate resource recoverable are now all identifiable from the data, and second, the model explicitly states how economic and political factors in-

5See Kaufmann (1988) for an explanation of why this specification is used.

fluence the estimates of the reserves that are ultimately recoverable. In this case Eq. (19) generalizes to: b o + b'lX t qt = Aqt-1 + 1 + a exp(-/3t) b o + b'lXt_ 1 t- v,, 1 + a exp[-/3(t - 1)]

(23)

which corresponds to Eq. (19) which assumes a constant level of recoverable reserves. Note that the presence of lagged dependent variable means that Q~ is equal to:

Q'-I

b'lxt A+I-A '

(24)

where Q ~ = b o / ( 1 - A), and a = b l / ( 1 - A). The ML estimates for this model computed over the period 1948-90 are presented in column K 1 of Table 4. The coefficients on RP12, O G and P C are not significantly different from zero (even at the 10% level). The other estimated coefficients have the right signs and are statistically significant at the 5% level. But, as can be seen from the LM statistic, there is still some evidence of residual serial correlation. Using (24), the floint estimates of the recoverable reserves, Q t , range between 236.49 and 324.42, and show a sharp drop in 1970. This rather abrupt decline in Q~ turns out to be primarily due to the effect of the P C variable, which as was noted above is not statistically significant. The estimated 95% confidence intervals for the recoverable reserves in the case of individual years also show considerable dispersion. For example, the confidence interval for the recoverable reserves in 1990 is estimated to lie in the range 114.95 to 364.19. This wide dispersion of the estimates of the recoverable reserves may also be due to the inclusion of statistically insignificant variables in the model. T h e r e is clearly need for further exploration, to see the degree to which the estimates of the recoverable reserves are sensitive to the choice of economic and political factors as included in x r Dropping the statistically insignificant variables RP12, O G and PC, and re-estimating the equation we

M.H. Pesaran, H. Samiei / International Journal of Forecasting 11 (1995) 543-555 Table 4 M L estimates of the extended Hubbert's model in terms of the rate of production and including economic factors (estimation period 1948-90)

553

~0

270 260

z60 '..... .

.

.

.

.

.

.

.

.

.-

25O

250

Alternative specifications 240i

K~

K2

K3

0.698 (0.074) -

0.756 (0.067) 50.82 (12.25) -

0.988 (0.084) -0.298 (0.094) 66.09 (11.66) -

1.398 (0.642) -

0.952 (0.625) -

240

i

2301

'(t "(2 ~(,

RP12 RP35 OG TRC

95.46 (36.52) -0.319 (0.757) 3.036 (1.628) -0.449 (0.992) 2.586 (1.443) 216.53 (167.56) 7.305 (2.497) 0.039 (0.013) 0.051 67.33 0.983 1.33 5.06

220

200

19oi

190

180 170 160

10.954 (2.108) 0.057 (0.003) 0.057 69.10 0.982 1.76 0.654

230

Note: See the notes to Table 1, and the discussion of Eq.

(b)

6LL /~ 2 DW X~c(1) (29) in the text.

obtain the results reported in column K 2 of Table 4. 6 The point estimates of the recoverable reserves together with their 95% confidence intervals, for this equation are displayed in Fig. 2(a). These estimates are now much more precisely determined and fall in a narrower range. This equation, however, still suffers from the residual serial correlation problem. To overcome this problem we added a second-order lag of qt to the model. Note that in this case, Q ~ = bo/ (1 - A1 - Z2) and a = bl/(1 - }~1 - / ~ 2 ) , where /~1 and A2 are the coefficients of qt-i and qt-2,

6 These variables turned out to be statistically insignificant separately as well as jointly.

180

,

"'-...°. ...... . ....... . . . . . . . . . . . . . . . . . . . . . . . . . . .

8.463 (2.697) 0.055 (0.005) 0.055 63.84 0.981 1.23 6.70

/3

210

200 :

2.532 (0.310) -

dt

220

J

210

2.235 (0.348) -

PC

230

170 Illlltlllllllllllflllllllllillllllll 1948 52 56 60 64

68

72

76

flO

IIII I 84 88

160

(a) 240

240

'"-..--'" .-.. ....... ... . . . . . . . . . . . . . . . . . . . . . . . .

.....-

........................................

230



2,20

220

210

210

J 200 190 18(

200

".. .......... . ........ . . . . . . . . . . . . . . . . . . . . . . . . . . IIIIIllllllllllllllllllllllllllllllllllll P'I8 52 56 60 64 68

190

..--"" . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

76

80

84

88

180

ISpeeifieation and parameter estimates for model g 2 and K3 are given in Table 4.

Fig. 2. Point estimates of ultimate recoverable oil reserves in the U.S. based on (a) model K 2, (b) model K 3, with their 95% confidence interval (in billions of barrels).

respectively. The results are presented in column K 3 of Table 4. The estimated model now explains over 98% of the variations in actual output over the period 1948-90, and as can be seen from Fig. l(b) the fitted values track actual output very closely. The estimates of Q~ together with the corresponding 95% confidence intervals are presented in Fig. 2(b). All the variables continue to remain statistically significant at the 5% level, except for the price variable, RP35, which is now significant at the 10% level. The point estimates of Q~ over the

554

M.H. Pesaran, H. Samiei / International Journal of Forecasting 11 (1995) 543-555

1948-90 period lie in the narrow range of 203.33 to 209.52. Also at the 1990 values of the exogenous variables, ultimate resource recovery is estimated to be 209.00, with the true value lying between 188.42 and 229.60 with 0.95 probability. Note further that this estimate is significantly higher than the estimates obtained using models based on cumulative production (specifications M I to M 4 and N~ to N3). It is also higher than the estimates obtained from the models based on the rate of production (specifications P] and P2), although the difference is much less. The value of 189.7 obtained by Kaufmann (1991) from his two-stage estimation of the extended model, also lies just inside the confidence interval obtained under specification K 3.

7. Predictive performance of the output equations While it is not possible to evaluate the various estimates of the ultimate recoverable reserves, it would be of some interest to examine the relative performance of the various underlying models in predicting the recovery rate. 7 With this in mind we re-estimated Hubbert's cumulative production model (model M 3 in Table 1), and Eq. (18), our formulation of Hubbert's model in terms of the rate of recovery (model P~ in Table 3), over the period 1926-85. Based on these estimates, we generated forecasts of the recovery rate for the period 1986-90. The results are displayed in Fig. 3(a), and clearly show the superiority of the forecasts based on Eq. (18) over the forecasts based on Hubbert's own formulation in terms of cumulative production. In order to examine the effect of the inclusion of economic variables on the accuracy of forecasts, we also re-estimated model K 3 (the pre-

7 This was suggested to us by one of the referees. s Note that the oil price variables enter the specification K 3 with a lag of two years. The TRC variable was set equal to unity over the forecast period. 9 T h e m e a n of the forecast errors for models P1 and K 3 were equal to - 0 . 0 2 2 and 0.050, respectively.

a.~

3.5

3.0

3.0 Model PI Actual

2.5

2.5

z.0

2.0

L5 J t9~0 (a) 3.~-

t

I

t

72

r

t

I

I

76

74

I

I

78

] 80

I

I 82

I

I 84

I

] 86

!

I 88

] 90

1.5

3.6

3.2

3.2

2.8

2.B

2.4

2.0

1.6

2.0

I

1970

I

I 72

I

74

t

I

76

I

I

I 78

I

80

I

I I I t t l l i l ~ 86 88

1.6

(b) Fig. 3. Actual rate of recovery and forecasts (a) based on models M 3 and Pt, (b) based on model K 3 (in billions of barrels).

ferred model presented in Table 4) over the period 1948-85, and obtained dynamic forecasts of output over the 1986-90 period, using actual values of oil prices realized over the 1988-90 period. 8 The forecasts follow the downward trend of the production quite closely, but show a faster rate of decline than the actual values. The effect of including the price variable in the forecasting equation is rather mixed. As far as the bias in the forecast errors is concerned, Eq. (18) (model P1) which excludes the price variable performs better. 9 But if one uses the root mean squared criterion, model K3, which includes the price variable, performs marginally better than P1. The root mean squared forecast errors for

M.H. Pesaran, H. Samiei / International Journal of Forecasting 11 (1995) .543-555

these two models respectively .10

were

0.055

and

0.068,

8. Concluding remarks In this paper we have examined various methodological issues concerning the specification and estimation of Hubbert's model of ultimate resource recovery and its extensions by Kaufmann (1991) and Cleveland and Kaufmann (1991). Our emphasis has been on discussing appropriate econometric procedures rather than extending the models from the economic or geological points of view. In particular, further research is required in examining, from a theoretical point of view, how (and what) economic factors should enter the determination of ultimate recoverable reserves. The estimates that we obtain using data on oil production in the U.S. illustrate the sensitivity of the estimates of ultimate reserves to model specification. In particular we find that once economic factors (as specified by Kaufmann, 1991 and Cleveland and Kaufmann, 1991) are taken into account, we obtain time-varying estimates of the ultimate recoverable reserves that are generally higher than the estimates obtained on the assumption that economic factors do not matter. The use of the various models for forecasting the rate of recovery over the 1986-90 period also suggests that our formulation of the Hubbert model in terms of oil production is preferable to his own formulation which is cast in terms of cumulative output levels.

Acknowledgements The authors are grateful to Robert Kaufmann for helpful discussions, and for providing us with 10We also carried out a forecasting exercise where oil price forecasts obtained from a geometric random walk model, were used instead of actual prices in forecasting the rate of recovery. These forecasts, with mean forecast error of around 0.010, and root mean squared forecast error of 0~029, turned out to be more accurate than the output forecasts using actual prices.

555

the data used in this paper; to two anonymous referees for useful comments on an earlier draft of this paper; and to the Isaac Newton Trust of Trinity College, Cambridge, for partial financial support. This paper has been presented at the National Bureau of Economic Research Workshop on Economic Forecasting held in Boston on 23 April 1995. The work on the paper was completed when the first author was visiting the Research Department of the IMF. He thanks the IMF for its hospitality. The views expressed in the paper are those of the authors and do not necessarily reflect the views of the International Monetary Fund or any other institution.

References Adelman, M.A. and H.D. Jacoby, 1979, Alternative methods of oil supply forecasting, in: R.S. Pindyck, ed., The production and pricing of energy resources (JAI Press, Greenwich, CT). Cleveland, C.J. and R.K. Kaufmann, 1991, Forecasting ultimate oil recovery and its rate of production: Incorporating economic forces into the models of M. King Hubbert, Energy Journal 12, 17-46. Favero, C. and M.H. Pesaran, 1994, Oil investment in the North Sea, Economic Modelling 11, 308-329. Hubbert, M.K., 1956, Nuclear energy and the fossil fuel, Drilling and Production Practice, American Petroleum Institute, Washington D.C. Hubbert, M.K., 1962, Energy resources, National Academy of Sciences Publication 1000-D, National Research Council, Washington D.C. Hubbert, M.K., 1967, Degree of advancement of petroleum exploration in United States, American Association of Petroleum Geologists Bulletin 511, 2207-2227. Kaufmann, R.K., 1988, Higher oil prices: Can OPEC raise prices by cutting production?, PhD dissertation, University of Pennsylvania, Ann Arbor, MI. Kaufmann, R.K., 1991, Oil production in the lower 48 states: Reconciling curve fitting and econometric models, Resources and Energy 13, 111-127. Pesaran, M.H., 1990, An econometric analysis of the exploration and extraction of oil in the UK continental shelf, Economic Journal I00, 367-391. Pesaran, M.H. and B. Pesaran, 1991, Microfit 3.0: An interactive econometric software package (Oxford University Press, Oxford). Pindyck, R.S., ed., 1979, The production and pricing of energy resources (JAI Press, Greenwich, CT).