Energy Economics ELSEVIER
Energy Economics 18 (1996) 295 314
A regional linear logit fuel demand model for electric utilities Carlisle E. M o o d y * Department of Economics, College of William and Mary, Williamsburg, VA 23187-8795, USA
Abstract
We investigate a short-term forecasting and simulation model of utility fuel demand based on the linear logit model. The forecasting properties of the model are surprisingly good for such a simple econometric model of factor demand. The model ignores the distinctive characteristics of electric utilities (load curves, wheeling, etc.) yet produces remarkably good forecasts at both the national and regional levels. Because the model is based on neoclassical theory it can be used for simulations, making it more useful, and less ad hoc, than a pure forecasting model.
JEL classification: C53; Q41; L94 Keywords: Linear Iogit; Forecasting; Simulation; Utility fuel demand
1. Introduction
The Energy Information Administration (EIA) is the independent energy forecasting and analysis agency for the federal government. E I A forecasts, among other things, the short-run demand and supply of energy, published in the Short Term Energy Outlook. The forecasts are based on the Short T e r m Integrated Forecasting System (STIFS), a large energy model. An important c o m p o n e n t of energy demand is fuel demand by electric utilities. Consequently, an important component of STIFS is the utility fuel demand model. E I A is often asked to analyse the effects of proposed energy policies. This mission requires that STIFS * emaih
[email protected] 0140-9883/96/$15.00 © 1996 Elsevier Science B.V. All rights reserved P l l SO 1 4 0 - 9 8 8 3 ( 9 6 ) 0 0 0 1 7 - 5
296
C. E. Moody/Energy Economics 18 (1996) 295-314
be a simulation model as well as a forecasting model. Simulation and forecasting must be done by the same model to preserve consistency among published results. Therefore, the model must satisfy two possibly conflicting goals, accurate forecasts and good simulation properties. The most accurate short-run forecasts are likely to be achieved with a pure time series model such as A R I M A or vector autoregression. However, such models cannot be guaranteed to obey the neoclassical properties of symmetry, monotonicity, etc., which are necessary for a well behaved simulation model. Optimizing models such as the translog, which are derived from hypothesized cost functions and cost minimization can be constrained to satisfy at least some of the neoclassical requirements, are typically used to examine substitution possibilities among inputs, but are seldom used for forecasting those inputs. In this paper we investigate a linear logit forecasting and simulation model of utility fuel demand based on standard neoclassical factor demand theory. This model has been proposed as the utility fuel demand component of STIFS. The model is potentially very simple because we can assume that fuel demands are weakly separable from capital, labour and inputs in the short run. Although the linear logit model is based on theory, it is not based on a presumed cost function like the translog model. Therefore, we compare the logit and translog models below to determine which are likely to be better behaved in simulation exercises. Because neither the translog nor the logit model is widely used in forecasting, we compare out-of-sample forecasts of the translog and logit models with each other and with the forecasts from a vector autoregression. The VAR is included to estimate the costs in terms of forecasting accuracy of the requirement at the model satisfy the neoclassical demand conditions.
2. Theory Historically, input demand equations have been derived in two ways. The first approach consisted of assuming a production function, maximizing profits, and using the first order conditions as the factor demand equations. The primary difficulties with this approach are the restrictive assumption of the particular form of the production function necessary to solve analytically for the marginal productivity conditions and the assumption of profit maximization under pure competition. The second, more recent, approach assumes a general form of the cost function and applies Shephard's lemma to determine the optimal factor inputs. A neoclassical cost function must satisfy the requirements that it be continuous, non-decreasing, concave and homogeneous of degree one in input prices. For cost minimizing producers, the conditional demand equations must be non-negative and homogeneous of degree zero in input prices. Finally, the Hessian matrix derived from the cost function must be symmetric and negative semidcfinite. Lau (1986) has shown that no parsimonious functional form of the cost function satisfies all of these conditions simultaneously. Nevertheless, several parsimonious cost function models have been suggested, of which the translog (Christiansen, 1973) is the most widely used.
C. E. M o o d y ~Energy E c o n o m i c s 18 (1996) 2 9 5 - 3 1 4
297
An alternative approach is to move directly to the factor demand equations themselves. One version of this approach is to define conditional demand functions, x(w, y) where w is a vector of factor prices and y is the level of output. The demand functions are conditional on the level of output and cost minimization. From Shephard's lemma, c)C x i -3 wi
w( s,----,,
cl (1)
--
c
wjxj j=l
wj j=l
where c is cost and s, is the input cost share for input i. It remains, however, to specify the input demand functions more completely. A convenient specification is the following logistic function (Considine and Mount, 1984): Wi X i
si - - c
efi
-
(2) ~ ef ) j=l
with the function f specified as
fi = ai + ~ cijln w~ + gilny j=l
(3)
Taking the logarithm of (2) yields the linear logit model of cost shares. Factor demand functions must exhibit the following properties: (i) (ii) (iii)
Input levels must be non-negative. Demand functions must be homogeneous of degree zero in input prices. The matrix [c)xi/c)wj] must be symmetric and negative semidefinite, implying that own price effects are negative and cross price effects are symmetric.
The linear logit model has several advantages for estimating systems of input demands. The first is that, unlike the translog model, it cannot yield negative predicted fuel shares, although negative predicted fuel shares can be set to zero for practical purposes. The multiplicative error structure of the linear logit model is consistent with the normality assumption typically required for statistical inference and the logit specification does not place any restrictions on the autoregressive properties of the structural error terms (Chavas and Segerson, 1986). Also, Considine (1989) has shown that if the model satisfies the concavity conditions at the point where symmetry is imposed, global concavity is assured. Applied translog models frequently violate concavity restrictions (Diewart and Wales, 1987).
C. E. Mood)' /Energy Economics 18 (1996) 295-314
298
The primary drawback of the linear logit specification is that symmetry can only be enforced at one set of cost shares, usually the mean. However, symmetry can be imposed for more than one set of cost shares in the linear logit model by redefining the point-symmetric version of the model to hold for each set of cost shares in the sample. The model is then iterated to convergence (Considine, 1990). In this way the linear logit model is guaranteed to be symmetric for all predicted cost shares in the sample. However, symmetry cannot be guaranteed in out of sample forecasts. Finally, the predicted shares from the demand equations can be interpreted as numerical approximations to the price terms in a log quadratic cost function. Thus the cost function can be estimated recursively. This implies that the input demands will not correspond to any particular cost function. The linear logit model is not an optimizing model and is ad hoc in this sense. The logit model is developed as follows (Considine, 1990). Consider a set of n non-homothetic cost share equations for an electric utility approximated by a logistic model: exp(f~,)
S,
(4)
n
Y'~ exp(~, ) j=l
where fi, = ai + ~ cijtlnwjt + gilnY, + hiHDD ~ + eit
(5)
j=l
for all i. The wit are the input prices, Yt is the level of output, and H D D , are heating degree days. The price coefficient varies across the sample points so that (4) forms a family of cost share equations. Homogeneity of degree zero can be imposed if,
•
Cij t
=
d
all i,
(6)
j=l
where d is an arbitrary constant. Symmetry can be imposed similarly if, Sitcij t = Sjtcij t
for all i :/: j,
(7)
which can be imposed at the predicted cost shares if c* = c~
for all i :~ j
(8)
where c~. = c i j J S / ~
for all i ~ j,
and Sj~ is the predicted share of the jth fuel at time t.
(9)
C. E. Moody/Energy Economics 18 (1996) 295 314
296)
We can restate the share equations as [Sit ~
i-1
In - -
=
-
( Wkt ]
) + ~'~ (c~i - c~")S['tln ~ , , I
__
-k= I
n
+
~
~3ktCik -- ~itCin
()
k=i+ I ,
p
-Wnt
Wk t
(C*k -- Ckn)Sttln - -
k=i+ 1
Wm
+(gi - g,)lnYt + (hi - h,,)HDD, + (ei, - e,, t)
(10)
where exp(L, - L,)" s/; =
,,_t
(11)
( e x p ( f j , - f , , , ) " + 1) j=l
and ( j r _f,,t)p is the predicted logarithmic share ratio from Eq. (10). The identifying restrictions required for estimation are a, = g,, = h,, = d = 0. Given continuity, a cost function corresponding to the equation system (10) exists: n
f(,91nC,/31nw,,)dlnwi, + g,(Y,. HDD,) i-I
= ln[C,(wl, . . . . . w,,. Y,. HDD,)]
(12)
where gt(Yt, HDD,) represents terms that are independent of the input prices. A numerical approximation of this integral is possible if the S/; are assumed to represent point estimates of the integral of the derivative in Eq. (12) for each observation:
f( lnCt/c) lnwit)dlnwit i=1
= ~ SiPln%, = ~ (Sit + ,i,)lnwit i=1
(13)
i=1
Substituting (13) into (12) and assuming a second order approximation for g , ( Y . HDD~), allows us to derive the following cost function: I, Sitlnwit + b llnYt + b~HDD~
lnC = b o + i=1
+b3(lnYt) 2 + b4HDD ~ + b~YtHDD , + ec,
(14)
C. E. Mood), /Energy, Economics 18 (1990) 295-314
300
Applying Shephard's lemma to (14) shows that the share errors are independent of the error term in the cost function. Thus we can estimate the share equations first and then, using the estimated shares, we can estimate the cost function recursively. Once the share equations (10) and the cost function (14), have been estimated, the demands for the various fuels can be calculated from the predicted shares, the fuel prices and the predicted total cost.
3. Data
The data are provided by EIA by nine census divisions by month for the time period 1985:1 to 1990:11.1 The census divisions are defined in Table 1. Data for fuel consumed by electrical utilities by nine census divisions comes from EIA Form 759 in physical units for the four fuels, natural gas (NG), residual fuel (RF), distillate fuel (DK) and coal (CL). Conversion to millions of BTUs is achieved as follows. Receipts of fuel in physical units and millions of BTUs is available from EIA Form 423. Dividing receipts in millions of BTUs by receipts in physical units yields a conversion factor of BTU per physical unit. Multiplying the consumption data from Form 759 by this conversion factor yields consumption in millions of BTUs. Expenditures for these fuels is also available from Form 423, so that, by dividing expenditures by receipts in million of BTUs we can derive fuel prices in dollars per million BTUs. Total cost is the sum of total expenditures across these four fuels. The data set also contains electrical generation in kilowatt hours by fuel. Total output is the sum of electrical generation across these four fuel types. With respect to weather and other potential exogenous variables, heating degree days (HDD) are available for the same period. Cooling degree days and generating capacity in megawatts by fuel type are also available, but did not prove useful in the estimation. Seasonal dummies also failed to be significant. The time period is determined by data availability. However, because of the fluctuations of energy markets in the 1970s and early 1980s, we decided to use data
Table 1 Census divisions 1 2 3 4 5 6 7 8 9
Ncw England Middle Atlantic East North Central West North Central South Atlantic East South Central West South Central Mountain Pacific
CT, NY, IL, IA, DE, AL, AR, AZ, AK,
ME, NJ, IN, KS, MD, KY, LA, CO, CA,
MA, PA M1, MN, DC, MS, OK, ID, HI,
NH,
RI,
VT
OH, MO, VA, TN TX MT, OR,
Wl NE. NC,
ND, SC,
SD GA,
FL
MN,
UT,
WY
NV, WA
The data used in this paper are available from the author on rcquest.
C. E. M o o d y / E n e r ~ Economics 18 (1996) 295-314
301
Table 2 Variables and sample means Var
Label
N
Mean
XI X2 X3 X4 WI W2 W3 W4 S1 $2 $3 $4 Cost Y HDD
Natural gas consumption (million BTUs) Residual fuel consumption (million BTUs) Distillate fuel consumption (million BTUs) Coal consumption (million BTUs) Price of natural gas (S/million BTUs) Price of residual fucl (S/million BTUs) Price of distillate fuel (S/million BTUs) Price of coal (S/million BTUs) Cost share of natural gas Cost share of residual fuel Cost share of distillate fuel Cost share of coal Total Cost Total output (kWh) Heating degree days
693 702 693 693 693 693 693 693 540 540 540 540 540 693 639
409336.992 11517395.574 919959.416 142056595.791 2.678 3.055 4.506 1.521 0.200 0.147 0.014 0.64/) 322615518.032 17351978.035 383.937
after 1984. For testing purposes, we restrict the model estimation to data before 1990 so that we have 11 months of data to serve as an out of sample test for forecasting accuracy. We therefore estimate each regional model on 60 data points: 1985:1 to 1989:12. Variable names, definitions, number of observations, and sample means are presented in Table 2.
4. Estimation strategy The model is estimated in three steps. First, we apply non-linear iterated seemingly unrelated regression (ITSUR) also known as iterated Zellner efficient least squares (ITZELS) to the share equation system (10). This technique is equivalent to maximum likelihood, which insures that the estimation is independent of the choice of fuel to be omitted. For our purposes we omit the coal share equation. In the first stage we use the actual shares for the endogenous variables Sip that appear on the right-hand side of Eq. (10). This stage yields the initial estimates for the coefficients in the cost share equations. The second stage of the estimation process consists of taking the initial predicted cost shares from (11), substituting into (10), and re-estimating using ITSUR. The predicted cost shares from this iteration are used on the right hand side of the next ITSUR estimation of (10). This process is repeated to convergence. This concludes the estimation of the cost share equations. The third step consists of estimating the cost equation (14) using the predicted cost shares from the final iteration of the previous stage. The cost function is estimated by ordinary least squares. All estimation is done using PROC M O D E L in the SAS, Version 6 statistical system (SAS 1988).
302
C, E. Moody/Energy Economics 18 (1996) 295-314
The complete model consists of the relative share equations (10), the absolute share equations (11), the cost function (14), and equations that combine cost and shares to yield input demands. For the four-fuel model considered here the complete simulation model is shown as equation set (15) below. This non-linear equation system is solved simultaneously for each month for each of the nine census divisions. fl = al - (Szc12 + $3c13 + ($I + S 4 ) c l 4 ) l n ( w J w 4 )
+ (c12 - c24)S21n(w2/w4) + (c13 - c34)S31n(w3/w4) + g l l n Y + h I H D D f2 = a2 + (Cl2 - c l 4 ) S l l n ( w J w 4 ) -($1c13 + $2c23 + (S 2 + S,~)c2~)ln(wz/w 4)
+(c23 - c34)S31n(w3/w4) + g21nY + h 2 H D D f3 = a3
+ (C13 -
c14)Slln(w1/w4)
+ (c23 - c24) S 2 ln(w2/w 4 ) -- (S1cl3 + $2c23 + ( S 3 + S 4 ) c 3 4 ) l n ( w 3 / w 4) + g3lnY + h 3 H D D
lnC = a 0 + S11nw I + S21nw 2 + S31nw 3 + S41nw4 +a41nY + a ~ H D D + a 6 ( l n Y ) 2 + a T H D D 2 + a s ( I n Y ) ( H D D ) C ~- exp(lnC) S 1 = exp(f,)/(exp(fl)
+ exp(f2) + exp(f3) + 1)
S 2 = exp(f2)/(exp(fl)
+ exp(f2) + exp(f3) + 1)
$ 3 = e x p ( L ) / ( e x p ( f l ) + exp(f2) + exp(f3) + 1) 54 = 1 / ( e x p ( f l ) + exp(f2) + exp(f3) + 1) XI = (S1C)//w l X2
(S2C)/wz
x3=
($3C)/w3 (S~C)/w4
(15)
The corresponding translog model, equation (16) below, is similarly solved using iterated seemingly unrelated regressions (ITSUR) for each of the nine census divisions. S l = a I + b I H D D + d l l l n ( w l / w 4) + d121n(w2/w4) + d131n(w3/Wa) S 2 = a 2 + b 2 H D D + d l 2 1 n ( w l / w 4) + d221n(w2/w4) + d231n(w3/w 4) S 3 = a 3 + b 3 H D D + d131n(wl/w4) + d231n(w2/w 4) + d331n(w3/w 4)
C. E. Mood)' /Energy, Economics 18 (1996) 295-314
303
lc = a~ + ln(Y) + alln(w 1) + aeln(w 2) + a31n(w 3) + (1 - (a I + a 2 + a3))ln(w 4) + boHDD + bc~oHDD z + bLln(wl)HDD + b21n(w2)HDD + b31n(w3)HDD + (1 -- ( b I + b 2 + b3))ln(w~)HDD
+ ( 1 / 2 ) ( ( d l l l n ( w l ) 2 + dl21n(wl)ln(w2) + dl31n(wl)ln(w3) +(-(dll
+ d12 + dl3))ln(wl)ln(w4)
+ dl21n(wl)ln(w 2) + de21n(w2) 2 + d231n(w2)ln(w3) + ( - ( d l 2 + dz 2 + de3))ln(w2)ln(w4)
+ dl31n(wl)ln(w3) + d231n(w2)ln(w~) + d331n(w3) 2 + (1+(-(dll
(a 1 + a 2 + a 3 ) ) + d l 2 + dl3))ln(wl)ln(w4)
+ ( - ( d 1 2 + d22 + dz3))ln(w2)ln(w4)
+ ( - ( d l . ~ + d23 + d33))ln(w3)ln(w~) - (-(dll
+ dl2 + dl3) - (dl~- + d22 + d23) - (dl3 + dz3 + d33))ln(w4) ~
C = exp(/c) s~ = l X¿ :
(Sj + S 2+S
(SIC)//w
3)
1
X2 = (32C)//w2
x3 = ( $ 3 C ) / w 3 x 4 = ( S 4 C / w 4)
(16)
Finally, we estimate and solve the following V A R model for each of the census divisions. ln(xl,) = a I + bllln(xl,t_l) + bl21n(x~ t_l) + bl31n(x3.t_ l) + bl41n(x4,,_ I) +Cltln(xl,t
2) + cl21n(x2.t-2) + Cl31n(x3,t-2 ) + c l 4 1 n ( x 4 , t - 2 )
+dllln(wlt) + dl21n(w2 t) + dl31n(w3,) + dlzln(w4t) + d l s l n ( H D D ,) + dl61n(Y~) + dlTt ln(x2~) = a2 + b211n(xl.t-1) + b221n(x2., l) + b231n(x3.t-l) + b241n(x4,t-1) +c211n(xl.t_ 2) + c221n(x2.t_ 2) + c231n(x3,t_ 2) + c241n(xz,t_ ~) +d211n(wl,) + d221n(w2,) + d231n(w3,) + d241n(w4t)
+d251n(HDD t) + d2~ln(Y~) + d27t
3O4
C E. Moody/Energy Economics 18 (1996) 295 314
b321n(x2.t-7 ) -]- b331n(x3.t 1) + b341n(x~l,t- 7) +c311n(xl.,_ 2) + c321n(x2,t_ 2) + c331n(x~.t_ 2) + c341n(x4.t_ 2)
log(x3,) = a3 + b311n(xT,t-i) +
+d371n(wl,) + d321n(w2 ,) + d331n(w3t) + d341n(wa,)
+d351n(HDD r) + d3~ln(Y,) + d37t log(x4,) = aa + b411n(xl.,-z) + b421n(x2,t-7) +b~31n(x.~.,-1) + b441n(x4., l) +c471n(xl,t_ e) + c421n(x2.t_ 2) + c431n(x3,t_ 2) + c441n(x4.t_ 2) +d411n(wl,) + d421n(w2t) + d431n(w3t) + d441n(w4,) +d~51n(HDD t) + d461n(Yt) + d47t X7t = exp(ln(xT,)) X2t = exp(ln(x2t)) X3t = exp(ln(x3t)) X4t = exp(ln(x4,))
(17)
5. Results
The results of the estimation for all three models for the nine census divisions are available from the author on request. Summary statistics are presented in Tables 3-5. Generally speaking, there is not much to choose between the linear logit and translog specifications in terms of their goodness of fit. The fractional degrees of freedom arise from the fact that coefficients are shared across equations. Both models explain cost very well and have remarkable success for fuel share equations based on monthly data for several regions, as well as notable failures, especially among the little used fuels in certain regions. The Durbin-Watson statistics are somewhat better for the linear logit model, although the overall level is somewhat low for both. There is quite likely some residual autocorrelation in both models, although it appears to be smaller in the linear logit case. The translog model cannot be constrained to yield positive shares and neither it nor the linear logit model can be constrained to yield concavity. If concavity is not guaranteed, the fuel demand equations are likely to be not well behaved and therefore not useful for simulation purposes. Similarly, the VAR model cannot be constrained to yield non-negative fuel demands or negative own price elasticities. The models are tested for negative demands, positive eigenvalues (indicating non-concavity), and positive own price elasticities at each observation. These model diagnostics are presented in Table 6. The VAR model shows no negative fuel demands, while the translog model yields only nine negative shares. The logit model, of course, never yields negative demands. With respect to non-concavity, both the logit and translog model yield positive eigenvalues. However, the translog model yields 50% more instances than the logit model. This result seems to
C. E. Moody/Energy Economics 18 (1996) 295 314
305
Table 3 Summary statistics, linear logit model
Equation Region 1 FI F2 F3 LNC Region 2 FI F2 F3 LNC Region 3 F1 F2 F3 LNC Region 4 F1 F2 F3 LNC Region 5 F1 F2 F3 LNC Region 6 FI F2 F3 LNC Region 7 F1 F2 F3 LNC Region 8 FI F2 F3 LNC Region 9 FI F2 F3 LNC
DF Model
DF Error
5 5 5 6
SSE
MSE
55 55 55 54
22.0666 3.0363 7.6170 0.0433
5 5 5 6
55 55 55 54
5 5 5 6
Adj R-Sq
DurbinWatson
Root MSE
R-Square
0.40121 0.05520 0.13849 0.0008018
0.63341 0.23496 0.37214 0.02832
0.8229 0.3323 0.6539 0.9878
0.8100 0.2837 0.6287 0.9867
1.706 0.898 2.164 1.318
4.5315 0.9969 8.2056 0.0652
0.08239 0.01813 0.14919 0.001207
0.28704 0.13463 0.38626 0.03473
0.8222 0.7707 0.5839 0.9677
0.8093 0.7540 0.5537 0.9647
0.817 1.268 1.792 0.650
55 55 55 54
4.1884 14.0525 2.2673 0.006195
0.07615 0.25550 0.04122 0.0001147
0.27596 0.50547 0.20304 0.01071
0.2184 0.1090 0.5499 0.9895
0.1615 0.0442 0.5171 0.9885
1.019 l.Ol6 1.406 1.662
5 5 5 6
55 55 55 54
3.8235 66.5494 5.7183 0.007795
0.06952 1.20999 0.10397 0.0001444
0.26366 1.09999 0.32244 0.01201
0.5410 0.0508 0.3283 0.9881
0.5076 - 0.0182 0.2794 0.9870
1.222 1.562 1.405 1.909
5 5 5 6
55 55 55 54
2.3172 3.0021 4.3136 0.0185
0.04213 0.05458 0.07843 0.0003434
0.20526 0.23363 0.28005 0.01853
0.6806 0.5648 0.5815 0.9838
0.6574 0.5332 0.5511 0.9823
0.756 1.305 1.225 1.214
5 5 5 6
55 55 55 54
12.2117 0.22203 960.4836 17.46334 5.0205 0.09128 0.008824 0.0001634
0.47120 4.17892 0.30213 0.01278
0.4488 0.1096 0.5482 0.9925
0.4087 0.0449 0.5154 0.9918
0.596 1.007 1.201 1.719
5 5 5 6
55 55 55 54
1.0680 150.3778 18.4104 0.0211
0.01942 2.73414 0.33474 0.0003915
0.13935 1.65352 0.57856 0.01979
0.7113 0.4003 0.5036 0.9899
0.6903 0.3567 0.4675 0.9889
0.951 1.282 1.199 1.113
5 5 5 6
55 55 55 54
5.7331 54.7257 3.2323 0.0382
0.10424 0.99501 0.05877 0.0007071
0.32286 0.99750 0.24242 0.02659
0.4793 0.2038 0.4285 0.9501
0.4415 0.1459 0.3870 0.9455
0.988 1.839 1.821 1.176
5 5 5 6
55 55 55 54
21.9560 24.8712 18.8010 0.0424
0.39920 0.45220 0.34184 0.0007858
0.63182 0.67246 0.58467 0.02803
0.5377 0.5187 0.6325 0.9920
0.5041 0.4837 0.6058 0.9912
1.037 1.194 1.072 0.982
C. E. Moody/Energy Economics 18 (1996) 295-314
306
Table 4 Summary statistics, translog model
Equation Region S1 $2 $3 LC Region S1 $2 $3 LC Region S1 $2 $3 LC Region S1 $2 $3 LC Region S1 $2 $3 LC Region S1 $2 $3 LC Region S1 $2 $3 LC Region SI 82 $3 LC Region S1 $2 $3 LC
DF Model
DF Error
SSE
MSE
2.167 2.167 2.167 8.5
57.83 57.83 57.83 51.5
0.0994 0.1960 0.008341 0.0482
0.001718 0.003389 0.0001442 0.0009367
0.04145 0.05822 0.01201 0.03061
2.167 2.167 2.167 8.5
57.83 57.83 57.83 51.5
0.0312 0.0741 0.0168 0.0578
0.0005399 0.001282 0.0002907 0.001123
2.167 2.167 2.167 8.5
57.83 57.83 57.83 51.5
0.000810 0.003847 0.000224 0.006336
2.167 2.167 2.167 8.5
57.83 57.83 57.83 51.5
2.167 2.167 2.167 8.5
Root MSE
DurbinWatson
R-Square
Adj R-Sq
0.6627 0.4218 0.4297 0.9864
0.6559 0.4101 0.4182 0.9844
1.397 1.016 1.471 1.360
0.02324 0.03580 0.01705 0.03351
0.8820 0.7346 -0.0838 0.9713
0.8797 0.7293 0.1056 0.9671
1.011 1.065 1.196 1.340
0.000014 0.0000665 3.8815E-6 0.000123
0.003743 0.008156 0.001970 0.01109
0.1441 - 0.0043 0.4503 0.9892
0.1268 0.0246 0.4392 0.9877
0.966 1.318 1.417 1.770
0.005204 0.000114 0.000748 0.009344
0.00009 1.9661E-6 0.0000129 0.0001814
0.009486 0.001402 0.003597 0.01347
0.3905 - 0.0047 0.2320 0.9857
0.3782 - 0.0250 0.2165 0.9837
1.127 1.701 1.391 1.714
57.83 57.83 57.83 51.5
0.006300 0.0596 0.002931 0.0233
0.0001089 0.001031 0.0000507 0.0004523
0.01044 0.03211 0.007119 0.02127
0.7175 0.1108 0.1339 0.9796
0.7118 0.0929 0.1164 0.9767
0.838 0.974 1.033 1.110
2.167 2.167 2.167 8.5
57.83 57.83 57.83 51.5
0.0127 0.002999 0.000733 0.0161
0.00022 0.0000519 0.0000127 0.0003117
0.01483 0.007201 0.003560 0.01765
0.3870 0.0455 0.2581 0.9863
0.3746 0.0262 0.2431 0.9843
0.846 1.285 1.051 1.230
2.167 2.167 2.167 8.5
57.83 57.83 57.83 51.5
0.0860 0.002120 0.002921 0.0226
0.001488 0.0000367 0.0000505 0.0004388
0.03857 0.006055 0.007107 0.02095
0.6105 0.2823 0.0658 0.9892
0.6027 0.2678 0.0470 0.9876
0.611 1.341 0.830 0.823
2.167 2.167 2.167 8.5
57.83 57.83 57.83 51.5
0.0529 0.000610 0.000294 0.0397
0.0009146 0.0000105 5.0759E-6 0.00077
0.03024 0.003248 0.002253 0.02775
0.4034 0.1487 0.1806 0.9482
0.3914 0.1316 0.1641 0.9406
0.857 1.513 1.404 1.110
2.167 2.167 2.167 8.5
57.83 57.83 57.83 51.5
0.2956 0.2724 0.001126 0.1476
0.005112 0.004710 0.0000195 0.002866
0.07149 0.06863 0.004413 0.05353
0.4173 0.3589 0.4390 0.9721
0.4056 0.3460 0.4277 0.9681
1.165 1.214 0.955 0.521
1
2
3
4
5
6
7
8
9
C. E. Moody /Energy Economics 18 (1996) 295-314
307
Table 5 Summary statistics, VAR model
Equation Region Xl X2 X3 X4 Region XI X2 X3 X4 Region XI X2 X3 X4 Region XI X2 X3 X4 Region X1 X2 X3 X4 Region X1 X2 X3 X4 Region X1 X2 X3 X4 Region X1 X2 X3 X4 Region X1 X2 X3 X4
DF Model
DF Error
16 16 16 16
SSE
MSE
42 42 42 42
17.1101 0.3619 3.0005 0.6138
16 16 16 16
42 42 42 42
16 16 16 16
42 42 42 42
16 16 16 16
42 42 42 42
16 16 16 16
42 42 42 42
16 16 16 16
42 42 42 42
16 16 16 16
Adj R-Sq
DurbinWatson
Root MSE
R-Square
0.40738 (/.008616 0.07144 0.01461
0.63827 0.09282 0.26728 0.12089
0.8803 0.9008 0.8778 0.6945
0.8375 0.8654 0.8342 0.5854
1.919 2.074 2.293 2.016
2.1016 0.4389 4.1079 0.0400
0.05004 0.01045 0.09781 0.0009531
0.22369 0.10223 0.31274 0.03087
0.9174 0.9164 0.8252 0.9208
0.8879 0.8865 0.7627 0.8925
1.691 1.603 2.069 1.752
2.1538 7.0991 1.3685 0.005486
0.05128 0.16903 0.03258 0.0001306
0.22645 0.41113 0.18051 0.01143
0.6871 0.6296 0.5397 0.9873
0.5753 0.4973 0.3753 0.9828
1.875 2.032 1.993 1.612
0.03510 1.37696 0.08271 0.0002565
0.18735 1.17344 0.28759 0.01602
0.8945 0.2087 0.5685 0.9876
0.8569 - 0.0739 0.4144 0.9832
2.361 2.089 2.140 1.725
0.02137 0.04113 0.04724 0.0003607
0.14620 0.20280 0.21735 0.01899
0.8485 0.8638 0.8063 0.9774
0.7944 0.8152 0.7371 0.9693
1.792 1.775 2.363 2.080
4.2758 489.3137 2.5108 0.0125
0.10181 11.65033 0.05978 0.0002981
0.31907 3.41326 0.24450 0.01726
0.8177 0.5474 0.6707 0.9815
0.7526 0.3858 0.5531 0.9750
2.106 1.811 1.827 2.201
42 42 42 42
0.0949 96.9584 11.5379 0.0853
0.002259 2.30853 0.27471 0.002031
0.04753 1.51939 0.52413 0.04507
0.9672 0.5812 0.6076 0.9337
0.9556 0.4316 0.4675 0.9101
1.971 2.057 1.845 1.901
16 16 16 16
42 42 42 42
2.2338 38.7176 2.2461 0.0149
0.05319 0.92185 0.05348 0.0003537
0.23062 0.96013 0.23125 0.01881
0.7626 0.4982 0.2953 0.9868
0.6778 0.3190 0.0436 0.9821
1.968 2.009 2.113 1.925
16 16 16 16
42 42 42 42
0.3951 3.7188 0.7011 10.2474
0.009407 0.08854 0.01669 0.24399
0.09699 0.29756 0.12920 0.49395
0.9014 0.7013 0.6528 0.8036
0.8662 0.5946 0.5289 0.7335
1.825 1.912 2.130 2.355
l
2
3
4 1.4742 57.8322 3.4738 0.0108
5 0.8977 1.7273 1.9841 0.0151
6
7
8
9
308
C. E. Moody"/Energy Economics 18 (1996) 295-314
Table 6 Model diagnostics: negative shares, positive roots, positive own price elasticities Region
Negative demands VAR TL
Positive roots LL TL
EL
Positive elasticities TL VAR
1 2 3 4 5 6 7 8 9
0 0 0 0 0 0 0 0 0
7 0 0 0 0 1 1 0 0
0 60 121) 60 70 114 116 97 87
1t9 120 122 92 152 118 180 66 120
0 60 85 33 74 177 187 168 100
43 60 50 86 90 177 145 41 52
60 120 120 60 60 180 120 60 120
Total
0
9
724
1089
884
744
900
indicate that the logit model is likely to be considerably better behaved in practice than the translog model. However, the most obvious case of badly behaved models is incorrect own-price elasticities. We calculated the own-price elasticities for each
440
38O 360 3~0 320 3©0 280 260 240 2:20 2OO 180 1GO 140 laO
I
I
I
I
I
I
I
I
I
I
I
1
2
3
4
~
~;
7
8
$
10
11
MOI~L~H
Fig. 1.
Out of sample forecast, USA, natural gas.
C. E. Moody ~Energy, Economics 18 (1996) 295 314
309
fuel at each observation for all regions. The results are shown in columns 6 - 8 in Table 6. Since the V A R has constant elasticities, we count each positive elasticity as 60 observations since it would hold for every observation. Surprisingly, the logit model yields 20% more positive own price elasticities than the translog model. Not surprisingly, the V A R model seems to be the worst behaved model. Thus, as a simulation model, the logit model appears to be only slightly better than the translog model, but considerably better than the V A R model. The best test for a forecasting model is its performance in out of sample forecasts. We kept 11 months of data aside for this purpose. The following exogenous variables are taken as given: heating degree days, the prices of the four Table 7 aMcan out of Sample Forecasts and forecast errors
NG
Region 1
TRNSLG
2
3
Actual LOG 1T %IEI 5.33 %IEI VAR %1 El Actual LOGIT %[ EI TRNSLG %IE[ VAR %1El Actual LOGIT
%IEI
4
TRNSLG %IEI VAR %IEI Actual LOGIT %IE] TRNSLG
%IEI
5
VAR %IEI Actual LOG IT %IEI TRNSLG %IEI VAR %IEI
6.07 4.8(t 80.2 25.73 25(t.6 5.65 55.4 24.07 27.42 38.3 32.25 54.0 40.58 58.8 3.59 2.30 34.7 2.35 33.3 3.03 16.0 3.72 3.31 28.5 3.38 26.4 3.19 17.8 20.21 16.75 16.1 18.15 12.0 18.26 11.3
RE 24.88 25.93 12.8 0.52 18.7 26.32 15.1 32.89 28.90 18.3 28.96 16.4 30.33 9.3 1.53 2.67 92.4 2.58 92.0 1.75 50.0 0.01 0.05 104.3 0.03 19070.7 0.06 12056.7 27.30 28.77 9.0 25.97 21.8 29.60 15.6
DK 0.32 0.51 69.8 12.94 73.6 0.52 74.0 1.41 2.19 60.3 2.28 84.9 3.81 160.0 0.85 1.06 26.8 0.99 19.6 0.96 14.6 0.33 0.34 37.5 0.34 43.9 0.32 42.2 2.13 2.18 25.0 2.25 37.9 3.05 51.8
CL 13.56 13.30 17.6 20.4 13.85 15.6 170.22 115.11 4.1 111.07 2.7 111.20 2.7 313.24 312.52 1.2 309.82 1.2 310.79 1.0 148.84 146.16 2.12 146.87 1.7 145.23 2.5 264.36 272.20 3.0 269.6(t 2.4 165.17 1.3
WPCIE
15.6 19.8 19.0
9.2 10.5 13.5
2.1 2.0 1.4
2.8 2.3 3.1
4.5 4.7 3.3
310
C. E. Mood), /Energy Economics 18 (1996) 295-314
Table 7 (Continued) Region 6
7
8
9
US
Actual LOG IT %E El TRNSLG %IEI VAR %IE[ Actual LOGIT %1 El TRNSLG %lEa VAR %IEI Actual LOGIT %IEP TRNSLG %1E] VAR %lEI Actual LOGIT %1 El TRNSLG %1 El VAR %IEI Actual LOGIT
%IEF TRNSLG %1E] VAR %lEa
NG
RF
DK
CL
6.23 4.45 25.8 4.57 26.1 4.47 27.7 129.54 139.72 12.6 141.20 12.9 127.47 6.0 7.31 7.19 31.1 8.12 28.2 10.54 60.1 43.92 51.57 20.5 50.46 16.8 41.90 7.1 244.68 257.52 9.9 265.8 11.7 255.09 8.8
0.72 0.06 127.2 0.36 3802.1 6.79 10816.6 0.16 0.11 124.0 0.30 498.7 0.07 83.6 0.18 0.16 177.76 0.32 453.3 0.26 462.7 9.53 7.15 32.2 6.90 44.4 10.02 23.3 97.20 93.81 6.5 91.16 8.6 105.21 11.3
0.30 0.35 26.6 0.41 41. I 0.47 69.3 0.41 0.37 19.1 0.65 79.0 0.80 93.2 0.29 0.32 27.4 0.36 37.9 0.29 19.5 1.15 0.85 25.0 0.82 31.5 1.14 8.5 7.18 8.18 17.2 8.62 24.4 11.36 57.8
153.35 156.67 2.0 155.46 1.6 156.21 1.7 157.49 149.06 6.6 147.68 7.6 162.50 4.3 162.75 163.57 1.4 164.54 1.4 158.37 3.0 7.75 4.28 46.1 6.24 36.9 4.99 40.0 1020.28 1020.59 1.5 1014.74 2.0 t 017.83 1.2
WPCIE
3.3 3.0 6.6
8.9 9.9 5.0
2.5 2.4 5.0
25.7 22.2 14.1
3.0 3.9 3.7
~Actual is the mean fuel use in trillion of BTUs over the 11 months of the forecast period. %IE] is the corresponding mean absolute forecast error.
fuels, and the total demand for electricity for each of the nine census divisions. These are variables that will have to be forecast by other models in STIFS. The out of sample forecasts are summarized in Table 7. The table presents average actual and predicted values over the 11 forecast months for each model for each region and each fuel as well as the corresponding mean absolute percentage errors. Since each region usually has one or two dominant fuels with the remaining fuels relegated to very small roles, we calculate the weighted absolute percentage error as an overall measure of forecast accuracy. There are some large percentage errors in Table 7. However, these errors occur in regions where the fuel involved is a
C. E. Moody/Energy Economics 18 (1996) 295 314
311
small proportion of total utility fuel use, so that the errors, as the weighted percentage error shows, are not as important as they appear. It appears that no single model predicts uniformly better than the others. All the models predict the dominant fuel in each region better than the minor fuels. Also, the forecasts are remarkably consistent across models. In terms of weighted error, the V A R model predicts best in four regions, the logit and translog model each predict two regions best, while tying in region 8. However, the logit model predicts total fuel use somewhat better than either the translog or VAR models. Because this model will be used primarily to forecast utility fuel demand at the national levels, we present the aggregate national forecasts graphically in Figs. 1-4. Examination of the four figures below reveal that all the models predict the monthly pattern of utility fuel use quite accurately, confirming the impressions gleaned from the percentage errors in Table 6. None of the models has any trouble forecasting coal demand. Surprisingly, the VAR model appears to yield the least accurate forecasts of the other fuels, showing large errors in the peak months of June, July and August. Between the translog and the logit, there is not much to choose. The logit model is slightly more accurate. In summary, the forecasting properties of all the models are quite good. This is remarkable given that the translog and logit models are simple econometric models
3.70 160 150
140 130
110
g~ ~0
50
I
1
I
I
i
i
I
I
I
t
I
2
3
&
5
E:
"7
8
9
10,
MOlV't'~ O
A,I~'3L"U,.-'~,..
-+-
I,,,,OG+',rT
~*
T ~ G
/*'.
V+'~,]P.
Fig. 2. Out of sample forecast,USA, residualfuel.
i 11
C. E. Moody/Energy Economics 18 (1996) 295 314
312 15' "18
+17 16
14,
13+
-1-1 10
8
6
5 1
2
D
3
~
F i g . 3.
4
.~
+
L~U~T
O u t of" s a m p l e
~:
~
7
8
T~aL~
~
•
10
11
vAz
forecast, USA, distillate fuel oil.
based on a very general theory of factor demand. None of the models takes into account the distinctive characteristics of electrical utilities (load curves, wheeling, etc.), yet they produce forecasts that are more than adequate at the national level and remarkably good at the regional level. The Iogit and translog forecasting capabilities are also remarkably good given that they are primarily simulation models that can also be used for forecasting. Nevertheless, because they can be used for simulation, they are more useful, and less ad hoc, than a pure forecasting model. Between the translog and the linear logit model, the logit model comes out slightly ahead of the translog model on several counts. It does not produce any negative demands, it has many fewer positive eigenvalues than the translog, and it was slightly more accurate in out of sample forecasts. On the other hand the translog model had fewer positive own price elasticities and ooly generated nine out of a possible 540 negative fuel demands. Thus, the linear logit model is preferred, on balance, for short-term simulation and forecasting of utility fuel demands.
313
C. E. Moody /Energy Economics 18 (1996) 295-314 1.25
1.2
1.15
1°1
1.¢s
0
0 .~5
O.S'
0.85
i
1
i
I
I
I
I
I
I
I
I
1
2
3.
&
S
~:
'7
8
9
10
11
Fig. 4. Out of sample forecast, USA, coal.
Acknowledgements This p a p e r has b e e n significantly i m p r o v e d by the suggestions of a n a n o n y m o u s referee, who bears no responsibility for r e m a i n i n g deficiencies. I w o u l d also like to t h a n k David Costello of the E n e r g y I n f o r m a t i o n A d m i n i s t r a t i o n for providing the data as well as the i n s p i r a t i o n for this study.
References Chavas, J.P. and K. Segerson, 1986, Singularity and autoregressive disturbances in linear Iogit models, Journal of Business and Economic Statistics, 4, 161-169. Christiansen, k., D. Jorgensen and L. Lau, 1973, Transcendental logarithmic production frontiers, Review of Economics and Statistics, 55, 342-354. Considine, T.J., 1989a, Separability, functional form and regulatory policy in models of interfuel substitution, Energy Economics, 11, 83-94. Considine, T.J,, 1989b, Estimating the demand for energy and natural resource inputs: trade-offs in global properties, Applied Economics, 21 (1989b): 931-945. Considine, T.J., 1990, Symmetry constraints and variable returns to scale in logit models, Journal of Business and Economic Statistics, 8 (1990): 347-353.
314
C. E. Mood,v ~Energy Economics 18 (1996) 295-314
Considine, T.J. and T.D. Mount, 1984, The use of linear Iogit models for dynamic input demand systems, Review of Economics and Statistics, 66 (1984), 434-443. Diewart. W.E. and T.J, Wales, 1987, Flexible functional forms and global curvature conditions, Econometrica, 55 (1987): 43-68. Energy Information Administration (EIA) Short Term Energy Outlook, US Department of Energy, Energy Information Administration, DOE/EIA-0202, Washington, DC. Lau, L.J., 1986, Functional forms in econometric model building, in Z. Griliches and M.D. ltltriligator (eds.), Handbook of Econometrics, Vol. 3: (Elsevier, Amsterdam) 1516-1566. SAS Institute, Inc., 1988, SAS/ETS User's Guide, Version 6, SAS Institute Cary, NC 1988, pp. 315-397.