Journal of Econometrics
A MONTE
7 (1978) 23-55. 0 North-Holland
Publishing Company
CARLO STUDY OF AUTOREGRESSIVE MOVING AVERAGE PROCESSES
INTEGRATED
Warren DENT The University of Zowa, Iowa City, IA 52242, USA
An-Sik MIN Edinboro State College, Edinboro, PA 16412, USA Received September 1976, final version received June 1977 Six of the simpler ARMA type models are examined with respect to properties of a variety of proposed estimators of unknown parameters. The findings suggest that if only one estimation method were available to a researcher the choice should probably be maximum likelihood. Stationarity- and invertibility-restricted estimation would appear appropriate when parameters are thought to be within 5 percent of constraint boundaries.
1. Introduction
Publication of Box and Jenkins’ book (1970) on forecasting and control techniques with Autoregressive Integrated Moving Average (ARIMA) processes, has had a profound impact on statistical time series analysis. Application of their procedures appears in models of telephone demand [Thompson-Tiao (1971)], agricultural commodity pricing [Leuthold et al. (1970)], GNP component analysis [Nelson (1972)], rail freight cartage [Dent-Swanson (1978)], and consumer credit [Sullivan-Marcis (1975)], to name only a few diverse areas. Success in modelling seasonal time series is common, and in situations involving univariate processes a strong standard for forecast performance of alternative models is usually established. The basic form of an ARIMA process is
where fit = wt -p is a mean-corrected stationary random variable and a, is a random shock with mean zero, unknown variance c2, uncorrelated with a,, s # t, and the parameters (4i, Bi} satisfy certain stationarity and invertibility requirements. In most applications w, is a transformation of a basic variable z,. Common transformations include first and second differences which eliminate linear and quadratic data trends, and logarithms which reduce heterogeneity [Granger-Newbold (1976)].
24
W. Dent and A.-S. Min, A Monte Carlo study of ARIMA processes
Assuming normality for the shocks, minimum mean square error forecasts of future values of w, are the conditional means of marginal distributions. Given parameter values, point estimates of these means and estimated error variances may be calculated readily [see Box-Jenkins (1970, ch. 5)]. A serious problem however is the choice of appropriate parameter estimates. 2. Estimation
procedures
Box and Jenkins (1970, ch. 7) concentrate on two least squares estimation procedures. Conditional least squares estimates of the (+,6) parameters are determined by setting unknown pre-sample a, values and a,, . . . , aP to zero and minimizing C:=,+1 [a,]’ where II is the sample size and [a,] denotes the expectation of a, conditional on (4,0), the sample values wl, . . . , w,, and p [Box-Jenkins (1970, p. 213)]. The parameter ,LL is often assumed to be zero, or may be replaced by i? = ~-‘~:=1 w, [Box-Jenkins (1970, p. 210)]. If desired, it may be included as an additional parameter to be estimated. Unconditional least squares estimates of the coefficient parameters (4,0) are found using a ‘backcasting’ technique to estimate pre-sample a, and w, terms, and then minimizing c:= _N[a,]’ for some large N. Aigner (1971, p. 357) reports that Box and Jenkins indicate that the conditional least squares procedure works well for many non-seasonal models, but that one would expect it not to in complicated models or where sample size is small. For seasonal processes Box and Jenkins claim (1970, p. 220) that ‘the conditional approximation becomes less satisfactory and the unconditional sum of squares should ordinarily be computed’. Nelson’s (1974) Monte Carlo findings and those described below suggest that one should not be too hasty in discarding the conditional least squares estimator. With the additional assumption of normality on the white noise terms, it is possible to consider maximum likelihood estimators of the parameter coefficients. Whittle (1962) shows for large samples that the conditional least squares procedure then ‘effectively yields’ maximum likelihood estimates. Box and Jenkins (1970, app. A7.4) demonstrate that the unconditional least squares procedure yields ‘true’ maximum likelihood estimates which would be obtained were needed w,_ j values treated explicitly as parameters [Aigner (1971, p. 357)]. The maximum likelihood estimators are known to be consistent, asymptotically normal [Whittle (1951, ch. 7)], and asymptotically efficient [Aigner (1971, p. 361, note 8)]. In the absence of normality the unconditional least squares estimators have been proven to have the same asymptotic distribution as the maximum likelihood estimates [Walker (1964) and Whittle (1962)]. Despite the strong evidence that unconditional least squares estimators are asymptotically as good as maximum likelihood estimators, the issue of nonlinearity in estimation continues to attract researchers. Pagan (1970) and Phillips (1966) develop iterative ‘maximum likelihood’ estimation methods, but
W. Dent and A.-S. Min, A Monte Carlo study of ARIMA processes
25
the resultant estimates are still conditional on pre-sample W, values. Pagan’s (1970) small Monte Carlo studies for sample sizes n = 40 and n = 70 suggest little gain for his method over conditional least squares on a minimum mean square error criterion. The results of Hendry and Trivedi (1970) in a more extensive study suggest however that the extra computation involved in this maximum likelihood estimation procedure is worth the effort. Nelson (1974) in his study of the so-called (0, 1, 1) model finds that the conditional least squares estimators exhibit surprisingly good properties relative to unconditional and ‘true’ maximum likelihood estimators. Our findings below support the contention that conditional least squares is very effective for pure moving average models [p = 0 in eq. (l)]. Others who have studied estimation of ARIMA models include Thornber (1967) who examines maximum likelihood, Bayesian, and other procedures, and Durbin (1959) who argues that an autoregressive model (q = 0) can be estimated consistently by least squares and that one can get arbitrarily close to maximum likelihood estimates of moving average parameters by choosing the equivalent model of sufficiently high order. Thornber found finite sample properties of maximum likelihood estimators to be not necessarily good. Anderson (1975) suggests a scoring procedure which yields approximations to maximum likelihood estimates, and Dent (1977) develops a numerically accurate and efficient computational procedure for calculating the exact likelihood associated with any given ARIMA process. This procedure has been used with non-calculus oriented search algorithms to determine maximum likelihood estimates.
3. Aim of report This study reports on an extensive Monte Carlo analysis of a variety of estimators for parameters of ARIMA processes. Aigner (1971, p. 369) in his survey of ARIMA models calls for a comparison of properties of maximumlikelihood type estimators. The basic thrust of our analysis is in response to this, although a variety of other estimators is also considered. The major bulk of the work, and some not herin reported, forms Min’s (1975) Ph.D. dissertation.
4. Design of study Monte Carlo studies are usually limited by institutional needs for research economy. In the present case this amounted primarily to a limit on available funds for computation. The design of a Monte Carlo study must usually be optimized over a variety of investigable facets and criteria. In analysis of ARIMA processes there appear to be four main areas of concern, viz. (i) sample size, (ii) type of model, (iii) regions of parameter spaces, and (iv) type of estimator. Beyond these considerations is an overriding concern with the number of
26
W. Dent and A.-S. Min, A Monte Carlo study of ARIMA processes
simulation trials which will be adequate for inferential purposes. Halton (1970) provides an extensive bibliography on the Monte Carlo method, but there exists no uniformly accepted procedure for optimizing Monte Carlo design. While study design is critical, the requirement of succinct but informative result reporting creates a challenge which often further limits study effectiveness. We attempt here to follow the general thesis and most of the explicit suggestions of Hoaglin and Andrews (1975) in reporting statistical computation based results. With respect to sample size we select n = 100 throughout the investigations. We note that Durbin considers 100 small [see Aigner (1971, p. 359, note 6)], while Hendry and Trivedi (1970), and Pagan (1970) think values in the range 25-75 are small. Our motivation for application of ARIMA processes stems from econometrics and the availability these days of approximately 25 years of quarterly observations on U.S. national accounts leads us to consider a sample size of 100 as appropriate for many applications. We note also that Nelson (1974) considered sample sizes of 30 and 50 in his studies. We record parenthetically that within Min’s (1975) study selected models and parameter sets were indeed examined at sample sizes of n = 200 and 400, and that Monte Carlo sample biases were unchanged, sample variances and mean square errors were reduced by expected amounts, and that rankings of estimators never varied. Detailed results of these particular investigations will not be given below however. We characterize the model (1) as a (p, q) ARIMA process. This Monte Carlo study evaluates the models described as (1, 0), (2,0), (3,0), (0, l), (0, 2), and (1, 1). A seasonal model is examined by Min (1975) but results are not reproduced here. Thus, three pure autoregressive models, two pure moving average models, and the simplest mixed model are examined. For each model a number of sets of parameter values are analyzed. For the (1,0) model twelve values of the parameter d1 are used; for the (2,0) model there are twenty-one sets of (&, c$J values; and for the (3, 0) model thirteen sets of (&1, c$~, &) values are used. Computation is more difficult and expensive for models with moving average parameters so that thirteen values are examined for the (0, 1) model, nine pairs of (0,) 0,) values for the (0,2) model, and eleven pairs of (&, 0,) values for the (1, 1) model. Parameter choices are made in an attempt to provide representation throughout stationarity and invertibility regions but with some emphasis on nearboundary positions. Details will be given later for particular cases. For all models and for all coefficient parameter sets the variance of noise terms is lixed at unity (c2 = 1). 5. Estimators in autoregressive models For autoregressive
models, a number
of estimators
is examined.
These
W. Dent and A.-S. Min, A Monte Carlo study of ARIMA processes
27
include (i) the (unconditional) least squares estimator determined using the formulas given in Box-Jenkins (1970, pp. 277-278), (ii) an approximate maximum likelihood estimator (p. 278), (iii) the Yule-Walker estimator (pp. 27%279), and (iv) a regression estimator wherein G’tis regressed on $‘t_1, . . . , i-c_-p, t =p+l, . . . . n (this being essentially the conditional least squares estimator). Kendall’s (1949) estimator extends the estimating equations of the YuleWalker estimator to higher-order sample correlations. Let rj be thejth sample autocorrelation of wl, . . . , w,, and let m be an integer greater than the order of autoregression p. Set R as the m x m matrix with element (i,j) given by Then Kendall’s estimate (v) of ~j is given by E=lrk-i+irk-j+l. j=
4j = -R1,j+llR,,,
19 ***, P,
where Rij is the cofactor of element (i,j) the autocovariance at lag k by Ck =
bk I$;(w,-w)(wt+k-$,
in R. Note that Kendall estimates
k =
0, . . . . m,
whereas Box and Jenkins (1970, p. 32) use n in the divisor. The sample autocorrelations in either case are determined as rj = Cj/C,, j = 0, . . . , m, with r-j defined as rj. For the models studied here m is set at m = p+ 1. Quenouille (1957) also suggests for Cp, 0) processes estimators which make use of higher-order sample autocorrelations. His estimation scheme is based on maximizing an asymptotic likelihood function and the derived normal equations are highly nonlinear. Chanda (1961) illustrates the difficulties and suggests simpler approximating versions. Appropriate formulas (correcting those of Chanda for the (2,0) model) for the Quenouille estimates (vi) are (1,O) model : $I = r1 +2rI(r,-rf)2(1
-rt)-2,
(2,O) model:
$, = +~-n-‘(2~2rIz-2-6r1-q5~), & = ~#+n-~(2~~@$+-4~#$),
where 4:, & are the unconditional 5 =
c3 +
z
l/[(l -r:)(l
=
The autocovariances
4:cz
+
least squares estimates, and r is given as
42*Cl,
-f$z)].
Cj are computed with divisor n-j.
28
W. Dent and A.-S. Min, A Monte Carlo study of ARIMA processes
None of these six estimators is guaranteed to satisfy stationarity restrictions. The final estimator examined is the exact maximum likelihood estimator (vii). A later section is devoted entirely to discussion of the definition of this estimator and the techniques used in its calculation. Not all estimators are considered for all autoregressive models. The seven estimators are applied to (1,0) and (2,O) models, but the Quenouille estimator is not examined for the (3,O) model. For estimation methods other than the maximum likelihood principle, estimates of error variance are computed as 0^*
=
_ f
t~$+lwh-l...-&w,_&y,
following Box and Jenkins’ (1970, p. 280) suggestion. The maximum likelihood case is described later. We note, that without correction for degrees of freedom lost in coefficient parameter estimation, the divisor IZ in 8* may bias downwards this estimator of 8*. 6. Estimators in moving average and mixed models Only five estimators are considered for the two moving average models and the one mixed model. These include: (i) the unconditional least squares estimator, and (ii) the conditional least squares estimator. Walker (1961, especially pp. 347-349) attempts to approximate the maximum likelihood estimator for a (0, q) process using information contained in higher-order sample autocorrelations and sample autocovariances of the process in question. His estimation scheme is based on the likelihood function of the asymptotic distribution of sample autocorrelations rr , . . . , rm, for a tied positive integer m > q. Divisors n -j for cj are used in forming rj = cj/co. He shows that as n -+ co, the joint distribution of nf(rk-pPk) --t N(0, W) where W = (wij) is a covariance matrix with wij = u=~m IPuPu+i-j+PuPu+i+j+2PiPjP,2 -2PiPuPu+ j -2PjPuPu+il,
and where Pi is the theoretical autocorrelation at lag j. Using this, the loglikelihood function of the sample autocorrelations is approximated by
L = -tW
WI -5
. f_
(ri-pi)wij(rj-pj),
ad-1
where
W-’
= (wij)
is the inverse of
W.
By maximizing the log-likelihood
29
W. Dent and A.-S. Min, Monte Carlo study of ARIMA processes
function, p estimates are found as i% = r,+
i
s=q-kl
k = 1, . . ..q.
QS,
where
Estimation of p is achieved through an iterative process starting with pi = Ti, and one iteration ordinarily is enough. Given the estimates for Pk, k = 1, . . . , q, estimates of 8 are determined from formulas relating pi to tIj, j = 1, . . . , q [see Box-Jenkins (1970, p. 68)]. Walker’s estimate as described here (iii) is calculated only for the (0, 1) model, with m = 4. The parameter rs* is estimated as c&l + 8:) using formulas in Box-Jenkins (1970, p. 69). For the (1, 1) model a related derivation is used by Walker (1962, particularly $6). His procedures suggest calculating provisional estimates PT and 4: by p: = rl--
4:Pr =
Yl 1 +2yf2
r1-
S.19
2Y:(l++:r3 l
s* 3,19
+2y,*2
where
7: = PI. - 4W(1+
4f2 --WPJ9
S2,1 = r3-2c)Tr2-l-t#$2rl.
Next, coefficients cc, i = 1,2, j = 3,4, 5, are determined equations 1+2y*2 r
2y* 1+2y*2
1 Y*2 2Y*
c
1+2y*2
1 c
2Y* Y*2
2Y* 2y”
2y*
1+2y*2 1 Yrl
[I Ll
1
*2 Y
from the sets of
*2 Y
43
cf.5 =-
0
CTs
0
[“?I
,
(zr*(ly:f:Y*)l
1+2y*2 2Y*
,,::*zJ
gj
=
-1
0
1’
30
W. Dent and A.-S. Min, A Monte Carlo study of ARIMA processes
where y * = (PT--~T)/(I +9T” -24:~:). from
Then, the estimate of 4 1 is computed
where Si* = rj-24Trj_1+4f2rj_2,
j = 3, 4, 5,
and the estimate of 8, is computed as
The parameter a2 is estimated [see Box-Jenkins (1970, p. 76)] as A2 = c,(l-~:)/(l+~:.-2~,8,).
f7
Neither Walker estimation procedure is guaranteed to meet stationarity/ invertibility restrictions on the parameters. A method of moments estimator (iv) is also computed for the (0, 1) and (1, 1) models. Formulas for both cases are given in Box-Jenkins (1970, pp. 203, 204, 499 - where on the second last line fj should be hj). Corresponding estimates of 0’ also obtain. The final estimator considered is the maximum likelihood estimator (v), and once again details are left for a later section. 7. Monte Carlo methodology All computations were performed on an IBM 360/65 computer at The University of Iowa. Double precision arithmetic within FORTRAN programs was used throughout the study. The heart of any such simulation lies in the random number generator used to construct artificial data. Our choice for generating standard normal (white noise) variates was the McGill Super-Duper Package developed by Professor George Marsaglia. To the best of our knowledge this package was first publicly introduced to statisticians and numerical analysts at the INTERFACE Symposium: Computer Science and Statistics, Berkeley, 1972. The methodology used is not described in the proceedings of that conference but is outlined in Marsaglia et al. (1976). The package is in use at some eighty installations nationally and internationally.
W. Dent and A.-S. Min, A Monte Carlo study of ARIMA processer
31
In the first step of the simulation 2n standard normal variates were generated. The last n+p of these were used as a, -P, u~-~, . . ., a,, a,, . . . , a,, in (1). The redundant n-p values were discarded, having been used as an extra precautionary procedure to help avoid possible ‘start-up’ difficulties. For a given set of (4,0) parameter values and with the mean parameter p set to zero, IZ sample w, values were generated using eq. (1). In all cases examined the realized sample mean W was zero to at least five decimal places. Only stationary and invertible models were examined. To reiterate, sample size n was set at 100 and o2 at unity. The second step of the simulation involved computation and storage of the appropriate estimates for the model under examination. A second set of sample values was then generated as in the first step and the estimation and storage process repeated. These two steps were performed for a specified number of trials T. In all cases reported here, T was set to 100, although Min (1975) reports that trials involving T = 200 produced exactly the same qualitative results, with no changes in ranking or relative performance of the estimators on the criteria examined. The third step of the simulation concerned analysis of the estimates obtained. Min (1975) reports his findings in ninety tables and twenty figures. These tables depict Monte Carlo sample biases, variances and mean square error for each estimator for each parameter estimated. Asymptotic variances, test statistics for bias, and quartiles in the dispersion of the sample mean square error across estimators are presented where appropriate. While minimum mean square error is our preferred criterion, Min also examined mean absolute error with the conclusion that again, no new information was elicited and that performance rankings of the estimators were unchanged. Histogram plots of the empirical distributions of estimates for selected maximum likelihood type estimators exihibited symmetric bell-shapes, and simple chi-square goodness-of-fit tests of normality with asymptotic variances and true parameter values for means were always passed. Skewed distributions were noted for most estimators not of the maximum likelihood type. The three steps outlined above were repeated for each set of parameter values used in examining a given model. For the six models analyzed here over 13 1,000 individual parameter estimates were obtained from the generation of approximately 800,000 sample w, observations.
8. Maximum likelihood
estimation
In general it is extremely difficult to find closed analytical expressions for maximum likelihood estimators. We demonstrate below that for a (1, 0) model the maximum likelihood estimator of & is the solution of a cubic equation. For a (2,O) model simultaneous cubic and quadric equations must be solved. For models with moving average terms there exist no closed solution forms of any usefulness.
32
W. Dent and A.-S. Min, A Monte Carlo study of ARIMA processes
One of the more important issues to be faced in estimation of ARIMA model parameters concerns the assumptions of stationarity and invertibility pertinent to the model. Even for the simple (1,O) model there is no guarantee that in the fn-St-order condition cubic equation there will exist a root which lies between plus and minus unity. Thornber (1967) has considered this possibility, and Jenkins and Watts (1968, ch. 5) have suggested a mean likelihood estimate which responds to truncation of the likelihood function at stationarity boundaries. [This latter estimator was examined by Min (1975, p. 60), but because ot the sample size used (n = loo), little difference from the unconstrained maximum likelihood estimator was noted.] Zellner and Geisel (1970) have discussed the issue with respect to Bayesian a priori restrictions in related models, and as Aigner (1971, p. 367) notes the stationarity restrictions ‘are enforceable rather more easily within the Bayesian framework than within the ML technique’. It is our contention that every effort should be made to satisfy such restrictions. In many economic applications such constraints are appropriate on a priori grounds [Zellner-Geisel (1970)]. Since intuition in analytic interpretation and inference on forecast errors is difficult at best with non-stationary or noninvertible behavior, we prefer to avoid such situations. Further, in the diverse array of applications of ARIMA models we are aware of only one published non-stationary model [Thompson-Tiao (1971)] other than random walk models applied to stock market behavior and other phenomena. We therefore choose to examine only restricted maximum likelihood estimators, where such restrictions are explicitly defined to mean invertibility and/or stationarity requirements. This makes estimation much more difficult, necessitating a constrained optimization procedure. The estimation techniques suggested by Box and Jenkins (1970) employing Marquardt’s (1963) algorithm do not recognize such constraints. Dent’s (1977) likelihood computational algorithm checks for satisfaction of stationarity and invertibility requirements and may be used to enlighten associated unconstrained search procedures. For the Monte Carlo study reported here we have used a nonlinear constrained search program, designated the ‘Box Complex Algorithm’. FORTRAN coding for this algorithm is given in Kuester-Mize (1973, ch. lo), representing the method developed by Box (1965). The procedure utilizes a sequential search technique to optimize a nonlinear objective function subject to inequality constraints, but of course does not guarantee recognition of a global optimum. For other than the (1,O) model, the Box algorithm was used for all maximum likelihood estimation and for the (0, q) and (1, 1) models, unconditional and conditional least squares estimation also. We note that other estimators such as YuleWalker, Quenouille, Walker, etc. were not examined for satisfaction or violation of the implicit stationarity/invertibility constraints. Since a large number of cases examined in this study involves parameters located near constraint boundaries, the problem of identification is relevant.
33
W. Dent and A.-S. h&n, A Monte Carlo study of ARIMA processes
While in actual application this would be a serious issue, it was not dealt with in this study. Thus, for each trial, the known generating model was assumed to be the true model, and suggestions of other potential generating processes were ignored. 8.1.
Autoregressive models
The log likelihood function for pure autoregressive processes may be written [see Box-Jenkins (1970, app. A7.5) as I(4,a2)
= -~ln02+~lnlM~‘~-~,
S(4)
where
with M(P) = Ml:jo) = {m$)} = (y,i_j,}-1c2 P
yp-l are the theoretical autocovariances of the process. with respect to ~2 and each of the 4’s in the log-likelihood
andwherey,,y,,..., Differentiating function yields
al/a02 = - (FZ/~~~)+ (~($)/2~4), MVj
= ~j+o-2(Dl,j+l-~~D2,j+l-...-~pDp+l,j+l), j=l
, ..*,P,
where Mj
=
a(*In
1Mf’l)/a#j,
and
The exact maximum likelihood estimators may be obtained by equating these derivatives to zero and solving. Obviously, A2 = S(&/n.
0
34
W. Dent and A.-S. Min, A Monte Carlo study of ARIMA processes
For the (1,O) process, after appropriate substitution, the first-order condition for $i becomes
and
and stationarity requires 1cjl 1 -c 1. In all cases examined below, a real root to the cubic between - 1 and + 1, was found [Selby-Girling (1965)] as
$I =
2(~)icos(~+~)-$
where
p=
-
n-2
D,,
n-_~2’
n+l ’ = -
y=--,
w:+VV;
n-l+(n-l)D,, n
n-l
’
Dt2
D,,
a = g3q--p2),
b = &(2p3 -9pq+27r).
Thus for this model (only) the Box Complex Algorithm was not applied and restricted and unrestricted maximum likelihood estimators were identical. Note that in general this would not be the case for true 4i values closer to the stationarity boundaries than those examined here. In the log-likelihood for the (2,O) process,
M,2=
1-G -dQu
- A(1 + +42)
$2)
1-4; I ’
W. Dent and A.-S. Min, A Monte Carlo study of ARIMA processes
Stationarity
conditions
In the log-likelihood
3.5
are
for the (3,O) process,
and
Stationarity
8.2.
conditions
are
Moving average models
From (1) a moving average venience, can be expressed as IV,
Applying
=
model
a,-&a,_,--08,a,_,...
this relation
recursively
of order q, with mean
~1 = 0 for con-
-O,a,_,. yields a model
a = LwfXu,,
vector described in where a = (al_q, a2_q, . . . . a,,)’ is an (n+q)-dimensional terms of the n-dimensional vector w = (wl, w2, . . . , IV,), and the q-dimensional vector a, = (a, _4, a2_rl, . . . , a,)‘. L is an (n-kq) x n matrix and X is an (n +q) x q matrix whose elements are functions of 0.
36
W. Dent and A.-S. Min, A Monte Carlo study of ARIMA processes
Box and Jenkins show (1970, app. A7.4) that the exact likelihood function of a (0, q) process is L(8, 02) = (2rr0~)-“‘~lX’XI-*
exp {-&@)},
where
SW =
,=$_, [%I29
and [a,] is the conditional expectation of at given w and 0. Invertibility conditions constraining moving average parameters are identical to the stationarity requirement on autoregressive models with 4 parameters replaced by 0 parameters. For the moving average models we find the maximum likelihood, unconditional and conditional least squares estimates using the Box Complex Algorithm. It is convenient to note the following procedure. Consider Lw = -xa*+a
as a standard linear model with unknown a*, E(u) = 0, var (a) = a2Z. The least squares estimate of a* is 2* = -(X’X)_‘X’LW,
whence the residual vector, [a], which can be shown to be the conditional expectation of u, is given as [a] = Lw-x(x’x)-lx’Lw = MLW.
The residual sum of squares is
s(e) =
t=$_q W9
and minimization of this expression yields unconditional least squares estimators. To preserve numerical accuracy we perform our regression using Householder transformations [Businger-Golub (1965)] and find an crchogonal matrix P such that
W. Dent and A.-S. Min, A Monte Carlo study of ARIMA processes
37
where R is a non-singular upper triangular matrix such that ii* = R-‘2. The residual sum of squares is given [Golub-Styan
(1973)] as
S(e) = I’Z. The conditional least squares estimators involve setting a* = 0, so that S*(8) = (W’(W, and the Box Complex Algorithm minimizes this expression. To compute maximum likelihood estimators one requires the unconditional sum of squares for the exponent of the likelihood and calculation of the term ]X’X] -%. From the Householder decomposition the latter is easily found as
Ix’x/-+ = IR’Rp = II?]-’ = fi
(rii)-l,
i=l
where rii is the ith diagonal element of R, i = 1, . . . , q. The concentrated likelihood, ignoring constant factors, is given as L+(e) = [6’S(e)]-““lX’xl-+, which is optimized. As a final important note we stress that within each simulation trial, knowledge of the true parameter models was nowhere used in determining any estimates, including those requiring the search algorithm for extraction. The latter algorithm uses random starting points. Results of course may therefore be dependent on the particular search algorithm used, but we felt this was a better choice than starting with the true values, which would never be feasible in actual application. 8.3.
Mixed model
The log-likelihood function for the (1, 1) model is denoted Z(&, 0, ,
0’) = -i In c2 +$ In lML1*l)l -&2
S(&, e,),
and for moderate and large n an adequate approximation Jenkins (1970, app. A7.5)] is
IM,(L 01 = (1-41al)o -&)(1-e:).
to 1A&!‘*‘) 1 [Box-
38
W. Dent and A.-S. Min, A Monte Carlo study of ARIMA processes
Stationarity and invertibility requires 1qbl1 < 1, 141j < 1. The unconditional sum of squares function S(+, , 0,) is computed using Box and Jenkins’ backcasting procedure as
S(4194)= ,=ij
h12,
where j is a positive integer such that the computed value of w_~ is sufficiently close to zero [Box-Jenkins (1970, p. 217)]. We choose ]w-~I < lob5 as the cut-off criterion for this purpose. The conditional sum of squares is determined as
The Box Complex Algorithm is used to search S,, S, and the likelihood function using the given approximation to I.&‘*‘)] for computation of conditional and unconditional least squares and maximum likelihood estimators. 9. Results
The numerical results of the simulation study are reported in tables l-6 and figs. 1-3. The tables correspond respectively to the models (1, 0), (2,0), (3,0), (0, I), (0,2) and (1, 1). Details include the parameter values, simulation sample bias, and variance and mean square error for each estimator considered. Other statistics were collected as described above, but for brevity are not tabulated here. The broadest inference from our reading of the result tables is that calculation of the (restricted) maximum likelihood estimates appears well worth the effort. This is based on a criterion of minimum mean square error, although as demonstrated below, these estimates appear to be relatively efficient. Two qualifications, which may be regarded as slightly less macro-oriented findings, are (i) for the first-order pure autoregressive model little differentiation can be made between estimators, and (ii) for pure moving average models the conditional least squares estimator may be marginally superior to the maximum likelihood estimator. This supports findings of Nelson (1974). Further, there is slight evidence that some of the estimators become less reliable as the number of parameters to be estimated is increased. Conversely, the performance of the maximum likelihood estimators tends to improve relatively. These interpretations are reinforced by the more extensive findings of Min (1975) who also examines mean absolute error of estimates, empirical distributions, quartile dispersion of estimates, increased sample size and increased number of simulation trials. In no instance are results reported here contradicted
W. Dent and A.-S. Min, A Monte Carlo study of ARIMA processes
39
or weakened, rather they are confirmed or strengthened. Given space limitations and the representative nature of results in tables 1-6, there seems little point in detailing redundant information. From these general findings and impressions, it is instructive to turn to more detailed analysis of results for individual model types.
9.1.
First-order
autoregressive
process
Relative to results for other processes, performances of the seven estimators of 41 vary only marginally on the twelve parameter sets examined. Bias tends to be negative for all coefficient estimators but the Kendall and Quenouille estimators exhibit minimum absolute bias in ten instances, with Quenouille tending to dominate at positive &. This apparent minimum bias property is at the expense of variance, and mean square error generally is at a minimum maximum likelihosd least squares------------_ .0120
.OlOO
.OOEO
.0060
.0040
.oozo
$1 -1.0
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
Fig. 1. Sample mean square error (loss) function for estimators autoregressive process.
0.8
1.0
of coefficient in first-order
for the exact maximum likelihood estimator or the unconditional least squares estimator. Neither dominates and, as found by Thornber, the sample loss functions of the two estimators cross as shown in fig. 1. Relative efficiency appears to be demonstrated by the maximum likelihood, approximate maximum likelihood and regression estimators while the second of these appears marginally least biased of the three. For negative and low positive &, sample variances of the maximum likelihood estimator appear to underestimate asymptotic variances, while the reverse is true for large & values. Mean square error performance of the least squares estimator improves as #1 increases, while performances of the Yule-Walker and regression estimators
True
Approx.
aA.qymptotic
variances
of maximum
likelihood
estimators
are, respectively,
0.0019,
0.0051,0.0064,0.0075,0.0091,0.0099,0.0099,0.0091,0.0075,0.0064,
0.0051,0.0019.
0.9 1.0 -0.0402 -0.0335 0.0032 0.01800.0048 0.0191 -0.0311 -0.0338 0.0033 0.01800.0043 0.0191...-0.0335 -0.0398 O.CO32 0.0180OX048 0.0191 -00.0557 -0.0295 0.0036 0.01850.0068 0.0193 -0.C425 -0.0323 0.0035 0.0181O&O53 0.0191 -0.0503 -0.03CO 0.0041 0.01830.0067 0.0192 -0.0294 -0.0282 0.0042 0.01810.0051 0.0189
,o:
MSE
0.7 1.0 -0.0405 -0.0318 0.0053 0.02350.0069 0.0245 -0.0337 -0.0319 0.0054 0.02350.0065 0.0245 -0.0404 -0.0318 O.CC'53 0.02350.0245 O.CCC9 -0.0313 -0.0471 0.0235 o.cc550.0245 0x077 -0.0317 -0.c4c5 0.0235 o.cc550.0245 0.0071 -00.03C4 -0.0472 0.0235 0.00630.0085 0.0244 -0.0260 -0.0305 0.0069 0.02360.0076 0.0245
Var.
Quenouille
0.6 1.0 -0.0338 -0.0242 0.0080 0.01930.0199 0.0092 -0.0242 -0.02800.0193 0.00820.0200 0.0090 -0.0242 -0.0338 0.0193 0.00800.0092 0.0199 -0.0237 -0.0404 0.0193 0.00780.0119 O.CF95 -0.0240 -0.0351 0.0194 0.0079O.OlS9 0.0092 -0.0229 -0.0383 0.0194 0.00900.0199 0.0105 -0.0238 -0.02450.0084 0.01930.0090 0.0199
Bias
2:
MSE
,$
Var.
Kendall
0.5 1.0 -0.0360 -0.0038 0.0089 0.02300.0102 0.0230 -0.0313 -0.00380.0091 0.02300.0101 0.0230 -0.03tO -0.CG38 O.CC69 0.02200.0102 O.C2?0 -fl.C415 -C.CC36 O.CCF5 @.C220O.ClC2 O.Ci:O -C.C2C7 - O.CC370.CC87 0.0220O.ClCl 0.02fO -0.0358 -0X027 O.CC94 0.0230O.OlC9 0.0231 -0.0289 -0.0035 0.0090 0.02300.0098 0.0230
Bias
::
MSE
0.3 1.0 -0.0147 -0.0143 0.0076 0.01910.0078 0.0193 -0.0118 -0.0143 0.0078 0.01910.0079 0.0193 -0.0147 -0.0143 O.CG76 0.01910.0193 O.CO58 -0.0180 -0.0142 Of074 O.Cl91O.CO78 0.0193 -0.0143 -0.C145 0.0191 O.CO750.0193 O.CO77-0.0137 -0.02050.0191 0.00770.0193 0.0081-0.0142 -0.0094 0.0080 0.01910.0080 0.0193
Var.
Regression
$
Bias
trials.e
0.1 1.0 -0.0061 -0.0293 0.0092 0.02870.0295 0.0093 -0.0293 -0.00510.0287 0.00940.0295 0.0095 -0.0293 -0.0061 0.0287 0.00920.0295 0.0093 -0.0070 -0.0293 0.0091 0.02870.0092 0.0295 -0.0059 -0.0293 0.0093 0.02870.0093 O.C295 -0.0090 -0.0292 0.0089 0.02870.0090 0.0295 -0.0046 -0.0293 0.0096 0.02870.0096 0.0295
MSE
size 100 using 100 simulation
::
Var.
Yule-Walker
based on sample
-0.3 1.0 -0.0214 0.00920.0086 0.02070.0087 0.0211 -0.0214 0.00630.0088 0.02070.0211 0.0089 -0.0214 0.00920.0207 O.CO860.0211 0.0087 -0.0213 0.01230.0207 O.CO860.0211 O.CC87 -0.0214 O.CC930.02G7 O.CCEX0.0211 O.OCE8 -0.0207 0.01620.0207 0.00910.0211 0.0093 -0.0213 0.00410.0093 0.02070.0093 0.0211 -0.1 1.0-0.0106 -0.0269 0.0111 0.01930.0200 0.0112 -0.0117 -0.0269 0.0114 0.01930.0200 0.0115 -0.0106 -0.0269 0.0111 0.01930.0112 0.0200 -0O.CC96 -0.0269 0.0110 0.0193OX111 0.0X0 -0.ClC2 -0.02t9 C.CIll 0.01930x112 0.02CO -C.CCE8 -0.0267 O.Clll 0.01930.0200 0.0112 -0.0132 -0.0266 0.0117 0.01930.0200 0.0119
Bias
estimators
,$ $
MSE
maxl’hood Var.
1
-0.5 1.0 -0.0038 -0.0004 0.0152 0.00680.0068 0.0152 -0.0054 -0.00380.0069 0.01520.0070 0.0152 -0.0003 -0.0038 0.0068 0.01520.0068 0.0152 -0.0035 -0.0036 0.0067 0.01520.0152 0.0067 -0.0037 -0.GOlO 0.0152 O.COt80.0068 0.0152 -0.CO25 O.OCO30.0079 0.01530.0079 0.0153 -0.0118 -0.0034 0.0068 0.01520.0070 0.0152
Bias
Table results for various
,$
MSE
Carlo
-0.6 1.0 -0.0311 -0.00170.0224 0.00540.0234 0.0054 -0.0078 -0.0312 0.0224 0.00550.0234 0.0057 -0.0017 -0.0311 0.0054 0.02240.0054 0.0234 -0.0308 O.CO32O.GC56 0.0225O.CC56 0.0234 -0X023 -0.G311 0.0224 OX056 O.CCC6 O.C2?4 -0o.CX9 O.CC120.0224 o.co730.0233 o.co73 -0.0X5 -0.0146 0.0224 0.00590.0233 0.0061
Var.
Least squares
Monte
$
Bias
process:
-0.9 1.0 -0.0053 0.00790.0217 0.00170.0217 0.0017 -0.0058 -0.00140.0217 0.00180.0217 0.0018 -0.0054 0.00770.0217 0.00170.0217 0.0017 -0.0018 0.01800.0220 0.00180.0221 0.0018 -0.0048 0.00740.0217 0.00170.0218 0.0017 -0.0028 0.01120.0021 0.02190.0219 0.0021 -0.0116 0.00730.0017 0.02520.0034 0.0253 -0.7 1.0 -0.0361 0.01800.0196 0.00670.0209 0.0070 -0.0362 0.01100.0068 0.01960.0209 0.0070 -0.0361 0.01790.0067 0.01960.0209 0.0070 -0.0354 0.02550.0196 0.00680.0209 0.0075 -0.0360 0.01790.0067 0.0196O.OiC9 0.0070 -0.0345 0.02370.0197 0.00750.02G9 0.0081 -0.0329 -0.0011 0.0199 0.00950.0210 0.0095
MSE
autoregressive
$ $
VW.
Max. likelihood
metervalueBias
Pam-
The first-order
W. Dent and A.-S. Min, A Monte Carlo study of ARIMA processes
worsen for large 4r values, Performances of the Kendall ally dominated in general by With respect to estimation estimator which dominates, 9.2.
Second-order
41
relative to the maximum likelihood estimator. and Quenouille estimators appear to be marginthose of other estimators. of the error variance there appears to be no one
autoregressive
process
Consider the two-dimensional graph of stationarity regions for coefficients & and &, depicted in fig. 2 which shows the location of the twenty-one parameter sets examined for this process. Results on bias again show a persistent tendency of the estimators to underestimate both parameters especially when +z is positive, but in this process there appears to be no pattern of domination by a particular estimator with respect to minimum absolute bias. Absolute bias for & is much larger than for & when & is positive, but bias levels are very similar when & is negative.
1.0
I $2
-1.0 ,I '-2.0
Fig. 2. Stationarity
-1.0
0.0
1.0
region for coefficients and location of parameter autoregressive process.
2.0
sets in second-order
Findings on sample variance and mean square error are similar and appear to vary regionally. In the first group [& 2 -0.05, but excluding pairs (0.3, -0.4), (0.0, -0.9), (1.0, -OS)], (unconditional) least squares and restricted maximum likelihood perform similarly, with maximum likelihood marginally superior on the second coefficient. In the second group [& < 0 and 1&I $ 1.0, but excluding pairs (0.5, -0.5), (-0.3, -0.4)], again least squares and restricted maximum likelihood perform similarly for 41, but the
42
W. Dent and A.-S. Min, A Monte Carlo study of ARIMA
processes
latter estimator is distinctly better with respect to estimation of 42. In the remaining group [+r < -0.05, but excluding pairs (- 1.0, -0.95), (-0.5, -OS)], sample variance and mean square error are markedly smallest for the maximum likelihood estimator for both coefficients. For the other estimators, ranking of performance on the basis of mean square error varies by quadrant of fig. 2. With respect to estimation of r&, for the first five parameter sets which are in the upper right quadrant, on the basis of minimum mean square error, the maximum likelihood, least squares, adjusted maximum likelihood and Yule-Walker estimators perform similarly, closely followed by the regression estimator then Kendall’s and finally Quenouille’s estimator. For the next five cases which are in the upper left quadrant of stationarity, least squares, approximate maximum likelihood, Yule-Walker and regression perform similarly, especially with respect to sample variance. In the following six parameter sets corresponding to the lower left quadrant, when & is near unity, least squares performs well. An overall ranking of mean square error yields maximum likelihood, least squares, approximate maximum likelihood, regression, Yule-Walker, Quenouille, Kendall. In the final five cases a similar conclusion applies with least squares performing well for & near zero. In estimation of the second coefficient &, in the first quadrant least squares and maximum likelihood are paired as best, followed by approximate maximum likelihood, Yule-Walker and regression. In the other three quadrants the performance of other estimators cannot compare with those of maximum likelihood and least squares. In the majority of cases outside the upper right quadrant, sample variances of maximum likelihood estimators are less than asymptotic variances, and estimation of the second coefficient is relatively more efficient than that of the first coefficient. At this point it is interesting to note that there does not appear to be any one estimator which performs uniformly better than others near boundaries of stationarity regions. With respect to error variance estimation, all estimators (including those of Kendall and Quenouille), in all quadrants perform similarly, suggesting the robustness of this statistic which may be interpreted as the mean square oneperiod-ahead forecast error. The superiority of the maximum likelihood estimator raises the question of whether the stationarity restrictions used had any noticeable effect on estimator performance. From fig. 2 one can distinguish thirteen boundary type parameter sets and eight interior sets. Results for the boundary sets were examined in detail to determine how close to stationarity restrictions sample estimates fell. In no case could parameter estimates be construed to have attained an actual boundary of a restriction. The most severe findings were for the two points (0.9,0.05) and (-0.05,0.9) where the parameter sets are located near the boundaries of two restrictions. Even for these points the restrictions were not enforced
-0.0271 -0.0321 -0.0196
-0.0242
0.3 0.3 1.0
QI 3
0.0045 0.0056 0.0213
-0.0107 -0.0150 -0.0260
0.0008 -0.9 0.05 -0.0068 1.0 -0.0279
0.0098 -0.0286 -0.0253
0.0012 -0.0061 -0.0065
-0.0155 -0.0174 -0.0261
-0.5 0.4 1.0
-0.3 0.3 1.0
-0.5 0.2 1.0
-0.1
-Fi”
41
41
41
41
3
0.0037 0.0045 0.0156
0.0083 0.0051 0.0210
0.0042 0.0032 0.0172
0.0037 0.0045 0.0156
0.0046 0.0064 0.0220
0.0079 0.0046 0.0196
0.0084 0.0053 0.0216
0.0046 0.0058 0.0192
0.0247 0.0103
0.0102
0.0106 0.0205 0.0083
0.0233 0.0067
0.0041
0.0089 0.0193 0.0099
0.0102 0.0300 0.0114
MSE
0.0099 0.0085 0.0203
0.0097 0.0106 0.0149
0.0088 0.0103 0.0228
-0.0171 -0.0173 -0.0145
-0.0127 -0.0147 -0.0360
0.0121 0.0121 0.0194
-0.0046 -0.0331 -0.0322
-0.0125 -0.0245 -0.0376
0.0109 0.0110 0.0208
0.0043 0.0035 0.0172
-0.0276 -0.0371 -0.0490
-0.0345 -0.0481 -0.0352
0.0098 0.0099 0.0228
0.0101 0.0090 0.0192
0.0035 010038 0.0230
0.0090 0.0090 0.0188
0.0109 0.0096 0.0287
Var.
MSE
0.0089 0.0105 0.0241
0.0100 0.0108 0.0152
0.0099 0.0096 0.0214
0.0123 0.0127 0.0208
0.0121 0.0133 0.0220
0.0050 0.0049 0.0196
0.0101 0.0110 0.0246
0.0107 0.0095 0.0203
0.0041 0.0055 0.0232
0.0090 0.0099 0.0193
0.0109 0.0107 0.0303
Least squares
-0.0185 -0.0339 -0.0430
-0.0235 -0.0215 -0.0333
-0.0241 -0.0412 -0.0139
-0.0043 -0.0308 -0.0228
-0.0040 -0.0330 -0.0400
Bias
autoregressive
-0.0035 -0.0145 -0.0351
-0.0119 -0.0211 -0.0144
-0.0016 -0.0386 -0.0321
-0.0042 -0.0259 -0.0367
-0.0294 -0.0556 -0.0345
-0.0271 -0.0550 -0.0465
-0.0232 -0.0373 -0.0429
-0.0262 -0.0272 -0.0332
-0.0243 -0.0588 -0.0125
-0.0091 -0.0386 -0.0224
-0.0125 -0.0340 -0.0394
Bias
Cam
0.0086 0.0100 0.0241
0.0097 0.0106 0.0151
0.0097 0.0097 0.0214
0.0119 0.0122 0.0207
0.0116 0.0136 0.0220
0.0050 0.0064 0.0194
0.0102 0.0109 0.0246
0.0106 0.0094 0.0203
0.0040 0.0071 0.0232
0.0089 0.0101 0.0193
0.0109 0.0104 0.0303
MSE
are, respectively,
0.0086 0.0098 0.0228
0.0095 0.0101 0.0149
0.0097 0.0082 0.0203
0.0119 0.0115 0.0194
0.0107 0.0105 0.0208
0.0042 0.0034 0.0173
0.0096 0.0095 0.0228
0.0099 0.0087 0.0192
0.0034 0.0036 0.0231
0.0088 0.0086 0.0188
0.0107 0.0092 0.0288
Var.
Tabie
Za
(0.0100,
-0.0077 -0.0165 -0.0266
-0.0112 -0.0272 -0.0138
-0.0005 -0.0444 -0.0318
-0.0028 -0.0366 -0.0295
-0.0426 -0.0799 -0.0312
-0.0504 -0.0945 -0.0309
-0.0254 -0.0414 -0.0425
-0.0272 -0.0349 -0.0326
0.0074 0.0079 0.0239
0.0094 0.0101 0.0151
0.0098 0.0099 0.0214
0.0108 0.0108 0.0203
0.0123 0.0162 0.0220
0.0098 0.0137 0.0187
0.0103 0.0106 0.0246
0.0104 0.0093 0.0203
0.0052 0.0111 0.0230
0.0086 0.0117 0.0189
0.0108 0.0101 0.0305
0.0091),
-0.0014 -0.0279 -0.0252
-0.0093 -0.0162 -0.0131
-0.0006 -0.0357 -0.0313
-0.0164 -0.0421 -0.0286
0.0173 0.0178 0.0235
0.0092 0.0107 0.0151
0.0101 0.0095 0.0214
0.0165 0.0177 0.0206
0.0113 0.0154 0.0220
0.0063 0.0083 0.0190
0.0113 0.0125 0.0245
0.0109 0.0096 0.0203
0.0046 0.0080 0.0231
0.0089 0.0109 0.0190
0.0136 0.0144 0.0306
MSE
(0.0096,0.0096),
0.0172 0.0170 0.0229
0.0092 0.0104 0.0150
0.0101 0.0082 0.0204
0.0162 0.0160 0.0198
0.0104 0.0119 0.0210
0.0051 0.0043 0.0174
-0.0334 -0.0636 -0.0399 -0.0291 -0.0593 -0.0313
0.0108 0.0108 0.0228
0.0101 0.0088 0.0193
0.0035 0.0035 0.0230
0.0088 0.0084 0.0187
0.0134 0.0131 0.0295
Var.
Kendall
trials.e
-0.0210 -0.0420 -0.0416
-0.0287 -0.0284 -0.0322
-0.0318 -0.0672 -0.0055
-0.0109 -0.0494 -0.0184
-0.0153 -0.0357 -0.0337
Bias
(0.0091,
0.0086 0.0098 0.0242
0.0098 0.0107 0.0151
0.0099 0.0097 0.0213
0.0123 0.0121 0.0209
0.0121 0.0136 0.0220
0.0053 0.0069 0.0191
0.0103 0.0109 0.0246
0.0108 0.0095 0.0204
0.0041 0.0075 0.0226
0.0093 0.0100 0.0193
0.0107 0.0104 0.0304
MSE
0.0019).
0.0086 0.0096 0.0231
0.0097 0.0103 0.0149
0.0099 0.0082 0.0203
0.0123 0.0114 0.0197
0.0112 0.0105 0.0208
0.0046 0.0035 0.0173
0.0098 0.0095 0.0228
0.0100 0.0088 0.0193
0.0031 0.0035 0.0225
0.0091 0.0085 0.0189
0.0104 0.0092 0.0290
Var.
Regression
(0.0019,
-0.0059 -0.0140 -0.0325
-0.0144 -0.0203 -0.0140
-0.0112 -0.0382 -0.0319
-0.0008 -0.0261 -0.0336
-0.0289 -0.0560 -0.0336
-0.0280 -0.0580 -0.0424
-0.0215 -0.0372 -0.0428
-0.0283 -0.0275 -0.0329
-0.0315 -0.0631 -0.0064
-0.0123 -0.0394 -0.0209
-0.0142 -0.0339 -0.0376
Bias
O.OloO), (0.0084,0.0084),
0.0074 0.0076 0.0232
0.0093 0.0094 0.0149
0.0098 0.0079 0.0203
0.0108 0.0094 0.0194
0.0105 0.0098 0.0210
0.0073 0.0048 0.0178
0.0096 0.0089 0.0228
0.0097 0.0080 0.0193
0.0045 0.0034 0.0230
0.0086 0.0073 0.0186
0.0103 0.0088 0.0294
Var.
based on sample size 100 using 100 simulation
MSE
estimators
Yule-Walker
-0.0256 - 0.0878 -0.0019
-0.0004 -0.0663 -0.0180
-0.0230 -0.0366 -0.0339
Bias
results for various
max. l’hood
Monte
Approx.
process:
*Asymptotic variances of the maximum likelihood estimators of (dr , 42) 0.0084), (0.0100, 0.0100). (0.0091,0.0091), (0.0096,0.0096), (O.OlOO,O.OlOO).
3
41
3
3
$;
.
0.0079 0.0046 0.0188
-0.0199 -0.0501 -0.0450
-0.05 0.9 1.0
41
3
0.0229 o.Oa92
0.0096
-0.0423 -0.0338
0.5
0.2 1.0
41
0.0099 0.0194 0.0079
0.0232 0.0035
$2
-0.0117 -0.0567
0.05 -0.0252
0.9 1.0
$1
3
0.0034
0.0087 0.0188 0.0088
-0.0123 -0.0219 -0.0335
0.5 0.4 1.0
3
,$
61
Var.
Max. likelihood
0.0100 0.0286 0.0106
Bias
0.9 -0.0140 0.05 1.0 -0.0272 -0.0384
True value
41
Paremeter
The second-order
(0.0019,
-0.0291 -0.0149 - 0.023 1
-0.0514 -0.0220 -0.0127
-0.0297 -0.0423 -0.0315
-0.0304 -0.0240 -0.0288
0.0019)
0.0100 0.0104 0.0228
0.0103 0.0101 0.0149
0.0112 0.0080 0.0203
0.0123 0.0123 0.0183
0.0113 0.0113 0.0206
0.0051 0.0034 0.0173
-0.0454 -0.0560 -0.0452 -0.0573 -0.0486 -0.0331
0.0107 0.0094 0.0228
0.0116 0.0085 0.0192
0.0055 0.0036 0.0231
0.0091 0.0091 0.0187
0.0107 0.0096 0.0284
Var.
MSE 0.0110 0.0107 0.0296
(0.0084
0.0109 0.0106 0.0233
0.0130 0.0106 0.0151‘
0.0121 0.0098 0.0213
0.0133 00129 0.0191
0.0146 0.0137 0.0217
0.0071 0.0064 0.0193
0.0109 0.0109 0.0245
0.0116 0.0095 0.0203
0.0057 0.0084 0.0232
0.0095 0.0102 0.0191
Quenouille
-0.0155 -0.0388 -0.0414
-0.0002 -0.0313 -0.0326
-0.0164 -0.0699 -0.0103
-0.0202 -0.0332 -0.0212
-0.0174 -0.0327 -0.0343
Bias
0.0103 -0.0010 -0.0365
0.0050 0.0072 -0.0196
0.0133 -0.0074 -0.0345
0.0041 -0.0090 -0.0220
0.0007 0.0037 -0.0150
0 0039 0.0028 -0:0398
-0.0087 -0.0084 -0.0222
-0.0181 -0.0229 -0.0313
-0.0083 -0.0020 -0.0113
-0.0007 -0.0060 -0.0441
-1.5 -0.6 1.0
-1.0 -0.95 1.0
-0.5 -0.5 1.0
-0.3 -0.4 1.0
0.0 -0.9 1.0
-;% 1:O
1.5 -0.6 1.0
1.0 -0.05 1.0
0.5 -0.5 1.0
0.3 -0.4 1.0
0.0085 0.0062 0.0190
0.0055 0.0053 0.0199
0.0101 0.0103 0.0166
0.0057 0.0062 0.0191
0’0005 00011 0:0201
0.0024 0.0013 0.0239
0.0071 0.0051 0.0239
0.0073 0.0023 0.0146
0.0012 0.0005 0.0170
0.0055 0.0057 0.0188
Var.
0.0085 0.0062 0.0209
0.0056 0.0054 0.0201
0.0104 0.0108 0.0175
0.0058 0.0062 0.0196
00012 0’0006 0:0217
0.0024 0.0013 0.0241
0.0071 0.0052 0.0244
0.007s 0.0024 0.0158
0.0012 0.0006 0.0174
0.0056 0.0057 0.0202
MSE
Max. likelihood
Bias
True value
-0.0023 -O.OOU -0.0493
-0.0026 -0.0112 -0.0130
-0.0072 -0.0282 -0.0322
0.0061 -0.0229 -0.0237
-0.0086 0.0068 -0.0449
-0.0009 -0.0031 -0.0193
-0.0032 -0.0141 -0.0281
0.0127 -0.0060 -0.0413
-0.0034 -0.0088 -0.0255
0.0090 0.0116 0.0187
0.0062 0.0061 0.0197
0.0099 0.0100 0.0166
0.0058 0.0064 0.0189
00014 0’0011 0:0196
0.0022 0.0025 0.0235
0.0098 0.0080 0.0230
0.0077 0.0069 0.0141
0.0011 0.0011 0.0165
0.0079 0.0081 0.0189
Var.
MSE
0.0088 0.0112 0.0187
0.0061 0.0058 0.0197
0.0097 0.0096 0.0166
0.0057 0.0062 0.0189
0.0014 0.0011 0.0197
0.0021 0.0024 0.0236
0.0096 0.0076 0.0230
0.0075 0.0066 0.0141
0.0011 0.0011 0.0165
0.0077 0.0077 0.0189
Var.
Table 2b
0.0088 0.0112 0.0211
0.0062 0.0058 0.0199
0.0100 0.0103 0.0176
0.0058 0.0063 0.0195
0.0014 0.0012 0.0214
0.0021 0.0026 0.0239
0.0096 0.0077 0.0238
0.0078 0.0066 0.0158
0.0012 0.0012 0.0170
0.0081 0.0078 0.0205
MSE
(0.0064,
-0.0008 0.0088 -0.0491
-0.0072 -0.0090 -0.0127
-0.0190 -0.0272 0.0302
-0.0100 -0.0092 -0.0228
0.004s 0.0129 -0.0372
-0.0008 0.0137 -0.0172
0.0000 -0.0056 -0.0278
0.0184 0.0050 -0.0410
0.0069 0.0106 -0.0193
(0.0019,
0.0103 0.0118 0.0188
0.0082 0.0082 0.0199
0.0128 0.0133 0.0168
0.0145 0.0122 0.0208
0.0016 0.0015 0.0206
0.0021 0.0023 0.0237
0.0095 0.0090 0.0231
0.0095 0.0086 0.0142
0.0016 0.0016 0.0165
0.0175 0.0146 -0.0486
-0.0190 0.0043 -0.0116
0.0157 -0.0282 -0.0256
0.0518 -0.0143 0.0136
0.0048 0.0341 -0.0141
-0.0009 0.0329 -0.0120
-0.0183 0.0011 -0.0273
-0.0090 0.0101 -0.0400
-0.0300 0.0034 0.0036
0.0098 0.0109 0.0187
0.0066 0.0060 0.0197
0.0097 0.0100 0.0164
0.0060 0.0060 0.0177
0.0007 0.0018 0.0222
0.0024 0.0023 0.0238
0.0107 0.007s 0.0231
0.0082 0.0067 0.0141
0.0015 0.0008 0.0165
0.0083 0.0075 0.0174
VU.
Quenouille
-0.0387 0.0066 -0.0071
Bias
0.0019), (0.0019,0.0010),
0.0103 0.0120 0.0210
0.0082 0.0082 0.0200
0.0130 0.0143 0.0174
0.0147 0.0123 0.0209
0.0016 0.0018 0.0217
0.0021 0.0034 0.0239
0.0096 0.0090 0.0237
0.0100 0.0087 0.0157
0.0016 0.0018 0.0168
0.0174 0.0145 0.0208
MSE
tria1s.a
0.0168 0.0144 0.0201
Var.
Kendall
0.0084),
-0.0068 0.0133 -0.0474
-0.0036 -0.0040 -0.0102
-0.0146 -0.0318 -0.0259
-0.0136 -0.0081 -0.0103
-0.0070 0.0141 -0.0338
0.0000 0.0142 -0.0157
0.0042 0.0004 -0.0258
0.0226 0.0102 -0.0386
0.0086 0.0135 -0.0148
0.0244 0.0124 -0.0263
Bias
(0.0084,
0.0088 0.0112 0.0211
0.0063 0.0062 0.0199
0.0099 0.0102 0.0176
0.0058 0.0063 0.0195
0.0013 0.0017 0.0219
0.0021 0.0031 0.0238
0.0098 0.0077 0.0238
0.0081 0.0069 0.0158
0.0012 0.0014 0.0168
0.0086 0.0082 0.0204
0.0075),
0.0088 0.0112 0.0187
0.0062 0.0062 0.0198
0.0095 0.0095 0.0167
0.0057 0.0062 0.0190
0.0013 0.0015 0.0205
0.0021 0.0029 0.0235
0.0098 0.0076 0.0231
0.0078 0.0069 0.0141
0.0012 0.0013 0.0164
0.0081 0.0080 0.0189
Var. MSE
size 100 using 100 simulation
Regression
O.OlOO), (0.0075,
0.0084 0.0109 0.0211
0.0064 0.0062 0.0200
0.0092 0.0086 0.0175
0.0215 0.0153 0.0244
0.003s 0.0026 0.0232
0.0022 0.0040 0.0239
0.0094 0.0073 0.0238
0.0083 0.0070 0.0156
0.0027 0.0032 0.0172
0.0228 -0.0128 -0.0385
Bias
based on sample
0.0254 0.0200 0.0215
(0.0100,
0.0084 0.0106 0.0188
0.0061 0.0061 0.0198
0.0080 0.0082 0.0168
0.0064),
-0.0063 0.0178 -0.0486
-0.0159 0.0108 -0.0199
-0.0348 -0.0204 -0.0260
0.0104 0.0096 0.0243
0.0017 0.0018 0.0230
-0.0309 0.040s -0.0124 -0.1054 0.0754 0.0066
0.0022 0.0030 0.0238
0.0093 0.0073 0.0231
0.0077 0.0067 0.0140
0.0017 0.0017 0.0172
-0.0007 0.031s -0.0116
0.0048 0.0035 -0.0274
0.0257 0.0160 -0.0404
0.0314 0.0387 0.0052
0.0126 0.0114 0.0214
Var.
MSE
estimators
YuleWalker
0.1133 -0.0927 -0.0113
Bias
results for various
d2) are, respectively,
-0.0008 0.0080 -0.0492
-0.0077 -0.0008 -0.0129
-0.0167 -0.0272 -0.0316
-0.0089 -0.0105 -0.0232
-0.0035 0.0109 -0.0410
-0.0009 0.0153 -0.0174
-0.0001 -0.0057 -0.0280
0.0176 0.0043 -0.0412
0.0069 0.0108 -0.0216
-0.0204 0.0133 -0.0393
Bias
Carlo
max. I’hood
Monte
Approx.
process:
of (41,
0.0090 0.0116 0.0211
0.0062 0.0062 0.0199
0.0099 0.0108 0.0176
0.0058 0.0070 0.0195
0.0015 0.0012 0.0216
0.0022 0.002s 0.0239
0.0098 0.0082 0.0238
0.0078 0.0069 0.0158
0.0011 0.0012 0.0171
0.0079 0.0081 0.0205
Least squares
-0.0055 -0.0007 A.0397
Bias
autoregressive
aAsymptotic variances of the maximumlikelihood estimators 0.0064),(0.0100,0.0100),(0.0075,0.0075), (0.0084,0.0084).
Pammeter
The second-order
(0.0064,
0.0101 0.0111 0.0211
0.0070 0.0060 0.0199
0.0099 0.0108 0.0171
0.0087 0.0062 0.0179
0.0030 0.0007 0.0224
0.0024 0.0034 0.0240
0.0111 0.007s 0.0238
0.0083 0.0086 0.0157
0.0024 0.0009 0.0166
0.0098 0.0076 0.0175
MSE
W. Dent and A.-S. Mitt, A Monte Carlo study of ARIMA processes
45
in simulation computations, and as an example, results for the point (-0.05, 0.9) are displayed in table 2c.
Table 2c Second-order autoregressive process (restricted) maximum likelihood estimation of parameters (-0.05, 0.09) percent of trials yielding parameter values.
0.95051 0.96051 0.97051 0.98051
9.3.
< 0.9505 - 0.96050 - 0.97050 - 0.98050 - 0.99050 > 0.99050
99.0 1.0 -
68.0 6.0 7.0 13.0 6.0
99.0 1.0
Third-order autoregressive process
It is clearly difficult to comprehensively cover the three-dimensional stationarity region with a small number of sets of parameter values. Expense has limited selection to only thirteen cases and there is no claim that these are sufficient to permit generalization to the complete space of stationarity. Within these cases however performance of the maximum likelihood estimator is by far the strongest. In estimating +r for four cases (first, second, tenth and eleventh) other estimators exhibit minimum mean square error, but even in these instances actual levels of mean square error for the maximum likelihood estimator are close to the optimum levels. Least squares and the approximate maximum likelihood estimators perform similarly but results for Yule-Walker, regression, and Kendall estimators are mixed. Again relative efficiency seems to provide the basis for small mean square error. For the second coefficient, c#J~, superiority of the maximum likelihood estimator is not as strong. In four instances (second, fifth, seventh and tenth) another estimator performs best, but even then the mean square error for the maximum likelihood estimator is still good. In estimating c$~clear domination by this estimator is evidenced. Other estimators are relatively indistinguishable with respect to realized mean square error. Again small sample variances tend to outweigh small absolute biases, and biases still tend to be negative. Five parameter sets are in some sense close to stationarity boundaries, but parameter estimates for all trials were found to be nowhere near these boundaries, and the restricted maximum likelihood estimates may be considered to be (unrestricted) maximum likelihood estimates. No distinction in estimator performance is seen for interior and boundary parameter sets. D
46
W. Dent and A.-S. Min, A Monte Carlo study of ARIMA processes
0.0036 0.0106 0.0106 -0.0376 0.0078 0.0092 0.0013 0.0097 0.0097 -0.0472 0.0273 0.0295 0.0013 0.0093 0.0093 -0.0076 0.0095 0.0095 -0.0006 0.0094 0.0094 -0.0340 0.0197 0.0209 -0.0303 0.0110 0.0120 -0.0408 0.0151 0.0167 0.0042 0.0094 0.0094 -0.0225 0.0239 0.0244 -0.0226 0.0081 0.0086 -0.0275 0.0120 0.0127 -0.0042 0.0097 0.0097 -0.0411 0.0190 0.0207 -0.0070 0.0090 0.0091 -0.0218 0.0141 0.0146 -0.0109 0.0077 0.0078 -0.0467 0.0243 0.0265 0.0108 0.0016 0.0017 -0.0122 0.0030 0.0031 -0.0048 0.0019 0.0019 -0.0452 0.0778 0.0298 0.0005 0.0104 0.0104 -0.0450 0.0075 0.0095 0.0074 0.0091 0.0092 -0.0470 0.0273 0.0295 -0.0034 0.0091 0.0091 0.0024 0.0091 0.0091 0.0091 0.0088 0.0089 -0.0332 0.0198 0.0209 -0.0244 0.0108 0.0114 -0.0420 0.0145 0.0162 0.0107 0.0088 0.0089 -0.0219 0.0239 0.0244 -0.0189 0.0080 0.0083 -0.0330 0.0115 0.0126 0.0028 0.0091 0.0091 -0.0402 0.0190 0.0207 0.0062 0.0088 0.0089 0.0009 0.0135 0.0135 -0.0016 0.0072 0.0072 -0.0461 0.0244 0.0265 0.0014 0.0015 0.0015 -0.0306 0.0029 0.0038 0.0232 0.0018 010023 -0.0411 0.0279 0.0295
-0.0029 0.0103 0.0103 -0.0527 0.0075 0.0103 0.0144 0.0088 0.0090 -0.0465 0.0273 0.0295 -0.0073 0.0082 0.0083 0.0095 00084 0.0085 0.0194 0.0079 0.0083 -0.0310 0.0199 0.0208 -0.0302 0.0109 0.0118 -0.0423 0.0135 0.0153 0.0223 0.0079 0.0084 -0.0164 0.0240 0.0244 -0.0346 0.0075 0.0087 -0.0417 0.0104 0.0122 0.0193 0.0086 0.0089 -0.0373 0.0193 0.0207 0.0713 0.0110 0.0161 0.0786 0.0143 00.205 0.0487 0.0072 0.0096 -0.0193 0.0257 0.0268 -0.0346 0.0045 0.0057 -0.1140 0.0102 0.0232 0.1323 0.0079 0.0025 0.0767 0.0694 0.0767
-0.0060 0.0104 0.0104 -00.04250.0079 0.0097 0.0094 0.0092 0.0093 -0.0465 0.0273 0.0295 -0.0221 0.0092 0.0097 0.0157 0.0097 0.0100 -0.0053 0.0089 0.0089 -0.0297 0.0201 0.0211 0.0081 0.0103 0.0104 -0.0186 0.0139 0.0142 0.0048 0.0090 0.0091 -0.0195 0.0245 0.0248 0.0028 0.0082 0.0082 -0.0224 0.0113 0.0118 -0.0046 0.0093 0.0093 -0.0371 0.0196 0.0210 0.0942 0.0232 0.0321 0.0939 0.0301 0.0389 0.0672 0.0174 0.0219 -0.0338 0.0282 0.0286 -0.0975 0.0156 0.0251 0.0256 0.0055 0.0062 0.0585 0.0059 0.0094 0.0858 0.0821 0.0880 0.0076 0.0117 0.0118 -0.0496 0.0085 0.0109 0.0027 0.0101 0.0101 -0.0454 0.0274 0.0295 -0.0153 0.0086 0.0089 0.0165 0.0096 0.0099 -0.0048 0.0098 0.0098 -0.0292 0.0200 0.0209 -0.0194 0.0124 0.0128 -0.0406 0.0173 0.0189 0.0029 0.0108 0.0108 -0.0185 0.0240 0.0243 -0.0060 0.0078 0.0079 -0.0284 0.0117 0.0125 -0.0054 0.0104 0.0104 -0.0372 0.0192 0.0206 -0.0028 0.0151 0.0151 -0.0102 0.0214 0.0215 -0.0079 0.0113 0.0114 -0.0387 0.0253 0.0268 0.0104 0.0039 0.0040 -0.0211 0.0083 0.0087 0.0048 0.0130 0.0130 0.0238 0.0411 0.0417
aAsymptotic variances of the maximum likelihood estimators of (~$1,&, &) are,respectively, (0.0096, 0.0089, 0.0096). (0.0051, 0.0051, 0.0051). (0.0091, 0.0172,0.0091) (0.0099. 0.0246, 0.0099), (0.0091, 0.0091, 0.0091), CO.0091, 0.0064, 0.0091), (0.0096, 0.0089, 0.0096), (0.0096, 0.0089,0.0096), (0.0091, 0.0091, 0.0091), (0.0096, 0.0131,0.0096), (0.0096,0.0089,0.0096),(0.0091,0.0139,0.0091),(0.0019.0.0019,0.0019).
0.3 -4X0014 0.0096 0.0096 0.4 -0.0356 0.0066 0.0079 -0.2 -0.0126 0.0059 0.0061 1.0 -0.0244 0.0259 0.0265 0.5 -0.0132 0.0038 0.0039 -0.5 0.0135 0.0031 0.0032 -0.3 -0.0010 0.0015 0.0015 1.0 -0.0271 0.0183 0.0191 -0.6 -0.0210 0.0105 0.0110 0.1 -0.0498 0.0133 0.0158 -0.2 -0.0057 0.0071 0.0072 1.0 0.0013 0.0079 0.0079 -0.4 -0.0153 0.0078 0.0081 0.3 -0.0388 0.0091 0.0106 -0.2 -0.0148 0.0064 0.0066 1.0 -0.0142 0.0065 0.0067 -1.3 -0.0137 0.0055 0.0057 -1.1 -0.0241 0.0080 0.0085 -0.3 -0.0254 0.0035 0.0042 1.0 -0.0318 0.0084 0.0094 0.9 0.0065 0.0012 0.0012 0.9 -0.0161 0.0016 0.0019 -0.9 0.0054 0.0012 0.0012 1.0 -0.0060 0.0161 0.0161
. 48
W. Dent and A.-S. Min, A Monte Carlo study of ARIMA processes
In general, revealed sample variances of maximum likelihood estimates are less than asymptotic variances, and the third parameter appears to be estimated with more precision than the first and second coefficients. In estimation of c?, the maximum likelihood estimator leads with no general distinctions possible between other estimators. 9.4.
First-order moving average process
From table 4 minimum absolute bias for estimation of 0, is evidenced by the conditional least squares estimator in four of thirteen cases, and by the unconditional least squares, maximum likelihood and Walker estimators in maximum likelihood conditional least squares
-----_-.
I -1.0
% -0.8
-0.6
-0.4
-C.Z
0
0.2
0.4
0.5
Fig. 3. Sample mean square error (loss) function for estimators moving average process.
0.8
i.0
of coefficient in first-order
three cases each. The conditional least squares estimator attains minimum variance for seven parameter sets, the maximum likelihood estimator for four sets. The race for minimum mean square error appears to be between the conditional least squares and maximum likelihood estimators. A graph of the respective loss functions is displayed in fig. 3. Immediately evident from this figure is the effect of the restrictions on estimation. In all moving average process simulations, the maximum likelihood, conditional and unconditional least squares estimators were restricted to satisfy invertibility requirements. For the three points 8, = -0.95, 0.90 and 0.95 some of the maximum likelihood and unconditional least squares estimates were found to attain
0.0079 0.0265
-0.0005 -0.0185
0.0113 -0.0083
0.0039 -0.0203
0.0186 -0.0093
0.0190 -0.0122
0.0110 -0.0161
0.0271 -0.0097
0.0254 -0.0106
0.0112 -0.0081
-0.02341 0.0066
-0.5 1.0
-0.3 1.0
-0.1 1.0
0.1 1.0
0.3 1.0
0.5 1.0
0.6 1.0
0.7 1.0
0.9 1.0
.
y;’
0.0072 0.0162
-0.0075 -0.0199
-0.6 1.0
0.0184 0.0022
0.0018 0.0204
0.0066 0.0186
0.0086 0.0266
0.0073 0.0165
0.0095 0.0185
0.0087 0.0181
0.0127 0.0175
0.0083 0.0260
0.0110 0.0212
0.0083 0.0185
0.0081 0.0194
0.0011 0.0177
MSE
0.0203 -0.0134
0.0190 -0.0102
0.0026 -0.0189
0.0075 -0.0085
-0.0061 -0.0185
-0.0139 -0.0223
-0.0073 -0.0300
0.0016 -0.0079
Bias
Cond.
-0.0388 0.0396
-0.0065 0.0227
0.0228 -0.0058
0.0202 -0.0021
0.0098 -0.0105
0.0159 -0.0119
0.0173 -0.0107
0.0034 -0.0191
0.0107 -0.0083
0.0041 -0.0140
-0.0029 -0.0130
0.0117 -0.0162
0.0290 0.0318
Bias
0.0031 0.0207
0.0032 0.0223
0.0067 0.0182
0.0074 0.0273
0.0070 0.0170
0.0091 0.0182
0.0083 0.0179
0.0128 0.0171
0.0079 0.0262
0.0105 0.0207
0.0078 0.0179
0.0079 0.0197
0.0026 0.0206
Var.
0.0010,
0.0046 0.0223
0.0032 0.0228
0.0072 0.0182
0.0078 0.0273
0.0071 0.0171
0.0094 0.0184
0.0086 0.0180
0.0128 0.0175
0.0080 0.0262
0.0105 0.0209
0.0078 0.0180
0.0081 0.0199
0.0035 0.0217
MSE
0.0051,
0.0962 0.0176
0.1315 0.0532
0.0187 0.0519
0.0204 0.0446
0.0194 0.0388
0.0201 0.0305
0.0202 0.0208
0.0142 0.0191
0.0094 0.0177
0.0112 0.0167
0.0127 0.0265
0.0221 0.0241
0.0176 0.0275
0.0196 0.0285
0.0169 0.0514
Var.
0.0091,
0.1045 0.1377
0.1100 0.1113
0.0248 0.0399
0.0212 0.0305
0.0212 0.0208
0.0142 0.0192
0.0098 0.0178
0.0113 0.0170
0.0131 0.0266
0.0235 0.0243
0.0269 0.0278
0.0369 0.0313
0.1237 0.1287
MSE
of moments
trials.’
0.0099,0.0099,
-0.2928 0.2929
-0.2994 0.2583
-0.0737 0.0337
-0.0343 -0.0003
-0.0318 -0.0020
-0.0059 -0.0105
0.0187 -0.0125
0.0118 -0.0191
0.0184 -0.0111
0.0367 -0.0143
0.0091,
0.0402 0.1080
0.0223 0.0625
0.0121 0.0215
0.0123 0.0331
0.0079 0.0186
0.0090 0.0190
0.0087 0.0178
0.0125 0.0173
0.0081 0.0254
0.0118 0.0210
0.0103 0.0214
0.0119 0.0227
0.3267 0.2781
Bias
Method
100 simulation
0.0389 0.0843
MSE
0.0075,
0.0089 0.0622
0.0074 0.0455
0.0115 0.0213
0.0118 0.0330
0.0079 0.0185
0.0089 0.0187
0.0084 0.0176
0.0125 0.0169
0.0080 0.0253
0.0117 0.0206
0.0103 0.0207
0.0114 0.0217
0.0081 0.0553
Var.
Walker
size 100 using
0.0064,
-0.1769 0.2139
-0.1222 0.1303
0.0245 -0.0137
0.0219 -0.0105
0.0026 -0.0098
0.0117 -0.0169
0.0180 -0.0114
0.0054 -0.0212
0.0139 -0.0107
0.0094 -0.0203
-0.0024 -0.0263
0.0229 -0.0313
0.1754 0.1703
Bias
based on sample
least sq.
estimators
of 01 are, respectively,
0.0018 0.0180
0.0025 0.0200
0.0076 0.0182
0.0094 0.0274
0.0076 0.0170
0.0096 0.0188
0.0089 0.0180
0.0129 0.0173
0.0080 0.0258
0.0114 0.0206
0.0091 0.0184
0.0085 0.0204
0.0010 0.0166
MSE
estimators
0.0017 0.0180
-0.0117 0.0019 likelihood
0.0020 0.0199
0.0065 0.0180
0.0082 0.0273
0.0226 -0.0105
0.0337 -0.0142
0.0346 -0.0111
0.0074 0.0168
0.0092 0.0186
0.0085 0.0179
0.0129 0.0170
0.0079 0.0257
0.0114 0.0203
0.0089 0.0179
0.0085 0.0195
0.0010 0.0165
Var.
Table 4 results for various
least sq.
Carlo
Uncond.
Monte
0.0154 -0.0152
process:
variances of the maximum 0.0051, 0.0019. 0.0010.
0.0183 0.0017
0.0017 0.0203
0.0060 0.0185
0.0091 0.0184
0.0084 0.0180
0.0127 0.0171
0.0082 0.0260
0.0110 0.0209
0.0082 0.0181
0.0081 0.0185
0.0010 0.0176
0.0008 -0.0297
Var.
likelihood
0.0131 -0.0083
Max.
average
-0.7 1.0
___ Bias
moving
-0.95 1.0
True value
‘Asymptotic 0,0075,0.0064,
Pammeter
The first-order
50
W. Dent and A.-S. Min, A Monte Carlo study of ARIMA processes
invertibility boundary values. For the cases, when If3,I = 0.95, approximately twenty-five percent of absolute unconditional least squares sample estimates had the value 0.99905 or greater (no estimate was over 0.9999 which was the program introduced version of the invertibility constraint 8, = 1.0). Another fifteen percent of estimates were found to be between 0.9905 and 0.999, so that forty percent of sample estimates were within one percent of the invertibility constraint value. For the maximum likelihood estimator the corresponding proportions were two percent (boundary) and nineteen percent (within one percent but not on boundary) with twenty-one percent total as margin cases. No conditional least squares estimates were greater than 0.95 in absolute value, either for these two parameter sets or for the third value of 8, = 0.9. In this latter case eighteen percent of unconditional least squares sample estimates attained a computer invertibility boundary value and eight percent were between 0.9905 and 0.999. No maximum likelihood estimates reached the boundary but five percent were within one percent of the boundary value. It appears from these results that the conditional least squares estimator (in unrestricted form) is relatively efficient, followed closely by the corresponding (unrestricted) maximum likelihood estimator. Sample variances of this latter (restricted) estimator were in general larger than asymptotic variances. In error variance estimation the unconditional least squares estimator appears best, especially for negative e1 values. The conditional least squares and maximum likelihood estimators perform similarly, while the Walker and method of moments estimators do not appear to be reliable. 9.5.
Second-order moving average process
Findings in table 5 are readily summarized, where for both parameters it is difficult to force a preference between the maximum likelihood and conditional least squares estimators, but where the unconditional least squares estimator is dominated by one of the other two in all but two instances, both involving the second coefficient &. The conditional least squares estimator leads the maximum likelihood estimator in terms of smallest absolute bias. The results for the one boundary case, & = 1.9, e2 = -0.95, provide a dire warning on the difficulties of estimation in higher-order moving average processes. While no conditional least squares estimates of 8, were less than -0.95, forty-six percent of the unconditional least squares and two percent of the maximum likelihood estimates reached the boundary value for 8,. Further, thirty percent of the former estimates had values between -0.9905 and -0.999, while forty percent of the latter estimates were in this range. This suggests that for near boundary cases in this process, (invertibility) restricted estimation would be well worthwhile. Sample variances of estimates were generally greater than asymptotic variances and in estimation of c’, the conditional least squares estimator clearly exhibits
W. Dent and A.-S. Min, A Monte Carlo study of ARIMA processes
51
Table 5 The second-order
Parameter
True value
moving average process: Monte Carlo results for various on sample size 100 using 100 simulation trials.a
Max.
Uncond.
likelihood
least sq.
estimators
Cond.
based
least sq.
Bias
Var.
MSE
Bias
Var.
MSE
Bias
Var.
MSE
0.3 0.3 1.0
0.0201 0.0153 -0.0491
0.0114 0.0101 0.0193
0.0118 0.0103 0.0217
0.0245 0.0232 -0.0465
0.0112 0.0099 0.0200
0.0118 0.0104 0.0221
0.0159 0.0161 -0.0422
0.0107 0.0092 0.0195
0.0109 0.0095 0.0213
0.5 0.2 1.0
0.0169 0.0257 -0.0378
0.0116 0.0114 0.0190
0.0119 0.0120 0.0204
0.0201 0.0350 -0.0385
0.0121 0.0128 0.0181
0.0125 0.0140 0.0196
0.0061 0.0226 -0.0255
0.0117 0.0110 0.0187
0.0117 0.0115 0.0194
1.9 -R;”
-0.0109 -0.0177 0.0179
0.0012 0.0011 0.0189
0.0013 0.0014 0.0192
0.0037 -0.0169 0.0188
0.0015 0.0017 0.0199
0.0015 0.0020 0.0202
-0.0342 -0.0186 0.0534
0.0017 0.0015 0.0200
0.0029 0.0018 0.0229
- 0.3 0.3 1.0
0.0174 0.0204 -0.0313
0.0063 0.0049 0.0210
0.0066 0.0053 0.0220
0.0075 0.0188 -0.0344
0.0067 0.0035 0.0203
0.0068 0.0039 0.0215
0.0186 0.0177 -0.0309
0.0069 0.0042 0.0201
0.0072 0.0045 0.0211
-0”:: 1.0
0.0061 0.0288 -0.0475
0.0036 0.0053 0.0203
0.0036 0.0061 0.0225
-0.0056 0.0363 -0.0479
0.0038 0.0061 0.0211
0.0038 0.0074 0.0234
0.0042 0.0211 -0.0370
0.0047 0.0058 0.0218
0.0047 0.0062 0.0231
-0.3 -0.4 1.0
-0.0042 0.0024 -0.0369
0.0072 0.0023 0.0200
0.0073 0.0023 0.0214
-0.0058 -0.0041 -0.0347
0.0073 0.0026 0.0201
0.0074 0.0026 0.0213
-0.0038 0.0021 -0.0333
0.0072 0.0025 0.0204
0.0072 0.0025 0.0215
-0.5 -0.5 1.0
-0.0105 0.0008 -0.0324
0.0095 0.0028 0.0182
0.0096 0.0028 0.0193
-0.0149 -0.0060 -0.0320
0.0090 0.0025 0.0185
0.0092 0.0025 0.0195
-0.0090 0.0038 -0.0206
0.0082 0.0025 0.0192
0.0083 0.0026 0.0196
0.3 -0.4 1.0
0.0103 -0.0068 -0.0366
0.0084 0.0043 0.0211
0.0085 0.0430 0.0224
0.0146 -0.0066 -0.0414
0.0088 0.0040 0.0191
0.0090 0.0041 0.0208
0.0106 -0.w51 -0.0331
0.0085 0.0044 0.0208
0.0086 0.0045 0.0219
0.5 -0.5 1.0
-0.0105 -0.0008 -0.0267
0.0076 0.0035 0.0187
0.0077 0.0035 0.0194
-0.0071 -0.0096 -0.0306
0.0082 0.0043 0.0175
0.0083 0.0044 0.0184
-0.0095 0.0030 -0.0191
0.0075 0.0033 0.0184
0.0076 0.0033 0.0188
aAsymptotic variances of the maximum likelihood estimators of (01, (0.0091, 0.0091). (0.0096, 0.0096). (0.0010, O.OOlO), (0.0091, 0.0091). O.W84), (0.0075,O.W75), (O.W84,0.0084), (0.0075,0.0075), (0.0010,0.0010).
LQ)are,respectively, (0.0096).
(0.0084,
least absolute bias (except in case 3) but on the basis of mean square error no estimator dominates. 9.6.
First-order mixedprocess
The most obvious finding from table 6 is that the Walker and method of moments estimates are in a separate class from that of the other three estimators and should not be considered reliable. There is really no clear picture among the remaining findings. With respect to estimation of &, results for bias and variance are indecisive, although the conditional least squares estimator appears to have a slight edge in achieving smallest mean square error. In estimating 0,) the maximum likelihood and conditional least squares estimators perform similarly and best with respect to variance and mean square error, with the former estimator achieving a marginal advantage. Absolute bias for the maximum likelihood estimator is worse than that for the two least squares estimators. In the two near-boundary cases of e1 = 0.9 similar results for the first-order moving average process were evidenced with respect to the attainment of exact invertibility restrictions boundaries. No difficulties were
Max.
0.0305 0.0364 -0.0140
0.0168 0.0154 0.0256
0.0159 0.0143 0.0254
0.0297 0.0336 -0.0144
1.0
0.0080 0.0075 0.0301
-0.0064 0.0157 0.0030
0.0081 0.0077 0.0301
8%:
0.0057
0.0036 0.0106 0.0546 -%% o:w17
0.0075 0.0079 0.0290
0.0033 0.0105 0.0563
-0.0378 -0.0034 0.0751 -Z%: -0.0311
0.0076 0.0079 0.0290
-0.0199 0.0634 0.0518
0.0111 0.0628 0.0615 -%% 0.1444
(0.0024,0.0035),
(0.0084,0.0085), (0.0041,0.0043),
0.0075 0.0065 0.0202
0.0075 0.0064 0.0193
:“o:gg 0.0432
0.0071 0.0246 0.1469
0.0040 0.0011 0.0916 -0.0565 0.0626 0.2352
0.0033 0.0046 OX+376 0.0087 0.0173 0.0181
-0.0174 -0.0009 -0.0213
0.0302 0.0380 0.0163
0.0123 0.0195 0.0404
0.0040 0.0123 0.0836
0d.g;; 0.0178
(0.0184, 0.0085), (0.0144,0.0300), (0.0020,0.0080),(0.0087,0.0042).
0.0111 0.0234 0.0407
“0%::
0.0298
0.1023 0.1222 0.0267
0.1578 0.1913 -0.0127 0.1058 0.1425 0.0264
0.0666 0.1980 0.0827 0.2446 O.OJJO8 0.0264 -0.0191 0.0014 -0.0136
0.0026 0.0019 0.0277 oO:%Z 0.0170
0.0077 0.0070 0.0162
0.0025 0.0019 0.0273
0.0163
0.0127 0.0235 0.0431
0.0063 0.0253 0.0855
0.0341 0.0421 0.0182
0.1272 0.1588 0.0269
0.1545 0.1534 0.0169
0.0042 0.0233 0.0682
8:::; 0.1084
_._ ._ 0.0264
0.0673 0.0364 0.0336
0.1123 0.1301 0.0186
0.3261 0.3111 0.0172
(0.0138,
%%f 0:0700
-0.0007 -0.1672 0.1959
0.0254 0.0154 -0.0235
0.0388 0.0436 0.0260
0.0130 0.0308 -0.0211
0.1781 0.2008 0.0166
0:0306
0.1720 0.1983 0.0160
0.0781 0.0508 - 0.0248
0.0190 0.0125 0.0192
z%; 0:0170
Kei
- 0.0266 -0.1364 0.0776
0.0816 0.1007 0.0176
0.0034 0.0225 0.0682
0.0070 0.0921 0.0935
0.0245
EE
0.1522 0.1022 0.0350
-0.1752 1;:;;;‘:
0.2677 0.2459 0.0171
MSE
of moments
Var.
Method
-0.2417 -0.2554 -0.0082
Bias
0.0265 0.0290 0.0016
0.0013 0.0177 0.0717
0.0064 0.0185 0.0373
0.0368 0.0459 0.0243
0.0605 0.0450 0.0281
0.1143 0.1270 0.0187
0.2189 0.2071 0.0186
MSE
tria1s.a
0.0045 0.0203 0.1092
0.0338 -0.0511 0.1937
0.0017 0.0038 0.0183
K% 0:0179
-0.0240 -0.2712 0.2730
K%Y 0:0272
0.0039 0.0037 0.0255
0.0130 0.0273 -0.0138
1 .0170 0.0171 0.0268
-0.3028 -0.3837 0.0831
-0.2149 -0.2142 -0.0306
0.0370 0.0462 0.0187 0.0205 0.0179 0.0189
0.1989 0.1846 0.0185
-0.1416 -0.1514 -0.0108
0.0088 0.0093 0.0179 0.0681 0.0811 0.0177
Var.
Walker Bias
MSE
size 100 using 100 simulation
0.0169 0.0167 0.0256
K%:
0.0190
%Z::
0.0316
0.0087 0.0085 0.0170
Var.
least sq.
based on sample
are,respectively,
1;:;;;:
-0.0359
0.0047 0.0105 0.0620
!?I%
-0.0136 0.0046 -0.0282
-0.0452 0.0013 0.0715
0.0089 0.0087 0.0164
z%: -0:0107
0.0092
0.0072 0.0056 0.0168
0.0067 0.0056 0.0166
0.0105 0.0007 -0.0193
0.0023 0.0019 0.0274
-0.0110
1”,:g
28”
l:o
-;:
0.0023 0.0019 0.0273
0.0056 0.0027 -0.0102
0.0029 0.0018 0.0280
0.0104 0.0158 -0.0471
0.0027 0.0018 0.0279
0.0141 0.0062 0.0051
-0.8 -0.6 1.0
K%: 0.0170
?z% -0:0199
-0.0045 -0.0488 0.0402
“0%: -0.0348
%:: 0:0172
%z:::‘: 0:0167
O.OOal 0.0024 -0.0221
%: 0.0175
Kz; - 0:0208
-0.1 -0.3 1.0
0.0022 0.0030 0.0430
0.0015 0.0028 0.0346
0.0021 0.0027 0.0362
0.0024 0.0030 0.0445
0.0240 -0.0111 0.0917
0.0290 -0.0166 0.0911
%%
-0.9 0.4 1.0
0.0033
0.0138 - 0.0639 0.0933
8:%7”
0.0206
%%
0.0028 0.0054 0.0302 %%!z: 0:0394
odpo:;: 0.0255
1:;:;: -0:0046
0.0110 0.0043 0.0172
-0.0733 1 ;:;g
0.0031
0.0191
-%E
-0.6 0.9 1.0
E
-0.2
K?%
0.0105
0.0229 0.0165 -0.0167
0.0179
8::::
::iE
-0.0060 -0.0104 -0.0114
0.4 0.9 1.0
0.0099
0.05 13 0.0658 0.0199
-0.0149 -0.1543 -0.0160
3:
Cond.
0.0062 0.0282 -0.0301
Bias
8% 0.0173
0.0088 0.0117 0.0183
0.0085
8%8 d.0170
%fZ - 0.0078 -0.0619 -0.0182
0.0091
%X 0.0735 0.0896 0.0201
MSE
least w.
6
estimators
Table results for various
Var.
Uncond.
Carlo
KG:
Bias
Monte
- 0.0163 0.0015 -0.0107
MSE
0.0086
Var.
likelihood
- 0.0208 - o.co37 -0.0105
Bias
process:
0.3 0.2 1.0
True value
mixed
aAsymptotic variances of the maximum likelihood estimators of (41.01) 0.0031), (0.0431,0.0409), (0.0067,0.0020),(0.0021,0.0092), (0.0233,0.0241),
Parameter
The tint-order
W. Dent and A.-S. Min, A Monte Carlo study of ARIMA processes
53
realized with maximum likelihood or conditional least squares estimates, although some unconditional least squares estimates reached the invertibility boundary and a larger number were within one percent of that value. The obverse stationarity restrictions on $~i appear not to pose a problem for any (restricted) estimator. Extremely mixed results are found in comparison of sample and theoretical asymptotic variances of maximum likelihood estimators. In the second case sample variances are much greater than asymptotic values, while in the fourth case the reverse applies. Other results are not out of line although some asymmetric performance with respect to the autoregressive and moving average coefficients is evidenced. No reasonable explanation of revealed behavior is readily available. In estimating 0’ the unconditional least squares estimator has the best over-all performance with respect to mean square error.
10. Conclusion
Our findings suggest that if only one estimation method were available to a researcher he should probably choose maximum likelihood. The restricted version of estimation appears appropriate for parameters near (within five percent of) stationarity or invertibility restrictions boundaries. Otherwise, the relatively easy-to-compute conditional least squares estimators seem appropriate for the moving average and mixed models examined here. For autoregressive models the easy-to-compute approximate maximum likelihood estimate is a second-best alternative, although the loss from not using maximum likelihood may be large especially when the number of unknown parameters is three or more. The inferences drawn here are based on a carefully designed Monte Carlo study, but should be treated with caution. In particular, further work should be directed to the examination of higher order moving average and mixed models in greater detail. Variation in findings for smaller sample sizes should be examined, and attention paid to detection of global versus local optima in multi-coefficient models. The expense of simulation studies for these cases however will not be trivial.
References Aigner, Dennis J., 1971, A compendium on estimation of the autoregressive-moving average model from time series data, International Economic Review 12, no. 3, Oct., 348-371. Anderson, T.W., 1975, Maximum likelihood estimation of parameters of autoregressive processes with moving average residuals and other covariance matrices with linear structure, The Annals of Statistics 3, no. 6. Box, George E.P. and Gwilym M. Jenkins, 1970, Time series analysis: Forecasting and control (Holden-Day, San Francisco, CA).
54
W. Dent and A.-S. Min, A Monte Carlo stady of ARZMA processes
Box, M.J., 1965, A new method of constrained optimization and comparison to other methods, Computer Journal 8, April, 42-52. Businger, P. and G.H. Golub, 1965, Linear least squares solution by Householder transformation, Numerical Mathematics 7, May, 269-276. Chanda, K.C., 1961, Comparative efficiencies of methods of estimating parameters in linear autoregressive schemes, Biometrika 48, Dec., 427-432. Dent, Warren T., 1977, Computation of the exact likelihood function of an ARIMA process, Journal of Statistical Computation and Simulation 5,193-206. Dent, Warren T. and J.A. Swanson, 1978, Forecasting with limited information: ARIMA models of the trailer on flatcar transportation market, to appear in the Journal of the American Statistical Association. Durbin, J., 1959, Efficient estimation of parameters in moving-average models, Biometrika 46, pts. 3 and 4, Dec., 306-3 16. Golub, G.H. and G.P.H. Styan, 1973, Numerical computation for univariate linear models, Journal of Statistical Computation and Simulation 2, July, 253-274. Granger, C.W. and P. Newbold, 1976, Forecasting transformed time series, Journal of the Royal Statistical Society (B) 38, no. 2. Halton, J.H., 1970, A retrospective and prospective survey of the Monte Carlo method, SIAM Review 12, no. 1, Jan., l-63. Hendry, D.F. and P.K. Trivedi, 1970, Maximum likelihood estimation of difference equations with moving average errors: A simulation study, Discussion Paper 7 (London School of Economics, London), Hoaglin, David C. and David F. Andrews, 1975, The reporting of computation-based results in statistics, The American Statistician 29, no. 3, Aug., 122-126. Jenkins, Gwilym M. and Donald G. Watts, 1968, Spectral analysis and its applications (HoldenDay, San Francisco, CA). Kendall, Maurice G., 1949, The estimation of parameters in linear autoregressive time series, Econometrica 17, July, 44-57. Kuester, James L. and Joe H. Mize, 1973, Optimization techniques (McGraw-Hill, New York). Leuthold. R.M.. A.J.A. MacCormick. A. Schmitz and D.G. Watts. 1970. Forecasting dailv hog prices and quantities: A study of alternative forecasting techniques, Journal-of the American Statistical Association 65, no. 329, March. Marquardt, D.W., 1963, An algorithm for least squares estimation of non-linear parameters, Journal of the Society of Industrial and Applied Mathematics 11. Marsaglia, G., K. Ananthanarayanan and N. Paul, 1976, Improvements on fast methods for generating normal random variables, Information Processing Letters. Min, An-Sik, 1975, A study of likelihood functions of ARIMA processes, unpublished dissertation (The University of Iowa, Iowa City, IA). Nelson. Charles R.. 1974. The fist order moving average process: Identification, estimation. and’prediction, Journal of Econometrics 2, norl. - _ Nelson, Charles R., 1972, The prediction performance of the FRB-MIT-PENN model of the U.S. economy, American Economic Review 62, no. 5. Pagan, A., 1970, The estimation of models with moving average disturbance terms, mimeo. (Australian National University, Canberra). Phillips, A.W., 1966, Estimation of system of difference equations with moving average disturbance, mimeo. (Australian National University, Canberra). Quenouille, M.H., 1957, The analysis of multiple time series (Charles Griffin, London). Selby, S.M. and S. Girling, 1965, Standard mathematical tables, 14th ed. (Chemical Rubber Co. Press, Cleveland, OH). Sullivan, J.A. and R.G. Marcis, 1975, Forecasting consumer installment credit: An application of parametric time series modeling, Journal of Business, Jan., 98-107. Thompson, Howard E. and G. C. Tiao, 1971, Analysis of telephone data: A case study of forecasting seasonal time series, Bell Journal of Economics and Management Science 2, no. 2, Autumn. Thornber, Hodson, 1967, Finite sample Monte Carlo results: An autoregressive illustration, Journal of the American Statistical Association 62, Sept., 801-818.
W. Dent and A.-S. Min, A Monte Carlo study of ARIMA processes
55
Walker, A.M., 1961, Large sample estimation of parameters for moving-average models, Biometrika 48, Dec., 343-357. Walker, A.M., 1962, Large sample estimation of parameters for autoregressive processes with moving-average residuals, Biometrika 49, June, 117-l 31. Walker, M., 1964, Asymptotic properties of least squares estimates of parameters of the spectrum of a stationary non-deterministic time series, Journal of the Australian Mathematics Society 4,363-384. Whittle, P., 1951, Hypothesis testing in timeseries (Almqvist and Wiksells, Uppsala). Whittle, P., 1962, Gaussian estimation in stationary time series, Bulletin of the International Statistical Institute 39,105-129. Zellner, Arnold and Martin S. Geisel, 1970, Analysis of distributed lag models with applications to consumption function estimation, Econometrica 38, no. 6,865-888.