Simultaneous prediction intervals for multiple forecasts based on Bonferroni and product-type inequalities

Simultaneous prediction intervals for multiple forecasts based on Bonferroni and product-type inequalities

Statistics & Probability North-Holland Letters July 1991 12 (1991) 57-63 Simultaneous prediction intervals for multiple forecasts based on Bonferr...

555KB Sizes 0 Downloads 36 Views

Statistics & Probability North-Holland

Letters

July 1991

12 (1991) 57-63

Simultaneous prediction intervals for multiple forecasts based on Bonferroni and product-type inequalities Joseph Department

Glaz * of Stutlstics,

University of Connecticut,

Nalini Ravishanker

Storrs, CT 06269, (/SA

**

IBM Research Diuisron, T.J. Watson Research Center,

Yorktown

Heights,

NY 10598, USA

Received June 1989 Revised July 1990

Abstract: New simultaneous prediction intervals for multiple forecasts from ARIMA models based on the Bonferroni-type product-type inequalities are introduced. These prediction intervals are compared with the marginal prediction intervals forecasting. Keywords: confidence

Autoregressive Integrated interval estimation

Moving

Average

models,

1. Introduction Assume that a Gaussian process { X,} has been observed for a given period of time (t = 1,. . . , T). Accurate forecasting of future observations X,,, (i= 1 >...> I,), along with simultaneous prediction intervals about these forecasts are of utmost importance for planning and control. The autoregressive integrated moving average (ARIMA) models have been extensively used in forecasting time series data (Box and Jenkins, 1976). To assess the accuracy of multiple forecasts from these models, one has to construct simultaneous prediction intervals for XT+, (i = 1,. . . , L) with the overall

* Research partially supported by the University of Connecticut Research Foundation. * * Current address: Department of Statistics, University of Connecticut, Storrs, CT 06269, USA. 0167-7152/91/$03.50

0 1991 - Elsevier Science Publishers

Bonferroni-type

inequalities,

product-type

inequalities,

and the used in

simultaneous

confidence being equal to lOO(1 - a)‘%, for a specified value of 0 -C (Y-C 1. The marginal prediction intervals for X,,, (i = I,...,

L),

XT(i)

f

~42

/Par(eT(i>>>

7

(1)

where z,/~ is the iath quantile from a standard normal distribution, X,(i) is the minimum mean squared error forecast and er(i) = XT+, - X,(i) is the forecast error, have been discussed in the statistical literature (Box and Jenkins, 1976, Section 5.2.4). These marginal prediction intervals can be quite misleading, as the overall confidence level for the simultaneous predictions is lower than the specified marginal confidence levels for individual forecasts. Simultaneous prediction intervals for X,, , (i = l,..., L) are given by XT(i)

B.V. (North-Holland)

* c/(var(e,(i)))

,

(2) 57

STATISTICS & PROBABILITY

Volume 12, Number 1

where the constant c is determined probability statement P[IU,(i)l
U,(i)

i=l,..., = (er(i)

L] =1-a,

- Ee.(i))//m

from the joint

(3) (i =

LETTERS

July 1991

Costigan and Sampson, 1988; Glaz, 1987; Hoover, 1989). In Section 4, two numerical examples are presented to illustrate and compare the simultaneous prediction intervals using both types of probability inequalities. A simulation is carried out to evaluate the accuracy of these intervals.

1,. . . , L)

are the standardized forecast errors. These simultaneous prediction intervals guarantee that all the L prediction intervals cover X,,, confidence (i= 1 ,..., L) with a predetermined level of lOO(1 - a)%;. The exact evaluation of the constant c is not possible for L a 8 and even for 4 6 L G 7, it is extremely complex (Schervish, 1984). The source of the difficulty in determining the constant c is in computing the L-variate normal probabilities given by the left hand side of equation (3). Bhansali (1974, Section 4) used a conservative first order Bonferroni-type inequality to construct simultaneous prediction intervals for an AR(p) process. Ravishanker, Hochberg and Melnick (1987) proposed conservative simultaneous prediction intervals for the general ARIMA( p, d, q) processes that are based on an improved second order Bonferroni-type inequality derived in Hunter (1976). In this paper, we propose tighter simultaneous prediction intervals that are based on higher order Bonferroni-type inequalities (Hoover, 1990) and the higher order product-type inequalities (Glaz and Johnson, 1984; Glaz, 1987). In Section 2 we present the simultaneous prediction intervals that are based on the higher order Bonferroni-type inequalities. We also discuss the assumptions for the distribution of the standardized forecast errors, U, = (U,(l), . . , U,(L))‘, that will guarantee the validity of the higher order product-type inequalities for the left hand side of equation (3). The assumptions for the validity of the producttype inequalities are analytically verified in Section 3 for some simple ARIMA models. For more complicated ARIMA models, a procedure for the numerical verification of these assumptions is outlined. The reason for investigating the performance of the simultaneous prediction intervals based on the higher order product-type inequalities is that they outperform (whenever the assumptions for their validity are in force) the simultaneous prediction intervals that are based on equivalent order Bonferroni-type inequalities (Block, 58

2. Bonferroni-type and product-type for P[ IUT(i)l
inequalities

Assuming that { X, } is a Gaussian process, U, = (U,(l), . . . , U,(L))’ has a multivariate normal distribution with mean 0 and correlation matrix Z. Let E,=(JU,(i)J GC) (i=l,..., L). We are interested in deriving tight inequalities for P(flf= , E,). In turn, this will determine a value of c such that for a given 0 < (Y< 1, the inquality P(fl,‘=,E,) a 1 - (Y will yield accurate simultaneous prediction intervals for X,,, (i = 1,. . . , L) that are given by (2). Recently, Hoover (1990, Theorem 1) has derived a sequence of Bonferroni-type inequalities of order k, 1 < k < L - 1, for the probability of a finite union of a sequence of events A,, . . . , A,. He proved that

where BC denotes the complementary event of B and S, is a subset of (1,. . . , i - l} of size k - 1. For k = 1 and k = 2 the right hand side of equation (4) reduces to the well known Bonferroni and Hunter upper bounds respectively (Hoover, 1990). If the events A,, . . . , A, are naturally ordered and the term P(nf=,A, ) is maximized for i,+, - i, = 1, ordering with j=l ,..., L - 1, then the natural i - k} is recommended. In this S,= {i-l,...,

Volume

12, Number

1

case the inequality

STATISTICS

(4) reduces

& PROBABILITY

to

ments kept fixed, is TP, (Barlow and Proschan, 1975). A nonnegative real valued function f(y,, . . , y,) is multivariate totally positive of order 2, MTP,, if for all x=(x1 ,..., x,,), y= (Yl,...?

G

CP(A,)L-1

J=2

/=I

where

-c cp

x Ay = (min(x,,

c

a,,,+kpl

-

P

h$

where for any 1 < m 6 n < a ,?I

,I

=P i fjE,, i \,=m I

c

a,+l.,+k&l

r=l

I=1 E

XVY

L-k

L+l-k

.v,),...,min(x,, ~4))

and

For the problem at hand, the Bonferroni-type inequality of order k for P(fl,'~, E,)is given by

2

Y,,)?

f(X A_v)f(xVY > af(x)f( Y),

I=1 k-l

July 1991

LETTERS

(6)

L,

(7)

(6) follows di(Y,+~., = 1. As the inequality rectly from the inequality (5) we omit the details of the proof. In Section 4, we will examine the amount of improvement achieved by using higher order Bonferroni-type inequalities to derive simultaneous predictions intervals. At this point, we would like to point out that the Bonferroni-type inequalities are valid without any further assumptions on the distribution of U,. We now proceed to discuss product-type inequalities (Glaz and Johnson, 1984) that will yield simultaneous prediction intervals for X,,, (i = 1,. . . , L).First, we have to introduce the positive dependence structures for the distribution of the random variables U,(l),. . . , U,(L) that will ensure the validity of these product-type inequalities. A nonnegative real valued function of two variables, f(y,, yz), is totally positive of order two, TP, if for all y, < y, and y,* < yz*,

and

(Karlin, 1968). A nonnegative real valued function of n variables, f( y,, . . , y,), is totally positive of order two in pairs, TP, in pairs, if for any pair of arguments y, and y,, the funtion f viewed as a function of y, and y, only, with the other argu-

=

(max(x,, .v,),...,max(x,,, y,,))

(Karlin and Rinott, 1980). If f is MTP, then it is also TP, in pairs. But, if the support of f is a product space, which is the case for the multivariate normal distribution, then MTP, and TP, in pairs are equivalent (Block and Ting, 1981). A random vector is said to be MTP, (respectively, absolute value MTP,, AMTP,) if its joint density is MTP, (respectively, the density of the absolute value of its components is MTP,). It follows from Karlin and Rinott (1981) that lJ, is AMTP, if and only if there exists a diagonal matrix D with diagonal elements being equal to -t 1, such that the off-diagonal elements of -DX’D are nonnegative. Also, Ur is MTP, if and only if the off-diagonal elements of -8-l are nonnegative (Karlin and Rinott, 1980). Now, if U, is AMTP, then it follows from Glaz and Johnson (1984) Theorem 2.3) that for 1 < k G L - 1,

= 7x9

(8)

where, (Y, ,, are defined in equation (7). Moreover, the sequence yk is increasing in k. We refer to (8) as the k th order product-type inequality. The product-type inequality (8) generalizes the well known Sidak (1967 and 1968) inequality:

which is valid without the AMTP, assumption about the distribution of UT. It is well known that the Sidak inequality (9) is tighter than the first order Bonferroni inequality. It follows from Glaz (1987, Theorem 3.1, and Corollary 3.2) that if UT 59

STATISTICS & PROBABILITY

Volume 12. Number 1 is

AMTP,, then for any 1 < k < L - 1, the kth order product-type inequality (8) is tighter than the Bonferroni-type inequality (6):

(Box

and

(X,,

XT-,,...

In Section 4, we will examine the amount of improvement gained by using higher order producttype inequalities.

3. The MTP, and AMTP, dependence structure for multiple forecasts In this section, we verify the dependence structure needed for constructing simultaneous prediction intervals for multiple forecasts that are based on the product-type inequality (8). Analytic results are derived for certain simple ARlMA( p, d, q) models. A computational approach for verifying this dependence structure is outlined, which makes it suitable for application in any ARIMA model. This approach can be easily extended to seasonal ARIMA models. The ARlMA( p, d, q) model for {x,1

is

(1 - @“+#)(X,

(11)

- p) = d,(B)%

where p is the process mean, B is the backshift operator, {E*} is Gaussian white noise. The stationary autoregressive operator &,(B)

= 1 - $QB -

and the invertible d,(B)

= 1 - 8,B -

. . . - $,B moving

average

operator

. . . - O,B

have all the roots outside the unit circle. The minimum mean square error forecasts of XT+, (i=l ,..., L) are given by XT(i)

=

E(XT+, IX> = ? ~,ET+,-,~ J=’

of the infinite where X, are the coefficients moving average representation of the ARIMA model and are functions of the model parameters 60

July 1991

Jenkins, 1976, pp. 95-96) and ). The forecast errors are

X=

r-l e,(i)

(10)

LETTERS

=

C X,&7-+,-,3 j=O

(12)

where, E{ e,(i) 1X} = 0. It follows from equation (12) that the elements of the covariance matrix of e,(L))‘, denoted by r, are eT=(e,(l),...,

(13) In practice, for a finite set of data (XT,. . . , X,), equation (13) is a good approximation if E,, t < 0, are set to their unconditional expectations of 0. From our discussion in Section 2, to asses the validity of the product-type inequality used in constructing the simultaneous prediction intervals, we must examine the sign of the elements of -r-‘(L) = {%,/%%XL. For the simple ARIMA models examined below, the elements of r, viz. y,,,, are computed analytically from equation (13). Using a symbolic manipulation program called MACSYMA, we derive explicit analytic formulas for the elements of -r-‘(L) and examine the conditions under which e, and equivalently U, is MTP, or AMTP,. We assume that X,‘s (or equivalently the process parameters) are known. If the process parameters are unknown, they can be replaced by their maximum likelihood estimators in (12) and (13). For large samples, the resulting simultaneous prediction intervals will converge to the intervals derived under the assumption of known process parameters, because of the invariance and asymptotic properties of the maximum likelihood estimators. For simplicity and without loss of generality, we set p = 0. Table 1 presents the elements of {q,:, } and the conditions for MTP, and AMTP, for five simple and commonly used ARlMA( p, d, q) models. It is seen that only under certain restrictions on the parameters, the MTP, or the AMTP, property holds for the distribution of eT or equivalently UT. For more complicated ARlMA( p, d, q) models, it is impractical to determine analytically the regions in the parameter space for which the product-type inequalities are

Volume Table

12, Number

1

STATISTICS

LETTERS

July 1991

1

Positive dependence

conditions

Model

for ARIMA

models Condition

VI.,

I$,

0.

(1~ B)(l-+43)X,

=

E,.

I@
j-1=1,

(1 - B)X,

= (l-

+,(l-&),

j-i=l,

l
@I,

i=L-1,

@I.

j-i=l. J-i=2,

0,

j-123.

$a)X,

AMTP,

O
-l<@
0<9,, O<@,il.

O
-l<@lQO.

-l
i>l,

(1 + +Y.

j-i=l.

l,CigL-2,

(1 + 9),

I-i=l,

r=L-1.

J-i=2,121, J-1>3.

(l_o[l+,--‘]

BBk,,

for

MTP,

otherwise.

- 9. 0.

(l-

& PROBABILITY

I-i=l,



(1+8)

l
= (1 - BE)&,,

101
valid. In this case we suggest determining numeritally the elements of I’, T and examine the aforementioned conditions on the sign of the ele-

ments of -T ‘. Recently, attention has been given to including correction terms in the elements of r to account for the effect of parameter estimation

Tahie 2 AR(l)

model -

k

Confidence

level

0.99

0.95

0.90

0.85

0.80

0.70

0.60

Product-type

1

3.29

2.80

2.56

2.41

2.29

2.11

1.96

interval

2

3.21

2.68

2.41

2.24

2.11

1.90

1.73

3

3.20

2.66

2.39

2.21

2.08

1.87

1.70

4

3.20

2.65

2.38

2.20

2.07

1 .X6

1.68

5

3.20

2.65

2.38

2.20

2.07

1.85

1.68

Bonferroni-type

1

3.29

2.81

2.58

2.43

2.33

2.17

2.05

interval

2

3.21

2.69

2.43

2.26

2.14

1.95

I.81

3

3.20

2.67

2.40

2.23

2.10

1.91

1.75

4

3.20

2.66

2.39

2.22

2.09

1.89

1.73

5

3.20

2.65

2.38

2.21

2.08

1.87

1.71

3.19

2.64

2.37

2.19

2.06

1.85

1.68

2.58

1.96

1.64

1.44

1.28

1.04

0.84

Simulation interval Marginal interval

Volume

12, Number

1

STATISTICS

& PROBABILITY

8 = 0.06 and SE2= 0.096. The results for this example are presented in Table 3. The values of c in the simultaneous prediction intervals were obtained with the aid of the algorithm for evaluating multivariate normal probabilities developed in Schervish (1984). For k = 6 and k = 7, the CPU time needed to perform the computations became prohibitively large.

(Stine, 1987). We wish to point out that our computational procedure is applicable to this situation as well.

4. Examples In this section we illustrate the Bonferroni-type and product-type simultaneous prediction intervals using Series D of Box and Jenkins (1976, p. 529). The data consists of 310 hourly uncontrolled viscosity readings of a chemical process. Box and Jenkins modeled this data both as an AR(l) process and an ARIMA(0, 1, 1) process.

For the simulated values of c for both examples, 50000 replicates of ten dimensional pseudorandom vectors that are normally distributed with a given mean and variance-covariance matrix were generated using the IMSL subroutine RNMVN. The values of c were incremented until the desired levels of confidence were achieved. From Tables 2-3 it is evident that the marginal prediction intervals are too short for the prescribed overall confidence level. Moreover, the discrepancy from the actual overall confidence level varies from one example to another. It is also clear from the numerical results that there is merit in using the higher order (>, 3) inequalities for constructing the simultaneous prediction intervals, especially for the lower confidence levels that are quite popular in many applications of forecasting. The improvement achieved in using the higher order product-type inequalities is most evident for lower confidence levels. The simulation study con-

Example 1. We model Series D as an, AR(l) process; the estimated parameters are 9 = 0.87 and *’ = 0.09. For L = 10, the c values in the simulta0, neous prediction intervals given in equation (3) based on the product-type inequality and the Bonferroni-type inequality for k = 1, 2, 3, 4 and 5 are presented in Table 2 for confidence levels of 0.99, 0.95, 0.90, 0.85, 0.80, 0.70 and 0.60. Also shown in Table 2 are the marginal c values and the ‘exact’ values that are computed by simulation. Example 2. We fit an ARIMA(0, Series D; the estimates of the

Table

July 1991

LETTERS

1, 1) model to parameters are

3

ARIMA(0,

1, 1) model k

Confidence 0.99

level 0.95

0.90

0.85

0.80

0.70

0.60

Product-type

1

3.29

2.80

2.56

2.41

2.29

2.11

1.96

interval

2

3.16

2.62

2.35

2.17

2.03

1.82

1.64

3.14

2.59

2.31

2.13

1.99

1 .I7

1.59

4

3.13

2.58

2.29

2.11

1.97

1.75

1.57

3.12

2.58

2.29

2.10

1.96

1.74

1.56

Bonferroni-type

1

3.29

2.81

2.58

2.43

2.33

2.17

2.05

interval

2

3.16

2.63

2.36

2.19

2.06

1.86

1.71

3

3.14

2.59

2.32

2.14

2.01

1.80

1.64

4

3.13

2.58

2.30

2.12

1.98

1.77

1.61

3.13

2.58

2.29

2.11

1.97

1.76

1.59

3.11

2.55

2.28

2.09

1.95

1.73

1.56

2.58

1.96

1.64

1.44

1.28

1.04

0.84

Simulation interval Marginal interval

62

Volume

12. Number

1

STATISTICS

& PROBABILITY

firms that the simultaneous prediction intervals based on the fifth order product-type inequalities are very accurate for the examples considered in this article. One can also conclude that in cases where the product-type inequalities are not valid, the high order (> 3) Bonferroni-type inequalities yield more accurate prediction intervals than the existing intervals.

Acknowledgements The authors are grateful to Professor Mark Schervish for providing the program for evaluating the multivariate normal probabilities.

References Barlow, R.E. and F. Proschan (1975) Statistical Theory of Reliability and Life Testing Probability Models (Holt, Rinehart and Winston, New York). Bhansali, R.J. (1974). Asymptotic mean-square error of predicting more than one-step ahead using the regression method, Appl. Statist. 23 (1) 35-42. Block, H.W., T. Cost&an and A.R. Sampson (1988) Optimal product-type probability bounds, Tech. Rept. No. 88-07, Dept. of Math. and Statist. Univ. of Pittsburgh (Pittsburgh, PA). Block, H.W. and M.L. Ting (1981) Some concepts of multivariate dependence, Comm. Statist. Ser. A 10, 749-762. Box, G.E.P. and G.M. Jenkins (1976), Time Series Analysis: Forecasting and Control (Holden Day, San Fransisco, CA). Glaz, J. and B.Mck. Johnson (1984). Probability inequalities

LETTERS

July 1991

for multivariate distributions with dependence structures, J. Amer. Statist. Assoc. 79, 436-440. Glaz, J. (1987). A comparison of Bonferroni-type and product-type inequalities in presence of dependence, Symp. on Dependence in Statist. and Probab. (Hidden Valley Conference Center, Pennsylvania). Hoover, D.R. (1990) Subset complement addition upper bounds - An improved inclusion-exclusion method, J. Statist. Plann. Inference 24, 195-202. Hoover, D.R. (1989) Comparison of improved Bonferroni and Sidak/Slepian bounds with applications to normal Markov processes, Tech. Rept. No. 412, Dept. of Statist., Stanford Univ. (Stanford, CA). Hunter, D. (1976) An upper bound for the probability of a union, J. Appl. Probab. 13, 597-603. Karlin, S. (1968) Total Positiuity (Stanford Univ. Press, Stanford, CA). Karlin, S. and Y. Rinott (1980) Classes of orderings of measures and related correlation inequalities - I: Multivariate totally positive distributions, J. Multiuariate Anal. 10, 467498. Karlin, S. and Y. Rinott (1981) Total positivity properties of absolute value multinormal variables with applications to confidence interval estimates and related probabilistic inequalities, Ann. Statist. 9, 1035-1049. Ravishanker, N., Y. Hochberg and E.L. Melnick (1987) Approximate simultaneous prediction intervals for multiple forecasts, Technometrics 29, 371-376. Schervish, M.J. (1984) Algorithm AS 195, multivariate normal probabilities with error bound, Appl. Statist. 33, 81-94. Sidak, Z. (1967) Rectangular confidence regions for means of multivariate normal distributions, J. Amer. Statist. Assoc. 62, 626-633. Sidak, Z. (1968) On multivariate normal probabilities of rectangles, Ann. Math. Statist. 42, 169-175. Stine, R.A. (1987) Estimating properties of autoregressive forecasts, J. Amer. Statist. Assoc. 82, 1072-1078.

63