Generalization of the Kolmogorov-Smirnov test

Generalization of the Kolmogorov-Smirnov test

COMPUTATIONAL STATISTICS & DATAANALYSIS ELSEVIER Computational Statistics & Data Analysis 24 (1997) 433-441 Generalization of the Kolmogorov-Smirnov...

431KB Sizes 0 Downloads 44 Views

COMPUTATIONAL STATISTICS & DATAANALYSIS ELSEVIER

Computational Statistics & Data Analysis 24 (1997) 433-441

Generalization of the Kolmogorov-Smirnov test Erhard Reschenhofer

Universit?it Wien, A-1010 Vienna, Austria Received April 1996; revised October 1996

Abstract Because of the low power of the Kolmogorov-Smimov goodness-of-fit test in case of multimodal alternatives this paper investigates a family of tests indexed by an integer-valued parameter K _>2. The sensitivity of these tests to multimodal altematives increases as K increases. The KolmogorovSmimov test is obtained for K = 2. The other elements of this family measure local deviations from uniformity in the same way as the Kolmogorov-Smimov test measures the global deviation. Hence they may be regarded as generalized Kolmogorov-Smimov tests. Tables of critical values are given and applications to the problem of testing for white noise are discussed. In addition, a further family of related tests is proposed. A Monte Carlo power study compares all the tests proposed in this note with Neyman's smooth test and different versions of the length test introduced by Reschenhofer and Bomze (Biometrika, 78 (1991) 207-216; Biornetrika, 79 (1992) 859). A secondary result of this simulation study is that the possibly most widely used version of the length test can break down when a peak is located near the boundary.

Keywords: Goodness-of-fit tests; Length tests; Tests for white noise

1. Introduction A s t a n d a r d test for w h i t e n o i s e is d u e to B a r t l e t t (1954, 1955). It is b a s e d o n t h e fact that, u n d e r the n u l l h y p o t h e s i s o f G a u s s i a n w h i t e n o i s e , the n o r m a l i z e d i n t e g r a t e d periodogram

Jk=

, j=l

/

k=

1,...,m-

1,

j=l

0167-9473/97/$17.00 ~) 1997 Elsevier Science B.V. All rights reserved PII S 0 1 6 7 - 9 4 7 3 ( 9 6 ) 0 0 0 7 7 - 1

E. Reschenhoferl Computational Statistics & Data Analysis 24 (1997) 433-441

434

where /j=(Zrtn) -1 Znx t exp(icojt) 2 ,

(nj=2xj and m = [ ( n - 1 ) / 2 ] ,

t=l

of a sequence xl .... ,xn has the same distribution as an ordered random sample from a uniform distribution on (0, 1). Here [.] denotes the integral part. This allows to reduce the problem of testing for white noise to that of testing the hypothesis that a random sample comes from a uniform distribution. A natural test for the latter hypothesis is the Kolmogorov-Smirnov test which rejects the null hypothesis whenever the maximum deviation between the empirical distribution function and the theoretical distribution function is too large. But in spite of the well known inefficiency of the Kolmogorov-Smirnov test in case of multimodal alternatives and the availability of a large number of alternative tests, many time series analysts still rely almost exclusively on the Kolmogorov-Smirnov test. Therefore Schlittgen (1989) and Reschenhofer and Bomze (1991) advocated the application of alternative tests to the normalized integrated periodogram. The latter authors considered tests which are based on estimates of the length of the graph of the normalized spectral distribution function. Under the null hypothesis, the graph of the normalized spectral distribution function is linear on the interval (0, 1) and its length is v ~ (when the frequency is given as a fraction of re). Any deviation from the null hypothesis implies an increase in the length, hence the null hypothesis may be rejected when the estimated length is significantly greater than v~. A simple estimator is the length of the normalized linearly interpolated integrated periodogram. Of course, this estimator cannot be efficient as it takes into account only the size of the periodogram ordinates but not their arrangement. Another disadvantage is that it asymptotically overestimates the true length (see Reschenhofer and Bomze, 1991). An obvious further development is therefore to replace the periodogram ordinates in this test statistic by smoother quantities obtained by averaging neighboring ordinates. The choice of this crude smoothing procedure indeed allows the application of Hall's (1986) theory of sum-functions of spacings between adjacent order statistics to derive the asymptotic distribution of the associated test statistic, but at the same time it causes unpleasant boundary effects (see Section 3). In addition, simulations show that not even for relatively large sample sizes, this normal approximation can be used for the determination of critical values (see Dittrich et al., 1993; note that linear combinations of chi-squares approximations might be better than normal approximations for this sort of test statistics, see Guttorp and Lockhart, 1989). Of course, nonparametric tests for white noise need not necessarily be based on the normalized integrated periodogram. For example, the adaptive tests proposed by Reschenhofer (1989) and Reschenhofer and Bomze (1992) use information contained in the real part of the finite Fourier transform of the data to construct a suitable test statistic for the imaginary part and vice versa. However, the results of Monte Carlo power studies show that these adaptive tests have to pay dearly for their superior behavior under multimodal alternatives because their power is relatively low when there is only one peak in the spectrum. In addition, their competitive position gets worse as the sample size increases.

E. Reschenhofer l Computational Statistics & Data Analysis 24 (1997) 433 441

435

The purpose of the present paper is to introduce both a family of length tests which allow a different degree of smoothing for different parts of the periodogram and a closely related family of tests which may be interpreted as generalized KolmogorovSmimov tests. Both families of tests are described in Section 2. In the same section, tables of critical values are given. Finally, Section 3 presents the results of a Monte Carlo power study.

2. Test statistics and critical values The empirical distribution function, F, of an ordered random sample 0 < J1 _< " " _< Jm-1 _< 1 may be characterized by the 2m edge points P1 = ( 0 , 0), P2=(J1, 0), P3=(Jl,(m1)-l),...,P2m-1 = ( J m - l , 1 ) , P2m = ( 1 , 1). An approximation of P can be obtained by connecting some selected edge points with straight lines. In the simplest nontrivial case, the subset of selected points only contains three points: P1,PEm, and some further point Pk. The length of the graph of the associated approximation, F(k), of t6 is given by L(1, k, 2 m ) = ( X 2 + y~),/2 + ((1 - X k ) 2 + (1 - Yk)2)'/2, where X~ and Yk denote the Cartesian coordinates of Pk. A test of the null hypothesis, Ho, that the sequence J1,... ,Jm-1 has the same distribution as an ordered random sample from a uniform distribution on (0, 1 ), may be based on the maximum length max L(1, k, 2m) k of such broken lines. This test is closely related to the Kolmogorov-Smimov test as both tests take into account only a single point of the empirical distribution function and are therefore rather insensitive to multimodal altematives (see also Section 3). However, the former test can very easily be adapted to the case of multimodal alternatives by just permitting the approximation o f / ~ by broken lines with more than one break. This leads to test statistics of the form LK =

max

1 =k~ <...
L(kl,...,kK),

K_>3,

where K

L(k,,..., kK) = Z ((Xkj -- Xkj_, )2 +

_

)2)1/2.

j=2

An analogous adaptation of the Kolmogorov-Smirnov statistic amounts to measuring the deviation from uniformity locally in disjoint segments and using the maximum sum of the local deviations as test statistic. The details are as follows. Let Pk,,.-.,Pk~ be a sequence of K edge points, where 1 = k l < " . < kx--2m. For each pair (kj-1, kj), the local deviation from uniformity is given by the maximum deviation of the points Pkj_, ,Pkj_,+l,... ,Pkj from the straight line joining Pkj_, and Pkj.

436

E. Reschenhofer l Computational Statistics & Data Analysis 24 (1997) 433-441

The global test statistic is then obtained by maximization of the sum of local deviations with respect to all sequences of K edge points: K

KS K =

max

1 =kl <"'
=2m

~ d(kj_,, kj),

K > 2,

j=2

where

d(kj_l, kj) =

max kj_~ _
YE-- (YkJ-I -~ X~j ~--Sk/_, Xkj--I~I(ykJ -- Ykj-I)

Obviously, a choice o f K = 2 yields the ordinary Kolmogorov-Smirnov test. To facilitate the computation o f critical values, we shall consider the slightly modified test statistics L K and K S K which use only the m + 1 edge points _P1 = (0, 0), _P2 = ( d l , ( m 1)-1), ~P3 =(d2, 2 ( m - l)-l),...,_Pm+l = ( 1 , 1). As we shall see in Section 3, the behavior o f L K and KS x is quite similar to that o f L x and K S K, respectively. Critical values for the test statistics L_x, K = 3, 4, 5 and K S K, K = 2, 3, 4 are given in Table 1. The critical values o f L__x were calculated by generating 500 000 pseudo-random samples for m - 1 = 5 , 6 , . . . , 1 6 , 300000 for m - 1 = 17, 18,..., 29, 200000 for m - 1 = 3 0 , 3 1 , . . . , 4 1 , and 100000 for m - 1 = 4 2 , 4 3 , . . . , 5 0 . The critical values of KS x were calculated by generating 500 000 pseudo-random samples for m - 1 = 5, 6 .... ,33 and 300 000 for m - 1 = 34, 3 5 , . . . , 50.

3. Monte Carlo power study Table 2 presents the results o f a simulation study. For six altematives ( a ) - ( f ) and two sample sizes (m = 20 and m = 50), the power of sixteen tests (L 3, Z 4, L 5, KS 2, KS3,KS 4, L 3, L_4, L 5, KS 2, KS 3, KS 4, L, StalL], Sm(L), and ~g4z) for white noise was estimated. The first 12 o f these tests were introduced in the previous section. Note that K S 2 coincides with the ordinary Kolmogorov-Smirnov test. The next three tests are different versions o f the length test introduced by Reschenhofer and Bomze (1991). L denotes the test based on the statistic m

TL : Z (D 2 + m-2) 1/2, k=l

where Dk =Ik/(I1 + "'" + Ira). Under the null hypothesis, we have v/-m(TL -- #L) --~ N(0, o-r:) in distribution as m---+ec, where #L-~1.53886 and a 2 ~ 0 . 0 1 6 9 (see Reschenhofer and Bomze, 1991). The smoothed length test Sm[L] is based on the statistic m

VsmLL----

+ m-2)'/2, k=l

where/~k =/)k/(D1 + / ) 2 + " " ~- bin), J~k = ( k I I - kI "~- 1)-l(Dk, + " " + Dk,,), k , = m a x ( k - r , 1 ), kli = m i n ( k + r , m), and r is some positive integer. SIn(L) is a simplified

E. Reschenhofer/ Computational Statistics & Data Analysis 24 (1997) 433-441

437

Table 1 Critical values (each table entry is the value of L~ and KS~, respectively, such that its right-tail probability is 5%)

m-1

KS~

KS 3

KS 4

L_3

L4

Lsj

5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50

0.513 0.473 0.442 0.417 0.396 0.377 0.362 0.347 0.335 0.324 0.314 0.305 0.297 0.289 0.282 0.276 0.269 0.264 0.258 0.253 0.249 0.244 0.240 0.236 0.232 0.228 0.225 0.222 0.219 0.216 0.213 0.210 0.207 0.205 0.202 0.199 0.198 0.195 0.193 0.191 0.189 0.187 0.185 0.184 0.182 0.180

0.559 0.539 0.517 0.499 0.481 0.465 0.452 0.439 0.427 0.416 0.406 0.397 0.388 0.380 0.372 0.365 0.358 0.352 0.346 0.340 0.335 0.330 0.325 0.320 0.316 0.311 0.307 0.303 0.299 0.296 0.292 0.289 0.286 0.283 0.280 0.276 0.274 0.271 0.268 0.266 0.263 0.261 0.258 0.256 0.254 0.252

0.503 0.517 0.515 0.509 0.501 0.492 0.482 0.473 0.464 0.456 0.447 0.439 0.432 0.424 0.417 0.411 0.404 0.398 0.393 0.387 0.381 0.377 0.371 0.367 0.362 0.358 0.354 0.350 0.346 0.342 0.338 0.335 0.332 0.328 0.325 0.322 0.319 0.316 0.313 0.310 0.308 0.305 0.302 0.300 0.298 0.295

1.610 1.584 1.565 1.550 1.538 1.528 1.520 1.512 1.506 1.501 1.496 1.491 1.488 1.484 1.481 1.478 1.476 1.473 1.471 1.469 1.467 1.466 1.464 1.462 1.461 1.459 1.458 1.458 1.457 1.456 1.454 1.453 1.452 1.451 1.450 1.449 1.449 1.448 1.447 1.447 1.446 1.445 1.445 1.444 1.444 1.443

1.645 1.620 1.601 1.586 1.573 1.562 1.553 1.545 1.538 1.532 1.526 1.521 1.516 1.512 1.509 1.505 1.502 1.499 1.496 1.494 1.491 1.489 1.487 1.485 1.483 1.481 1.479 1.479 1.478 1.476 1.474 1.473 1.471 1.470 1.469 1.468 1.467 1.466 1.465 1.464 1.463 1.462 1.461 1.461 1.460 1.459

1.667 1.645 1.628 1.613 1.600 1.589 1.580 1.571 1.564 1.557 1.551 1.545 1.540 1.535 1.531 1.527 1.523 1.520 1.517 1.514 1.511 1.509 1.506 1.504 1.502 1.499 1.497 1.497 1.496 1.494 1.490 1.489 1.487 1.486 1.485 1.483 1.482 1.481 1.479 1.479 1.477 1.476 1.475 1.474 1.473 1.472

E. Reschenhofer/ Computational Statistics & Data Analysis 24 (1997) 433 441

438

Table 2 Results of a Monte Carlo power study Proc.

Tests

L3

L4

L5

KS z

KS 3

KS 4

L3

L_4

L5

KS 2

KS 3

KS 4

7989 8563 1643 1648 3082 3221 1447 1361 2183 2204 1332 1220

6718 7887 5689 5364 5120 5590 3282 2674 3352 3390 2360 1888

6664 7557 4369 4824 5743 6621 3347 3239 4077 4350 2537 2266

8348 8715 1379 1195 2844 2564 2141 1963 1877 1773 1611 1441

6513 7142 4460 4938 5400 6114 3436 3164 3867 3701 2712 2362

4721 5252 4044 4499 4963 6124 3712 3768 4410 4944 3046 2760

999 999 820 808 639 686 285 293 434 441 217 202

984 993 961 968 936 954 8t5 802 786 799 644 601

978 986 935 955 957 972 849 880 866 895 709 720

998 999 767 756 527 525 365 349 362 351 279 273

981 984 948 969 961 968 871 861 837 847 697 666

922 934 931 947 953 970 906 916 931 957 867 859

L

Sm[L]

~

Sm(L)

m = 20 (a) (b) (c) (d) (e) (f)

5040 4830 4464 4220 3932 3683

7856 603 7144 8647 5860 2296 3997 5594 2728 1263 1842 835

7300 6575 5589 3907 2153 1732

m = 50 (a) (b) (c) (d) (e) (f)

801 806 778 743 774 748

993 253 979 995 975 430 964 984 942 454 910 947

993 984 966 919 407 390

Note: Each table entry is the number of rejections of the hypothesis of Gaussian white noise, the total number of applications is 10000 for m = 2 0 and 1000 for m = 5 0 , m is the number of periodogram ordinates. Lk: test based on broken line with k breaks, KSk+I: gen. Kolm.-Sm. test based on k segments, L: length test, Sm[L]: smoothed length test (with boundary), Sm(L): smoothed length test (without boundary), ~2: Neyman's smooth test. v e r s i o n o f S m [ L ] . It is b a s e d o n the statistic mDF

TSm~L~ =

Y ~ (10~ + m - 2 ) 1/2. k=r+ 1

T h e m a i n r e a s o n f o r c o n s i d e r i n g this statistic is that it a l l o w s t h e c o n v e n i e n t st u d y o f its a s y m p t o t i c b e h a v i o r . U n d e r the null h y p o t h e s i s , w e h a v e m3/4(Tsm(L) -- V/2(1 +

E. Reschenhofer I Computational Statistics & Data Analysis 24 (1997) 433-441

439

7(8r)-l))---*N(0, 3/8) in distribution as m--,cx~ (see Reschenhofer and Bomze, 1991 ). Tables of critical values are available for r = [lv/-~ ] (see Dittrich et al., 1993). Critical values of the statistic Tsm[r] (again for r = [½v/-~]) can be found in Reschenhofer and Bomze (1991). Critical values of the statistic Tc can be found both in Reschenhofer and Bomze (1991) and in Dittrich et al. (1993). The last test, 7j], is a particular version of Neyman's smooth test (see Neyman, 1937; Rayner and Best, 1989) which is powerful for testing for uniformity against alternatives with densities of the form

h(x,k, O1,...,Ok)=exp{~"~j=o OjIIj(x)-K(k, Ol.... ,Ok)} where the //j are orthogonal polynomials related to the Legendre polynomials and K(k, 01,..., Ok) is a normalizing constant. The first five polynomials are: H0(x) = 1,

//l(X) = V/3(2X -- 1),

F/z(X) =

X/5(6x 2 --

6x + 1),

/-/3(x) = X/'ff(20x 3 -- 3 0 x 2 + 12x - 1 ),

//4(x) = 3(70x 4 - 140x 3 + 90x 2 - 2 0 x + 1) Neyman's test statistic is given by

j=l

Of course, the properties of Neyman's test depend on the choice of k. Neyman felt that taking k = 4 or 5 would detect a sufficiently broad class of alternatives. Each entry of Table 2 is the number of rejections of the hypothesis of white noise. For each alternative ( a ) - ( f ) , 10 000 samples of size m = 20 and 1000 samples of size m = 50 were generated. The alternatives are given by the following autoregressive processes: (a) yt + ~XlYt-1 = ,St, ~Xl =0.5, (b) Yt + c~2yt-2 = st, o~2=0.5, (C) Yt + o~3Yt-3 = E't, 0~3 =0.5, (d) Yt + ~X4Yt-4 ~---~t, 0~4 =0.5, (e) yt + e5Yt-5 = et, as =0.5, ( f ) Yt + ~X6Yt-6 = St, CX6=0.5. Fig. 1 shows the shapes of the spectral densities of these processes. The spectral densities of the processes (a) and (b) have a single peak, the spectral densities of the processes (c) and (d) have two peaks, and the spectral densities of the processes (e) and (f) have three peaks. Consider Table 2 again. As was to be expected, the ordinary Kolmogorov-Smirnov test (KS 2) and the closely related test L 3 are very powerful in case of alternative (a) where the only peak is located at frequency n. However, the power of these tests decreases rapidly as the peak moves to the middle of the interval (0, n). The generalized Kolmogorov-Smirnov tests KS 3 and K S 4 and the related tests L 4 and L 5 are more sensitive to multimodal alternatives, but at the same time they are

440

E. Reschenhofer/ Computational Statistics & Data Analysis 24 (1997) 433-441

(a)

(b)

(c)

(d)

LL/ %%L (e)

(f)

Fig. 1. Spectral densities of the six alternatives ( a ) - ( f ) used in the Monte Carlo power study.

less competitive in case of alternatives with a single peak. In general, the results obtained with the computationally simpler tests K S K and L K are similar to those obtained with the corresponding tests KS K and LK. The simplest version of the length test, i.e. the test L, is not competitive since its power increases only slowly as the sample size increases. The smoothed length test Sm[L] is quite powerful in all cases. Unfortunately, this is not true for the second smoothed length test, Sm(L), which can break down when a peak is located near 0 or near ~t. Obviously, this is a consequence of the employed smoothing method. Neyman's smooth test comes out a bit worse than the smoothed length test SIn[L]. In particular, for the alternatives (e) and (f) polynomials of higher order would be required. The asymptotics for the test statistics proposed in this paper seems to be quite delicate. Moreover, it is expected that future characterizations of the asymptotic distributions will be of little value for the computation of critical values.

Acknowledgements I would like to thank I.M. Bomze and the referees for helpful comments.

E. Reschenhoferl Computational Statistics & Data Analysis 24 (1997) 433-441

441

References Bartlett, M.S., Problbmes de l'analyse spectrale des srries temporelles stationnaires, Publ. Inst. Stat., Univ. Paris, III-3 (1954) 119-134. Bartlett, M.S., An introduction to stochastic processes with special reference to methods and applications (Cambridge University Press, London, 1955). Dittrich, R., E. Reschenhofer and I.M. Bomze, Behavior of the length test for medium sample sizes, Comm. Statist. Theory Methods, 22 (1993) 2517-2525. Guttorp, P. and R.A. Lockhart, On the asymptotic distributions of high-order spacings statistics, Canad. J. Statist., 17 (1989) 419-426. Hall, P., On powerful distributional tests based on sample spacings, J. Multivariate Analy., 19 (1986) 201-224. Rayner, J.C.W. and D.J. Best, Smooth tests of goodness of fit (Oxford University Press, New York, 1989). Reschenhofer, E., Adaptive test for white noise, Biometrika, 76 (1989) 629~532. Reschenhofer, E. and I.M. Bomze, Length tests for goodness of fit, Biometrika, 78, 207-216 (Amendment: Biometrika, 79 (1992) 859). Reschenhofer, E. and I.M. Bomze, Testing for white noise against multimodal spectral alternatives, J. Time Ser. Analy., 13 (1992) 435-439. Schlittgen, R., Tests for white noise in the frequency domain, Computat. Statist. Quart., 4 (1989) 281-288.