Testing for no effect in nonparametric regression

Testing for no effect in nonparametric regression

larming and Inference 36 (1993) I-04 North-Holland ubank V.N. LaRiccia Received 6 March 1990; revised manuscript received 13 May 1991 Recommended...

1MB Sizes 0 Downloads 104 Views

larming and Inference

36 (1993) I-04

North-Holland

ubank

V.N. LaRiccia

Received 6 March 1990; revised manuscript received 13 May 1991 Recommended

by M.E. Bock

.IhstrNct: The large sample Groperties of three tests for no effect in nonparametric regression are investigated. The ?ests ca1 all be represented as weighted sums of squared sample Fourier coefficients. The type of weighting employed by a test is shown to be crucial for its asymptotic ;nd finite sample power properties. A MS Suhjwt K+*

Ckr.rsi_fit.crtrc)r!s:Primary 62G 10; secondary 63102.

~*ortj.urul phrusw

ntroduction

Nonparametric

regression, tests for no effect: large sample behavior.

an

A number of tests of para metric hypotheses have been developed recently that employ nonparametric smoothing techniques: e.g., Cox and Koh (1989), Cox et al. (1988), Hfirdle and Mammen (1988), Miller (1992), King (1988), StaniswaPis and Sevcrini (l991), Raz (1990) and Eubank and Spiegelman (1990). Such tests have the advantage of being valid under weak assumptions concernirg the form of the alternative model while, at the same time, maintaining good power properties. In this paper we propose and study three tests for the hypothesis of no effect for a regressor variable that derive from nonparametric regression method< logy. Assume that rcqponses yl, . . . ,,yn are obtained from the model yr=p+f(t,)+E,,

r= l,...) n,

0)

where the E, are i.i.d. normal errors with zero mean and variance &, the t, are known values of a predictor variables t, ,Uis an unknown parameter and f is some

03X!-3758/93/$0G.O0

0

i393-Elsevier

Science Publishers

B.V. A11 rights reserved

R. L. Eubartk, V.IV. LaRiccia / Nonparametric

9

regression

unknown, smoothly periodic, function. We wish to test the hypothesis that ! has no influence on the response, i.e., we want to test Ho: f 2% In the next section we propose three tests for Ho. One is a parall of a test by Cox and Koh (1989) for the setting of model (1). The other two are obtained by felting series and smoothing splino estimators to residuals from a fit of the null model with f = 0. The large sample behavior of the three tests is studied in Section 2 under both the null hypothesis and sequences of local alternatives. The Cox and Koh type test is found to be capable of detecting alternatives converging to the null at a paraimetric rate while the series and smoothing splint based tests can only detect alternatives converging at slower nonparametric rates. However, the asymptotic power of the Cox and Koh test is not a strictly increasing function of the distance of the alternative from the null model, which gives it a disadvantage when compared to d’ .Ilt *Ather two tests. This is shown analytically in Section 2 and demonstrated empirically in a small simulation study in Section 3. Proofs of all results are collected in an Appendix. Similar testing problems to the one studied here have been considered by, e.g., Eubank and Spiegelman (1990) where the response is modeled as a polynomial of some fixed order plus an unknown, nonperiodic function. The goal in that case is to test the goodness-of-fit of a polynomial regression function. Unfortunately, the distribution theory for the Cox and Koh test under local alternatives is quite involved for general polynomial models. By restricting attention to the no errect hypothesis, periodic f and a uniform design we are able to rigorously obtain the asymptotic distribution theory we need for comparison of the Cox and Koh test with other statistics. Despite our more limited framework of analysis, we believe the results derived here are indicative of what is true for testing goodness-of-fit for more general polynomial models.

he proposed tests

For asymptotic analysis we reformulate model (1) in the following fashion. It is now assumed that, for each U, responses yN, are obtained from the model Y

rll

=p+h(n)g(p-l)/n)+~,,

r= l,ncsa.-r-,n,

(2)

where the ,sr,,are i.i.d., zero mean, normal random variables with variance 02, p is an unknown parameter, g is an unknown smoothly periodic function and h(? :> some function of the sample size that satisfies h(n) -+ 0 as n + 00. Some specific choices for h(n) will be discussed subsequently. In saying that g is smoothly periodic we mean that gE = {p :p, p’ are absolutely continuous, p(O) =p( l), p’(O) = p’(l) and ji p”(rj2 d;TeL}. Witho t 10~s of generality g can be ssun\ed to satisfy odel (2) corresponds

formally

to model (1) with f=h(n)g.

R. L. Elrbank,

V. N.

LaRiccia

/ Nonpurmne:r-ic

regression

3

regression function is now allowed to change with n to facilitate the study of power properties for tests of Ilo. When g#O, mcdel (2) provides a sequence of alternative models converging to the null or constant model Analyhis of the power properties of tests under such local alternative models provides a more stringent comparison than for fixed alternatives. Note that the null hypothesis is now equivalent to g = 0 and we have made the simplifying assumption of a uni,‘orm design t, = (I’subsequent mathematical convenience. Cae possible test for Ho can be derived from a Bayesian approach simil of Cox and Koh (1989). The idea here is to find a locally most powerful test k‘orHo versus an alternative where f in (1) is modeled as a constant multiple of a Gaussian process that is uncorrelated with the E’S and has covarian B(s, I) = C.ir_, cos(27~(f - ~))/(2nr)~, where C ’ indicates summation exci zero index. A parallel of the Cox and Koh (1989) test derived from this point of Liew iS

c‘N

C’

-

!c7jn

12/((2n j)%'j.

lj\<(n- l)i2

(3)

The cj,l are the sample Fourier coefficients ajn

tl

=

-1

CY Nl

e-2nij(r-

1)/n 9

(4)

r= 1

with eiu=cosx+isinxandi’- - 1. When n is even (3) is modified by adding an additional Fourier coefficient corresponding to, e.g., the +z frequency. The test statistic Cn can be motivated directly by noting that the n I5jn1‘/a2, j= +“(n - 1) are all independent, central chi-squared, random variables with +l , .=*, -_z one degree-of-freedom under the null hypothesis. They will have nonzero noncentralities only if g#O in (2). Thus, large values of C,, will provide an indication that Ho is false. An alternative class of tests can be obtained by smoothing the residu Yt#)- rz-’ C,“=, ySn from fitting the null model (2) with g= 0. If the null h is true these residuals should have no pattern as a function of t. Thus, giv nonparametric regression fit p to the Zm, we could base a test on

(5)

T = i f&j’, r=

1

for example. Under our assumptions, f admits a Fourier series expansion xi:_, aj

=

I‘*

f(tje-2niil

a,e”.” with

&

.o

the jth Fourier coefficient for-J’. Thus, one possible way to estimate f is to the series after some appropriate number of terms and lhen estimate the r unknown Fourier coefficients by their sample analogs (4). The resulting estl ator

R. L. Embank, V.N. iaRiccia / Nonpara,metric regrersior.

4

off has the form f&) = C ;i, _.mISi,ezxijtfor some 16 v 5 $(n - I), I_Js;nethis in (5) and recente! ing 2nd resealing by subtracting off the mean and dividing by the standard deviation of T results in the test statistic

Another possible estimator off is a periodic, cubic smoothing spline obtained as the minimizer of the penali:..ed least-squares criterion

n-’ i ( yrrl -p(t,))2 r=l

+

a



(p”Ct))2

i .O

dt,

A > 0,

over all p e WiPer. This estimator is approximately (cf. Wahba, I975 or Eubank, Gjne2nij’/( 1+ A(2~j)~) for pzodd. Using jt’i.in (5) and again 1988)&(t) = C recentering and resealing appropriately gives yet another test statistic ;j/ G:(~-

L =

1)/z

n C[j/S@- I)/2

15jn12(l+I(2stj)“)-“--2G2 __I___

(1 +A/2Xj)4)-2

Clii<(n_I]/Z

2a2(E lji <(n- !;/I r,l -!-i!(2n j)4i-4)“2

a

(7) Notice that all three test statistics CnY T1!?!and T,,;, rely on the sample Fourier coefficients in some fashion. Cn uses all the tij,l (except &,) but heavily downweights those Fourier coefficients corresponding to higher frequencies. Thus, one might eqect its power to be concentrated at low frequency departures from Ho. We will s
F’ $(IJ ’ b.J )2/a2)/(2n

;=y,

jb4,

(8) ndosri variables with

one a’egree 0-f freed0

R. i.. Errbard,

V. N. LaRiccia / Nonpizrarnetric regression

5

Theorem 1 has a number of consequences. First, nC, is seen to be (asymptoticallyj a weighted sum of central chi-square randcm variables under the null hypothesis which can be used to obtain approximate critical values for the @ox and Koh test. Both TO;.and T,,,, are asymptotically standard normal under HO J that a? approximate a-level test can be obtained bj rejecting HO if T,lAor T,,?, exceed the loO(“l- a) percentage point, Z,, of the .ccandard normal distribution. The development of critical values for use with the three tests in practice will be discussed further in the next section. We also see from Theorem I that Cn has nontrivial power against alternatives converging to the null at the parametric rate I /fi. In contrast, Tnd and T,, can only recognize those alternatives converging at the rates 1 .@d”‘6) and rn”‘/vg, resrectk.y. Thus, it appears that a price has been paid for the more uniform weightings of sample Fourier coefficients used in (6) and (?) through some loss in our ability to detect alternatives that are very close to the null. This is not entirely true however as we will now demonstrate. The ability of C,, 90 detect alternatives of th. form g/fi is not without its drawbacks. For example, we have the following corollary to Theorem 1. Corollary 1. Assume the conditions sf Theorem 1 and for any event A let P(A 16) denote the probability of A un&r tl’;ealternative 3. Then, for any fixed 0 < y < a, if c, is the 1oO(1 - a) percentage pok t of C in (8), ;nf

lim P(r, Jn>ca 1g/l/;l) = a,

, =)Jtl-+UJ

inf lim P(T,, 2 Za 1g/(fil i)gll=6n-b=

“16))= 1 -- @(7ja- y2/(2a2a’:‘2))

(9 (10)

and

where @ is the standard normal distribution function. One implication of the Corollary (and its proof in Secticn 4) is thai we can always find an alternative of any given size (i.e., norm) y for INhich C,, will arbitrarily close to its level in large sampies. This is made possible by the downweighting Cn uses for the Fourier coefficients related to higher frequencies. The more uniform weighting of sample Fourier coefficients used in TnAand Tnmhas the

R. L. E~rrbad, V.N. LuRiccia ,/ Nonparameiric regw~sion

6

consequence that their (asymptotic) power is constant against all ahernatives of a given size. The difficulty in drawing conclucions about the relative performance of the various tests from the Corollary is that the local alternatives used in (9) converge at a different rate than those in (loi and (1 I). Thus, it would be more meaningful if similar results could be established for alternatives that are more directly comparable. This is the subject of our next theorem. For any integer 171and constants O< ‘orc yzc 00 define

Ikocem 2. Assurne that M,m --+00 and let qfj(t) = ezxij*. Then, given any /k ((x, I), there exists y md yz such that !

The theorem can be viewed as saying that nC,, and Tnrr2have roughly comparahie power against aiternatives of the form ~‘~jl<~~~ j’r,l//j/fi and A” T,;;,Giii <_jYj”v”, respttciively: at least for /3 ~:at- ON. These two alternatives are comparable since the ‘;tJand <, are the same for both tests. However, the higher frequency components in tht alternative for C,I corresponding to j’> I#’ will actualty be farther away from the ~11 than in the alternative for Tq,n. Thus, in this sense Tnrrlcan detect alternatives closer to the than those for the Cox and Koh test. We note in pakrg ihac there 15 notiung special about T,,,,, in Theorem 2 anti an analogous result can be established for T,+ Theorem 1 provides us with a means to compare TnAand TnrfI.Xote that A*‘4 is essentially a bandwidth for the periodic spline estimator (see, e.g.p Eubank, 1988, Chapter 6) white ~/HZ plays tha: role of a bandwidth for series estimators (cf. &bank, 11988,Chapter 3). Thus, the alternatives n~t”~g/fi and g,/(~/&*“~) are of a parallel nature and can be aligned in a fashion that makes it possible to compute Pitman type asymptotic relative efficiencies. We now state one result along these lines. nuli

pares to T,,,, choices fx ~2 and A (cf. Eubank 8988, T!j.

at can be viewed as ‘opti er 3 and Wahba, 1975). The basic

conclusion to be drawn from the proposition is that mA will be about twice as efficient as Tnr,,for testing Ho in large samples with these choices for the smoothing parameters of the tests. Mot-c general comparisons can b.: conducteo using techniques such as those of Jayasuriya (19909. We note here that this proposition corrects an error in Eubank and Spiegelman (1990) who mistakenly erive a relative efficiency of 4.7 for these two tests. To use the statistics (3), (6) atid (7) in practice it will generally be necessary to estimate 02. One natural estimator for this purpose under the null hypothesis is the sample variance S2 = i

(yi,

-JnJ2/(Pl- 119

jil

with yn the sample mean of the responses. However, as noted by a referee, this estimator will have a large bias under any fixed alternative, so it is advisable t3 use estimators such as those of Rice (19849 and Gasser et al. (1986) that are censisient for 0’ even when I-I0 is false. A simple example of this type of estimator is -

G2 =

(n - 11)/2 c

r-l

()‘2r--Yzr-

19’/(n-

19.

Concerning s2 we have the following result. Proposition 2. Assume the conditions of Theorem I and thaL m/u --+0 or I /@A I”) --) 0. Then, the conclusilons of parts (ii) and (iii) ?f Theorem f rtwain valid if 0’ is replaced by 6’ in T,,,,, or TnA. An analog of Proposition 2 holds for S* under the more restrictive cenditions that m3’% + 0 OPthat I/(nA3”*) ---)0. The stronger conditions ate needed because the bias of S2 is of order h(n) compared to the bias of e2 which is only of order lI(n)/ra. We should also point out that, by Slutsky’s Theorein, any weakly consistent estimator can be used for o2 in (3) without altering the asymptotic distribution OC C,l. Both S2 and 6’ can be shown to hale this property under model (2). In conclusion, we note that parts (ii) and (iii) of Theorem 1 and Proposition I can be shown assuming only that the c’s in (2) are i.i.d. and have finite fourth moments. The proof relies on results in de Jong (1987). See Jayasuriya (1990) for further details.

To ascertain the extent to which the asymptotic results of she previous section ct the finite sample properties of our tests for fixe alternatives we conductA 11scale simulation. Samples of size 101 were generate3 from model (1) using

R L. Eubank, V. N. LaRiccia / Nonpurame?ric regressioiz

8

normal errors. The error variance was taker: to be k equal to 1 while the t, were chosen to be equally spa For the function f in (1) -.ve used

f(t) = y cos 2n jt. he choice

w9

of y governs

the signa wer frequency alter 6. The case of y =0 allo

‘o detect the sensitivity of Tnl and Tnnlto the choice of il and n? we computed them for two choices of L and m for each sample. he specific choices made were ion does not always work well for T,A and Tnnt so values were foun by simulation from the null cs. In doing this we used 8000 replicate samples of size s. The same approach was used to approximate the 5% clritical value for C,, . These simulations made use of the fact that the n @, [*h* are all i.i.d. central chi-squares with one degree-of-freedom under HO. Thus, we were able to obtain repeated realizations of our statistics under the null by simulating and then squaring values from a standard normal. This method can be used in practice to get finite sample approximations to the a-level critical values of our tests even when CJis unknown since n lGj,jI%’ has an approximate central chi-square distribo. Alternative chi-square approximations to the distributions of TnA e deduced from uckley and Eagleson (1988) that have been found somewhat effective in a related context. (See Jayasuriya, 1990). Asymptotic critical values for C,, derived from the distrib (8) could be obtained using methods such as tt; se discussed in Shor ner (1986, Chapter 4) that may value. We have not tried the latter two approaches in the present setting. Once arJpropriate critical values had been determined the basic experiment was replicated 1000 times. A different random seed was used for each j and y combinaTable I as the proportion of rejections e

reveals that al

ree tests maintain their levels reasonably st the low frequency uperior in this case. cy alternatives, j= 3 or 6, C,, has power that does not differ ap.Q5. In contrast, T,,,, and TnAcan perform quite well in r against the alternatives if m and A are 1

9

Table 1 Proportion

j=l

of rejections in 1000 sa

plec of ILL EOI far various choices of y and j.

y=o.o

y=O.5

m=3

0.057

0.792

m=9

0.074

0.837

0.055 0.062 0.047

;J= 1.0

y= 1.5

1.00 loo

1.00

0.898

1.00

0.805

1.00

0.923

1.00

I.00 1.00 1.00

0.054 0.069

0.775 0.718

1.00 1.00

1.00 1.00

0.056 0.05s 0.049

0.208 0.693

0.955

0.050

1.00 0.075

1.09 1.00 O.li&

0.057 G.G75

0.056 0.134

0.056 0.428

0.057 0.979

1, = l0-h

0.051 0.056

0.060 0.107

0.065 U.368

cl

0.040

0.058

0.059

0.074 0.961 0.068

Tnm 1.00

T A= ;;1-5 A = 1o-6

G j=3

Tfl??l 1?1= 3 m=9 Tfli. A = 1o-5 A= 1o-h cn

j=6

T11111 m=3 t?1=9 Tnl. A.= lo- 5

The choices of A= lo-” and 10e6 correspond to roughly uniform weightings of lhe first 3 and 5 sample Fourier coefficients, respectively, in T,+ Thus, it is no surprise to see that TnAwith il = lo-” performs well over alf LilIte Axnatives while the choice of A = lOA results in poor power against the high frequency alternativej = 6. In summary, the simulation results support our asymptotic analysis. They indicate that either TnAor T,,,,l are to be preferred over C’*in situations where anything but low frequency alternatives to I-I0 are considered likely. The power of T,,Aor Tnmwill be quite dependent on thz choice of 1 and m, a problem we have not addressed in this paper. We believe that data driven methods can be used to guide the choices of these parameters. 0ne approach is to use a generalized cross-validation or ‘unbiased’ risk criterion for selection of the 2) that the smoothing paralmeters. However, it can be shown (Eubank and ata driven heorem I for s limiting distribution of T,ll,lis not t choices of m. Undoubtedly the same is true for TnA. Future research will focus more on this aspect of the problem.

R. L. Eubank,

10

V.N. LaRiccia / Nonparametric regression

We begin by proving Theorem 1. For this purpose we require lemmas.

t

and define t)j;l=n-’ C,“=, g(t,)e-2nijtr Let ge qYU uniformly in 1j 1<$(n - 1). g(t)e- Snijrdt. Then lbjn - bj 1= 0(nw2), a

bj’

ji

. F4owing for any h,

and

Eubank (i988, Chapter 3) we have bjn- bj= ES+* bj+ns. B ‘1

I

bk = -(2:&)-2

g”(t)e-2nikt

dt=_ -(2?sk)-“&,

QO

since g E W&er. Thus, l/2

>

for all 1jl <+(n-- 1).

= 0(nm2),

Cl

The next lemma is a consequence of the Lindeberg-Feller

Theorem.

Let Mn be a sequence of n x n Tymnzetric, positive semi-definite matrices with eigenvahres qn < 99 < T,,, . Assume that yn has an n-variate normal distribution with mean fN and variance-covariance matrix a21,,, for In the n-dimensional identitY* Then, (y;,M, y,, - 02trace Mr, - fiMn f,)/(02(2 trace Mt) 1’2) converges in distribution to a standard normal random variable if (i) maxjr,‘,/ Cy = 1 T:,, --+0 and (ii) f,‘Mif,,/trace Mi -+0. l

l

oaf of

To establish that nC, converges in distribution work with the moment generating function of nC,. Let C,f =

to C in (8) we

,,,2X,'(lbj12/02Y(2nj)4,

where the xJ’( ) are the same as in (8). Clearly C,: converges in distribution to C. Thus, it suffices to show that the ratio of the moment generating functions for C,, and C,: converges to one. This ratio is seen to be l

lji<(n-I)/2

exPOt( 1bjnI2- 1b; / *V((2n j)4 - Zt)),

indicates the zero index is excluded from the prod efine& I”,2 equivalent to show t ”

I1

Next we apply Le at case (I-n-’ h), where n is an n-vector of all l’s and -

n ’

1s<(n- 1);2

e matrix (d-82-r (A) has (j,k)th entry

(A)

e

From this one obtaines trace A& = C ‘;ji..(,I_ 1)/2(1 t- A(2~j)~)-’ and trace Mi = Arguing as m Wahba (1979, it can then be shown C’,j,<(,r__I)/2 (1 +G2,j)4)-4* that traceM,,=Xr’“21,(1 +0(l)) and traceM,f==.4-1’42/,(1 +0(l)) as n-+ 00, A--+0 and nA1’4--) m. Thus, the first condition of Lemma 2 is satisfied. For condieron (ii) of Lemma 2 we note that f,‘M’f,, = nh(nj2

’ 1jl <(n-

lbJNl2/( 1 + A(25rj)4)4

I)/2

< nh(nj2

12/(

1 + 1(25rj)4)2.

Now

c

lb,,,J’/(lt A(2nj)4)2+

I

l/g//‘,

ljl <(n - 1)/2

because

C’

lbjn(z/(l

+A(271j)4)2-

C’

Ibjl’

lj)<(n-1)/2


Ijl

c

I

=

ljj<(n--1)/2 I

[Q(nF2)-

< 0(nF2) +A

fl(211j)2

lbj

I][2 lbj 1+ O(ne2) + I/;i(2TCj)2 Ibj I]/(1 + J-(25cj)4)2

C’ (1 +J.(2Ttj)“)-2+fl I_4<(n- rY2

C’ (2nj)2Jbj[2/(1 +2,(2~j)~)~ Iji <@- 1)/2

C’ (2nj)41bj12/(1 +h(22j)‘)’ Ijl GM- rV2

= 0(n-2P’4)

.f O(fl)

+ O(3L))

where we have used bj = -bT/(2Tt j)2 and the facts that 1I bj,l/ - lbj I I < 1bj, - bj 1 and I b; / < Ilg”11.Thus, f,‘MZf,/traceA# = O(nh(n)“)/O(A -l”) = O(A 1’4)-+0. Finally, it remains only to show th;lt f,ik$f,/(tra )1/2+ Ilgi1242~1)1’Z. this is a consequence of (A.1) and the fact that trace To finish the proof, observe that the matrix MR for m,, is (&,-n-l L) with Ii?,,,=@-’ Cjs,G,,l e2nis(‘J-‘h emma 2 obviously The rjn are all 1 or zero so Con For Condition (ii) note that

R. L. Eubank, V.N. LaRicci: / Nonparametric regresshi

12

Finally J”,;‘M,f;li(rrace~~;Ij 1’2+ 11 g (12 . a and Theorem 1 has been prov~c!?’ orollary. Without loss of generality we can choose y = 1. Thus, consider the sequence of alternative functions g&)x e2nikr, all of which have norm one. If Ck denotes the random variable m (8) when g =gk, then CA has moment generating function Gk(t)

=

exp {2t/((25rk)” - 2t)) fi (2 - 2t/(2n j)4)-’ j=l

=

exp {2t/(C2nk)4 - 2t)) Go(t).

Now &.(t) -+ oo(t)

as k-+ 00, so Ck converges in distribution to the null distribution of C. Thus, if c, is the lOO(1 - cy) percentage point of C, we have limk _,oo P(C> c, 1g =gk) = cx. The remaining two parts of the Corollary are straightforward. q roof of Theorem 2. First note that

(A.2) NOWA,, is normal with mean zero and with A,] = fi C ‘~_;i cttI (tijn- cjjz/fi)T,/j2. variance C II,ji
P TnmaZ, 1m1’4n-“2 >P

C’ ~fij,-~jm’“/fi~2-2Hl

n (I

C’ cjpj Ij I<171 >

/j

16 ttt

+A,&Za-

Y!/2 ( m”4n-“2

1

/(2a2fm)

C’ {jvj , Iji

(A-3)

where A,,=fi C;jic,rl ((ijn-rJm”4/~)rj/(2a2m’/4). NOW EA =0 and the variance of A,, is at most i2/(404 /%) for any alternative corresponnding to %,,&Y$ ThuG, for any E>0 there is an no such that for all n > no (A3) is bounded below by

Pt(Xf,, -2m)/(2fi)aZ,-

y,/2+e)+o(l)k2,

where A’&, , ;L central chi-squared random variable with 2m degrees-of-freedom and the o( 1) ter.rl !Yuniform over the alternatives. Now let n -+ 00 and E -+ 0 to finish the proof. Cl

R.L. Eubank, b’.lV. LaRiccia / Nonpnramerric regression

13

roof of The argument here follows Serfling (1980, Chapter 10). Let us begin by defining IpI = j211g”11 2/((2n)4a2)] I” and B2 = [IIa2(27t)4/(2!g” 1i2)]I/5. Then, we have HI-B~~“~ anci d”4--B2n-“5. Now suppose that the series estimator test is to be based on nl observations and the smoothing spline test uses n2 observations with corresponding smoothing parameters ml and AZ. Let the local alternative for Tn,nl, be of the form p+ (m:/4/fi)g, and let the one for T& be p + (l/( fi&“6))g2. In order to compare the tests the alternatives must coincide asymptotically which entails that [B, Bz] 1’4lim

L-1 n1

-9/m

au)

(A.4

= 920

n2

for all t. Since ml = B,t~i’~, the limiting power of ?-;1,,,I,is 1 - @(Z, - ilgl 112/(2a2)), by Theorem 1. Similarly, for the smoothing spline based test using n2 observations the limiting power is 1 - - C(.Z,,.- IIg2112/(202fi)). If we equate these two limiting powers and use (A.4), this gives lim(n&) = [jl/& ‘19. The integrals involved in this expression can be computed exactly using tables of Gamma integrals to finish the proof. Cl Proof of the Proposition 2. In order for any estimator G2 of a2 to leave the limiting distribution of, e.g., Tn,,,unchanged we need fi(C2 - a’) + 0 in probability. Thus, it is only necessary to show that this condition is satisfied by e2. Direct calculations give that EtT2=

~2+W2r$l (g(;)-g(y))

/(n - 1) = ~7~+ O(h(n)2/n2)

and 404 Var c?’ ‘,1+802h(n)’ -

The proposition

now follows from Markov’s inequality.

q

Acknowledgement The authors are indebted to two referees for many helpful suggestions and probing questions that have improved the quality of this paper. We also wish to thank Jeff Hart for pointing out an error in an earlier version of the paper. The first author’s research was supported, in part, under National Science Foundation Grant DMS-8902576.

14

R. L. &bunk,

V. N. LaRiccia / Nonparametric regression

of quadratic forms in osirnation to the distribution Buckley, M.J. and G.K. Eagleson (1988). An , 149-163. normal random variables, Amt. J. Statist., COX, D.D. and E. Koh (1989). A smoothing spline based test of model adequacy in polynomial regression, Ann. Inst. Statist. Math., 41, 383-400. COS, D.D., E. Koh, 6. Wahba and B. Yanderi (1988). Testing the (parametric) null model hypothesis in (semiparametric) partial and generalized spline models, A ~IPI.Stutist., I De Jong, P. (1987). A central limit theorem for generalized quadratic forms, Prob. Theor. and Rel. Fields. 25, 261-277. Eubank, R.L. (1988). Spline Smoothing and Nonparametric Regression, Marcel-Dekker,

New York.

Eubank, R.L. and J.D. Hart (1992). Testing goodness-of-fit in regression via order selection criteria, Ann. Statist., 20, 1412-1425. Eubank. R.L. and C. Spiegelman (1990). Testing the goocines+of-fit of a linear model via nonparametric regression techniques, J. Amer. Sta:ist. Assoc., 85, 387-392. Gasser, T., L. Sroka and C. Jennen-Stiinmetz (1986). Residual variance and residual pattern in non’linear regresGon, Biornetrika, 73, 62>-63?. Hardle, W. and E. Mammen (1988). Comparing nonparamdtric versus parametric regression fits, manuscript. Jayasuriya, B. (1990). Testing the goodness-of-fit of a polynomial model via nonparametric regression techniques, Unpublished Ph.D. Dissertation, Dept. of Statist., Texas A&M University. King, E. (1988). A test for the equality of two regression curves based on kernel smoothers, Unpublished Ph.D. Dissertation, Dept. of Statist., Texas A&M Univrersity. Mullet-, H.G. (1992). Bandwidth choice for nonparametric regression via parametric regression models, with an application to goodness-of-fit testing, Scund. J. Statist., 19, 157-172. Raz, J. (1990). Testing for no effect when estimating a smooth function by nonparametric regression: a randomization approach, J. Amer. Statisr. Assoc., 85, 132-l 38. Rice, J. (1984). Bandwidth choice for nonparametric regression, Ann. Statist., 12, 1215-1230 Serfling, R. (1960). Approximation Theorems of Mathemutica! Statistics, Jchn Wiley, New York. Shorack, CJ. and J. Wellner (1986). Etnpirical Processes icith Applications to Statistics, John Wiley, New York. Staniskvalis, J. and T. Severini (1991). Diagnostics for assessing regression models, J. Amer. Statist. Assoc., 86, 684-692. Wahba, G. (1975). Smoothing noisy data by spline functions, Numer. Math., 24, 383-393.