Testing the equality of two regression curves using linear smoothers

Testing the equality of two regression curves using linear smoothers

Statistics & Probability North-Holland Letters 12 (1991) 239-247 September Testing the equality of two regression using linear smoothers Eileen Kin...

785KB Sizes 0 Downloads 88 Views

Statistics & Probability North-Holland

Letters 12 (1991) 239-247

September

Testing the equality of two regression using linear smoothers Eileen King TheProcter and Gamble

Company,

Cincinnati,

Jeffrey D. Hart and Thomas Department

of Statistics,

Texas A&M

1991

curves

OH, USA

E. Wehrly

Uniuersity,

College Station,

TX 77843, USA

Received July 1990 Revised December 1990

Abstract: Suppose that data (y, I) are observed from two regression models, y = f(x)+ e and z = g(x) + 7. Of interest is testing the hypothesis H: f = g without assuming that f or g is in a parametric family. A test based on the difference between linear, but nonparametric, estimates of f and g is proposed. The exact distribution of the test statistic is obtained on the assumption that the errors in the two regression models are normally distributed. Asymptotic distribution theory is outlined under more general conditions on the errors. It is shown by simulation that the test based on the assumption of normal errors is reasonably robust to departures from normality. A data analysis illustrates that, in addition to being attractive descriptive devices, nonparametric smoothers can be valuable inference tools.

1. Introduction A common problem in experimental work is the comparison of two regression curves. Plant physiology and analysis of animal growth curves are examples of areas in which this problem frequently arises. Typically, the two curves correspond to treatment and control groups, and the predictor variable in the regression is time or a covariate. For any given value of the predictor variable, suppose that one measurement is recorded for each regression model and that the measurements across settings of the predictor variable are independent of one other. We shall propose a nonparametric method of testing the hypothesis that the two curves are identical. This is a useful methodology in cases where (a) the experimenter has little (if any) indication of an ap-

Research 0723.

supported

0167-7152/91/$03.50

in part by ONR

Contract

NOOO14-85-K-

0 1991 - Elsevier Science Publishers

propriate parametric regression model, or (b) the experimenter desires a quick and easy significance test without any serious modelling of the two regressions. Our test is a function of the difference between two nonparametric curve estimates, where each estimate is a linear smoother. We feel that comparing plots of, e.g., kernel smoothers is a particularly good way to describe differences between two regression curves. Our test will provide a formal significance test that can be used to supplement such a graphical comparison. The setting we shall consider can be described more precisely as follows. The observed data are (x,, y,, z,), i= l,..., n, where yj=f(xi)+&;

and zj=g(x,)+ni,

i=l,...,n.

(1.1) The design points, x,, are fixed with 0 < x1 < ~2 ( . . . < x, G 1, and the regression functions I and g are smooth (at least uniformly continuous on [O,l]). Each &i is assumed to be normally dis-

B.V. (North-Holland)

239

Volume 12. Number 3

STATISTICS & PROBABILITY

tributed with mean 0 and variance u*, while each ni is normal with mean 0 and variance r2. Furthermore, the random variables er, . . . , E,, 1713...2 9n are assumed to be mutually independent. The functions f and g may be estimated using linear smoothers. For example, an estimate of f(x) (x E [0, 11) is h(x) = X~Z1wi(x; h)y,, where w,(x; A),..., w,,(x; h) are weights that depend on a smoothing parameter h. The weights could be those of a kernel estimator or a smoothing spline - see Eubank (1988) for a discussion of each of these linear estimators. We are interested in testing the hypothesis H,:

f(x)

= g(x)

for all x E [0, l]

(I.21

against the alternative Hi:

f(x)

#g(x)

for some x E [0, 11.

We propose that these hypotheses be tested using the statistic

T,

=

-ih(x,))2

np'C;=l(_&(xj)

5

A2 s

(l-3)

where s^* is some estimate of scale. The null hypothesis H, is rejected in favor of H, (at level a) if the observed value of T, is greater than the 1 - cy quantile of T,,‘s distribution under Ha. A motivation for the statistic (1.3) is that its numerator is a consistent estimator of

D=

J’(f(x) -g(x))*+) 0

dx,

where r is the density of the design points xi. Assuming r(x) > 0 for all x, and since f and g are continuous, D is zero if and only if Ho is true. Therefore, ‘large’ values of T, favor H, and ‘small values favor H,. It is important to realize that our assumption of normal errors is needed only for formal validity of the proposed test. It turns out (see Section 4) that the normal-theory test is reasonably robust to departures from normality. The area of hypothesis testing in nonparametric regression is relatively new in the literature. Cox, Koh, Wahba and Yandell (1988) examine the one-sample test of the hypothesis that f(x), x E 240

LETTERS

September 1991

[0, 11, is a polynomial of degree m - 1 or less versus the alternative that f is ‘smooth’. Estimation of f is by the technique of smoothing splines. Eubank and Spiegelman (1990) propose alternative nonparametric smoothing techniques to test the goodness of fit of a linear model. Raz (1990) develops a randomization test to determine if a real-valued response variable and a vector-valued explanatory variable are related. Hardle and Marron (1990) compare two regression curves that have been estimated using nonparametric techniques. They assume the curves are the same up to a parametric transformation of the predictor and response variables. Hence, in their model the hypothesis (1.2) is equivalent to a hypothesis that depends on only a few parameters. Knafl, Sacks, and Ylvisaker (1985) propose methods of constructing confidence bands for regression functions, including the nonparametric case. Hall and Hart (1990) propose a bootstrap test for detecting the difference between regression functions. Other recent work on testing a parametric null hypothesis using nonparametric smoothers includes Dabrowska (1987), Horvath, Yandell and Sen (1990 ) and Kozek (1990, 1991). The rest of the paper will be arranged as follows. In Section 2 we define our test statistic using matrix notation and discuss a number of its distributional properties. Implementation of the test in practice and an example involving a dietary experiment with cows are the subject of Section 3. Finally, in Section 4 results of a simulation study are presented. The study suggests that the level of our test is relatively insensitive to departures from normally distributed errors, and that its power compares favorably with that of traditional competitors like the paired t-test.

2. Distributional

properties of the test statistic

We first introduce some notation that makes it easier to describe the distribution of our test statistic. Let y and z denote, respectively, the vectors of observations (yi, . . . , y,,)’ and (z,, . . . , z,)‘, and define d = y - z. Let W, be the n X n matrix with ijth element wj(xi; h). Defining d, =y, - zi, i =

Volume

12, Number

STATISTICS

3

1 , . . . , n, our estimator of scale suggested by Rice (1984), namely

6; =

& 2^(d; -

is of the

& PROBABILITY

type

d,_l)2.

I=2

This estimator d’G’Gd/n,

defined

may be written in the form 2: = where G is the (n - 1) X n matrix

LETTERS

September

1991

use Monte Carlo methods to approximate the null distribution of T,. The latter approach is the one we have taken in data analyses and simulation studies. For a given set of data, one may perform the hypothesis test by rejecting H, only when P < a, where a is the desired significance level, P = P [a’ Wl W,a/a’G’Ga

> T:“],

by I-1

1

p

I (j

0

0

y1 ; ; (j

(j(j

...

0 0

1:’

:

..:o

0 0

o\ 0

-1

1I

:

:

Finally, taking s^*= I$, the test statistic (1.3) may be written as a ratio of quadratic forms: T, = d’ W; W,d/d’G’Gd.

and T,Ohsis the observed valued of T,. The quantity P may be estimated to any desired degree of precision by generating m independent sets of n i.i.d. standard normal variates and taking m sufficiently large. 2.2, Scale estimation

It is clear that, under the null hypothesis (1.2), T, has the same distribution as a' WL Wha/a’G’Ga, where the n components of a are i.i.d. random variables having the standard normal distribution. Therefore, for any 9,

In principle, any scale estimator of the form d’Bd could be used in the statistic (1.3). For example, one could use sj = Cl,l(d, - 2)*/n. Both si (as n -+ bo) for cri = and 6: are consistent Var(d,) under the null hypothesis (1.2). However, for uj we prefer 3: since it is also consistent under the alternative hypothesis, so long as lim n_,mmaxj=, ,...,” (xi--~~_,)=O.Bycontrast, si is consistent for

P(T,>+)=P[a’(W,‘W,--$G’G)a>O].

ef-g)

2. I. Null distribution

of test statistic

(2.1)

The probability (2.1) may be calculated for any n and any (fixed) value of the smoothing parameter h by using Theorem 2.1 in Box (1954). Box’s theorem implies that P[a’(W,‘W,-$G’G)

a>O]=P(

~lhjX~>O),

G-2)

where r=rank(W,‘W,-QG’G), Ar,...,h, are the real nonzero eigenvalues of Wi W,, - $G’G, and xf,. . . , x: are independent &i-squared random variables each with 1 degree of freedom. To perform the desired hypothesis test there are at least two possible approaches. One could approximate percentiles of T,‘s null distribution by using (2.1), (2.2) and one of the suggested procedures for approximating the distribution of linear combinations of independent x2 random variables (see, e.g., Box, 1954; Imhof, 1961; and Farebrother, 1990). A second approach approach is to simply

= 0; + jkf(xJ 0

-g(x))

- (i~?)]~r(x)

dx,

where f- 2 = /,‘( f(x) - g(x))r(x) dx. It follows that si is consistent for 0: if and only if f(x) g(x) = c for all x. Since a*( f - g) > uj for nonconstant alternatives, a test using 3: will tend to have larger power than a test using s$ hence our preference for 6:. 2.3. Nonnormal

errors

When the errors in model (1.1) are not normally distributed, (2.1) and (2.2) at best only approximately describe the null distribution of the test statistic. However, our experience indicates that this approximation often works quite well even when the sample size is as small as n = 20. Evidence of this will be seen in the simulation 2tudy of Section 4. King (1988) shows that, when fh and & are Gasser-Mtiller type kernel estimators of f 241

STATISTICS & PROBABILITY

Volume 12, Number 3

and g (see Gasser and Miiller, 1979) standardized version of

._

a properly

K(z)K(z+y)

2.4. Power of the test Define t, to be the 1 - (Yquantile of T,‘s distribution under H,. When the alternative hypothesis is true, the power of our test is P(T,>t,)=P[d’(W;W,-t,G’G)d>O]

Lj=l

_I

where the h,‘s are as defined in Section 2.1, Sj is a linear combination of (f (x1) - g(x,))/u,, ..., and the x:(1, 8,‘) are inde(f(xn) - g(x,))/%? pendent, noncentral &i-squared random variables each with 1 degree of freedom and noncentrality parameter S,!. King (1988) studied asymptotic power properties of our proposed test when the linear smoother is a Gasser-Miiller kernel estimator with smooth, symmetric kernel K having support ( - 1, 1). Consider a sequence of size (Y tests for which h -+ 0 and nh + 00. For a fixed alternative f - g, King (1988) showed that the power of such tests tends tolasn+oo. King also (1988) studied asymptotic power for a sequence of local alternatives. For a fixed function u (not identical to 0), define the local alternatives u,(x) = f(x) - g(x) = u(x). (n~&-“~, and again let h + 0 and nh + 00. Then King showed that, as n + cc, the power of our test tends to z>

z _ a

i 242

where Z has the standard with 1 - (Y quantile z, and

normal

distribution

dz

j=l

is asymptotically normal as n + co, h + 0 and nh + 00. One can use this fact to perform an asymptotically valid test of H, vs. H, even when the normality assumption is inappropriate. The simulation study of King (1988), though, suggests that, when n is small, using the distribution in (2.2) as an approximation tends to yield smaller level error than does the test based on the asymptotic normality result.

p

September 1991

LETTERS

Jo+J2b)r(4dx u,‘JB

1 ’

2.5. Choice of smoothing

parameter

To each choice of smoothing parameter h there corresponds a different test of the null hypothesis (1.2). Ideally one would choose h to maximize power. In general, though, the most powerful smoothing parameter will depend on the unknown function (f - g)/u,. One way of proceeding is to fix h by using a data-based method such as cross-validation (see, e.g., Hardle, Hall and Marron, 1988). There are two potential deficiencies of this approach. First of all, this procedure adds randomness to the test, and hence the distribution theory of Section 2.1 is no longer valid. On the other hand, if one uses simulation to approximate the null distribution, the extra randomness could be accounted for in the simulation. Another objection to using crossvalidation is that it is designed for a different purpose than maximizing the power of a test. Cross-validation on the d, ‘s tends to produce a good point estimate of f - g, i.e. one which comes close to minimizing C[A(x,) - &(xi) - (f(x,) g(x,))]*. Our experience is that, when using a kernel estimate in (1.3), the bandwidth h that maximizes power tends to be quite a bit larger than ones which produce visually good estimates of f-g. In practice the experimenter often has a reasonable idea, at least qualitatively, of the type of alternative to expect. Therefore, another approach would be to choose a bandwidth that is optimal over a class of alternatives of the expected type. Monte Carlo can be used to approximate the optimal bandwidth for any particular alternative If the optimal bandwidth varies sig(f-g)/%? nificantly over the alternatives of interest, then one could use the test statistic T,* = C,riTRjr where T,,, is the test statistic of form (1.3) which maximizes power for a particular alternative, and VT;is the prior probability attached to that particular alternative. The statistic T,* is still a ratio of

STATISTICS & PROBABILITY

Volume 12, Number 3

quadratic forms, and hence its distribution harder to determine than that of (1.3).

LETTERS

September 1991

is no

3. An illustration Matis, Wehrly and Ellis (1989) developed nonlinear models for the digesta flow through the gastrointestinal tracts of ruminants. These models were fit to the observed concentration of marker in the feces of four cows given each of two types of diet. The data were originally presented by Ferreiro, Boodoo, Sutton, and Bishop (1980). The experiment was designed to compare the effects of two diets, one a chopped straw and the other a ground and pelleted straw, denoted as treatments C and I’, respectively. Four cows were used in the experiment and each received the two diets. The straw was stained with magenta, and each cow was given a meal with the stained straw. The feces were collected at regular time intervals, and the observed concentration of stained particles was recorded. It was noted that, for each cow, the best fitting models were qualitatively different for the two diets. However, for a given diet the same model fit all the cows quite well. Since the emphasis in Matis, Wehrly and Ellis (1989) was on developing and fitting models, no tests of significance were performed to determine whether the response curves differed either for the two diets given to a single cow, or for two cows given the same diet. We will use our methodology to test whether Cow 7 had different responses to diets C and I’, and whether Cows 6 and 8 had different responses to diet C. Defining K(u)=0.75(1-z+(]U]

= ;

50

100

150

200

250

time (in hours)

Fig. 1. Kernel regression estimates of the concentration-time curves for Cow 7 under Diets C and V. The observed responses are given by x for Diet C and by A for Diet V. The solid and dashed lines denote, respectively, Diet C and V estimated response curves.

where x E [0, 11, and then transformed back to the original time scale using the technique of Carroll and Hlirdle (1989). Effectively, this procedure produces a kernel estimate whose bandwidth varies with design density. Figure 1 shows the regression estimates and the observed data. The test statistic T, was computed for each of the bandwidths h = 0.1, 0.2, 0.3, 0.4 and 0.5, where h is measured on the transformed time scale. The P-value was estimated according to the method described in Section 2.1 by using m = 8000 simulations, so that the error of estimation would be less than 0.01 with 95% confidence. The values of the test statistics and P-values are presented in Table 1. A similar analysis was performed to compare the responses of Cows 6 and 8 to Diet C. Figure 2 shows the regression estimates, and Table 2 pre-


estimates of concentration-time curves for Cow 7 were computed as follows. We transformed the time scale to produce equally spaced points on the unit interval, computed the Gasser-Mtiller (1979) smoother J(x)

A 0

;&Y,/” K(Y) (i-1)/n

ds,

Table 1 Values of the test statistic T, for comparing Diets C and V for Cow 7, and the corresponding estimated P-values based on 8000 simulations h

T, P

0.1

0.2

0.3

0.4

0.5

5.10 0.0000

4.20 0.0000

3.22 0.000125

2.51 0.000125

2.28 0.000375

243

Volume 12. Number 3

STATISTICS & PROBABILITY

LETTERS

September 1991

after readings is consistently in one direction, then one feels confident that a treatment effect has been detected. When a fairly large number of experimental units is available, it would be more appropriate to perform a repeated measures analysis. In a repeated measures design, covariance between measurements on the same subject would typically be modelled through the error terms. In our analysis this covariance was modelled through the mean functions f and g.

0

50

100

150

200

250

300

4. A simulation study

lime (in hours)

Fig. 2. Kernel regression estimates of the concentration-time curves for Cows 6 and 8 under Diet C. The observed responses are given by x for Cow 6 and by A for Cow 8. The solid and dashed lines denote, respectively, Cow 6 and 8 estimated response curves.

sents the values of the test statistics and the P-values. The P-values in Table 1 indicate strong evidence that the curves for the two diets differ for Cow 7. Even smaller P-values were observed for the other three cows in the study. The P-values in Table 2 indicate that the Diet C response curves of Cows 6 and 8 did not differ significantly. Similar pairwise comparisons for the other cows within diets could also be performed for a more complete analysis. Though one would naturally expect some difference in before and after readings on a given animal, our test provides a means of formally establishing that the observed difference is larger than would be expected from experimental error alone. If the tests applied to each of a few animals are all significant, and the difference in before and

Table 2 Values of the test statistic T, for comparing Cows 6 and 8 under Diet C, and the corresponding estimated P-values based on 8000 simulations h

T, P

244

0.1

0.2

0.3

0.4

0.5

0.69 0.212

0.44 0.243

0.40 0.196

0.39 0.177

0.43 0.124

A simulation study was conducted to investigate the level error and power of our proposed test. Throughout the study the design points were evenly spaced, and the test statistic T, was based on a kernel smoother of the form

where K(u)

= 0.75(1-

u2)1(

(241 6 1).

To avoid large boundary bias, boundary kernels were used at design points within a bandwidth of either 0 or 1 (see Gasser and Miiller, 1979). Tests of nominal level 0.05 and 0.10 were considered, and the following three factors were of interest: (i) the distribution of the errors, (ii) the sample size n and (iii) the bandwidth h. Three models for the errors were used. In two of these pi,. . . , E,, ql,. . . , qn were taken to be independent and identically distributed, with fiei having standard normal distribution in one case and 2&i distributed as t with 4 degrees of freedom in the other. To investigate how skewness of ei - vi affects the level of our test, we studied a third model in which E, and vi were distributed, respectively, as 3(x24 4)/m and (x’, - 4)/m, where xi has the &i-squared distribution with 4 degrees of freedom. In each of the three models ei - vi has variance 1, and hence power comparisons across error models are not confounded by scale differences. Finally, two choices for the sample size were considered, n = 20 and 50, and four bandwidths were used at each n (see Tables 3 and 4).

Volume

Table 3 Rejection a

12, Number

STATISTICS

3

rates of various Error distribution

tests under the null hypothesis

& PROBABILITY

LETTERS

September

(n = 20)

Test Paired

0, ;

= 0.10)

$ = 0.15)

;

;

= 0.20)

Normal X2 t

0.040 0.060 0.028

0.044 0.056 0.042

0.040 0.070 0.048

0.040 0.078 0.048

0.038 0.082 0.052

0.050 0.074 0.048

0.10

Normal X2 t

0.098 0.114 0.084

0.078 0.120 0.096

0.090 0.118 0.098

0.094 0.122 0.098

0.090 0.124 0.104

0.106 0.108 0.086

Entries in the same row are based on the same set of 500 replications. The error models are explained linear combination of independent Xi random variables, while t refers to the difference of independent

All tests based on T, were performed as described in Section 2 with m = 500. For the sake of comparison, two tests in addition to the one based on T, were considered. These were the usual paired t-test (where a pair of observations is (y,, zl)) and a test based on the statistic



where si is the sample variance of the d,‘s. Using a central limit theorem for l-dependent random variables, it is easy to show that D,, has an asymptotically standard normal distribution under H,. Therefore, in the simulation study we used the test : “reject H, (at level a) if D,, exceeds the 1 - (Y quantile of the standard normal distribution.” Essentially, the paired t-test and the test based on D,, are extremes of the T,-based tests, with the t-test corresponding to a very large bandwidth and the test based on D,, corresponding to no smoothing of the data. The statistic D,, is very similar to the Durbin-Watson statistic, which is most often used to detect serial correlation among residuals. See Munson and Jemigan (1989) for another statistic similar to D,,. The results of the level study are given in Tables 3 and 4. At each n and error model 500 replications were performed, and all six of the tests were performed at both OLlevels in a given replication. Replications thereby play the role of blocks for purposes of comparing tests. Tables 3 and 4 show that the tests based on T, held their level reasonably well even when the distribution of

t

= 0.25)

0.05

D = Z=zdrdr-i n sjJn-1

1991

in the text. X2 refers to the tq variates.

E, - nr was skewed or long-tailed. With skewed di’s, there is evidence, in particular at n = 20, that the T,-based tests are liberal, but the excess of the rejection rate over the nominal level in most cases is not large. When the errors are normally distributed, of course, the t-test is exact and our tests based on T, are exact up to approximating the critical values by Monte Carlo methods. It is thus not surprising that the T,-based test performed well in the normal errors case. To investigate power, five choices for u =f - g (other than u = 0) were made. Each choice was of the form u(x) = a + bx + c sin(dx), x E [0, 11, where a, b, c and d are constants. The constants used in the simulation study are given in Table 5.

o.o{

,

D.

,

Tn

Tn

(h = .04) ERROR

(cu

(

Normal

(h

= JO) scw

Tll (h = Chi(4)

20)

(h -0

/

,

T”

Paired t

= .25) l(4)

Fig. 3. Estimated power for the shift alternative u(x) = 0.47 (based on 500 replications; n = 50, (I = 0.05). 245

Volume

12, Number

Table 4 Rejection

3

STATISTICS

rates of various

tests under

Error distribution

a

the null hypothesis

Normal X2

t 0.10

LETTERS

September

1991

(n = 50)

Test

T,

D” 0.05

& PROBABILITY

Normal X2

t

Paired

T,

T,

(h = 0.20)

(h = 0.25)

0.050 0.066 0.072

0.050 0.058 0.064

0.052 0.054 0.064

0.060 0.038 0.048

0.108 0.122 0.136

0.098 0.106 0.112

0.092 0.098 0.110

0.110 0.088 0.088

(h = 0.04)

;

0.062 0.056 0.062

0.052 0.062 0.072

0.102 0.106 0.118

0.104 0.116 0.128

= 0.10)

r

See notes of Table 3.

Two of the alternatives correspond to shifts, one is linear and integrates to 0, and the other two oscillate about 0. In conducting the power study, the same sets of simulated errors were used as in the level study. Hence, at a given n and error model, 5 x 500 data sets of the form d,, = y, (i/n) + E, - q,, j = 1,. . . ,5, i = 1,. . . , n, were obtained, where u, is a given alternative to H, and e1 n,, was a set of differences from the nl,...,Enlevel study. Here we only display power results for the case n = 50 and (Y= 0.05 (see Figures 3-5). As would be expected, the t-test was more powerful than all others against shift alternatives. However, when the alternative was such that u alternated between positive and negative values, the t-test had very

10.

low power compared to the other tests. The results for n = 50, cx= 0.10 were very similar to those at n = 50, (Y= 0.05. At n = 20, there was no clear pattern to indicate that smoothing usually leads to a more powerful test. For n = 20, except in the case of shift alternatives, the test based on D,, (i.e., no smoothing) had power comparable to or higher than the ‘smooth’ tests. However, at n = 50, the superiority of tests based on smoothing becomes evident. In summary, all tests considered did a reasonable job of holding their levels, and so none should be rejected on the grounds of being patently invalid. In terms of power, there was some sensitivity of the results to choice of bandwidth. (One may as well regard the t-test and D,-based tests as special cases of T,-based tests with, respectively, very large and small bandwidths.) The discussion I

R 08e

Tn

Tn

Tn

T”

(j, = .04)

(h = .lO)

(h = 20)

(h = .25)

Dn

ERRrEi -

HormaI

E-BE! Chi(l)

-0

0

I

Paired t

T(4)

Fig. 4. Estimated power for the straight line alternative u(x) = 0.814-1.628x (based on 500 replications; n = 50, a = 0.05).

(h “.O,)

1ERROR

m

Normal

Tll

TII

Tn

(h = .lO)

(h = .20)

(h = .25)

-

ChiO)

e-04

IO)

Paired

t

I

Fig. 5. Estimated power for the sine curve alternative u(x) = 0.671 sin(9.42x) (based on 500 replications; n = 50, (Y= 0.05).

Volume

12, Number

STATISTICS

3

Table 5 Models used in the simulation a 0

0.1565 0.4700 0.8140 0 0

b 0

0 0 - 1.628 0 0

& PROBABILITY

LETTERS

September

1991

study u(x) = a + bx + c sin( dx) c

d

j;“*(u)&

$o(u)du

0

1 1 1 1 9.42 9.42

0 0.0245 0.2209 0.2209 0.2213 0.7471

0 0.1565 0.4700 0 0.0712 0.1309

0 0 0 0.671 1.233

in Section 2.5 provides some ideas choose the bandwidth in practice.

on how

to

References Box, G.E.P. (1954) Some theorems on quadratic forms applied in the study of analysis of variance problems, I. effect of inequality of variance in the one-way classification, Ann. Math. Statist. 25, 290-302. Carroll, R.J. and W. Hlrdle (1989), Symmetrized nearest neighbor regression estimates, Stntist.Probab. Lett. 7, 315318. Cox, D., E. Koh, G. Wahba and B.S. Yandell (1988) Testing the (parametric) null model hypothesis in (semiparametric) partial and generalized spline models, Ann. Statist. 16, 1133119. Dabrowska, D.M. (1987), Non-parametric regression with censored survival time data, Stand. J. Statist. 14, 181-197. Eubank, R.L. (1988) Spline Smoothing and Nonparametric Regression (Dekker, New York). Eubank, R.L. and C.H. Spiegelman (1990), Testing the goodness of fit of a linear model via nonparametric regression techniques, J. Amer. Statist. Assoc. 85, 387-392. Farebrother, R.W. (1990) The distribution of a quadratic form in normal variables, Appl. Statist. 39, 294-309. Ferreiro, G.J., A.A. Boodoo, J.D. Sutton and C. Bishop (1980) National Institute for Research in Dairying, 1980 Report (Shinfield, UK). Gasser, T. and H.G. Mtiller (1979) Kernel estimation of regression functions, in: Gasser and Rosenblatt, eds., Smoothing Techniques for Curve Estimation, Lecture Notes in Mathematics No. 757 (Springer, Heidelberg) pp. 23-68.

Hall, P. and J.D. Hart (1990) Bootstrap test for difference between means in nonparametric regression, J. Amer. Statist. Assoc. 85, 1039-1049. Hardle, W., P. Hall and J.S. Marron (1988), How far are automatically chosen regression smoothing parameters from the optimum? (with discussion), J. Amer. Statist. Assoc. 83, 86-101. Hlrdle, W. and J.S. Marron (1990) Semiparametric comparison of regression curves, Ann. Statist. 18, 63-89. Horvath, L., B.S. Yandell and A. Sen (1990), Convergence of kernel regression estimators, Tech. Rept. No. 869, Dept. of Statist., Univ. of Wisconsin (Madison, WI). Imhof, J.P. (1961), Computing the distribution of quadratic forms in normal variables, Biometrlka 48, 419-426. King, E.C. (1988) A test of the equality of two regression curves based on kernel smoothers, Ph.D. dissertation, Dept. of Statist., Texas A&M Univ. (College Station, TX). Knafl, G., J. Sacks and D. Ylvisaker (1985) Confidence bands for regression functions, J. Amer. Statist. Assoc. 80, 683691. Kozek, A.S. (1990) A nonparametric test of fit of a linear model, Comm. Statwt. - Theory Methods 19(l), 169-179. Kozek, AS. (1991), A nonparametric test of fit of a parametric model, J. Multiuariate Anal. 37, 66-75. Matis, J.H., T.E. Wehrly and W.C. Ellis (1989). Some generalized stochastic compartment models for digesta flow, Biometrics 45, 703-720. Munson, P.J. and R.W. Jernigan (1989) A cubic spline extension of the Durbin-Watson test, Biometrika 76, 39-47. Raz, J. (1990), Testing for no effect when estimating a smooth function by nonparametric regression: a randomization approach, J. Amer. Statist. Assoc. 85, 132-138. Rice, J. (1984), Bandwidth choice for nonparametric regression, Ann. Statist. 12, 1215-1230.

247