June 1989
Statistics & Probability Letters 8 (1989) 167-170 North-Holland
BIAS IN NONLINEAR REGRESSION OR AR(l) ERROR STRUCTURE
Chih-Ling Division
MODEL WITH HETEROSCEDASTIC
TSAI
of Statistics,
University
of California at
Daois, Davis, CA 9.5616, t/SA
Received March 1988 Revised July 1988
Abstract: We investigate the biases of the maximum likelihood estimators from normal nonlinear regression models. Emphasis
is placed on the heteroscedastic tion is also discussed. Keywords: autocorrelation,
and first order autoregressive error structure. Bias reduction after the parameter transforma-
bias, heteroscedasticity,
transformation.
1. Introduction
The standard
nonlinear
regression
_Vi=71(x,> P>+e,
model can be represented
as
(i=I,...,n),
(1)
where y, is the i th response, xi is a vector of known variables, p is a p X 1 vector of unknown parameters, the response function 9 is assumed to be known and twice continuously differentiable in p, and the errors are assumed to be independent, identically distributed normal random variables with mean 0 and variance ls2. For model (1) with homoscedastic error structure, the maximum likelihood estimator p^ of p can be found by minimizing the objective function
J(P)= i {Y,-_q(x;~P>>‘. i=l
Numerous approximations for the bias vector E( p^ - p) to order l/n are available in the literature, such as Cox and Snell (1968), Box (1971), Bates and Watts (1980), Clarke (1980) Hougaard (1982) Amari (1982) and Cook, Tsai and Wei (1986). It can be shown that, apart from differences in notation, all these bias approximations are identical for model (1) and are special cases of the results derived in this paper. In general, for any given loglikelihood function I and its related r x 1 vector of unknown parameter 8, the bias b, of the maximum likelihood estimator 8 of 8 to order l/n can be obtained from Bartlett’s result (1953, eq. (28)). Using this result, we derive: b,=E(8--8)=F-’
tr{~F-‘(G),+FP’(H)j}
(2)
where F=
0167-7152/89/$3.50
-~(a?/ae
ae’),
(c),=E(a3r/aei
aef ae),
0 1989, Elsevier Science Publishers B.V. (North-Holland)
(ffji=E(al/ae.a2//aej
ae’) 167
STATISTICS & PROBABILITY
Volume 8, Number 2
LETTERS
June 1989
and i= l,..., t. The purpose of this paper is to apply equation (2) to obtain the bias of the maximum likelihood estimator for both the regression parameter p and the disturbance parameter X (or p), where h appears in the heteroscedastic error structure and p is the autocorrelation coefficient in the AR(l) error structure. In addition, bias reduction through parameter transformation is discussed.
2. Bias of fi and fi In this section, we assume that e, in equation (1) follows a normal distribution with mean 0 and variance kc ’ = k(z,, X)02, where the weight function k, depends on a 4 x 1 vector of unknown parameter A and an r X 1 vector zi, i = 1, _. . , n. Using the result given in equation (2), the bias of the maximum likelihood estimator of 8 = (p, X) is
(3) where V=X-“2{an(x,
p)/ap},
D=Y’{ak(z,
x)/ax},
h = - +tr{ (D’D)-‘B,}
g=
-~tr{(V’V)~‘A,}aZ,
A =
2-“2{ a2q(x, p)/ap ap’},
B =
E=diag{k(z,,
X)},
- fd,
Z-l{ a2k(z, X)/ah ax}
and the ith component of A, B and d are respectively, Ai, Bi and d, = D,‘( D’D)p’Di, i = 1,. . . , n. Thus, the biases bp and b, are simply the set of coefficients from the ordinary least-squares regression of g and h on the columns V and D, respectively. The bias b, depends on the parameter-effect curvature array (Bates and Watts, 1980) of the regression function n. In other words, the bias bp is strictly a property of the parametrization and can be reduced or eliminated by an appropriate transformation which will be discussed in Section 4. In addition, the magnitude of the bias bp is dependent on the weight function k(z;, A). As k(z,, A) increases, the norm of the bias b, increases as well. The bias b, depends on both the parameter-effects curvature array of the weight function k and the leverage components d;. Thus, even if the parameter-effects nonlinearity can be reduced after appropriate power transformation, the remote leverage points d, still have a substantial influence on the bias b,. For the nonlinear regression model with homoscedastic error structure or known weight function, b, = 0 and b, is the same result given by Cook, Tsai and Wei (1986, eq. (3)). As for the scaled measure of the bias b,, this can be obtained by adapting either Ratkowsky’s (1983, p. 21) “percentage bias” or Box’s (1971, eq. (3.1)) “average ratio of the squared bias” criterion.
3. Bias of b and $ In this section, the error structure is assumed to follow the first-order autoregressive process, i.e., e,=pe,_,+e,
(i=2
,...,
n),
where e, = ei, p denotes the unknown autocorrelation coefficients, independently distributed with mean 0 and constant variance u2. 168
and the E~‘S are normally
and
Volume 8, Number 2
STATISTICS & PROBABILITY
June 1989
LE-ITERS
Using the result given in equation (2) the bias of the maximum likelihood estimator of the unknown parameter 8 = (p, p) is obtained as follows:
1 b,
=
‘B-P\ =E
\ =
’
bP bP
b-P
I
\
I
-2p/n
where g=
qj=
I
- (1 - p2)2 i=Ficj i ,
-+tr{(?‘f)-‘%}a*,
ty(w-‘q,
f=
CjPjpiel
Q&(x>
+ CIP
1
/{2
P)/W}>
+ (n - 3)(1 -P”>} I
I+‘=
Q{a*v(x,
P)/W
W’},
F& is the i th component of I@‘,
and
J1-p2 -P
The bias bP obtained from the least-squares regression of g on the columns f and it depends on the nonlinearity of the regression function n and the autocorrelation coefficient p. For the AR(l) error structure with zero mean, b, = -2p/n as given by Phillips (1977, p. 475) and Tanaka (1984, p. 66). If the zero mean is replaced by a constant mean, then b,, = -(3p + 1)/n as given by Tanaka (1983, p. 1225).
4. Parameter transformations From the previous two sections’ discussions, we notice that the biases of the maximum likelihood ,. *I estimators p, X and fi of unknown parameters p, h and p are related to the model’s parametrization. Thus, the appropriate parameter transformation may reduce the bias to a negligible level. Let the transformed parameters be p* = a,(P), X* = a,(X) and p* = ax(p). Then the relationship between the bias of the transformed parameter and the untransformed parameter is given as follows. bPw= MBbB+ ftr((V’V)-iNp),
b,.=M,b,+ftr{(D’D)-lN,}, and b,. = M,b, + +(l - ~‘)~,‘{2 where MS = aa,/@! a2a,/ap2.
MA = aa,/ah,
+ (n - 3)(1 - p2)}N,, M, = aa,/ap,
ND = a*a,/aj3
(4) ap’,
N, = a2a2,Qh
aA’
and
N, =
169
Volume 8, Number 2
STATISTICS & PROBABILITY
LETTERS
June 1989
For the standard nonlinear regression model (1) with a single unknown parameter p, Hougaard (1982, eq. (2.1)) proposed a parameter transformation procedure to obtain an asymptotically unbiased estimator. We can adopt his approach to reduce the bias of the maximum likelihood estimator of the autocorrelation coefficient p. If n is large enough, the term (1 - ~‘)~/{2 + (n - 3)(1 - p*)} appearing in equation (4) can be replaced by (1 - p*)/n and the asymptotically unbiased estimator of the AR(l) model with zero mean can be obtained through the following parameter transformation
a3(p>= p/(1 - P*>+ t ln{(l + ~)/(l-
P)>.
The second term of equation (5) is Fisher’s Z-transformation applied to the autocorrelation coefficient. For the multiparameter case, Hougaard (1984) indicated that it may not always be possible to obtain an asymptotically unbiased estimator by reparametrization. Here, we propose the following parameter exponential transformation to reduce the bias of the maximum likelihood estimator: 8* = ’
(S,B,- l)/ln(6,), r e1)
if Si# 1, if S, = 1,
a;>0
(i=l,...,t).
Note that parameter Oi is not restricted to be positive. The purpose of this transformation is to find S,* such that 1be: 1 can be minimized. The corresponding Si* is obtained by drawing a graph of ( be: 1 versus 6,. The advantage of this parameter transformation is that a,* can always be found to reduce the bias of the maximum likelihood estimator. However, this does not imply that the bias-corrected estimator is efficient under the transformation. (see Amari, 1985, Chapter 5).
References Amari, S.I. (1982) Differential geometry of curved exponential families-curvatures and information loss, Ann. Stofist. 10, 357-87. Amari, S.I. (1985). Differential-geometrical methods in statistics, in: Lecture Notes in Statistics 28. (Springer, New York). Bartlett, M.S. (1953), Approximate confidence intervals II. More than one unknown parameter, Biomerrika 40,306-17. Bates, D.M. and D.G. Watts (1980), Relative curvature measures of nonlinearity (with discussion). J Roy. Statist. Sot. Ser. B 42, l-25. Box, M.J. (1971). Bias in nonlinear estimation (with discussion), J. Roy. Statist. Sot. Ser. B. 32, 171-201. Clarke, G.P.Y. (1980), Moments of the least squares estimators in a nonlinear regression model, J. Roy. Sfutist. Sot. Ser. B 42, 227-37. Cook, R.D., CL. Tsai and B.C. Wei (1986), Bias in nonlinear regression, Biomefrika 73, 615-23.
170
Cox, D.R. and E.J. Snell (1968), A general definition of residuals (with discussion), J. Roy. Statist. Sot. Ser. B 30, 248-75. Hougaard, P. (1982), Parametrizations for nonlinear models, J. Roy. Statist. Sot. Ser B 44, 244-52 Hougaard, P. (1984), Parameter transformations in multiparameter nonlinear regression models, Technical Report No. 2, University of Copenhagen. Phillips, P.C.B. (1977), Approximations to some finite sample distributions associated with a first-order stochastic difference equation, Econometrika 45, 463-85. Ratkowsky, D.A. (1983), Nonlinear regression modeling (Marcel Dekker, New York and Basel). Tanaka, K. (1983), Asymptotic expansions associated with the AR(l) model with unknown mean, Economeirika 51, 1221-31. Tanaka, K. (1984), An asymptotic expansion associated with the maximum likelihood estimators in ARMA models, J. Roy. Statist. Sot. Ser. B 46, 58-67.