bilornatignal JDurnalef
Bin-Medical
ELSEVIER
International Journal of Bio-Medical Computing 42 (1996) 181 190
Computing
Algorithms for robust nonlinear regression with heteroscedastic errors L/tszl6 Tdthfalusi, L~szl6 Endr6nyi* Department of Pharmacology, Department of Preventive Medicine and Biostatistics, University (if Toronto, Toronto, Ontario M5S IA8, Canada
Received 13 October 1995; accepted 3 April 1996
Abstract
Nonlinear regression algorithms were compared by Monte-Carlo simulations when the measurement error was dependent on the measured values (heteroscedasticity) and possibly contaminated with outliers. The tested leastsquares (LSQ) algorithms either required user-supplied weights to accommodate heteroscedasticity or the weights were estimated within the procedures. Robust versions of the LSQ algorithms, namely robust iteratively reweighted (IRR) and least absolute value (LAV) regressions, were also considered. The comparisons were based on the efficiency of the estimated parameters and their resistance to outliers. Based on these criteria, among the tested LSQ algorithms, extended least squares (ELSQ) was found to be the most reliable. The 1RR versions of these algorithms were slightly more efficient than the LAV versions when there were no outliers but they provided weaker protection to outliers than the LAV variants. Keywords: Robust regression; Parameter estimation; Curve fitting; Nonlinear regression
1. Introduction
The purpose o f this simulation study is to compare different nonlinear regression algorithms. The simulated data sets represent typical observations which have been often encountered. We consider a data set typical when the n u m b e r of observations is not m u c h higher than the n u m b e r
* Corresponding author.
o f estimated parameters, the model to be fitted is nonlinear, the measurement error is heteroscedastic and could contain outliers. F o r example, concerning only the last point, analyzing ligand binding data Cressie and Keightley [1] reported that 2 out o f 10 observations were 'bad'. Similarly Tiede [2], even t h o u g h using a robust method, could not fit a regression curve to 6% o f his data. Nonlinear regression algorithms with nonh o m o g e n o u s variance but normally distributed measurement errors were c o m p a r e d by [3-9].
0020-7101/96/'S15.00 @ 1996 Elsevier Science Ireland Ltd. All rights reserved PI1 $0020-7101(96)01173-7
182
L. T6thfalusi, L. Endrdnyi / International Journal of Biomedical Computing 42 (1996) 181--190
Consensus was not reached concerning the efficiencies and robustness of the procedures. Although there is a huge literature on the application of robust methods to linear models, only a few studies refer to nonlinear models. CornishBowden and Endrenyi [ 1 0 ] supposing heteroscedasticity, applied an iteratively reweighted algorithm to linearly transformed enzyme kinetic data. Lawrence and Arthur [11] compared different robust algorithms on nonlinear growth models and found them disappointing, contrary to Tiede [2] who applied them to analyze radioimmunoassay data. Tiede considered nonlinear models with homoscedastic error distribution but applied the procedures also to radioimmunoassay data that are typically heteroscedastic [12]. Normolle [13] recently proposed a new robust algorithm to analyze radioimmunoassay data supposing heteroscedasticity, but the statistical performance was not evaluated. Lopez-Cabrera et al. [14] presented an approach based on the sequential application of an outlier test which required replicate measurements.
HF-
2. M e t h o d s
HF -
ai = So+Cq x y i
(2)
If not stated otherwise, ~' = (0.025, 0.05). To simulate the non-normal distribution of measurement errors (e), the so-called scale-shift mixture model was applied [15]: Ei = (1 -- 2) x N(0,~r~) + 2 x N(O,DIF x ~ )
(3)
where 2 refers to the fraction of outliers and D I F to a 'deviation inflation factor'. With the special condition of D I F = 1, by definition, normal error was simulated. A heteroscedasticity factor was defined: Yi x ~l x 100 (withy~ = 0.5) Y-0+ ~1 x y~ ~11'c%
x
(4)
100
2 + el/c~0
2.1. Pharmacostatistical model The algorithms were compared on the example of the well-known Michaelis-Menten equation which describes simple enzyme kinetic, uptake and receptor binding relationships: QxX, Y, = - K + X~
To study the effect of the number of observations, we generated additional curves consisting of n = 16 points, using the recursive relationship above, with the following parameters: r = 1.35 and d = 0.05. The heteroscedastic error was modelled as a linear function of the dependent variable, with the standard deviation o- = g(~,y) in a simple form:
(1)
The vector of parameters is q' = (K,Q) with default values, without any loss of generality, of (0.75,1). The default data set consisted of n = 7 simulated data at x' -- (0.05, 0.19, 0.51, 1.25, 2.95, 6.87, 15.8). The x values were generated with the following recursive relationship: x~ = 0.05, x2 = 0.05 + d, xt+l = r(x~ - x~ 1) + d w h e r e d = 0.14, r = 2.3 and i = 3,4,..,7. The corresponding relative responses were y / Q = (0.062, 0.202, 0.405, 0.625, 0.797, 0.901, 0.954).
It is a convenient index to measure heteroscedasticity in terms of total measurement error. When H F = 0 the measurement error contains only a homoscedastic component, and when H F = 100 the measurement error is purely heteroscedastic. The default values of e-s determine the default value of HF; if otherwise not stated, H F --- 50. 2.2. Algorithms There are several algorithms to solve the heteroscedastic nonlinear regression problem. The comparison of these algorithms by Carroll and Ruppert [7] serves as a useful starting point. The following notations are applied: y = f ( x , 9 ) model equation, r/is a vector of the parameters of interest; Yi = f ( x i , q ) + ei - i-th response at x~, i = 1,2,..,n; ))i=f(xi,~) - i-th response predicted by t/, a vector of predicted parameters; a 2 = g2(yi,~) - variance model, • are nuisance parameters; g(y~,e) = s0 + : q y i the default
L. T6thfalusi, L. Endr{,nyi / International Journal of Biomedical Computing 42 (1996) 181-190
form of g; r~ -- abs[(yi - ~s)#~l] - i-th scaled residual; w~ - weight of the i-th response. 2.2.1. Least squares (LSQ) algorithms Weighted least squares (WLSQ) is a simple algorithm. The e parameters are supplied by the user and then WLSQ is calculated: Min [Ewi(y~fi)z]. A particular case when all wt equal 1, corresponds to ordinary least squares (OLSQ). For reference purposes the WLSQ algorithms were computed also with true weights, calculated from the true e and q parameters used for generating the data sets (WLSQ_opt). A very widespread practice for weighting is to use the reciprocal value of squares of the measured data, i.e., = w~ = y - Z (WLSQ_y). Extended least squares (ELSO), estimates both q and the nuisance ~ parameters by minimizing the following error function: E ( ( y i - )~s):#t2+ ln(d~)). Generalized least squares (GLSQ) uses the predicted residuals for weighting after a fitting step. It estimates the parameters of the g variance function by fitting the unscaled absolute residuals on the predicted values (~). The algorithm is: 0 Initialize w~ = 1; 1 WLSQ step; 2 Estimate by OLSQ the ~ parameters of the g variance function; 3 wi ~ 1/g@i, fQ; 4 Iterate 1,2,3 until convergence. GLSQ is a very popular algorithm. It has several variants which differ mostly in Step 2. Unweighted least squares was applied in the present study. Weighted regressions were also evaluated for this step but were found to be unsatisfactory. Iterative least squares (ILSQ) is a simplified version of GLSQ. It does not estimate the parameters of the variance function and the weights in step 3 are updated simply as w~ -- ~ F 2. We refer to the group of algorithms using a non-fixed set of weights as flexible algorithms (ELSQ, GLSQ, and ILSQ), and procedures in which the weights are supplied by the user as rigid algorithms (OLSQ, WLSQ_y, WLSQ opt). 2.2.2. Least absolute value (LA V) algorithms The basic step in all LSQ algorithms listed above is a minimization of the square of the error function. Minimization of the absolute value of
183
this error function instead of the quadratic one corresponds to changing the error norm in which we measure the goodness of fit. The fit is measured as a sum of the weighted absolute deviations between the observed and predicted values. LAV algorithms are claimed to be robust. We denote the absolute variant of the LSQ algorithms by changing the letters SQ in the abbreviations to AV, for example WLSQ opt becomes W L A V _ opt, and so on. 2.2.3. Iteratively reweighted robust (IRR) algorithms IRR algorithms assign two weights to each observation. The first of these, referred to as the internal weight, measures the heteroscedasticity. The second or external weight is a measure for the lack of compatibility of a data point with the remaining values, or for being an 'outlier'. The final weight is the product of the internal and external weights. In the case of ELSQ we minimize a modified error function, defined as Y',(ws(y, - 9,)2#/2 + ln(d2)) A general scheme for an IRR algorithm is: 0 Initialize w/ = 1; 1 Fit a regression with one of the flexible LSQ-procedures; 2 r t = abs[(y i ~ i ) # i l ; 3 Define the external weights as wei = rw(ri/NP) z where rw( ) is one of the robust weight functions listed in Table 1 and N P is a normalization parameter; 4 Apply internal weights defined as: wi, = ~i -2 (GLSQ) or wii = . ~ / - 2 (ILSQ) or wii = 1 (ELSQ); 5 Calculate w~ = wei x wit; 6 Iterate 1 to 5 until convergence. IRR algorithms differ from each other in the LSQ algorithm applied in Step 1. This serves as the basis for the various abbreviations. An IRR algorithm will be referred to by its LSQ name in Step 1 -I- R, for example GLSQ-R. Following Cornish-Bowden and Endrenyi [10] we consider an observation to be an outlier if its scaled deviation (rt) from the predicted value is much 'higher' than the scaled deviation of the remaining observations. Consequently, we set NP to 1.48 x MED(r) where r' = rl,r2,..,r~. The constant 1.48 makes N P an approximately unbiased estimator of the scale if n is large and the error distribution is normal.
L. T6thjalusi, L. Endrknyi / International Journal q/Biomedical Computing 42 (1996) 181 190
184
Table 1 Definitions of r o b u s t weight functions Huber(r) Biweight(r) Andrews(r) Anscombe(r)
=
1
=
(1--(r/B)2)
A1/r -
0
if
r } A2
-
1
if if if
r < Mt r } Mi and r < M 2 r } M2
= sin(r/A)/r = 1 -
Hampel(r)
= Mt/r Student T(r) Ramsey(r)
-(M3-r)/(rx(M = (l+(r/C):) = e x p ( - R x r)
3
M2) )
r r r r
< < < <
else H x r else 0 else 0
if if if if if
2
H B A A~
H=1.5, B=4.68, A=4.33, A 1=1.5, A2=3, M 1=1.7, M2=3.4, M3=8.5, C=6, The f o r m u l a s of the weight functions a n d c o n s t a n t s are from [11] a n d [16].
To achieve robustness in the case of GLSQ we use OLAV in Step 1 to estimate the e-s. When it is not otherwise stated, the default robust weight function is that of Andrews (see Table 1). A summary of the applied algorithms is given in Table 2.
2.3. PerJormance metrics • The square root of the average quadratic distance between the estimated parameters and true parameter values, the root mean square error (RMSE), is a performance metric which is frequently applied for the comparison of algorithms. This measure turned out to be highly nonrobust, so we devised other, more robust metrics for the evaluation o f performance. First, we defined the estimated parameter as acceptable if it was within its previously set acceptance range. In the case of K in Eq. (1) this range was set to 1 / 3 - 3 . The nonplausibility ratio (NPR) is the percentage of runs yielding unacceptable estimates, and the trimmed root mean squared error (TRMSE) is the RMSE of the accepted parameters. An alternative way to get a resistant performance metric is to calculate the median of absolute deviations of the estimates from their true value (MAD). The reported MAD values are multiplied by 1.48 in order to obtain equivalence with RMSE in the noncontaminated case. When similar conclusions could be reached about both model parameters,
and r _< A 2
r ) A 1
R-0.3
only results of K in Eq. (1) are reported.
2.4. hnplementation To evaluate the various estimating procedures, an object-based program was written in Borland PASCAL. All computations were carried out on an IBM-compatible 486-DX2 personal computer. For unconstrained minimization, the simplex method of Nelder and Mead was used [17]. Five hundred experiments were simulated in order to obtain the investigated performance metrics. For comparisons we used the same simulated data sets; consequently, differences in the results should be attributed to deviations between the algorithms. The starting values for the optimization were set at q' = (1,1) and x' = (0.1, 0.1).
3. Results
3.1. Comparisons of LSQ algorithms in the absence of outlie~ Fig. 1 compares the performances of LSQ algorithms, OLSQ, WLSQ opt, WLSQ_y, GLSQ, ELSQ and ILSQ, when there are no outliers. The heteroscedasticity factor was changed from 0 (hoinoscedastic case) to 100 (full heteroscedasticity). RMSE-s of algorithms were expressed relative to those with WLSQ_opt. Consequently, by defini-
L. T6thjalusi, L. EndrOnyi / International Journal oJ Biomedical Computing 42 (1990) 181 190
185
Table 2 Classification and the corresponding abbreviations of the implemented algorithms Weighting type
Full name
Classification of the algorithms by error function Quadratic (LSQ)
Rigid
Ordinary least squares Optimal weighted least squares Weighted least squares by response lterative least squares Generalized least squares Extended least squares
Flexible
tion, RMSE for W L S Q _ o p t was 100. As Fig. 1A and C show, OLSQ is an optimal algorithm when H F = 0, but its performance becomes worse when the data are more heteroscedastic. The other rigid algorithm, WLSQ_y, shows just the opposite behaviour: it is getting more efficient with increasing heteroscedasticity. Thus, the performances of the rigid algorithms are highly dependent on the extent of heteroscedasticity which is usually unknown. In contrast, as Fig. "IB and D show, the performance of ELSQ is never much A
rr"
loo-~.U
0
50
100
0
100
50
HF
HF
12o
r}
t~- 80 60 0
50
100 0
50
HF []
OLSQ
•
100
HF WLSQ
y
•
ELSQ
~
ILSQ
•
GLSQ
Fig. 1. Efficiency of least-squares algorithms. Eq. (2) was fitted to data sets consisting of either 7 (A and B) or 16 (C and D) points. HF, heteroscedasticity factor, for definition see Eq. (4); RMSE%, root mean square error of algorithms as a percentage of RMSE for WLSQ opt. (A and C) rigid algorithms; (B and D) flexible algorithms. Symbols: D, OLSQ; closed hexagon, WLSQ_y; II, ELSQ; A, 1LSQ; A, GLSQ.
OLSQ WLSQ opt WLSQ y 1LSQ GLSQ ELSQ
Robust (IRR)
Absolute (LAV)
-[LSQ-R GLSQ-R ELSQ-R
OLAV WLAV opt WLAV y 1LAV GLAV ELAV
worse than WLSQ opt. Furthermore, ELSQ becomes much better with increasing heteroscedasticity. GLSQ shows a similar pattern but is less efficient than ELSQ. ILSQ is a strong competitor to ELSQ when the data are nearly homoscedastic, especially when there are few observations (n = 7, Fig. 1B).
3.2. Comparisons of LSQ, LA V and IRR estimators in the presence of outliers To study the effect of outliers on the parameter estimates, DIF was first varied at a fixed 2. Definitions of these constants are given in Eq. (3). Essentially, the relative standard deviation of the contaminating distribution was changed from 1 (no outlier) to 10. Considering that each data set had 7 observations and the outlier percentage was set to 15, every curve contained on the average one 'outlier'. The performances of LSQ estimators under the influence of contamination are compared in Fig. 2A C. ELSQ is generally most resistant to outliers. This is expressed particularly by the NPR metric. The MAD and T R M S E metrics indicate similar resistances to outliers for LAV and IRR algorithms (Fig. 2D,E and G,H). However, there is an important difference. Generally, the LAV algorithms decrease much more effectively the number of unacceptable experiments (i.e. NPR) at high DIF than the IRR procedures (Fig. 2F and I). The conclusions about the outlier sensitivities of algorithms are the same when the percentage of outliers is 30% instead of 15% (Fig. 3). As a rough average, 2 of 7 points can be considered as out-
186
L. T6thfalusi, L. EndrOnyi /International Journal o f Biomedical Computing 42 (1996) I81 190
0.5
a
A
D
G
0.4 0.3 f 0.2 0.1
~ 2
~ 4
0.60.5
~ 6
J--~ 8 10
I
I
J
J
I
0
2
4
6
8
DIF
DIF
g
E
10
I
i
I
J
I
r
0
2
4
6
8
10
DIF
H
~
I.-- 0.3 0.2 0.1 0
'
'
'
'~
2
4
6
8
2O
L
10
0
.
_
2
_
~
4
DIF
DIF
C
F
I
L_
6
8
~
10
~
I
I
I
0
2
4
6
-__--J
8
10
6
8
10
DIF
I
lo 5 o 0
2
4
6
8
10
0
2
WLSQ_opt
6
8
10
0
DIF
DIF
•
4
[]
OLSQ
•
WLSQ_y
2
4 DIF
•
ELSQ
•
GLSQ v
ILSQ
Fig. 2. Effectiveness of computational procedures for the robust, heteroscedastic estimation of the parameter K in Eq. (1). (A C) LSQ algorithms; (D-F) the corresponding LAV algorithms; (G I) the corresponding IRR algorithms. 2 = 0.15. For the definition of 2 (fraction of outliers) and DIF see Eq. (3). The symbols for the LSQ algorithms are: O, WLSQ opt; O, OLSQ; closed hexagon, WLSQ_y; II, ELSQ; /',, ILSQ; A, GLSQ. The symbols for the LAV and IRR algorithms are identical to those of the equivalent LSQ methods.
liers. ELSQ is again the most robust LSQ procedure (Fig. 3C). The difference between the LAV and IRR algorithms, demonstrated by NPR, is even more pronounced (Fig. 3F and I). Among the flexible LSQ algorithms, the order of stability to outliers based on NPR is ELSQ > GLSQ > ILSQ. Note that WLSQ_y, probably the most widely used algorithm, as well as WLSQ_ opt are sensitive to outliers (Fig. 2 and especially Fig. 3). The corresponding statistics in Fig. 3 are
always larger than in Fig. 2. There is a kind of equivalence, on average, between the extent and magnitude of contamination (2 and DIF, respectively): high DIF ( = 10) in Fig. 2 has about the same effect as a moderate DIF ( = 5) in Fig. 3.
3.3. Tuning the IRR algorithms In contrast to LAV procedures, I R R algorithms are tunable. By decreasing the tuning constant
L. Tg)thfalusi, L. Endrdnyi .; International Journal of Biomedical Computing 42 (1996) 181 - 190
j/
0.8 0.6 O 0.4
D
187
G
0.2 _ _ 1
0
2
4
0.8
0.6 n,"
f
0.4
I-
0.2
2
6
10
0
2
4
6
8
10
)
I--
2
I
4
10
6
8
10
6
8
10
DIF
B
E
H
6
8
10
0
2
4
6
8
10
0
2
4
DIF
DIF
C
F
I
I
8
DIF
DIF
_.1
6
DIF
4
30
8
25 20 tY
10 5
0 0
2
4
6
8
10
0
2
DIF
4
6 DIF
8
10
0
2
4
DIF
Fig. 3. Robust, heteroscedastic estimation of K when 2 = 0.3. The structure, abbreviations and notations of the diagram are identical to those presented for Fig. 2.
(Table 1), stronger protection can be achieved. However, we pay a price for it: the efficiency of estimation is reduced when there are no outliers. Therefore, the effect of the tuning constant on the contrast of efficiencies was explored for IRR and LAV algorithms. Fig. 4 presents, at various levels of the tuning constant, the performance metrics MAD, TRMSE and NPR for the IRR algorithms relative to the values for the corresponding LSQ estimators. The results of the corresponding LAV algorithms, ILAV, GLAV and ELAV, are included for reference. Two cases were investigated, the noncontaminated (DIF = 1) and the heavily contaminated (DIF = 10).
As expected, decreasing the tuning constant reduces the efficiency measured by TRMSE in the outlier-free case (except for GLSQ R) but increases the protection in the contaminated case. However, there is an optimum for this protective effect, at least for the ILSQ-R and GLSQ-R algorithms: reducing the tuning constant after an optimum point does not lead to increased protection. The IRR algorithms do not generally provide the protection offered by their LAV counterparts. In fact, taking into account the demonstrations by all three metrics, the IRR estimators do not reach, at any value of the tuning constant, the protection level of the respective LAV algorithms.
188
L. T6thfidusi, L. Endrdnyi / International Journal o[' Biomedical Computing 42 (1996) 181 I90
Iterative regression
Generalized regression
Extended regression
120 lO0 r~ < :£
~
•
8o
60 ~ ~
_c~._c:H_~_~ J : ] _
0.5
i
r.
1.0
1.5
.
.
.
2.0
0.5
1.0
1.5
k
2.0
0.5
1.0
k
1.5
2.0
k
120
:£ m,,
10o
t--
80
_
60
~--I
L _..~
0.5
1.0
L.
1.5
.
.
.
~_
2.0
0.5
~ 1
~
1.0
•
___
1.5
2.0
0.5
i
i
1.0
1.5
2.0
1.5
2.0
k
k
k
0
=o .~
-4
x3 n,"
-8
Z
-12 0.5
1.0
1.5
2.0
0.5
•
DIF=I,
[]
DIF = 10, IRR
IRR
1.0
1.5
2.0
k
k
-
-
DIF=I,
0.5
1.0 k
LAV
DIF = 10, L A V
Fig. 4. Effect of tuning the I R R algorithms on robust parameter estimation. The tuning constant was multiplied by k. For definitions of the tuning constants see Table 1. Only the results for Andrews' weighting function are shown; similar results were obtained with the biweight function of Tukey and with Huber's function. Results for D I F = 1 and 10 are illustrated by • and E respectively. For reference, the corresponding statistics of the L A V variant are shown as continous line (DIF = 1) and dashed line (DIF = 10). The results are expressed either as the percentage ( M A D % and TRMSE%) or the difference (NPR difference) from the appropriate LSQ statistics.
In addition to the default Andrews' function, the protective effects of the Huber and biweight functions were also computed with different tuning constants. The results were very similar to those obtained with the Andrews' function. All had optimum points in their protective effect around the constant listed in Table 1 and none could decrease the NPR value further than the corresponding LAV estimator. (Data are omit-
ted.) Another conclusion to be drawn from Fig. 4 is that the protective action of robust reweighting is strongest in the case of ILSQ and weakest with ELSQ. Since ILSQ is the most sensitive algorithm, it was selected to demonstrate differences between robust weighting functions. As Table 3 shows, the robust weighting functions of Andrews, Tukey and Huber are the most effective procedures.
L. T6thjalusi, L. Endr6nyi / International Journal q[ Biomedical Computing 42 (1996) 181 190
189
Table 3 Comparison of robust weightingfunctions using ILSQ as a test algorithm Robust weighting function
Huber Biweight (Tukey) Anscombe Student Andrews Ramsey Hampel
DIF
-
0
DIF
-
10
RMSE%
MAD%
TRMSE%
N P R diff.
101.9 106.0 103.8 100.8 103.8 119.5 100.8
53.9 54.3 55.3 65.5 53.5 73.9 69.2
58.6 63.6 59,7 77.4 62.6 78,3 75.9
-8.2 -9.2 8.4 -5.6 -9.2 6.2 -3.8
For the definition of the weighting functions see Table 1. Data are expressed as a percentage of the corresponding ILSQ algorithm ( M A D % or TRMSE%) or as the difference between the non-reweighted and reweighted forms (NPR diff.).
4. Discussion
The application of different nonlinear regression algorithms in the heteroscedastic case is controversial, particularly when the number of observations is small. Sheiner and Beal [3,8] initially claimed that the ELSQ procedure is more favourable than a rigid algorithm such as OLSQ or WLSQ_y. Later they preferred different GLSQ variants [6]. Note that Sheiner and Beal's GLSQ algorithm is very different from our implementation. Our procedure is based on the work of Carroll and Ruppert [7]. Metzler [5] repeated the simulations of Sheiner and Beal [3] with a smaller number of observations per curve and found that ILSQ was the most useful, van Houwelingen [9] preferred GLSQ over ELSQ. He proved that ELSQ could, under some conditions, lead to inconsistent estimates and claimed that GLSQ was more robust than ELSQ. Fig. 1 of the present work illustrates that flexible algorithms usually outperform the rigid approaches even when the weights are optimally correct. As heteroscedasticity increases, GLSQ and particularly ELSQ become more and more efficient not only in comparison with OLSQ but even WLSQ opt. Theoretical calculations support these apparently surprising results [9]. The asymptotic variance of GLSQ is equal to that of WLSQ opt, whereas the asymptotic variance of ELSQ is smaller than that of WLSQ opt. The flexible algorithms effectively exploit the informa-
tion hidden in the variance function. An improper specification of the weight in the rigid algorithms can yield a loss of 20% in the precision of the estimates (Fig. 1A and C). Conversely, from our simulations, the performances of the flexible algorithms are never much worse than that of WLSQ_opt. Furthermore, rigid algorithms, particularly WLSQ y, are more sensitive to outliers (Figs. 2 and 3) than the flexible algorithms. Among flexible algorithms, ILSQ compares favourably with others when the number of observations is low (n = 7). ILSQ has the additional advantage of being fast and easily implementable. However, our method of choice is E L S Q It is not only the most efficient algorithm in most cases (Fig. 1), but also the most resistant to outliers, particularly as measured by the NPR metric (Fig. 2C and 3C). Robust iterative reweighting improves the resistance of the flexible algorithms to outliers. In the presence of outliers, the relative efficiency of iterative reweighting is highest in the case of the least resistant ILSQ and smallest with the most resistant ELSQ (Fig. 4). Features of GLSQ are particularly noteworthy. The algorithm of G L S Q _ R is favoured over GLSQ even when the measurement error is not contaminated by outliers (Fig. 4). This could indicate a numerical instability in the algorithm of GLSQ. The numerical instability of GLSQ is even more apparent when the variance parameters are estimated by WLSQ_y instead of OLSQ.
190
L. T6thfalusi, L. Endrknyi / International Journal of Biomedical ComputhTg 42 (1996) 181-190
In c o m p a r i s o n s o f the I R R a n d L A V algorithms, I R R p r o c e d u r e s have a higher efficiency in the outlier-free case. This s h o u l d be b a l a n c e d by the s t r o n g e r p r o t e c t i o n offered by their L A V v a r i a n t s (Figs. 2 a n d 3). A useful a p p r o a c h c o u l d be to use generally an I R R a l g o r i t h m and, if the c o m p u t a t i o n fails, to try the L A V variant. F a i l u r e o f an L A V a l g o r i t h m is a strong i n d i c a t i o n t h a t the p r o b l e m is caused n o t b y one o r m o r e outliers b u t b y o t h e r factors, for e x a m p l e by p o o r experim e n t a l design. In conclusion, the results o f this s t u d y d e m o n strate the following: (1) Flexible L S Q m e t h o d s o f p a r a m e t e r e s t i m a t i o n are m o r e efficient t h a n rigid a l g o r i t h m s in the presence o f possible heteroscedasticity. (2) In o u r simulations, E L S Q s h o w e d better n u m e r i c a l stability t h a n the o t h e r L S Q algorithms. (3) Iterative reweighting by a r o b u s t weighting f u n c t i o n is a very safe p r o c e dure. Very little efficiency is lost when there are no outliers, a n d there is a m o d e r a t e l y g o o d p r o tection against outliers. R o b u s t reweighting is p a r t i c u l a r l y advised with I L S Q a n d G L S Q because they are m o r e easily influenced b y outliers t h a n E L S Q . (4) L A V v a r i a n t s o f iterative alg o r i t h m s are m o r e r o b u s t t h a n the c o r r e s p o n d i n g I R R algorithms; however, they have s o m e w h a t lower e s t i m a t i n g efficiencies. In s u m m a r y , we have d e m o n s t r a t e d that the e s t i m a t i o n o f n o n l i n e a r regression p a r a m e t e r s can be c o n s i d e r a b l y i m p r o v e d b y using m o r e sophisticated algorithms. T h e i m p r o v e m e n t is s u b s t a n t i a l when the m e a s u r e m e n t e r r o r is h e t e r o s c e d a s t i c a n d can c o n t a i n outliers.
References [l] Cressie NAC and Keightley DD: Analyzing data from
hormone-receptor assays, Biometrics, 37 (1981) 235249. [2] Tiede J J: The application of robust calibration to radioimmunoassay, Biometrics, 35 (1979) 567-574. [3] Sheiner LB and Beal SL: Pharmacokinetic parameter estimates from several least squares procedures: superiority of extended least squares, J Pharmaeokin Biopharm, 13 (1985) 185 20l. [4] Thomson AH, Kelman AW and Whiting B: Evaluation of nonlinear regression with extended least squares: simulation study, J Pharm Sci, 72 (1985) 1327 1330. [5] Metzler CM: Extended least squares (ELS) for pharmacokinetic models, J Pharm Sci, 76 (1987) 565 571. [6] Beal SL and Sheiner LB: Heteroscedastic nonlinear regression, Technometrics, 30 (1988) 327-337. [7] Carroll RJ and Ruppert D: Transformation and Weighting in Regression, Chapman and Hall, New York, 1988. [8] Sheiner LB and Beal SL: Commentary on "Extended least squares (ELS) for pharmacokinetic models", J Pharm Sci, 77 (1988) 731-732. [9] van Houwelingen JC: Use and abuse of variance models in regression, Biometrics, 44 (1988) 1073-1081. [10] Cornish-Bowden A and Endrenyi L: Robust regression of enzyme kinetic data, Biochem J, 234 0986) 21 29. [11] Lawrence KD and Arthur JL: Robust nonlinear regression. In: Robust Regression (Eds: KD Lawrence and JL Arthur), Marcel Dekker, New York, 1991, pp. 59 86. [12] Rodbard D, Lenox RH, Wray HE and Ramseth D: Statistical characterization of the random errors in the radioimmunoassay dose-response variable, Clin Chem, 22 (1976) 350-358. [13] Normolle DP: An algorithm for robust non-linear analysis of radioimmunoassays and other bioassays, Stat Med, 12 (1993) 2025-2042. [14] Lopez-Cabrera A, Cabre F, Franco R and Canela EI: Identification and rejection of outliers in enzyme kinetics, lnt J Biomed Comput, 23 (1988) 9-20. [15] Iglewicz B and Hoaglin DC: How to Detect and Handle Outliers, American Society for Quality Control, Milwaukee, 1993. [16] Holland PW and Welsch RE: Robust regression using iteratively reweighted least squares, Commun Stat-Theor Methods, A6 (1977) 813-827. [17] Press WH, Teukolsky SA, Vetterling VT and Flannery BP: Numerical Recipes, Cambridge University Press, New York, 1992.