Statistics & Probability Letters 55 (2001) 397 – 401
Integrated squared error estimation of Cauchy parameters Panagiotis Besbeas ∗ , Byron J.T. Morgan Institute of Mathematics and Statistics, University of Kent at Canterbury, Canterbury, Kent CT2 7NF, UK Received 31 January 2001; received in revised form 31 May 2001
Abstract We show that integrated squared error estimation of the parameters of a Cauchy distribution, based on the empirical characteristic function, is simple, robust and e1cient. The k-L estimator of Koutrouvelis (Biometrika 69 (1982) 205) is c 2001 Elsevier Science B.V. All rights more di1cult to use, less robust and at best only marginally more e1cient. reserved Keywords: Cauchy distribution; E1ciency; In:uence function; Integrated squared error; k-L method; Maximum likelihood; Robustness
1. Introduction Many authors have studied alternative ways of estimating the parameters, = (c; )T of the Cauchy probability density function c f(x; ) = ; x ∈ R: {c2 + (x − )2 } Because of the simple form of the characteristic function
(t; ) = exp(it − c|t|); Feuerverger and McDunnough (1981a) and Koutrouvelis (1982) have used the empirical characteristic function n
n (t) = n−1 eitXj j=1
formed from the random sample, X1 ; X2 ; : : : ; Xn from f(x; ). In this paper, we show how the integrated squared error (ISE) approach of Heathcote (1977) provides a far simpler estimation procedure based on n (t), which is robust and e1cient. ∗
Corresponding author. E-mail addresses:
[email protected] (P. Besbeas),
[email protected] (B.J.T. Morgan).
c 2001 Elsevier Science B.V. All rights reserved 0167-7152/01/$ - see front matter PII: S 0 1 6 7 - 7 1 5 2 ( 0 1 ) 0 0 1 5 3 - 5
398
P. Besbeas, B.J.T. Morgan / Statistics & Probability Letters 55 (2001) 397 – 401
2. Estimation using the integrated squared error method ISE estimates ˆ are the result of minimising with respect to the criterion ∞ I () = | n (t) − (t; )|2 dW (t); −∞
t for some weight function, W (t). The weight function W (t; ) = −∞ e−|y| dy, for ¿ 0, results in a particularly simple form for I (; ), and therefore we adopt this weight function in the paper. We obtain n 2 1 4( + c) I (; ) ˙ − : + 2c n ( + c)2 + (Xj − )2 j=1
Setting @I=@c = @I=@ = 0 results in the estimating equations: n ( + c)2 − (Xj − )2 n = ; {( + c)2 + (Xj − )2 }2 ( + 2c)2 j=1
n j=1
(Xj − ) = 0; {( + c)2 + (Xj − )2 }2
(1)
which can be easily solved iteratively to provide the ISE estimates of and c as functions of . Alternatively, it is simple to minimise I (; ) directly with respect to using a numerical optimisation routine. We found that a standard simplex search worked easily, and apparently produced a unique solution. For example, contour plots revealed a single optimum (Besbeas, 1999). By Heathcote (1977), n1=2 (ˆ − ) is asymptotically normally distributed with mean vector zero and covariance matrix c( + 2c)2 (52 + 14c + 10c2 ) () = I2 ; 16( + c)3 where I2 is the 2 × 2 identity matrix. The estimators ˆ and cˆ are thus asymptotically independent, have the same asymptotic variance and the same asymptotic relative e1ciency 32c( + c)3 ( + 2c)2 (52 + 14c + 10c2 ) compared with maximum likelihood. If we deHne g(x) = {( + c)2 + (x − )2 }−2 , for x ∈ R, we Hnd from Campbell (1993) that the in:uence function of ˆ is given by 1 ( + 2c) − 12 ( + 2c)3 {( + c)2 − ( − )2 }g() 2 ˆ = IF(; ) : (2) ( + c)( + 2c)3 ( − )g() It is clear from (2) that the individual in:uence functions of cˆ and ˆ are bounded in and descend to the asymptotes ( + 2c)=2 and 0, respectively, as || → ∞. The ISE estimator ˆ is thus robust against outliers. We set = 1:03c in (1) and (2), as this is the value of which minimises |()|, irrespective of , resulting in an asymptotic joint relative e1ciency of 96.22%. 3. The k-L method Feuerverger and McDunnough (1981a, b) estimate by minimising S(; t) = {zn (t) − z(t; )}T −1 (){zn (t) − z(t; )}
(3)
P. Besbeas, B.J.T. Morgan / Statistics & Probability Letters 55 (2001) 397 – 401
399
Table 1 Maximum joint asymptotic relative e1ciency of the k-L estimator ˜ for selected values of k k E1ciency
1 0.4194
2 0.6728
3 0.7940
4 0.8592
5 0.8979
10 0.9666
15 0.9837
for a suitable value of t = (t1 ; : : : ; tk ), for some integer k. In this expression, z(t; ) = (U (t1 ; ); : : : ; U (tk ; ); V (t1 ; ); : : : ; V (tk ; ))T , where U (t; ) and V (t; ) are respectively the real and imaginary parts of (t; ); zn (t) is the empirical counterpart of z(t; ) given by zn (t) = (Un (t1 ); : : : ; Un (tk ); Vn (t1 ); : : : ; Vn (tk ))T and n−1 () is the variance–covariance matrix of zn (t). The (i; j) element of () is given below (speciHc reference to is omitted for convenience of printing) 1 {U (ti + tj ) + U (ti − tj )} − U (ti )U (tj ) 1 6 i; j 6 k; 2
1 6 i 6 k; !ij = 12 {V (ti + tj−k ) − V (ti − tj−k )} − U (ti )V (tj−k ) k + 1 6 j 6 2k; 1 2 {U (ti−k − tj−k ) − U (ti−k + tj−k )} − V (ti−k )V (tj−k ) k + 1 6 i; j 6 2k: Note that there are errors in the expressions of Feuerverger and McDunnough (1981a, b). ˜ the k-L estimator and established Feuerverger and McDunnough (1981a) called the resulting estimator, , its consistency and asymptotic normality under general conditions. They also established its arbitrarily high asymptotic e1ciency as k → ∞ and noted some general robustness properties. For general k, the k-L estimator proceeds by minimising (3). The estimator for is not unique. Furthermore, the k-L estimator becomes increasingly complicated as k increases, mainly as a consequence of the matrix inversion required for its computation. Koutrouvelis (1982) showed that n1=2 (˜ − ) is asymptotically normal with mean vector zero and covariance matrix −1 k 2 (tj − tj−1 ) 1 () = I2 ; 2 exp(2ctj ) − exp(2ctj−1 ) j=1
where I2 is the 2 × 2 identity matrix and t0 ≡ 0. The k-L estimators for c and are thus asymptotically independent. They have the same asymptotic variance, and compared with maximum likelihood share the same asymptotic relative e1ciency k j=1
(2ctj − 2ctj−1 )2 : exp(2ctj ) − exp(2ctj−1 )
(4)
Practical implementation of the k-L estimator requires choice of k and the points tj (j = 1; 2; : : : ; k). In the selection of k; there is a trade-oL between computational complexity and asymptotic e1ciency. For Hxed k; Koutrouvelis (1982) selected the {tj } by maximising (4); the determination of their optimum values reduces to Hnding the asymptotically optimum quantiles for the estimation of an exponential distribution parameter by linear functions of order statistics (Ogawa, 1960). Table 1 displays the maximum joint asymptotic relative e1ciency thus obtained for selected values of k. Under the same regularity conditions as Feuerverger and McDunnough (1981b), the consistent k-L estimator of has in:uence function −1 @z(t; ) −1 @z(t; ) @z(t; ) −1 ˜ IF (; ) = () () IF (; zn (t)); T @ @ @T where IF (; zn (t)) is the in:uence function of the empirical vector zn (t) (Besbeas, 1999).
P. Besbeas, B.J.T. Morgan / Statistics & Probability Letters 55 (2001) 397 – 401
-2
-2
-1
-1
0
0
1
1
2
2
400
(i)
-10
-5
0
5
10
(ii)
-10
-5
0
5
10
Fig. 1. In:uence functions for (i) c, (ii) , for the integrated squared error estimator (solid line), the maximum likelihood estimator (dotted line) and the k-L estimator with k = 10 (dashed line). The in:uence functions are evaluated at the Cauchy distribution with = 0 and c = 1, and the parameters and {tj ; j = 1; 2; : : : ; 10} are selected to maximise e1ciency.
Note that IF (; zn (t)) is an elementary calculation since IF (; Un (t)) = cos (t) − U (t; ); IF (; Vn (t)) = sin (t) − V (t; ): A symbolic expression for the in:uence function of the k-L estimator for general k is di1cult to produce. The ith (i = 1; 2) individual in:uence function of the k-L estimator has the general form IF (; ˜i ) =
k
[aij (){cos (tj ) − U (tj ; )} + bij (){sin (tj ) − V (tj ; )}];
j=1
where the elements aij () and bij () do not depend on (Besbeas, 1999). It is clear from this expression that the individual in:uence functions for ˜ are bounded in . However, they do not decay as || → ∞; and oscillate inHnitely. 4. Comparison of performance For k = 10 we provide a comparison of the in:uence function for the ISE, k-L and maximum-likelihood estimators in Fig. 1. The maximum-likelihood in:uence function is derived from Hrst principles, as in Campbell ˆ over any range ∈ ("; #) by (1992). We can describe the :uctuation of an in:uence function IF (; ) ˆ sˆ("; #) = sup |IF (; )| ∈(";#)
and for the k-L estimator, we Hnd, for example, sc˜(−10; 10) = 2:400 and s˜(−10; 10) = 2:087 whereas sc˜(−50; 50) = 3:957 and s˜(−50; 50) = 2:316. Similar results are obtained for other values of k. By contrast,
P. Besbeas, B.J.T. Morgan / Statistics & Probability Letters 55 (2001) 397 – 401
401
for the ISE estimator, the gross-error sensitivities, deHned as sˆi (−∞; ∞); i = 1; 2; have the explicit forms 1 ( + 2c)3 1 scˆ(−∞; ∞) = ( + 2c) + ; 2 16 ( + c)2 √ 3 3 ( + 2c)3 s˜(−∞; ∞) = ; 16 ( + c)2
and, for the parameter choices of Fig. 1, equal to 1.938 and 2.192, respectively. We conclude that the ISE method is the best in terms of overall robustness. Although the ISE estimators have slightly smaller asymptotic e1ciency than the maximum-likelihood estimators, the small-sample behaviour of the two estimators is virtually identical when examined by simulation. We can see no argument for using the k-L estimators, and on grounds of simplicity, robustness and e1ciency we recommend the ISE estimators. Thornton and Paulson (1977) showed that ISE estimators are simple and well-behaved for the normal case. We anticipate similar good behaviour for ISE estimators of the parameters of stable laws in general, but the simplicity of the Cauchy and normal cases appear to be unique. References Besbeas, P., 1999. Parameter estimation based on empirical transforms. Ph.D. Thesis, University of Kent, UK. Campbell, E.P., 1992. Robustness of estimation based on empirical transforms. Ph.D. Thesis, University of Kent, UK. Campbell, E.P., 1993. In:uence for empirical transforms. Comm. Statist. Theory Methods 22, 2491–2502. Feuerverger, A., McDunnough, P., 1981a. On the e1ciency of empirical characteristic function procedures. J. Roy. Statist. Soc. Ser. B 43, 20–27. Feuerverger, A., McDunnough, P., 1981b. On some Fourier methods for inference. J. Amer. Statist. Assoc. 76, 379–387. Heathcote, C.R., 1977. The integrated squared error estimation of parameters. Biometrika 64, 255–264. Koutrouvelis, I.A., 1982. Estimation of location and scale in Cauchy distributions using the empirical characteristic function. Biometrika 69, 205–213. Ogawa, J., 1960. Determination of optimum spacings for the estimation of the scale parameter of an exponential distribution based on sample quantiles. Ann. Inst. Statist. Math. 12, 135–141. Thornton, J.C., Paulson, A.S., 1977. Asymptotic distribution of characteristic function-based estimators for the stable laws. SankhyQa Ser. A 39, 341–354.