Economics Letters 8 (1981) 153- 157 North-Holland Publishing Company
153
THE MAXIMUM ENTROPY MEDIAN IN THE PRESENCE OF AN OUTLIER Henri THEIL,
Sartaj A. KIDWAI
and Murat
A. YALNIZOGLU
(/,~ioersit_v of Flondu, Guinesvdle, FL 32611, USA Received
I7 September
When a random sample of size )r from a standard normal outlier with either a different mean or a median outperforms the sample median in terms is when n = 3 and the outlier mean or variance is
normal population is contaminated by a different variance, the maximum entropy of squared-error loss. The only exception sufficiently different.
1. Introduction Theil and O’Brien (1980) tabulated the sampling variances of the sample and maximum entropy (ME) medians for random samples from a standard normal population. Let x’ < x2 < . . . < xn be the sample in the form of order statistics; then, for odd n, ’ the sample median is xm and the ME median is Ix”-’ + fxm + ix”*+‘, where m = i(n + 1). Using the tables of David “,t al. (1977), we tabulate the bias and the expected squared sampling error of these two medians when one of the n sample elements is an outlier, for two kinds of outliers.
2. An outlier with a different mean Let the outlier be normally distributed with mean p and unit variance; the n - 1 other sample elements are standard normal, and all n are mutually independent. For p # 0, the sample and ME medians are both biased. The upper part of table 1 contains their expectations for odd n < 20 and p = 0 (3) 3,4, co. 2 In all pairwise comparisons, the bias of the ’ For even II. the ME median is identical (i.e., the arithmetic
average
to the convential choice of the sample of the two middle order statistics).
0165-1765/81/0000-0000/$02.75
0 1981 North-Holland
median
0.0693
0.069 I
0
0
9
11
13
15
0
0
squured 0.4481 0.2868 0.2 104
0.1661
0.1372
0.1168
0.1017
0.0900
0.0808
17
19
E.pected 3 5 7
9
11
13
15
17
19
0.1791
0.0153
0.0833
0.0932
0.1059
0.1225
0.1454
0.0253
0.0845 0.0762
0.08 18
0.0947
0.0913
0.1078
0.1189
0.1251
0.1033
0.1401
0.0842
0.0943
0.1072
0.1241
0.1473
0.1812
0.2353
0.1852 0.1493
0.2 176
0.1705
0.5796 0.3351
0.3686 0.2449
0.4848
0.3005
0.0456
0.05 IO
0.0283
0.0283
0.0670
0.0253
0 0.0579
0.037 1
0.037 1
0.032 I
0
0.0794
0.0974
0.1261
0.1786
0.305 1
Sa
p=l
0.032 I
0.0538 0.0439
0.0537
0.0439
0.0976
0.0971
0
0.1657
0. I629
7
ME
0
Sa
0
error 0.3406 0.2336
ME
5
Sa
rzf
0.0655
0.1296
0.0785
0.0873
0.0867
0.0975
0.1113
0.1 125 0.0982
0.1571
0.1931
0.2553
0.3757
0.7002
0.0585
0.1316
0.15x7
0.2005
0.2740
0.4494
0.0457
0.05 I 1
0.0745
0.0863
0.0580
0.0672
0.1265 0.1026
0.0797
0.0980
0.1647
0.2359
0.1274
0.4 134
0.1 X23
Sa
p=l{
0.0809
0.0903
0.1021
0.1 I76
0.1392
0.1694
0.2183
0.3 104
0.5744
0.05X6
0.0657
0.0748
0.0868
0.1034
0.1280
0.1680
0.2460
0.4784
ME
of an outlier with a different
0.3263
ME
squared error of sample and ME medians in the presence
3
BUS
II
p=o
Bias and expected
Table 1
0.1446
0.0885
0.0826
0.1049 0.0924
0.099X
0.1214 0.1143
0.1337
0.1777 0.1441
0. I609
0.2329
0.3433
0.7347
0.0652
0.2020
0.2710
0.4094
0.8130
0.0650
0.0732
0.0830 0.0729
0.0972 0.0835
0.0964
0.1162
0.1422 0.1149
0.1919
0.2877
0.62 14
ME
0.1863
0.2698
0.4855
Sa
fi=2
expectation
E
0.2866
0.1962
0.1491
0.1202
0.1007
0.0866
0.0760
0.0677
5
I
9
II
13
15
17
19
0.8979
0.4309
0.2803
0.2071
0.1641
0.1358
0.1159
0.1010
0.0895
3
5
I
9
II
13
I5
I7
19
Expected squured err*r
0.5215
3
BUS
0.0835
0.0899
0.1015
0.1165
0.1064
0.0935
0.1368 0.0940
0.1070
0.1243
0.1485
0.1849
0.2094 0.1655
0.2467
0.3813
1.1403
0.0688
0.0774
0.0885
0.1032
0.1238
0.1552
0.2085
0.3234
0.8873
ME
0.2846
0.4419
0.9511
0.0685
0.0770
0.0878
0.1021
0.1220
0.1516
0.2000
0.2936
0.5490
0.1234
0.1472
0.1826
0.2421
0.3671
0.9240
0.0680
0.0764
0.0872
0.1017
0.1218
0.1523
0.2037
0.3115
0.7569
Sa
Sa
ME
p=3
p=2f
Table I (continued)
0.0900
0.1017
0.1168
0.1371
0.1660
0.2104
0.2867
0.4481
0.9924
0.0688
0.0773
0.0881
0.1026
0.1226
0.1525
0.2015
0.2968
0.5623
Sa
p=4
0.0840
0.0942
0.1073
0.1247
0.1491
0.1859
0.2492
0.3917
1.6557
0.0691
0.077X
0.0888
0.1037
0.1246
0.1563
0.2107
0.3306
I.1406
ME
0.0691
0.1491 0.1247 0.1073 0.0942
0.1662 0.1371 0.1168 0.1017
0.0X40
0.1860
0.2105
0.0900
0.2494 0.2868
0.4487
oq;3935
0.077X
0.0773 0.0688
1.0000
0.1037 0.08X9
0.1246 0.0882
0.1563
0.1227 0.1026
0.2108
0.2015 0.1525
cc 0.3316
0.5642
ME
0.2970
Sa
p=oO
9 11 13 15 17 19
3 5
squared
0.2832 0.2120 0.1674 0.1381 0.1174 0.1021 0.0904 0.0810 0.0734
Sa
Table 2 Expected
0.4487 0.2868 0.2104 0.1661 0.1372 0.1168 0.1017 0.09Oil 0.0808
0.2424
0.1793 0.1449 0.1220 0.1055 0.0929 0.083 1 0.075 1 0.0685
Sa
ME
0=1
0.3406 0.2336 0.1791 0.1454 0.1225 0.1059 0.0932 0.0833 0.0753
ME 0.6181 0.2969 0.2100 0.1640 0.1350 0.1148 0.0998 0.0883 0.0792
ME
in the presence
0.6596 0.3571 0.2452 0.1867 0.1508 0.1265 0.1089 0.0957 0.0853
Sa
a=2
error of sample and ME medians
0.7610 0.3860 0.2586 0.1944 0.1558 0.1300 0.1 115 0.0976 0.086X
Sa
0=3
1.0036 0.326 1 0.2224 0.1710 0.1395 0.1 180 0.1024 0.0904 0.08 IO
ME
0.40 I2 0.2655 0.19x4 0.1583 0.1318 0.1128 0.0987 0.0876
0.8173
Sa
a=4
of an outlier with a different
1.4889 0.342 I 0.2290 0. I747 0.1419 0.1197 0.1036 0.09 14 0.08 17
ME
variance
1.0000 0.44217 0.2868 0.2105 0. I662 0.1371 0.1 168 0.1017 0.0900
Sa
cc 0.3935 0.2494 0. I X60 0.1491 0.1247 0.1073 0.0942 0.0840
ME
H. Theil et ul. / The ME me&n
rn the presence of m outlier
157
ME median exceeds that of the sample median, although the relative excess declines as n increases. The expected squared sampling error of the sample and ME medians is shown in the lower part of table 1. In all pairwise comparisons except n = 3, p > 2, this expectation is larger for the sample median than for the ME median. Thus, the ME median is preferable under squared-error loss except only when the outlier mean is sufficiently different from that of the rest of the sample and n = 3.
3. An outlier with a different variance Let the outlier be normally distributed with zero mean and variance a2; again, the n - 1 other sample elements are standard normal and all n are independent. The present outlier does not affect the expectation of the sample and ME medians so that both are unbiased. The variances of these medians are shown in table 2 for odd n < 20 and u = i, 1, 2, 3, 4, co. They indicate that the ME median is preferable under squared-error loss except only for n = 3, u > 2.
References David, H.A.. W.J. Kennedy and R.D. Knight, 1977, Means, variances, and covariances of normal order statistics in the presence of an outlier. in: D.B. Owen and R.E. Odeh. co-eds., Selected tables in mathematical statistics, Vol. V (American Mathematical Society, Providence, RI) 75-204. Theil, H. and P.C. O’Brien, 1980, The median of the maximum entropy distribution, Economics Letters 5. 345-347.
’ Write M,.,>(p) for the expectation of the ith order statistic x’ of a sample of size ,I when the outlier expectation is p, Then M,,: N(m)= co. which explains the upper-right entry in table I, and M,. ,,(w)=M,:,,_ ,(O) for i