The maximum entropy median in the presence of an outlier

The maximum entropy median in the presence of an outlier

Economics Letters 8 (1981) 153- 157 North-Holland Publishing Company 153 THE MAXIMUM ENTROPY MEDIAN IN THE PRESENCE OF AN OUTLIER Henri THEIL, Sart...

208KB Sizes 2 Downloads 41 Views

Economics Letters 8 (1981) 153- 157 North-Holland Publishing Company

153

THE MAXIMUM ENTROPY MEDIAN IN THE PRESENCE OF AN OUTLIER Henri THEIL,

Sartaj A. KIDWAI

and Murat

A. YALNIZOGLU

(/,~ioersit_v of Flondu, Guinesvdle, FL 32611, USA Received

I7 September

When a random sample of size )r from a standard normal outlier with either a different mean or a median outperforms the sample median in terms is when n = 3 and the outlier mean or variance is

normal population is contaminated by a different variance, the maximum entropy of squared-error loss. The only exception sufficiently different.

1. Introduction Theil and O’Brien (1980) tabulated the sampling variances of the sample and maximum entropy (ME) medians for random samples from a standard normal population. Let x’ < x2 < . . . < xn be the sample in the form of order statistics; then, for odd n, ’ the sample median is xm and the ME median is Ix”-’ + fxm + ix”*+‘, where m = i(n + 1). Using the tables of David “,t al. (1977), we tabulate the bias and the expected squared sampling error of these two medians when one of the n sample elements is an outlier, for two kinds of outliers.

2. An outlier with a different mean Let the outlier be normally distributed with mean p and unit variance; the n - 1 other sample elements are standard normal, and all n are mutually independent. For p # 0, the sample and ME medians are both biased. The upper part of table 1 contains their expectations for odd n < 20 and p = 0 (3) 3,4, co. 2 In all pairwise comparisons, the bias of the ’ For even II. the ME median is identical (i.e., the arithmetic

average

to the convential choice of the sample of the two middle order statistics).

0165-1765/81/0000-0000/$02.75

0 1981 North-Holland

median

0.0693

0.069 I

0

0

9

11

13

15

0

0

squured 0.4481 0.2868 0.2 104

0.1661

0.1372

0.1168

0.1017

0.0900

0.0808

17

19

E.pected 3 5 7

9

11

13

15

17

19

0.1791

0.0153

0.0833

0.0932

0.1059

0.1225

0.1454

0.0253

0.0845 0.0762

0.08 18

0.0947

0.0913

0.1078

0.1189

0.1251

0.1033

0.1401

0.0842

0.0943

0.1072

0.1241

0.1473

0.1812

0.2353

0.1852 0.1493

0.2 176

0.1705

0.5796 0.3351

0.3686 0.2449

0.4848

0.3005

0.0456

0.05 IO

0.0283

0.0283

0.0670

0.0253

0 0.0579

0.037 1

0.037 1

0.032 I

0

0.0794

0.0974

0.1261

0.1786

0.305 1

Sa

p=l

0.032 I

0.0538 0.0439

0.0537

0.0439

0.0976

0.0971

0

0.1657

0. I629

7

ME

0

Sa

0

error 0.3406 0.2336

ME

5

Sa

rzf

0.0655

0.1296

0.0785

0.0873

0.0867

0.0975

0.1113

0.1 125 0.0982

0.1571

0.1931

0.2553

0.3757

0.7002

0.0585

0.1316

0.15x7

0.2005

0.2740

0.4494

0.0457

0.05 I 1

0.0745

0.0863

0.0580

0.0672

0.1265 0.1026

0.0797

0.0980

0.1647

0.2359

0.1274

0.4 134

0.1 X23

Sa

p=l{

0.0809

0.0903

0.1021

0.1 I76

0.1392

0.1694

0.2183

0.3 104

0.5744

0.05X6

0.0657

0.0748

0.0868

0.1034

0.1280

0.1680

0.2460

0.4784

ME

of an outlier with a different

0.3263

ME

squared error of sample and ME medians in the presence

3

BUS

II

p=o

Bias and expected

Table 1

0.1446

0.0885

0.0826

0.1049 0.0924

0.099X

0.1214 0.1143

0.1337

0.1777 0.1441

0. I609

0.2329

0.3433

0.7347

0.0652

0.2020

0.2710

0.4094

0.8130

0.0650

0.0732

0.0830 0.0729

0.0972 0.0835

0.0964

0.1162

0.1422 0.1149

0.1919

0.2877

0.62 14

ME

0.1863

0.2698

0.4855

Sa

fi=2

expectation

E

0.2866

0.1962

0.1491

0.1202

0.1007

0.0866

0.0760

0.0677

5

I

9

II

13

15

17

19

0.8979

0.4309

0.2803

0.2071

0.1641

0.1358

0.1159

0.1010

0.0895

3

5

I

9

II

13

I5

I7

19

Expected squured err*r

0.5215

3

BUS

0.0835

0.0899

0.1015

0.1165

0.1064

0.0935

0.1368 0.0940

0.1070

0.1243

0.1485

0.1849

0.2094 0.1655

0.2467

0.3813

1.1403

0.0688

0.0774

0.0885

0.1032

0.1238

0.1552

0.2085

0.3234

0.8873

ME

0.2846

0.4419

0.9511

0.0685

0.0770

0.0878

0.1021

0.1220

0.1516

0.2000

0.2936

0.5490

0.1234

0.1472

0.1826

0.2421

0.3671

0.9240

0.0680

0.0764

0.0872

0.1017

0.1218

0.1523

0.2037

0.3115

0.7569

Sa

Sa

ME

p=3

p=2f

Table I (continued)

0.0900

0.1017

0.1168

0.1371

0.1660

0.2104

0.2867

0.4481

0.9924

0.0688

0.0773

0.0881

0.1026

0.1226

0.1525

0.2015

0.2968

0.5623

Sa

p=4

0.0840

0.0942

0.1073

0.1247

0.1491

0.1859

0.2492

0.3917

1.6557

0.0691

0.077X

0.0888

0.1037

0.1246

0.1563

0.2107

0.3306

I.1406

ME

0.0691

0.1491 0.1247 0.1073 0.0942

0.1662 0.1371 0.1168 0.1017

0.0X40

0.1860

0.2105

0.0900

0.2494 0.2868

0.4487

oq;3935

0.077X

0.0773 0.0688

1.0000

0.1037 0.08X9

0.1246 0.0882

0.1563

0.1227 0.1026

0.2108

0.2015 0.1525

cc 0.3316

0.5642

ME

0.2970

Sa

p=oO

9 11 13 15 17 19

3 5

squared

0.2832 0.2120 0.1674 0.1381 0.1174 0.1021 0.0904 0.0810 0.0734

Sa

Table 2 Expected

0.4487 0.2868 0.2104 0.1661 0.1372 0.1168 0.1017 0.09Oil 0.0808

0.2424

0.1793 0.1449 0.1220 0.1055 0.0929 0.083 1 0.075 1 0.0685

Sa

ME

0=1

0.3406 0.2336 0.1791 0.1454 0.1225 0.1059 0.0932 0.0833 0.0753

ME 0.6181 0.2969 0.2100 0.1640 0.1350 0.1148 0.0998 0.0883 0.0792

ME

in the presence

0.6596 0.3571 0.2452 0.1867 0.1508 0.1265 0.1089 0.0957 0.0853

Sa

a=2

error of sample and ME medians

0.7610 0.3860 0.2586 0.1944 0.1558 0.1300 0.1 115 0.0976 0.086X

Sa

0=3

1.0036 0.326 1 0.2224 0.1710 0.1395 0.1 180 0.1024 0.0904 0.08 IO

ME

0.40 I2 0.2655 0.19x4 0.1583 0.1318 0.1128 0.0987 0.0876

0.8173

Sa

a=4

of an outlier with a different

1.4889 0.342 I 0.2290 0. I747 0.1419 0.1197 0.1036 0.09 14 0.08 17

ME

variance

1.0000 0.44217 0.2868 0.2105 0. I662 0.1371 0.1 168 0.1017 0.0900

Sa

cc 0.3935 0.2494 0. I X60 0.1491 0.1247 0.1073 0.0942 0.0840

ME

H. Theil et ul. / The ME me&n

rn the presence of m outlier

157

ME median exceeds that of the sample median, although the relative excess declines as n increases. The expected squared sampling error of the sample and ME medians is shown in the lower part of table 1. In all pairwise comparisons except n = 3, p > 2, this expectation is larger for the sample median than for the ME median. Thus, the ME median is preferable under squared-error loss except only when the outlier mean is sufficiently different from that of the rest of the sample and n = 3.

3. An outlier with a different variance Let the outlier be normally distributed with zero mean and variance a2; again, the n - 1 other sample elements are standard normal and all n are independent. The present outlier does not affect the expectation of the sample and ME medians so that both are unbiased. The variances of these medians are shown in table 2 for odd n < 20 and u = i, 1, 2, 3, 4, co. They indicate that the ME median is preferable under squared-error loss except only for n = 3, u > 2.

References David, H.A.. W.J. Kennedy and R.D. Knight, 1977, Means, variances, and covariances of normal order statistics in the presence of an outlier. in: D.B. Owen and R.E. Odeh. co-eds., Selected tables in mathematical statistics, Vol. V (American Mathematical Society, Providence, RI) 75-204. Theil, H. and P.C. O’Brien, 1980, The median of the maximum entropy distribution, Economics Letters 5. 345-347.

’ Write M,.,>(p) for the expectation of the ith order statistic x’ of a sample of size ,I when the outlier expectation is p, Then M,,: N(m)= co. which explains the upper-right entry in table I, and M,. ,,(w)=M,:,,_ ,(O) for i