Enhanced error spectrum for estimation performance evaluation

Enhanced error spectrum for estimation performance evaluation

Accepted Manuscript Title: Enhanced Error Spectrum for Estimation Performance Evaluation Author: Weishi Peng Yangwang Fang Zhansheng Duan Binghe Wang ...

2MB Sizes 0 Downloads 68 Views

Accepted Manuscript Title: Enhanced Error Spectrum for Estimation Performance Evaluation Author: Weishi Peng Yangwang Fang Zhansheng Duan Binghe Wang PII: DOI: Reference:

S0030-4026(16)30138-3 http://dx.doi.org/doi:10.1016/j.ijleo.2016.02.063 IJLEO 57377

To appear in: Received date: Accepted date:

19-1-2016 28-2-2016

Please cite this article as: Weishi Peng, Yangwang Fang, Zhansheng Duan, Binghe Wang, Enhanced Error Spectrum for Estimation Performance Evaluation, (2016), http://dx.doi.org/10.1016/j.ijleo.2016.02.063 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ip t

Enhanced Error Spectrum for Estimation Performance Evaluation Weishi Penga,c , Yangwang Fanga , Zhansheng Duanb,1 , Binghe Wangc a School

us

cr

of Aeronautics and Astronautics Engineering, Air Force Engineering University, Xi’an, Shaanxi 710038, China b Center for Information Engineering Science Research, Xi’an Jiaotong University, Xi’an, Shaanxi 710049, China c School of Equipment Engineering, Armed Police Force Engineering University, Xi’an, Shaanxi 710086, China

an

Abstract

The error spectrum is a comprehensive metric for estimation performance eval-

M

uation in that it is an aggregation of many incomprehensive measures. However, the error spectrum is a two-dimensional curve for any estimand (i.e., the quantity to be estimated) of interest. Therefore, unless one error spectrum

d

dominates the others, it is in general not straightforward to say which one is

te

better. Although the dynamic error spectrum (i.e., the average height of the error spectrum) was proposed to tackle this problem, it suffers from the problem of information loss due to the mapping of the whole error spectrum at a time

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

instant into a single point. Particularly, if the average heights of two error spectrums are the same, they are still indistinguishable. To alleviate this, two new metrics called range error spectrum induced area and dynamic error spectrum induced area are proposed in this paper. Then how to combine these two new metrics, called as enhanced error spectrum, are further studied. An additive and a multiplicative form of the enhanced error spectrum are presented respectively for different scenarios. A numerical example is provided to illustrate the effectiveness of the metrics. It is shown that due to the consideration of more information, the new metrics have greater applicability than the dynamic error 1 Corresponding author Email address: [email protected](Zhansheng Duan)

Preprint submitted to Optik

January 19, 2016

Page 1 of 25

spectrum.

ip t

Keywords: estimation, estimation performance evaluation, error spectrum, dynamic error spectrum, multi-objective optimization

cr

1. Introduction

In recent years, estimation performance evaluation (EPE) has received con-

us

siderable attention owing to its increasing application in estimation/filtering (see, e.g. [1]-[3], [5]-[6], [9]-[12]), track fusion/tracking [13], performance analysis [14], etc. To the best of our knowledge, EPE includes mainly two components:

an

5

the estimator ranking and the estimator evaluation. For the estimator ranking, Pitman proposed a criterion known as the Pitman closeness measure (PCM)

M

[17]. Since then, most existing research has focused on the improvement of the non-transitivity problem of the PCM (see, e.g. [18]-[22]), which is a major ob10

stacle for EPE. Inspired by the PCM, the authors of [21] proposed an estimator

d

ranking vector which includes several performance metrics. Thus, a key aspect in EPE is the selection and proper interpretation of the metrics used for the

te

estimator ranking and the estimator evaluation. The root mean square error (RMSE) is widely used in EPE, since it is the most natural finite-sample approx15

imation of its theoretical counterpart. As pointed out in [1] and [2], the RMSE

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

is easily dominated by large error terms and has no clear physical interpretation. Therefore, it was replaced with the average Euclidean error (AEE) in several applications [1]. Although the AEE has several advantages, it is still affected by extreme values. Therefore, several incomprehensive performance measures

20

were proposed in [2] such as the harmonic average error (HAE), the geometric average error (GAE), median error, and error mode. Furthermore, the iterative mid-range error (IMRE) was presented in [6], since the above-listed metrics are not robust. Unfortunately, all of the above-listed metrics can reflect only one aspect of

25

the estimator performance. Thus, three comprehensive performance measuresthe error spectrum (ES), desirability level, and relative concentration and devi-

2

Page 2 of 25

ation measures were proposed in [3]-[5]. Among these metrics, the ES can reveal more information about the estimation because it is an aggregation of several

30

ip t

incomprehensive metrics.

However, the ES has some limitations and drawbacks. On one hand, its

cr

calculation without the error distribution is not easy, though in [7] (a further development of [4]), the authors provided analytical formulae for the computa-

us

tion of the ES when the error distribution is given. To overcome this problem, we proposed two approximation algorithms in [15]-[16] based on the Gaussian 35

mixture and power means error. On the other hand, it is difficult to say which

an

estimator performs better if their ES curves intersect with each other. To tackle this problem, a dynamic error spectrum (DES) reflect the estimation accuracy of an estimator was presented in [8]-[9], which is in fact the average height of

40

M

the ES.

Although the DES does provide a solution to this problem, it still has some limitations. First, it is difficult to decide which estimator performs better,

d

when the average height of their ES. Second, the DES provides a ruler only to

te

measure how large the estimation error is. Recall that the least-squares (LS) estimation and minimum mean square error (MMSE) estimation differ from 45

the Maximum likelihood (ML) estimation and Maximum a posteriori (MAP)

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

estimation in their underlying ideas. The former seeks an estimator that has the smallest error, while the latter uses the “most frequently occurred”value of the estimatee as the estimator [2]. Although the ML and MAP estimators may have a larger average error, they may have a higher probability of being close

50

to the estimatee, i.e., the estimation error of the ML and MAP estimators are concentrated to the estimatee(the concentration of an estimator in this paper), which reflects the ES curve of the ML and MAP estimators more flatness. This has important implications while choosing an estimation method for a particular application. Thus, a worthwhile problem is how to also take into account the

55

flatness of an ES curve in the evaluation of the performance of an estimator. The main contribution of this work is twofold. First, two new estimation evaluation metrics, i.e., range error spectrum (RES) induced area (RESA) and 3

Page 3 of 25

the DES induced area (DESA), have been proposed, where the RESA is designed to quantify the flatness of an error spectrum curve, and the DESA is designed to measure the estimation accuracy of an estimator. Second, how to combine these

ip t

60

two new metrics, which is called enhanced error spectrum (EES), is considered

cr

in this paper. Two forms of combination are presented for different scenarios.

The first form is additive, which is dependent on prior preference between the

65

us

concentration and estimation accuracy. The second form is multiplicative, which does not depend on the prior preference and is also suggested when the dynamic error spectrum induced area is dominating.

an

This paper is organised as follows. The ES and DES are summarised in Section 2. In Section 3, two areas of ES, called as RESA and DESA, are designed to evaluate an estimator. Furthermore, how to combine these two new metrics, i.e., the EES, is considered in Section 4. A numerical example is provided in

M

70

Section 5 to illustrate the superiority of the proposed metrics. The paper is

d

concluded in Section 6.

te

2. Summary of ES and DES 2.1. Error spectrum 75

According to [4] and [23], let the (possibly vector-valued) estimation error

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

ˆ where θ is the estimand (i.e., the θ˜ of a (point) estimator θˆ be θ˜ = θ − θ,

˜ or e = kθk/k ˜ θk ˆ as the absolute quantity to be estimated). We denote e = kθk

or relative estimation error norm, where k · k can be 1-norm or 2-norm. Then, for ri ∈ (−∞, +∞), the ES is defined as R S(r) = (E[er ])1/r = { er dF (e)}1/r  R  { er f (e)de}1/r if e is continuous =  (P p er )1/r if e is discrete i i

80

(1)

where F (e), f (e), and pi are the cumulative distribution function (CDF), probability density function (PDF), and probability mass function (PMF), respectively.

4

Page 4 of 25

ip t

For a discrete {ei }ni=1 , ES can be approximately calculated by [15]-[16]  n P  r  [1 (ei ) ]1/r r 6= 0  n i=1 S(r) ≈ n Q   r=0 [ ei ]1/n  i=1

85

cr

From (1), it is clear that the ES includes several incomprehensive metrics as special cases when r is set to some specific values: 1

(2) S(1) = E[e]. Thus, for a discrete ei , S(1) =

n P

i=1

e2i )1/2 = RMSE.

ei = AEE.

i=1

an



1 n

n P

us

(1) S(2) = (E[e2 ]) 2 . Thus, for a discrete ei , S(2) = ( n1

∆ 1 n

(3) S(0) = lim S(r) = exp(E[ln(e)]). Thus, for a discrete ei , S(0) = r→0

GAE.

(4) S(−1) = 1/E[1/e]. Thus, for a discrete ei , S(−1) = ( n1

M

90

n P i=1

n Q

ei =

i=1

−1 e−1 = HAE. i )

In view of this, the notation r used in this paper is a real number that satisfies ri ∈ [−1, 2] .

d

Certainly, the ES is a curve for a state estimator of a dynamic system at any

95

te

time instant. Therefore, it will be a three-dimensional plot over the entire time span, which causes difficulty in the EPE of dynamic systems. Fortunately, the DES has been proposed to tackle this problem.

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

2.2. Dynamic error spectrum According to [8]-[9], if some prior knowledge is available about the weights n P {ωi }ni=1 corresponding to each different ri , where ωi = 1, ri ∈ {rj }nj=1 , −1 ≤

100

i=1

ri ≤ 2, and n is the number of indices over {rj }nj=1 , the weighted form of the

DES is given simply as

DES(ω) =

n X

S(ri )ωi

(2)

i=1

Since it is difficult to obtain the weights, another form of the DES is given by the average height under the ES curve, as follows: DES =

1 rn − r1

Z

n

rn

S(r)dr ≈ r1

1X S(ri ) n i=1

(3)

5

Page 5 of 25

It can be clearly seen that the DES combines several incomprehensive met105

rics into a single metric. Thus, the DES reflect the estimation accuracy of an

ip t

estimator the same as the incomprehensive metric. In other words, the DES

suffers from the problem of information loss during this many-to-one mapping,

cr

as shown in Example 1.

Example 1. As pointed out in [2], a more complete description of estimation performance is the PDF of the estimation error. In the following example, we

us

110

directly show why the DES can’t distinguish the difference among the following error PDFs. Fig. 1(a) illustrates the following five error PDFs and a desired

˜ = N (θ; ˜ 0, 2.5) p1 (θ) √ √ ˜ = U(θ; ˜ − 15.5, 15.5) p2 (θ)

an

error PDF, which are given by

M

˜ = αN (θ; ˜ −0.5, 0.2) + βN (θ; ˜ 0, 0.2) + αN (θ; ˜ 0.5, 0.2) p3 (θ) ˜ = 0.5N (θ; ˜ −0.8, 0.19) + 0.5N (θ; ˜ 0.8, 0.19) p4 (θ)

(4)

˜ = 0.5N (θ; ˜ 0, 1.8) + 0.5N (θ; ˜ 0, 0.8) p5 (θ)

d

˜ = N (θ; ˜ 0, 0.5) p(θ)

115

te

˜ µ, σ) is a Gaussian distribution with where α = 0.0192, β = 1 − 2α [5], N (θ; ˜ a, b) is a uniform distribution with location mean µ, and variance σ, and U(θ; parameter a and scale parameter b.

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Over 100,000 Monte Carlo runs, for each estimator, their ES curves are

obtained as shown in Fig. 1(b). As discussed before, the DES does provide a solution to decide which estimator performs better. Substituting their ESs into

120

(3) yields

DESp1 = DESp2 =1.60 DESp = DESp3 =0.44

(5)

DESp5 = DESp4 = 0.70

Clearly, the performance of the above estimators are no difference between the left hand side and the right hand side of the above equality. ˜ has the highest concentrated level, However, as shown in Fig. 1(a), p3 (θ) ˜ the next is p4 (θ) ˜ and which is more concentration than the desired one p(θ); 6

Page 6 of 25

0.7

us

PDF

0.6 0.5 0.4

an

0.3 0.2 0.1 -6

-4

-2 0 2 estimator error

4

M

0 -8

ip t

p1 p2 p3 p4 p5 p

0.8

cr

0.9

6

8

3

ES 1 ES 2 ES 3 ES 4 ES 5 ES of Desire PDF

te

2.5

d

(a) Error PDFs

2

ES

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

1.5

1

0.5

0 -1

-0.5

0

0.5 r

1

1.5

2

(b) ES curves

Figure 1: Error PDFs and ES curves of six estimators in Example 1

7

Page 7 of 25

125

˜ p2 (θ) ˜ is the least concentrated one; the next is p1 (θ); ˜ and they are all less p5 (θ); ˜ This phenomenon is corresponding to concentration than the desired one p(θ).

ip t

the Fig. 1(b) that the ES curve of estimator 2, estimator 3 and estimator 4 are flatter than that of estimator 1, desired estimator and estimator 5, respectively.

proposed next to alleviate this.

us

130

cr

Nevertheless, this property is ignored in the DES. Thus, two areas of ES are

3. Two areas of error spectrum 3.1. Range error spectrum

an

As shown in Fig. 1(b), the left and right endpoints of an ES curve represent the HAE and RMSE, respectively. In fact, the RMSE can easily be dominated 135

by large errors, whereas the HAE could easily be dominated by small errors.

M

From Fig. 1(b), it can be seen that the HAE of estimator 1 is larger than the HAE of estimator 2 and the RMSE of estimator 1 is smaller than the RMSE

d

of estimator 2, which means that small errors of estimator 2 are smaller than those of estimator 1 and large errors of estimator 2 are larger than those of estimator 1. Thus, the ES curve of estimator 1 is flatter than that of estimator

te

140

2. In other words, we can say that the evaluation results of estimator 1 are less sensitive to r than are those of estimator 2. Similarly, this phenomenon is still

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

exists between estimator 4 and estimator 5, as well as estimator 3 and desired estimator. Clearly, this property of the ES is strongly desirable for EPE. To

145

quantify this property, we define the RES as follows: RES = S(rmax ) − S(rmin )

(6)

where without loss of generality, S(rmax ) denotes the right endpoint of an ES curve and S(rmin ) denotes the left endpoint. In this study, for r ∈ [−1, 2], we know that RES = S(2) − S(1) = RMSE − AEE

(7)

Obviously, the smaller the RES, the flatter is the ES curve. Thus, the 150

estimator with a smaller RES will be better. Furthermore, when S(rmax ) − 8

Page 8 of 25

S(rmin ) is approaching zero, we have RES → 0; that is, the ES curve becomes a

the RES reflects how sensitive the ES is to the parameter r.

ip t

horizontal line and is independent of the parameter r. From another perspective,

As discussed above, we need not only consider the DES but also consider the

RES. However, according to (3) and (6), the dimension of the DES is inconsistent

cr

155

with the dimension of the RES. Thus, the RESA and DESA are proposed to

us

overcome this problem.

te

d

M

an

3.2. Range and dynamic error spectrum induced areas

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Figure 2: Two areas of ES curve

As shown in Fig. 2, two areas are associated with an ES curve. The first

160

one is the area of region R1 , which defined as follows. For a fixed e, let f −1 (·) is the inverse function of S(r) → r. The inverse

function always exists because S(r) is strictly increasing with respect to r [3]. Thus, we have

r = f −1 (S(r))

(8)

Then, denote ARES as the RESA, which is defined as Z

S(rn )

ARES =

f −1 (S(r))dS(r)

(9)

S(r1 )

9

Page 9 of 25

165

where S(r1 ) and S(rn ) are the left endpoint and the right endpoint of an ES

Using the random simulation, the RESA can be calculated by n

ARES ≈

S(rn ) − S(r1 ) X −1 f (S(ri )) n i=1

ip t

curve, respectively.

(10)

(10) can be approximated as n

(11)

Similarly, the second one is the area of region R2 . Let ADES is the DESA,

an

170

1X |ri | n i=1

us

ARES ≈ RES ×

cr

Since f −1 (S(ri )) is difficult to obtain analytically and ri ∈ [−1, 0], ri ≤ 0,

where {ri }ni=1 is the set of r. Actually, c1 = rn −r1 and c2 =

(12)

M

which is defined as Z rn n 1X ADES = S(r)dr ≈ (rn − r1 ) S(ri ) ≈ (rn − r1 ) × DES n i=1 r1 1 n

n P

|ri | are two constants. Correspondingly,

i=1

d

ADES = c1 × DES

te

ARES = c2 × RES

In other words, ADES and ARES are respectively different from the DES and 175

RES by only a constant.

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Certainly, the ADES of the estimators shown in Fig. 1(b) have the following

relationship:

1 2 ApDES = ApDES =1.60 × c1 3 ApDES = ApDES =0.44 × c1

(13)

5 4 ApDES = ApDES = 0.70 × c1

That is, they are still indistinguishable for using the ADES . However, the

RES and ARES of the estimators shown in Fig. 1(b) have the following relation-

180

ship, which are summarised in Table 1. From Table 1, it can be easily seen that RESp1 > RESp2 RESp > RESp3 RESp5 > RESp4 10

Page 10 of 25

p2

p3

p4

p5

p

RES

2.26

1.92

0.58

0.71

1.01

0.64

ARES

3.38

2.88

0.88

1.06

1.52

0.96

cr

p1

and

5 4 ApRES > ApRES

us

1 2 ApRES > ApRES 3 ApRES > ApRES

ip t

Table 1: Comprehensive EPE Metrics for Example 1

metrics

an

That is, in terms of the RES or ARES , estimators in the right hand side of the above equality are better. Certainly, it is interesting to determine which of 185

the above two sides estimators are better when both the ADES and ARES are

M

considered.

As discussed before, the ADES is still reflect the estimation accuracy of an estimator since ADES = c1 × DES. Meanwhile, the ARES describes the flatness

of an estimator. Certainly, the smaller the ADES and ARES , the better will be an estimator.

te

190

d

of the ES curve of an estimator, i.e., the concentration of the estimation errors

4. Enhanced error spectrum

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

4.1. Multi-objective optimization Essentially, the above problem is a multi-objective optimisation problem,

195

which is formulated as follows [24]: min f (θ) = [f1 (θ), f2 (θ), · · · , fK (θ)]T θ   q (θ) ≤ 0, j = 1, 2, · · · , M j subject to  h (θ) = 0, l = 1, 2, · · · , N

(14)

l

where K,M and N are the number of objective functions, inequality constraints, and equality constraints, respectively; θ is a decision variable (possibly vectorvalued); qj (θ) is the inequality constraints; hl (θ) is the equality constraints;

11

Page 11 of 25

f (θ) ∈ RK is a K-dimensional objective function; and fi (θ) is the i-th objec200

tive function (also known as criteria, cost functions, payoff functions, or value

ip t

functions).

In multi-objective optimization, we first need to determine the utility func-

nential sum: g(f1 (θ), · · · fK (θ)) =

K X

ωi [fi (θ)]p

205

or g(f1 (θ), · · · , fK (θ)) =

K X

us

i=1

cr

tions. One of the most commonly used utility functions is the weighted expo-

[ωi fi (θ)]p

(15)

(16)

an

i=1

where ωi are weights representing the relative significance of objective function K P i, which satisfies ωi = 1 and ωi ≥ 0, ∀i; g(·, · · · , ·) is the utility function; i=1

M

and p is a real number, which is used to increase or decrease the value of the objective.

In addition, [25]-[27] suggested the following extensions to (15) and (16): K X

d

210

g(f1 (θ), · · · , fK (θ)) = {

p 1/p ωi [fi (θ) − min{fi (θ)}K i=1 ] }

(17)

or

te

i=1

K X p 1/p g(f1 (θ), · · · , fK (θ)) = { [ωi (fi (θ) − min{fi (θ)}K i=1 )] }

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

(18)

i=1

In particular, for the case p = 1, we will have the regular weighted sum form: g(f1 (θ), · · · , fK (θ)) =

K X

ωi fi (θ)

(19)

i=1

One may also consider the weighted product form because the weighted sum

form is easily affected by extreme values [28]-[29]: g(f1 (θ), · · · , fK (θ)) =

K Y

fi (θ)ωi

(20)

i=1 215

For simplicity, let ωi = 1, ∀i, the weighted product form can be rewritten as: g(f1 (θ), · · · , fK (θ)) =

K Y

fi (θ)

(21)

i=1

12

Page 12 of 25

Several other forms are available that can be applied in multi-objective optimization, e.g. the exponential weighted criterion, goal programming, and the

ip t

bounded objective function. Owing to space constraint, in this paper, we focus

only on the weighted sum form and the weighted product form to define our metrics.

cr

220

In the above, we have obtained two induced areas for EPE, i.e., RESA

us

and DESA. Clearly, we prefer both of them to be as small as possible. In addition, θ is similar to r. So if we treat RESA as f1 (r) and DESA as f2 (r), the EPE problem using both RESA and DESA can be naturally changed into a bi-objective optimization problem. However, how to combine the two objective

an

225

functions, i.e., RESA and DESA, into a single objective function is critical. This is called EES in this paper and defined as

M

EES = g(ARES , ADES )

(22)

where g(·, ·) is the utility function to be determined.

plicative form, will be introduced for different scenarios.

te

230

d

Next, two forms of the utility function, i.e., an additive form and a multi-

4.2. Additive form of EES

If prior preference about the weights is available, denote EESA as the AEES,

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

we can use the following additive form of the EES, i.e., AEES, as EESA =

235

2 X

ωi fi (r) = ω1 ARES + ω2 ADES

(23)

i=1

where ω1 and ω2 are the weights associated with the ARES and ADES , respec2 P tively, typically determined by the user, which satisfy ωi = 1. Clearly, an i=1

estimator with a smaller EESA is better. Remark 2: It can be easily seen from (23) that

(a) If the weights satisfy ω1 > ω2 , then the EESA focuses more on the concentration of the estimation errors of an estimator. 240

(b) If the weights satisfy ω1 < ω2 , then the EESA focuses on the estimation accuracy of an estimator. 13

Page 13 of 25

For an estimation problem, estimation accuracy is certainly more important

this, we should choose the weights such that ω1 > ω2 .

Let ω2 = 0.6, ω1 = 0.4 . Calculating the EESA of each estimator for Example 1, we have EESpA1 = 4.20 > EESpA2 = 4.03 EESpA = 1.19 > EESpA3 = 1.16

(24)

us

EESpA5 = 1.87 > EESpA4 = 1.67

cr

245

ip t

than the concentration of the estimation errors of the estimator. As a result of

Clearly, (24) means that estimators in the right hand of the above equality

an

are better when considered both the estimation accuracy and the concentration of estimation errors. 4.3. Multiplicative form of EES

M

250

Clearly, the EESA does provide a solution to the bi-objective minimization problem. Nevertheless, the EESA can easily be dominated by extreme values.

as shown in Example 2.

Example 2. Similar to Example 1, assuming that θ˜1 and θ˜2 follow the dis-

te

255

d

For example, if ADES  ARES , ARES will have a very small effect on the EESA ,

tributions θ˜1 ∼ N (100, 1) and θ˜2 ∼ N (100, 1.5), respectively. Their error distributions are as shown in Fig. 3(a) and their ES curves are as shown in Fig.

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

3(b).

As can be clearly seen from Fig. 3(b), the ADES of two the estimators are

260

very close. However, as seen in both Figs. 3(a) and 3(b), the estimation errors of estimator 1 are more concentrated than those of estimator 2. Then, in terms of the overall performance, estimator 1 is certainly better. However, it can be further seen from Fig. 3(a) that for both estimators, the ADES is far more dominating over the ARES . As a result of this, the EESA cannot tell us that

265

estimator 1 is better, as could have been expected. The EESA and the other metrics for Example 2 are summarized in Table 2. From Table 2, it can be easily seen that EES1A = EES2A

(25)

14

Page 14 of 25

us

0.25 0.2 0.15

an

Gaussian PDF

0.3

ip t

estimator 1 estimator 2

0.35

cr

0.4

0.1 0.05 0 50

M

100 estimator error

150

100.02

estimator 1 estimator 2

te

100.015

d

(a) Error PDFs

100.01 100.005

ES

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

100

99.995

99.99

99.985 -1

-0.5

0

0.5 r

1

1.5

2

(b) ES curves

Figure 3: Error PDFs and ES curves of two estimators for Example 2

15

Page 15 of 25

ADES

RES

ARES

EESA

Estimator 1

100.00

300.00

0.01

0.01

180.01

Estimator 2

100.00

300.00

0.02

0.02

180.01

cr

DES

Additionally, A1DES  A1RES

(26)

us

A2DES  A2RES

ip t

Table 2: Comprehensive EPE Metrics for Example 2

Clearly, the ARES has a very small impact on the EESA for the two estimators in Example 2, and the above results immediately illustrate the drawbacks of the

an

270

EESA . Next, we discuss how to resolve this type of problem by using a new form of the EES: the multiplicative form, i.e., the MEES.

EESM =

M

According to (21), let EESM is the MEES, which is defined as 2 Y

fi (r) = ARES × ADES

(27)

i=1

indicate that ADES and ARES are equally important in EPE.

te

275

d

Obviously, the estimator with a smaller EESM is better. Furthermore, (27)

Using the EESM to evaluate the estimators in Example 1, we have EESpM1 = 16.03 > EESpM2 = 13.83

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

EESpM = 1.28 > EESpM3 = 1.17

EESpM5 = 3.21 > EESpM4 = 2.19

Clearly, the above results show that the evaluation results obtained using

the EESM are the same as those obtained using the EESA . For Example 2, we have

EES1M = 3.00 < EES2M = 6.00

280

That is, estimator 1 is better than estimator 2, exactly as expected. We can conclude that compared to the use of the DES alone, the use of the EESA and EESM can reveal more information about an estimator. Thus, the evaluation results obtained using these two forms are more reasonable. As a further supplement, we provide some suggestions on how to choose between 16

Page 16 of 25

285

them for a specific application. It is suggested that the EESA should be used if the prior preference between the concentration and estimation accuracy is

ip t

available. However, if no prior preference is available or if ADES  ARES , then

cr

the use of the EESM is suggested.

5. Numerical Example

This section presents an example for demonstrating the superiority of the

us

290

EESA and EESM . Using the example given in [2] and [4], we consider a single noisy measurement:

an

z =x+v

(28)

where v follows a standard normal distribution, i.e., v ∼ N (0, 1), and x is a

As pointed out in [2], the MAP estimator is given by

te

295

(29)

d

M

random variable with an exponential prior:   λ exp(−λx) x > 0 f (x) =  0 x≤0

x ˆMAP (λ) = max(z − λ, 0)

(30)

and the MMSE estimator is given by

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

x ˆMMSE (λ) = √

1 −(z − λ)2 exp( + z − λ) 2 2π(1 − Φ(λ − z))

(31)

where Φ(·) is the CDF of the standard normal distribution. The estimation error of the MAP estimator is x ˆMAP (λ) = x − x ˆMAP = x − max(x + v − λ, 0)

(32)

and that of the MMSE estimator is [30] x ˜MMSE (λ) = x − x ˆMMSE (λ)   q 2 ≈ λ − v − 0.661|λ − z| + 0.3999 (λ − z) + 5.51

300

(33)

Next, two cases will be considered. 17

Page 17 of 25

Case 1. Assume that the true value x was generated from (29), where the parameter λ is always equal to 1. Then, let the estimation errors of the MAP

ip t

and MMSE estimators be x ˜MAP (λ = 1.8) and x ˜MMSE (λ = 1.8), respectively. Over 100,000 Monte Carlo runs, the empirical error PDFs of the MAP and MMSE estimators are as shown in Fig. 4(a) and their ES curves are as shown

cr

305

in Fig. 4(b).

us

Fig. 4(b) shows that the ES curve of the MMSE estimator is flatter than of the MAP estimator. And also, the ES curve of the MMSE estimator is lower than that of the MAP estimator for the region of interest of r. Hence, the MMSE estimator is better than the MAP estimator. To verify this, the above

an

310

EPE metrics are presented shown in Table 3.

M

Table 3: Comprehensive EPE Metrics for Case 1

ADES

RES

ARES

EESA

EESM

MMSE

0.42

1.27

0.68

1.02

1.17

1.29

MAP

0.59

1.78

0.93

1.39

1.62

2.47

d

DES

te

From Table 3, it can be easily seen that DESMMSE < DESMAP

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

RESMMSE < RESMAP AMMSE < AMAP DES DES < AMAP AMMSE RES RES < EESMAP EESMMSE M A EESMMSE < EESMAP M M

That is, all EPE metrics indicate that the MMSE estimator performs better

than the MAP estimator.

315

Case 2. Similar to Case 1, assume that λ is different for the MAP and MMSE

estimators, i.e., x ˆMAP (0.56) and x ˆMMSE (2.5). Over 100,000 Monte Carlo runs, their empirical error PDFs and their ES curves are as shown in Figs. 5(a) and 5(b), respectively. As can be seen from Fig. 5(b), both the ES curves of the two estimators are 18

Page 18 of 25

ip t

1

cr

MAP MMSE

0.9 0.8

us

0.7

PDF

0.6 0.5 0.4

an

0.3 0.2 0.1 -4

-3

-2 -1 0 estimator error

1

M

0 -5

2

3

MAP ( λ = 1.8 ) MMSE ( λ = 1.8 )

te

1

d

(a) Empirical error PDFs

0.8

0.6

ES

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

0.4

0.2

0 -1

-0.5

0

0.5 r

1

1.5

2

(b) ES curves

Figure 4: Empirical error PDFs and ES curves of MAP and MMSE estimators in Case 1

19

Page 19 of 25

ip t

1

cr

MAP MMSE

0.9 0.8

us

0.7

PDF

0.6 0.5 0.4

an

0.3 0.2 0.1 -4

-3

-2

-1 0 estimator error

M

0 -5

1

2

3

4

0.9

MAP ( λ = 0.56 ) MMSE ( λ = 2.5 )

te

0.8

d

(a) Empirical error PDFs

0.7 0.6 0.5

ES

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

0.4 0.3 0.2 0.1

0 -1

-0.5

0

0.5 r

1

1.5

2

(b) ES curves

Figure 5: Empirical error PDFs and ES curves of MAP and MMSE estimators in Case 2

20

Page 20 of 25

320

very closeness. Therefore, it is difficult to say which estimator performs better.

metrics are presented in Table 4. Table 4: Comprehensive EPE Metrics for Case 2

ADES

RES

ARES

EESA

EESM

MMSE

0.45

1.36

0.75

1.13

1.27

1.53

MAP

0.45

1.36

0.70

1.06

1.24

an

DESMMSE = DESMAP = AMAP AMMSE DES DES

M

and

RESMMSE > RESMAP > AMAP AMMSE RES RES 325

1.43

us

From Table 4, it can be easily seen that

cr

DES

ip t

However, this problem can be solved by the EESA and EESM . The above EPE

(34)

(35)

and

d

EESA MMSE > EESA MAP

(36)

te

EESM MMSE > EESM MAP

From the viewpoint of estimation accuracy, (35) means that the MMSE estimator has the same performance as the MAP estimator, which clearly demon-

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

strates that use of DES alone for a comparison between the MMSE estimator and the MAP estimator could be unfair. However, (36) means that the MAP

330

estimator is better than the MMSE estimator in terms of the concentration of the estimation errors. Overall, (37) means that the MAP estimator is better than the MMSE estimator when both the estimation accuracy and the concentration of estimation errors are considered. This example further illustrate that the effectiveness of the EESA and EESM .

335

6. Conclusions The main contribution of this work is twofold. First, two new estimation evaluation metrics, i.e., RESA and DESA, have been proposed, where the RESA 21

Page 21 of 25

is designed to quantify the flatness of an error spectrum curve, and the DESA is designed to measure the estimation error of an estimator. Second, how to combine these two new metrics, which is called EES, is considered in this paper.

ip t

340

Two forms of combination are presented for different scenarios. The first form is

cr

additive, which is dependent on prior preference between the concentration and estimation accuracy. The second form is multiplicative, which does not depend

345

us

on the prior preference and is also suggested when the DESA is dominating. Simulations show that both forms of EES have greater applicability than the DES. As for future work, a possible direction is to study the other combination

an

forms of EES.

Acknowledgments

350

M

This work was supported in part by the National 973 Project of China through Grant No. 2013CB329405, the National Natural Science Foundation of

d

China through Grant No. 61174138, and the Fundamental Research Funds for

References

te

the Central Universities of China.

[1] X.R. Li, Z.L. Zhao, Measures of performance for evaluation of estimators

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

355

and filters, in: Proc. 2001 SPIE Conf. Signal and Data Processing of Small Targets, San Diego, CA, USA, November 2001, pp. 530-541.

[2] X.R. Li, Z.L. Zhao, Evaluation of estimation algorithms-part I: Incomprehensive measures of performance, IEEE Trans. Aerosp. Electron. Syst. 42 (4) 2006 1340-1358.

360

[3] Z.L. Zhao, X.R. Li, Interaction between estimators and estimation criteria, in: Proc. Int. Conf. Information Fusion, Philadelphia, PA, USA, July 2005, pp. 311-316. [4] X.R. Li, Z.L. Zhao, Z.S. Duan, Error spectrum and desirability level for estimation performance evaluation, in: Proc. Workshop on Estimation, 22

Page 22 of 25

365

Tracking and Fusion: A Tribute to Fred Daum, Monterey, CA, USA, May

ip t

2007, pp. 1-7. [5] Z.L. Zhao, X.R. Li, Two classes of relative measures of estimation per-

formance, in: Proc. Int. Conf. Information Fusion, Qubec City, Canada,

370

cr

July 2007, pp. 1432-1440.

[6] H.L. Yin, J. Lan, X.R. Li, Iterative mid-range with application to estima-

us

tion performance evaluation, IEEE Signal Process. Lett., 22 (11) (2015) 2044-2048.

an

[7] Y. Liu, X.R. Li, Computation of error spectrum for estimation performance evaluation, in: Proc. Int. Conf. Information Fusion, Istanbul, Turkey, July 2013, pp. 477-483.

M

375

[8] Y.H. Mao, Z.S. Duan, C.Z. Han, Dynamic error spectrum for IMM performance evaluation, in: Proc. Int. Conf. Information Fusion, Istanbul,

d

Turkey, July 2013, pp. 461-468.

380

te

[9] Y.H. Mao, C.Z. Han, Z.S. Duan, Dynamic error spectrum for estimation performance evaluation: A case study on interacting multiple model algorithm, IET Signal Process. 8 (2) (2014) 202-210.

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

[10] S.V. Halunga, N. Vizireanu, Performance evaluation for conventional and MMSE multiuser detection algorithms in imperfect reception conditions Digital signal processing. 20 (1) (2010) 166-178.

385

[11] R. Chou, Y. Boers, M. Podt, et al, Performance evaluation for particle filters, in: Proc. Int. Conf. Information Fusion, Chicago, IL, USA, July 2011, pp. 1-7.

[12] C.K. Chang, T. Zhi, R.K. Saha, Performance evaluation of track fusion with information matrix filter, IEEE Trans. Aerosp. Electron. Syst. 38 (2) 390

(2002) 455-466.

23

Page 23 of 25

[13] A. Gorji, R. Tharmarasa, T. Kirubarajan, Performance measures for multiple target tracking problems, in: Proc. Int. Conf. Information Fusion,

ip t

Chicago, IL, USA, July 2011, pp. 1-8.

[14] S. Luo, G.A. Bi, X.L. Lv, et al, Performance analysis on Lv distribution and its application, Digital signal processing. 23 (3) (2013) 797-807.

cr

395

[15] W.S. Peng, Y.W. Fang, S.H. Chen, An approximate calculation for error

us

spectrum, in: proceedings of 2015 International Conferencee on Estimation, Detection and Information Fusion, January 10-11, 2015, Harbin,

400

an

China,2015, pp. 278-281.

[16] W.S. Peng, Y.W. Fang, R.J. Zhan, Y.L. Wu, Two approximation algo-

127 (5) (2016) 2811-2821.

M

rithms of error spectrum for estimation performance evaluation, Optik.

[17] E.J.G. Pitman, The “closest”estimates of statistical parameters, Math.

405

d

Proc. Cambridge Philos. Soc. 33 (2) (1937) 212-222. [18] W. Volterman, K.F. Davies, N. Balakishnan, Pitman closeness as a cri-

te

terion for the determination of the optimal progressive censoring scheme, Statistical methodology, 9 (6) (2012) 563-571.

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

[19] A. Hamaz, M. Lbazizen, Comparison of two estimation methods of missing values using Pitman-closeness criterion, Communications in Statistics-

410

Theory and Methods, 38 (13) (2009) 2010-2213.

[20] M. Ghosh, P.K. Sen, Bayesian Pitman closeness, Comm. Statist. Theory Meth. 20 (11) (1991) 3659-3678.

[21] H.L. Yin, J. Lan, X.R. Li, Measures for ranking estimation performance based on single or multiple performance metrics, in: Proc. Int. Conf.

415

Information Fusion, Istanbul, Turkey, July 2013, pp. 2020-2027. [22] H.L. Yin, J. Lan, X.R. Li, Ranking estimation performance by estimator randomization and attribute support, in: Proc. Int. Conf. Information Fusion, Salamanca, Spain, July 2014, pp. 7-10. 24

Page 24 of 25

[23] P.S. Bullen, Handbook of Means and Their Inequalities. Kluwer Academic: Dordrescht, the Netherlands, 2003.

ip t

420

[24] A.M. Sharaf, A.A.A. El-Gammal, Multi-objective PSO/GA optimization

control strategies for energy efficient PMDC motor drives, Euro. Trans.

cr

Electr. Power. 21 (8) (2011) 2080-2097.

[25] P.L. Yu, Compromise solutions, domination structures, and Salukvadze’s solution, J. Optim. Theory App. 13 (3) (1974) 362-378.

us

425

[26] P. Lu, D. Tolliver, Multiobjective pavement-preservation decision making

an

with simulated constraint boundary programming, Journal of Transportation Engineering 139 (9) (2013) 880-888.

430

M

[27] P. Wang, H.S. Zhu, M.W-. Korsak, et al, Determination of weights for multiobjective decision making or machine learning, IEEE. Systems. Journal, 8 (1) (2014) 63-72.

d

[28] Y.P. Zhao, S.W. Mao, J.H. Reed, Y.S. Huang, Utility function selection

te

for streaming videos with a cognitive engine testbed. Mobile. Netw. Appl. 14 (3) (2010) 446-460. 435

[29] E.N. Gerasimov, V.N. Repko, Multicriterial optimization, Int. Appl.

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Mech. 14 (11) (1978) 1179-1184.

[30] X.R. Li, Probability, Random Signals, and Statistics, CRC Press: Boca Raton, FL, 1999.

25

Page 25 of 25