Information Sciences 201 (2012) 53–60
Contents lists available at SciVerse ScienceDirect
Information Sciences journal homepage: www.elsevier.com/locate/ins
Cross-entropy measure of uncertain variables Xiaowei Chen a, Samarjit Kar b,⇑, Dan A. Ralescu c a b c
Department of Risk Management and Insurance, Nankai University, Tianjin 30071, China Department of Mathematics, National Institute of Technology, Durgapur 713209, India Department of Mathematical Sciences, University of Cincinnati, Cincinnati, OH 45221-0025, USA
a r t i c l e
i n f o
Article history: Received 7 July 2010 Received in revised form 22 February 2012 Accepted 26 February 2012 Available online 17 March 2012 Keywords: Uncertain variable Cross-entropy Minimum cross-entropy principle
a b s t r a c t ross-entropy is a measure of the difference between two distribution functions. In order to deal with the divergence of uncertain variables via uncertainty distributions, this paper aims at introducing the concept of cross-entropy for uncertain variables based on uncertain theory, as well as investigating some mathematical properties of this concept. Several practical examples are also provided to calculate uncertain cross-entropy. Furthermore, the minimum cross-entropy principle is proposed in this paper. Finally, a study of generalized cross-entropy for uncertain variables is carried out. Ó 2012 Elsevier Inc. All rights reserved.
1. Introduction Probability theory, fuzzy set theory, rough set theory, and credibility theory were all introduced to describe non-deterministic phenomena. However, some of the non-deterministic phenomena expressed in the natural language, e.g. ‘‘about 100 km’’, ‘‘approximately 39 °C’’, ‘‘big size’’, are neither random nor fuzzy. Liu [16] founded uncertainty theory, as a branch of mathematics based on normality, self-duality, countable subadditivity, and product measure axioms. An uncertain measure is used to indicate the degree of belief that an uncertain event may occur. An uncertain variable is a measurable function from an uncertainty space to the set of real numbers and this concept is used to represent uncertain quantities. The uncertainty distribution is a description of an uncertain variable. Uncertainty theory has wide applications in programming, logic, risk management, and reliability theory. In many cases, the uncertainty is not static, but changes over time. In order to describe dynamic uncertain systems, uncertain processes were first introduced by Liu [17]. Uncertain statistics is a methodology for collecting and interpreting experimental data (provided by experts) in the framework of uncertainty theory. Suppose that we know the states of a system take values in a specific set with unknown distribution, although we do not know the exact form of this distribution function. However, we may learn constraints on this distribution: expectations, variance, or bounds on these values. Suppose that we need to choose a distribution that is in some sense the best estimate given what we know. Usually there are infinite many distributions satisfying the constraints. Which one should we choose? Before answering this question, we will first discuss the concepts of entropy and cross-entropy. In 1949, Shannon [25] introduced entropy to measure the degree of uncertainty of random variables. Inspired by the Shannon entropy, fuzzy entropy was proposed by Zadeh [34] to quantify the amount of fuzziness, and the entropy of a fuzzy event is defined as a weighted Shannon entropy. Fuzzy entropy has been studied by many researchers such as [7,12,14,15,20–22,33,36]. The principle of maximum entropy was proposed by Jaynes [11]: of all the distributions that satisfy the constraints, choose the one with the largest entropy. Besides this method, cross-entropy was introduced by Good [8]. ⇑ Corresponding author. Tel.: +91 9434453186. E-mail addresses:
[email protected] (X. Chen),
[email protected] (S. Kar),
[email protected] (D.A. Ralescu). 0020-0255/$ - see front matter Ó 2012 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.ins.2012.02.049
54
X. Chen et al. / Information Sciences 201 (2012) 53–60
It is a non-symmetric measure of the difference between two probability distributions. Other names of this concept include expected weight of evidence, directed divergence, and relative entropy. Based on De Luca and Termini’s [7] fuzzy entropy, Bhandari and Pal [1] defined the cross-entropy for a fuzzy set via its membership function. The theory of fuzzy cross-entropy has been studied in [2,24]. The principle of minimum cross-entropy was proposed by Kullback [13]: from the distributions that satisfy the constraints, choose the one with the least cross-entropy.The principle of maximum entropy can be used to select a number of representative samples from a large database [28]. The principle of maximum entropy and principle of minimum cross-entropy have been applied to machine learning and to decision trees; see [10,26,27,29–31,35] for details. Other applications include portfolio selection [23] and optimization models [9,32]. Uncertainty theory is used to model human uncertainty. The uncertainty distribution plays a fundamental role. Unlike probability distribution (based on the sample), we often ask some domain experts to evaluate their degree of belief that each event will occur. Then the empirical prior information becomes more important. In many real problems, the distribution function is unavailable except for partial information, for example, prior distribution function, which may be based on intuition or experience with the problem. In order to better estimating the uncertainty distribution, Liu [19] introduced uncertain entropy to characterize uncertainty resulting from information deficiency. Chen and Dai [4] investigated the maximum entropy principle of uncertainty distribution for uncertain variables. To compute the entropy more conveniently, Dai and Chen [5] provide some formulas for the entropy of functions dealing with uncertain variables with regular uncertain distributions. In order to deal with the divergence of two given uncertain distributions, this paper will introduce the concept of cross-entropy for uncertain variables. Several practical examples are also provided to calculate uncertain cross-entropy. In practice, we often need to estimate the uncertainty distribution of an uncertain variable from the known (partial) information, for example, prior uncertainty distribution, which may be based on intuition or experience with the particular problem. In this context, the principle of minimum cross-entropy in uncertainty theory will be studied. The rest of the paper is organized as follows: some preliminary concepts of uncertainty theory are briefly recalled in Section 2. The concept and basic properties of entropy of uncertain variables are introduced in Section 3. The concept of cross-entropy for uncertain variables is introduced in Section 4, where some mathematical properties are also studied. The minimum cross-entropy principle theorem for uncertain variables is proved in Section 5. A study of generalized cross-entropy for uncertain variables is carried out in Section 6. Finally, a brief summary is given in Section 7. 2. Preliminaries Let C be a nonempty set, and L a r-algebra over C. An uncertain measure M [16] is a set function defined on L satisfying the following four axioms: Axiom 1. (Normality Axiom) MfCg ¼ 1; Axiom 2. (Duality Axiom) MfKg þ MfKc g ¼ 1 for any event K 2 L; Axiom 3. (Subadditivity Axiom) For every countable sequence of events {Ki}, we have
1 1 P S M Ki 6 MfKi g i¼1
i¼1
Axiom 4. (Product Measure Axiom) Let Ck be nonempty sets on which Mk are uncertain measures, k = 1, 2, . . . , n, respectively. Then the product uncertain measure M is an uncertain measure on the product r-algebra L ¼ L1 L2 Ln satisfying
n Q M Kk ¼ min Mk fKk g 16i6n
k¼1
An uncertain variable is a measurable function from an uncertainty space ðC; L; MÞ to the set of real numbers. The uncertainty distribution function U : R ! ½0; 1 of an uncertain variable n is defined as UðxÞ ¼ Mfn 6 xg: The expected value operator of uncertain variable was defined by Liu [16] as
E½n ¼
Z
þ1
Mfn P rgdr
0
Z
0
Mfn 6 rgdr
1
provided that at least one of the two integrals is finite. Furthermore, the variance is defined as E[(n e)2], where e is the finite expected value of n. 3. Entropy of uncertain variables Definition 1 (Liu [19]). Let n be an uncertain variable with uncertainty distribution U(x). Then its entropy is defined by
H½n ¼
Z
þ1
SðUðxÞÞdx 1
X. Chen et al. / Information Sciences 201 (2012) 53–60
55
where S(t) = t ln t (1 t)ln(1 t). Note that S(t) = t ln t (1 t)ln(1 t) is strictly concave on [0, 1] and symmetrical about t = 0.5. Then H[n] P 0 for all uncertain variables n. Liu [18] proved that 0 6 H[n] 6 ln 2 if n takes values in the interval [a, b], and H[n] = (b a)ln 2 if and only if n is an uncertain variable with the following distribution:
8 if x < a > < 0; UðxÞ ¼ 0:5; if a 6 x 6 b > : 1; if x > b
Next, we will calculate the entropy of some uncertain variables. Example 1. A linear uncertain variable n has an uncertainty distribution
8 if x < a > < 0; UðxÞ ¼ ðx aÞ=ðb aÞ; if a 6 x 6 b > : 1; if x > b
where a and b are real numbers with a < b. The entropy of the linear uncertain variable Lða; bÞ is H[n] = (b a)/2. Example 2. A zigzag uncertain variable n has an uncertainty distribution.
8 0; > > > < ðx aÞ=2ðb aÞ; UðxÞ ¼ > ðx þ c 2bÞ=2ðc bÞ; > > : 1;
if x < a if a 6 x < b if b 6 x 6 c if x > c
where a, b and c are real numbers with a < b < c. The entropy of zigzag uncertain variable Zða; b; cÞ is H[n] = (c a)/2. Example 3. A normal uncertain variable n is an uncertain variable with normal uncertainty distribution function
UðxÞ ¼ 1 þ exp
1
pðe xÞ pffiffiffi 3r
;
1 < x < þ1;
r>0
pffiffiffi Then the entropy of a normal uncertain variable is H½n ¼ ðprÞ= 3. Theorem 1 (Dai and Chen [5]). Assume n is an uncertain variable with regular uncertainty distribution U. If the entropy H[n] exists, then
H½n ¼
Z
1
U1 ðaÞ ln
0
a 1a
da
Theorem 2 (Dai and Chen [5]). Let n and g be independent uncertain variables. Then for any real numbers a and b, we have
H½an þ bg ¼ jajH½n þ jbjH½g Furthermore, Chen and Dai [4] proved the following maximum entropy theorem for uncertain variables. Let n be an prffiffi and the equality holds only if n is a normal uncertain variable with finite expected value e and variance r2. Then H½n 6 p 3 uncertain variable with expected value e and variance r2, i.e., N ðe; rÞ. Definition 2 (Dai [6]). Suppose that n is an uncertain variable with uncertain distribution U. Then its quadratic entropy is defined by
Q ½n ¼
Z
þ1
ðUðxÞÞð1 UðxÞÞdt
1
Some mathematical properties of quadratic entropy were studied by Dai [6], in particular the principle of maximum quadratic entropy, where also several maximum quadratic entropy theorems with moment constraints are discussed. The quadratic entropy has also been applied to estimate uncertainty distributions in uncertain statistics by Dai [6].
4. Cross-entropy for uncertain variables In this section, we will introduce the concept of cross-entropy for uncertain variables by using uncertain measures. First, we recall the information theoretic distance known as cross-entropy and introduced by Kullback [13]. Let P = {pl, p2, . . . , pn}
56
X. Chen et al. / Information Sciences 201 (2012) 53–60
and Q = {ql, q2, . . . , qn} be two probability distributions where follows:
DðP; Q Þ ¼
n P
pi ln
i¼1
Pn
i¼1 pi
¼
Pn
i¼1 qi
¼ 1. The cross entropy, D(P; Q) is defined as
pi qi
ð1Þ
Eq. (1) is asymmetric, Pal and Pal [21] used the symmetric version:
DðP; Q Þ ¼
p q pi ln i þ qi ln i qi pi i¼1 n P
Inspired by this, we will introduced the following function to define cross-entropy for uncertain variables
Tðs; tÞ ¼ s ln
s 1s þ ð1 sÞ ln ; t 1t
0 6 t 6 1; 0 6 s 6 1
with convention 0 ln 0 = 0. It is obvious that T(s, t) = T(1 s,1 t) for any 0 6 s 6 1 and 0 6 t 6 1. Note that
@T s 1s ¼ ln ln ; @s t 1t @2T 1 ¼ ; @s sð1 sÞ
@T ts ¼ @t tð1 tÞ
@2T 1 ¼ ; @t@s tð1 tÞ
@2T @2s
¼
s 1s þ t 2 ð1 tÞ2
Then T(s, t) is a strictly convex function with respect to (s, t) and reaches its minimum value 0 when s = t. In uncertainty theory, the best description for uncertain variable is its uncertainty distribution. The inverse uncertainty distribution has many good properties and the inverse uncertainty distribution for operations of uncertain variables can be obtained easily. So we will define cross-entropy by using uncertainty distributions. Definition 3 (Chen [3]). Let n and g be two uncertain variables. Then the cross-entropy of n from g is defined as
D½n; g ¼
Z
þ1
TðMfn 6 xg; Mfg 6 xgÞdx
1
s
where Tðs; tÞ ¼ s ln t þ ð1 sÞ ln 1s : 1t It is obvious that D[n; g] is symmetric, i.e., the value does not change if the outcomes are labeled differently. Let Un and Ug be the distribution functions of uncertain variables n and g, respectively. The cross-entropy of n from g can be written as
D½n; g ¼
Z
þ1
Un ðxÞ ln
1
Un ðxÞ 1 Un ðxÞ þ ð1 Un ðxÞÞ ln 1 Ug ðxÞ Ug ðxÞ
dx
The cross-entropy depends only on the number of values and their uncertainties and does not depend on the actual values that the uncertain variables n and g take. Theorem 3. For any uncertain variables n and g, we have D[n; g] P 0 and the equality holds if and only if n and g have the same uncertainty distribution. Proof. Let Un(x) and Ug(x) be the uncertainty distribution functions of n and g, respectively. Since T(s, t) is strictly convex on [0, 1] [0, 1] and reaches its minimum value when s = t. Therefore
TðUn ðxÞ; Ug ðxÞÞ P 0 for almost all the points x 2 R. Then
D½n; g ¼
Z
þ1
TðUn ðxÞ; Ug ðxÞÞdx P 0
1
For each s 2 [0, 1], there is a unique point t = s with T(s, t) = 0. Thus, D[n; g] = 0 if and only if T(Un(x), Ug(x)) = 0 for almost all points x 2 R, that is Mfn 6 xg ¼ Mfg 6 xg: h Example 4. Suppose that n and g are uncertain variables with uncertainty distributions U1 and U2, respectively. Assume that uncertainty distributions U1 and U2 have the form
8 > < 0; if x < a1 U1 ðxÞ ¼ ai ; if ai 6 x < aiþ1 > : 1; if x P an
X. Chen et al. / Information Sciences 201 (2012) 53–60
and
57
8 > < 0; if x < a1 U2 ðxÞ ¼ bi ; if ai 6 x < aiþ1 > : 1; if x P an
respectively. Then the cross-entropy of n from g is
D½n; g ¼
n1 P
a 1 ai ai ln i ðaiþ1 ai Þ þ ð1 ai Þ ln ðaiþ1 ai Þ 1 bi
bi
i¼1
Example 5. Suppose that n and g are linear uncertain variables with uncertainty distributions Lða; bÞ and Lðc; dÞ (c 6 a < b 6 d), respectively. Then the cross-entropy of n from g is
D½n; g ¼
Z
þ1
Un ðxÞ ln
1
¼
dx
Un ðxÞ 1 Un ðxÞ þ ð1 Un ðxÞÞ ln 1 Ug ðxÞ Ug ðxÞ
x a ðx aÞðd cÞ b x ðb xÞðd cÞ ln þ ln dx b a ðb aÞðx cÞ b a ðb aÞðd xÞ a Z a Z d dc dc þ dx þ dx ln ln d x xc c b Z
b
ð2Þ ð3Þ ð4Þ
In particular if the uncertainty distributions of n and g are Lð0; 1Þ and Lð0; 2Þ, respectively, using the formula above, we get D[n; g] = 0.5. Example 6. Suppose that n and g are zigzag uncertain variables with uncertainty distributions Lða; b; cÞ and Lðd; b; eÞ (d 6 a < b < c 6 e), respectively. Then the cross-entropy of n from g is
D½n; g ¼
Z
þ1
Un ðxÞ ln
1
¼
Z
dx
Un ðxÞ 1 Un ðxÞ þ ð1 Un ðxÞÞ ln 1 Ug ðxÞ Ug ðxÞ
ð5Þ
b
xa ðx aÞð2b 2dÞ 2b x a ð2b x aÞð2b 2dÞ ln þ ln dx 2b 2a ðx dÞð2b 2aÞ 2b 2a ð2b x dÞð2b 2aÞ Z c x þ c 2b ðx þ c 2bÞð2e 2bÞ cx ðc xÞð2e 2bÞ ln þ ln dx þ 2c 2b ðx þ e 2bÞð2c 2bÞ 2c 2b ðe xÞð2c 2bÞ b Z e Z a 2b 2d 2e 2b dx þ dx ln ln þ 2b d x x þ e 2b d c
ð6Þ
a
ð7Þ ð8Þ
In particular if the uncertainty distributions of n and g are Zð0; 1; 2Þ and Zð0; 1; 3Þ, respectively, using the formula above, we get D[n; g] = 0.2. Example 7. Suppose that n and g are normal uncertain variables with uncertainty distributions N ðe1 ; r1 Þ and N ðe1 ; r2 Þ, respectively. Then the cross-entropy of n from g is
D½n; g ¼
Z
þ1
Un ðxÞ ln
1
Z
dx
Un ðxÞ 1 Un ðxÞ þ ð1 Un ðxÞÞ ln 1 Ug ðxÞ Ug ðxÞ
1 þ exp ppðeffiffi32rxÞ 1 2 dx ln ¼ pp ðe1 xÞ pp ðe1 xÞ 1 1 þ exp ffiffi ffiffi 1 þ exp 3r 1 3r 1 pp ðxe2 Þ Z þ1 ffiffi 1 þ exp 1 3r2 ln dx þ pp ðxe1 Þ pp ðxe1 Þ 1 1 þ exp ffiffi ffiffi 1 þ exp 3r 3r þ1
1
ð9Þ
ð10Þ
ð11Þ
1
In particular if the uncertainty distributions of n and g are N ð0; 2Þ and N ð0; 1Þ, respectively, using the formula above, we get D[n; g] = 0.9. 5. Minimum cross-entropy principle In real problems, the distribution function of an uncertain variable is unavailable except partial information, for example, some prior distribution function, which may be based on intuition or experience with the problem. If the moment constraints and the prior distribution function are given, since the distribution function must be consistent with the given information and our experience, we will use the minimum cross-entropy principle to choose the one that is closest to the given prior distribution function out of all the distributions satisfying the given moment constraints.
58
X. Chen et al. / Information Sciences 201 (2012) 53–60
Theorem 4. Let n be a continuous uncertain variable with finite second moment m2. If the prior distribution function has the form
WðxÞ ¼ ð1 þ expðaxÞÞ1 ;
a<0
then the minimum cross-entropy distribution function is the normal uncertain distribution with second moment m2. Proof. Let U(x) be the distribution function of n and write U⁄(x) = U(x) for x P 0. The second moment is
Z
E½n2 ¼
þ1
Mfn2 P rgdr ¼
Z
0
Z
¼
þ1
Mfðn P
0
Z
þ1
2rð1 UðrÞ þ UðrÞÞdr ¼
0
pffiffiffi pffiffiffi r Þ [ ðn 6 rÞgdr ¼
Z
þ1
pffiffiffi pffiffiffi ð1 Uð rÞ þ Uð rÞÞdr
0 þ1
2rð1 UðrÞ þ U ðrÞÞdr ¼ m2
0
Thus there exists a real number j such that
Z
Z
þ1
2rð1 UðrÞÞdr ¼ jm2 ;
0
þ1
2rU ðrÞdr ¼ ð1 jÞm2
0
The minimum cross-entropy distribution function U(r) should minimize the cross-entropy
Z
þ1
UðrÞ ln
1
UðrÞ 1 UðrÞ þ ð1 UðrÞÞ ln 1 WðrÞ WðrÞ
Z dr ¼
0
UðrÞ ln
1
þ
Z
þ1
¼
þ1
UðrÞ ln 0
þ U ðrÞ ln
dr
UðrÞ 1 UðrÞ þ ð1 UðrÞÞ ln 1 WðrÞ WðrÞ
UðrÞ ln
0
Z
UðrÞ 1 UðrÞ þ ð1 UðrÞÞ ln 1 WðrÞ WðrÞ
UðrÞ 1 UðrÞ þ ð1 UðrÞÞ ln 1 WðrÞ WðrÞ
dr
U ðrÞ 1 U ðrÞ þ ð1 U ðrÞÞ ln 1 WðrÞ WðrÞ
dr
Subject to the moment constraints
Z
Z
þ1
2rð1 UðrÞdr ¼ jm2 ;
0
þ1
2r U ðrÞdr ¼ ð1 jÞm2
0
The Lagrangian is
L¼
Z
þ1
UðrÞ ln
0
k1
Z
UðrÞ 1 UðxÞ U ðrÞ 1 U ðrÞ þ ð1 UðrÞÞ ln þ U ðrÞ ln þ ð1 U ðrÞÞ ln 1 WðrÞ 1 WðrÞ WðrÞ WðrÞ
þ1
2rð1 UðrÞÞdr jm2
k2
Z
0
þ1
2r U ðrÞdr ð1 jÞm2
dr
0
The Euler–Lagrange equations tell us that the minimum cross-entropy distribution function satisfies
UðrÞ 1 UðrÞ ln ¼ 2rk1 1 WðrÞ WðrÞ U ðrÞ 1 U ðrÞ ln ln ¼ 2rk2 1 WðrÞ WðrÞ ⁄ Thus U and U have the form ln
UðrÞ ¼ ð1 þ expðar þ 2k1 rÞÞ1 U ðrÞ ¼ ð1 þ expðar 2k2 rÞÞ1 Substituting it into the moment constrains, we get
1
pr 6jm pr
UðrÞ ¼ 1 þ exp pffiffiffiffiffiffiffi
!!1
U ðrÞ ¼ 1 þ exp pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
6ð1 jÞm
When j = 1/2, then cross-entropy achieves the minimum. Thus the distribution function is just the normal uncertainty distribution N(0, m). h
X. Chen et al. / Information Sciences 201 (2012) 53–60
59
6. Generalized cross-entropy In this section, we will define a generalized cross-entropy for an uncertain variable by a strictly convex function P(x) satisfying P(1) = 0. There are functions satisfying the above condition such as P(x) = x2 x, P(x) = (xa x)/(a 1), many (a > 0, a – 1) and PðxÞ ¼ x 12 12. For convenience, we define
s 1s þ ð1 tÞP ; Tðs; tÞ ¼ t P t 1t
ðs; tÞ 2 ½0; 1 ½0; 1
It is easy to prove that T(s, t) is a function from [0, 1] [0, 1] to [0, +1) with convention Tðs; 0Þ ¼ limt!0þ Tðs; tÞ and Tðs; 1Þ ¼ limt!1 Tðs; tÞ. Definition 4. Let n and g be two uncertain variables. Then the generalized cross-entropy of n from g is defined as
GD½n; g ¼
Z
þ1
TðMfn 6 xg; Mfg 6 xgÞdx 1
where Tðs; tÞ ¼ t P
s
t
þ ð1 tÞP 1s : 1t
It is clear that GD[n; g] is not symmetric. Note that, by changing the formulation of P(x), we can get a different generalized cross-entropy. (i) Let P(x) = (xa x)/(a 1), a > 0, a – 1. Then
GD½n; g ¼
1 a1
Z
þ1
½Mfn 6 xga Mfg 6 xg1a þ ð1 Mfn 6 xgÞa ð1 Mfg 6 xgÞ1a 1dx
1
In particular, when a = 1/2,
GD½n; g ¼
Z
þ1 1
ðMfn 6 xg Mfg 6 xgÞ2 dx Mfg 6 xgð1 Mfg 6 xgÞ
(ii) Let PðxÞ ¼ x 12 12. Then
GD½n; g ¼
Z
þ1
jMfn 6 xg Mfg 6 xgj
1
1 dx 2
Let Un(x) and Ug(x) be the distribution functions of n and g, respectively. Similarly, the generalized cross-entropy can be rewritten as
GD½n; g ¼
Z
þ1
Ug ðxÞP 1
Un ðxÞ 1 Un ðxÞ þ ð1 Ug ðxÞÞP 1 Ug ðxÞ Ug ðxÞ
dx
Now, suppose that P(x) is a twice-differentiable strictly convex function. Then T(s, t) is twice-differentiable with respect to both s and t, and
s @T 1s ; ¼ P0 P0 @s t 1t
s s s @T 1s 1s 0 1s þ ¼P P0 P P @t t t t 1t 1t 1t
ð12Þ
! @2T @2T s 00 s 1s 00 1 s P ¼ ¼ 2P þ @s@t @t@s t 1t t ð1 tÞ2
ð13Þ
@ 2 T 1 00 s 1 00 1 s þ ; ¼ P P @s2 t t 1t 1t
ð14Þ
@ 2 T s2 00 s ð1 sÞ2 00 1 s þ ¼ P P t 1t @t 2 t 3 ð1 tÞ3
The following properties of T(s, t) can be easily proved by equations (12)–(14): (a) T(s, t) is a strictly convex function with respect to (s,t) and attains its minimum value zero on the line s = t; and (b) for any 0 6 s 6 1,0 6 t 6 1, we have T(s, t) = T(1 s, 1 t). Theorem 5. For any uncertain variables n and g, the generalized cross-entropy satisfies
GD½n; g P 0 and the equality holds if and only if n and g have the same distribution function. Proof. Since T(s, t) is strictly convex about (s, t) and attains its minimum value zero on the line s = t, this theorem can be proved similarly to Theorem 3. h
60
X. Chen et al. / Information Sciences 201 (2012) 53–60
7. Conclusion In this paper, we first recalled the concept of entropy for uncertain variables and its mathematical properties. Then we introduced the concept of cross-entropy for uncertain variables to deal with the divergence of two uncertain variables. In addition, we investigated some mathematical properties of this type of cross-entropy and proposed the minimum cross-entropy principle. Further, some examples are provided to calculate uncertain cross-entropy. Finally, we carry out a study on generalized cross-entropy for uncertain variables. In the future, we plan to carry on our research to obtain more properties of our proposed cross-entropy measure, especially when the prior distribution function of an uncertain variable has other forms. In addition, we plan to apply our results to the field of portfolio selection, uncertain optimization, and machine learning. Acknowledgments This work was supported by National Natural Science Foundation of China Grant Nos. 71073084, 71171119 and 91024032. Dan A. Ralescus work was partly supported by a Taft Travel for Research Grant. References [1] D. Bhandary, N. Pal, Some new information measures for fuzzy sets, Information Sciences 67 (1993) 209–228. [2] P. Boer, D. Kroese, S. Mannor, R. Rubinstein, A tutorial on the cross-entropy method, Annals of Operations Research 134 (1) (2005) 19–67. [3] X. Chen, Cross-entropy of uncertain variables, in: Proceedings of the 9th International Conference on Electronic Business, Macau, November 30– December 4, 2009, pp.1093–1095. [4] X. Chen, W. Dai, Maximum entropy principle for uncertain variables, International Journal of Fuzzy Systems 13 (3) (2011) 232–236. [5] W. Dai, X. Chen, Entropy of function of uncertain variables, Entropy of Function of Uncertain Variables Mathematical and Computer Modelling. http:// dx.doi.org/10.1016/j.mcm.2011.08.052. [6] W. Dai, Quadratic entropy of uncertain variables, Information – An International Interdisciplinary Journal, in press. [7] A. De Luca, S. Termini, A definition of nonprobabilistic entropy in the setting of fuzzy sets theory, Information and Control 20 (1972) 301–312. [8] I. Good, Maximum entropy for hypothesis formulation, especially for multidimensional contingency tables, The Annals of Mathematical Statistics 34 (3) (1963) 911–934. [9] R. Haber, R. Toro, O. Gajate, Optimal fuzzy control system using the cross-entropy method. A case study of a drilling process, Information Sciences 180 (14-15) (2010) 2777–2792. [10] Q.H. Hu, W. Pan, S. An, P.J. Ma, J.M. Wei, An efficient gene selection technique for cancer recognition based on neighborhood mutual information, International Journal of Machine Learning and Cybernetics 1 (1–4) (2010) 63–74. [11] E. Jaynes, Information theory and statistical mechanics, Physical Reviews 106 (4) (1957) 620–630. [12] A. Kaufmann, Introduction to the Theory of Fuzzy Subsets, vol. 1, Academic Press, New York, 1975. [13] S. Kullback, Information Theory and Statistics, Wiley, New York, 1959. [14] B. Kosko, Fuzzy entropy and conditioning, Information Sciences 40 (1986) 165–174. [15] P. Li, B. Liu, Entropy of credibility distributions for fuzzy variables, IEEE Transactions on Fuzzy Systems 16 (1) (2008) 123–129. [16] B. Liu, Uncertainty Theory, second ed., Springer, Verlag, Berlin, 2007. [17] B. Liu, Fuzzy process, hybrid process and uncertain process, Journal of Uncertain Systems 2 (1) (2008) 3–16. [18] B. Liu, Uncertainty Theory: A Branch of Mathematics for Modeling Human Uncertainty, Springer-Verlag, Berlin, 2010. [19] B. Liu, Some research problems in uncertainty theory, Journal of Uncertain Systems 3 (1) (2009) 3–10. [20] D.MalyszkoandJ. Stepaniuk, Adaptive multilevel rough entropy evolutionary thresholding, Information Sciences 180 (7) (2010) 1138–1158. [21] N. Pal, K. Pal, Object background segmentation using a new definition of entropy, IEE Proceedings – Computers and Digital Techniques 136 (1989) 284– 295. [22] N. Pal, J. Bezdek, Measuring fuzzy uncertainty, IEEE Transactions on Fuzzy Systems 2 (1994) 107–118. [23] Z. Qin, X. Li, X. Ji, Portfolio selection based on fuzzy cross-entropy, Journal of Computational and Applied mathematics 228 (1) (2009) 139–149. [24] R. Rubinstein, D. Kroese, The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation and Machine Learning, Springer, Berlin, 2004. [25] C. Shannon, The Mathematical Theory of Communication, The University of Illinois Press, Urbana, 1949. [26] V. Vagin, M. Fomina, Problem of knowledge discovery in noisy databases, International Journal of Machine Learning and Cybernetics 2 (3) (2011) 135– 145. [27] L.J. Wang, An improved multiple fuzzy NNC system based on mutual information and fuzzy integral, International Journal of Machine Learning and Cybernetics 2 (1) (2011) 25–36. [28] X.Z. Wang, L.C. Dong, J.H. Yan, Maximum ambiguity based sample selection in fuzzy decision tree induction, IEEE Transactions on Knowledge and Data Engineering (2011), http://dx.doi.org/10.1109/TKDE.2011.67. [29] X.Z. Wang, C.R. Dong, Improving generalization of fuzzy if–then rules by maximizing fuzzy entropy, IEEE Transactions on Fuzzy Systems 17 (3) (2009) 556–567. [30] X.Z. Wang, J.H. Zhai, S.X. Lu, Induction of multiple fuzzy decision trees based on rough set technique, Information Sciences 178 (16) (2008) 3188–3202. [31] W.G. Yi, M. G Lu, Z. Liu, Multi-valued attribute and multi-labeled data decision tree algorithm, International Journal of Machine Learning and Cybernetics 2 (2) (2011) 67–74. [32] H. Xie, R. Zheng, J. Guo, X. Chen, Cross-fuzzy entropy: a new method to test pattern synchrony of bivariate time series, Information Sciences 180 (9) (2010) 1715–1724. [33] R. Yager, On measures of fuzziness and negation – Part I: Membership in the unit interval, International Journal of General Systems 5 (1979) 221–229. [34] L. Zadeh, Probability measures of fuzzy events, Journal of Mathematical Analysis and Applications 23 (1968) 421–427. [35] J.H. Zhai, Fuzzy decision tree based on fuzzy-rough technique, Soft Computing 15 (6) (2011) 1087–1096. [36] Q. Zhang, S. Jiang, A note on information entropy measures for vague sets and its applications, Information Sciences 178 (21) (2008) 4184–4191.