ARTICLE IN PRESS
JID: KNOSYS
[m5G;April 7, 2018;8:40]
Knowledge-Based Systems 0 0 0 (2018) 1–14
Contents lists available at ScienceDirect
Knowledge-Based Systems journal homepage: www.elsevier.com/locate/knosys
Fuzzy Bayes risk based on Mahalanobis distance and Gaussian kernel for weight assignment in labeled multiple attribute decision making Mingliang Suo, Baolong Zhu, Yanquan Zhang, Ruoming An, Shunli Li∗ School of Astronautics, Harbin Institute of Technology, Harbin 150001, PR China
a r t i c l e
i n f o
Article history: Received 15 November 2017 Revised 30 March 2018 Accepted 1 April 2018 Available online xxx Keywords: Weight assignment Fuzzy Bayes risk Mahalanobis distance Gaussian kernel Labeled MADM Effectiveness evaluation
a b s t r a c t Attribute weight assignment plays a key role in multiple attribute decision making (MADM). For the issue of labeled multiple attribute decision making (LMADM), the existing methods of attribute weight determination that have been well developed for MADM usually ignore or do not take full advantage of the supervisory function of labels. As a result, the weights produced by these methods may not be ideal in practice. To make up for this deficiency, this paper develops an objective method based on Bayes risk. Specifically, the LMADM problem is first put forward, then a Gaussian kernel based loss function is proposed to cope with the drawback that the loss function in Bayes risk is usually determined by experts. Meanwhile, Mahalanobis distance and fuzzy neighborhood relationship are employed to measure the fuzziness of data set. Finally, a number of experiments, including the comparison experiments on UCI data and the effectiveness evaluation of fighter, are carried out to illustrate the superiority and applicability of the proposed method. © 2018 Elsevier B.V. All rights reserved.
1. Introduction Data systems in practical applications can be broadly divided into two categories, decision system (DS) and information system (IS). The first category named DS is a set of data consisted of conditional attributes and decision attributes, and the other one called IS does not include decision attributes, i.e., labels. In practical applications, multiple attribute decision making (MADM) is one of the most important and fundamental issues in the field of DSs, which could be called labeled multiple attribute decision making (LMADM), because of its significant applications in various fields such as effectiveness evaluation [1], classification [2] and fault diagnosis [3]. Attribute weight assignment is one of the most significant parts in MADM, which has been studied in-depth in a variety of aspects [4–11]. Generally, it can be classified into three categories of methods, i.e., subjective methods [12,13], objective methods [14–16] and hybrid methods [17–20], according to the extent of dependence on the preferences or subjective judgements of decision makers (DMs) [4]. In practical applications, it is usually quite hard to obtain ideal weight results by using the subjective or hybrid methods when there are lack of related field experts or no unanimous conclusion reached by DMs [21,22]. Fortunately, the objective weight methods
∗
Corresponding author. E-mail addresses:
[email protected] (M. Suo),
[email protected] (S. Li).
can solve the above problem effectively, which generate attribute weights from data alone without requiring any preference information from the DMs. According to the applied data systems, there could be two parts with respect to the objective methods, one is named ISOM (objective method for information system) and the other one is called DSOM (objective method for decision system). However, to the best of our knowledge, most of objective weight assignment methods are aimed at IS, such as Entropy method [4,23,24], Principal Components Analysis (PCA) method [25], criteria importance through inter-criteria correlation (CRITIC) method [16], modifying TOPSIS method [14], standard deviation (SD) and mean deviation (MD) method [26], correlation coefficient and standard deviation (CCSD) method [15] and some other weight assignment methods in different issues (see, e.g., [27–29] and the references therein). Nevertheless, there are few relevant studies on DS. These methods, such as grey relation analysis (GRA) [30] approach, conditional entropy (CE) [31] approach, rough set (RS) [32,33] approach, F-score approach [34] and mutual information approach [35], could be the alternatives for the assignment of conditional attribute weights in DSs, due to the considering of the coupling relationships between conditional attributes and decision attribute. All of the above methods for ISs, however, do not consider the contributions of decision attributes to the determination of conditional attribute weights, when they are applied to DS. The conditional attributes are the descriptions of the whole system in some concerned aspects. Usually, there is only one decision attribute in
https://doi.org/10.1016/j.knosys.2018.04.002 0950-7051/© 2018 Elsevier B.V. All rights reserved.
Please cite this article as: M. Suo et al., Fuzzy Bayes risk based on Mahalanobis distance and Gaussian kernel for weight assignment in labeled multiple attribute decision making, Knowledge-Based Systems (2018), https://doi.org/10.1016/j.knosys.2018.04.002
JID: KNOSYS 2
ARTICLE IN PRESS
[m5G;April 7, 2018;8:40]
M. Suo et al. / Knowledge-Based Systems 000 (2018) 1–14
DS,1 which is a generalization of the overall system and an abstract of all the conditional attributes. Each conditional attribute provides a particular contribution to its system and an individual support degree to the abstract of decision attribute, which could be depicted as the weight of conditional attribute. Therefore, for DSs, the determination of conditional attribute weights cannot ignore the role of decision attribute. In fact, with regard to MADM, the final decision produced by any decision making unit will be accompanied by some risks. These risks stem from the data distributions of conditional attributes, consequently, each of the conditional attributes will generate a unique risk for the final decision, which could employ the weight of attribute as a metric. However, the exiting methods have not taken the decision risk as a main factor to determine the weights of attributes. On the other hand, the current methods may not take into account the fuzziness of data system, which includes two aspects: one is the fuzziness among the samples and the other one is that between the samples and the decision classes. Therefore, there are two kinds of coupling relations between the samples induced by conditional attributes and decision attribute, i.e., the decision risk and the fuzzy membership. Further more, with respect to the weight assignment of multi-layer attribute set, it usually needs the help of experts/DMs, or it is achieved through some complex combination methods [17], which greatly limits the application of weight determination methods in multi-layer index system. These inadequacies of the present researches motivate this work. To handle the aforementioned issues and overcome the deficiencies of the existing methods, we propose a simple and effective objective attribute weight assignment method (MGFBRW) using Mahalanobis distance and Gaussian kernel based fuzzy Bayes risk (MGFBR) method, which is applicable not only to ISs and DSs, but also to single-layer and multi-layer index systems. In order to mine the fuzziness of data system, Mahalanobis distance and fuzzy neighborhood relationship are employed to generate the fuzzy similarities among samples and fuzzy memberships between the samples induced by conditional attributes and decision classes. Therefore, the Bayes risk model characterized by the aforementioned fuzziness can be called as a fuzzy Bayes risk model. The loss function in Bayes risk, however, is usually determined by experts or through a large number of statistical tests, which greatly limits the practical application and extension of Bayes risk theory [36]. In order to cope with this drawback, a novel loss function model based on Gaussian kernel is proposed. Furthermore, an improved loss function model combined with Gaussian kernel and Mahalanobis distance is designed to determine weights for multi-layer data system. Subsequently, a number of experiments, including the parameter selection tests and comparison experiments, are carried out to illustrate the superiority of the proposed method. Finally, we demonstrate and verify the applicability of the proposed method through the effectiveness evaluation of fighter. Therefore, the main highlights of this work lie in that 1) This paper is the first attempt to deal with the problem of labeled multiple attribute decision making. 2) A simple and effective objective attribute weight assignment method named MGFBRW is proposed. 3) A Gaussian kernel loss function model is raised, which can promote the application and extension of Bayes risk theory. 4) The detailed demonstrations and analyses of fighter effectiveness evaluation have important guiding significance for other similar engineering applications.
1
In fact, systems with multiple decision attributes can also be transformed into ones with single decision attribute.
The remainder of this paper is organized as follows. Section 2 introduces the LMADM problem for this work. The basic theories and analyses of the proposed method are presented in Section 3. The results and analyses of numerical experiments are given in Section 4, and the effectiveness evaluation of fighter is demonstrated in Section 5. Then, some discussions are brought in Section 6. Finally, conclusions and future work are described in Section 7. 2. LMADM Problem In this section, the LMADM problem related to our work will be carried out first, then the goal of LMADM is analyzed, which will pave the way for the further development of the following sections. 2.1. Labeled multiple attribute decision making problem Definition 1 (Decision system). [37] A decision system is a 4-tuple DS = (U, {A|A = C ∪ D}, {Va |a ∈ A}, {Ia |a ∈ A} ), where U is a finite set of objects called universe and U = {x1 , x2 , · · · , xm }, A is the attribute set, C is the set of conditional attributes, D is the decision attribute, C ∩ D = ∅, D = ∅, Va is a set of values of each a ∈ A, and Ia is an information function for each a ∈ A. A decision system is often denoted as DS = (U, A, V, I ) or DS = (U, C, D ) for short. Specifically, a decision system is called an information system IS = (U, C ) if its decision attribute forms an empty set [38]. The LMADM problem used in DS is a special case of MADM. The decision attribute (label) provides an initial rough classification label for the whole data system. However, in MADM, we expect to acquire the ordering relationship of each alternative. Therefore, the specific implication and resolution process of LMADM can be described as follows. Definition 2 (LMADM). Given a decision system DS = (U, C, D ), U = {x1 , x2 , · · · , xm } is the set of alternatives, C = {c1 , c2 , · · · , cn } is the set of conditional attributes, D = {d1 , d2 , · · · , dK } (K ≤ m) is the set of labels associated with the alternatives, W = (w1 , w2 , · · · , wn ) generated by some means is the weight vector of C, such that n j=1 w j = 1 and wj ≥ 0, V = [vi j ]m×n is a decision matrix given by the decision maker, where vij denotes the preference value of xi induced by cj . It is worth noting that, the DMs often tend to be dishonest in the process of decision making because of their personal preferences [9], which has become a complex and difficult problem in the study of MADM. In this regard, a number of research results have been reported in the literature (see [6–11]). In this paper, in order to simplify the study, we assume that the DMs are honest in MADM and employ the objective weight assignment methods to maximize the possibility of avoiding the participation of DMs. As for the dishonest topic, it is not the focus of this paper and the reader interested in this issue can refer to the references [9,39–42]. Generally, there are four steps in LMADM by using objective weight assignment methods, which can be listed as follows. 1) Normalization In order to avoid the dimensional problem interfering with decision making, the raw data system should be normalized, where the cost normalized model (Eq. (1)) and the income normalized model (Eq. (2)) are employed if cj ∈ C is a cost and benefit attribute [43], respectively.
vi j =
max j (vi j ) − vi j , max j (vi j ) − min j (vi j )
(1)
Please cite this article as: M. Suo et al., Fuzzy Bayes risk based on Mahalanobis distance and Gaussian kernel for weight assignment in labeled multiple attribute decision making, Knowledge-Based Systems (2018), https://doi.org/10.1016/j.knosys.2018.04.002
ARTICLE IN PRESS
JID: KNOSYS
[m5G;April 7, 2018;8:40]
M. Suo et al. / Knowledge-Based Systems 000 (2018) 1–14
vi j =
vi j − min j (vi j ) . max j (vi j ) − min j (vi j )
3
(2)
2) Weight assignment This step is the central part in LMADM, since different weights will produce different decision order of alternatives [9]. In this paper, we focus on the objective approaches of weight assignment and generate the conditional attribute weights (wC ) with the aid of the supervisory function of decision attribute (label). 3) Aggregation In our work, the frequently used operator WA [44] is utilized in this step, which is described as follows
d(xi ) = WAw (vi1 , vi2 , · · · , vin ) =
n
w j vi j ,
j=1 T
D = V wC ,
(4)
where the superscript T stands for matrix transposition, wC ∈ R1×n . 4) Ranking of the alternatives The final step is to sort the alternatives in terms of the decision vector D(x ) = (d(x1 ), d(x2 ), · · · , d(xm )), and the sorted result is the rank-order of the alternatives.
J = min eC GT (eC ), n
(10)
e j = 1, e j ≥ 0.
(11)
j=1
Definition 3 (Distribution error). Given a decision system DS = (U, C, D ), and a certain map function F( · , · ), then the distribution error of C is defined as n
where 1 = (1, 1, · · · , 1 ) ∈ R1×n , 0 < ε eC . Eq. (9) denotes that the less the distribution error between cj and D is, the greater the weight wc j will be, and vice versa, which is in line with our natural cognition. Then, the objective function could be presented as
s.t.
2.2. Goal of LMADM
eC =
Fig. 1. The distribution of X induced by c with respect to D.
(3)
F(c j (v ), D ) = F(V, D ),
(5)
Therefore, if we could acquire an appropriate distribution error vector, the objective function of LMADM will be the optimal one. In the following section, we will discuss how to extract this distribution error in detail. 3. Weight assignment based on MGFBRW
j=1
where eC = (e1 , e2 , · · · , en ), nj=1 e j = 1 and ej ≥ 0. The distribution error could be considered as difference index. On the other hand, it is regarded as consistency index if C is beneficial for D. In LMADM, the main goal is to minimize the difference index/maximize the consistency index between the original decision attribute and the decision vector produced in step 3) by setting an appropriate weight vector, which can be described as
J = min Fdif (D, D )orJ = max Fcon (D, D ),
(6)
where Fdif ( · , · ) denotes the difference map function between D and D, and it is consistency map function for Fcon ( · , · ). Then, we take the difference index as an example. Given a weight vector that includes two parts: the weights of conditional attributes and the weight of decision attribute, W = (wC , wD ) = (w1 , w2 , · · · , wn , 1 ),2 where wC = (w1 , w2 , · · · , wn ) and wD = 1. Then, the objective function could be rewritten as follows according to Eq. (4)
J = min Fdif (V wCT , D ) = min Fdif (V, D )W T = min eC wCT .
(7)
From Eq. (7) we can see that, there are two factors affecting the above objective function J, i.e., the weights of conditional attributes and the distribution error. Without loss of generality, given a function G that is defined as
wC = G(eC ).
(8)
With respect to the difference index, we usually set the weight vector as
wC = 1 − eC or wC =
1 , eC + ε
(9)
For multiple decision attributes, where wD = (w1 , w2 , · · · , wk ), k is the number of decision attributes, ki=1 wi = 1 and wi ≥ 0. With regard to the DSs in this paper, there is only one decision attribute, the weight wD is therefore equal to 1. 2
In this section, the single-layer data set is taken as an example to analyze how the above-mentioned distribution error is presented in DS. Accordingly, the MGFBRW method is designed. Finally, some discussions about the weight assignment in multi-layer data set and information system are carried out. 3.1. MGFBRW for single-layer DS Generally, data-driven attribute weight is determined based on a single-layer data set [14,15]. Considering a single dimensional conditional attribute c ∈ C and D = {d1 , d2 , · · · , dK } in DS, the distribution of sample set X (X = {x1 , x2 , · · · , xm } ) induced by c with respect to D can be depicted as Fig. 1, where di , dj , dk ∈ D, i, j, k ∈ {1, 2, , K}, K ≤ m. From Fig. 1 we can see that, decision making in region r1 is bound to generate some risks/errors because of the overlap between different decision classes, i.e., di , dj and dk . Region r2 , in particular, will produce a more risky decision due to the inclusion of more than two classes in it. Neighborhood granulation and fuzzy relation are two effective tools for processing continuous data. Therefore, in order to deal with hybrid data and take into account the fuzziness of samples, we propose a fuzzy neighborhood relationship based on Mahalanobis distance as follows. Definition 4 (Fuzzy neighborhood). Given an arbitrary xi ∈ U and c ∈ C, a fuzzy neighborhood Nc (xi ) of xi in subspace c is defined as
Nc (xi ) = {x j |x j ∈ U, fc (xi , x j ) ≥ δ},
(12)
where δ is a threshold, f( · , · ) is fuzzy similarity depicted as
MD(xi , x j ) f (xi , x j ) = exp − , n
(13)
Please cite this article as: M. Suo et al., Fuzzy Bayes risk based on Mahalanobis distance and Gaussian kernel for weight assignment in labeled multiple attribute decision making, Knowledge-Based Systems (2018), https://doi.org/10.1016/j.knosys.2018.04.002
ARTICLE IN PRESS
JID: KNOSYS 4
[m5G;April 7, 2018;8:40]
M. Suo et al. / Knowledge-Based Systems 000 (2018) 1–14
Fig. 2. The distribution of N in c with respect to dk .
where n is the number of conditional attributes that induce sample xi and it is 1 for the single-layer case, MD( · , · ) is Mahalanobis distance [45] described as
MD(xi , x j ) =
(xi − x j )C −1 (xi − x j )T ,
Fig. 3. The distribution relationships.
as
(14)
where the superscript ‘−1’ is the inverse operation.
Rc ( dk |xi ) =
K
λkj (c, xi )P (d j |xi ),
(15)
j=1
Remark 1. The main drawback of Mahalanobis distance is that the inverse of C may not exist when the following two conditions occurred: 1) the row number of data is less than that of column; 2) the standard deviation of data is 0. Remark 2. Obviously, the following properties hold regarding to fuzzy similarity: 1) f (xi , xi ) = 1; 2) f (xi , x j ) = f (x j , xi ); 3) 0 < f(xi , xj ) ≤ 1. With the help of fuzzy neighborhood relationship, we transform the distribution of dk in Fig. 1 into the neighborhood space, where the basic cells are the neighborhood set instead of sample set. The distribution of neighborhood sets N in c with respect to dk is depicted as Fig. 2. In Fig. 2, (Np ∈ N) is the neighborhood of sample xp , the red regions denote that the samples in neighborhood Np are classified into dk according to the current decision system, and the gray regions are the ones classified into other classes except dk , i.e., di or dj in Fig. 1. Moreover, the neighborhood N0 will not be taken into account in the calculation of dk . According to the above analyses, the distribution error ec produced by a conditional attribute c could be the error that measures the difference between the distribution of the current conditional attribute and that of the decision attribute. Bayes risk utilizes the loss function and the probability to estimate the risk of event, which takes full account of the causal and dependency relations between the conditional attributes and the decision attribute in DS and has been applied in various of applications (see, e.g., [46–48] and the references therein). Consequently, we can derive the principle of attribute weight assignment in DS as, such an attribute should be assigned a greater weight, whose Bayes risk with respect to decision attribute is lower. Therefore, the definition of Bayes risk in DS can be drawn as follows. Definition 5 (Bayes risk in DS). Given a decision system DS = {U, C ∪ D, V, I}, U = {x1 , x2 , · · · , xm }, D = {d1 , d2 , · · · , dK }, for an arbitrary xi ∈ U induced by an attribute c ∈ C, it may be divided into any decision classes of D by using some metrics, but it belongs to a certain class dk ∈ D according to the information function I. Therefore, the Bayes risk of xi vesting in dk with respect to c is defined
where λkj (c, xi ) is a loss function that measures the loss relative to its own class dk when xi is classified into the possible class dj , and P(dj |xi ) is the probability of xi belonging to dj . From Eq. (15) we can see that, there are two factors that affect the Bayes risk, i.e., the loss function and the probability. Therein, the loss function takes account of the sample distribution in conditional attributes, so as to estimate the degree of the sample loss; meanwhile, the probability is derived from the relationship between the conditional attributes and the decision attribute. Consequently, the above Bayes risk function fully considers the distribution of data and the coupling relationship between the conditional attributes and the decision attribute, and it is the fusion of ISOMs and DSOMs. Therefore, the Bayes risk function will be the better one to assign attribute weight in LMADM. The commonly used loss function is 0–1 function [49–51], but it can not effectively assess the real loss of decision making. In order to guarantee the effectiveness of loss function, it is usually determined by experts or through a large number of statistical tests [36], which has been a stumbling block to the application and extension of this theory. From the aspect of the data distribution in Fig. 1, we depict the relationships in the data belonging to one decision class (shown in Fig. 3). From Fig. 3 we can see that, the samples x1 and x2 are subject to the same distribution (e.g., Gaussian distribution) with the expectation μ. Obviously, the loss of misclassification x2 is less than that of x1 . Thus, we can employ the relationship between the sample and the expectation to measure the loss of misclassification. Therefore, we propose a kind of loss function based on Gaussian kernel, in which the loss values of samples are depicted by the distribution characteristics of data. Definition 6 (Gaussian kernel loss function). Given a decision system DS = (U, C ∪ D ), c ∈ C, D = {d1 , d2 , · · · , dK }, for an arbitrary sample xi in U, its designated decision class is dk and its possible class is dj produced by some metrics, dj , dk ∈ D. Then, the Gaussian kernel loss function of xi relative to c is defined as
λ (c, xi ) = k j
exp − (xi2−σμ2k ) k
0
2
k = j, k = j,
(16)
where μk is the expectation of the sample set belonging to class dk with respect to c, and σ k is the corresponding standard deviation, k, j ∈ {1, 2, , K}.
Please cite this article as: M. Suo et al., Fuzzy Bayes risk based on Mahalanobis distance and Gaussian kernel for weight assignment in labeled multiple attribute decision making, Knowledge-Based Systems (2018), https://doi.org/10.1016/j.knosys.2018.04.002
ARTICLE IN PRESS
JID: KNOSYS
[m5G;April 7, 2018;8:40]
M. Suo et al. / Knowledge-Based Systems 000 (2018) 1–14
Remark 3. There are three aspects about the Gaussian kernel loss function: (a) if the sample is divided into its own decision class, i.e., k = j, its loss is 0; (b) if the sample is nearer to the expectation, its loss will be greater; (c) the loss is 1 if the standard deviation is 0, which means that all the data in this class are equal to each other. From the above definition we can see that, for the multi-layer data set, the loss function (Eq. (16)) cannot take into account the mutual effect between each attribute, in other words, it does not combine all the attributes that are in the same layer as a whole, so the above Gaussian kernel loss function is not good at dealing with multi-layer data set. Fortunately, Mahalanobis distance considers all the concerned attributes at the same time with the help of covariance matrix. What’s more, both of Gaussian kernel and Mahalanobis distance take the distribution character as a factor to generate the relationship of samples. Therefore, we can rewritten the above loss function (Eq. (16)) as follows.
So far, the proposed method takes full account of the two aspects of fuzziness, one is the fuzzy similarity relation among samples (Definition 4), and the other one is the fuzzy membership relation between the sample and the decision class. Therefore, the presented Bayes risk in Definition 5 can be named as Mahalanobis distance and Gaussian kernel based fuzzy Bayes risk (MGFBR), or fuzzy Bayes risk (FBR) for short. Theorem 1. The fuzzy Bayes risk (Definition 5) defined as Rc (dk |xi ) = K k k j=1 (λ j P (d j |xi )) is equivalent to Rc (dk |xi ) = λ∼k (1 − P (dk |xi )), where d ∼ k is the decision classes except dk , and the decision class of xi is dk according to the information function. Proof. It follows from Eq. (16) that the loss function can be rewritten as λk∼k = exp(−MD(xi , μk )/n ) if k = j, and λkk = 0 (k = j). Therefore, the risk function can be written as
Rc ( dk |xi ) =
Definition 7 (MD based Gaussian kernel loss function). Given a decision system DS = (U, C ∪ D ), c ∈ C, D = {d1 , d2 , · · · , dK }, for an arbitrary sample xi in U, its designated decision class is dk and its possible class is dj produced by some metrics, dj , dk ∈ D. Then, the Mahalanobis distance based Gaussian kernel loss function of xi relative to c is defined as
λkj (c, xi ) = exp 0
− MD(xni ,μk )
k = j k = j,
Remark 4. The above definition shows that, λkj (c, xi ) = 0 iff k = j. With regard to the two conditions of the MD’s shortcoming presented in Remark 1, i.e., the row number of data is less than the column number of data, and the standard deviation of data is 0. For the first condition, it is rare in weight assignment for singlelayer data set. For the second one, the distance should be equal to 0 and the loss is 1, which is the same as the condition (c) in Remark 3. What’s more, the Mahalanobis distance should be replaced by the Euclidean distance if the inverse of the covariance matrix in MD does not exist. Definition 8 (Classification probability). Given a decision system DS = (U, C ∪ D ), for an arbitrary sample xi ∈ U induced by c ∈ C, its fuzzy neighborhood is N (xi ) = {x1 , x2 , · · · , xm }, and the corresponding decision set is N (d ) = {d1 , d2 , · · · , d p }, N(d)⊆D. Then, the probability of xi classified to dj ∈ N(d) is denoted as
{ f ( xi , xk )|xk ∈ d j } Pc (d j |xi ) = , { f ( xi , xk )|xk ∈ N ( d )}
(18)
where xk ∈ dj is generated by the information function in this DS, f( · , · ) is the fuzzy similarity relation. This definition can be also called fuzzy membership due to the fuzzy similarity relation. Remark 5. Obviously, the property 0 ≤ Pc (dj |xi ) ≤ 1 holds. Therein, Pc (d j |xi ) = 0 iff the following three conditions hold simultaneously. 1) the threshold δ = 0; 2) f (xi , xk ) = 0; 3) only xk is belonging to dj in N(xi ). And on the other hand, Pc (dj |xi ) = 0 if the decision class of xi is dj according to the corresponding information function.
K
λkj P (d j |xi )
j=1
= λk1 P (d1 |xi ) + λk2 P (d2 |xi ) + · · · + λkk P (dk |xi ) + · · · + λkK P (dK |xi ) = λk∼k P (d1 |xi ) + λk∼k P (d2 |xi ) + · · · + λkk P (dk |xi ) + · · · + λk∼k P (dK |xi ) = λk∼k (P (d1 |xi ) + P (d2 |xi ) + · · · + P (dK |xi ))
(17)
where MD( · , · ) is Mahalanobis distance, μk is the expectation of the sample set belonging to class dk with respect to c, k, j ∈ {1, 2, , K}, n is the number of conditional attributes that induce sample xi and it is 1 for the single-layer case. Usually, we take the loss function as λkj for short.
5
+ λkk P (dk |xi ) = λk∼k (1 − P (dk |xi )) + λkk P (dk |xi ) = λk∼k (1 − P (dk |xi )).
The above Theorem 1 greatly reduces the computational complexity of the Bayes risk, which will be helpful for promoting the proposed method. Remark 6. Through the above theorem and its corresponding proof, the loss function can be rewritten as
λk∼k = exp − MD(xni ,μk ) , and λk∼k = 0 holds.
Theorem 2. The fuzzy Bayes risk satisfies that 0 ≤ Rc (dk |xi ) < 1. Proof. According to Theorem 1, the Bayes risk could be writ ten as Rc (dk |xi ) = λk∼k (1 − P (dk |xi )) where λk∼k = exp −
MD(xi ,μk ) n
,
λk∼k
and 0 < ≤ 1 holds according to Remark 6. On the other hand, the classification probability satisfies that 0 < P(dk |xi ) ≤ 1 according to Definition 8 and Remark 5. Therefore, the above theorem holds. According to the aforementioned definitions and Eq. (9), we can derive the definition of attribute weight based on MGFBR as follows. Definition 9 (MGFBR weight). Given a decision system DS = (U, C ∪ D ), U = {x1 , x2 , · · · , xm }, for an arbitrary c ∈ C, its weight based on MGFBR is denoted as
wc = 1 − Rc , where Rc =
1 m
(19) m
i=1
Rc (dk |xi ), xi ∈ U, dk ∈ D.
It is easy to see that 0 ≤ wc < 1 according to Theorem 2. Thus, the weight vector of DS is W = (w1 , w2 , · · · , wn ), where wc = wc / nc=1 wc , and 0 ≤ wc < 1 holds. 3.2. Extension of MGFBRW In practical application, the multi-layer index system is commonly used in MADM. Generally, the weight determination of
Please cite this article as: M. Suo et al., Fuzzy Bayes risk based on Mahalanobis distance and Gaussian kernel for weight assignment in labeled multiple attribute decision making, Knowledge-Based Systems (2018), https://doi.org/10.1016/j.knosys.2018.04.002
ARTICLE IN PRESS
JID: KNOSYS 6
[m5G;April 7, 2018;8:40]
M. Suo et al. / Knowledge-Based Systems 000 (2018) 1–14
multi-layer DS often depends on some subjective approaches or some complex approaches [17], while the method proposed in this paper can easily obtain the weights of each layer conditional attributes with the help of Mahalanobis distance and fuzzy neighborhood relationship. The main different is that the conditional attribute c ∈ C should be changed to B⊆C. However, in this case, the condition of MD’s drawback may be occurred, which is that the row number is less than that of column. Then the MD based Gaussian kernel loss function should be degenerated to the form of
Table 1 The details of UCI data.
λk
Eq. (16), and the total loss is ∼k L , where L is the column number of B. On the other hand, IS is a kind of commonly used data systems. Compared with DS, IS does not contain decision attribute. In order to deal with the weight assignment in IS, the combination approach of MGFBR and some clustering methods should be an alternative, where the clustering methods provide labels as the decision attribute. This part is not the focus of this paper. Therefore, MGFBR is also applicable to assign weight for multilayer data set and information system. What’s more, all the theorems still hold in the cases of multi-layer data set and IS. 3.3. MGFBRW Algorithm Based on the preceding theories and definitions, the algorithm of MGFBRW can be designed as follows (shown in Algorithm 1). Algorithm 1 MGFBRW. Input: the m × n raw data and the m × 1 labels in DS = (U, C ∪ D ), threshold δ Output: weight vector W 1: for each c or B in C do // c ∈ Rm×1 , B ∈ Rm×k , (k ≤ n ) 2: Rc = ∅ or RB = ∅. for each xi and x j in U induced by c or B do 3: Calculate the Mahalanobis distance MD(xi , x j ) according 4: to Eq. (14). end for 5: Obtain the fuzzy similarity matrix F ( f (xi , x j ) ∈ F) according 6: to Eq. (13). 7: for each xi in U induced by c or B do NL(xi ) = ∅ // the neighborhood labels of xi 8: 9: NF (xi ) = ∅ // the neighborhood fuzzy similarity of xi 10: for each x j in U induced by c or B do if f (xi , x j ) ≥ δ then 11: NL(xi ) = NL(xi ) + d (x j ). // add the label of x j to 12: NL(xi ) NF (xi ) = NF (xi ) + f (xi , x j ). 13: end if 14: 15: end for if there exist heterogeneous labels in NL(xi ) then 16: Compute P (d (xi )|xi ) by NF (xi ) according to Eq. (18). 17: Compute λ(xi ) according to Eq. (17) and Remark 6. 18: R(xi ) = λ(xi ) · (1 − P (d (xi )|xi )). 19: end if 20: end for 21: 22: Rc = Rc + R(xi ) or RB = RB + R(xi ). wc = 1 − Rc /m or wB = 1 − RB /m. 23: 24: end for n 25: wc = wc / c=1 wc and the same as wB . 26: return W combined with wc or wB .
ID
Data
Samples
Feature
Discrete
Continuous
Class
1 2 3 4 5 6 7 8
Banknote Blood Ecoli Iris Seeds Vote Wdbc Wine
1372 748 336 150 210 435 569 178
4 4 7 4 7 16 30 13
0 0 2 0 0 16 0 0
4 4 5 4 7 0 30 13
2 2 8 3 3 2 2 3
(in Eq. (12)) selection tests, and the other one includes some comparison experiments. It is worth noting that, unfortunately, there are no universally acknowledged approaches to evaluate the performance of weight assignment method. We note that the classification accuracy of DS can effectively measure the importance of conditional attributes and has been widely used in feature selection and atttribute reduction [32]. Pearson Correlation Coefficient (PCC) [52] is an effective means to measure the similarity between spatial vectors. Therefore, we employ some classification accuracy evaluation methods as the references and PCC as the tool to measure the pros and cons of attribute weight assignment methods. With the help of Weka,3 we employ as many as ten classifier methods and 10-fold cross-validation in order to guarantee that the results are highly credible. Therein, the employed ten classification algorithms produced in Weka are C4.5(J48), REPTree, NaiveBayes, SVM(SMO), IBk, Bagging, LogitBoost, FilteredClassifier, JRip and PART, and the default values in Weka are chosen for their parameters. The weights produced by classification accuracies are the average values of the ten methods. All of the following experiments are run on the same platform and compiled by Matlab.
4.1. Parameter selection There is only one parameter in the MGFBRW method, i.e., the fuzzy neighborhood threshold δ . In this subsection, we design some tests using UCI4 data to demonstrate the better range of the parameter δ , the details of the UCI data are shown in Table 1. Therein, the feature is the number of the conditional attributes in DS and class is the number of decision categories. The theoretical range of the parameter δ is [0,1], we set the length of step as 0.05 and take PCC between the weights assigned by our MGFBRW and the classification accuracies as the evaluation metric. The test results for single-layer attribute set are shown in Fig. 4. From this figure we can see that, for the selected data, using the parameter selected in the wide range of 0.25–0.95, some satisfactory correlation coefficients can be obtained, i.e., the weights determined by the δ in [0.25, 0.95] maintain good consistencies with the classification accuracies. With regard to multi-layer data set, we take each of the selected UCI data sets as the higher layer index, and the features in each data set are the basic layer indexes. Then, we compute the risk of each data set and obtain the weights of the higher layer indexes. The alternative range of δ and PCC are also employed in this test. The test results for multi-layer attribute sets are shown in Fig. 5. The results in this figure show that, the optimum range of δ is smaller than that for single-layer case, which is from 0.85 to 0.95.
4. Numerical experiments In this section, we carry out two parts of experiments to reveal the superiority of MGFBRW. The first one contains the parameter δ
3 4
http://weka.wikispaces.com,v3.6.13. http://archive.ics.uci.edu/ml/datasets.html.
Please cite this article as: M. Suo et al., Fuzzy Bayes risk based on Mahalanobis distance and Gaussian kernel for weight assignment in labeled multiple attribute decision making, Knowledge-Based Systems (2018), https://doi.org/10.1016/j.knosys.2018.04.002
ARTICLE IN PRESS
JID: KNOSYS
[m5G;April 7, 2018;8:40]
M. Suo et al. / Knowledge-Based Systems 000 (2018) 1–14
Fig. 4. The δ test results for single-layer attribute set.
7
are the latest ones that produce weights by F-score and mutual information. The parameter δ is 0.8 in MGFBRW. In addition, the classification accuracy and PCC are also employed to evaluate the performance of each method, and the UCI data in Table 1 are also used in these experiments. The comparison results are shown in Table 2 and the bold values indicate the maximum ones. From the results in Table 2 we can see that, most of the results obtained by the MGFBRW method are the best ones, in other words, the correlation coefficients of most data have reached the maximum values. In addition to the NRS and DRJMIM methods, all of the results produced by the other comparisons have negative correlation values, which indicates that there are greater deviations between the weights produced by the above methods and those of the classification accuracies. It is noteworthy that, there is no result of Vote with regard to the NRS method, because there are always some overlap regions between the classes of each discrete conditional attribute and those of decision attribute, which produces an empty positive region and a zero-weight result in the NRS model. The above problem is the main drawback of the NRS method. What’s more, there is also no result of Ecoli regarding to the FDAF method, because the mean values of some attributes induced by the decision classes are equal to each other, which is an inadequacy of FDAF. 5. Effectiveness evaluation of fighter In this section, an application of LMADM related to the effectiveness evaluation of fighter is carried out to demonstrate the validity of the proposed method. The effectiveness evaluation of fighter is one of the main approaches to measure the capabilities of fighter to accomplish some given tasks, which could be applied to fighter design, combat simulation, military might comparison, etc. The methods of fighter effectiveness evaluation include analytic hierarchy process (AHP) [54], availability-dependability-capability (ADC) [55], synthesized index method [56], fuzzy evaluation (FE) [57] and multiple attribute decision analysis (MADA) [58], etc. Therein, MADA, a simple and practical method, can only rely on the characteristics of the data in index system to obtain the evaluation results, which is also considered as MADM. 5.1. Index system of fighter
Fig. 5. The δ test results for multi-layer attribute set.
4.2. Comparison experiments In this subsection, we select some commonly used objective weight determination methods and some mature and latest methods that can produce the weights of DS as comparisons to illustrate the advantage of the proposed method. Therein, the objective methods are Entropy method [4], CRITIC [16], SD [26] and CCSD [15], and the other ones are GRA [30], CE [31], NRS [32], FDAF [34] and DRJMIM [35]. In the GRA model, we take the decision attribute as the optimal one to measure the importance degrees of conditional attributes, which could be considered as the weights of attributes. In the CE model, we take the conditional entropy of decision attribute with regard to conditional attribute as the metric to produce the weights of conditional attributes, the smaller the conditional entropy is, the greater the weight should be assigned. In addition, because the discretization method plays an important role in CE and DRJMIM, we select the SMDNS method as the discretization tool that has the best performance compared with other models in literature [53]. In the NRS model, the dependencies are employed to generate the weights of attributes, and the neighborhood threshold is 0.1. The FDAF and DRJMIM methods
The index system of fighter consists of two parts, i.e., the basic performance (BP) index set and the maneuver performance (MP) index set, and there are some sub-attributes in the two sets. The index system can be considered as a multi-layer system shown in Table 3. We sort out the index data of some typical fighters [57] shown in Tables 9–10 in Appedix, which are classified into four categories according to the generational criteria [59]. 5.2. Weight assignment based on MGFBRW We demonstrate the calculation process of MGFBRW by the above index system. For single-layer attribute weight determination, we take the basic performance index set as an example. Before the weight assignment, the original data should be normalized, where the cost normalized model (Eq. (1)) is used for the second attribute, i.e., take off wing load, because the less the wing load is, the better the maneuverability will be in air combat. And the income normalized model (Eq. (2)) is employed for the other attributes. We take the index value, thrust of MiG-21 (ID: 7, Generation: 2), as an example to demonstrate the following calculation process. Firstly, we calculate the Mahalanobis distance of
Please cite this article as: M. Suo et al., Fuzzy Bayes risk based on Mahalanobis distance and Gaussian kernel for weight assignment in labeled multiple attribute decision making, Knowledge-Based Systems (2018), https://doi.org/10.1016/j.knosys.2018.04.002
ARTICLE IN PRESS
JID: KNOSYS 8
[m5G;April 7, 2018;8:40]
M. Suo et al. / Knowledge-Based Systems 000 (2018) 1–14 Table 2 The PCCs between each method and the weights produced by classification accuracy. ID
Data
Entropy
CRITIC
SD
CCSD
GRA
CE
NRS
FDAF
DRJMIM
MGFBRW
1 2 3 4 5 6 7 8
Banknote Blood Ecoli Iris Seeds Vote Wdbc Wine
−0.0309 −0.8336 −0.7523 0.9817 0.9041 0.0631 0.3340 0.6350
0.1142 −0.3548 0.9030 0.9280 0.8916 −0.6436 0.7840 0.5658
0.5692 0.0278 0.8135 0.9855 0.9709 −0.2194 0.5685 0.4810
−0.2257 0.5557 0.4204 −0.2401 −0.9399 0.1375 −0.4151 −0.0110
−0.7469 −0.5322 −0.5871 0.9834 −0.6036 −0.2304 −0.7614 −0.6843
0.2135 −0.3677 0.4058 0.6054 0.4256 −0.2443 0.6076 0.7167
0.9211 0.7279 0.5838 0.9617 0.9053 – 0.7914 0.4570
0.9222 −0.2211 – 0.9692 0.8812 0.7718 0.8658 0.8181
0.9478 0.5662 0.0626 0.8534 0.6188 0.0066 0.9266 0.8628
0.9723 0.9941 0.8685 0.9951 0.9937 0.9183 0.9923 0.9745
Table 3 The index system of fighter. ID
BP (C1 )
Unit
ID
MP (C2 )
Unit
C11 C12 C13 C14 C15 C16 C17
Thrust Take off wing load Basic range Maximum range Ceiling Maximum speed Maximum permissible indicate speed
10N kg/m2 km km m km/h km/h
C21 C22 C23 C24
Thrust-to-weight ratio Maximum using overload Maximum hover overload Specific excess power
g g m/s
Table 4 The weights of the basic performance index set. Method
C11
C12
C13
C14
C15
C16
C17
Entropy CRITIC SD CCSD GRA CE NRS FDAF DRJMIM MGFBRW CA
0.2814 0.1211 0.1532 0.1314 0.1572 0.1411 0.3333 0.1503 0.1307 0.1546 0.2037
0.0497 0.2608 0.1110 0.2409 0.1101 0.1599 0.0370 0.0665 0.2039 0.1331 0.1061
0.1128 0.1083 0.1110 0.1355 0.1382 0.1801 0.0370 0.0665 0.2698 0.1331 0.1099
0.1174 0.1208 0.1254 0.1477 0.1436 0.1263 0.1852 0.0880 0.1447 0.1397 0.0954
0.1235 0.1326 0.1612 0.1260 0.1377 0.1398 0.0370 0.1105 0.0654 0.1374 0.1224
0.1142 0.1271 0.1636 0.1238 0.1543 0.1263 0.0741 0.1969 0.0928 0.1495 0.1917
0.2011 0.1294 0.1748 0.0946 0.1589 0.1263 0.2963 0.3212 0.0928 0.1507 0.1704
Table 5 The weights of the maneuver performance index set. Method
C21
C22
C23
C24
Entropy CRITIC SD CCSD GRA CE NRS FDAF DRJMIM MGFBRW CA
0.2431 0.1795 0.2283 0.1532 0.2507 0.2227 0.2917 0.2468 0.1965 0.2590 0.2780
0.1459 0.3306 0.2360 0.3365 0.2202 0.2227 0.1250 0.0365 0.1965 0.2307 0.1290
0.3297 0.1886 0.2624 0.1633 0.2508 0.3009 0.2083 0.2198 0.3273 0.2409 0.2393
0.2813 0.3013 0.2733 0.3470 0.2784 0.2536 0.3750 0.4968 0.2796 0.2693 0.3538
calculated according to Eq. (18), which is
PC11 (d2 |x7 ) =
1 + 0.8759 + 0.9187 = 0.7478. 1 + 0.8759 + 0.9187 + 0.9423
(20)
Then, the loss of x7 is obtained in terms of Definition 7 and Remark 6, which is
λd∼d1 1 (C11 , x7 ) = exp(−MD(0.1700, 0.2102 )/1 ) = 0.3832,
(21)
where 0.2102 is the average value of the samples induced by C11 with respect to the 2nd generation. After that, the risk of x2 induced by C11 can be produced as follows according to Definition 5 and Theorem 1:
RC11 (d2 |x7 ) =
λd∼d2 2 (C11 , x7 ) · (1 − PC11 (d2 |x7 ))
= 0.3832 × 0.2522 = 0.0966. each sample induced by the index C11 . Subsequently, the fuzzy similarity degrees related to MiG-21 can be obtained, and then the fuzzy neighborhood set and the corresponding fuzzy similarity set are presented as NC11 (x7 ) = {x7 , x9 , x10 , x13 } and NFC11 (x7 ) = {1, 0.8759, 0.9187, 0.9423} respectively, if δ is 0.8. Whereafter, the classification probability (fuzzy membership degree) of x7 can be
(22)
Finally, we can obtain all the normalized risks of conditional attributes, i.e., RC1 = (0.1277, 0.2490, 0.2490, 0.2115, 0.2133, 0.1563, 0.1498 ), and the normalized weights are W C1 = (0.1546, 0.1331, 0.1331, 0.1397, 0.1374, 0.1495, 0.1507 ) according to Definition 9. What’s more, the normalized weights of maneuver performance indexes are W C2 = (0.2590, 0.2307, 0.2409, 0.2693 ).
Table 6 The comparison results of weight assignment. Index
Entropy
CRITIC
SD
CCSD
GRA
CE
NRS
FDAF
DRJMIM
MGFBRW
BP MP
0.7278 0.6762
−0.3144 −0.2648
0.7316 0.5632
−0.5026 −0.0737
0.7294 0.9828
−0.4364 0.2276
0.5803 0.9829
0.6759 0.9664
−0.4944 0.3803
0.9243 0.9624
Please cite this article as: M. Suo et al., Fuzzy Bayes risk based on Mahalanobis distance and Gaussian kernel for weight assignment in labeled multiple attribute decision making, Knowledge-Based Systems (2018), https://doi.org/10.1016/j.knosys.2018.04.002
ARTICLE IN PRESS
JID: KNOSYS
[m5G;April 7, 2018;8:40]
M. Suo et al. / Knowledge-Based Systems 000 (2018) 1–14
9
Table 7 The rank-order of the basic performance index weights. Method
Rank-order
Entropy CRITIC SD CCSD GRA CE NRS FDAF DRJMIM MGFBRW CA
C11 > C17 > C15 > C14 > C16 > C13 > C12 C12 > C15 > C17 > C16 > C11 > C14 > C13 C17 > C16 > C15 > C11 > C14 > C12 = C13 C12 > C14 > C13 > C11 > C15 > C16 > C17 C17 > C11 > C16 > C14 > C13 > C15 > C12 C13 > C12 > C11 > C15 > C14 > C16 = C17 C11 > C17 > C14 > C16 > C12 = C13 = C15 C17 > C11 > C16 > C15 > C14 > C12 = C13 C13 > C12 > C14 > C11 > C16 = C17 > C15 C11 > C17 > C16 > C14 > C15 > C13 = C12 C11 > C16 > C17 > C15 > C13 > C12 > C14
Table 8 The rank-order of the maneuver performance index weights. Method
Rank-order
Entropy CRITIC SD CCSD GRA CE NRS FDAF DRJMIM MGFBRW CA
C23 > C24 > C21 > C22 C22 > C24 > C23 > C21 C24 > C23 > C22 > C21 C24 > C22 > C23 > C21 C24 > C23 > C21 > C22 C23 > C24 > C21 > C22 C24 > C21 > C23 > C22 C24 > C21 > C23 > C22 C23 > C24 > C21 = C22 C24 > C21 > C23 > C22 C24 > C21 > C23 > C22
Fig. 6. The statistical distribution of C11 data.
Similarly, for the multi-layer index system, we can obtain the fuzzy Bayes risks of the two index sets are 6.2070 and 0.7363 respectively with the parameter δ setting as 0.9. Then, the weights of each index set can be generated as 0.4173 and 0.5827. From the above results we can see that, the weights determined by MGFBRW indicate that the maneuver performance indexes (C2 ) are more important than the basic performance indexes (C1 ), which conforms to the standard of the evaluation of fighter in reality. 5.3. Comparison experiments In this subsection, we also take the aforementioned weight assignment methods in Section 4.2 to compare and analyze the practicability of the MGFBRW method. The weight determination results of the basic performance index set and the maneuver performance index set are shown in Tables 4 – 5. Therein, CA is the weight determination method based on classification accuracies that are produced by the ten classification methods in Section 4.2. In addition, the PCCs between the weights assigned by the ten methods and those of classification accuracies are shown in Table 6, and the values in bold are the maximum ones. Table 6 shows that, the weights obtained by the MGFBRW method are greatly related to the reference weights determined by classification accuracies because the PCCs are more than 0.9. For further comparative analysis, we rank the weights determined by the eleven methods in descending order, and the results are shown in Tables 7 – 8. In order to fully compare and explain the results, we analyze the weight assignment from two aspects, i.e., the distribution of data and the practical meanings of attributes. For simplicity, we take the basic performance index set as an example for the first aspect and the maneuver performance index set for the second one. From the results in Table 7 we can see that, there are some differences between the rank-orders of the weights obtained by the compared methods. In order to analyze the reasons of above
Fig. 7. The statistical distribution of C12 data.
problem more accurately and clearly, we visualize the basic performance index data through the data statistical distributions produced by the generations (shown in Figs. 6–12). Therein, the boxes indicate the statistical characters, which are the maximum and minimum values drawn by black lines, and the blue boxes mean the ranges between the 25th percentile and the 75th percentile, the red lines are the median values, and the red cross symbols are the abnormal data. From the statistical distributions of data in Figs. 6–12 we can see that, compared with the others, there are obvious discriminations in the distributions of C11 , C16 and C17 , and it is just the opposite for C12 and C13 . In other words, we can easily distinguish the fighters according to the distributions of C11 , C16 and C17 rather than those of C12 and C13 . Therefore, the attributes C11 , C16 and C17 should be assigned greater weights, and the weights of C12 and C13 should be less than the others’. According to the above analyses we can see that, the results (see Table 7) produced by CRITIC, CCSD, CE and DRJMIM have lower credibility, which can be also verified by the results of Table 6, i.e., the PCCs of CRITIC, CCSD, CE and DRJMIM are negative.
Please cite this article as: M. Suo et al., Fuzzy Bayes risk based on Mahalanobis distance and Gaussian kernel for weight assignment in labeled multiple attribute decision making, Knowledge-Based Systems (2018), https://doi.org/10.1016/j.knosys.2018.04.002
JID: KNOSYS 10
ARTICLE IN PRESS
[m5G;April 7, 2018;8:40]
M. Suo et al. / Knowledge-Based Systems 000 (2018) 1–14
Fig. 8. The statistical distribution of C13 data.
Fig. 11. The statistical distribution of C16 data.
Fig. 9. The statistical distribution of C14 data. Fig. 12. The statistical distribution of C17 data.
Fig. 10. The statistical distribution of C15 data.
On the other hand, for the maneuver performance indexes, the attribute C24 , specific excess power, has been recognized as the most important parameter to measure the operational effectiveness of fighter, because some other indexes, e.g., longitudinal acceleration, climb rate, stable hover performance and ceiling, are closely related with it [57]. Therefore, C24 should be assigned the greatest weight. The attribute C21 , i.e., thrust-to-weight, is considered as a relatively important index for the combat effectiveness evaluation of fighter, which will directly affect the maneuverability of fighter [57] and should be also assigned a greater weight. What’s more, the significance of C23 is greater than that of C22 in the combat effectiveness evaluation of fighter, thus, the weight of C23 should be greater than C22 ’s. Therefore, the rank-order of the indexes of MP should be C24 > C21 > C23 > C22 . In summary, the weights assigned by MGFBRW are more reasonable and explanatory with respect to the above comparison experiments and the corresponding analyses.
Please cite this article as: M. Suo et al., Fuzzy Bayes risk based on Mahalanobis distance and Gaussian kernel for weight assignment in labeled multiple attribute decision making, Knowledge-Based Systems (2018), https://doi.org/10.1016/j.knosys.2018.04.002
ARTICLE IN PRESS
JID: KNOSYS
[m5G;April 7, 2018;8:40]
M. Suo et al. / Knowledge-Based Systems 000 (2018) 1–14
Fig. 13. The basic effectiveness of the fighters.
Fig. 14. The maneuver effectiveness of the fighters.
11
Fig. 15. The total effectiveness of the fighters.
has the maximum effectiveness value, i.e., 0.8106, and Tornado has the minimum one. Therefore, we can conclude two levels of the above fighters, one includes four fighters, i.e., F-15A, MiG-29, F16A and Mirage 20 0 0-5, and MiG-31, F/A-18A and Tornado belong to the second level, which is consistent with the actual situation and the results in literature [56–58]. Therein, F-15A is recognized as an outstanding fighter, and F-16A is designed to be the partner of F-15A, whose performance is lower than that of F-15A. The three fighters, MiG-29, F-16A and Mirage 20 0 0-5, are considered to be the same level ones. With regard to this three fighters, the thrust C21 and maximum speed C211 of MiG-29 are larger than those of F-16A, but its takeoff wing load C23 is more than F-16A’s. The power system is the “short board” of Mirage 20 0 0-5, however, its takeoff wing load and maximum speed make its combat performance comparable to that of F-16A. The design objectives of MiG31 are high speed and strong firepower, which reduce the air combat capability of MiG-31. F/A-18A is a kind of carrier based aircraft, whose combat performance is certainly not as good as the other subgrade fighters. Compared with the other fighters, many performances of Tornado are worse than those of others, which can be seen in Tables 9 – 10.
5.4. Effectiveness evaluation
6. Discussion
In this subsection, we take the fighters, MiG-29, MiG-31, F-15A, F-16A, F/A-18A, Mirage 20 0 0-5 and Tornado, to demonstrate the combat effectiveness evaluation of the fighters, which are belonging to the 4th generation. The effectiveness results are depicted in Figs. 13–15. From the results in Fig. 13 we can see that, the basic effectiveness of MiG-31 has reached to 0.8173, which is the maximum one compared with the other ones. However, it is the opposite that the maneuver effectiveness of MiG-31 is the minimum one in Fig. 14. The reason is that, most of the basic performances of MiG-31 are better than those of the other fighters, its maneuver performances, however, are so poor that they result in the lowest maneuver effectiveness. On the other hand, with the help of the reasonable weights of the two index sets, the total effectiveness is more in line with the actual situation than the other sub-effectiveness (see Figs. 13– 15). From the results in Fig. 15 we can see that, the fighter, F-15A,
Based on the above amount of experiments, we provide some further discussions as follows. From the results of the comparison experiments we can see that, with the help of considering the relationships, i.e., the decision risk and the fuzzy membership, between conditional attributes and decision attribute in DS, the weights produced by MGFBRW are better than the others’. There are some shortcomings in the compared methods that could be listed as follows. (a) The methods, Entropy, CRITIC, SD and CCSD, do not take account of the above relationships. (b) Data-driven weight assignment is usually based on the assumption of attribute independent, so that the weights can be extracted from the aspect of statistic. GRA ignores the above assumption due to considering the maximum and minimum differences between all the conditional attributes and the decision attribute, and results in some unsatisfactory weights. (c) CE and DRJMIM are greatly affected by discretization methods, different discretization methods generate kinds of conditional attributes’
Please cite this article as: M. Suo et al., Fuzzy Bayes risk based on Mahalanobis distance and Gaussian kernel for weight assignment in labeled multiple attribute decision making, Knowledge-Based Systems (2018), https://doi.org/10.1016/j.knosys.2018.04.002
ARTICLE IN PRESS
JID: KNOSYS 12
[m5G;April 7, 2018;8:40]
M. Suo et al. / Knowledge-Based Systems 000 (2018) 1–14
partitions, which will result in various weights. (d) The dependencies of conditional attributes with respect to decision attribute are employed in NRS, which may result in bad weight (see Table 2). (e) The main drawback of FDAF lies in the negligence for the case of the equality mean values between classes. (f) Sward has double edges, there is an unavoidable drawback in our method, the inverse of covariance matrix may not exist under some special conditions. However, we have provided some remedial measures and corresponding explanations in this paper. As mentioned in Eq. (10), the distribution error/consistency index is the only variable that affects the objective function in LMADM. An appropriate objective weight assignment method should extract a reasonable distribution error/consistency index from the concerned data system. In the proposed MGFBRW method, the coupling and distribution information of the data (as Figs. 3 and 2), namely, the above mentioned decision risk and fuzzy membership, are fully and adequately taken into account to extract more appropriate distribution error eC . Therefore, our MGFBRW method is the more effective one of weight assignment in LMADM, which has been verified in the comparison experiments and the effectiveness evaluation of fighter. Furthermore, our proposed method can be also applied to other decision making issues, such as hesitant fuzzy preference relations [60], incomplete hesitant fuzzy preference relations [61], linguistic preference relations [62] and other relations in group decision making [29,63,64]. 7. Conclusion and future work Labeled multiple attribute decision making (LMADM) is considered as a simple and effective approach for decision applications (e.g., the effectiveness evaluation of fighter), in which the weight assignment plays a significant role. In order to obtain more reliable weights, this paper proposes an object weight assignment method based on fuzzy Bayes risk, which is named as MGFBRW. Meanwhile, two kinds of relations in LMADM are discussed: the fuzzy membership relation/distribution relation and the risk decision relation/coupling relation between conditional attributes and decision attribute, both of which could be regarded as distribution error.
Firstly, the LMADM problem and its objective function are raised, and the MGFBRW method follows. Mahalanobis distance and fuzzy neighborhood relationship are employed to measure the two aspects of fuzziness in data set, i.e., the fuzzy similarity and the fuzzy membership. A Mahalanobis distance based Gaussian kernel loss function is raised to make up the deficiency of Bayes risk. Subsequently, the problems of weight determination for multi-layer attribute set and IS are discussed. Then, the algorithm of MGFBRW is designed. Finally, a large number of experiments are carried out, which include the parameter selection tests, the comparison experiments by using UCI data set, and the effectiveness evaluation of fighter. The experimental results and discussions show that, the proposed MGFBRW method is not only good at dealing with single-layer or multi-layer DS, but can be also extended to cope with the weight assignment issue in IS. Compared with other weight assignment methods, the weights produced by MGFBRW are more reasonable and have greater correlation coefficients with those determined by classification accuracies. In our work, the hypothesis of the honest decision makers is utilized. However, the decision makers are often dishonest because of their preferences. Accordingly, we will focus on the work about the above issue in our future work. In addition, we put forward the idea from LMADM to design the corresponding solution, i.e., MGFBRW, and the comparison experiments and analyses demonstrate the effectiveness of our approach. Nevertheless, the overall sufficient proof of the above process is still a key and challenging problem, and we will concern this issue in the further work. Appendix The index sets for effectiveness evaluation of fighter are shown as follows, which include the basic performance index set (see Table 9) and the maneuver performance index set (see Table 10).
Table 9 The basic performance index set of fighters. ID
Fighter
C11
C12
C13
C14
C15
C16
C17
Generation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
MiG-9 MiG-15 MiG-15bis MiG-17 F-80C F-86F MiG-21 Su-7 F-100D F-104G MiG-25 F-4E Mirage F-1 MiG-29 MiG-31 F-15A F-16A F/A-18A Mirage20 0 0-5 Tornado
1568 2200 2650 3380 1820 2700 6468 9310 7565 7170 220 0 0 160 0 0 6960 16260 30400 25800 10800 14250 9500 14220
277 234 241 236 248 275 364 320 372 520 569 403 460 389 666 316 375 443 268 680
800 1420 1330 1240 1600 1470 1020 1150 1600 1290 1500 20 0 0 1700 1480 30 0 0 1980 1825 1800 1650 1850
1100 1860 2520 2020 2220 2200 2100 1450 2100 2220 1730 2600 2800 20 0 0 3300 4450 3800 2600 3200 2500
12800 150 0 0 15500 16600 13700 14300 190 0 0 19500 12300 16760 20700 17700 18500 180 0 0 20600 19800 180 0 0 15240 18300 150 0 0
950 1080 1080 1150 10 0 0 1130 1300 1200 1125 1390 1200 1390 1470 1400 1500 1380 1380 1345 1480 1390
910 1050 1076 1145 932 1053 2571 2448 1591 2448 3427 2448 2448 2877 3464 2815 2387 2203 2693 2570
1 1 1 1 1 1 2 2 2 2 3 3 3 4 4 4 4 4 4 4
Please cite this article as: M. Suo et al., Fuzzy Bayes risk based on Mahalanobis distance and Gaussian kernel for weight assignment in labeled multiple attribute decision making, Knowledge-Based Systems (2018), https://doi.org/10.1016/j.knosys.2018.04.002
ARTICLE IN PRESS
JID: KNOSYS
[m5G;April 7, 2018;8:40]
M. Suo et al. / Knowledge-Based Systems 000 (2018) 1–14 Table 10 The maneuver performance index set of fighters. ID
Fighter
C21
C22
C23
C24
Generation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
MiG-9 MiG-15 MiG-15bis MiG-17 F-80C F-86F MiG-21 Su-7 F-100D F-104G MiG-25 F-4E Mirage F-1 MiG-29 MiG-31 F-15A F-16A F/A-18A Mirage20 0 0-5 Tornado
0.310 0.460 0.530 0.630 0.340 0.375 0.770 0.850 0.557 0.760 0.630 0.810 0.610 1.100 0.670 1.190 1.030 0.870 0.860 0.700
4.0 8.0 8.0 8.0 6.5 7.0 8.0 8.0 7.3 6.0 4.5 8.0 8.0 9.0 5.0 7.3 9.0 7.0 9.0 7.5
2.5 4.0 4.4 4.8 3.0 3.5 5.9 5.0 5.5 3.0 3.0 6.0 5.0 9.0 3.2 7.3 9.0 6.0 9.0 5.5
16 42 50 75 35 47 145 150 130 254 200 152 165 310 235 300 305 245 255 180
1 1 1 1 1 1 2 2 2 2 3 3 3 4 4 4 4 4 4 4
Supplementary material Supplementary material associated with this article can be found, in the online version, at 10.1016/j.knosys.2018.04.002. References [1] M. Sahoo, S. Sahoo, A. Dhar, B. Pradhan, Effectiveness evaluation of objective and subjective weighting methods for aquifer vulnerability assessment in urban context, J. Hydrol. 541 (2016) 1303–1315. [2] H. Ishibuchi, T. Yamamoto, Rule weight specification in fuzzy rule-based classification systems, IEEE Trans. Fuzzy Syst. 13 (4) (2005) 428–435. [3] M. Suo, B. Zhu, D. Zhou, R. An, S. Li, Neighborhood grid clustering and its application in fault diagnosis of satellite power system, Proc. Inst. Mech. Eng. Part G J. Aerosp. Eng. (2018) (in press, doi:10.1177/0954410017751991). [4] G.L. Yang, J.B. Yang, D.L. Xu, M. Khoveyni, A three-stage hybrid approach for weight assignment in MADM, Omega 71 (2017) 93–105. [5] I.J. Perez, F.J. Cabrerizo, S. Alonso, E. Herrera-Viedma, A new consensus model for group decision making problems with non-homogeneous experts, IEEE Trans. Syst. Man Cybern.Syst. 44 (4) (2014) 494–498. [6] W. Liu, Y. Dong, F. Chiclana, F.J. Cabrerizo, E. Herrera-Viedma, Group decision– making based on heterogeneous preference relations with self-confidence, Fuzzy Optim. Decis. Making 16 (4) (2017) 429–447. [7] H. Zhang, Y. Dong, E. Herrera-Viedma, Consensus building for the heterogeneous large-scale GDM with the individual concerns and satisfactions, IEEE Trans. Fuzzy Syst. (2017) (in press, doi:10.1109/TFUZZ.2017.2697403). [8] Y. Dong, Z. Ding, L. Martínez, F. Herrera, Managing consensus based on leadership in opinion dynamics, Inf. Sci. 397 (2017) 187–205. [9] Y. Dong, Y. Liu, H. Liang, F. Chiclana, E. Herrera-Viedma, Strategic weight manipulation in multiple attribute decision making, Omega 75 (2018) 154–164. [10] Y. Dong, H. Zhang, E. Herrera-Viedma, Integrating experts’ weights generated dynamically into the consensus reaching process and its applications in managing non-cooperative behaviors, Decis. Support Syst. 84 (2016) 1–15. [11] J. Wu, L. Dai, F. Chiclana, H. Fujita, E. Herrera-Viedma, A minimum adjustment cost feedback mechanism based consensus model for group decision making under social network with distributed linguistic trust, Inf. Fusion 41 (2018) 232–242. [12] E.H. Forman, S.I. Gass, The analytic hierarchy process: an exposition, Oper. Res. 49 (4) (2001) 469–486. [13] C.L. Hwang, K. Yoon, Multiple Attribute Decision Making: Methods and Applications, Berlin: Springer-Verlag, 1981. [14] H. Deng, C.H. Yeh, R.J. Willis, Inter-company comparison using modified TOPSIS with objective weights, Comput. Oper. Res. 27 (10) (20 0 0) 963–973. [15] Y.M. Wang, Y. Luo, Integration of correlations with standard deviations for determining attribute weights in multiple attribute decision making, Math. Comput. Model. 51 (1–2) (2010) 1–12. [16] D. Diakoulaki, G. Mavrotas, L. Papayannakis, Determining objective weights in multiple criteria problems: the critic method, Comput. Oper. Res. 22 (7) (1995) 763–770. [17] M.I.C.T. Che, B. Yusoff, M.L. Abdullah, A.F. Wahab, Conflicting bifuzzy multi-attribute group decision making model with application to flood control project, Group Decis. Negot. 25 (1) (2016) 157–180. [18] S. Liu, F.T.S. Chan, W. Ran, Decision making for the selection of cloud vendor: an improved approach under group decision-making with integrated weights and objective/subjective attributes, Expert Syst. Appl. 55 (2016) 37–47.
13
[19] C. Fu, S. Yang, An attribute weight based feedback model for multiple attributive group decision analysis problems with group consensus requirements in evidential reasoning context, Eur. J. Oper. Res. 212 (1) (2011) 179–189. [20] Y. Dong, H. Zhang, E. Herrera-Viedma, Consensus reaching model in the complex and dynamic MAGDM problem, Knowl. Based Syst. 106 (2016) 206–219. [21] K.S. Chin, C. Fu, Y. Wang, A method of determining attribute weights in evidential reasoning approach based on incompatibility among attributes, Comput. Ind. Eng. 87 (C) (2015) 150–162. [22] C. Fu, D.L. Xu, Determining attribute weights to improve solution reliability and its application to selecting leading industries, Ann. Oper. Res. 245 (1–2) (2014) 401–426. [23] G.V. Valkenhoef, T. Tervonen, Entropy-optimal weight constraint elicitation with additive multi-attribute utility models, Omega 64 (2016) 1–12. [24] Y. He, H. Guo, M. Jin, P. Ren, A linguistic entropy weight method and its application in linguistic multi-attribute group decision making, Nonlinear Dyn. 84 (1) (2016) 399–404. [25] J. IT., Principle Component Analysis, New York: Springer-Verlag, 1986. [26] Y.J. Xu, Q.L. Da, Standard and mean deviation methods for linguistic group decision making and their applications, Expert Syst. Appl. 37 (8) (2010) 5905–5912. [27] W. Zhang, Y. Xu, H. Wang, A consensus reaching model for 2-tuple linguistic multiple attribute group decision making with incomplete weight information, Int. J. Syst. Sci. 47 (2) (2016) 389–405. [28] J.M. Merigó, M. Casanovas, Y. Xu, Fuzzy group decision-making with generalized probabilistic owa operators, J. Intell. Fuzzy Syst. 27 (2) (2014) 783–792. [29] Y. Xu, H. Wang, H. Sun, D. Yu, A distance-based aggregation approach for group decision making with interval preference orderings, Comput. Ind. Eng. 72 (1) (2014) 178–186. [30] J.L. Deng, Introduction to Grey System Theory, Sci-Tech Information Services, 1989. [31] J. Liang, K. Chin, C. Dang, R.C. Yam, A new method for measuring uncertainty and fuzziness in rough set theory, Int. J. Gen. Syst. 31 (4) (2002) 331–342. [32] Q. Hu, D. Yu, J. Liu, C. Wu, Neighborhood rough set based heterogeneous feature subset selection, Inf. Sci. 178 (18) (2008) 3577–3594. [33] M. Suo, R. An, D. Zhou, S. Li, Grid-clustered rough set model for self-learning and fast reduction, Pattern Recognit. Lett. 106 (2018) 61–68. doi: 10.1016/j. patrec.2018.02.018. [34] Q.J. Song, H.Y. Jiang, J. Liu, Feature selection based on FDA and F-score for multi-class classification, Expert Syst. Appl. 81 (2017) 22–27. [35] L. Hu, W. Gao, K. Zhao, P. Zhang, F. Wang, Feature selection considering two types of feature relevancy and feature interdependency, Expert Syst. Appl. (93) (2018) 423–434. [36] L. Wang, J. Shen, X. Mei, Cost sensitive multi-class fuzzy decision-theoretic rough set based fault diagnosis, in: Chinese Control Conference, 2017, pp. 6957–6961. [37] Y. Yao, A partition model of granular computing, in: LNCS Transactions on Rough Sets, 2004, pp. 232–253. [38] X.Z. Zhu, W. Zhu, X.N. Fan, Rough set methods in feature selection via submodular function, Soft Comput. (2016) 1–13. [39] B. Zhang, Y. Dong, Y. Xu, Multiple attribute consensus rules with minimum adjustments to support consensus reaching, Knowl. Based Syst. 67 (3) (2014) 35–48. [40] C.C. Li, Y. Dong, F. Herrera, E. Herrera-Viedma, Personalized individual semantics in computing with words for supporting linguistic group decision making. an application on consensus reaching, Inf. Fus. 33 (C) (2017) 29–40. [41] X. Chen, H. Zhang, Y. Dong, The fusion process with heterogeneous preference structures in group decision making: a survey, Inf. Fus. 24 (2015) 72–83. [42] Y. Dong, C.C. Li, F. Chiclana, E. Herrera-Viedma, Average-case consistency measurement and analysis of interval-valued reciprocal preference relations, Knowl. Based Syst. 114 (2016) 108–117. [43] Y. Xu, W. Zhang, H. Wang, A conflict-eliminating approach for emergency group decision of unconventional incidents, Knowl. Based Syst. 83 (1) (2015) 92–104. [44] R.R. Yager, On ordered weighted averaging aggregation operators in multi-criteria decision making, Read. Fuzzy Sets Intell. Syst. 18 (1) (1993) 80–87. [45] P.C. Mahalanobis, On the generalized distance in statistics, Proc. Natl. Inst. Sci. 2 (1936) 49–55. [46] M. Naderpour, J. Lu, G. Zhang, A human-system interface risk assessment method based on mental models, Saf. Sci. 79 (2015) 286–297. [47] M. Naderpour, J. Lu, G. Zhang, An intelligent situation awareness support system for safety-critical environments, Decis. Support Syst. 59 (1) (2014) 325–340. [48] M. Naderpour, J. Lu, G. Zhang, An abnormal situation modeling method to assist operators in safety-critical systems, Reliab. Eng. Syst. Saf. 133 (2015) 33–47. [49] S. Kumar, W. Byrne, Minimum Bayes-Risk word alignments of bilingual texts, in: Acl-02 Conference on Empirical Methods in Natural Language Processing, 2002, pp. 140–147. [50] J. González-Rubio, F. Casacuberta, Minimum Bayes risk subsequence combination for machine translation, Pattern Anal. Appl. 18 (3) (2015) 523–533. [51] V. Goel, S. Kumar, W. Byrne, Segmental minimum Bayes-Risk asr voting strategies, 2002, pp. 334–344. [52] P. Tan, M. Steinbach, V. Kumar, Introduction to Data Mining, Posts & Telecom Press, 2011. [53] F. Jiang, Y. Sui, A novel approach for discretization of continuous attributes in rough set theory, Knowl. Based Syst. 73 (1) (2015) 324–334.
Please cite this article as: M. Suo et al., Fuzzy Bayes risk based on Mahalanobis distance and Gaussian kernel for weight assignment in labeled multiple attribute decision making, Knowledge-Based Systems (2018), https://doi.org/10.1016/j.knosys.2018.04.002
JID: KNOSYS 14
ARTICLE IN PRESS
[m5G;April 7, 2018;8:40]
M. Suo et al. / Knowledge-Based Systems 000 (2018) 1–14
[54] F. Ma, J. He, J. Ma, S. Xia, Evaluation of urban green transportation planning based on central point triangle whiten weight function and entropy-AHP, Transp. Res. Procedia 25 (2017) 3638–3648. [55] S. Liu, H. Li, A Human Factors Integrated Methods for Weapon System Effectiveness Evaluation, Springer Singapore, 2016. [56] D.Y. Fei, X. An, Synthesized index model for fighter plane air combat effectiveness assessment, Acta Aeronautica Et Astronautica Sinica 27 (6) (2006) 1084–1087. [57] B. Zhu, R. Zhu, X. Xiong, Fighter Plane Effectiveness Assessment, Beijing: Aviation Industry Press, 1993. [58] L. Wang, H. Zhang, H. Xu, Multi-index synthesize evaluation model based on rough set theory for air combat efficiency, Acta Aeronautica Et Astronautica Sinica 29 (4) (2008) 880–885. [59] A. Bongers, J.L. Torres, Technological change in U.S. jet fighter aircraft, Res. Policy 43 (9) (2014) 1570–1581.
[60] Y. Xu, F.J. Cabrerizo, E. Herrera-Viedma, A consensus model for hesitant fuzzy preference relations and its application in water allocation management, Appl. Soft Comput. 58 (2017) 265–284. [61] Y. Xu, L. Chen, F. Herrera, H. Wang, Deriving the priority weights from incomplete hesitant fuzzy preference relations in group decision making, Knowl. Based Syst. 99 (2016) 71–78. [62] Y. Xu, H. Sun, H. Wang, Optimal consensus models for group decision making under linguistic preference relations, Int. Trans. Oper. Res. 23 (6) (2015) 1201–1228. [63] D. Yu, J.M. Merigó, Y. Xu, Group decision making in information systems security assessment using dual hesitant fuzzy set, Int. J. Intell. Syst. 31 (8) (2016) 786–812. [64] Y. Xu, L. Chen, K.W. Li, H. Wang, A chi-square method for priority derivation in group decision making with incomplete reciprocal preference relations, Inf. Sci. 306 (2015) 166–179.
Please cite this article as: M. Suo et al., Fuzzy Bayes risk based on Mahalanobis distance and Gaussian kernel for weight assignment in labeled multiple attribute decision making, Knowledge-Based Systems (2018), https://doi.org/10.1016/j.knosys.2018.04.002