Information Fusion 34 (2017) 80–86
Contents lists available at ScienceDirect
Information Fusion journal homepage: www.elsevier.com/locate/inffus
The analysis of expert opinions’ consensus quality ´ Adrianna Kozierkiewicz-Hetmanska Wroclaw University of Technology, Wybrzeze Wyspianskiego 27, 50-370 Wroclaw, Poland
a r t i c l e
i n f o
Article history: Received 6 November 2015 Revised 30 May 2016 Accepted 25 June 2016 Available online 29 June 2016 2010 MSC: 00-01 99-00
a b s t r a c t In many situations we need to obtain one, common decision (which can be understood as a consistent state of knowledge) out of opinions collected from many experts or any other external sources. This entails a problem concerning the reliability of such decision. We would like to know that decisions based on experts’ opinions are trustworthy. Unfortunately, in many cases the determination of such decision is difficult and expensive, especially when big sets of input data are involved in the process. This paper presents a framework which allows to assess the quality of the aforementioned final decision. Its output is based solely on the analysis of its input (e.g. an assumed representation of experts’ opinions). Moreover, the paper contains an overview of several types of possible approaches to the considered topic.
Keywords: Consensus Quality Experts’ opinions
1. Introduction In recent years we can observe a rapid growth of information, its sources and methods of representation. It has caused the necessity of developing methods for their storing and processing. Modern companies are characterised by an increasing complexity of used systems and amount of data which is stored in not only one, central database, but frequently in several distributed collocations. Furthermore, sometimes the same data is replicated among multiple databases to ensure its safety. To properly manage a company that has to deal with such diversity, the effective methods of knowledge integration from all of the possible sources are required. Moreover, companies are constantly enforced to question a reliability of acquired results coming from an integration process due to its high complexity and significant size of its input. In general, any source of knowledge can be treated as an expert’s opinion. In the simplest case, they can be collected even from ordinary relational databases. For example, some company’s branches posses different data referring to their local management. This diversity may imply different business decisions taken on regional and completely different on the global level. Obviously, mistakes made during the development of such strategy can cause losing a lot of money or even going bankrupt. Therefore, before trusting the aforementioned strategy (a result of an integration of experts’ opinions coming from a tremendous amount of data concerning local company’s branches), its quality assessment is indispensable. Unfortunately, such verification is not easy due to the E-mail address:
[email protected] http://dx.doi.org/10.1016/j.inffus.2016.06.005 1566-2535/© 2016 Elsevier B.V. All rights reserved.
© 2016 Elsevier B.V. All rights reserved.
computational complexity of calculating of some assumed quality metrics. Moreover, processing a large set of experts opinions is very expensive in terms of a time consumption when distributed knowledge sources are involved. In our work we assume that to make a decision regarding some problem we ask many experts for their opinion or we process data from various sources. Thus, we have to deal with a knowledge of a collective, and based on it we have to make the final decision. It has been proved in [1] that the Consensus Theory can be useful in determining a consistent knowledge of a collective. However, in a real situation the final solution is not simple to achieve. Before designating any conclusions from experts’ opinions we would like to know if it is possible to get reliable and high quality consensus. The general idea of the considered problem is presented in Fig. 1. In this paper we assume that experts’ opinions share a unified representation. To simplify this process, we will use real numbers and binary vectors for such representation. Furthermore, we propose conditions which should be fulfilled, by a set of collected opinions, in order to achieve the highest quality of the final consensus. Therefore, we claim that the analysis of dependencies between the number of experts (knowledge bases) and elements of sets representing their opinions (without determination of the consensus) is enough to assess the quality of consensus. The main contribution of this paper is a set of new theorems, which contain conditions for maximal quality of the consensus for assumed knowledge structures. Subsequently, we also investigate particular issues concerning consistency of profiles (knowledge bases) and their susceptibility to a consensus. All proposed
´ ska / Information Fusion 34 (2017) 80–86 A. Kozierkiewicz-Hetman
Fig. 1. General idea of the presented problem.
theorems are discussed and some ideas for improving the quality of a consensus are presented. The obtained results are novel, the quality of consensus has not been widely investigated in the literature. The remaining part of this paper is organised as follows. In the next section a brief summary of related work is described. Section 3 contains the introduction to the Consensus Theory. In Section 4 some different representations of experts opinions are presents along with an analysis of the quality of consensus. Section 5 concludes the paper. 2. Related works In many practical tasks we encounter a decision problem that needs to be solved based on a collective knowledge. However, this problem is difficult and not deeply investigated by researchers. Nguyen [1] proposed a formal mathematical model of the collective knowledge and applied Consensus Theory methods for generating it. The consensus methodology has been proved to be useful in solving conflicts and should also be effective for problems of a knowledge inconsistency resolution and the knowledge integration. The problem of determination of the collective knowledge is related to the knowledge integration problem. These issues were solved and thoroughly investigated for a logical and a relational structures as well as for ontologies. Refer to [1,2] where the author proposed the formal model for the knowledge integration and its algorithms. Additionally, a set of postulates for a knowledge function and their analysis, depending on a selected representation of knowledge states, were proposed. In [3] authors introduced an algorithm aggregating the preference relations provided by experts in multi-expert decision making problems. In order to aggregate the individual preferences for each of the elements, from the set of aggregation functions, the most suitable one was selected by means of a consensus done through penalty. Authors assumed that the information provided by the experts was homogeneous and represented by means of a fuzzy preference relations that were fused into a single relation, called the collective preference relation. This fusion was done by using the aggregation function which was selected by the consensus. Rosello and others [4] proposed a mathematical framework and a methodology for the group decision-making using distances and consensus within a linguistic information. Distances were defined from the geodesic distance in the graph theory and the Minkowski distance. The degree of the consensus is based on the concept of an entropy of the generalised qualitative assessments. In [5] the decision support system was proposed. Experts provided their testimonies as fuzzy preference relations. The consensus process was supervised by a moderator called “super-expert”. Szmidt and Kacprzyk [6] presented the fuzzy analysis of the consensus based on an idea of a distance from the final consensus.
81
Mata [7] proposed a model of an adaptive consensus support system for decision-making problem with multi-granular linguistic information. The consensus process was improved by adapting search for preferences in disagreement to the current level of the consensus at each round. Additionally, authors defined three different methods of identifying the preferences that each expert should modify, in order to increase the agreement in the next consensus round. The interesting problem was considered in [8] where authors assumed that decision makers provided their opinions using a linguistic expression instead of a single linguistic term. Furthermore, the paper considered a consensus reaching process in case of the hesitant linguistic group decisions making. From the assumed problem the novel distance based consensus measure was proposed. In [9] an overview and a categorisation of some existing models for decision making problem was proposed. These models were applied in a prototype of a simulation-based analysis framework called AFRYCA for the resolution of decisions making problems under the consensus. In [10] authors proposed the consensus model suitable to manage a large scale of decision makers, which was also raised by researchers in [11–15]. The decision problem is closely related to the issue of assessing the quality of determined decisions. Formal definition for the quality of knowledge was proposed in [1]. The author defined measures which allows to evaluate these consensuses referring to the profiles. They included the measure of a quality and a consistency. For selected cases, the author pointed out that the larger the consistency value of a profile the higher the quality of its consensus. The problem of assessing the collective knowledge was also raised in [16]. The quality of the collective knowledge states were evaluated by comparing them with the real knowledge states. The author analytically proved that the collective knowledge state is always better than the worst element of the collective (the collective member). Dong and others [17,18] used the social choice theory and the prospect theory for decision making problem and evaluate the consensus process. Authors considered different representations like the preference orderings, the utility functions, the multiplicative and fuzzy preference relations, and based on them, the individual preferences vector of alternatives were created. The standardized individual preferences vectors are aggregated into a collective preference vector. Authors calculated the consensus degree as a distance between the individual preference values and the collective preference values. The consensus degree evaluated the consensus process and it was used to adjust the opinions of the decision makers. The proposed framework avoided the internal inconsistency and satisfied the Pareto principle. In [19] the author evaluated one-level and the two-level consensuses with the reference to the optimal solution. The prepared quality measure allowed to demonstrate that in comparison to the optimal solution the two-level algorithm and one-level method were good approximations and gives results worse for less than 5%. In [20] authors showed interesting experimental results. They engaged four volunteer experts and they gave them definitions of the seventy-six variables and asked them to write, in a limited amount of time, rules describing the printer domain to the best of their ability. These rules were assessed and the analysis demonstrated that the collective knowledge achieves higher accuracy than a simple combination of the individual volunteers. In [21] the quality was measured by the difference between the collective knowledge and the real world knowledge. Authors proposed a method for improving the quality of the collective knowledge. They have conducted an experimental research with a dif-
´ ska / Information Fusion 34 (2017) 80–86 A. Kozierkiewicz-Hetman
82
ferent number of collective members, using a multi-dimensional vector structure, to determine how the number of collective members influences the quality of the collective knowledge. The similar research was presented in [22] where authors investigate how the density and the coherence factors can affect the considered quality using binary vectors. 3. Preliminaries- introduction to consensus theory
Definition 1. The macrostructure of the set U is a distance function δ : U × U → [0, 1] which satisfies the following condition: 1. ∀v,u∈U , δ (v, u ) = 0 ⇔ v = u 2. ∀v,u∈U , δ (v, u ) = δ (u, v ) In this paper we consider only metrics satisfying transitive condition. For generality in Definition 1 we assume the lack of triangular inequality. This condition is too strong for many practical situations [24]. Therefore, the pair (U, δ ) is called a distance space because, there is no need for it to be a metric space. For the assumed distance space, the consensus choice problem requires establishing the consensus choice function. Definition 2. By a consensus choice function in space (U, δ ) we define a function: (1)
By C(X) we denote the representation of X ∈ (U) and each c ∈ C(X) we denote a consensus of a profile X. Therefore, each element of C(X) represents a consistent knowledge state of an assumed collective. We can interpret it as the final decision determined based on experts’ opinions. In [1,2] authors present 10 postulates for consensus choice functions: reliability, unanimity, simplification, quasi-unanimity, consistency, Condorcet consistency, general consistency, proportion, 1-optimality, 2-optimality. The last two postulates: 1- optimality and 2-optimality play the important role in solving the consensus choice problem. 1-optimality postulate requires the consensus to be as near as possible to elements of the profile and could be recognised as the best representative of the profile. Postulate 2optimality allows to determine the most ‘fair’ consensus. Let us assume, that some symbols are defined as follows: • •
δ 1 (x, X ) = y∈X δ (x, y ) δ 2 (x, X ) = y∈X (δ (x, y ))2
Postulates 1-optimality and 2-optimality are formally defined as follows ([23], [1]): Definition 3. For a profile X ∈ (U) a consensus choice function C satisfies the postulate of: 1. 1-optimality iff (x ∈ C (X ) ⇒ (δ 1 (x, X ) = min δ 1 (y, X )), y∈U
2. 2-optimality iff (x ∈ C (X ) ⇒ (δ 2 (x, X ) = min δ 2 (y, X )). y∈U
Definition 4. Let X ∈ (U) and x ∈ C(X). By the quality of a consensus x in a profile X we call the following value:
Q i (x, X ) = 1 −
Let U be the finite, nonempty set of a universe of objects which can be considered as the potential elements of a knowledge referring to a certain world. The symbol 2U denotes the powerset of U, that is the set of all subsets of U. By b (U) we denote the set of all b-element subset (with repetitions) of the set U for b ∈ N. Therefore (U ) = b∈N b (U ) is the set of all nonempty subsets with repetitions of the universe U. Each X which belongs to (U) is called a knowledge profile and could be interpreted as the knowledge of collective and each x ∈ X as the knowledge of a collective member. The macrostructure of the set U is defined in the following way [23]:
C : (U ) → 2U
For the same distance space it is possible to determine many consensuses, so it is necessary to define a measure which allows to evaluate these consensuses that are referring to the profile [1]. For this task the consensus’ quality is defined:
δ i (x, X ) card (X )
(2)
where: i ∈ {1, 2}. If the consensus x in the profile X satisfies the criterion for 1optimality then we calculate Q1 (x, X), otherwise Q2 (x, X). It is obvious that we would like to find a consensus which maximise the quality measure. Intuitively speaking, it is the most reliable representation of experts’ opinions. Additionally, all consensuses of X which satisfy the 1-optimality postulate have the same, maximal quality. In the next section, the analysis of a consensus’ quality (for some assumed knowledge representation) is presented along with the detailed analysis of particular examples. 4. The analysis of the consensus’ quality The determination of a consensus is often a very difficult task. In this paper we would like to investigate an issue concerning a question if it is possible to determine the consensus with the highest possible quality (or at least on the acceptable level) only by analysing an input of the procedure. Let us consider that we have asked two groups of experts about an estimation of the capacity of a box presented to them. Two experts from the first group said 1 m3 and 4 m3 . The two experts from the second group claimed 1 m3 and 2 m3 . It is obvious that the consensus determined based on the opinions of the second group has a higher quality because their opinions are closer to each other. As it was mentioned in the previous Section, the determination of a consensus which satisfies the 1-optimality criterion has always the highest quality. However, in case of the determination of the 2-optimality consensus we would also like to maximise the quality. In the first case, we assumed that experts’ opinions are represented as real numbers. This situation occurs often in a real world when we ask experts about numerical assessments of some situations. For example, the experts are a jury in a competition and give numerical notes or experts are economists estimating the value of shares, currencies etc. The decision determined based on experts’ opinions is very important and can have a strong influence on a company development. The following theorem gives us a clue in which situation the determined consensus (decision) is trustworthy in the assumed distance space (we consider the particular case of the set U and the function δ ): Theorem 1. For an assumed distance space: U = {0, 0.01, 0.02, . . . , 5.99, 6.00, . . . , f }, δ (v, u ) = 1f |u − v|, u, v ∈ U and the profile X = {a1 , . . . , an } ∈ (U ) the consensus satisfying 2-optimality criterion approaches the maximal quality if
( n ai )2 lim |n − in=1 2 | = 0 n→∞ ( i=1 ai ) Proof. We assumed that n > 1 because for n = 1 the consen( ni=1 ai )2 | = 0. We would ( ni=1 a2i ) δ 2 (x,X ) like to find the maximal value for Q 2 (x, X ) = 1 − card (X ) where:
sus has always maximal quality and |n −
´ ska / Information Fusion 34 (2017) 80–86 A. Kozierkiewicz-Hetman
δ 2 (x, X ) = min
1 f2
Q (x, X ) = 1 − 2
1 f2
= 1−
1 f2
= 1− = 1−
δ 2 ( xm , y ).
y∈X
xm ∈U
n
i=1
( xm − ai )2 n
n
i=1
(x2m − 2ai xm + a2i )
n 2 x − 2xm ni=1 ai + ni=1 a2i m i=1
n
nx2m − 2xm
n n 2 a i=1 i + i=1 ai . f 2n
n
Fig. 2. The example of a regular profile.
For the defined distance space the consensus of a profile X is equal to xm = 1n ni=1 ai [23], therefore: 1 n
Q 2 (x, X ) = 1 − = 1− = 1−
(
(
n
i=1
n
n
i=1
n
ai )2 − 2n (
ai )2 − 2 (
n
n
−( f 2 n2
2 i=1 ai
n
i=1 ai f 2n
i=1 ai f 2 n2
i=1
ai )
)
n
i=1
)2 + n
ai +
n
i=1
n
i=1
a2i
a2i
2
.
Due to the fact that we would like to maximise the quality presented above it is enough to minimise the expression n
n
n 2 2 i=1 ai − ( i=1 ai ) f 2 n2
. Therefore, for lim |n − n→∞
( ni=1 ai )2 | = 0 the quality ( ni=1 a2i )
of a consensus approaches the maximal value.
The condition presented above allows to easily check that from collected data it is possible to determine the consensus with the maximal quality. This condition is especially important in case of big sets of data for which the determination of final decisions may be very time-consuming. Example 1. For distance space U = {0, 0.01, 0.02, . . . , 3}, δ (v, u ) = 1 3 |u − v|, u, v ∈ U we consider two profiles: the profile X = {1, 2, 3} and the profile Y = {1, 1, 2}. According to Theorem 1 for the profile X: |n −
83
( ni=1 ai )2 6 3 | = 1 − 36 14 = 14 = 7 and for the profile Y: |n − ( ni=1 a2i )
( ni=1 ai )2 2 1 | = 1 − 16 6 = 6 = 3 . For profiles X and Y a difference be( ni=1 a2i )
tween their cardinalities and their elements significantly differs from 0. However, the expression |n −
( ni=1 ai )2 | is further from 0 ( ni=1 a2i )
for the profile X than for the profile Y which implicates that the profile Y has a better quality then the profile X. Let us consider a situation where we asked two experts about their opinions. We want to know what they think about studying at our university. The first expert said it is a good university, the second one claimed that it is a bad university. Based on collected information we cannot decide if our university should be recommended to others students. In such situation, we say that our profile is regular. Formal definition of a regular profile is presented below: Definition 5. Let X ∈ (U) be a profile. We say that X is regular if and only if for each pair of objects x, y ∈ U the following equality takes place: δ i (x, X ) = δ i (y, X ). Theorem 2. If a profile X ∈ (U) is regular then each determined consensus has the same, highest quality. Proof. From Definition 4 the quality of the consensus is equal δ i (x,X ) i i Q i (x, X ) = 1 − card (X ) where: δ (x, X ) = min y∈X δ (xm , y ). From xm ∈U
Definition 5 we know that δ i (x, X) is the same for all x ∈ U, therefore Qi (x, X), is the same for all x ∈ X.
Despite the fact that for a regular profile, the obtained consensus has always the maximal quality, the previous analysis shows us that it is not worthy to determine the consensus for a regular profile because it is not susceptible to consensus (it is not possible to determine a “good” consensus) [1]. Example 2. Let us consider the distance space (U, δ ) where U = {a, b, c} and δ (x, y ) = k, for x = y, k ∈ [0, 1] and δ (x, y ) = 0 for x = y. If we assume that the profile X = U = {a, b, c}, then according to Definition 5 the profile is regular and each element of U could be treated as a consensus. Then, if x = a or x = b or x = c then the quality of the consensus is the same and equal Q 1 (x, X ) = 1 − 23∗k = 3k . This situation is graphically presented in Fig. 2. In many practical problems we would like to know the degree to which the opinions given by experts are consistent with each other. In other words, we would like to measure the degree of the coherence of experts’ opinions. Definition 6. By the consistency function for a profile X ∈ (U) we call a function: cf: (U) → [0, 1] Intuitively, the consistency function measures how elements of the profile are similar to each other. It is obvious that the highest possible consistency is desirable. In [1] authors defined a set of postulates that should be fulfilled by consistency functions and moreover proposed five different functions. In this paper we consider only a function which takes into the account the total average distance between elements of the profile X ∈ (U). This function reflects the consistency of the profile in the best way and satisfies the most postulates defined in [1]. It can be defined in the follow
δ 1 (x,y )
ing way: c f (X ) = 1 − x,yn∈(Xn+1 ) , where n is a cardinality of X. If the profile X has the maximal consistency value c f (X ) = 1 then the quality of the consensus is maximal Q i (x, X ) = 1 for i ∈ {1, 2}. This situation occurs only in case of a homogeneous profile (in which all elements are identical) and then δ i (x, X ) = 0 for i ∈ {1, 2}. In this part of our paper we present dependencies between a minimal consistency and a quality of the consensus: Theorem 3. If the profile X ∈ (U) has the minimal consistency then the consensus’ quality is also minimal: Q i (x, X ) = 0, i ∈ {1, 2}. Proof. We assumed that the cardinality of a profile n > 1 because if n = 1 then X is homogeneous profile and δ i (x, X ) = 0 what implicates Q i (x, X ) = 1 − 0 = 1. If c f (X ) = 1 −
δ 1 (x,y ) n (n+1 )
x,y∈X
is minimal, then
δ 1 (x,y ) n (n+1 )
x,y∈X
has to
be maximal. It follows that δ 1 (x, y ) = 1 for all x, y ∈ U and x = y. Then, δ i (x, X ) = min y∈X δ i (xm , y ) = n ∗ 1i and Q i (x, X ) = 1 − δ i (x,X ) = 1 − nn = 0. card (X )
xm ∈U
The profile with the minimal consistency entails the fact the the determination of a reliable consensus is difficult or even impossible. However, Theorem 3 gives us a clue on how to improve a
´ ska / Information Fusion 34 (2017) 80–86 A. Kozierkiewicz-Hetman
84
Fig. 4. The example of profile after removing the worse element.
Fig. 3. The example of two different profiles with different qualities and different consistency degrees.
profile with the minimal consistency (for which the determination of a reliable consensus is impossible). Improving the consistency of the profile improves the quality of the consensus. Let us consider a situation where we try to assess two decisions based on two different sets of experts’ opinions. Let us assume that those two profiles have different consistencies. A consistency value of a profile delivers some information about a density or a rarity of it - the high consistency value reflects its coherency. On the other side, the consistency refers to experts’ knowledge analysis. The high consistency of experts opinions increases the credibility of the final decision. The consistency degree of a profile is closely connected with the quality of the consensus determined based on a set of experts’ opinions. The better quality of the consensus the higher the consistency of the profile: Theorem 4. Let X, Y ∈ (U) be profiles and x be a consensus of the profile X and y be a consensus of the profile Y, both consensuses x and y satisfy the 1-optimality postulate. If Q1 (x, X) ≥ Q1 (y, Y) then the following inequality is true: cf(X) ≥ cf(Y) . Proof. From the Definition 4 the quality of the consensus is equal δ 1 (y,Y ) δ 1 (x,X ) δ 1 (x,X ) 1 Q 1 (x, X ) = 1 − card (X ) and Q (y, Y ) = 1 − card (Y ) . If 1 − card (X ) ≥ δ (y,Y ) δ (y,Y ) δ (x,X ) 1 − card (Y ) then card (X ) ≤ card (Y ) . According to [1] the consistency function satisfies the postulate for greater consistency, therefore 1
1
1
δ 1 (y,Y ) δ 1 (x,X ) the inequality card (X ) ≤ card (Y ) implicates cf(X) ≥ cf(Y).
The profile X with better quality is denser than the profile Y with the worse quality. In other words, the elements of the profile X are more concentrated than the elements of the profile Y.
satisfying 1-optimality postulate of the profile Y. The following dependency is true: Q1 (x, X) ≤ Q1 (x , Y). Proof. By x we denote the consensus of the profile Y, therefore δ 1 (x , Y) ≤ δ 1 (x, Y). From δ 1 (x, y ) = max δ 1 (x, z ) follows (n − 1 ) ∗ z∈X
δ 1 (x, y ) ≥ δ 1 (x, Y ), where n is the cardinality of X and (n − 1 ) is 1 1 1 x,Y ) 1 x ,Y ) δ 1 (x,y ) the cardinality of Y. Thus δ (nx,X ) = δ (x,Y )+ ≥ δ n(−1 ≥ δ n(−1 . n
1 1 x ,Y ) From the inequality above: Q 1 (x, X ) = 1 − δ (nx,X ) ≤ 1 − δ n(−1 = 1 Q (x , Y ). From Theorem 4 also cf(X) ≤ cf(Y).
Example 4. Let us consider a graphical representation of the mentioned problem presented in Fig. 4. We can notice that the consensus of the profile Y has the better quality then the the consensus of the profile X because the average distance between the consensus and elements of the profile X is smaller than the average distance between the consensus and elements of the profile Y. Theorems 4 and 5 show us that in certain situations, to improve the quality of a consensus (quality of knowledge state), it is enough to modify the profile by adding an extra expert’s opinion or removing the most extreme opinion. From theorems presented above we can notice that the proposed consistency function and the quality measure are dependent on each other. Theorems 2 and 3 consider situations where the determination of a consensus (a consistent knowledge state) is difficult or even impossible. This part of the paper is devoted to the case, where profile is susceptible to a consensus. In other words, that it is possible to determine a “good”, reliable state of knowledge which is acceptable as a common opinion of a collective. The susceptibility to the consensus of profiles is defined in the following way [1]:
Example 3. Let us consider a graphical representation of a mentioned problem presented in Fig. 3. We can notice that the consensus of the profile X has the better quality then the consensus of the profile Y because the distance between the consensus and elements of the profile X is smaller than for the profile Y. The Fig. 3 illustrates also the fact, that the profile X is denser than the profile Y. Therefore, the consistency of the profile X is better than the consistency of the profile Y.
Definition 7. A Profile X ∈ (U) is susceptible to a consensus if and only if the following inequality takes place:
In [1] the author proves that adding an extra opinion improves the quality of a consensus and also improves the consistency degree. Improvement of the quality of the consensus can also be achieved by removing the worst element of the profile. In this case, we assumed that one expert has an extremely different opinion than the others. If we do not consider this opinion in the determination of the final decision the quality of the consensus will be improved and from Theorem 4 also the consistency of the profile will be improved.
Theorem 6. If a profile X ∈ (U) is susceptible to a consensus satisfying the 1-optimality postulate then the following inequality is true 2 (n- the cardinality of X): Q 1 (x, X ) ≥ (n+1 )
Theorem 5. Let x be a consensus of profile X ∈ (U) which satisfies 1-optimality postulate. Let y be such element of the profile X that: δ 1 (x, y ) = max δ 1 (x, z ), Y = X − {y} ∈ (U ) and x be the consensus z∈X
i δˆi (X ) ≥ δˆmin (X )
(3)
(δ (x,y ))i
where: δˆi (X ) = x,yn∈(Xn+1 ) , δˆxi (X ) = min δˆ i (X ), n = card (X ), i = {1, 2}. x∈U
y∈X
(δ (x,y ))i n
,
i (X ) = δˆmin
x
Proof. If a profile X ∈ (U) is susceptible to a consensus satis1 (X ) and 1 − δˆ1 (X ) ≤ 1 − fying the 1-optimality then δˆ1 (X ) ≥ δˆmin 1 1 ˆ δ (X ) = Q (x, X ). It was proved in [1] that: δˆ1 (X ) ≤ n−1 . Theremin
fore, Q 1 (x, X ) ≥ 1 − δˆ1 (X ) ≥ 1 −
n−1 n+1
=
n+1−n+1 n+1
n+1
=
2 n+1 .
Theorem 6 gives us the lower limitation of the quality of the consensus in case when the profile is susceptible to it. The analysis of the Theorem 5 shows us, that for n = 1 the profile has the
´ ska / Information Fusion 34 (2017) 80–86 A. Kozierkiewicz-Hetman
maximal quality, and for n > 1 the quality of the consensus is always bigger than 0. In this part of a paper we consider more complicated knowledge structure and we assume that the profile X contains the experts opinions (knowledge of a collective) stored as n binary vectors of the length equal to m. Let us consider a real situation when we ask n experts about their opinions on a more complex problem. Experts are asked to answer one question and they can choose one of the two possible answers i.e. “yes” or “no”, “a” or “b”, “occur” or “not occur” etc. It is easy to show that the quality of the consensus for this structure depends on the content and the cardinality of the profile X. Example 5. The situation mentioned above can be formally presented as follows. Let us consider a distance space (U, δ ) where U = {a, b} and δ (x, y ) = 0 for x = y and δ (x, y ) = 1 for x = y. If we assume that the profile X = {z ∗ a, (n − z ) ∗ b} where z is the number of occurrences of an element a in the profile X, (n − z ) is the number of occurrences of an element b in the profile X and z > (n − z ) then the consensus for this profile satisfying the 1-optimality postulate is the element a and then Q 1 (x, X ) = 1 − (n−z )∗1 = nz . n The example presented above could be generalised to a situation where n experts are asked about their opinions for m different questions. It can be defined in a formal way: Theorem 7. For an assumed distance space: U = {u1 , u2 , . . .} where j j elements of the Universe are binary vectors, δ (w, v ) = m j=1 |w − v |
for such w, v ∈ U that w = (w1 , w2 , . . . , wm ), v = (v1 , v2 , . . . , vm ), vq , wq ∈ {0, 1}, q ∈ {1,…, m} and the profile X = {a1 , a2 , . . . , an } ∈ (U ), where: ai = (ai1 , ai2 , . . . , aim ), i ∈ {1, . . . , n}, the consensus satisfying the 2-optimality postulate has the quality equal or higher than:
Q 2 (x, X ) ≥ 1 − where: C = { j : dinality of X.
(
n
i=1
j∈D
n
i=1
ai j ≥
ai j +
n 2 }, D
j∈C m2 n
= {j :
(n −
n
i=1
n
i=1
ai j <
ai j ))2
n 2}
and n is the car-
Proof. The proof of Theorem 7 is a result of an algorithm given in [23]. The determination of the consensus satisfying the 2-optimality criterion is the NP-complete problem, therefore a heuristic algorithm was proposed. The consensus satisfying the 1-optimality postulate is determined using the fact that if for a fixed j = 1, . . . , m ni=1 ai j ≤ 2n then u j∗ = 0, and u j∗ = 1 otherwise, ∗ where u is the desired consensus. Next, a vector u∗ is modified in order to find the better consensus satisfying the 2-optimality postulate. As a result, the quality of the consensus satisfying the 2-optimality criterion is equal or better than the quality of the consensus satisfying the 1-optimality criterion. Additionally, the inequality Q 2 (x, X ) ≥ (
n
i=1 ai j +
n
j∈C (n−
a ))2
i=1 i j is true due the property of the 2 2 sum: ( i si ) ≥ ( i si ), where si is a natural number. Therefore, ( j∈D ni=1 ai j + j∈C (n − ni=1 ai j ))2 ≥ ni mj (ai j − x j )2 where xj
1−
j∈D
2n m
is the consensus vector and 1 − n m i
j
(ai j −x j )2
m2 n
= Q 2 (x, X ).
(
j∈D
n
i=1 ai j +
n
j∈C (n−
m2 n
i=1 ai j ))
2
≤1−
From Theorem 7 it is possible to decide how to change the profile in order to maximise the quality of the consensus: Example 6. Let us assume that we asked three experts about observed symptoms of a flu. Their answers were collected and presented in Fig. 5, where “1” means that symptoms occur and “0” otherwise. For the profile X = {(1, 0, 1 ); (0, 1, 1 ); (0, 1, 1 )} the consensus satisfying the 1-postulate is equal (1, 0, 1).
85
Fig. 5. The first group of experts’ answers.
Fig. 6. The second group of experts’ answers.
According to the algorithm presented in [23] the consensus satisfying the 2-optimality postulate is the same or better in terms of its quality. Therefore, Q 2 (x, X ) ≥ 1 − (
n
n
j∈C (n− m2 n
i=1 ai j +
i=1 ai j ))
2
(3−3 )) 4 = 1 − (1+(3−232)+ = 1 − 27 = 23 27 . ∗3 For the profile Y = {(1, 0, 1 ), (0, 1, 1 ), (0, 0, 0 )} obtained from the data presented in Fig. 6 the consensus satisfying the 1postulate criterion is equal (0, 0, 1). Therefore, Q 2 (y, Y ) ≥ 1 − j∈D
2
(1+1+(3−2 ))2
9 18 = 1 − 27 = 27 . This example illustrates that for the 32 ∗3 profile X it is possible to determine the better quality consensus than for the profile Y.
Theorem 8. For an assumed distance space: U = {u1 , u2 , . . .} where j j elements of the Universe are binary vectors, δ (w, v ) = m j=1 |w − v |
for such w, v ∈ U that w = (w1 , w2 , . . . , wm ), v = (v1 , v2 , . . . , vm ), vq , wq ∈ {0, 1}, q ∈ {1,…, m} and the profile X = {a1 , a2 , . . . , an } ∈ (U ), where: ai = (ai1 , ai2 , . . . , aim ), i ∈ {1, . . . , n}, the consensus satisfying the2-optimality postulate approaches the maximal quality n n a − a if: lim |n − j∈C i=1 i j k j∈D i=1 i j | = 0 where: C = { j : ni=1 ai j ≥ n→∞ n n n i=1 ai j < 2 }, n-the cardinality of X, k = card (C ) . 2 }, D = { j : Proof. For an assumed distance space and from Theorem 7 we know that the quality is equal or bigger then Q ≥ 1 − ( (
j∈D
j∈D
n n
lim
n→∞
n
2
n
2
n
ai j +
j∈D i=1
⇒ lim
n→∞
i=1 ai j ))
j∈C (n− m2 n
i=1 ai j +
j∈C (n− m2 n
i=1 ai j +
n j∈D i=1
i=1 ai j ))
j∈C
n−
. The maximal quality appears when is minimal, therefore:
n
ai j
=0
j∈C i=1
ai j + k · n −
n
ai j
=0
j∈C i=1
n n j∈C i=1 ai j − j∈D i=1 ai j ⇒ lim n − = 0. n→∞ k
Theorem 7 allows to estimate the lower limitations of a quality for a binary vector. It allows to identify when the quality of the determined decision based on experts’ opinions achieves the assumed level. From Theorem 8 we can infer how to prepare a profile (how many experts are needed) to obtain the maximal quality. The analysis of the profile (the knowledge of the collective) is easier and cheaper than the determination of the consensus and its quality. This thriftiness can be significantly rewarding especially in case of processing a big set of data. 5. Conclusions Nowadays, it frequently happens that in order to make a decision we rely on a knowledge originating from different,
86
´ ska / Information Fusion 34 (2017) 80–86 A. Kozierkiewicz-Hetman
autonomous sources i.e., from experts or other large set of data from the Internet, sensors or databases. The large amount of knowledge, that has been collected, is difficult to process and frequently time and cost consuming. Therefore, we want to know if it is possible to determine a common and reliable opinion of a collective only by analysing the given knowledge of such collective. For this task the consensus theory was applied. The quality of the determined consensus was investigated for particular cases and for some types of representations of experts’ opinions. From all of the presented theorems, we can draw a conclusion that changing the profile can also change the quality of consensus. This is important, especially in case when we have the possibility to add a new expert’s opinion or other new data. Decisions based on experts’ opinions are important for companies’ management divisions, due to the fact that bad decisions can cause a potential income loss. Some experts are more reliable that the others, therefore, we need to know that we can trust those experts and their decisions. For this task we used the measure called the quality of consensus which estimates the reliability of the determined decisions [1]. However, the determination of a consensus, especially for big set of data, is very expensive. That is why we proposed the easy method for assessing the quality of the consensus. The analysis of profiles and their cardinalities allowed to decide if decisions based on experts’ opinions are trustworthy. This conditions were developed for knowledge structures build from real numbers and binary vectors. Additionally, we analysed the quality of the consensus related to the consistency of the profile and its susceptibility. If a profile is susceptible to a consensus then we can easily estimate the lower limitation of the consensus’ quality. It allows to determine if the consensus (final decision) is profitable and can achieve assumed level of quality. In the future, we want to focus on more complicated methods of knowledge representation like hierarchical structures, ordered partitions or ordered coverings. Additionally, we would like to analyse the situation where it is impossible to solve conflicts using only one-level algorithms due to a high complexity of the operations or a geographical distance between collocations of knowledge bases. In such case, the consensus needs to be determined in two- or more steps. Our upcoming work will be devoted to developing methods of assessing the quality of the consensus achieved in such way. References [1] N.T. Nguyen, Advanced Methods for Inconsistent Knowledge Management, Springer Science & Business Media, 2007. [2] N.T. Nguyen, Processing inconsistency of knowledge in determining knowledge of a collective, Cybernet. Syst. 40 (8) (2009) 670–688.
[3] H. Bustince, E. Barrenechea, T. Calvo, S. James, G. Beliakov, Consensus in multi– expert decision making problems using penalty functions defined over a cartesian product of lattices, Inf. Fus. 17 (2014) 56–64. [4] L. Rosello, M. Sanchez, N. Agell, F. Prats, F.A. Mazaira, Using consensus and distances between generalized multi-attribute linguistic assessments for group decision-making, Inf. Fus. 17 (2014) 83–92. [5] M. Fedrizzi, J. Kacprzyk, J. Owsinski, S. Zadrozny, Consensus reaching via a gdss with fuzzy majority and clustering of preference profiles, Ann. Oper. Res. 51 (1994) 127–139. [6] E. Szmidt, J. Kacprzyk, A consensus-reaching process under intuitionistic fuzzy preference relations, Int. J. Intell. Syst. 18 (2003) 837–852. [7] F. Mata, L. Martinez, E. Herrera-Viedma, An adaptive consensus support model for group decision-making problems in a multigranular fuzzy linguistic context, Ann. Oper. Res. 17 (2) (2009) 279–290. [8] Y. Dong, X. Chen, F. Herrera, Minimizing adjusted simple terms in the consensus reaching process with hesitant linguistic assessments in group decision making, Inf. Sci. 297 (2015) 95–117. [9] I. Palomares, F.J. Estrella, L. Martínez, F. Herrera, Consensus under a fuzzy context: taxonomy, analysis framework afryca and experimental case of study, Inf. Fus. 20 (2014) 252–271. [10] I. Palomares, L. Martinez, F. Herrera, A consensus model to detect and manage noncooperative behaviors in large-scale group decision making, Fuzzy Syst. IEEE Trans. 22 (3) (2014) 516–530. [11] A. Choudhury, R. Shankar, M. Tiwari, Consensus-based intelligent group decision-making model for the selection of advanced technology, Dec. Support Syst. 42 (2006) 1776–1799. [12] Y. Dong, Y. Xu, H. Li, B. Feng, The owa-based consensus operator under linguistic representation models using position indexes, Eur. J. Oper. Res. 203 (2010) 455–463. [13] Z. Xu, An automatic approach to reaching consensus in multiple attribute group decision making, Comput. Ind. Eng. 56 (4) (2009) 1369–1374. [14] D. Ben-Arieh, T. Easton, Multi-criteria group consensus under linear cost opinion elasticity, Dec. Support Syst. 43 (2007) 713–721. [15] Y. Dong, G. Zhang, W.-C. Hong, Y. Xu, Consensus models for ahp group decision making under row geometric mean prioritization method, Dec. Support Syst. 49 (2010) 281–289. [16] N.T. Nguyen, Inconsistency of knowledge and collective intelligence, Cybernet. Syst. 39 (6) (2009) 542–562. [17] Y. Dong, N. Luo, H. Liang, Consensus building in multiperson decision making with heterogeneous preference representation structures: A perspective based on prospect theory, Appl. Soft Comput. 35 (2015) 898–910. [18] Y. Dong, H. Zhang, Multiperson decision making with different preference representation structures: a direct consensus framework and its properties, Knowl.-Based Syst. 58 (2014) 45–57. [19] N.T. Nguyen, et al. (Eds.), Computational collective intelligence: technologies and applications, Springer Berlin Heidelberg, 2012, pp. 1–10, doi:10.1007/ 978- 3- 642- 34630- 9_1. [20] M. Richardson, P. Domingos, Building large knowledge bases by mass collaboration, in: Proceedings of KCAP 03, October 23,25, 2003, Sanibel Island, Florida, USA, 2003. [21] N.T. Nguyen, et al. (Eds.), Intelligent information and database systems, Springer International Publishing, 2012, pp. 75–84, doi:10.1007/ 978- 3- 319- 15702- 3_8. [22] M. Gebala, N.T. Nguyen, et al., An analysis of influence of consistency degree on quality of collective knowledge using binary vector structure, Stud. Comput. Intell. 572 (2015) 3–13. [23] N.T. Nguyen, Consensus Choice Methods and their Application to Solving Conflicts in Distributed Systems, Wroclaw University of Technology Press, 2002. [24] K.P. Bogart, Preference structures i: Distances between transitive preference relations, J. Math. Sociol. 3 (1) (1973) 49–67.