Consensus via penalty functions for decision making in ensembles in fuzzy rule-based classification systems

Consensus via penalty functions for decision making in ensembles in fuzzy rule-based classification systems

Accepted Manuscript Title: Consensus via Penalty Functions for Decision Making in Ensembles in Fuzzy Rule-based Classification Systems Author: Mikel E...

485KB Sizes 0 Downloads 35 Views

Accepted Manuscript Title: Consensus via Penalty Functions for Decision Making in Ensembles in Fuzzy Rule-based Classification Systems Author: Mikel Elkano Mikel Galar Jos´e Antonio Sanz Paula Fernanda Schiavo Sidnei Pereira Jr. Grac¸aliz Pereira Dimuro Eduardo N. Borges Humberto Bustince PII: DOI: Reference:

S1568-4946(17)30315-0 http://dx.doi.org/doi:10.1016/j.asoc.2017.05.050 ASOC 4254

To appear in:

Applied Soft Computing

Received date: Revised date: Accepted date:

7-2-2017 3-5-2017 24-5-2017

Please cite this article as: Mikel Elkano, Mikel Galar, Jos´e Antonio Sanz, Paula Fernanda Schiavo, Sidnei PereiraJr., Grac¸aliz Pereira Dimuro, Eduardo N. Borges, Humberto Bustince, Consensus via Penalty Functions for Decision Making in Ensembles in Fuzzy Rule-based Classification Systems, (2017), http://dx.doi.org/10.1016/j.asoc.2017.05.050 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ip t

Consensus via Penalty Functions for Decision Making in Ensembles in Fuzzy Rule-based Classification Systems

cr

Mikel Elkanoa,b , Mikel Galara,b , Jos´e Antonio Sanza,b , Paula Fernanda Schiavoc , Sidnei Pereira Jr.c , Grac¸aliz Pereira Dimurob,c , Eduardo N. Borgesc , Humberto Bustincea,b, a Departamento

an

us

of Autom´atica y Computaci´on, Universidad Publica de Navarra Campus Arrosad´ıa, Navarra, 31006, Spain b Institute of Smart Cities, Universidad Publica de Navarra Campus Arrosad´ıa, Navarra, 31006, Spain c Centro de Ciˆ encias Computacionais, Universidade Federal do Rio Grande Av. It´alia km 08, Campus Carreiros, Rio Grande, 96201-900, Brazil

Abstract

Ac ce p

te

d

M

The aim of this paper is to propose a consensus method via penalty functions for decision making in ensembles of fuzzy rule-based classification systems (FRBCSs). For that, we first introduce a method based on overlap indices for building confidence and support measures, which are usually used to evaluate the degree of certainty or interest of a certain association rule. Those overlap indices (a generalizations of the Zadeh’s consistency index between two fuzzy sets) are built using overlap functions, which are a special kind of non necessarily associative aggregation functions proposed for applications related to the overlap problem and/or when the associativity property is not demanded. Then, we introduce a new FRM for the FRBCS, considering different overlap indices, which generalizes the classical methods. By considering several overlap indices and aggregation functions, we generate fuzzy rule-based ensembles, providing different results. For the decision making related to the selection of the best class, we introduce a consensus method for classification, based on penalty functions. We also present theoretical results related to the developed methods. A detailed example of a generation of fuzzy rule-based ensembles based on the proposed approach, and the decision making by consensus via penalty functions, is presented. Keywords: fuzzy rule-based classification system, aggregation function, penalty function, overlap function, overlap index, confidence and support measures

Email addresses: [email protected] (Mikel Elkano), [email protected] (Mikel Galar), [email protected] (Jos´e Antonio Sanz), [email protected] (Paula Fernanda Schiavo), [email protected] (Sidnei Pereira Jr.), [email protected];[email protected] (Grac¸aliz Pereira Dimuro), [email protected] (Eduardo N. Borges), [email protected] (Humberto Bustince)

Preprint submitted to Applied Soft Computing

June 1, 2017

Page 1 of 32

1. Introduction

Ac ce p

te

d

M

an

us

cr

ip t

Classification problems are present in many different real-world problems. In particular, Fuzzy Rule-Based Classification Systems (FRBCSs) [1] are widely used to deal with classification problems [2, 3, 4] in real-world applications, for several reasons, e.g.: they provide an interpretable model, usually present a good performance [5, 6] and can combine information coming from different sources, such as expert knowledge, mathematical models, data bases, or empirical measures [6]. See, for example, its use in industry [7], fingerprint detection [6, 8, 9], health [6, 10], earthquake predictions [11] and economy [5]. Aggregation functions [12, 13] play an important role in FRBCSs, since they are used for obtaining a single output value from several input values. Aggregation functions are also used in any other applications, such as pattern recognition, image processing [14] and decision making [15, 16]. Examples of aggregation functions are t-norms and t-conorms, uninorms, overlap and grouping functions, weighted quasiarithmetic means, ordered weighted average (OWA) functions, Choquet and Sugeno integrals (see, e.g.,[17, 18, 19, 20, 21, 22, 23, 24]). In particular, averaging aggregation functions [12, 25] provide output values that are bounded by the minimum and maximum of the inputs, representing a consensus value of the inputs. They include a large class of functions (e.g., quasi-arithmetic means, medians, OWA functions and fuzzy integrals), which are often used in preference aggregation, aggregation of expert opinions, judgements in sports competitions. See [25] and the references therein. Overlap functions [18, 20, 21, 22, 26, 27, 28] are a special kind of non necessarily associative aggregation functions proposed for applications related to the overlap problem and/or when the associativity property is not demanded, as in image processing [29], decision making based on fuzzy preference relations [19] and classification problems [3, 4, 30, 31], respectively. In [32, 33], overlap functions were used in order to build overlap indices [34, 35, 36], which are used to measure the degree of overlapping between two fuzzy sets, been a generalizations of the Zadeh’s consistency index [37]. In particular, Garcia-Jimenez et al. [32] generalize the inference algorithm for interpolative fuzzy systems using overlap indices constructed from overlap functions, proposing an algorithm to select, from a set of different overlap indices, the best one for the considered application, in the sense of Baldwin’s axioms [38] (that is, the overlap index that provides the biggest conclusion). One can point out that there are other measures/indices that involve the idea of overlapping. For example, in data mining, specially in FRBCS, two such measures are usually used to evaluate the degree of certainty or interest of a certain association rule, namely, the degree of confidence and the support [4, 39]. The confidence of an association is classically measured by the co-occurrence of attributes in tuples in the database. Then, in this paper, we introduce a method to build confidence and support measures using different overlap indices, which gives rise to a new fuzzy reasoning method (FRM) and the respective classification algorithm of the FRBCS, which generalizes the classical methods [40] found in the literature. 2

Page 2 of 32

M

an

us

cr

ip t

By using several overlap indices and different aggregation functions in the FRM, we produce an ensemble of fuzzy rule-based classifiers, where each base classifier provides an output when classifying new examples. So, in the presence of several different outputs, we face the problem of the decision making with respect to the consensual result, which should be the one with the least deviation in relation to all other inputs. For that, we propose to use penalty functions, in the sense of Bustince et al. [41]. Penalty functions are able to provide a measure of the deviation of the output values obtained by different aggregation functions, in order to indicate a consensus value of the inputs, or a penalty for not having a consensus (see, e.g., [16, 24, 25, 42, 43, 44, 45, 46, 47, 48, 49]). Examples of functions that minimize some penalty functions, called penalty-based functions, are the weighted arithmetic and geometric means, the median and the mode. Observe that there are different approaches for computing consensus measures. One of them is based on the distance between decision makers [50], as the works by Wu and Xu [51, 52]. Another approach considers the distance to the collective preference, where the consensus value may be a new value that represents the consensus. Penalty functions never produces a new consensus value, but select one among the results already obtained, namely, the one that presents the least deviation in relation to the other alternatives.1 The objectives of this paper are:

Ac ce p

te

d

1. To introduce a method for building confidence and support measures, based on overlap indices (Algorithm 1); 2. To introduce a new FRM for the FRBCS, considering different overlap indices, which generalizes the classical methods (Algorithm 2); 3. To generate fuzzy rule-based ensembles, by considering several overlap indices and aggregation functions, providing different results; 4. To develop a consensus method for the classification, based on penalty functions (Algorithm 3); 5. To present theoretical results related to the developed methods; 6. To develop an example involving the steps 1–4 of our proposal.

The paper is organized as follows. Section 2 presents the basic concepts necessary to develop this work. Section 3 is the core of the paper, presenting Algorithms 1, 2 and 3, and related theoretical results. Section 4 shows a detailed example of a generation of fuzzy rule-based ensembles, and the decision making by consensus via penalty functions. Section 5 is the Conclusion. 2. Preliminaries

In this section, we recall some basic concepts that are important for the development of the paper. In the following, given a finite universe set U , with card(U ) = n, denote by F S(U ) the space of all fuzzy sets defined over U . 1 See

the work by Herrera-Viedma et al. [53], for an extensive discussion on soft consensus models.

3

Page 3 of 32

is convex if for every x, y ∈ [0, 1] and for

f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y) holds.

is quasi-convex if for every x, y ∈ [0, 1] and

us

Definition 2. A function f : [0, 1] → for every λ ∈ [0, 1] the inequality

cr

Definition 1. A function f : [0, 1] → every λ ∈ [0, 1] the inequality

ip t

A fuzzy set X is called normal if there exists u ∈ U such that X(u) = 1. Two fuzzy sets X, Y ∈ F S(U ) are said to be completely disjoint if X(u)Y (u) = 0, for every u ∈ U .

holds.

an

f (λx + (1 − λ)y) ≤ max{f (x), f (y)}

Proposition 1. [41, Proposition 2.2] Consider the function g : [0, 1] → holds that:

M

(i) If g is monotonic then g is quasi-convex;

. Then it

(ii) If g is convex then g is quasi-convex;

d

(iii) If g is convex then g has a minimizer in [0, 1]. Definition 3. [54] A function f : [0, 1] →

is lower semicontinuous at x0 ∈ [0, 1] if

te

lim inf f (x) ≥ f (x0 ). x→x0

Ac ce p

Corollary 1. [41, Corollary 2.1] Let f : [0, 1] → be a quasi-convex and lower semicontinuous function. Then, the set of minimizers of f is a connected non-empty set. Definition 4. A function F : [0, 1]n → is idempotent if, for every x ∈ [0, 1] it holds that F (x, . . . , x) = x. 2.1. Aggregation operators: t-norms and overlap functions The key concept in FRBCS are the aggregation functions:

Definition 5. [12, 55] An n-ary aggregation function is a mapping A : [0, 1]n → [0, 1] satisfying the following properties: (A1) A is increasing2 in each argument: for each i ∈ {1, . . . , n}, if xi ≤ y, then A(x1 , . . . , xn ) ≤ A(x1 , . . . , xi−1 , y, xi+1 , . . . , xn );

2 We consider that an increasing function may not be strictly increasing (and analogously for decreasing functions).

4

Page 4 of 32

(A2) The boundary conditions: A(0, . . . , 0) = 0 and A(1, . . . , 1) = 1.

ip t

Definition 6. An n-ary aggregation function A : [0, 1]n → [0, 1] is said to be 0-positive if it also satisfies the following property: (A3) The boundary conditions for 0: A(x1 , . . . , xn ) = 0 if and only if xi = 0, for all i ∈ {1, . . . , n}.

cr

An n-ary aggregation function A : [0, 1]n → [0, 1] is said to be 1-positive if it also satisfies the following property:

us

(A4) The boundary conditions for 1: A(x1 , . . . , xn ) = 1 if and only if xi = 1, for all i ∈ {1, . . . , n}.

an

Definition 7. An aggregation function f : [0, 1]n → [0, 1] is said to be averaging if it is bounded by the minimum and maximum of its arguments, that is, for all (x1 , . . . , xn ) ∈ [0, 1]n , it holds that: min{x1 , . . . , xn } ≤ f (x1 , . . . , xn ) ≤ max{x1 , . . . , xn }.

M

Due to the monotonicity of aggregation functions f , the averaging behavior is equivalent to the idempotency property.

d

Definition 8. [17] A t-norm is a bivariate aggregation function T : [0, 1]2 → [0, 1] satisfying the following properties, for all x, y, z ∈ [0, 1]: (T1) Commutativity: T (x, y) = T (y, x);

te

(T2) Associativity: T (x, T (y, z)) = T (T (x, y), z); (T3) Boundary condition: T (x, 1) = x.

Ac ce p

An element x ∈ ]0, 1] is a non-trivial zero divisor of T if there exists y ∈ ]0, 1] such that T (x, y) = 0. A t-norm is positive if and only if it has no non-trivial zero divisors, i.e., if T (x, y) = 0 then either x = 0 or y = 0. Some examples of continuous and positive t-norms are TM (x, y) TP (x, y)

= =

min{x, y}, xy.

which are the minimum and the product t-norms, respectively. Due to the associativity property, trivially one can define n-ary t-norms. Definition 9. [18, 19, 29, 20, 22, 21, 27, 26] An overlap function is a bivariate function O : [0, 1]2 → [0, 1] satisfying the following properties, for all x, y ∈ [0, 1]: (O1) O is commutative: O(x, y) = O(y, x); (O2) O(x, y) = 0 if and only if x = 0 or y = 0; (O3) O(x, y) = 1 if and only if x = y = 1; 5

Page 5 of 32

Overlap Function TM (x, y) = min{x, y} TP (x, y) = xy

OmM (x, y) = min{x, y} max{x2 , y 2 }



√ xy xy+1−xy

M

Orat (x, y) =

an

Ok (x, y) = min{xk y, xy k }

us

cr

√ Op (x, y) = xp y p , p > 0 (in particular, O√ (x, y) = xy)  2xy if x + y = 0 x+y ODB (x, y) = 0 if x+y =0  1+(2x−1)2 (2y−1)2 if x, y ∈]0.5, 1]; 2 O2V (x, y) = min{x, y} otherwise. √ √ Om 12 (x, y) = min{ x, y}

ip t

Table 1: Examples of overlap functions (cf. [18, 19, 29, 20, 22, 21, 27, 23, 56, 28])

(O4) O is increasing; (O5) O is continuous.

te

d

An overlap function O is associative if and only if O is a continuous and positive t-norm (see [18]). Examples of overlap functions are presented in Table 1. Observe that TM and TP are also t-norms.

Ac ce p

Proposition 2. [32, Proposition 5] Let O : [0, 1]2 → [0, 1] be an overlap function and T : [0, 1]n → [0, 1] an n-ary t-norm. Then it holds that O(x, T (y1 , . . . , yn )) = T (O(x, y1 ), . . . , O(x, yn ))

if and only if T = min.

2.2. Overlap indices Overlap indices are used to measure the degree of overlapping between two fuzzy sets, and consists a generalization of Zadeh’s consistency index between two fuzzy sets over the same referential universe. Several definitions of overlap indices may be found in the literature (see, e.g: [32, 33, 34, 37]). In this paper, we adopt the approach proposed by Bustince et al. [33], which was also formalized by Garcia-Jimenez et al. [32], together with a method for constructing overlap indices by means of overlap functions. Our aim is to define confidence and support measures of association rules using overlap indices. Definition 10. [32] An overlap index is a function O : F S(U )×F S(U ) → [0, 1] such that, for all A, B, C ∈ F S(U ), the following conditions hold: 6

Page 6 of 32

Table 2: Examples of overlap indices

cr

ip t

Overlap Index OZ (A, B) = maxu∈U min{A(u), B(u)}  0 if ∀u ∈ U : A(u)B(u) = 0 for a fixed x ∈]0, 1] Ox (A, B) = x otherwise,  Oπ (A, B) = n1 u∈U A(u) · B(u), where n = card(U )

Table 3: Examples of overlap indices constructed using Theorem 1

Aggregation function M arithmetic mean

Overlap Index O   O√ (A, U ) = n1 u∈U A(u) · 1

OM

maximum

Orat

arithmetic mean

OZ (A, U ) = maxu∈u min{A(u), 1} [37] √  A(u)·1 Orat (A, U ) = n1 u∈U √

an

us

Overlap function O O√

A(u)·1+1−A(u)

M

(O1) O(A, B) = 0 if and only if A and B have disjoint supports, that is, for all u ∈ U , it holds that A(u)B(u) = 0,; (O2) O(A, B) = O(B, A);

d

(O3) If B ≤ C, then O(A, B) ≤ O(A, C).

te

An overlap index O is said to be normal whenever the following condition holds: (O4) If there exists u ∈ U such that A(u) = B(u) = 1, then O(A, B) = 1.

Ac ce p

Some examples of overlap indices are presented in Table 2. Observe that OZ is the Zadeh’s consistency index [37]. OZ and Oπ are normal and Ox is not normal for x = 1. [32]

Theorem 1. [32] Let M : [0, 1]n → [0, 1] be a 0-positive aggregation function and O : [0, 1]2 → [0, 1] be an overlap function. Then, the mapping O : F S(U ) × F S(U ) → [0, 1] defined, for all X, Y ∈ F S(U ) and ui ∈ U , with i = 1, . . . , n, by O(X, Y ) = M (O(X(u1 ), Y (u1 )), . . . , O(X(un ), Y (un )))

(1)

is an overlap index. Reciprocally, if O is an overlap function and M : [0, 1]n → [0, 1] is an aggregation function such that O, defined by Equation (1), is an overlap index, then M is 0-positive. Table 3 shows some examples of overlap indices constructed using Theorem 1, using some overlap functions shown in Table 1.

7

Page 7 of 32

ip t

2.3. Penalty Functions One can find different definitions of penalty functions in the literature (see, e.g., [42, 43, 44, 24, 16, 57, 45, 46, 47]). In this paper, considering the discussion by Bustince et al. [41], we have decided to adopt the following definition, since it overcomes many drawbacks of the previous definitions:

(P1) P (x, y) ≥ c, for all x ∈ [0, 1]n ,y ∈ [0, 1];

us

(P2) P (x, y) = c if and only if xi = y, for all i = 1 . . . n, and

cr

Definition 11. [41, Definition 4.1] For any closed interval I ⊆ , the function P : [0, 1]n+1 → + is a penalty function if and only if there exists c ∈ + such that:

(P3) P is quasi-convex lower semi-continuous in y for each x ∈ [0, 1]n .

M

an

Definition 12. [41, Definition 4.2] Let P be a penalty function in the sense of Definition 11. The function fP : [0, 1]n → [0, 1] is said a P -function, if, for each x ∈ [0, 1]n , one has that a+b fP (x) = , (2) 2 where [a, b] = cl(M inz(P (x, ·))), and M inz(P (x, ·)) is the set of minimizer of P (x, ·), that is, M inz(P (x, ·)) = {y ∈ [0, 1] | P (x, y) ≤ P (x, z), for each z ∈ [0, 1]}, and cl(S) is the closure of S ⊆ [0, 1].

d

Theorem 2. [41, Theorem 4.1] A function f : [0, 1]n → [0, 1] is a P -function if and only if f is idempotent.

te

Then, any averaging aggregation function can be represented by a P -function.

Ac ce p

Example 1. Considering an idempotent function f : [0, 1]n → [0, 1],  > 0 and c ≥ 0, the function Pf : [0, 1]n+1 → + , defined, for all x ∈ [0, 1]n and y ∈ [0, 1] by:  c if xi = y for each i Pf (x, y) = (3) | f (x) − y | +c +  otherwise is a non-continuous penalty function [41, Proof of Theorem 4.1]. Now, define the function Pf : [0, 1]n+1 → + , for all x ∈ [0, 1]n and y ∈ [0, 1], by: Pf (x, y) =| f (x) − y | +V (x) + c,

(4)

where V is a strict continuous spread measure [58]. It follows that Pf is a continuous penalty function [41, Example 4.1].

The penalty function P describes the dissimilarity or disagreement between the inputs in x and the value y. Then, the P -function f is a function that minimizes the chosen dissimilarity. In this work, we use a penalty function defined over a Cartesian product of lattices, introduced by Bustince et al. [16]. However, for the general concept of penalty function, we consider Definition 12 instead of the one provided in [16], since Definition 12 satisfies all the requirements to be used in the construction methods introduced in [16], as we show in Section 3.3. 8

Page 8 of 32

3. Fuzzy rule-based classification systems using several overlap indices and aggregation functions

us

cr

ip t

A classification problem, from the point of view of supervised learning, consists in finding a decision rule that allows to determine the class of a new object (also called an example χ ∈ E) between the already existing and known classes, C ∈ {C1 , · · · , CM }. An example is described through a set of observations χ = (χ1 , · · · , χn ). Each of these observations is called a variable, attribute or characteristic. The design of a classifier or a classifying system can be seen as the search for a mapping: D : E → {C1 , · · · , CM }

an

optimal in the sense of some criterium that determines the goodness of the classifier. A fuzzy classifier is a fuzzy rule-based system that utilizes fuzziness only in the reasoning mechanism [59] and that consists of rules in the following format: Rule Rq : If χ1 is Aq1 and · · · and χn is Aqn then Class Cq with CFq

M

where Rq is the rule label, χ = (χ1 , · · · , χn ) is an n-dimensional example vector, Aqi denote the linguistic label of the i-th feature associated with the rule q, Cq is a consequent class and CFq ∈ [0, 1] is the certainty grade of rule q(i.e., the rule weight).

te

d

3.1. Construction of certainty degrees from overlap indices As discussed in the Introduction, in order to evaluate the degree of certainty of the rule Rq , that is, to evaluate CFq , one may use the degree of confidence and the support. The confidence of an association is classically measured by the co-occurrence of attributes in tuples in the database. Consider a set of p rules {R1 , · · · , Rp } as follows:

Ac ce p

Rule R1 : If χ1 is A11 and · · · and χn is A1n then Class C1 with CF1 =? ··· Rule Rp : If χ1 is Ap1 and · · · and χn is Apn then Class Cp with CFp =?

and take a set of m patterns (examples) χl with l = 1, · · · , m. In the following, we introduce the Algorithm 1, which was developed in order to build confidence and support measures, using overlap indices. This algorithm is a generalization of the one commonly used in [59].

9

Page 9 of 32

Algorithm 1

ip t

Input: A set of rules Rj , with j ∈ {1, · · · , p}, and a set of examples χl , with l ∈ {1, · · · , m}. Output: The Confidence (Cnf (Rj )) and the Support (Supp(Rj )) measures, for each rule Rj , with j ∈ {1, · · · , p}.

1: Select a t-norm T and an overlap index O; 2: Order the examples χl = (χl1 , · · · , χln ), with l = {1, · · · , m}, taking into account the

cr

class they classify;

3: for q = 1 to p do 4: Select the s ≤ m examples that tells us that the considered object belongs to the class Cq

associated to the rule Rq ; for j = 1 to s do Calculate (Matching degree)

us

5: 6:

7: 8:

an

cj (χj ) = T (Aq1 (χj1 ), · · · , Aqn (χjn )); end for Construct the fuzzy set on U

9: 10:

for l = 1 to m do Calculate

11: 12:

end for Construct the fuzzy set on U

M

Cqs = {(u1 , c1 (χ1 )), · · · , (us , cs (χs )), (us+1 , 0), · · · , (um , 0)};

(6)

(7)

d

cl (χl ) = T (Aq1 (χl1 ), · · · , Aqn (χln ));

(5)

13:

Calculate

te

Cqm = {(u1 , c1 (χ1 )), · · · , (um , cm (χm ))}; O(Cqs , U ) ; O(Cqm , U )

(8)

Supp(Rq ) = O(Cqs , U );

(9)

Ac ce p

Cnf (Rq ) =

14:

Calculate

15: end for

Remark 1. Observe that:

1. If, in Equations (8) and (9), we take the expression for the overlap index Oπ , defined in Table 2, we recover the confidence degree and the support introduced by Ishibuchi et al. [40], which are given, respectively, by: Cnf (Rq )

= = =

Oπ (Cqs , U ) Oπ (Cqm , U ) s 1 ci (χi ) m i=1 m 1 i=1 ci (χi ) m  s ci (χi ) i=1 m i=1 ci (χi )

(10)

10

Page 10 of 32

s

1  Supp(Rq ) = Oπ (Cqs , U ) = ci (χi ). m i=1

(11)

ip t

and

cr

2. In the construction of the confidence degree Cnf of a rule, using Equation (8), we only take into account the sets Cqs and Cqm . There exist situations where it is necessary to consider the overlap index of the set composed of the rules that do not classify the class considered in that moment [60]. In Algorithm 1, this means that we should build the set

and build the confidence degree as follows:

max{0, O(Cqs , U ) − O(Cq(m−s) , U )} . O(Cqm , U )

an

Cnf (Rq ) =

us

Cq(m−s) = {(u1 , 0), · · · , (us , 0), (us+1 , 0), (us+1 , cs+1 (χs+1 )), · · · , (um , cm (χm ))}

(12)

M

Theorem 3. Let M : [0, 1]2 → [0, 1] be an aggregation function such that, for all x, y ∈ [0, 1], it holds that M (x, y) = f −1 (ωf (x) + (1 − ω)f (y)),

d

where ω ∈ ]0, 1[ and f : [0, 1] → [a, b] is a continuous and strictly increasing function. Then one has that:

te

(i) M (x, min{y1 , . . . , yn }) = min{M (x, y1 ), . . . , M (x, yn )}; (ii) M is 0-positive aggregation function.

Ac ce p

P ROOF. It follows that: (i) Consider L = min{y1 , . . . , yn }. Then one has that M (x, L) ≤ M (x, yi ), for all i = 1, . . . , n, and min{M (x, y1 ), . . . , M (x, yn )} = M (x, L) = M (x, min{y1 , . . . , yn }).

(ii) One has that M (x, y) = 0 if and only if f −1 (ωf (x) + (1 − ω)f (y)) = 0 if and only if ωf (x) + (1 − ω)f (y) = f (0). If x or y are strictly positive, then, from the monotonicity of f , we have that f (x) or f (y) is strictly greater than f (0), and then ωf (x) + (1 − ω)f (y) > 0. It follows that x = y = 0. 2 Corollary 2. In the setting of Theorem 3, the following inequality holds: M (min{x1 , · · · , xn }, min{y1 , · · · , yn }) ≤ min{M (x1 , y1 ), · · · , M (xn , yn )}. (13) P ROOF. It follows from the monotonicity of M .

2

11

Page 11 of 32

Using the notation of Step 6 of Algorithm 1, we build the following sets over the referential U :   l=1

γ

= {(ul , Aqi (χlγ ))|ul ∈ U },

(14)

ip t

s

Aqi

cr

where γ = 1, · · · , n, q ∈ {1, . . . , p} is the label of the rule Rq , s ≤ m is the number of patterns informing that the considered object belongs to the class Cq associated to the rule Rq , m is the the total number of patterns and χl = (χl1 , · · · , χln ), with l = {1, · · · , m}, is the ordered set of patterns, taking into account the class they classify. Then, we introduce the following result:

l=1

1

an

us

Theorem 4. Let O : [0, 1]2 → [0, 1] be an overlap function and M : [0, 1]n → [0, 1] a 0-positive aggregation function satisfying the conditions of Theorem 3. Let O : F S(U ) × F S(U ) → [0, 1] be an overlap index built, from O and M , according to Theorem 1. Then, in the setting of Algorithm 1, whenever one consider T = min, it holds that:        s s Aqi , U , O(Cqs , U ) ≤ min O Aqi , U . . . , O l=1

n

d

M

where Cqs is the fuzzy set on U defined by Equation (6), q ∈ {1, . . . , p} is the label of the rule Rq , s ≤ m is the number of patterns informing that the considered object belongs to the class Cq associated to the rule Rq , and m is the the total number of patterns.

O(Cqs , U ) s

M (O(min{Aq1 (χi1 ), · · · , Aqn (χin )}, 1)) by Equations (1) and (5) i=1  M O(min{Aq1 (χ11 ), . . . , Aqn (χ1n )}, 1), . . . ,  O(min{Aq1 (χs1 ), . . . , Aqn (χsn )}, 1)    M min O(Aq1 (χ11 ), 1), . . . , O(Aqn (χ1n ), 1) , . . . ,   by Proposition 2 min O(Aq1 (χs1 ), 1), . . . , O(Aqn (χsn ), 1)    min M O(Aq1 (χ(11)), 1), . . . , O(Aq1 (χ(s1)), 1) , . . . ,   by Corollary 2 M O(Aqn (χ(1n)), 1), . . . , O(Aqn (χ(sn)), 1)        s s Aqi , U . . . , O Aqi , U by Equations (1) and (14) min O

Ac ce p

=

te

P ROOF. It follows that:

=

=



=

l=1

1

l=1

n

2 12

Page 12 of 32

ip t

3.2. Reasoning mechanism of a fuzzy rule-based classification system The reasoning mechanism of a FRBCS is usually implemented by the single winner approach that selects the class label associated with the rule that provides the highest rule activation degree. Consider a set of M classes {C1 , . . . , CM } and a set of p rules {R1 , . . . , Rp }, with M ≤ p, given as:

cr

Rule Rq : If χ1 is Aq1 and · · · and χn is Aqn then Class Cq with CFq ,

us

where q ∈ {1, . . . , p}, Cq ∈ {C1 , . . . , CM } represents the class of rule Rq , and CFq is degree of certainty of the rule Rq , which is evaluated by applying Algorithm 1. Then, in the following, given the schema:

an

Rule R1 : If χ1 is A11 and . . . and χn is A1n then Class C1 with CF1 .. . Rule Rp : If χ1 is Ap1 and . . . and χn is Apn then Class Cp with CFp Fact: χ1 is A1 and . . . and χn is An

Ac ce p

te

d

M

we introduce the Algorithm 2 in order to calculate the class C ∈ {C1 , . . . , CM } to which the fact belongs, taking into account different overlaps among the facts and the antecedents of the rules. We represent such overlaps by means of different overlap indices. These reasoning method (using different overlap indices) is which distinguishes the proposed Algorithm 2 from the classical ones (see [40]), which we show to be particular instances of Algorithm 2.

13

Page 13 of 32

Algorithm 2

ip t

Input: A set of rules Rj , with j ∈ {1, · · · , p}, a set of classes {C1 , · · · , CM }, with M ≤ p and a Fact. Output: A class C ∈ {C1 , · · · , CM }.

us

j=1

an

11: 12:

(and supports): (Cf nj1 , . . . , Cf njr ); for C = C1 to CM do Select the set of rules Rj that have assigned the class C; α = number of rules that have assigned the class C; for t = 1 to r do for L = 1 to S do for j = 1 to α do C Calculate T (OL (A1 , Aj1 ), . . . , OL (An , Ajn )) · Cf njt = kjCnf ; OL end for α C C Calculate maxkjO = KCnf ; L O L

13:

end for

14:

C Calculate KC = M KCnf ; O

S×r

L=1

L

te

It is immediate that:

d

15: end for 16: end for 17: Take C = arg max KC .

M

4: 5: 6: 7: 8: 9: 10:

cr

1: Select a t-norm T and an aggregation function M ; 2: Select a set of S overlap indices {O1 , . . . , OS }; 3: Execute r times Algorithm 1. Each rule Rj has assigned a r-tuple of confidence degrees

Ac ce p

Proposition 3. If in Algorithm 2 one considers the values of CFq given by Equation (10) or Equation (11) and the overlap index O : F S(U ) × F S(U ) → [0, 1] defined, for all A, B ∈ F S(U ), by O(A, B) =

n

1 A(ui )B(ui ), n i=1

then we recover Ishibuchi’s algorithm [40].

3.3. Using penalty functions to choose the best class in fuzzy-rule ensembles A key factor in Algorithm 2 is the aggregation function M selected for Step 1. If, for a given classification system, we run this algorithm several times, each of them with a different aggregation function, we may have different results, generating fuzzy-rule ensembles. This fact forces us to provide a consensus system to select the final resulting class. For such system, in the following, we adapt the decision making method presented in [16], which uses for the exploitation phase penalty functions P∇ defined over a Cartesian product of lattices. Denote by C ∗ a chain whose elements belong to [0, 1] and consider the Cartesian product L∗m = C ∗ × . . . × C ∗ . Denote by Byq the fuzzy set over U such that all the

m

14

Page 14 of 32

membership values are equal to yq ∈ [0, 1], that is, Byq (u) = yq ∈ [0, 1], for all u ∈ U .  Y = (By , . . . , By ) ∈ F S(U )m .  = (y1 , . . . , ym ), B Consider Y 1 m

cr

ip t

Theorem 5. Let Ki : + → + , with i = 1, . . . , m, be quasi-convex lower semicontinuous functions with a unique minimum at Ki (0) = 0, and D : F S(U ) × F S(U ) → + be the distance between fuzzy sets, defined, for all X, Y ∈ F S(U ), by n  D(A, B) = | A(u1 ) − B(u1 ) |, (15) i=1

an

us

where n = card(U ). Then the mapping P∇ : F S(U )m × L∗m → + , given, for all  ∈ F S(U )m , Y ∈ L∗m , by, : A 

n m m m      Kq (D(Aq , Byq )) = Kq | Aq (up ) − yq | (16) P∇ (A, Y ) = q=1

q=1 q=1

p=1

∗(n+1)

is a penalty function defined over a Cartesian product of lattices Lm

.

M

P ROOF. Observe that (P1) and (P2) follows from [16, Theorem 6]. The proof of (P3) 2 is due to the sum of the quasi-convex and lower semi-continuous functions Ki .

te

d

In Algorithm 2 we let the user to choose S confidence degrees and r overlap indices. Then, for each class Ci ∈ C = {C1 , · · · , CM } we have a set of S × r numerical values. We consider a Cartesian product of as many lattices as classes in our problem. Moreover, each lattice has S × r elements: CM C1 C1 M C1 × · · · × CM = (KCf n1O , . . . , KCf nrO ) × . . . × (KCf n1O , . . . , KCf nrO ). 1

S

1

S

Ac ce p

Consider a set of M classes and another set of M aggregation functions. In [16] an optimization method is proposed such that, from all the M-tuples of aggregation functions obtained by calculating the permutations with repetitions of the M aggregation functions, we can select the tuple such that, when we apply the components of the tuple (in its order) to the S × r components of the corresponding class, we get the Mtuple that minimizes the dissimilarity between the M input classes and the considered tuple of numbers. We then modify [16, Algorithm 1] to our problem (Algorithm 3), applying the adopted penalty function P∇ (Equation (16)) in order to calculate the deviation between the outputs provided by the different tuples of aggregation functions, and then select the one that minimizes such dissimilarity.

15

Page 15 of 32

Algorithm 3

cr

ip t

Input:Using Algorithm 2, we calculate: C1 C1 (KCf n1O1 , . . . , KCf nrO ) S .. . CM M , . . . , KCf (KCf nrO ) n1O1 S Output: A class C ∈ {C1 , · · · , CM }.

(M1 , . . . , MM ).

4: for each permutation of step 3 do

6:

an

5:

C1 C1 C1 Mσ(1) (KCf n1O1 , . . . , KCf nrO ) = mσ(1) S ... CM CM M Mσ(M) (KCf , . . . , KCf nrO ) = mσ(M) n1O1 S Calculate the penalty function by

us

1: Select a penalty function P∇ defined over the Cartesian product of M lattices (Theorem 5); 2: Select a M-tuple (M1 , · · · , MM ) of idempotent aggregation functions; 3: Calculate all the permutations with repetition (Mσ((1) , . . . , Mσ(M) ) of the tuple

(17)

M

CM 1 P∇ (C1 , . . . , CM , (mC σ(1) , . . . , mσ(M) ));

7: end for CM C1 8: Take the tuple of aggregations (Mm(σ(1)) , . . . , Mm(σ(M) ), which that, with the values ob-

tained in Step 5, minimizes Eq. (17);

9: Calculate

te

d

C1 C1 C1 C1 (KCf Mm(σ(1)) n1O1 , . . . , KCf nrO ) = K S .. . CM CM M CM (KCf , . . . , KCf Mm(σ(M) nrO ) = K n1O 1

10: Take C = arg max KC

S

Ac ce p

{C1 ,...,CM }

It is immediate that:

Proposition 4. Let M = (M1 , · · · , MM ) be a tuple of idempotent aggregation functions. If we run Algorithm 2 several times with different idempotent aggregation functions M ∈ M, when we run the consensus algorithm (Algorithm 3), it holds that: (i) If the M1 = . . . = MM , then the outcome is the result given by Algorithm 2. (ii) If, for all i = 1, . . . , M, Mi is the arithmetic mean, then Algorithm 3 is the weighted voting algorithm in decision making ([19]). Remark 2. Note that an analogous reasoning regarding the running of Algorithm 2 with different aggregation functions M and the running of Algorithm 3 can also be made using different t-norms in Step 1 of Algorithm 2. We do not develop this case since it is similar to the one for different aggregation functions M .

16

Page 16 of 32

4. An example of a generation of fuzzy rule-based ensembles and the decision making by consensus via penalty functions

ip t

The objective of this section is to illustrate the use of the previously developed algorithms by means of a very simple fuzzy rule-based classification system. The example consists of:

te

d

M

an

us

cr

1. Three classes, namely, C1 , C2 and C3 . 2. Two linguistic variables (attributes), namely, χ1 (body mass index - BMI) and χ2 (AGE). The values χ1 and χ2 are qualified by the linguistic terms low, medium and high, as represented in Figure 1. Considering the graphic on the top of the figure, for the linguistic variable χ1 , the actual crisp values of the body mass indices are marked on the X-axis, varying from 14 to 42. Analogously, in the graphic on the bottom of the figure, for the linguistic variable χ2 , the actual crisp values of the ages are marked on the X-axis and varies from 30 to 70. The membership degrees are marked in the Y-axis, varying from 0 to 1, which will reflect the values of the linguistic terms of such variables on the membership functions. For example, a body mass index of 35 is fuzzified as: “χ1 is low with degree 0, medium with degree 0.5 and high with degree 0.5”. 3. Twenty five examples or patterns (i.e., m = 25), shown in Table 4. Figure 2 shows both the examples used to learn the fuzzy rules and the new example to be classified once the rules have been learned. For the learning process it can be observed that we have four examples of class C1 (addition symbol), ten examples of class C2 (circles) and eleven examples of class C3 (triangles). 4. Five learned rules (R1-R5), as shown in Figure 2), from the data in Table 4 and using some of the methods in [61, 62, 63]: If χ1 If χ1 If χ1 If χ1 If χ1

is Middle and χ2 is Low then Class C1 with CF1 =? is Middle and χ2 is Middle then Class C2 with CF2 =? is Middle and χ2 is High then Class C2 with CF3 =? is High and χ2 is Middle then Class C3 with CF4 =? is High and χ2 is High then Class C3 with CF5 = ?

Ac ce p

Rule R1 : Rule R2 : Rule R3 : Rule R4 : Rule R5 :

From the analysis of Table 4, we deduce that the 25 patterns (examples) are of the form: χl = (χl1 , χl2 ),

with l = 1, · · · , 25, where χl1 represents the value of the variable χ1 (BMI) and χl2 (AGE) is the value of variable χ2 in the example l. 4.1. Evaluating the certainty degrees for each rule Now, we build the certainty degrees CFq for each rule (q = 1, · · · , 5). Consider the confidence degree CFq . The membership grades of the values in Table 4 to the sets represented in Figure 1 are given in Table 5. Taking into account Table 4, the distribution of the rules among the three classes and Table 5, we calculate the confidence degrees of the rules by Algorithm 1 as follows:

17

Page 17 of 32

1

Low

High

0 14

cr

ip t

Medium

28

42

us

Body mass index 1

Low

High

an

Medium

M

0 30

50

70

Age

d

Figure 1: Linguistic terms for BMI and AGE

te

Step 1: First consider the Product t-norm T = TP and the overlap index Oπ (given in Table 2). Since we have m = 25 patterns, then since A ∈ F S(U ), one has that:

Ac ce p

Oπ (A, U ) =

1  A(u) · 1. 25 u∈U

Step 2: Classify the patterns in the three classes. For example, the first four patterns are in C1 (see Table 4).

Step 3: For q = 1 to p = 5 (that is, for all rules), and beginning with q = 1: Step 4: Take s = 4, related to the first four examples belonging to class C1 (see Table 4); Step 5: For j = 1 to s = 4: Step 6: Calculate the matching degree, taking into account the values in Table 5, starting for j = 1: c1 (χ1 ) = TP (A11 (χ11 ), A12 (χ12 )) = A11 (28) · A12 (32) = 1 · 0.900 = 0.900 c2 (χ2 ) = TP (A11 (χ21 ), A12 (χ22 )) = A11 (34) · A12 (32) = 0.571 · 0.900 = 0.5143 c3 (χ3 ) = TP (A11 (χ31 ), A12 (χ32 )) = A11 (33) · A12 (35) = 0.643 · 0.750 = 0.4821 c4 (χ4 ) = TP (A11 (χ41 ), A12 (χ42 )) = A11 (29) · A12 (37) = 0.929 · 0.650 = 0.6036

18

Page 18 of 32

d

cr

an

Ac ce p

te

Class 1 1 1 1 2 2 2 2 3 3 2 3 2 3 2 2 3 2 3 3 2 3 3 3 3

us

χ2 = AGE 32 32 35 37 38 47 48 52 55 55 63 62 64 62 68 48 49 50 55 57 63 65 64 68 69

M

χ1 = BMI 28 34 33 29 34 30 33 31 29 34 30 33 29 32 30 36 39 36 37 41 38 37 41 39 36

ip t

Table 4: Examples used for training

Id Example 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

where A11 and A12 correspond to the fuzzy sets defining the linguistic terms Medium and Low, respectively, in the rule R1 .

Step 7: Build the fuzzy set on U : C14

=

{(u1 , c1 (χ1 )), (u2 , c2 (χ2 )), (u3 , c3 (χ3 )), (u4 , c4 (χ4 ))}

=

{(u1 , 0.9000), (u2 , 0.5143), (u3 , 0.4821), (u4 , 0.6036), (u5 , 0), . . . , (u25 , 0)}

Step 8: For l = 1 to m = 25, beginning with l = 1:

19

Page 19 of 32

1

0 14

High

28

42

Body mass index

us an M 03

d

42

1

35

Low

21

0

30 14

50

40

Medium

Age

X2: Age

60

High

70

70

ip t

Medium

cr

Low

te

X1: Body mass index

Figure 2: Rules

Ac ce p

Step 9: Calculate: c5 (χ5 ) =

Tp (A11 (χ51 ), A12 (χ52 )) = A11 (34) · A12 (38) = 0.571 · 0.600 = 0.3429

c6 (χ6 )

=

Tp (A11 (χ61 ), A12 (χ62 )) = A11 (30) · A12 (47) = 0.857 · 0.150 = 0.1286

c7 (χ7 )

=

Tp (A11 (χ71 ), A12 (χ72 )) = A11 (33) · A12 (48) = 0.643 · 0.100 = 0.0643

c8 (χ8 )

=

Tp (A11 (χ81 ), A12 (χ82 )) = A11 (31) · A12 (52) = 0.786 · 0.000 = 0.0000

c9 (χ9 )

=

Tp (A11 (χ91 ), A12 (χ92 )) = A11 (29) · A12 (55) = 0.929 · 0.000 = 0.0000

c10 (χ10 )

=

Tp (A11 (χ101 ), A12 (χ102 )) = A11 (34) · A12 (55) = 0.571 · 0.000 = 0.0000

c11 (χ11 )

=

Tp (A11 (χ111 ), A12 (χ112 )) = A11 (30) · A12 (63) = 0.857 · 0.000 = 0.0000

c12 (χ12 )

=

Tp (A11 (χ121 ), A12 (χ122 )) = A11 (33) · A12 (62) = 0.643 · 0.000 = 0.0000

c13 (χ13 )

=

Tp (A11 (χ131 ), A12 (χ132 )) = A11 (29) · A12 (64) = 0.929 · 0.000 = 0.0000

c14 (χ11 )

=

Tp (A11 (χ141 ), A12 (χ142 )) = A11 (32) · A12 (62) = 0.714 · 0.000 = 0.0000

c15 (χ12 )

=

Tp (A11 (χ151 ), A12 (χ152 )) = A11 (30) · A12 (68) = 0.857 · 0.000 = 0.0000

c16 (χ16 )

=

Tp (A11 (χ161 ), A12 (χ162 )) = A11 (36) · A12 (48) = 0.429 · 0.100 = 0.0429

c17 (χ17 )

=

Tp (A11 (χ171 ), A12 (χ172 )) = A11 (39) · A12 (49) = 0.214 · 0.050 = 0.0107

c18 (χ18 )

=

Tp (A11 (χ181 ), A12 (χ182 )) = A11 (36) · A12 (50) = 0.429 · 0.000 = 0.0000

c19 (χ19 )

=

Tp (A11 (χ191 ), A12 (χ192 )) = A11 (37) · A12 (55) = 0.357 · 0.000 = 0.0000

c20 (χ20 )

=

Tp (A11 (χ201 ), A12 (χ202 )) = A11 (41) · A12 (57) = 0.071 · 0.000 = 0.0000

c21 (χ21 )

=

Tp (A11 (χ211 ), A12 (χ212 )) = A11 (38) · A12 (63) = 0.286 · 0.000 = 0.0000

c22 (χ22 )

=

Tp (A11 (χ221 ), A12 (χ222 )) = A11 (37) · A12 (65) = 0.357 · 0.000 = 0.0000

c23 (χ23 )

=

Tp (A11 (χ23120 ), A12 (χ232 )) = A11 (41) · A12 (64) = 0.071 · 0.000 = 0.0000

c24 (χ24 )

=

Tp (A11 (χ241 ), A12 (χ242 )) = A11 (39) · A12 (68) = 0.214 · 0.0000 = 0000

c25 (χ25 )

=

Tp (A11 (χ251 ), A12 (χ252 )) = A11 (36) · A12 (69) = 0.429 · 0.000 = 0.0000

Page 20 of 32

Table 5: Fuzzy grades

ip t

Class 1.000 1.000 1.000 1.000 2.000 2.000 2.000 2.000 3.000 3.000 2.000 3.000 2.000 3.000 2.000 2.000 3.000 2.000 3.000 3.000 2.000 3.000 3.000 3.000 3.000

cr

High 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.100 0.250 0.250 0.650 0.600 0.700 0.600 0.900 0.000 0.000 0.000 0.250 0.350 0.650 0.750 0.700 0.900 0.950

us

χ2 = AGE Middle 0.100 0.100 0.250 0.350 0.400 0.850 0.900 0.900 0.750 0.750 0.350 0.400 0.300 0.400 0.100 0.900 0.950 1.000 0.750 0.650 0.350 0.250 0.300 0.100 0.050

an

Low 0.900 0.900 0.750 0.650 0.600 0.150 0.100 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.100 0.050 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

Ac ce p

te

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

High 0.000 0.429 0.357 0.071 0.429 0.143 0.357 0.214 0.071 0.429 0.143 0.357 0.071 0.286 0.143 0.571 0.786 0.571 0.643 0.929 0.714 0.643 0.929 0.786 0.571

M

Low 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

χ1 = BMI Middle 1.000 0.571 0.643 0.929 0.571 0.857 0.643 0.786 0.929 0.571 0.857 0.643 0.929 0.741 0.857 0.429 0.241 0.429 0.357 0.071 0.286 0.357 0.071 0.214 0.429

d

Id Example

Step 10: Build the fuzzy set on U : C1,25 = {(u1 , c1 (χ1 )), · · · , (u2 5, cm (χ2 5))} =

{(u1 , 0.9000), (u2 , 0.5143), (u3 , 0.4821), (u4 , 0.6036), (u5 , 0.3429), (u6 , 0.1286), (u7 , 0.0643), (u8 , 0.0000), (u9 , 0.0000), (u10 , 0.0000), (u11 , 0.0000), (u12 , 0.0000), (u13 , 0.0000), (u14 , 0.0000), (u15 , 0.0000), (u16 , 0.0429), (u21 , 0.0000), (u22 , 0.0000), (u23 , 0.0000), (u24 , 0.0000),

(u25 , 0.0000, (u17 , 0.0107), (u18 , 0.0000), (u19 , 0.0000), (u20 , 0.0000)}. Step 11: Using Eq. (8), calculate:

21

Page 21 of 32

OZ 1 1 1 1 1

Cnf (R1 ) O(C14 , U ) O(C125 , U ) 1 4 i=1 ci (χi ) = 25 1 25 i=1 ci (χi ) 25 1 (0.9000 25

1 (0.90000 25

an

=

=

Orat 0.6867 0.4993 0.4290 0.5188 0.7598

us

O√ 0.6753 0.4964 0.4160 0.5164 0.7586

cr

Oπ 0.8092 0.5617 0.5024 0.5579 0.8169

ip t

Table 6: Cnf built from different overlap indices

Rule 1 2 3 4 5

+ 0.5143 + 0.4821 + 0.6036)

+ 0.5143 + 0.4821 + 0.6036 + 0.3429 + 0.1286 + 0.0643 + 0.0429 + 0.0107)

2.5 3.0893 = 0.8092

M

=

te

d

Following Steps: Repeating the previous steps for q = 2, 3, 4, 5, we have the following results for the other rules: Cnf (R2 ) = 0.5671, Cnf (R3 ) = 0.5024, Cnf (R4 ) = 0.5579, Cnf (R5 ) = 0.8169 (see the second column of Table 6).

Ac ce p

If in Step 1 of Algorithm 1, we take the overlap indices of Table 3, then we obtain the confidence degrees Cnf presented in columns 3, 4 and 5 respectively of Table 6. 4.2. Inference

Given the set of rules: Rule R1 : Rule R2 : Rule R3 : Rule R4 : Rule R5 :

If χ1 If χ1 If χ1 If χ1 If χ1

is Middle and χ2 is Low then Class C1 with CF1 is Middle and χ2 is Middle then Class C2 with CF2 is Middle and χ2 is High then Class C2 with CF3 is High and χ2 is Middle then Class C3 with CF4 is High and χ2 is High then Class C3 with CF5

and the following fact: Body mass index = 33 and Age = 61 we intend to determine to which class it belongs to, using Algorithm 2. In Figure 2 the new example to be classified (BMI=33 and AGE=61) is represented with the asterisk and according to its position it could be inferred that it belongs to the class three, since it is surrounded of examples of that class. Now, from Figure 1: 1. Build the sets A1 and A2 as follows:

22

Page 22 of 32

A1 = {(14, 0), (15, 0), · · · , (32, 0), (33, 1), (34, 0), · · · , (42, 0)}

ip t

A2 = {(30, 0), (31, 0), · · · , (60, 0), (61, 1), (62, 0), · · · , (70, 0)}

us

cr

For the sake of simplicity we present this construction. In fact we can choose different constructions depending on the application we are working at. 2. Calculate the membership grades of the linguistic variable BMI (33) to the fuzzy sets Low, Middle and High, which represents the linguistic terms qualifying the linguistic variable BMI. The results are: Low(33)= 0.000, Middle(33)= 0.6429 and High(33)= 0.3571. 3. Analogously, calculate the membership grades of the linguistic variable AGE (61), to the fuzzy sets Low, Middle and High, obtaining the results: Low(61)= 0.000; Middle(61)= 0.450 and High(61)= 0.550.

Ac ce p

te

d

M

an

To run the Algorithm 2, in Step 1, consider the Product t-norm T = TP and the aggregation function M = max. In Step 2, consider the set of S = 3 overlap indices, namely, {Oπ , O√ , Orat } (Table 3). In Step 3, the confidence degrees CF nj (j = 1, · · · , 5) are calculated with Algorithm 1 and represented in columns 2, 3 and 5 of Table 6. Table 7 shows the results obtained with Algorithm 2 for the experiment. If Table 7 is analyzed by columns, then we see that each central column corresponds to the data obtained with the classical classification algorithm fixing a confidence degree for the rules and choosing the set of overlap indices in Step 2 as the set consisting of one single overlap. If we analyze Table 7 by rows, the result is the one obtained by means of Algorithm 2 with two confidence degrees and three overlap indices. So Table 7 and Algorithm 2 allow us to study the results of the classification of a given fact in two different ways: by means of the classical algorithm (by columns) or by means of a family of different measures to represent the overlapping between the antecedents of the rules and the considered fact (by rows). From our point of view, we must use Algorithm 2 and understand Table 7 by rows since in classification there always exist different kinds of interactions between facts and antecedents, and we should take into account all of them. This does not happen in classical algorithms, which only measure overlapping in one way. From the analysis of the first column in Table 7 we deduce that the fact we are considering in this example belongs to C2 . However, from the analysis of the other columns (except the fourth), we deduce that the fact belongs to class C3 . 4.3. Consensus

Once Algorithm 2 has been run with the aggregation function M1 = max = m1 , we run this algorithm twice more: one with the aggregation M2 , given by the OWA operator corresponding to the quantifier at least one half (that is, considering the weight vector ω = (0.33, 0.33, 0.33, 0, 0, 0)) and another with M3 given by the OWA operator corresponding to the quantifier as many as possible (that is, considering weight vector ω = (0, 0, 0, 0.33, 0.33, 0.33)).

23

Page 23 of 32

24

Page 24 of 32

π

π

C2

Class

C3

C3 KCnf = ∨(0.2237, 0.3621) O√

C2 KCnf = ∨(0.3021, 0.2987) O√ rat

rat

C3 KCnf O

C2 KCnf O

rat π

π

rat rat

C3 KCnf O

C2 KCnf O

rat

C3

= ∨(0.1677, 0.2701)

= ∨(0.2151, 0.2056)

us

an

C3

C3 KCnf = ∨(0.2080, 0.3367) O√

C2 KCnf = ∨(0.2685, 0.2551) O√

M

C2

C3 KCnf = ∨(0.0834, 0.1492) O

C2 KCnf = ∨(0.1444, 0.1517) O

π

C1 KCnf O

d

Table 8: Results from Algorithm 6. Consensus

C3

= ∨(0.1803, 0.2904)

= ∨(0.2420, 0.2407)

CF = Cnf Constructed by Algorithm 4 using Orat O√ Orat C1 C1 =0 KCnf =0 KCnf =0 O√ O

KC3 = 0.3621

KC2 = 0.3021

K C1 = 0

M = max

C = C3

C = arg maxKC

cr

ip t

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 0.5717 0.3461 0.4242 0.5290 0.4394 0.4356 0.3346 0.4783 0.4669 0.4783 0.4356 0.4394 0.4669 0.3461 0.5717 0.3346 0.5290 0.4242 0.3346 0.5290 0.4356 0.4669 0.4242 0.5717 0.4783 0.4394 0.3461

π

C3 KCnf = ∨(0.0897, 0.1605) O

C2 KCnf = ∨(0.1625, 0.1776) O



Table 7: Results from Algorithm 5

te

CF = Cnf Constructed by Algorithm 4 using Oπ Oπ O√ Orat C1 C1 C1 KCnf =0 KCnf =0 KCnf =0 O O√ O

C3

C2

C1

Class

Ac ce p

Next, we run the consensus algorithm, Algorithm 3, with the penalty function P∇ of Theorem 5, defined, in this context, by M   i=1

 j=1...,r;L=1,...S

Ci Ci |KCf njO − mσ(i) | L

2

ip t

CM 1 P∇ (C1 , . . . , CM , (mC σ(1) , · · · , mσ(M) )) =

an

us

cr

(18) and the tuples built calculating the permutations with repetition of (M1 , M2 , M3 ). Then we have 27 possible combinations of the 3 aggregation functions. In Table 8 we present the numerical values obtained with the penalty function used for each of the considered 3-tuples. Note that the minimum value is 0.3346, corresponding to the combination of aggregations: (M1 , M3 , M3 ). Observe that the same value is also obtained with the combinations (M2 , M3 , M3 ) and (M3 , M3 , M3 ), since the rule of class C1 was not activated. Therefore, we have that KC1 = 0, KC2 = 0.1028, KC3 = 0.2318). By Step 5 in Algorithm 3 the resulting class is C3 .

M

5. Conclusion

This paper introduced a consensus method for decision making in ensembles of fuzzy rule-based classification systems that uses penalty functions. For that, the paper contributed with 3 novel algorithms related to FRBCSs:

Ac ce p

te

d

1. The Algorithm 1, which implemented a method for building confidence and support measures, based on overlap indices, which, in their turn, can be constructed by overlap functions; 2. The Algorithm 2 implementing a new FRM for FRBCS, using different overlap indices and aggregation functions, which we prove to be a generalization of some classical methods; 3. The Algorithm 3, which implemented the consensus method for the classification, based on penalty functions.

The paper also studied several theoretical results, which provided the basis of our approach. Using the proposed algorithms, we developed a detailed example, generating fuzzy rule-based ensembles considering several overlap indices and aggregation functions, which provided different results, from which we got a consensual final solution. Future work is concerned to run an experiment with different aggregation functions and penalty functions, in order to enlarge the scope of the experiments. Observe that whenever one uses different penalty functions, another kind of ensemble may be generated, and it would be interesting to investigate the decision making process in this case. Finally, due to the use of interval mathematics in several applications (see, e.g., [64, 65]), we also intend to develop this study in the interval-valued setting, following the approach adopted in [66, 68, 69, 70].

25

Page 25 of 32

Acknowledgments

cr

ip t

This work is supported by Brazilian National Counsel of Technological and Scientific Development CNPq (Proc. 307781/2016-0), by the Spanish Ministry of Science and Technology (under project TIN 2016-77356-P (AEI/FEDER, UE)), and by Caixa and Fundaci´on Caja Navarra of Spain.

References

us

[1] H. Ishibuchi, Y. Nojima, Pattern classification with linguistic rules, in: H. Bustince, F. Herrera, J. Montero (Eds.), Fuzzy Sets and Their Extensions: Representation, Aggregation and Models, Vol. 220 of Studies in Fuzziness and Soft Computing, Springer, Berlin, 2008, pp. 377–395.

M

an

[2] G. Lucca, J. Sanz, G. Pereira Dimuro, B. Bedregal, R. Mesiar, A. Koles´arov´a, H. Bustince Sola, Pre-aggregation functions: construction and an application, IEEE Transactions on Fuzzy Systems 24 (2) (2016) 260–272. doi:10.1109/TFUZZ.2015.2453020.

d

[3] G. Lucca, G. P. Dimuro, V. Mattos, B. Bedregal, H. Bustince, J. A. Sanz, A family of Choquet-based non-associative aggregation functions for application in fuzzy rule-based classification systems, in: 2015 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), IEEE, Los Alamitos, 2015, pp. 1–8. doi:10.1109/FUZZ-IEEE.2015.7337911.

Ac ce p

te

[4] M. Elkano, M. Galar, J. Sanz, A. Fern´andez, E. Barrenechea, F. Herrera, H. Bustince, Enhancing multi-class classification in FARC-HD fuzzy classifier: On the synergy between n-dimensional overlap functions and decomposition strategies, IEEE Transactions on Fuzzy Systems 23 (5) (2015) 1562–1580. [5] J. Sanz, D. Bernardo, F. Herrera, H. Bustince, H. Hagras, A compact evolutionary interval-valued fuzzy rule-based classification system for the modeling and prediction of real-world financial applications with imbalanced data, Fuzzy Systems, IEEE Transactions on 23 (4) (2015) 973–990. doi:10.1109/TFUZZ.2014.2336263. [6] J. A. Sanz, M. Galar, A. Jurio, A. Brugos, M. Pagola, H. Bustince, Medical diagnosis of cardiovascular diseases using an interval-valued fuzzy rule-based classification system, Applied Soft Computing 20 (2014) 103 – 111. [7] S. R. Samantaray, K. El-Arroudi, G. Joos, I. Kamwa, A fuzzy rule-based approach for islanding detection in distributed generation, Power Delivery, IEEE Transactions on 25 (3) (2010) 1427–1433. [8] M. Galar, J. Derrac, D. Peralta, I. Triguero, D. Paternain, C. Lopez-Molina, S. Garc´ıa, J. M. Ben´ıtez, M. Pagola, E. Barrenechea, H. Bustince, F. Herrera, A survey of fingerprint classification part I: Taxonomies on feature extraction methods and learning models, Knowledge-Based Systems 81 (2015) 76 – 97. 26

Page 26 of 32

ip t

[9] M. Galar, J. Derrac, D. Peralta, I. Triguero, D. Paternain, C. Lopez-Molina, S. Garc´ıa, J. M. Ben´ıtez, M. Pagola, E. Barrenechea, H. Bustince, F. Herrera, A survey of fingerprint classification part II: Experimental analysis and ensemble proposal, Knowledge-Based Systems 81 (2015) 98 – 116.

cr

[10] S. Patidar, R. B. Pachori, U. R. Acharya, Automated diagnosis of coronary artery disease using tunable-q wavelet transform applied on heart rate signals, Knowledge-Based Systems 82 (2015) 1 – 10.

us

´ [11] G. Asencio-Cort´es, F. Mart´ınez-Alvarez, A. Morales-Esteban, J. Reyes, A sensitivity study of seismicity indicators in supervised learning to improve earthquake prediction, Knowledge-Based Systems 101 (2016) 15 – 30.

an

[12] G. Beliakov, A. Pradera, T. Calvo, Aggregation Functions: A Guide for Practitioners, Springer, Berlin, 2007. [13] M. Grabisch, J. Marichal, R. Mesiar, E. Pap, Aggregation Functions, Cambridge University Press, Cambridge, 2009.

M

[14] D. Paternain, J. Fernandez, H. Bustince, R. Mesiar, G. Beliakov, Construction of image reduction operators using averaging aggregation functions, Fuzzy Sets and Systems 261 (2015) 87 – 111.

te

d

[15] E. Barrenechea, J. Fernandez, M. Pagola, F. Chiclana, H. Bustince, Construction of interval-valued fuzzy preference relations from ignorance functions and fuzzy preference relations. Application to decision making, Knowledge-Based Systems 58 (2014) 33 – 44.

Ac ce p

[16] H. Bustince, E. Barrenechea, T. Calvo, S. James, G. Beliakov, Consensus in multiexpert decision making problems using penalty functions defined over a cartesian product of lattices, Information Fusion 17 (2014) 56–64. [17] E. P. Klement, R. Mesiar, E. Pap, Triangular Norms, Kluwer Academic Publisher, Dordrecht, 2000. [18] H. Bustince, J. Fernandez, R. Mesiar, J. Montero, R. Orduna, Overlap functions, Nonlinear Analysis: Theory, Methods & Applications 72 (3-4) (2010) 1488– 1499. [19] H. Bustince, M. Pagola, R. Mesiar, E. H¨ullermeier, F. Herrera, Grouping, overlaps, and generalized bientropic functions for fuzzy modeling of pairwise comparisons, IEEE Transactions on Fuzzy Systems 20 (3) (2012) 405–415. [20] B. C. Bedregal, G. P. Dimuro, H. Bustince, E. Barrenechea, New results on overlap and grouping functions, Information Sciences 249 (2013) 148–170. [21] G. P. Dimuro, B. Bedregal, H. Bustince, M. J. Asi´ain, R. Mesiar, On additive generators of overlap functions, Fuzzy Sets and Systems 287 (2016) 76 – 96, theme: Aggregation Operations.

27

Page 27 of 32

ip t

[22] G. P. Dimuro, B. Bedregal, Archimedean overlap functions: The ordinal sum and the cancellation, idempotency and limiting properties, Fuzzy Sets and Systems 252 (2014) 39 – 54. [23] G. P. Dimuro, B. Bedregal, R. H. N. Santiago, On (G, N )-implications derived from grouping functions, Information Sciences 279 (2014) 1 – 17.

cr

[24] T. Calvo, G. Beliakov, Aggregation functions based on penalties, Fuzzy Sets and Systems 161 (10) (2010) 1420 – 1436.

us

[25] G. Beliakov, H. Bustince, T. Calvo, A Practical Guide to Averaging Functions, Springer, Berlin, New York, 2016.

an

[26] G. P. Dimuro, B. Bedregal, H. Bustince, A. Jurio, M. Baczy´nski, K. Mi´s, QL-operations and QL-implication functions constructed from tuples (O, G, N ) and the generation of fuzzy subsethood and entropy measures, International Journal of Approximate Reasoning 82 (2017) 170–192. doi:http://dx.doi.org/10.1016/j.ijar.2016.12.013.

M

[27] G. P. Dimuro, B. Bedregal, On residual implications derived from overlap functions, Information Sciences 312 (2015) 78 – 88.

d

[28] G. P. Dimuro, B. Bedregal, On the laws of contraposition for residual implications derived from overlap functions, in: Fuzzy Systems (FUZZ-IEEE), 2015 IEEE International Conference on, IEEE, Los Alamitos, 2015, pp. 1–7. doi:10.1109/FUZZ-IEEE.2015.7337867.

Ac ce p

te

[29] A. Jurio, H. Bustince, M. Pagola, A. Pradera, R. Yager, Some properties of overlap and grouping functions and their application to image thresholding, Fuzzy Sets and Systems 229 (2013) 69 – 90. [30] G. Lucca, J. A. Sanz, G. P. Dimuro, B. Bedregal, M. J. Asiain, M. Elkano, H. Bustince, CC-integrals: Choquet-like copula-based aggregation functions and its application in fuzzy rule-based classification systems, Knowledge-Based Systems 119 (2017) 32 – 43. doi:http://dx.doi.org/10.1016/j.knosys.2016.12.004. [31] M. Elkano, M. Galar, J. Sanz, H. Bustince, Fuzzy rule-based classification systems for multi-class problems using binary decomposition strategies: On the influence of n-dimensional overlap functions in the fuzzy reasoning method, Information Sciences 332 (2016) 94–114. [32] S. Garcia-Jimenez, H. Bustince, E. H¨ullermeier, R. Mesiar, N. R. Pal, A. Pradera, Overlap indices: Construction of and application to interpolative fuzzy systems, IEEE Transactions on Fuzzy Systems 23 (4) (2015) 1259–1273.

[33] H. Bustince, J. Fern´andez, R. Mesiar, J. Montero, R. Orduna, Overlap index, overlap functions and migrativity, in: Proceedings of IFSA/EUSFLAT Conference, 2009, pp. 300–305.

28

Page 28 of 32

ip t

[34] D. Dubois, W. Ostasiewicz, H. Prade, Fuzzy sets: History and basic notions, in: D. Dubois, H. Prade (Eds.), Fundamentals of Fuzzy Sets, Kluwer, Boston, 2000, pp. 21–124. [35] N. R. Pal, K. Pal, Handling of inconsistent rules with an extended model of fuzzy reasoning, Journal of Intelligent and Fuzzy Systems 7 (1) (1999) 55–73.

cr

[36] W. Yu, Z. Bien, Design of fuzzy logic systems with inconsistent rule base, Journal of Intelligent and Fuzzy Systems 2 (2) (1994) 147–159.

us

[37] L. Zadeh, Fuzzy sets as a basis for a theory of possibility, Fuzzy Sets and Systems 1 (1) (1978) 3 – 28. [38] J. Baldwin, B. Pilsworth, Axiomatic approach to implication for approximate reasoning with fuzzy logic, Fuzzy Sets and Systems 3 (2) (1980) 193 – 219.

an

[39] J. Alcala-Fdez, R. Alcala, F. Herrera, A fuzzy association rule-based classification model for high-dimensional problems with genetic rule selection and lateral tuning, IEEE Transactions on Fuzzy Systems 19 (5) (2011) 857–872.

M

[40] H. Ishibuchi, T. Yamamoto, T. Nakashima, Hybridization of fuzzy gbml approaches for pattern classification problems, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 35 (2) (2005) 359–365. doi:10.1109/TSMCB.2004.842257.

te

d

[41] H. Bustince, G. Beliakov, G. P. Dimuro, B. Bedregal, R. Mesiar, On the definition of penalty functions in data aggregation, Fuzzy Sets and Systems(in press, Corrected Proof). doi:http://dx.doi.org/10.1016/j.fss.2016.09.011.

Ac ce p

[42] R. R. Yager, Toward a general theory of information aggregation, Information Sciences 68 (3) (1993) 191 – 206. [43] R. R. Y. ans A. Rybalov, Understanding the median as a fusion operator, International Journal of General Systems 26 (3) (1997) 239–263. [44] T. Calvo, R. Mesiar, R. R. Yager, Quantitative weights and aggregation, Fuzzy Systems, IEEE Transactions on 12 (1) (2004) 62–69. [45] G. Beliakov, T. Calvo, S. James, Consensus measures constructed from aggregation functions and fuzzy implications, Knowledge-Based Systems 55 (2014) 1 – 8. [46] G. Beliakov, S. James, A penalty-based aggregation operator for non-convex intervals, Knowledge-Based Systems 70 (2014) 335 – 344. [47] T. Wilkin, G. Beliakov, Weakly monotonic averaging functions, International Journal of Intelligent Systems 30 (2) (2015) 144–169. [48] S. Ilanko, A. Tucker, The use of negative penalty functions in solving partial differential equations, Communications in Numerical Methods in Engineering 21 (3) (2005) 99–106. 29

Page 29 of 32

ip t

[49] H. Askes, S. Piercy, S. Ilanko, Tyings in linear systems of equations modelled with positive and negative penalty functions, Communications in Numerical Methods in Engineering 24 (11) (2008) 1163–1169.

cr

[50] I. Palomares, F. J. Estrella, L. Mart´ınez, F. Herrera, Consensus under a fuzzy context: Taxonomy, analysis framework {AFRYCA} and experimental case of study, Information Fusion 20 (2014) 252 – 271. [51] Z. Wu, J. Xu, Managing consistency and consensus in group decision making with hesitant fuzzy linguistic preference relations, Omega 65 (2016) 28 – 40.

us

[52] Z. Wu, J. Xu, Possibility distribution-based approach for magdm with hesitant fuzzy linguistic information, IEEE Transactions on Cybernetics 46 (3) (2016) 694–705.

an

[53] E. Herrera-Viedma, F. J. Cabrerizo, J. Kacprzyk, W. Pedrycz, A review of soft consensus models in a fuzzy environment, Information Fusion 17 (2014) 4 – 13, special Issue: Information fusion in consensus and decision making.

M

[54] A. J. Kurdila, M. Zabarankin, Convex Functional Analysis, Springer, Berlin, 2005.

d

[55] G. Mayor, E. Trillas, On the representation of some aggregation functions, in: Proceedings of IEEE International Symposium on Multiple-Valued Logic, IEEE, Los Alamitos, 1986, pp. 111–114.

Ac ce p

te

[56] G. P. Dimuro, B. Bedregal, H. Bustince, R. Mesiar, M. J. Asiain, On additive generators of grouping functions, in: A. Laurent, O. Strauss, B. Bouchon-Meunier, R. R. Yager (Eds.), Information Processing and Management of Uncertainty in Knowledge-Based Systems, Vol. 444 of Communications in Computer and Information Science, Springer International Publishing, 2014, pp. 252–261. [57] H. Bustince, J. Fernandez, R. Mesiar, A. Pradera, G. Beliakov, Restricted dissimilarity functions and penalty functions, in: Proceedings of 2011 EUSFLAT-IFSA Joint Conference, no. 1 in Advances in Intelligent Systems Research, Atlantis Press, Amsterdam, 2011, pp. 79–85. [58] M. Gagolewski, Spread measures and their relation to aggregation functions, European Journal of Operational Research 241 (2) (2015) 469 – 477. [59] A. Riid, E. R¨ustern, Adaptability, interpretability and rule weights in fuzzy rule-based systems, Information Sciences 257 (2014) 301 – 312. doi:http://dx.doi.org/10.1016/j.ins.2012.12.048. [60] J. A. Sanz, A. Fern´andez, H. Bustince, F. Herrera, Improving the performance of fuzzy rule-based classification systems with interval-valued fuzzy sets and genetic amplitude tuning, Information Sciences 180 (19) (2010) 3674 – 3685. [61] Z. Chi, H. Yan, T. Pham, Fuzzy Algorithms: With Applications to Image Processing and Pattern Recognition, World Scientific, Singapore, 1996. 30

Page 30 of 32

ip t

[62] J. Sanz, A. Fernandez, H. Bustince, F. Herrera, A genetic tuning to improve the performance of fuzzy rule-based classification systems with interval-valued fuzzy sets: Degree of ignorance and lateral position, International Journal of Approximate Reasoning 52 (6) (2011) 751–766.

cr

[63] J. Sanz, A. Fernandez, H. Bustince, F. Herrera, Iivfdt: Ignorance functions based interval-valued fuzzy decision tree with genetic tuning, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 20 (2012) 1–30.

us

[64] M. S. Aguiar, G. P. Dimuro, A. C. R. Costa, ICTM: An interval tessellationbased model for reliable topographic segmentation, Numerical Algorithms 37 (1– 4) (2004) 3–11.

an

[65] L. V. Barboza, G. P. Dimuro, R. H. S. Reiser, Towards interval analysis of the load uncertainty in power electric systems, in: Proceedings of the 8th International Conference on Probability Methods Applied to Power Systems, Ames, 2004, IEEE Computer Society Press, Los Alamitos, 2004, pp. 538–541.

M

[66] G. P. Dimuro, On interval fuzzy numbers, in: 2011 Workshop-School on Theoretical Computer Science, WEIT 2011, IEEE, Los Alamitos, 2011, pp. 3–8. doi:10.1109/WEIT.2011.19.

te

d

[67] B. C. Bedregal, G. P. Dimuro, R. H. S. Reiser, An approach to interval-valued R-implications and automorphisms, in: J. P. Carvalho, D. Dubois, U. Kaymak, J. M. da Costa Sousa (Eds.), Proceedings of the Joint 2009 International Fuzzy Systems Association World Congress and 2009 European Society of Fuzzy Logic and Technology Conference, IFSA/EUSFLAT, 2009, pp. 1–6.

Ac ce p

[68] B. C. Bedregal, G. P. Dimuro, R. H. N. Santiago, R. H. S. Reiser, On interval fuzzy S-implications, Information Sciences 180 (8) (2010) 1373 – 1389. [69] G. P. Dimuro, B. R. C. Bedregal, R. H. S. Reiser, R. H. N. Santiago, Interval additive generators of interval t-norms, in: W. Hodges, R. de Queiroz (Eds.), Proceedings of the 15th International Workshop on Logic, Language, Information and Computation, WoLLIC 2008, Edinburgh, no. 5110 in Lecture Notes in Artificial Intelligence, Springer, Berlin, 2008, pp. 123–135. [70] T. C. Asmus, G. P. Dimuro, B. Bedregal, On two-player interval-valued fuzzy bayesian games, International Journal of Intelligent Systems 32 (6) (2017) 557– 596. doi:10.1002/int.21857.

31

Page 31 of 32

Highlights - A consensus method via penalty functions for decision making in ensembles of fuzzy rulebased classification systems is introduced. - Overlap indices are built using overlap functions. - A method for constructing confidence and support measures from overlap indices is presented.

cr

- Theoretical results related to the developed methods are discussed.

ip t

- A new fuzzy rule mechanism is proposed, considering different overlap indices, which generalizes the classical methods.

Ac

ce pt

ed

M

an

us

- An example of a generation of fuzzy rule-based ensembles and the decision making by consensus via penalty functions is presented.

Page 32 of 32