Towards sophisticated decision models: Nonadditive robust ordinal regression for preference modeling

Towards sophisticated decision models: Nonadditive robust ordinal regression for preference modeling

Knowledge-Based Systems xxx (xxxx) xxx Contents lists available at ScienceDirect Knowledge-Based Systems journal homepage: www.elsevier.com/locate/k...

470KB Sizes 0 Downloads 19 Views

Knowledge-Based Systems xxx (xxxx) xxx

Contents lists available at ScienceDirect

Knowledge-Based Systems journal homepage: www.elsevier.com/locate/knosys

Towards sophisticated decision models: Nonadditive robust ordinal regression for preference modeling✩ ∗

Gleb Beliakov a , , Jian-Zhang Wu b , Dmitriy Divakov c a

School of Information Technology, Deakin University, Geelong, 3220, Australia School of Business, Ningbo University, Ningbo 315211, China c Peoples’ Friendship University of Russia (RUDN University), 6 Miklukho-Maklaya St, Moscow, 117198, Russian Federation b

article

info

Article history: Received 13 March 2019 Received in revised form 5 December 2019 Accepted 6 December 2019 Available online xxxx Keywords: Fuzzy sets Multicriteria decision making Fuzzy measure Ordinal regression Aggregation functions

a b s t r a c t This paper addresses a methodology for decision support under multiple and correlated decision criteria. Nonadditive robust ordinal regression (NAROR) aims to build capacities that fit the decision makers’ explicit preferences and pairwise rankings of some alternatives. The capacities provide great flexibility to model decision problems accounting for interactions among the decision criteria. The feasible set of capacities helps identifying all the necessary and possible dominance relations among all the decision alternatives. In this paper we enhance the NAROR method by identifying optimal capacities through entropy maximisation. We formulate suitable optimisation problems and provide avenues for capacity simplification based on k-interactivity. We also consider the situation of large number of sparse constraints, for which we formulate a linear program based on Renyi entropy. We deal with preferences inconsistency by using multiple goal linear programming technique. The results show that the k-interactivity is an efficient way to reduce the complexity of capacities while preserving their expressiveness and representation ability, and that optimal capacities can be found by standard mathematical programming techniques. © 2019 Elsevier B.V. All rights reserved.

1. Introduction In multicriteria decision analysis the criteria are often correlated, redundant or complementary. Most aggregation functions – functions that combine the values of the criteria or their utilities into an overall representative value, e.g., weighted means, – treat the criteria independently, which causes inconsistency with the decision makers’ preferences. Capacities [1], also called fuzzy measures [2], or non-additive measures, are normalised monotone set functions on the set of decision criteria [3]. The additivity of classical probability measures is replaced by the weaker monotonicity condition (with respect to set inclusion). It enables capacities to more flexibly and adequately model the dependency, or interaction between multiple decision criteria [4– 7]. Numerous studies illustrate the benefits of replacing the traditional weighted averaging with the capacity-based Choquet, Sugeno or another fuzzy integral in aggregating the dependent criteria in the decision problems [4,5,8–11]. ✩ No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.knosys. 2019.105351. ∗ Corresponding author. E-mail address: [email protected] (G. Beliakov).

Capacity identification is one of the focus areas in the field of capacity-based multicriteria decision making. The main goal of capacity identification models is to translate explicit or implicit preferences of decision makers into a suitable capacity, see, e.g. [4,8,9,12–14]. Usually, the decision maker can provide some explicit preferences about the importance and interaction of decision criteria on an ordinal scale, like criterion i is more important than criterion j, the interaction between criteria i and j is greater than that between criteria k and l, and so on. Typically this information is rather sparse and cannot identify a unique suitable capacity or even a narrow range of capacities. In addition, the decision makers can provide some information about the ranking order of some alternatives, e.g., the alternative i is better than alternative j. This preference information on the decision examples actually reflects the decision makers’ preferences on the decision criteria too, hence can be considered as implicit preferences. With the explicit and implicit preference information, and the boundary and monotonicity constraints on a capacity, we usually only get a feasible region of capacities, which cannot produce a complete ranking of the decision alternatives. In order to determine this ranking order, one way is to select the most representative capacity according to some given principles, like the maximum entropy principle [15], the compromise principle [13], the interaction index oriented principle [14], the MCCPI (Multiple

https://doi.org/10.1016/j.knosys.2019.105351 0950-7051/© 2019 Elsevier B.V. All rights reserved.

Please cite this article as: G. Beliakov, J.-Z. Wu and D. Divakov, Towards sophisticated decision models: Nonadditive robust ordinal regression for preference modeling, Knowledge-Based Systems (2019) 105351, https://doi.org/10.1016/j.knosys.2019.105351.

2

G. Beliakov, J.-Z. Wu and D. Divakov / Knowledge-Based Systems xxx (xxxx) xxx

Criteria Correlation Preference Information) based least square and absolute deviation principles [9], and set function regression principles [4,12]. In these methods it is often required that the overall evaluations of the alternatives be given on the absolute scale, e.g., as a number in the unit interval. Another approach is called NAROR (Nonadditive robust ordinal regression) [16–21], which is based on matching all possible dominance relationships on the decision alternatives with all the feasible capacities. This method uses mathematical programming techniques to identify the desired capacities, but due to computational complexity of this task is limited to 2-additive capacities. In this paper we consider the problem of capacity identification from the NAROR perspective, i.e., through learning the optimal capacity from pairwise preferences in the set of alternatives and various explicit decision makers’ preferences. There are two issues we address in this paper. The first issue is the sparsity of the decision makers’ preferences available for identification of a unique suitable capacity. Even for modest numbers of decision criteria n of the order of 10, there 2n − 2 capacity values to be identified, of the order of hundreds, and the decision makers can only provide a small number of pairwise preferences. This means there will be multiple equivalent solutions which satisfy all the required constraints (if the preferences are mutually consistent), or no suitable capacity in case of inconsistency. We will use the entropy maximisation as an underlying principle in order to select the optimal capacity under given constraints. The second issue is the computational complexity. We tackle it from two directions: simplifying the capacity, yet not to the same degree as 2-additive capacities, which limit inputs’ interactions to pairs only, and converting the learning problem into a linear programming problem which helps to deal efficiently with large numbers of sparse inequality constraints. The contributions of this paper are two-fold. Firstly, it is the incorporation of the entropy maximisation as the guiding principle to determine the optimal capacities in the NAROR framework, as maximising the entropy facilitates using more criteria in the decision process rather than relying on only a few. Secondly, it is a significant reduction of the computational complexity of capacity learning problem, which is also based on maximising the entropy. This way the NAROR method can be used as a robust learning tool in multicriteria decision making and decision support systems with even larger number of decision criteria. This paper is organised as follows. After the introduction, we present basic facts about capacities and their representations in Section 2. In Section 3 we give a summary of the NAROR method advocated in [17–21]. Section 4 presents capacity entropies and also outlines capacity simplification strategies. The NAROR method with entropy maximisation is presented in Section 5. Here we provide different optimisation problem formulations, incorporate capacity simplification strategies and also deal with sparse matrices of constraints and constraints inconsistencies. We conclude the paper in Section 6.

The capacity is a monotonic set function in which the additivity of a probabilistic measure is replaced by the weaker monotonicity condition, which states an essential property of the decision criteria in the decision problem: the value of a subset of criteria cannot decrease when new criteria are added [4,5]. Definition 2 ([2,4,22]). A capacity µ on N is ⋆-additive, if ⋆

µ(A ∪ B) = µ(A) + µ(B), ∀A, B ⊆ N , A, B ̸= ∅, A ∩ B = ∅. Further, µ is ⋆-additive within S ⊆ N, if ⋆

µ(A ∪ B) = µ(A) + µ(B), ∀A, B ⊆ S , A, B ̸= ∅, A ∩ B = ∅, ⋆

where ‘‘=’’ stands for ‘‘= (resp. ⩾ , ⩽, >,and <)’’, ‘‘⋆-additive’’ stands for ‘‘additive (resp. superadditive, subadditive, strict superadditive, and strict subadditive)’’. The nonadditivity of a capacity, such as superadditivity and subadditivity, enables capacities to flexibly represent various kinds of interactions among the decision criteria, ranging from substitutivity (negative interaction) to complementarity (positive interaction) [4,23]. An additive capacity implies that the decision criteria are all independent of each other [24]. A strictly superadditive (resp. strictly subadditive, superadditive, and subadditive) capacity implies that to some degree all the decision criteria can be considered as mutually complementary (substitutive, nonsubstitutive, and noncomplementary). Definition 3 ([23,25]). A set function v : P (N) → R is a representation of µ if there exists an invertible transform T such that v =T(µ) and µ =T−1 (v ). The most notable are the Möbius representation [26] and the Shapley interaction index representation [27], both are routinely used in the field of capacity based decision making. Definition 4 ([4,25,27,28]). Let µ be a capacity on N. Then the Möbius representation of a capacity µ at a subset A ⊆ N is defined as mµ (A) =



(−1)|A\C | µ(C ),

C ⊆A

and the Shapley interaction index of subset A ⊆ N w.r.t. µ is defined as µ

ISh (A) =

B⊆N \A

×



(

1



|N | − |A| + 1

|N | − |A| |B|

)−1

(−1)|A\C | µ(C ∪ B).

C ⊆A

Let N = {1, 2, . . . , n}, n ⩾ 2, be the decision criteria set, P (N) be the power set of N, and |S | be the cardinality of subset S ⊆ N.

The Shapley interaction index satisfies some axiomatic properties (see, e.g., [27–29]) and is widely adopted for describing the interaction phenomena among the decision criteria. It is designed to describe the simultaneous marginal interaction among the decision criteria. In order to represent the kind and degree of interaction associated with the nonadditivity, we proposed recently the notion of nonadditivity index, which is also a representation of a capacity [22].

Definition 1 ([1,2,4]). A capacity on N is a set function µ : P (N) → [0, 1] such that

Definition 5 ([22]). The nonadditivity index of a subset A ⊆ N w.r.t. µ is defined as

2. Capacity and its representations

1. µ(∅) = 0, µ(N) = 1; (boundary condition) 2. ∀A, B ⊆ N, A ⊆ B implies µ(A) ⩽ µ(B). (monotonicity condition)

nµ (A) = µ(A) −

1 2|A|−1

∑ −1

µ(C ).

(1)

C ⊂A

Please cite this article as: G. Beliakov, J.-Z. Wu and D. Divakov, Towards sophisticated decision models: Nonadditive robust ordinal regression for preference modeling, Knowledge-Based Systems (2019) 105351, https://doi.org/10.1016/j.knosys.2019.105351.

G. Beliakov, J.-Z. Wu and D. Divakov / Knowledge-Based Systems xxx (xxxx) xxx ⋆

Theorem 1 ([22]). A capacity µ on N is ⋆-additive, then nµ (A) = 0, ∀A ⊆ N, |A| ⩾ 2. A capacity µ on N is ⋆-additive within S ⊆ N, ⋆ then nµ (A) = 0, ∀A ⊆ S, |A| ⩾ 2. Further, the nonadditivity index is also normalised differently to the probabilistic interaction indices and takes the values in [−1, 1] regardless the cardinality of the subset, which allows nonadditivity indices of different subsets to be directly compared. When using capacities to describe the importance and interaction of the decision criteria, the Choquet integral is a widely applied aggregation function to calculate the comprehensive evaluations of the decision alternatives. Definition 6. For a given x ∈ [0, 1]n , the discrete Choquet integral Cµ of x with respect to capacity µ on N is defined as: Cµ (x) =

n ∑

(x(i) − x(i−1) )µ({(i), . . . , (n)}),

where x(.) is a non-decreasing permutation induced by xi , i = 1, . . . , n, i.e., x(1) ⩽ · · · ⩽ x(n) , and x(0) = 0 ⩽ x(1) by convention. The Choquet integral can also be expressed in terms of Möbius representation as [4]:



m(A)xA ,

A⊆N

where xA = mini∈A (xi ). The Choquet integral can be represented in terms of a capacity without previously ordering the partial values of x as [5]: Cµ (x) =



µ(A)xˆ A .

(2)

A⊆N

where the basis functions are xˆ A = max{0, mini∈A xi −maxi∈N \A xi },

∀A ⊆ N. Another representation of a capacity is based on the marginal contributions [30,31]. The monotonicity condition can be equally rewritten as

∆i µ(B) = µ(B ∪ {i}) − µ(B) ⩾ 0, ∀i ∈ N , B ⊆ N \ {i}.

(3)

That is, the marginal contribution of any criterion i to any subset B, denoted as ∆i µ(B), is always nonnegative as a consequence of the monotonicity condition. Remark 1. In the context of pseudo-boolean functions [25,32] ∆i µ(B) is called the ith derivative of µ at B. Theorem 2. Let N = {1, 2, . . . , n}, and let π denote a permutation of (1, 2, . . . , n). Then µ is a capacity on N if and only if ∆i µ(B) ⩾ 0 for all i ∈ N and B ⊆ N \ {i} and n ∑

∆π (i) µ(Nπ (i−1) ) = 1 for all π,

3. Nonadditive robust ordinal regression

Based on the Shapley importance and interaction index and the Choquet integral, the nonadditive robust ordinal regression (NAROR) has been constructed and studied [16–21,33,34]. In summary, the nonadditive robust ordinal regression model has the following three steps: Step 1: elicit preference information characterising all compatible capacities. The preferences are usually expressed by two types of linear constraints. The first type reflects preferences on some, not necessary all, decision alternatives, like:

• the alternative a is at least as good as b: a ≿ b ⇔ Cµ (a) ⩾ Cµ (b);

• a is preferred to b at least as much as c is preferred to d: (a, b) ≿ (c, d) ⇔ Cµ (a) − Cµ (b) ⩾ Cµ (c) − Cµ (d).

i=1

Cm (x) =

3

(4)

i=1

where Nπ (i) = {π (1), . . . , π (i)}, Nπ (0) = ∅. Remark 2. If we adopt the marginal contributions as the variables to represent the capacity, the number of (non-negative) variables is n × 2(n−1) , and the number of (possibly redundant) constraints (4) is n!. This contrasts with the much smaller numbers of variables and constraints in the standard or Möbius representation. The rationale for using the marginal contributions representation is the efficient treatment of the entropy. In addition, it was shown how to reduce the number of variables in that representation to 2n − 2 in [30].

The second type reflects the preferences in the decision criteria space, mainly the pairwise comparison of the importance of some decision criteria as well as the interactions between two criteria like:

• criterion i is at least as important as criterion j: i ≿ j ⇔ µ µ ISh ({i}) ⩾ ISh ({j}); • the difference of importance between criteria i and j is at least as big as the difference of importance between criteria µ µ µ µ k and l: (i, j) ≿ (k, l) ⇔ ISh ({i}) − ISh ({j}) ⩾ ISh ({k}) − ISh ({l}); • the sign of interaction of pairs of criteria is positive or µ negative: [i, j] ⩾ (⩽)0 ⇔ ISh ({i, j}) ⩾ (⩽)0; • interaction intensity between criteria i and j is at least as strong as interaction intensity between criteria k and l: µ µ |[i, j]| ≿ |[k, l]| ⇔ |ISh ({i, j})| ⩾ |ISh ({k, l})|; • difference of interaction intensity between criteria i and j and interaction intensity between criteria k and l is at least as strong as difference of interaction intensity between criteria r and s and interaction intensity between criteria t µ and w : (|[i, j]|, |[k, l]|) ≿ (|[r , s]|, |[t , w]|) ⇔ |ISh ({i, j})| − µ µ µ |ISh ({k, l})| ⩾ |ISh ({r , s})| − |ISh ({t , w})|. It should be mentioned that the constraints involving the absolute value can be translated into normal linear constraints if the signs of interactions are given. Further, in practice we may encounter the indifference ‘‘∼’’ and preference ‘‘≻’’ relationships which will correspond to ‘‘=’’ and ‘‘>’’ in the above constraints. These preference constraints, combined with the boundary and monotonicity constraints on a capacity, constitute the feasible domain of all compatible capacities. In addition to that, we also advocate the use of the nonadditivity indices in the same spirit as the Shapley values above. Step 2: check the preference inconsistency and adjust if needed. The constraints in Step 1 can be collected into the constraints set E AC . As mentioned above, there are three types of constraints: the equality, the weak inequality and the strict inequality. By introducing auxiliary variable ε , the strict inequalities can be changed into weak inequalities. For example, we can write the constraints as:

• Cµ (a) = Cµ (b) if a ∼ b, • Cµ (a) ⩾ Cµ (b) if a ≿ b, • Cµ (a) ⩾ Cµ (b) + ε if a ≻ b.

Please cite this article as: G. Beliakov, J.-Z. Wu and D. Divakov, Towards sophisticated decision models: Nonadditive robust ordinal regression for preference modeling, Knowledge-Based Systems (2019) 105351, https://doi.org/10.1016/j.knosys.2019.105351.

4

G. Beliakov, J.-Z. Wu and D. Divakov / Knowledge-Based Systems xxx (xxxx) xxx

For convenience, in the following, we only write the weak inequalities to represent E AC as follows:

E AC

⎧ Cµ (a) ⩾ Cµ (b) + ε ⎪ ⎪ ⎪ ⎪ Cµ (a) − Cµ (b) ⩾ Cµ (c) − Cµ (d) + ε, ⎪ ⎪ ⎪ µ µ ⎪ ⎪ISh ({i}) ⩾ ISh ({j}) + ε, ⎪ ⎪ µ µ µ µ ⎪ ⎪ISh ({i}) − ISh ({j}) ⩾ ISh ({k}) − ISh ({l}) + ε, ⎨ µ µ ⇐ ISh ({i, j}) ⩾ ε (ISh ({i, j}) ⩽ −ε ), ⎪ µ µ ⎪ ⎪ |ISh ({i, j})| ⩾ |ISh ({k, l})| + ε, ⎪ ⎪ µ µ µ µ ⎪ ⎪ ⎪|ISh ({i, j})| − |ISh ({k, l})| ⩾ |ISh ({r , s})| − |ISh ({t , w})| + ε, ⎪ ⎪ ⎪ ⎪ µ(∅) = 0, µ(N) = 1, ⎪ ⎩ µ(A) ⩽ µ(B), ∀A, B ⊆ N , A ⊆ B,

where ε is an auxiliary variable. If E AC is feasible and ε ∗ = max ε > 0 subject to E AC , then there exists at least one compatible capacity. Otherwise, one should check inconsistency and adjust the constraints by using some techniques, among which the most common one is the 0-1 linear programming based method [19,33]. Step 3: exploit the preferences on all the decision alternatives, which include the necessary preferences and possible preferences [16,18] obtained from the following two types of linear programs. The necessary preference of alternative pair x, y that is not given in Step 1 can be confirmed by the following linear programming problem if it has a nonpositive optimal value:

ε

max

s.t. Cµ (y) ⩾ Cµ (x) + ε,

(5)

E AC . If the programming problem (5) has an optimal nonpositive value with ε ⩽ 0, then Cµ (x) ⩾ Cµ (y) for all compatible capacities. In contrast, a possible preference of an alternative pair x, y that is not given in Step 1 can be confirmed by the following linear programming program if it has a positive optimal value:

ε

max

s.t. Cµ (x) ⩾ Cµ (y), E

AC

(6)

.

4. Entropy and k-interactivity 4.1. Entropy The (Shannon) entropy of a capacity µ on N is defined by [35] n ∑ ∑ (n − |A| − 1)!|A|! h (µ(A ∪ {i}) − µ(A)) , n!

1

Hα (µ) =

1−α



⎞ n ∑ ∑ (n − |A| − 1)!|A|! × ln ⎝ (µ(A ∪ {i}) − µ(A))α ⎠ , n! i=1 A⊆N \{i}

which gives Shannon’s entropy as the limiting case α → 1. If we adopt the Choquet integral as the overall aggregation function, the entropy of the capacity µ measures the average contribution of the arguments x1 , x2 , . . . , xn to the overall evaluation Cµ (x). So, the aim of the maximum entropy principle is to seek the highest average contribution of the arguments in the aggregation phase [35]. That is, the maximum entropy principle tends to give each criterion an equal chance to affect the overall aggregation result [13]. 4.2. k-order interactions Despite its flexibility and ability to model inputs’ interactions, the use of the Choquet integral as an aggregation function has been limited to small dimensions because of the inherent complexity of capacities in terms the number of parameters to be specified. To simplify the construction of fuzzy measures while keeping some information about interactions, Grabisch [27] introduced the k-order additivity. Definition 7. A capacity µ is k-additive if its Möbius transform satisfies m(A) = 0 for any A such that |A| > k and there exists at least one subset A ⊆ N of exactly k elements such that m(A) ̸ = 0. The special case of 2-additive capacities has been widely studied. A 2-additive capacity is entirely determined by the coefficients of singletons and pairs. The Choquet integral with respect to a 2-additive capacity can be computed from the Shapley inµ dices of singletons ϕ (i) = ISh ({i}) and the interaction indices [36]: Cµ (x) =

n ∑

{i,j}⊆N

If the program (6) has a optimal positive value with ε > 0, then Cµ (x) ⩾ Cµ (y) for at least one compatible capacity. We highlight that differently to the usual regression problems, in NAROR the actual values of the Choquet integral to be fitted are not provided. This implies that the NAROR method does not identify one particular capacity that fits best the decision maker’s preferences, but rather a subset of feasible capacities. In [19] a procedure useful to build a single most representative capacity summarising the results of NAROR has been introduced. Our goal here is to explore a different direction and formulate an additional optimisation criterion to determine in some sense the ‘‘best’’ feasible capacity.

E(µ) =

In addition to Shannon’s entropy in (7), Renyi’s entropy is often used. Here we have for α > 0, α ̸ = 1

(7)

i=1 A⊆N \{i}

where h(x) = −x lnx if x > 0 and 0 if x = 0. The symmetric additive capacity has the largest entropy value ln(n). The Choquet integral with respect to such a capacity coincides with the arithmetic mean function.

ϕ (i)xi −



µ

ISh ({i, j})|xi − xj |.

{i,j}⊆N

The concept of k-tolerant (and k-intolerant) capacities was proposed in [32,37]. Definition 8. Let k ∈ N. A capacity µ on N is k-tolerant if µ(A) = 1 for all A ⊆ N such that |A| ⩾ k and there exists a subset B ⊆ N, with |B| = k − 1, such that µ(B) ̸ = 1. A capacity µ on N is k-intolerant if µ(A) = 0 for all A ⊆ N such that |A| ⩽ n − k and there exists a subset B ⊆ N, with |B| = n − k + 1, such that µ(B) ̸= 0. k-intolerant capacities can be obtained from k-tolerant measures by using duality. The Choquet integral with respect to a k-tolerant capacity is independent of the first n − k smallest inputs. From the point of view of capacity specification, k-order capacities significantly reduce the total number of parameters while preserving many input interactions, but only marginally reduce the monotonicity constraints, which are fundamental in their definitions. Exceptions are the 2-additive and k-tolerant (intolerant) capacities, which also reduce the number of constraints but arguably at the cost of oversimplifying the interactions. Another approach to reduce the complexity of capacity identification was proposed in [38]. It works by fixing the values of the capacity for all subsets of cardinality greater than k in some appropriate way. The approach which maximises the partial entropy of the capacity (calculated over subsets of cardinality greater than k) results in the following definition.

Please cite this article as: G. Beliakov, J.-Z. Wu and D. Divakov, Towards sophisticated decision models: Nonadditive robust ordinal regression for preference modeling, Knowledge-Based Systems (2019) 105351, https://doi.org/10.1016/j.knosys.2019.105351.

G. Beliakov, J.-Z. Wu and D. Divakov / Knowledge-Based Systems xxx (xxxx) xxx

Definition 9. A capacity µ on N is called k-interactive if for some chosen K ∈ [0, 1]

|A| − k − 1 (1 − K ), for all A, |A| > k. n−k−1 In particular, the values µ(B) = K for all B, |B| = k + 1,

µ(A) = K +

are fixed at K . These conditions simplify various expressions, in particular the Choquet integral, and significantly reduce the number of variables and constraints when constructing capacities from the data. The k-tolerant capacities arise as the special cases. Note that the interactions in the subsets larger than k take place but are predefined by interactions in smaller subsets and the values of k, K . Also note that the particular formula in Definition 9 is obtained by applying the maximum entropy principle [38] which maximises the average contribution of the n − k − 1 smallest inputs. The Choquet integral with respect to a k-interactive capacity can be written as Cµ (x) =

1−K

n−k−1



n−k−1

x(i) + Kx(n−k) (8)

i=1

+



µ(A)xˆ A ,

A⊆N ,|A|⩽k

which is derived from (2). We see that the contribution of the n − k − 1 smallest inputs is averaged with the arithmetic mean while the interactions are explicitly accounted for the remaining inputs. Importantly, kinteractive capacities significantly reduce both the number of parameters and the monotonicity constraints, and make it feasible to fit capacities to data for larger n by solving an optimisation problem detailed in the next section. 4.3. Entropy maximisation with constraints Maximisation of entropy under a set of linear equality and inequality constraints is a well known problem [39,40], for which several approaches to solution exist. The problem is formulated as max E(x) s.t. Ax ⪯ b

(9)

Cx = d, where x ∈ Rn , b ∈ Rm , d ∈ Rk , A ∈ Rm×n and C ∈ Rk×n . This is a convex nonlinear optimisation problem whose unique solution can be obtained by using Lagrange multipliers [41]. In particular, if we have one inequality constraint and multiple inequality constraints max −

n ∑

xi log xi

i=1

(10)

s.t. Ax ⪯ b 1t x = 1, a change to dual variables is warranted, so the problem becomes

max − b λ − v − e T

−v−1

n ∑ i=1

5

Maximising over the variable v analytically gives

( max − b λ − log T

n ∑

) e

−aTi λ

i=1

(12)

s.t. λ ⩾ 0, which is a geometric programming problem with non-negativity constraints. There are very efficient specialised methods for solving such problems with a large number of inequality constraints [41]. 5. NAROR with entropy maximisation 5.1. Problem formulation The decision maker’s preferences in NAROR are expressed in the form of the constraints E AC , which can be converted into a set of linear constraints by representing the modulus function with − the help of two auxiliary non-negative variables d+ r , dr , so that + − + − the rth variable ar = dr − dr and |ar | = dr + dr (and one of − AC d+ r , dr is zero). Hence we assume that all the constraints in E have been converted into an equivalent set of linear constraints which we also denote with E AC . The set of constraints E AC provides a feasible set D of all capacities consistent with the decision maker’s preferences, which can be empty in case these preferences are inconsistent. We discuss the latter case in a subsequent section, and at the moment focus on nonempty D. Different capacities from D can produce different ordering of the available alternatives, and it is a significant challenge to determine some particular capacity from D which is optimal in some sense. Moreover, typically the set of constraints E AC is rather small compared to the total number of parameters which identify a capacity even for moderate n, because the decision makers provide only a few preferences or alternative rankings. In this paper we advocate the use of the maximal entropy principle as an additional criterion to select the optimal capacity consistent with E AC . The rationale is that maximum entropy capacities tend to take into account as many criteria as possible and weight them rather uniformly. In the absence of the decision maker’s preferences the principle of maximum entropy results in the additive symmetric capacity, for which the Choquet integral is the arithmetic mean function. When the preferences are specified, we expect the optimal compatible capacity to be the one from D closest to the additive symmetric capacity. Thus the maximum entropy capacity identification problem is formulated as follows: max E(µ) s.t. Aµ ⪯ b

(13)

C µ = d, where the set of constraints E AC is written as systems of inequality and equality constraints. Now we notice that the expression for E(µ) (7) does not involve functions h(µ(A)) but rather h of the differences between the values of µ. Therefore to convert problem (13) into the standard entropy maximisation problem (9) we use the marginal contribution representation of capacities (3) in which the decision variables are ∆i (A). max E(∆)

−aTi λ

e

(11)

s.t. λ ⩾ 0, where the dual variables are λ ∈ Rm , v ∈ R and ai is the ith column of A.

s.t. A∆ ⪯ b

(14)

C∆ = d, where the constraints are rewritten in terms of the variables ∆ (we shall use the same letters to denote the matrices and the right hand sides).

Please cite this article as: G. Beliakov, J.-Z. Wu and D. Divakov, Towards sophisticated decision models: Nonadditive robust ordinal regression for preference modeling, Knowledge-Based Systems (2019) 105351, https://doi.org/10.1016/j.knosys.2019.105351.

6

G. Beliakov, J.-Z. Wu and D. Divakov / Knowledge-Based Systems xxx (xxxx) xxx

The set of equality constraints stems from Eq. (4), whereas the inequality constraints involve simple non-negativity of the variables (monotonicity of the capacity) and the constraints in E AC excluding monotonicity. The Choquet integral which appears in E AC can be written in terms of the marginal contributions as Cµ (x) =

n ∑

xπ (i) ∆π (i) µ(Xπ (i+1) ),

(15)

i=1

where xπ (.) is a non-decreasing permutation induced by xi , i = 1, . . . , n, i.e., xπ (1) ⩽ · · · ⩽ xπ (n) , and Xπ (i) = µ({π (i), . . . , π (n)}), Xπ (n+1) = ∅, by convention. Once the objective and the constraints have been set, the numerical solution to problem (14) can be found by methods of convex optimisation, found in the package CVX [42], which is available as Python, Matlab or R library. 5.2. Using k-interactive capacities Even a moderate n of about 10 leads to a very high number of decision variables and constraints. In order to reduce the complexity of capacities we employ the k-interactive capacities from Definition 9, where the values of µ(A) are fixed for subsets of cardinality greater than k, k < n and µ(A) = K , |A| = k + 1, where k, K are user-selected parameters. The rationale is twofold: a) a significant reduction in the number of parameters and constraints, and b) the defining formula in Definition 9 is based on the maximum entropy principle itself, where the partial entropy (over the subsets of cardinality larger than k) is maximised unconditionally (i.e., irrespective of decision maker’s preferences in E AC ). Hence the simplification strategy based on k-interactive capacities is consistent with the maximum entropy principle employed in this work. The entropy over the subsets of large cardinality is maximised unconditionally and the entropy over smaller subsets is optimised conditionally to the decision maker’s explicit and implicit preferences expressed in E AC . The decision variables are a subset of marginal contributions ∆i (A) for |A| ⩽ k and the re1−K . The expressions for maining variables are fixed at ∆i (A) = n− k−1 the Choquet integral and various indices also simplifies, whereas the equality constraints reduce in number.

While maximisation of the Shannon entropy (14) is achievable through convex optimisation, for sufficiently large n (of the order of 20) it becomes numerically expensive to do so because of the sheer number of non-negativity and equality constraints. In this section we consider a different type of entropy, the Renyi entropy H∞ , also called the min-entropy. This measure has been used by Yager in [43] in the context of OWA weights determination. Therefore the objective is to maximise the Renyi H∞ entropy

which translates (using monotonicity of the log function) into minimising min max(∆), subject to the same set of linear constraints as before. Now this optimisation problem can be solved by methods of linear programming, namely by minimising min t

C∆ = d.

s.t. t ⩾ µ(A ∪ {i}) − µ(A), for all A ⊂ N and i ̸ ∈ A

µ(A ∪ {i}) − µ(A) ⩾ 0 Aµ ⪯ b C µ = d.

(17)

This reduces significantly the number of variables and constraints and allows one to take full advantage of the sparsity of the matrices of constraints Therefore combination of k-interactivity with optimising the Renyi H∞ entropy by linear programming is our method of choice for n larger than 10. 5.4. Dealing with the preference inconsistency by multiple goal linear programming If E AC is infeasible or ε ∗ ⩽ 0, where ε ∗ = max ε subject to E AC , then we need to check and adjust the preference inconsistency. Differently to the 0-1 linear programming based method mentioned in Section 3, we give a multiple goal linear programming based consistency check method, which can help decision makers to remove or adjust the inconsistent constraints according to their deviation degree, reduce the redundant constraints to a certain extent, and even provide some suggestions for adjusting the inconsistent and redundant constraints. First, by introducing the positive and negative deviation vari− ables, d+ r and dr , where r = 1, . . . , p, p is total number of − preference constraints in E AC , εr = d+ r − dr , and associating them with each preference constraint, the constraints like Cµ (a) ⩾ Cµ (b) + ε change into

We denote the goal linear preference constraints and the boundary and monotonicity constraints of capacity by EGAC . Then, by constructing and solving the following multiple goal linear programming problem: p ∑

min

r =1

s.t.

EGAC

− d+ r + dr

(18)

,

where dr , d− ⩾ 0, we can find one subset of consistent conr straints with both optimal deviation variables equal to zero, d+∗ = d−∗ = 0, and other inconsistent (not necessarily contradicr r tory, can also be redundant) constraints with optimal positive or negative deviation not zero, d+∗ = 0 or d−∗ = 0 and d+∗ ̸= d−∗ r r r r . −∗ Here, one can see that max{d+∗ , d } serves as an inconsisr r tency degree of each preference constraint. More specifically, the value d−∗ can be considered as the contradiction degree r and the d+∗ can be regarded as the redundancy degree of the r preference constraint when the original preference constraints in E AC are ‘‘equal or larger than’’, ⩾, type inequalities. To the contrary, if the original inequality is ⩽ type, then d+∗ and d−∗ r r are regarded as the redundancy degree and contradiction degree, respectively. We call the preference constraints with nonzero +

max H∞ (∆) = − log(max(∆)),

A∆ ⪯ b

min t

− Cµ (a) − Cµ (b) − d+ r + dr = 0.

5.3. Sparse matrices and large n: maximising min-entropy

s.t. t ⩾ ∆i , for all i

The numerical advantages include significant numerical efficiency of linear programming and the widespread ability to handle sparse matrices of constraints by LP software. The automatic incorporation of the non-negativity of the variables (which reflects monotonicity of the capacity) is also a significant bonus. Indeed, the matrices of constraints are very sparse: for example there are at most n nonzero entries per row in the matrix C even without k-interactivity, which also helps simplifying the problem. One can also switch back to the standard capacity representation in this case and write the problem as

(16)

Please cite this article as: G. Beliakov, J.-Z. Wu and D. Divakov, Towards sophisticated decision models: Nonadditive robust ordinal regression for preference modeling, Knowledge-Based Systems (2019) 105351, https://doi.org/10.1016/j.knosys.2019.105351.

G. Beliakov, J.-Z. Wu and D. Divakov / Knowledge-Based Systems xxx (xxxx) xxx

redundancy degree (resp. contradiction degree) as the redundant (resp. contradictory) constraints. The infeasibility of E AC is caused by the contradictory constraints. There are two strategies to deal with these contradictory constraints. One strategy is to iteratively remove the contradictory constraints from E AC whose corresponding goal constraint is the largest until there are no contradictory constraints. Another strategy is to adjust all the contradictory constraints in E AC by minus d−∗ (resp. plus d+∗ r r ) on the right hand side if they are ⩾ (resp. ⩽) type inequalities. The following example is borrowed from [44,45]. Suppose the decision maker provides the inconsistent constraint µ

µ

ISh (A) ⩾ ISh (A) + 0.1. Then, we transform the above constraint into a goal constraint: µ µ ISh (A) − ISh (A) − d+ + d− = 0.1, and then get the following Multiple Goal Linear Programming (MGLP) problem: µ

µ

From the above preferences we construct the following set of constraints E AC :

E AC

⎧ Cµ (a5 ) ⩾ Cµ (a1 ) + ε1 ⎪ ⎪ ⎪ ⎪ Cµ (a7 ) ⩾ Cµ (a6 ) + ε1 ⎪ ⎪ ⎪ ⎪C (a2 ) ⩾ C (a3 ) + ε ⎪ µ µ 1 ⎪ ⎪ ⎨µ µ ISh ({1}) ⩾ ISh ({2}) + ε2 , ⇐ µ µ ⎪ ⎪ISh ({3}) ⩾ ISh ({3}) + ε2 , ⎪ µ µ ⎪ ⎪ ISh ({1, 2}) ⩾ ε3 , ISh ({2, 3}) ⩾ ε3 , ⎪ ⎪ ⎪ ⎪ µ(∅) = 0, µ(N) = 1, ⎪ ⎪ ⎩ µ(A) ⩽ µ(B), ∀A, B ⊆ N , A ⊆ B.

The values ε1 , ε2 , ε3 are the chosen thresholds which are largely problem-specific. Now we assume 2-interactivity and fix constant K = 0.6, which implies that µ(A) = K when |A| = 3. We then set up the following linear programming problem instantiating (17): min t Constraints from E AC .

s.t. ISh (A) − ISh (A) − d + d = 0.1, +



(19)

the boundary and monotonicity of the capacity on N, where d+ , d− ⩾ 0. Solving the model, the optimal solution of deviation variables are d+∗ = 0 and d−∗ = 0.1. That is, the preference constraint is not feasible, and we need to change it into the following consistent case: µ

µ

µ

ISh ({2, 4}) ⩽ −ε3 ,

s.t. t ⩾ µ(A ∪ {i}) − µ(A), for all A ⊂ N and i ̸ ∈ A, |A| = 1, 2

d+ + d−

min

7

µ

µ

ISh (A) ⩾ ISh (A) + 0.1 − d−∗ , i.e., ISh (A) ⩾ ISh (A). The more detailed procedure and strategies of inconsistency recognition and adjustment can be found in [44,45]. As for the redundant constraints, it is better to keep rather than remove them because these constraints are helpful for keeping the diversity of feasible capacities. Of course, the above adjustment strategies should be carried out in close cooperation with the decision makers and analysts.

5.5. Didactic example We consider the following problem borrowed from [21]. We have a multicriteria decision problem with four criteria (c1 , c2 , c3 , c4 ), and to be specific, the setting is evaluating cars in citycar segment market and the criteria are the price, acceleration, maximum speed and fuel consumption. While the raw criteria evaluations are expressed on different scales, we translate them to a common scale of utility values by using a suitable monotone transformation. Although more sophisticated methods are available, as in [21] for example, this is not the focus of the present work. The alternatives denoted by (a1 , a2 , . . .), for example Peugeot 208, Citroen C3, Fiat 500, etc., are evaluated against each criterion, and the evaluation matrix is available. Further, decision maker (DM) expressed the following preferences: c1 is more important than c2 , and c3 is more important than c4 . There is a positive interaction between c1 and c2 , between c2 and c3 and a negative interaction between c2 and c4 . DM supplies preferences between the alternatives a5 ≻ a1 , a7 ≻ a6 and a2 ≻ a3 . Let the vectors aj ∈ [0, 1]4 denote the (scaled) evaluations of the alternative aj against the four decision criteria.

There are 4 + 6 + 1 = 11 non-negative decision variables with 48 + 3 + 5 = 56 linear constraints. The problem is solved by the standard simplex method which confirms constraints consistency and provides an optimal capacity such that all the DM constraints are specified, in particular the specified pairwise preferences on the alternatives. The respective solution is as in [21] a7 ≻ a4 ≻ a5 ≻ a8 ≻ a2 ≻ a10 ≻ a3 ≻ a1 ≻ a9 ≻ a6 . 5.6. Numerical performance We now report on the numerical performance when solving the maximum entropy problems (14) and (17), in particular when k-interactivity is used as a simplifying assumption. We used R software with the packages CVXR for convex optimisation and Rfmtool together with lpSolve for linear programming. These packages are available through the CRAN repository [46]. In our experiments we varied the number of decision criteria as per Table 1 and also varied the index k, resulting in a changing number of variables and linear constraints. Our main goal here is to illustrate how the nonlinearity of (14) was dealt with by changing the Shannon’s entropy to Renyi’s entropy, as well as the benefits of sparse linear solver. Note that the use of k-interactivity has a significant impact on the number of variables and constraints (which are 2n and n2n−1 respectively when k-interactivity is not used), and thus allows for an efficient numerical solution. Also note that using Renyi’s entropy and LP formulation is a very efficient computational strategy in higher dimensions, which results in much lower CPU times. In fact the problem (14) was not solved in over 24h in two cases as indicated in the table. 6. Conclusions In this paper we advocated the entropy maximisation principle to learn optimal capacities from decision makers’ pairwise preferences in the space of alternatives and in the space of decision criteria. We formulated several suitable optimisation problems to identify an optimal capacity, as well as to deal with inconsistency of explicit or implicit preferences. We discussed the use of kinteractive capacities and a simplification strategy in order to reduce a very large number of monotonicity constraints and parameters, as well as the use of Renyi’s H∞ -entropy in order to

Please cite this article as: G. Beliakov, J.-Z. Wu and D. Divakov, Towards sophisticated decision models: Nonadditive robust ordinal regression for preference modeling, Knowledge-Based Systems (2019) 105351, https://doi.org/10.1016/j.knosys.2019.105351.

8

G. Beliakov, J.-Z. Wu and D. Divakov / Knowledge-Based Systems xxx (xxxx) xxx

Table 1 The CPU time (s) needed to solve problems (14) and (17) as a function of the number of variables and constraints. Computations were performed on an Intel i7-6700 workstation at 4 GHz with 32 GB RAM under Linux environment. n

k

variables

constraints

Problem (14)

Problem (17)

5 5 10 10 15 15 20 20

3 4 3 5 3 5 3 5

25 30 175 637 575 4 943 1 350 21 700

60 75 570 802 2 030 25 053 4 940 116 204

<1

<1 <1 <1 <1

2 3 4 4 – 712 –

1 19 3 234

convert the problem into a linear programming problem, where efficient methods accommodate very large numbers of sparse constraints are common. The outcomes of this research are the efficient methods for robust capacity identification and learning from pairwise preferences, which will contribute towards building more efficient, robust and sophisticated decision support systems in the presence of multiple correlated criteria. Acknowledgments The work was supported by the National Natural Science Foundation of China (No. 71671096) and the K.C. Wong Magna Fund in Ningbo University, China, and also received the support of the ‘‘RUDN University Program 5-100’’. References [1] G. Choquet, Theory of capacities, Ann. Inst. Fourier 5 (1954) 131–295. [2] M. Sugeno, Theory of fuzzy integrals and its applications (Ph.D. thesis), Tokyo Institute of Technology, 1974. [3] E. Pap, Null-additive Set Functions, Kluwer Academic Pub, Dordrecht, 1995. [4] M. Grabisch, I. Kojadinovic, P. Meyer, A review of methods for capacity identification in Choquet integral based multi-attribute utility theory: Applications of the Kappalab R package, European J. Oper. Res. 186 (2) (2008) 766–785. [5] G. Beliakov, H. Bustince, T. Calvo, A Practical Guide to Averaging Functions, Springer, New York, 2016. [6] J.-Z. Wu, L.-P. Yu, G. Li, J. Jin, B. Du, Using the monotone measure sum to enrich the measurement of the interaction of multiple decision criteria, J. Intell. Fuzzy Systems 30 (5) (2016) 2529–2539. [7] J.-Z. Wu, L.-P. Yu, G. Li, J. Jin, B. Du, The sum interaction indices of some particular families of monotone measures, J. Intell. Fuzzy Systems 31 (3) (2016) 1447–1457. [8] G. Beliakov, S. James, G. Li, Learning Choquet-integral-based metrics for semisupervised clustering, IEEE Trans. Fuzzy Syst. 19 (3) (2011) 562–574. [9] J.-Z. Wu, S. Yang, Q. Zhang, S. Ding, 2-additive capacity identification methods from multicriteria correlation preference information, IEEE Trans. Fuzzy Syst. 23 (6) (2015) 2094–2106. [10] Y. Zulueta-Veliz, L. García-Cabrera, A Choquet integral-based approach to multiattribute decision-making with correlated periods, Granular Comput. (2018) 1–12. [11] R.R. Yager, Using fuzzy measures to construct multi-criteria decision functions, in: Soft Computing Based Optimization and Decision Models, Springer, 2018, pp. 231–239. [12] G. Beliakov, Construction of aggregation functions from data using linear programming, Fuzzy Sets and Systems 160 (1) (2009) 65–75. [13] J.-Z. Wu, Q. Zhang, Q. Du, Z. Dong, Compromise principle based methods of identifying capacities in the framework of multicriteria decision analysis, Fuzzy Sets and Systems 246 (2014) 91–106. [14] J.-Z. Wu, E. Pap, A. Szakal, Two kinds of explicit preference information oriented capacity identification methods in the context of multicriteria decision analysis, Int. Trans. Oper. Res. 25 (2018) 807–830. [15] J.-L. Marichal, M. Roubens, Determination of weights of interacting criteria from a reference set, European J. Oper. Res. 124 (3) (2000) 641–650. [16] S. Greco, V. Mousseau, R. Slowinski, Ordinal regression revisited: Multiple criteria ranking using a set of additive value functions, European J. Oper. Res. 191 (2) (2008) 416–436.

[17] S. Angilella, S. Greco, B. Matarazzo, Non-additive robust ordinal regression: A multiple criteria decision model based on the Choquet integral, European J. Oper. Res. 201 (1) (2010) 277–288. [18] S. Corrente, S. Greco, M. Kadziński, R. Słowiński, Robust ordinal regression in preference learning and ranking, Mach. Learn. 93 (2–3) (2013) 381–422. [19] S. Angilella, M. Bottero, S. Corrente, V. Ferretti, S. Greco, I.M. Lami, Non additive robust ordinal regression for urban and territorial planning: an application for siting an urban waste landfill, Ann. Oper. Res. 245 (1–2) (2016) 427–456. [20] S. Corrente, S. Greco, A. Ishizaka, Combining analytical hierarchy process and Choquet integral within non-additive robust ordinal regression, Omega 61 (2016) 2–18. [21] S. Angilella, S. Corrente, S. Greco, Stochastic multiobjective acceptability analysis for the choquet integral preference model and the scale construction problem, European J. Oper. Res. 240 (1) (2015) 172–182. [22] J.-Z. Wu, G. Beliakov, Nonadditivity index and capacity identification method in the context of multicriteria decision making, Inform. Sci. 467 (2018) 398–406. [23] M. Grabisch, The representation of importance and interaction of features by fuzzy measures, Pattern Recognit. Lett. 17 (6) (1996) 567–575. [24] P. Wakker, Additive Representations of Preferences: A New Foundation of Decision Analysis, Springer, Berlin, New York, 1989. [25] M. Grabisch, J.-L. Marichal, M. Roubens, Equivalent representations of set functions, Math. Oper. Res. 25 (2) (2000) 157–178. [26] A. Chateauneuf, J.-Y. Jaffray, Some characterizations of lower probabilities and other monotone capacities through the use of Möbius inversion, Math. Soc. Sci. 17 (3) (1989) 263–283. [27] M. Grabisch, K-order additive discrete fuzzy measures and their representation, Fuzzy Sets and Systems 92 (2) (1997) 167–189. [28] K. Fujimoto, I. Kojadinovic, J.-L. Marichal, Axiomatic characterizations of probabilistic and cardinal-probabilistic interaction indices, Games Econom. Behav. 55 (1) (2006) 72–99. [29] M. Grabisch, M. Roubens, Probabilistic interactions among players of a cooperative game, in: Beliefs, Interactions and Preferences in Decision Making, Springer, 1999, pp. 205–216. [30] G. Beliakov, D. Divakov, On representation of fuzzy measures for learning Choquet and Sugeno integrals, Knowl.-Based Syst. (2019) http://dx.doi.org/ 10.1016/j.knosys.2019.105134, in press. [31] G. Beliakov, J.-Z. Wu, Marginal contribution representation of capacitybased multicriteria decision making, Int. J. Intell. Syst. (2019) http://dx. doi.org/10.1002/int.22209, in press. [32] J.-L. Marichal, K-intolerant capacities and Choquet integrals, European J. Oper. Res. 177 (3) (2007) 1453–1468. [33] V. Mousseau, J. Figueira, L. Dias, C. Gomes da Silva, J. Climaco, Resolving inconsistencies among constraints on the parameters of an MCDA model, European J. Oper. Res. 147 (2003) 72–93. [34] V. Mousseau, J. Figueira, L.s. Dias, C. Silva, J.a. Climaco, Resolving inconsistencies among constraints on the parameters of an MCDA model, European J. Oper. Res. 147 (2003) 72–93, http://dx.doi.org/10.1016/S03772217(02)00233-3. [35] J.-L. Marichal, Entropy of discrete Choquet capacities, European J. Oper. Res. 137 (3) (2002) 612–624. [36] J.-L. Marichal, Aggregation of interacting criteria by means of the discrete choquet integra, in: T. Calvo, G. Mayor, R. Mesiar (Eds.), Aggregation Operators: New Trends and Applications, Springer, 2002, pp. 224–244. [37] J.-L. Marichal, Tolerant or intolerant character of interacting criteria in aggregation by the Choquet integral, European J. Oper. Res. 155 (3) (2004) 771–791. [38] G. Beliakov, J.-Z. Wu, Learning fuzzy measures from data: simplifications and optimisation strategies, Inform. Sci. 494 (2019) 100–113. [39] J.N. Kapur, H.K. Kesavan, Entropy Optimization Principles with Applications, Academic Press, Boston, 1992. [40] S.-C. Fang, J. Rajasekera, H.-S.J. Tsao, Entropy Optimization and Mathematical Programming, Kluwer, Boston, London, Dordrecht, 1997. [41] S. Boyd, L. Vandenberghe, Convex Optimization, Cambridge University Press, Cambridge, 2004. [42] M. Grant, S. Boyd, CVX : Matlab software for disciplined convex programming, 2017, http://http://cvxr.com/cvx/. [43] R. Yager, Families of OWA operators, Fuzzy Sets and Systems 59 (1993) 125–148. [44] J.-Z. Wu, G. Beliakov, Nonadditive robust ordinal regression with nonadditivity index and multiple goal linear programming, Int. J. Intell. Syst. 34 (7) (2019) 1732–1752. [45] J.-Z. Wu, L. Huang, R.-J. Xi, Y.-P. Zhou, Multiple goal linear programmingbased decision preference inconsistency recognition and adjustment strategies, Information 10 (7) (2019) 223. [46] Comprehensive R Archive Network, 2019, https://CRAN.R-project.org/.

Please cite this article as: G. Beliakov, J.-Z. Wu and D. Divakov, Towards sophisticated decision models: Nonadditive robust ordinal regression for preference modeling, Knowledge-Based Systems (2019) 105351, https://doi.org/10.1016/j.knosys.2019.105351.