JID:TCS AID:10629 /FLA
Doctopic: Algorithms, automata, complexity and games
[m3G; v1.172; Prn:10/02/2016; 10:24] P.1 (1-16)
Theoretical Computer Science ••• (••••) •••–•••
Contents lists available at ScienceDirect
Theoretical Computer Science www.elsevier.com/locate/tcs
Parameterized ceteris paribus preferences over atomic conjunctions under conservative semantics Sergei Obiedkov Faculty of Computer Science, National Research University Higher School of Economics, Kochnovskii proezd 3, Moscow, Russia
a r t i c l e
i n f o
Article history: Received 31 March 2015 Received in revised form 10 January 2016 Accepted 18 January 2016 Available online xxxx Keywords: Ceteris paribus preferences Implications Horn formulas Formal concept analysis
a b s t r a c t We consider a propositional language for describing parameterized ceteris paribus preferences over atomic conjunctions. Such preferences are only required to hold when the alternatives being compared agree on a specified subset of propositional variables. Regarding the expressivity of the language in question, we show that a parameterized preference statement is equivalent to a conjunction of an exponential number of classical, non-parameterized, ceteris paribus statements. Next, we present an inference system for parameterized statements and prove that the problem of checking the semantic consequence relation for such statements is coNP-complete. We propose an approach based on formal concept analysis to learning preferences from data by showing that ceteris paribus preferences valid in a particular model correspond to implications of a special formal context derived from this model. The computation of a complete preference set is then reducible to the computation of minimal hypergraph transversals. Finally, we adapt a polynomial-time algorithm for abduction using Horn clauses represented by their characteristic models to the problem of determining preferences over new alternatives from preferences over given alternatives (with ceteris paribus preferences as the underlying model). © 2016 Elsevier B.V. All rights reserved.
1. Introduction Preferences have been studied in many fields as diverse as philosophy, psychology, decision theory, and economics, to give a few examples [1,2]. They are of fundamental interest for many applications of artificial intelligence, and several preference representation languages have been proposed by AI researchers. On the other hand, a number of approaches to modeling various notions of preferences have been developed within preference logics—notably, modal preference logics. The key principle here is to extend an observed preference relation on individual outcomes to sets of outcomes in one of several reasonable ways (e.g., so that a set A is preferred to a set B whenever every outcome from A is preferred to every outcome from B or, alternatively, whenever, for every outcome from B there is a better outcome in A) and, based on that, derive preferences between propositions about these outcomes: a formula φ is preferred to a formula ψ if the set of outcomes satisfying φ is preferred to the set of outcomes satisfying ψ . Bienvenu et al. discuss the relation between preference logics and AI preference languages in [3]. In this paper, we focus on logical preference theories restricted to preferences between conjunctions of atomic propositions. We are interested in ceteris paribus preferences, i.e., preferences that hold between propositions other things being equal. In the classical preference logic of von Wright [4] ceteris paribus preferences between propositions are required to
E-mail address:
[email protected]. http://dx.doi.org/10.1016/j.tcs.2016.01.035 0304-3975/© 2016 Elsevier B.V. All rights reserved.
JID:TCS AID:10629 /FLA
Doctopic: Algorithms, automata, complexity and games
[m3G; v1.172; Prn:10/02/2016; 10:24] P.2 (1-16)
S. Obiedkov / Theoretical Computer Science ••• (••••) •••–•••
2
hold only when all other things—those not mentioned in the propositions being compared—are equal. Similarly, in CP-nets, one of the best-known preference modeling AI formalisms [5], a preference of one attribute value over another is conditioned by values of all other variables being identical. In [6], a more general logic is described where it is possible to explicitly specify which other things must be equal for a preference to hold. Similar extensions have been developed for CP-nets [7], and a unifying framework was proposed by Bienvenu et al. [3] in the form of what they call a “prototypical” preference logic, expressive enough to encode CP-nets and many of their extensions. The language we use here can be regarded as a syntactic fragment of this prototypical logic, but our semantics follows that of the logic from [6] and, thus, is rather different from the approach taken in [3]. In particular, we interpret preferences on arbitrary preorders, whereas Bienvenu et al. require the preference relation on outcomes to be a total preorder. Another difference is that the possible-world semantics of modal logics (which is the one we use here) allows for outcomes satisfying the same propositional variables to be treated as different entities, thus, taking into account the possibility that the propositional language used for their description might not cover all aspects relevant to determining preferences. This is not allowed in [3], as well as in many other AI preference languages, which makes them incapable of handling cases when two identically described outcomes are preferred to different sets of outcomes—a situation that very well may happen when building a preference model from real-life data. In the next section, we define the syntax and semantics of our language. We then discuss the relation of this language to some formalisms for preference modeling developed in the artificial intelligence community in Section 3 and the relation between parameterized statements of our language and classical, non-parameterized, ceteris paribus statements in Section 4. We proceed by developing an inference system for parameterized statements and proving its soundness and completeness. We then study the complexity of inference and show that the problem of checking the semantic consequence in our language is a coNP-complete problem. The last two sections deal with issues arising in analysis of data about preferences. In Section 8, we show how a semantically complete set of preferences can be extracted from a dataset using formal concept analysis via translating preferences into implications. We conclude by discussing how preferences over new objects can be predicted from observed preferences over objects. 2. Language and semantics Given a set of propositional symbols (variables), a non-strict preference statement over is defined in [3] as an expression (in a slightly different syntax)
α F β, where α and β are propositional formulas over and F is a set of such formulas. In this paper, we allow only preference statements where α and β are atomic conjunctions and F is a set of atomic formulas. This restriction makes it possible to work with α , β , and F as with sets of variables—and this is the approach we take. Definition 1. Let be a set of propositional variables. A preference statement over is an expression of the form
A C B , where A , B , C ⊆ . When this does not result in ambiguity, we may refer to preference statements simply as preferences. We sometimes refer to A as the left-hand side, to B as the right-hand side, and to C as the ceteris paribus part of the preference statement A C B. One can regard A , B, and C as sets of attributes describing certain outcomes or situations. The preference A C B is then interpreted as follows: Between two outcomes that agree on all attributes from C , one that has all attributes from B is always at least as good as one that has all attributes from A. Note that the two outcomes under comparison are assumed to agree on every variable from C , not just on the value of the conjunction of variables from C . Thus, C corresponds to a set of atomic formulas, whereas A and B correspond each to a single atomic conjunction. Following [6], we use possible-world semantics for interpreting preference statements and assume that the preference relation over possible worlds, which we refer to as outcomes, is a preorder. Definition 2. A preference model over a set of propositional symbols is a triple M = ( W , ≤, V ), where W is a set of outcomes (or possible worlds), ≤ is a reflexive and transitive binary preference relation (preorder) on W , and V is a function assigning to each propositional variable p ∈ a subset of outcomes V ( p ) ⊆ W that satisfy it. We will slightly abuse notation
JID:TCS AID:10629 /FLA
Doctopic: Algorithms, automata, complexity and games
[m3G; v1.172; Prn:10/02/2016; 10:24] P.3 (1-16)
r1
lunch menu
expensive
3
r6
x
r2
x
r3 r4
cheap
Georgian
Chinese
S. Obiedkov / Theoretical Computer Science ••• (••••) •••–•••
x x
x
r4
x
r5
x
r6
x
r2
r5
r3 r1
x x
x
Fig. 1. A small dataset; left: restaurants and their properties; right: preferences over the restaurants.
by writing V −1 ( w ) to denote the set of propositional variables p such that w ∈ V ( p ) and will refer to this set as the description of w. We also extend V to arbitrary subsets of variables so that
V ( A ) :=
V ( p)
p∈ A
denotes the set of outcomes that satisfy all variables from A ⊆ . We use < to denote the strict order induced by ≤ in the usual way. If w ≤ v and v ≤ w, we say that there is indifference between w and v. V −1 ( w ) is meant to be a complete description of the outcome w in terms of , that is, w satisfies precisely the propositional variables from V −1 ( w ). We say that two outcomes, w and v, agree on C ⊆ if V −1 ( w ) and V −1 ( v ) contain precisely the same variables from C . This leads to the following interpretation of preference statements: Definition 3. A preference statement A C B over is satisfied in a preference model M = ( W , ≤, V ) if w ≤ v for all w ∈ V ( A ) and v ∈ V ( B ) such that C ∩ V −1 ( w ) = C ∩ V −1 ( v ). In this case, we also say that the preference statement A C B is valid in M (or holds in M, or that M is a model of A C B) and denote this by M | A C B. If is a set of preference statements each of which holds in M, we write M | . Definition 4. A preference statement A C B follows from (or is a semantic consequence of ) a set of preferences (notation: | A C B) if, whenever all preferences from are valid in some preference model M (i.e., M | ), the preference statement A C B is also valid in M. We say that a preference D F E is weaker (in a non-strict sense) than A C B if it determines a preferential order only for (some) outcome pairs for which the order is determined by the preference A C B or, more precisely, if
{ A C B } | D F E .
(1)
For example, the preference statement { p } {r ,s} {q} is weaker than the statement { p } ∅ {q}. Observe that (1) holds whenever A ⊆ D , B ⊆ E, and C ⊆ F . Thus, adding arbitrary variables to any part of a valid preference statement results in a valid preference. More formally: Proposition 1. If A C B is satisfied in a preference model M = ( W , ≤, V ), then A ∪ D C ∪ F B ∪ E is satisfied in M for all D , E , F ⊆ . Proof. Adding variables to either side of a preference statement can only make it weaker. Indeed, whenever we have w ∈ V ( A ∪ D ) and v ∈ V ( B ∪ E ) for some outcomes w , v ∈ W that agree on all variables from C ∪ F , we also have w ∈ V ( A ), v ∈ V ( B ), and w and v agree on all variables from C . Thus, if M satisfies A C B, then w ≤ v, and, since w and v were chosen arbitrarily, M also satisfies A ∪ D C ∪ F B ∪ E. 2 Example 1. Consider the small dataset in Fig. 1. The table on the left-hand side shows properties of six restaurants: one of them serves Chinese cuisine, some others serve Georgian cuisine, a few are cheap, one is expensive, and two offer a lunch menu at a special price. The diagram on the right-hand size represents an agent’s preferences over these restaurants: r 6 is the best, r4 is worse than r2 and r5 , which are incomparable, etc. This dataset can be interpreted as a preference model in the sense of Definition 2. The outcomes are the six restaurants, the propositional variables are their properties, and V assigns to each variable the set of restaurants with the corresponding property; for example, V (Georgian) = {r2 , r5 , r6 }. In this example, the preference relation ≤ represented by the diagram in Fig. 1 is a partial order, since there are no outcomes with indifference between them (although there are incomparable outcomes).
JID:TCS AID:10629 /FLA
Doctopic: Algorithms, automata, complexity and games
[m3G; v1.172; Prn:10/02/2016; 10:24] P.4 (1-16)
S. Obiedkov / Theoretical Computer Science ••• (••••) •••–•••
4
Here the preference
∅ ∅ {Georgian, cheap} holds, indicating that the agent prefers cheap restaurants serving Georgian cuisine to any other type of restaurant. We also have
∅ {cheap,expensive} {Georgian}, suggesting that every Georgian restaurant is the best within its price range. However, the stronger preference
∅ ∅ {Georgian}, does not hold, because, for example, the Georgian restaurant r2 is incomparable with r3 . In this example, the dataset contains explicit information about preferences over individual outcomes, and we may use this information to infer preferences over attribute subsets. If a dataset is interpreted as a preference model, we may be interested in all preference statements valid in this model similarly to how, say, in market basket analysis people are interested in all association rules valid in a given dataset [8]. More to the point, we would like to be able to compute a set of preference statements that has as its semantic consequences precisely all preference statements valid in the model. Such a set is called complete for the given model: Definition 5. A set of preference statements is said to be (semantically) complete for a model M over if, for all preference statements A C B over , we have M | A C B if and only if | A C B. In Section 8, we discuss how a set of preference statements complete for a given model can be computed. 3. Relation to other formalisms While restricting the constituting parts of a preference statement to be atomic conjunctions is a considerable simplification as compared with the logic from [3] and even more so as compared with the logic from [6], our language is, in principle, not less general syntactically than languages used to express many popular preference-based formalisms, such as CP-nets and some of their extensions. Indeed, CP-nets deal only with preferences that hold all other things being equal (cf. Section 4), while cp-theories from [7] may include preferences with parameterized ceteris paribus conditions, but, roughly speaking, this language is capable of specifying only (strict) preferences of the form X ∪ {xa } ≺C X ∪ {xb }, where xa and xb are values of the same multi-valued variable, as contrasted with preferences of the form A C B (with arbitrary A and B) studied in this paper. Conditional importance networks from [9] do make it possible to express statements over attribute subsets, but these preferences are assumed to be monotonic (it is always better to have all attributes from A than only those from a proper subset of A), which is not the case here. One difference between these works and our formalism is that we use only non-strict preferences (“being at least as good as”), whereas preferences in CP-nets and cp-theories are strict (“being better than”). We ignore this difference for now, but we plan to investigate what it takes to transfer our results to strict preferences. Another important difference is that cp-theories can handle multi-valued variables, whereas our language does not allow even negated Boolean variables. This issue can be addressed by introducing a distinct propositional variable for each value of multi-valued variables. To be able to express, for example, that every outcome characterized by p is at least as good as every outcome characterized by absence of p, it is necessary to explicitly introduce the negation of p as a new variable, say, p¯ in . The preference for outcomes with p over those without it can then be expressed by the statement { p¯ } ∅ { p }. Of course, the problem with this approach is that it does not take into account that p and p¯ are not independent from each other. While the mutual exclusivity of p and p¯ can, to a certain extent, be addressed by the preference statements
{ p , p¯ } ∅ ∅
and
∅ ∅ { p , p¯ },
which together prohibit outcomes characterized by both p and p¯ except in models with indifference between every two outcomes, even this can hardly be done for the constraint that each outcome must have p or p¯ . In [3], a similar problem occurring in embedding a CP-net with multi-valued variables into propositional logic is fixed by introducing an integrity constraint in the form of a propositional formula; only outcomes satisfying the constraint are taken into consideration in all reasoning tasks. We will show that allowing arbitrary propositional constraints will not change the complexity of inference in our logic (see Section 7). This is, however, because of an important difference in how we define semantics for our language and how it is defined for preference handling formalisms used in artificial intelligence. Semantically, our logic is a fragment of the modal logic from [6], since we follow the tradition of modal logics in using Kripke semantics. This means that no bijection is assumed between possible outcomes and variable assignments. For example, we allow models containing outcomes w and v such
JID:TCS AID:10629 /FLA
Doctopic: Algorithms, automata, complexity and games
S. Obiedkov / Theoretical Computer Science ••• (••••) •••–•••
[m3G; v1.172; Prn:10/02/2016; 10:24] P.5 (1-16)
5
that V −1 ( w ) = V −1 ( v ), but, say, w < v. This makes it possible to model situations where not all information relevant to determining the reason for observed preferences over outcomes is known; it might be that there is an important (hidden) attribute outside that v has, but w does not, and that is the reason of preferring outcome v over outcome w. On the other hand, Kripke semantics makes no assumption on dependencies between the variables in ; in particular, we do not assume that the variables are independent from each other. Therefore, not all variable assignments are required to be realized in every model: there may be assignments not corresponding to any outcome in a given model. This absence of a bijection between possible worlds (outcomes) and variable assignments is common in modal logics, but is in sharp contrast with the semantics of CP-nets, the logic of [3], and other AI preference handling formalisms. In these formalisms, a model over n Boolean variables must contain exactly 2n outcomes—except if some of them are ruled out by an integrity constraint, in which case, the number of outcomes is smaller, but still fixed over all feasible models. The variable assignments corresponding to outcomes are also fixed across models, and no two outcomes correspond to the same assignment. What varies across models is only the preference relation over the outcomes. In our case (as it is usual in modal logics), the set of outcomes is also allowed to vary from model to model. Since we do not make strong assumptions on which outcomes are possible, we work with a larger set of models, and therefore, our reasoning system is necessarily more conservative than those developed within AI formalisms. It may be argued though that this conservative approach is, in fact, more suitable for some applications in data analysis, where we are likely to observe only some variables relevant for determining preferences and have very little prior knowledge on dependencies between observed variables. Example 2. Consider again the dataset shown in Fig. 1. Restaurants r1 and r3 have exactly the same properties and still r3 is preferred to r1 . This may be due to the quality of food, or due to location, or due to some other properties not recorded in the dataset. A situation like this is perfectly possible in data analysis, since it is common for a dataset to represent only a limited view on the corresponding fragment of the real world; some properties relevant for determining preferences may be missing from the dataset because, for instance, they are not easily observable. Yet, from the viewpoint of CP-nets or similar formalisms, r1 and r3 correspond to the same outcome (unless we allow hidden variables in CP-nets) and hence neither can be preferred to the other one. On the contrary, the possible-world semantics does not force us to treat outcomes with identical descriptions as being identical. Note also that not all combinations of the Boolean properties shown in Fig. 1 are possible: e.g., no restaurant can be cheap and expensive at the same time (although some restaurants can be neither). In this particular example, the dependencies among the variables can be explicitly specified in the form of an integrity constraint, but it is not always easy to do for complicated domains with many variables. What can be learnt about preferences if we have to be blind to the structure of the domain and cannot make assumptions about dependencies among the variables? The conservative semantics considered in this paper is an attempt to answer this question. From the computational point of view, the prize we get for being conservative is that our logic is “only” coNP-complete, whereas most AI formalisms allowing for ceteris paribus preferences are at least PSPACE-complete. Also, our approach admits a natural deduction system (presented in Section 6), as opposed to proof theories based on swapping/flipping sequences developed for CP-nets and their extensions. 4. Non-parameterized preferences In our definition of preference statements, we follow [6] in that we explicitly specify which conditions must be kept equal when comparing two outcomes in order to check if B is preferred ceteris paribus to A. However, ceteris paribus preferences are traditionally understood as preferences that hold all other things being equal [4]. For fixed , this can be easily expressed using parameterized preference statements. Definition 6. For A , B ⊆ , the expression
A CP B is used as a shorthand for the preference statement
A \( A ∪ B ) B . Thus, ceteris paribus preferences in their traditional understanding are a special case of parameterized preferences. It may seem that the language of parameterized preferences is strictly more expressive than the language of non-parameterized preferences. Somewhat surprisingly, this is not the case: any parameterized preference statement can be represented by an exponentially large set of non-parameterized ones. Theorem 1. A preference statement A C B over holds in a preference model M = ( W , ≤, V ) if and only if M | A ∪ D CP B ∪ E for all D ⊆ and E ⊆ such that D ∩ C = B ∩ C , E ∩ C = A ∩ C , and D ∩ E ⊆ C .
JID:TCS AID:10629 /FLA
Doctopic: Algorithms, automata, complexity and games
[m3G; v1.172; Prn:10/02/2016; 10:24] P.6 (1-16)
S. Obiedkov / Theoretical Computer Science ••• (••••) •••–•••
6
Proof. Suppose that M | A C B. To show that all preferences of the form A ∪ D CP B ∪ E with D and E as described above hold in M, consider one such preference, denote X = A ∪ D and Y = B ∪ E, and take any two outcomes w and v from W such that w ∈ V ( X ), v ∈ V (Y ), and V −1 ( w ) ∩ ( \ ( X ∪ Y )) = V −1 ( v ) ∩ ( \ ( X ∪ Y )). The outcomes w and v agree on all variables c ∈ C . Indeed, if c ∈ \ ( X ∪ Y ), this follows from how w and v are selected. Otherwise, c ∈ X ∪ Y . We show that if c ∈ X and thus w ∈ V (c ), then c ∈ Y and thus v ∈ V (c ). For c ∈ X , two cases are possible: c ∈ A or c ∈ D. If c ∈ A, then c ∈ A ∩ C = E ∩ C and therefore c ∈ E ⊆ Y . If c ∈ D, then c ∈ D ∩ C = B ∩ C and c ∈ B ⊆ Y . Similarly, c ∈ Y implies c ∈ X . Thus, w ∈ V ( A ), v ∈ V ( B ), and V −1 ( w ) ∩ C = V −1 ( v ) ∩ C . Since M | A C B, we have w ≤ v. As w and v were chosen arbitrarily, we have M | A ∪ D CP B ∪ E. For the other direction, assume that M is a model of all preference statements of the form A ∪ D CP B ∪ E such that D ∩ C = B ∩ C , E ∩ C = A ∩ C , and D ∩ E ⊆ C , and show that A C B holds in M. Consider arbitrary outcomes w and v from M such that w ∈ V ( A ), v ∈ V ( B ), and V −1 ( w ) ∩ C = V −1 ( v ) ∩ C . We must show that w ≤ v. Denote
D = ( V −1 ( w ) \ ( A ∪ C ∪ V −1 ( v ))) ∪ ( B ∩ C ); E = ( V −1 ( v ) \ ( B ∪ C ∪ V −1 ( w ))) ∪ ( A ∩ C ). Obviously, D ∩ C = B ∩ C and E ∩ C = A ∩ C . Since
D ⊆ ( V −1 ( w ) \ V −1 ( v )) ∪ ( B ∩ C ) and E ⊆ ( V −1 ( v ) \ V −1 ( w )) ∪ ( A ∩ C ), all variables shared by D and E must come from B ∩ C (for D) and A ∩ C (for E). Therefore, D ∩ E ⊆ C and, by assumption, M | A ∪ D CP B ∪ E. Since B ∩ C ⊆ V −1 ( v ) ∩ C = V −1 ( w ) ∩ C ⊆ V −1 ( w ), we have w ∈ V ( D ) and w ∈ V ( A ∪ D ). Similarly, v ∈ V ( B ∪ E ). We can use the preference A ∪ D CP B ∪ E to show that w ≤ v if we prove that w and v agree on all variables p ∈ \ ( A ∪ B ∪ D ∪ E ). Suppose that p ∈ V −1 ( w ) for some such p. From p ∈ / D, we obtain p ∈ A ∪ C ∪ V −1 ( v ). Combining − 1 this with p ∈ / A, we have p ∈ C ∪ V ( v ), and the outcomes w and v agree on p, for they agree on all variables from C by assumption. The case when p ∈ V −1 ( v ) is similar. Obviously, if neither p ∈ V −1 ( w ) nor p ∈ V −1 ( v ) holds, w and v agree on p, too. 2 According to Theorem 1, the preference A C B can be represented by a set of non-parameterized preferences between certain supersets of A and B, denoted by A ∪ D and B ∪ E respectively. Each such preference is weaker than A C B. Because A C B is concerned only with outcome pairs that agree on all variables from C , we need to ensure the same when switching to non-parameterized preferences. For this reason, if A contains variables from C , we add these variables to the right-hand side of all non-parameterized preferences; similarly, if B contains variables from C , we add them to the left-hand side. The requirement that the values of other variables from C be kept equal is enforced by including these variables neither in the left-hand side nor in the right-hand side of non-parameterized preferences. This explains the conditions D ∩ C = B ∩ C and E ∩ C = A ∩ C . If the value of a variable p is not governed by the ceteris paribus condition, it is sufficient to add p only to one of the left-hand side and the right-hand side of a non-parameterized preference; this explains D ∩ E ⊆ C in the statement of the theorem. 5. Checking semantic consequence In this section, we give an algorithm that decides whether a preference A C B follows from a set of preferences , i.e., whether every model that satisfies also satisfies A C B. Unless otherwise stated, we assume that all preferences are defined over the same set of propositional symbols. Proposition 1 describes an inference rule that makes it possible to obtain valid preferences by adding variables to other valid preferences. As should be expected, not all preferences following from can be obtained using only this rule. However, it turns out that | A C B if and only if this rule suffices to derive a certain set of preferences related to A C B in a special way. Thus, to prove that | A C B, it is enough to find a single preference from this set that is not derivable by adding variables to preferences from . We will now describe this set of preferences and present an algorithm for checking | A C B by exploring this set. The result of applying the inference rule from Proposition 1 to A C B is either a weaker or an equivalent preference statement. The latter happens if a preference statement contains the same variable in all of its three parts; it then can be removed from any one of them resulting in an equivalent statement. Proposition 2. If A ∪ D C ∪ D B ∪ D is satisfied in a preference model M, then each of the following three preferences is also satisfied in M:
JID:TCS AID:10629 /FLA
Doctopic: Algorithms, automata, complexity and games
[m3G; v1.172; Prn:10/02/2016; 10:24] P.7 (1-16)
S. Obiedkov / Theoretical Computer Science ••• (••••) •••–•••
7
1. A ∪ D C B ∪ D; 2. A ∪ D C ∪ D B; 3. A C ∪ D B ∪ D. Proof. Suppose that A ∪ D C ∪ D B ∪ D is satisfied in M = ( W , ≤, V ). First, consider two outcomes w , v ∈ W such that w ∈ V ( A ∪ D ), v ∈ V ( B ∪ D ), and w and v agree on C . Obviously, w and v agree on D and, consequently, on C ∪ D. Due to M | A ∪ D C ∪ D B ∪ D, we have w ≤ v. This shows that the first preference statement is a semantic consequence of the statement A ∪ D C ∪ D B ∪ D. Similarly, if w ∈ V ( A ∪ D ), v ∈ V ( B ), and w and v agree on C ∪ D, then v ∈ V ( B ∪ D ), and w ≤ v. This proves the proposition with respect to the second preference statement; the proof for the third statement is fully analogous. 2 Note that all the four preference statements in Proposition 2 are equivalent to each other: Proposition 2 states that the three numbered preferences follow from A ∪ D C ∪ D B ∪ D, while the latter follows from each of the former by Proposition 1. Thus, the language of parameterized preferences is redundant: there is more than one way to specify the same preference statement. We will refer to the longest among equivalent statements as canonical. Definition 7. The canonical form of a preference statement A C B, denoted by can( A C B ), is the statement
A ∪ ( B ∩ C ) C ∪( A ∩ B ) B ∪ ( A ∩ C ). If can( A C B ) = A C B, we say that A C B is in canonical form. Proposition 3. A preference statement A C B is in canonical form if and only if A ∩ C = B ∩ C = A ∩ B. Proof. Suppose that can( A C B ) = A C B. Then B ∩ C ⊆ A and A ∩ C ⊆ B (as well as A ∩ B ⊆ C ). From this, we obtain, for example, B ∩ C ⊆ A ∩ C and A ∩ C ⊆ B ∩ C , i.e., A ∩ C = B ∩ C . The equality A ∩ C = A ∩ B can be proved similarly. The other direction of the proposition is trivial. 2 Preference statements in canonical form will play a special role in the algorithm we are about to describe (as well as in the completeness proof for the inference system in Section 6). Proposition 1 allows us to get valid preferences by replacing A, B, and C in a valid preference A C B by their arbitrary supersets. The following definition captures preferences that can be obtained from other preferences in this way: Definition 8. Let be a set of preference statements. Then
• = { D F E | ∃ A C B ∈ ( A ⊆ D , B ⊆ E , C ⊆ F )}. Note that | • . However, as can be seen from Proposition 2, not all preferences that follow from are in • .
/ • , although | {a, d} {c} {b, d} Example 3. Let = {a, b, c , d} and = {{a, d} {c ,d} {b, d}}. In this case, {a, d} {c } {b, d} ∈ • due to Proposition 2. A different kind of inference not captured by is observed when
= {a d b,
a c d,
d c b}
(we omit curly brackets around single-element sets here). Then a c b ∈ / • , but | a c b. Indeed, consider a model M satisfying and two arbitrary outcomes w and v from this model such that w ∈ V (a), v ∈ V (b), and w and v agree on c. Either w and v agree on d, in which case w ≤ v due to a d b; or one of them satisfies d, in which case w ≤ v due to a c d or d c b. Therefore, M | a c b. As an aside, it is interesting to note that
{a c d,
d c b}
|
a c b,
although the transitivity of the preference relation may suggest the opposite. The reason for this is that, in our semantics, we make no assumption on the dependencies between variables (see the discussion in Section 3). It may be that d implies c, i.e., an outcome with d but without c is not possible. In a model where this is the case, the statements a c d and d c b only force outcomes with {b, c } to be at least as good as outcomes with {c , d}, which must be at least as good as those with {a, c }. The transitivity of the preference relation then guarantees than outcomes with {b, c } are at least as good as those with {a, c }. However, transitivity does not help here if we need to compare outcomes without c, and therefore, a c b does not have to hold. Proposition 4 gives a better characterization of semantic consequence for preference statements.
JID:TCS AID:10629 /FLA
Doctopic: Algorithms, automata, complexity and games
[m3G; v1.172; Prn:10/02/2016; 10:24] P.8 (1-16)
S. Obiedkov / Theoretical Computer Science ••• (••••) •••–•••
8
Algorithm 1 Ceteris Paribus Consequence( A C B , ). Input: A preference statement A C B and a set of preference statements (over a universal set ). Output: true, if | A C B; false, otherwise.
S := [can( A C B )] repeat D F E := pop(S ) if D F E ∈ / • then X := \ ( D ∪ E ∪ F ) if X = ∅ then return false choose p ∈ X push( D ∪ { p } F E , S ) push( D F E ∪ { p }, S ) push( D F ∪{ p } E , S } until empty(S ) return true
{stack}
Proposition 4. Let be a set of preference statements over . For any preference A C B in canonical form, we have | A C B if and only if • contains all canonical-form preferences D F E such that A ⊆ D , B ⊆ E , C ⊆ F , and = D ∪ E ∪ F .
/ • be a preference satisfying the conditions above. Consider a preference model M with only two Proof. Let D F E ∈ outcomes, w 1 < w 2 , such that V −1 ( w 1 ) = E and V −1 ( w 2 ) = D. Since D F E is in canonical form, D ∩ F = E ∩ F by Proposition 3 and, consequently, f ∈ D = V −1 ( w 2 ) if and only if f ∈ E = V −1 ( w 1 ) for all f ∈ F . Therefore, the outcomes w 1 and w 2 agree on all variables in F . The values of all variables in \ F = ( D ∪ E ) \ F are different for w 1 and w 2 . This is again because D F E is in canonical form and hence D ∩ E ⊆ F . Since A ⊆ V −1 ( w 2 ), B ⊆ V −1 ( w 1 ), C ⊆ F , and w 2 w 1 , we conclude that M | A C B. Consider an arbitrary P R Q ∈ . As D F E ∈ / • , either P D or Q E or R F . In all these cases, M | P R Q . Thus, M | , but M | A C B. It follows that | A C B. For the other direction, suppose that | A C B. Then, there is a model M such that M | , but M | A C B. This model must contain two outcomes, w 1 and w 2 , for which A C B fails, i.e., B ⊆ V −1 ( w 1 ), A ⊆ V −1 ( w 2 ), V −1 ( w 1 ) ∩ C = V −1 ( w 2 ) ∩ C , but w 2 w 1 . Denote D = V −1 ( w 2 ), E = V −1 ( w 1 ), and F = ( \ ( D ∪ E )) ∪ ( D ∩ E ). Obviously, D F E is a canonical-form preference satisfying the conditions listed in the proposition, but M | D F E and, therefore, • cannot contain D F E, which concludes the proof. 2 Proposition 4 paves the way for Algorithm 1, which checks whether a preference A C B is a consequence of the set of preference statements. The algorithm starts by computing the canonical form of A C B and putting the result, A 1 C 1 B 1 , on a stack: A C B follows from if and only if A 1 C 1 B 1 does. It then tries to find a canonical-form preference D F E∈ / • such that A 1 ⊆ D, B 1 ⊆ E, C 1 ⊆ F , and = D ∪ E ∪ F . We know from Proposition 4 that | A 1 C 1 B 1 and, consequently, | A C B, if and only if such a preference can be found. The algorithm searches for it in a depth-first manner, by replacing the first preference D F E ∈ / • on the stack with three extensions adding an arbitrary variable from \ ( D ∪ E ∪ F ) to either of D, E, and F . If D F E ∈ / • cannot be extended, because D ∪ E ∪ F = , the algorithm terminates with a negative answer. If, at some point, the algorithm comes across a preference D F E ∈ • , it simply removes it from the stack, because all its extensions must also be in • . Suppose that there is a preference statement D F E ∈ / • of the sort required by Proposition 4. Since D F E is in canonical form, every variable from \ ( A 1 ∪ B 1 ∪ C 1 ) belongs to exactly one of D, E, and F , or to all the three. It is easy to see that, if such a variable p belongs to all of D, E, and F , then the (stronger) preference statement D \ { p } F E \ { p } also satisfies the requirements of Proposition 4. Therefore, if | A C B, we may assume that there is a canonical-form statement D F E ∈ / • such that A 1 ⊆ D, B 1 ⊆ E, C 1 ⊆ F , and every variable from \ ( A 1 ∪ B 1 ∪ C 1 ) belongs to exactly one of D, E, and F . Algorithm 1 will find this statement (unless it returns a negative answer before that) through a sequence of iterations adding at one iteration one variable from \ ( A 1 ∪ B 1 ∪ C 1 ) to the appropriate part of the preference statement. Thus, if the stack becomes empty, we know that all canonical-form preferences of the sort required by Proposition 4 are in • and conclude that | A C B. If we find a preference that cannot be extended with additional variables and is not in • , we conclude that | A C B. Of course, there is no need to explicitly compute • : whether or not D F E belongs to • can be checked via a scan through using Definition 8. Nevertheless, Algorithm 1 is exponential in || in the worst case, but there is little hope to do better, as will be shown in Section 7. However, the algorithm is linear in ||, which makes it efficient in applications where the language for describing preferences (and, thus, the number of variables) is fixed and small compared to the number of preferences that need to be taken into account. Example 4. Let us show how Algorithm 1 works on input a c b and = {a d b, a c d, d c b} assuming = {a, b, c , d}. It starts by putting a c b, which is in canonical form, on the stack and then removes it from the stack at the first iteration of the loop. Since a c b ∈ / • , we choose the only remaining variable d and put three preferences on the stack: ad c b, a c bd, and a cd b (we omit curly brackets here). Removing a cd b from the stack at the next iteration, we
JID:TCS AID:10629 /FLA
Doctopic: Algorithms, automata, complexity and games
[m3G; v1.172; Prn:10/02/2016; 10:24] P.9 (1-16)
S. Obiedkov / Theoretical Computer Science ••• (••••) •••–•••
9
see that a cd b ∈ • , since a d b ∈ . We then proceed to the next iteration and remove a c bd, which is also in • . We do the same for the remaining preference statement and, observing the empty stack, terminate with the positive answer. 6. An inference system In this section, we develop a sound and complete inference system for parameterized preferences. Propositions 1 and 2 are the first steps in this direction. Here is the next step. Proposition 5. If the following three preferences 1. A C ∪{d} B; 2. A C B ∪ {d}; 3. A ∪ {d} C B are satisfied in a preference model M, then A C B is also satisfied in M. Proof. Let w and v be two outcomes from a model M satisfying the three preference statements above with A ⊆ V −1 ( w ), B ⊆ V −1 ( v ), and C ∩ V −1 ( w ) = C ∩ V −1 ( v ). We must show that w ≤ v. If w ∈ V (d), then w ≤ v because of A ∪ {d} C B. Similarly, w ≤ v due to A C B ∪ {d} if v ∈ V (d). If neither w ∈ V (d) nor v ∈ V (d), then
(C ∪ {d}) ∩ V −1 ( w ) = C ∩ V −1 ( w ) = C ∩ V −1 ( v ) = (C ∪ {d}) ∩ V −1 ( v ), and w ≤ v due to A C ∪{d} B.
2
Propositions 1, 2, and 5 suggest the following inference system for parameterized preferences (for all A , B , C , D , E , F ⊆ and d ∈ ): (R1) (R2) (R3) (R4) (R5)
Given Given Given Given Given
A C B, derive A ∪ D C ∪ F B ∪ E. A ∪ D C ∪ D B ∪ D, derive A ∪ D C B ∪ D. A ∪ D C ∪ D B ∪ D, derive A ∪ D C ∪ D B. A ∪ D C ∪ D B ∪ D, derive A C ∪ D B ∪ D.
A C ∪{d} B , A C B ∪ {d}, and A ∪ {d} C B , derive A C B. Definition 9. We say that a preference statement A C B is derivable (or provable) from a set of preferences if A C B occurs as the last item in some finite sequence of preference statements, each of which belongs to or is obtained from one or more earlier statements in the sequence by applying one of the rules (R1)–(R5). In this case, we write A C B. For example, we can derive any preference statement from its canonical form using rules (R2)–(R4). Proposition 6. For any preference statement A C B, we have
{can( A C B )} A C B . Proof. Since A ∩ B is a subset of both the left-hand side and the right hand-side of can( A C B ), which is (by Definition 7)
A ∪ ( B ∩ C ) C ∪( A ∩ B ) B ∪ ( A ∩ C ), we obtain from it
A ∪ ( B ∩ C ) C B ∪ ( A ∩ C ) using rule (R2). From this, we obtain
A ∪ ( B ∩ C ) C B
JID:TCS AID:10629 /FLA
Doctopic: Algorithms, automata, complexity and games
[m3G; v1.172; Prn:10/02/2016; 10:24] P.10 (1-16)
S. Obiedkov / Theoretical Computer Science ••• (••••) •••–•••
10
using rule (R3) and then
A C B using rule (R4).
2
Theorem 2. The system of the five inference rules (R1)–(R5) is sound and complete with respect to parameterized preferences, i.e., for any preference set and any preference statement A C B over a set of propositional symbols, A C B if and only if | A C B. Proof. Soundness is a direct consequence of Propositions 1, 2, and 5. To prove completeness, we first prove that rules (R1)–(R4) form a complete inference system with respect to a smaller class of preferences, namely, with respect to the class of ceteris paribus preferences corresponding to the original “all other things being equal” interpretation. Such preferences can be expressed in our language as A C B, where A ∪ B ∪ C = . This differs slightly from preference statements of the form A CP B introduced in Definition 6 in that here we do not require C to be disjoint from A and B. Lemma 1. If A ∪ B ∪ C = , then A C B is derivable from a preference set (using only the first four inference rules) if and only if
| A C B.
Proof. Again, one direction of the lemma, namely, from A C B to | A C B, is a consequence of Propositions 1 and 2. To prove the other direction, first assume that A C B is in canonical form. Consider a model M = ( W , ≤, V ), where W = { v , w }, v < w , V −1 ( v ) = B, and V −1 ( w ) = A. The preference statement A C B does not hold in M, since its two outcomes agree on all variables from C (by Proposition 3); one, w, satisfies all variables from A; the other, v, satisfies all variables from B; but w ≤ v. If | A C B, there must be a preference statement D F E ∈ that also fails to hold in M. The only way it can fail is that D ⊆ V −1 ( w ) = A , E ⊆ V −1 ( v ) = B, and F ∩ V −1 ( w ) = F ∩ V −1 ( v ). In this case, F ⊆ C ∪ ( A ∩ B ) and, since A C B is in canonical form, it follows from Proposition 3 that F ⊆ C . Therefore, A C B can be derived from by rule (R1) using D F E ∈ . If A C B is not in canonical form, we can derive can( A C B ) as described above and then derive A C B from can( A C B ) due to Proposition 6. 2 Proceeding with the proof of Theorem 2, let be a set of preferences and A C B be a preference over such that | A C B. We show that, in this case, A C B. We let = \ ( A ∪ B ∪ C ) and prove the claim by induction on the size of . If || = 0, then A ∪ B ∪ C = and A C B by Lemma 1. Assume that we proved the claim for preference statements A C B with | \ ( A ∪ B ∪ C )| = k. We now prove it for preferences A C B with = | \ ( A ∪ B ∪ C )| = k + 1. Take any d ∈ and consider the following three preferences:
A C ∪{d} B , A C B ∪ {d}, and A ∪ {d} C B . Since A C B is a semantic consequence of , it follows from Proposition 1 that so is each of these weaker preferences. By assumption, | \ ( A ∪ B ∪ C ∪ {d})| = k; therefore, by the induction hypothesis, these three preferences are derivable from . We can derive A C B from them using rule (R5). 2 7. Complexity of inference Having described an inference system for our language, we now look at inference from another angle and analyze the complexity of deciding if a preference follows from a set of preferences: Problem 1. Given a set of preference statements and a preference A C B over , decide whether | A C B. This problem is solved by Algorithm 1, but in exponential time in the worst case. We now show that, unless P= NP, this problem does not admit a polynomial-time solution—and this is despite the fact that, as we show in Section 8, we can represent preferences as implications (also known as Horn formulas), for which the corresponding problem is solvable in linear time. One of the fundamental reasoning tasks for working with preferences is consistency checking, i.e., determining if a set of preferences has a model. For our language, consistency is trivial: since we allow only non-strict preference statements, every set of preferences is satisfied by any model where the preference relation consists of all pairs of outcomes, i.e., in
JID:TCS AID:10629 /FLA
Doctopic: Algorithms, automata, complexity and games
[m3G; v1.172; Prn:10/02/2016; 10:24] P.11 (1-16)
S. Obiedkov / Theoretical Computer Science ••• (••••) •••–•••
11
models where there is indifference between every two outcomes. A more meaningful question is whether this is the only kind of models for , i.e., whether is incompatible with any strict preferences. Definition 10. We say that a set of preferences over induces total indifference if is satisfied only by models of the form ( W , W × W , V ). We can now formulate the above question as follows: Problem 2. Given a set of preference statements over , decide if it induces total indifference. Obviously, induces total indifference if and only if | ∅ ∅ ∅. Thus, Problem 2 is a special case of Problem 1, and, by proving that Problem 2 is coNP-hard, we provide a lower bound on the complexity of Problem 1. Lemma 2. Problem 2 is coNP-hard. Proof. We reduce the following coNP-complete problem to Problem 2: Problem 3. Given a propositional formula φ in conjunctive normal form, decide if it is unsatisfiable, i.e., if it is false under all assignments to its variables. We let be the set of all variables of φ and build a set φ of preferences over by adding to φ the following preference for each p ∈ :
∅ { p } ∅. In addition, we include one preference for each clause of φ :
P ∅ N , where P is the set of all variables that occur positively and N is the set of all variables that occur negatively in this clause. We claim that φ is unsatisfiable if and only if the set φ induces total indifference. To prove this, suppose that φ does not induce total indifference and show that φ is satisfiable. φ must be satisfied by some preference model M = ( W , ≤, V ), where w ≤ v for some w , v ∈ W . Since M | ∅ { p } ∅ for all p ∈ , the sets V −1 ( w ) and V −1 ( v ) form a partition of , i.e., V −1 ( w ) ∪ V −1 ( v ) = and V −1 ( w ) ∩ V −1 ( v ) = ∅. By assigning variables in V −1 ( v ) true and those in V −1 ( w ) false, we obtain a satisfying assignment for φ . Indeed, consider any clause of φ and denote by P the set of all variables that occur positively and by N the set of all variables that occur negatively in this clause. We know that M | P ∅ N, because P ∅ N ∈ φ . As w ≤ v, we have P V −1 ( w ) or N V −1 ( v ). Therefore, either some variable from P is assigned true or some variable from N is assigned false. In either case, the clause is true. For the other direction, assume that φ is satisfiable. We claim that, in this case, φ does not induce total indifference, i.e., there is a preference model M = ( W , ≤, V ) such that M | φ and w ≤ v for some w , v ∈ W . Consider a satisfying assignment for φ and denote by T the set of variables assigned true and by F the set of variables assigned false. We build a preference model M consisting of two outcomes v < w with V −1 ( v ) = T and V −1 ( w ) = F . Since V −1 ( v ) and V −1 ( w ) form a partition of , the preferences ∅ { p } ∅ are valid in M for all p ∈ . Each preference P ∅ N ∈ φ corresponds to a clause of φ , where P is the set of variables that occur positively and N is the set of variables that occur negatively in this clause. Every clause is true under our satisfying assignment; therefore, there is either p ∈ P assigned true or p ∈ N assigned false. In the first case, p ∈ / F and P V −1 ( w ); in the second case, p ∈ / T and N V −1 ( v ). Therefore, the statement P ∅ N does not force w ≤ v. Since ( w , v ) is the only pair of outcomes that is not part of the preference relation of M, we have M | P ∅ N. This concludes the proof. 2 Theorem 3. Checking the semantic consequence relation for parameterized preferences (Problem 1) is coNP-complete. Proof. The problem is in coNP: to prove that | A C B, one may present a preference model M such that M | , but M | A C B. If such a model exists, its submodel with only two outcomes for which the preference A C B fails will also do as a certificate for | A C B. Thus, the size of the certificate is polynomial in || and checking M | and M | A C B can be done in polynomial time. The coNP-hardness of Problem 1 follows from Lemma 2, since, as mentioned above, Problem 2 is a special case of Problem 1. 2 It is worth noting that the full modal preference logic of van Benthem et al. [6] is NEXPTIME-hard [3], while most formalisms used for preference modeling in artificial intelligence are at least PSPACE-complete; these include CP-nets [5],
JID:TCS AID:10629 /FLA
Doctopic: Algorithms, automata, complexity and games
[m3G; v1.172; Prn:10/02/2016; 10:24] P.12 (1-16)
S. Obiedkov / Theoretical Computer Science ••• (••••) •••–•••
12
cp-theories [7], and the “prototypical” logic from [3]. The reason for their hardness is in that they consider only models with the exponential number of outcomes (see the discussion in Section 3); therefore, they cannot use the trick of removing all but the two violating outcomes from an arbitrary violating model that worked for us in the proof of Theorem 3. A relatively rare case of a language where consistency checking is polynomial is one that allows only so-called free preferences—those with the empty ceteris paribus part, i.e., non-strict preferences of the form A ∅ B and their strict counterparts [3]. When an arbitrary propositional formula is added as the integrity constraint, consistency checking becomes coNP-complete for free preferences. It is easy to see that adding integrity constraints will not change the complexity of Problem 1, provided that we keep our semantics. If the role of the integrity constraint is to allow only models in which every outcome satisfies the constraint (but not to require that models contain all such outcomes), then the complexity result of Theorem 3 remains intact. Indeed, being a generalization of Problem 1, its constrained version remain coNP-hard. On the other hand, it can be checked in polynomial time whether the two outcomes of the model from the proof of Theorem 3 satisfy the given constraint; thus the problem remains in coNP. 8. Preferences as implications In [10], we proposed semantics for preferences of two types based on formal concept analysis [11]. In terms of Section 2, the two preference types from [10] can be described as follows: an attribute set B is existentially preferred to an attribute set A in a model M if, for every outcome satisfying all the variables from A, the model M contains a (not necessarily strictly) better outcome satisfying B; a set B is universally preferred to A if every outcome with B is preferred to every outcome with A. Universal preferences are a special case of preference statements of this paper: B is universally preferred to A if and only if the preference statement A ∅ B holds. In [12,13], we used formal concept analysis to define the semantics for parameterized ceteris paribus preferences (some of the results presented here are from these papers). This semantics is equivalent to the one defined in Section 2. The basic data structure in formal concept analysis is a formal context K = (G , M , I ), where G is a set of objects, M is a set of attributes, and I ⊆ G × M is a binary relation between objects from G and their attributes from M. The derivation operators (·) I are defined for A ⊆ G and B ⊆ M as follows:
A I = {m ∈ M | ∀ g ∈ A ( g Im)} B I = { g ∈ G | ∀m ∈ B ( g Im)} A I is the set of attributes shared by objects of A, and B I is the set of objects having all attributes of B. Often, (·) is used instead of (·) I . The double application of (·) results in two closure operators (one on objects and the other on attributes): (·)
is extensive, idempotent, and increasing. Sets A
and B
are said to be closed. An important notion in formal concept analysis is that of an implication, which is, formally, an expression A → B, where A , B ⊆ M are attribute subsets. It holds or is valid in the context K (notation: K | A → B) if A ⊆ B , i.e., every object of the context with all attributes from A also has all attributes from B. If A = ∅, then (G , M , I ) | A → M. Special notation is sometimes used for such zero-support implications: A → ⊥. Note that A → ⊥ is a headless Horn clause, whereas an implication A → B is a conjunction of definite Horn clauses with the same body. The implications valid in a context are summarized by the Duquenne–Guigues basis [14], which has the minimal number of implications among equivalent implication sets. Nevertheless, it may be exponential in the size of the context, and determining its size is a #P-hard problem [15]. Other valid implications can be obtained from this basis using the Armstrong rules [16], which constitute a sound and complete inference system for implications. The preference context P = (G , M , I , ≤) is defined in [10] as a context (G , M , I ) supplied with a reflexive and transitive preference relation ≤ on G. It is essentially another way to express a preference model over M in the sense of Definition 2: the corresponding model is MP = (G , ≤, V ), where V (m) = {m} for m ∈ M. Note that from the data analysis perspective, MP records preferences observed over objects from G. It may be interesting to find out a set of preference statements over M explaining these observed preferences, which could then be used to predict preferences over new objects. So, given a preference model, how can we enumerate a semantically complete set of preference statements valid in this model? It turns out that preferences—existential, universal, or parameterized—between attribute subsets that hold in a preference model can be represented as implications of a certain kind in a special formal context built from this model. We will show how this is done for parameterized preference statements. Definition 11. The ceteris paribus translation of the preference model M = ( W , ≤, V ) over variables is a formal context
KM = ( W × W , ( × {1, 2, 3}) ∪ { p ≤ }, I ), where
JID:TCS AID:10629 /FLA
Doctopic: Algorithms, automata, complexity and games
[m3G; v1.172; Prn:10/02/2016; 10:24] P.13 (1-16)
S. Obiedkov / Theoretical Computer Science ••• (••••) •••–•••
( w , v ) I ( p , 1) ( w , v ) I ( p , 2) ( w , v ) I ( p , 3) ( w , v ) Ip ≤
⇐⇒ ⇐⇒ ⇐⇒ ⇐⇒
13
w ∈ V ( p ), v ∈ V ( p ), | V ( p ) ∩ { w , v }| = 1, w ≤ v.
T ( A C B ), the translation of a preference statement A C B, is the implication
( A × {1}) ∪ ( B × {2}) ∪ (C × {3}) → { p ≤ } of the formal context KM . The context resulting from the translation has as its objects pairs of outcomes of the preference model and contains three copies of each propositional variable as attributes; ( w , v ) is associated with the first copy of p if w satisfies p and with the second copy if v satisfies p. We associate ( w , v ) with the third copy of p if either both w and v satisfy p or neither of them does. Theorem 4. A C B is valid in a preference model M = ( W , ≤, V ) over if and only if its translation is valid in KM :
M | A C B
⇐⇒
KM | T ( A C B ).
Proof. Suppose that M | A C B and ( A × {1}) ∪ ( B × {2}) ∪ (C × {3}) ⊆ {( w , v )} for some w ∈ W and v ∈ W . Then, A ⊆ V −1 ( w ), B ⊆ V −1 ( v ), and w ∈ V (c ) if and only if v ∈ V (c ) for all c ∈ C . The latter means that C ∩ V −1 ( w ) = C ∩ V −1 ( v ). Since A C B holds in M, we have w ≤ v and ( w , v ) Ip ≤ as required. Conversely, assume KM | ( A × {1}) ∪ ( B × {2}) ∪ (C × {3}) → { p ≤ }. We need to show that w ≤ v whenever A ⊆ V −1 ( w ), B ⊆ V −1 ( v ), and V −1 ( w ) ∩ C = V −1 ( v ) ∩ C . Indeed, in this case, we have ( A × {1}) ∪ ( B × {2}) ∪ (C × {3}) ⊆ {( w , v )} and, consequently, ( w , v ) Ip ≤ , i.e., w ≤ v. 2 In the end of Section 2, we argued that computing a set of preference statements complete for a model M is useful when the model represents a dataset. From Theorem 4, it immediately follows that to find such a set, it is sufficient to find all valid implications of KM of the form X → p ≤ with inclusion-minimal premises. It is easy to see that this is the same as enumerating minimal transversals of the hypergraph whose edges are the complements of all {( w , v )} in KM such that w ≤ v (see, e.g., [17,18] for a discussion of hypergraph transversals and related problems). Note though that the preference set obtained in this way may be redundant. 9. Preference prediction We now study the following problem: given a preference model M and two additional outcomes w and v with descriptions A and B, predict which of the two is better. This is essentially a problem in preference learning: the model M records observed preferences over some possible outcomes, and we want to use it for training a classifier that would correctly predict preferences over new outcomes. The purpose of this section is to propose a possible solution to this problem based on ceteris paribus preferences. One possible approach is to find a preference valid in M that forces a particular order for A and B and, consequently, for w and v. If M | D F E with D ⊆ A, E ⊆ B, and F having variables neither from A \ B nor from B \ A (i.e., F ∩ A = F ∩ B), then we predict w ≤ v, for otherwise the preference D F E would not hold in the preference model M extended with w and v. Similarly, if a preference E F D with D , E, and F as above is valid in M, we conclude v ≤ w. It is, of course, possible to obtain w ≤ v and v ≤ w, in which case we have to postulate indifference between v and w. However, there is a problem with this approach: whenever we need to compare two outcomes with descriptions A and B such that at least one of A and B is not satisfied by any outcome of the preference model M, we will always have both M | A ∅ B and M | B ∅ A. This motivates the following definition: Definition 12. A preference A C B is supported by M = ( W , ≤, V ) if M | A C B and
∃ w ∈ V ( A )∃ v ∈ V ( B )( w = v and V −1 ( w ) ∩ C = V −1 ( v ) ∩ C ). It seems more appropriate to use only preferences supported by the preference model to predict preferences over outcomes outside this model: a propositional preference statement should be used for predicting a preference between two outcomes only if it explains an observed preference between some outcomes of the preference model. To sum up, we will use the following rule for predicting preferences over new outcomes: Definition 13. Let M = { W , ≤, V } be a preference model and w , v ∈ / W be two outcomes with descriptions A ⊆ and B ⊆ respectively. We say that v is hypothetically preferred to w with respect to M if M supports D F E such that D ⊆ A, E ⊆ B, and F ∩ A = F ∩ B.
JID:TCS AID:10629 /FLA
Doctopic: Algorithms, automata, complexity and games
[m3G; v1.172; Prn:10/02/2016; 10:24] P.14 (1-16)
S. Obiedkov / Theoretical Computer Science ••• (••••) •••–•••
14
Algorithm 2 Predict Preference( A , B , M). Input: Outcome descriptions A , B ⊆ and a preference model M = ( W , ≤, V ). Output: true, if M supports D F E for some D ⊆ A , E ⊆ B, and F ⊆ M such that F ∩ A = F ∩ B; false, otherwise. for all w ∈ W do D := A ∩ V −1 ( w ) for all v ∈ W \ { w } such that w ≤ v do E := B ∩ V −1 ( v ) F := ( \ ( A B )) ∩ ( \ ( V −1 ( w ) V −1 ( v ))) if M | D F E then return true return false
Algorithm 3 Check Preference( D F E , M). Input: A preference statement D F E over and a preference model M = ( W , ≤, V ). Output: true, if M | D F E; false, otherwise.
X := p ∈ D V ( p ) Y := p ∈ E V ( p ) for all w ∈ X do for all v ∈ Y do if w v and V −1 ( w ) ∩ F = V −1 ( v ) ∩ F then return false return true
To determine whether v is hypothetically preferred to w, we need to find a preference D F E described in Definition 13 or make sure that no such preference is supported by M. One way to achieve this is to generate a semantically complete set of preferences valid in M, ignore those that are not supported by M and see if A \( A B ) B follows from the rest (where A B is the symmetric difference between A and B). The first step of this approach can be carried out by representing preferences as implications (see Section 8), but, as already mentioned, this essentially means enumerating minimal transversals of a hypergraph, for which no output-polynomial algorithm is known. Furthermore, this preference set can itself be exponential in the size of the preference context. A more efficient method is thus desirable. In [19], a polynomial-time algorithm is proposed for abduction using Horn theories represented by their characteristic models. A model of a Horn theory is characteristic if it cannot be obtained as the intersection of other models. The representation of a Horn theory by its set of characteristic models is essentially the same as the representation of an implication set by an object-reduced context for which this implication set is sound and complete. The abduction algorithm works fine also on a set of models that contains some (or all) non-characteristic models. A strategy similar to this algorithm can be used for determining if v is hypothetically preferred to w as described in Definition 13: assume that v is indeed preferred to w and find an explanation for this. This is the logic behind Algorithm 2, which is an adaptation of the abduction algorithm from [19] for our problem. In addition to M, the algorithm receives sets A and B of propositional symbols, which should be thought of as descriptions of two outcomes that must be preferentially ordered with respect to each other. The algorithm returns true if the outcome described by B is hypothetically preferred to the outcome described by A and false otherwise. To find a required preference D F E supported by M, Algorithm 2 iterates through all pairs of different outcomes ( w , v ) of M such that w ≤ v. For each such pair, it forms the weakest preference that would explain both w ≤ v and the assumed preference of the outcome described by B over the outcome described by A. The left-hand side of this preference must “match” both V −1 ( w ) and A; therefore, the algorithm sets D := A ∩ V −1 ( w ). Similarly, the right-hand side is E := B ∩ V −1 ( v ). The variables in the ceteris paribus condition must “match” the similarity between w and v, as well as the similarity between A and B. Thus, F is set to contain all variables on which there is no disagreement between w and v and between A and B: each variable from F is satisfied by both w and v or by none of them and belongs to both A and B or to none of them (recall that A B is the symmetric difference between A and B). It is easy to see that, if the resulting preference D F E is valid in M, it satisfies the description in Definition 13 making it possible to conclude that the outcome described by B is hypothetically preferred to the outcome described by A. On the other hand, every preference supported by M must “match” a pair of outcomes w ≤ v in the sense of Definition 12. If, after iterating through all pairs w ≤ v, we have not built a preference D F E explaining why B should be preferred to A, then no such preference is supported by M, and the algorithm answers negatively. It is easy to see that Algorithm 2 is polynomial in the size of its input. To be more precise, let n = | W |, m = ||, and k = | ≤ |. Then there are k − n iterations of the inner for loop in total in the worst case, since the two nested loops iterate through all the pairs w ≤ v of M for different w and v. At each iteration, we compute E and F applying set-theoretic operations to subsets of (this subsumes the time needed to compute D). Then we test if the resulting preference D F E holds in M. For this, we have to search through all pairs w v of M trying to find one with D ⊆ V −1 ( w ), E ⊆ V −1 ( v ), and F ∩ V −1 ( w ) = F ∩ V −1 ( v ). Assuming that set-theoretic operations over subsets of take time O (m), this can be done in time O ((n2 − k)m), resulting in the overall time of O (m(k − n)(n2 − k)).
JID:TCS AID:10629 /FLA
Doctopic: Algorithms, automata, complexity and games
S. Obiedkov / Theoretical Computer Science ••• (••••) •••–•••
[m3G; v1.172; Prn:10/02/2016; 10:24] P.15 (1-16)
15
Another way to check if a preference holds in a preference model is given by Algorithm 3. This algorithm will generally be more efficient if, for each variable p ∈ , we precompute V ( p ). To verify that M | D F E, we then need to compute X = p ∈ D V ( p ) and Y = p ∈ E V ( p ) using | D | − 1 and | E | − 1 intersections, respectively. After this, we check if there are outcomes w ∈ X and v ∈ Y that agree on all variables from F , but for which w v. If we can find such outcomes, the preference does not hold; otherwise, it does. This optimization does not lead to a better worst-case theoretical complexity, but it may result in a considerable speed-up in practice: instead of spending O (m) time on each of O (n2 − k) pairs, we would only consider pairs from X × Y . Further optimizations may include precomputing, for each p ∈ , all pairs of outcomes that agree on p:
{( w , v ) ∈ W 2 | w ∈ V ( p ) ⇐⇒ v ∈ V ( p )}. 10. Conclusion In this paper, we have focused on a language for describing a particular type of ceteris paribus preferences. Traditionally, ceteris paribus preferences are understood as preferences that hold “all other things being equal” [4]; this is the case, for example, for CP-nets. We consider a broader class of preference statements that includes parameterized statements making it possible to explicitly specify which conditions must be kept equal when comparing two alternatives. We show that this language has the same expressivity as the language of non-parameterized statements, but is far more concise. Our language can be described as a simple fragment of modal preference logic from [6] where only non-strict preference statements between atomic conjunctions can be expressed; the ceteris paribus part of a preference statement, which specifies conditions that must be identical for the two alternatives being compared, is given by a set of atomic formulas in our case, as compared to a set of arbitrary formulas in [6]. This fragment comes close to some of the AI formalisms for representing preferences, e.g., [3] and especially [20]. However, unlike in these works and as in [6], we use possible-world semantics and also base it on arbitrary preorders, whereas semantics in other AI formalisms is more commonly based on total preorders. We have described a sound and complete inference system for the type of preferences studied in the paper. We have also showed that the inference problem or, more precisely, the problem of determining whether a given preference statement is a semantic consequence of a given collection of preference statements, is coNP-complete for our language. Multi-valued variables can be handled in our language in a standard way by introducing a separate propositional variable for each value of a multi-valued variable. This, however, requires that the notion of semantic consequence (as presented in Definition 4) be modified: it should not take into account models that, for example, allow outcomes with several values for the same variable or otherwise violate constraints related to variable values. Of course, a similar approach can be used for modeling constraints over values of different variables. Such constraints can be represented by propositional formulas, but unlike integrity constraints in preference logic from [3], these formulas are not part of our language, which is designed only for expressing preference statements. In our case, constraints could be specified on the semantic side, by appropriately modifying the notion of semantic consequence. We are going to explore various types of constraints and to see if and how they affect our results on the complexity of inference. In our future work, we also plan to extend our language to include strict preference statements. This can be done in at least two ways. A preference statement A ≺C B can be defined to hold whenever A C B holds, but B C A does not. A more common way to define strict preference statements is via a strict preference relation (a strict partial order) on outcomes. Acknowledgements The article was prepared within the framework of the Academic Fund Program at the National Research University Higher School of Economics (HSE) in 2015 (grant 15-01-0172) and supported within the framework of a subsidy granted to the HSE by the Government of the Russian Federation for the implementation of the Global Competitiveness Program. The author would like to thank the anonymous reviewers for their helpful comments and suggestions. References [1] R.I. Brafman, C. Domshlak, Preference handling—an introductory tutorial, AI Mag. 30 (2009) 58–86. [2] C. Domshlak, E. Hüllermeier, S. Kaci, H. Prade, Preferences in AI: an overview, Artificial Intelligence 175 (2011) 1037–1052. [3] M. Bienvenu, J. Lang, N. Wilson, From preference logics to preference languages, and back, in: Proceedings of the Twelfth International Conference on Principles of Knowledge Representation and Reasoning, AAAI Press, Menlo Park, CA, 2010, pp. 9–13. [4] G.H. von Wright, The Logic of Preference, Edinburgh University Press, 1963. [5] C. Boutilier, R.I. Brafman, C. Domshlak, H.H. Hoos, D. Poole, CP-nets: a tool for representing and reasoning with conditional ceteris paribus preference statements, J. Artificial Intelligence Res. 21 (2004) 135–191. [6] J. van Benthem, P. Girard, O. Roy, Everything else being equal: a modal logic for ceteris paribus preferences, J. Philos. Logic 38 (2009) 83–125. [7] N. Wilson, Computational techniques for a simple theory of conditional preferences, Artificial Intelligence 175 (2011) 1053–1091. ´ A. Swami, Mining association rules between sets of items in large databases, SIGMOD Rec. 22 (1993) 207–216. [8] R. Agrawal, T. Imielinski, [9] S. Bouveret, U. Endriss, J. Lang, Conditional importance networks: a graphical language for representing ordinal, monotonic preferences over sets of goods, in: [21], 2009, pp. 67–72, http://ijcai.org/papers09/Papers/IJCAI09-022.pdf.
JID:TCS AID:10629 /FLA
16
Doctopic: Algorithms, automata, complexity and games
[m3G; v1.172; Prn:10/02/2016; 10:24] P.16 (1-16)
S. Obiedkov / Theoretical Computer Science ••• (••••) •••–•••
[10] S. Obiedkov, Modeling preferences over attribute sets in formal concept analysis, in: Formal Concept Analysis, in: Lecture Notes in Artificial Intelligence, vol. 7278, Springer, Berlin/Heidelberg, 2012, pp. 227–243. [11] B. Ganter, R. Wille, Formal Concept Analysis: Mathematical Foundations, Springer, Berlin/Heidelberg, 1999. [12] S. Obiedkov, Ceteris paribus preferences: prediction via abduction, in: S. Gruner, B. Watson (Eds.), Formal Aspects of Computing: Essays Dedicated to Derrick Kourie on the Occasion of His 65th Birthday, Shaker, 2013, pp. 45–56. [13] S. Obiedkov, Modeling ceteris paribus preferences in formal concept analysis, in: P. Cellier, F. Distel, B. Ganter (Eds.), Formal Concept Analysis, in: Lecture Notes in Computer Science, vol. 7880, Springer, Berlin/Heidelberg, 2013, pp. 188–202. [14] J.-L. Guigues, V. Duquenne, Famille minimale d’implications informatives résultant d’un tableau de données binaires, Math. Sci. Hum. 24 (1986) 5–18. [15] S. Kuznetsov, S. Obiedkov, Some decision and counting problems of the Duquenne–Guigues basis of implications, Discrete Appl. Math. 156 (2008) 1994–2003. [16] W. Armstrong, Dependency structure of data base relationships, in: Proc. IFIP Congress, 1974, pp. 580–583. [17] T. Eiter, G. Gottlob, Identifying the minimal transversals of a hypergraph and related problems, SIAM J. Comput. 24 (1995) 1278–1304. [18] R. Khardon, Translating between Horn representations and their characteristic models, J. Artificial Intelligence Res. 3 (1995) 349–372. [19] H. Kautz, M. Kearns, B. Selman, Horn approximations of empirical data, Artificial Intelligence 74 (1995) 129–145. [20] N. Wilson, Efficient inference for expressive comparative preference languages, in: [21], 2009, pp. 961–966. [21] C. Boutilier (Ed.), IJCAI 2009, Proceedings of the 21st International Joint Conference on Artificial Intelligence, Pasadena, California, USA, July 11–17, 2009, 2009.