Top-K interesting preference rules mining based on MaxClique

Top-K interesting preference rules mining based on MaxClique

Expert Systems With Applications 143 (2020) 113043 Contents lists available at ScienceDirect Expert Systems With Applications journal homepage: www...

3MB Sizes 0 Downloads 60 Views

Expert Systems With Applications 143 (2020) 113043

Contents lists available at ScienceDirect

Expert Systems With Applications journal homepage: www.elsevier.com/locate/eswa

Top-K interesting preference rules mining based on MaxClique Zheng Tan, Hang Yu, Wei Wei, Jinglei Liu∗ School of Computer and Control Engineering, Yantai University, Yantai 264005, PR China

a r t i c l e

i n f o

Article history: Received 4 September 2018 Revised 2 October 2019 Accepted 17 October 2019 Available online 8 November 2019 Keywords: Maxclique Association rules Conditional preference rules Belief system Interestingness measure

a b s t r a c t In order to fully considered context constraints and eliminate the redundancy of preferences in the personalized queries in the database, a Top-K conditional preference mining framework based on maximal clique is proposed. Firstly, a conditional constraint model i+  i− |X is proposed, which means that users prefer i+ to i− under the constraint of X. Based on this model, we design a MPM algorithm to efficiently obtain users contextual preferences by using graphic model and new pruning strategies. Then, we propose an updated belief system and the concept of common belief. The belief system is constructed by using users common preferences, and combined with Bayesian approach to acquire users’ Top-K preference rules. Finally, the experiment result indicates that MPM algorithm is far more efficient than three classic or leading-edge mining algorithms in mining all rules and has comparable performance with 2 state-of-the-art Top-K methods. Moreover, the experiment results on Recall, precision and F1-Measure show that MPM algorithm comprehensively outperforms than six kinds various algorithms. © 2019 Published by Elsevier Ltd.

1. Introduction Massive information retrieval and utilization has become an important research and application field. At the same time, the popularization of E-commerce makes the Internet commodity information increase exponentially (Amo et al., 2015), which expands consumers the choice space and also improves the cost of consumers’ choice. The user preferences are extremely important in designing the system model, whether it is in an E-commerce system or in a personalized recommendation system (Cao & Li, 2007). Some existing E-commerce recommendation systems still have some shortcomings in real world applications. For example, it cannot satisfy users’ individual needs, and requires excessive user intervention when mining users’ potential needs. The consumer’s preference, however, is a kind of consumer psychology, which reflects the consumer’s interest or preference for a certain product or service, and is influenced by the subjective and objective factors such as personality characteristics and external environment (Zhang, Kou, Chen, & Li, 2010). For quite some time, users cannot express their preferences accurately and concisely (Li, Han, & Pei, 2001), Thus, preference extraction is a key technology for mining implicit user preference rules. In preference mining process, mining preferences based on association rules is universal. However, in practical applications,



Corresponding author. E-mail address: [email protected] (J. Liu).

https://doi.org/10.1016/j.eswa.2019.113043 0957-4174/© 2019 Published by Elsevier Ltd.

there are some problems. For instance, the generation of association rules is only related to the results of user consumption, and it also ignores the context preference constraint and the users’ background knowledge. In this paper, we use preference mining algorithm to obtain user’s potential preferences from preference databases and select interesting user preferences through objective measures. User preferences are elaborated by a series of conditional preference rules which satisfy certain standards of interest and consist with most users common sense. The conditional preference rule model proposed in this paper is i+  i− |X, which indicates that users are more interested in i+ than i− under the constraint of context X. The conditional constraints here is also refer to context constraints. For example, in Table 1(a), Tid represents the movie identification, and Attribute represents the property of the movies. Y represents the transaction that contains this attribute, and the rating indicates the user’s rating of the transaction. The movie preference database can be generated through the D-value between ratings, which is shown in Table 1(b). In Table 1, tuple information express user preferences, which is generated by the users rating on the movie. When a user scores a movie above a certain threshold and is higher than the rating of another movie, the user is considered to have a preference between these two movies. For example, if the rating of a movie is above 8 points and the D-value between it and the other movie is greater than 1, it shows that the user has a preference between these movies. From Table 1, we can infer that the rating of t1 is higher than t3 , and the rating of t1 is also higher than 8 points, it indicates that the user preference for movies t1 is

2

Z. Tan, H. Yu and W. Wei et al. / Expert Systems With Applications 143 (2020) 113043 Table 1 Tuples in database. (a) Movie Database D TID

t1 t2 t3 t4 t5

Attributes A

B

Y Y Y

Y Y Y Y

C

D

E

Y Y Y Y Y

Y Y

Scores 9.2 7.4 6.5 8.6 7.7

(b) Preference Database ξ Pid

User preferences

p1 p2 p3 p4 p5

< t1 , < t1 , < t1 , < t4 , < t4 ,

t2 > t3 > t5 > t2 > t3 >

higher than t3 , which is represented by < t1 , t3 > . In this way, a preference database ξ shown in Table 1b can be formed. Analyzing the preference < t1 , t3 > , we can find that both tuples contain attribute A and attribute B. The difference is that t1 contains attribute D and t3 contains attribute E. So, the conditional preference rule is: D  E|AB, which means that two movies that contain attributes A and B, people prefer the movie with attribute D rather than attribute E. Attribute A and attribute B here constitute the context of this rule. The user preferences presented in this paper are composed of a series of conditional preference rules. The main criteria for measuring these rules are Conciseness, Generality, Novelty and Surprisingness (Geng & Hamilton, 2006). In this paper, a preference rule mining MPM algorithm—Matrixbased Preference Miner, is designed to mine the users contextual preferences, and uses common user preferences as belief system to perform a Top-K interestingness sorting. The method is divided into two phases. In the belief system generation phase, the common user preferences are mined by MPM. And use the Ruleset aggregation algorithm–RSA, which is based on the average internal distance, to construct a belief system where the representative rules are selected as the background knowledge. In the users’ personalized preferences processing phase, an evaluation function based on Bayesian formula is used to select the Top-K users’ personalized preference rules under the constraints of the belief system constructed in the first phase. In this paper, based on the lattice theory and Bayesian approach, we investigate users personalized conditional preference mining, our characteristics and contributions lie in: 1. We devise a MPM algorithm for mining contextual preference rules of individual user. Comparing with the traditional methods—Apriori, CONTENUM, etc, our MPM algorithm requires less amount of database scans, so, it is rather efficient. Moreover, we also use vector to represent the transaction, clique, class, and contextual preference rule involved in the proposed algorithm. 2. Based on the average internal distance of preference ruleset, we construct a new belief system by the well-designed ruleset aggregation algorithm RSA. Because the proposed RSA algorithm aggregates users common preferences which are mined from the first individual preference mining phase— MPM, so, the constructed belief system not only reflect all the users’ consensuses preference without additional information, but also is rather concise owing to a small number of fused rules.

3. Based on Conciseness, Generality, Novelty and Surprisingness, we propose a novelty preference interestingness measure standard, to evaluate the quality of preference rules in the constructed belief system. This standard holds if a rule is interesting, it should basically conform to a certain rule in the belief system (Conciseness and Generality) or deviate from all the rules in the belief system (Novelty and Surprisingness). More importantly, Top-K preference rule is mined according to our Preference rules Interestingness Measurer algorithm— PIM based proposed measure standard. This paper is organized as follows. In Section 2, we discuss some related work. In Section 3, we define contextual preference. The two following sections describe two phases of finding interesting rules, Section 4 is dedicated to present the first phase, namely the preference rule mining algorithm, whereas Section 5 describes the definition of interestingness of preference rules and uses Bayesian approach to measure the interestingness through the belief system that has been filtered by RSA algorithm. In Section 6, we describe and analyze experimental results on real datasets to compare the efficiency of our algorithm with Apriori algorithm and CONTENUM algorithm. The Section 7 is the conclusion. 2. Related work Preference extraction tasks can be viewed as two independent problems: label ranking learning problem and object ranking learning problem. In literature De Sá, Soares, Jorge, Azevedo, and Costa (2011) and Hüllermeier, Fürnkranz, Cheng, and Brinker (2008), some deeper research has been conducted on these two issues respectively. The representative research work on extracting user preferences can be divided into the following two categories: One is to input user preferences through a query interface, but in this case, the expression of user preferences is greatly limited. In view of literature (Amo et al., 2015), if users can not accurately express their preferences, they need not to explore the preferences. As a result, it is difficult to obtain user’s preferences in this way. Another category is to use mining algorithms to obtain the users potential preferences. Preferences, technically, are affected by various factors, and users cannot accurately express their preferences. 2.1. Mining frequent itemsets and preference rules With the maturing of data mining association rules technology, using association rules to extract implicit user preferences from the user’s historical records and extracting a series of preference rules with Conciseness and Generality have become the focus of research. 2.1.1. Mining methods After the first frequent itemsets mining algorithm (Agrawal, Srikant et al., 1994) versatile updated algorithms have been proposed. The sampling-based association rule extraction algorithm in literature (Toivonen et al., 1996) uses only partially representative data in the mining process; the partitionbased algorithm in literature (Savasere, Omiecinski, & Navathe, 1995) divides the data into small blocks that can be processed in memory, the first scan database generates local frequent sets, and the second scan database forms a global frequent set. Both of these methods attend to reduce the number of database scans in the process of mining association rules, but the sampling-based method has less stability in the case of fewer samples, and the partition-based algorithm easily generates false frequent itemsets. Based on these technique, many researchers have done further investigations. The effectiveness of sampling-based association rules

Z. Tan, H. Yu and W. Wei et al. / Expert Systems With Applications 143 (2020) 113043

algorithm has been analyzed in literature (Zaki, Parthasarathy, Li, & Ogihara, 1997). The class association rules algorithm (CARs) of literature (Hu, Lu, Zhou, & Shi, 1999) is a classification association rule algorithm which focuses on mining specific subsets. The goal is to find a set of rules in the database for precise classification. To evaluate the difference of mining algorithms, in ˝ literature (Gyorödi, & Holban, 2004), the author compares some classic algorithms of association rules. Latest research turn to focus on some new fields. Applying the graphic model to the process of mining association rules, a parallel and scalable algorithm is proposed in literature (Fan, Wang, Wu, & Xu, 2015), and it is guaranteed that as the number of processors increasing, the processing speed is accelerated in the order of polynomials. In the work of literature (Zeng, 2016), a mining model of neighborhood rough sets based on two systems has been proposed. This model can be used to solve the problem where some classic mining algorithms can not solve cold start problem when encountering new users or new projects. Nonetheless, there is still a room for an efficient and stable mining algorithm. The lattice theory proposed in literature (Davey & Priestley, 2002) shows that the maximal frequent itemsets can uniquely determine all frequent itemsets. This achievement indicates that all frequent itemsets can be obtained by finding maximal frequent itemsets. In order to decompose the project into smaller subset space so as to solve in memory easily, in literature (Zaki, 20 0 0), an adaptive algorithm for association rules has been proposed. These research opens a gate for designing a matrixbased mining algorithm. 2.1.2. Preference rules and applications The research on preference rules is pretty various and sophisticated. In literature (Agrawal, Rantzau, & Terzi, 2006), the author proposes a contextual preference mining model. The preference rules for different data sources may be repeated and contradictory, and the off-line prior group is established to store representative preference rules. In literature (Holland, Ester, & Kießling, 2003), a novel data preference mining algorithm for user-individual applications is proposed. The advantage of this method is that the result of mining is the strict partial semantics of “A is better than B”, but this semantic expression form still has some deficiencies in describing the complex preferences. As a prominent user preference model—CP-nets, Liu and Liao (2015) studies its expressive power and expressive efficient, and Sun, Liu, and Wang (2017) studies the corresponding preference composition operators. In literature (Costa & de Amo, 2016), the author considers that users prefer another item i2 to item i1 while taking the users’ preferences into account for two items of i1 and i2 , which increases the stability of the results but does not take the effects of multiple items into account. In literature (Zhu et al., 2012), context awareness is used in mobile user’s preference mining. In the case of insufficient user log information, the user’s preferences are considered as the distribution of common context-aware preferences of all users, two kinds of preference mining methods under context independent and non-independent conditions are discussed to apply to a variety of different application scenarios. Mining user preferences using association rules is also widely used in many recommendation systems. In literature (Jung & Lee, 2004), an optimal recommendation system based on hybrid collaborative filtering is proposed, but the traditional Apriori algorithm is used in the process of association mining, which may affect the efficiency of the system. Literature (Nguyen et al., 2012) points out that the current works of user preferences are completely specified, but in real world applications, users rarely even unable to provide their knowledge of preference. In this case, they proposed several heuristic algorithms to apply different elements to discover a set of preferences to provide to the user.

3

Since the use of preference logic under multiple contexts sometimes leads to some undesirable results, a logical sort concept in literature (Le, Son, & Pontelli, 2018) is proposed, which is applicable to multi-source preferences and can be applied to a multicontext system with preference. 2.2. Interestingness measure for preference rules preference learning and interestingness measure have lied a solid foundation, some classic or up-to-date works are as follows. 2.2.1. Measure metric and methods There are some similarities between preferences and preferences, literature (Ha & Haddawy, 2003) analyzes these similarities from the perspective of decision-making, and proposes a probability-based similarity measure, which uses Kendall’s tau function. The author demonstrates the stability of the method and the scalability under the uncertain conditions, which is very enlightening to the theory of preference similarity and uniqueness proposed in this paper. In literature (Sá de, Azevedo, Soares, Jorge, & Knobbe, 2018), the author discovers pair-wise association rules based on similarity measure by studying a multi-target mining pattern, and investigates which data sets benefit more from these measures and which parameters have greater impact on the accuracy of the model. From the point of view of machine learning, litera´ ´ ture (Dembczynski, Kotłowski, Słowinski, & Szelag, 2010) considers the problem of multiple attribute ranking, and proposes two methods for statistical learning from decision rulesets. Literature Shambour and Lu (2011) shows the necessity of the measure of the interestingness of the association rules, and proposes a new direction for selecting interesting rules based on the interestingness description: using multi-criteria assisting decision-making processes. On account of there are many flaws in classic interestingness measures, literature (Freitas, 1999) focuses on several factors that were ignored by other literature and proposes several new factors. The authors use these factors to improve the interestingness measure algorithm. However, the lack of scalability of the algorithm can not be applied to the results obtained by different mining algorithms or the results from different data sets. literature (Ju, Bao, Xu, & Fu, 2015) proposes three improved indexes and a utility function based on the interestingness gain, which take into account the subjective preferences and characteristics of the specific applications. Experiments in the literature show that the model can discover interesting rules that are difficult to discover by using association rules. literature (Geng & Hamilton, 2006) describes a variety of interestingness measure methods, classifies them, compares their performance, determines their role in the data mining process, and proposes a strategy for selecting appropriate application measures. In literature (Ma, Wang, & Liu, 2011), the author uses a fuzzy interestingness measure method to mine more potential user preferences. Nevertheless, this approach may bring more interesting rules while affecting the stability of user preference rules. literature Glass (2013), further, studies the results of interestingness measure, and proposes two stable and efficient measures to confirm the interestingness of association rules. 2.2.2. Top-K methods In literature (Pereira & de Amo, 2010), the Top-K rules of rulesets are evaluated. The result shows that the Top-K rules splendidly represent users’ preferences, which indicates that information filtering can obviously improve the efficiency and accuracy of information search. Several mining methods have been proposed. The first is kMHC (Duong, Liao, Fournier-Viger, & Dam, 2016), by using RIU,CUD

4

Z. Tan, H. Yu and W. Wei et al. / Expert Systems With Applications 143 (2020) 113043

Fig. 1. Preference database and transaction database connection diagram.

and Cov strategies to raise its internal minimum utility threshold effectively. An improved top-k high utility mining algorithm, THUI (Krishnamoorthy, 2019) with a new utility lower bound estimation method, has been proposed for mining HUIs, but it still has limitations while faces non-contiguous patterns. Similar method has been proposed such as Zhang et al. (2019). In literature (Lin, Ren, Fournier-Viger, Pan, & Hong, 2018), an updating method HAUIM has proposed, which is designed to provide more reasonable utility measure by taking the size of the itemset into account. To reduce the number of patterns,TFRC-Mine (Amphawan & Lenca, 2015), an approach that does not require to set a support threshold and focuses on closed patterns, is designed. In literature (Lucas, Silva, Vimieiro, & Ludermir, 2017), an evolutionary algorithm is designed for efficiently mining patterns in high dimensional data. iNTK algorithm (Le, Vo, & Baik, 2018) employs an N-list structure to represent patterns for mining top-rank-k frequent patterns. And the subsume concept is used to speed up the process of mining top-rank-k patterns. In literature Jabbour, Sais, and Salhi (2017) proposes a general algorithm for Top-K SAT and applies it in Data Mining. In application field, literature (Chen, Xie, Shang, & Peng, 2017) reinforces conventional association rule mining process by mapping the whole process into a visualization loop. As a result, the user workload for modulating parameters and mining rules is reduced, and the mining efficiency is considerably improved. Literature (Kamsu-Foguem, Rigal, & Mauget, 2013) emphasises the improvement of the operations processes. The application example shows the detail of an industrial experiment in which association rule mining is used to analyze the manufacturing process of a fully integrated provider of drilling products. 2.3. Difference and compensation There exists four difference between our approach and the related work: 1. In frequent itemsets mining process, Aiming at reducing the memory consumption and reducing the number of data sets, we use the graph model algorithm and vectorized it, This compensates the disadvantages of tradition mining algorithms and significantly improves the efficiency of mining frequent itemsets. 2. Combining with the contextual preference, MPM algorithm has been proposed for mining these contextual preferences rules, which could improve the accuracy of rules. Also a new approach is proposed to accelerate the mining speed of preference rules, which considerably narrows the search space. 3. Generally, objective measure lacks supervision information and subjective measure have various limitation. To provide

accurate and convinced supervision information for evaluating preference rules, a representative belief system is established by using representative users’ common preferences. Moreover, in this process, a heuristic algorithm named RSA is designed to eliminate the redundance in common preferences. 4. While mining user’s personalized preferences, most of research concentrates on the consistency of preference. On the contrary, we use an updated Bayesian approach to measure the interestingness of contextual preference rules, and choose the personalized preferences which mostly satisfy all users’ preferences and are different from these users’ preferences. This method not only considers the user’s consensus on preferences but also retains the user’s personalized information, which enables the algorithm more adaptable and improves the accuracy of results. 3. Context preference 3.1. Contextual preference rules Let I be a set of items, X is a subset of I, that is, X⊆I, and transaction database D is a multi-set of itemsets in I. Each itemset is a tuple of a database, such as one row in Table 1(a). A preference database ξ ⊆D × D is a set of pairs of transactions, each record represents a user’s preference, such as: < t, u > ∈ ξ , which means that the user prefers t over u. The connection between the preference database and the transaction database is shown in Fig. 1. Definition 1 (Contextual preference rule). A contextual preference rule is an expression of the form i+  i− |X where X⊆I, i+ and i− are items of I-X. For example, D  E|AB means that when context AB occurs, the user prefers D over E. The preference database tuple tR  u associated with the preference rule R = i+  i− |X satisfies ( X ∪ {i+ } ∈ t) ∧ ( X ∪ {i− } ∈ u) ∧ ( i+ ∈ t) ∧ ( i− ∈ u) For example, one tuple ACD  ABCE is associated with the preference rule D  E|A, which indicates that when A appears, the user’s preference for D is better than for E. Definition 2 (Support). The support of a contextual preference rule R in the preference database ξ is defined as:

supp(R, ξ ) = |agree(R, ξ )|/|ξ |

(1)

For example, the contextual association rule D  E|A matches p1 and p2 in Table 1b, D  E|B matches p2 , we obtain supp(D  E|A,ξ ) = |{p1 , p2 }|/|ξ | = 2/5 = 0.4. Similarly, supp(D  E|B,ξ ) = |{p2 }|/|ξ | = 1/5 = 0.2. The support reflects the matching degree of a contextual association rule R to the tuple in the preference database and also reflects the interestingness of a contextual preference rule. Obviously, the rule D  E|A is more interesting than the rule D  E|B. Definition 3 (Confidence). The confidence of a contextual preference rule R corresponding to the preference database ξ is defined as follows:

con f (R, ξ ) =

agree(R, ξ ) agree(R, ξ ) + against (R, ξ )

(2)

For example, con f (D  E |A, ξ ) = 2/2 = 1. Similarly, con f (D  E |φ , ξ ) = 2/3. Confidence evaluates whether a contextual preference rule matches user preferences. That is to say, D  E|A is more valuable than D  E|φ . A contextual preference rule R is called a frequent association rule, if its support exceeds a certain specified threshold - minsup. If the confidence of a frequent rule R exceeds a specified value minconf, R is called a strong association rule. The support

Z. Tan, H. Yu and W. Wei et al. / Expert Systems With Applications 143 (2020) 113043

threshold and the confidence threshold discard respectively the infrequent contextual preference rules. In Fig 1, D  E|B and D  E|AB have the same support and confidence. In order to enhance the generalization of the rules, it is desirable to keep shorter contextual rules, namely, the smallest contextual rules. 3.2. Lattice theory Some terminology of the lattice theory is well described as following. Definition 4. Lis a lattice of sets if it is closed under finite intersections and unions, i.e., (L; ⊆) is a partial order set defined by the subset relationship ⊆, X ∨ Y = X ∪ Y and X ∧ Y = X ∩ Y . A(L) represents a atom set of L. Lemma 1. The subsets of frequent sets are frequent. Supersets of infrequent itemsets are infrequent. Lemma 2. The maximal frequent itemsets uniquely identify all frequent itemsets. Lemma 3. For a finite Boolean lattice L, with x ∈ L, x = ∨{y ∈ A(L )|y ≤ x}, that is to say, the elements are given in the form of a union of subsets of atom sets. Lemma 4. ℘(L) is the power set of L, for any X ∈ ℘(L), let J = {Y ∈ A(℘(L ))|Y ≤ X }. Then X = ∪Y ∈JY, and supp(x ) = | ∩Y ∈J Y |. This lemma says that if an itemset is given as the intersection of a series of itemsets, we can determine any k-itemset by simply intersecting the tid-list of any two (k-1)-length subsets. 3.3. Prefix-Based class Definition 5 (Equivalence relations). Let  be a set, X,Y, Z ∈ ℘(L), an equivalence relation on is a kind of binary relation ≡ , this equivalence relation satisfies: 1. Reflexive: X ≡ X 2. Symmetric: If X ≡ Y then Y ≡ X 3. Transitive: If X ≡ Y, Y ≡ Z then X ≡ Z This equivalence relationship divides the set into disjoint subsets called equivalence classes. The element X ∈  of an equivalence class is given as [X] = {Y ∈ |X ≡ Y }. The equivalence relation θ k on the set lattice (L) is defined as follows: ∀X, Y ∈ (L ), X ≡ θkY ⇔ p(X, k ) = p(Y, k ) If two itemsets have a common prefix of length k, they are in a class and θ k is called prefix-based equivalence. Lemma 5. The equivalence class [X]θk that all θ k contains is a subset of (L). 3.4. Maximal clique Definition 6 (Pseudo-equivalent relationship). Let  be a set. A binary relationship ≡ on , called a pseudo-equivalent relationship, for all X, Y ∈ , satisfies following conditions: 1. Reflexive: X ≡ X 2. Symmetric: If X ≡ Y then Y ≡ X This pseudo-equivalence relationship divides  into potentially overlapping subsets called pseudo-equivalence classes. Definition 7 (Maximal Clique). If there is an edge between each pair of the vertices in a graph, it is called a complete graph. The subgraph of a complete graph is called a clique, and the Maximal Clique is the largest complete subgraph.

5

Let Fk represent a set of frequent K-itemsets, define a Kassociation graph, and give it in the form of GK = (V, E ), vertex set V = {X | X ∈ F1 }, edge set E = {(X , Y )|X , Y ∈ V ∧ ∃Z ∈ F(k+1 ) , X, Y ∈ Z }, MK denotes the set of the largest cliques of GK . According to the Definition 6, any two items X, Y in the power set are belong to the same largest clique, and satisfy p(X, k ) = p(Y, k ), sharing the common prefix of length k, denoted as X ≡ϕk Y ⇔ ∃C ∈ Mk , in other word, ϕ k is the Maxclique-based pseudoequivalence relations on power sets. Fig. 2 shows the association graph, the maximal cliques, and the prefix-based classes in the case of the frequent 2-itemsets {01, 02, 03, 04, 05, 06, 07, 12, 13, 14, 18, 25, 26, 34, 36, 38, 48, 57, 58, 68, 78}. Fig. 2a contains the maximal cliques shown in Fig. 2b, Fig. 2c shows the classes of the maximal clique based on frequent 1itemsets and the prefix-based classes when K = 1. For example, [0] = {0, 1, 2, 3, 4, 5, 6, 7}, because the graph contains 2-itemsets {01, 02, 03, 04, 05, 06, 07}. If a prefix-based class is contained by another prefix-based class, no new maximal clique will be generated. For example, the prefix-based class [2] is included by the prefix-based class [0], so there is no class [2] based on maximal clique. This feature can be used for pruning in generating maximal clique by prefix-based classes. Lemma 6. The pseudo-equivalence class[X]ϕk generated by the pseudo-equivalence relation ϕ k is a subset of (L). Lemma 7. Let Nk represent the set of pseudo-equivalence classes generated based on the maximal clique relation ϕ k . The pseudoequivalence classes [Y]ϕk generated by ϕ k is a subset of some of pseudo-equivalence classes [Y]θk generated from the prefix relationship θ k . On the other hand, [X]θk is the union of a set of pseudoequivalence classes ψ , denoted as [X]θk = ∪{[Z]ϕk |Z ∈ ψ ⊆ Nk } . Substituting ϕ k for θ k can generate smaller lattice. These small lattice require less memory. From Fig. 2 it can be seen that the prefix-based class [0] = {01234567} contains all the items and based on the maximal clique class [0]={012, 0134, 025, 026, 0fig436, 057} is smaller than prefix-based class. 4. Generating contextual preference rules Firstly, a preference database is generated from the transaction database, then an association matrix is generated from frequent 1-itemsets and frequent 2-itemsets. The MPM algorithm is used to generate the maximal cliques and all the cliques according to association matrix, and the cliques that satisfy the support requirements are retained. According to the vector representation of clique, it is determined whether the two cliques can generate a contextual preference rule and record the rule that satisfy the support and confidence. Definition 8 (Vector representation of transaction). Sorted the m attributes in the transaction database in a certain order and numbered them (0 to m-1). For each transaction, the vector representation has m bits, the ith bit is 0, which means that the attribute is not included in the transaction, and the ith bit is 1 indicating that the attribute is contained in the transaction. Similarly, according to this, the vector representation of clique and prefix-based class can be defined. Take node 1 in Fig. 2 as an example, the vector of the maximal clique is represented as {10 0 011010}, and the vector of the prefixbased class is {10 0 011110}. 4.1. Association matrix The frequent 1-itemsets I is filtered from all itemsets, and the prefix-based class of each itemset in I is taken as a row to form a

6

Z. Tan, H. Yu and W. Wei et al. / Expert Systems With Applications 143 (2020) 113043

4.2. The algorithm for generating clique Before generating contextual preference rules, frequent itemsets must be generated. However, traditional methods request many database scans. We propose an algorithm based on associationmatrix to solve the frequent itemsets generation problem in association rules. This algorithm has excellent pruning ability in sparse matrix or a certain scale graph. The algorithms are described as follows: 1. clique in algorithm is vector representation that represents what vertices exist in the current clique and initialized to clique = 0. 2. leftitem is a vector representation of a set of nodes that represents which nodes in the graph are directly connected to all the vertices in the clique, initialized to leftitem = 0. 3. MaxCliques is a set of vectors, each vector represents a maximal clique in the graph, initialized to MaxCliques = { }. 4. prefix[itemi ] (see Algorithm 2) represents itemi ’s prefixbased class [itemi ]. The validity of the Algorithm 1 depends on the following

Algorithm 1: Clique generation algorithm CliqueGen. Input: Preference database, vector clique, vector leftitem Output: MaxCliques (or all cliques) 1

2 3 4 5 6 7 8 9 10 11 12 13 14

15 16 17

function CliqueGen (vector clique, vector leftitem)// the all cliques in the graph could be found, if clique is recorded the cliques here if clique = 0 then foreach itemi belongs to I do CliqueGen(clique| 1 <
Algorithm 2: Prefix-based class solving algorithm Cal. Input: Preference database, MinSupport, Item itemi Output: Prefix class prefix[itemi ] of itemi 1

Fig. 2. Association graph.

2 3

matrix as follows:

 assmati j =

1 0

if item i and item j are frequent 2 - itemsets if item i and item j not frequent 2 - itemsets

4 5

(3)

6 7 8

We built an association matrix by Fig. 2a as follows:

function Cal(FrequentItem itemi ) prefix[itemi ] := 0 foreach item j belongs to I and order(j ≥ i) do if support(itemi , item j ) > Minsupport then prefix[itemi ] := prefix[itemi ] | 1<
Z. Tan, H. Yu and W. Wei et al. / Expert Systems With Applications 143 (2020) 113043

Algorithm 3: Matrix-based Preference rules Miner MPM. Input: Preference database, MinConfidence, MinSupport Output: Contextual preference ruleset R 1 2 3 4 5

6 7 8 9 10 11

12 13 14 15 16 17 18

function MPM() get the set of all cliques CQ by CliqueGen(0,0) foreach CQ1 belongs to CQ do foreach CQ2 belongs to CQ do if C Q1 | C Q2 − C Q1 = 2K1 and C Q1 | C Q2 − C Q2 = 2K2 and K1 +K2 > 0 then RULE.X := CQ1 &CQ2 RULE. i+ := CQ1 | CQ2 − CQ2 RULE. i− := CQ1 | CQ2 − CQ1 RULE.supp := Support(RULE) RULE.conf := Confidence (RULE) if RULE.sup > MinSupport and RULE.conf > MinConfidence then R := R∪RULE end end end end return R end MPM

7

Algorithm 5: Preference Interestingness Measure algorithm PIM. Input: Belief system RA, User preference ruleset RU Output: Interestingness degree of RU 1 2 3 4 5 6 7 8 9 10

procedure PIM() foreach RU j belongs to RU do belief :=0, deviation := 0 foreach RAi belongs to RA do belie f := MAX (belie f, P (RAi |RU j , ξ )) deviation := deviation + 1 − P (RAi |RU j , ξ ) end I (RU j ) := MAX (belie f, deviation/n ) end end PIM

Algorithm 4: Ruleset Aggregation algorithm RSA. Input: Preference ruleset R, Minimum average internal distance threshold MinDis Output: Preference ruleset as background knowledge RA 1 2 3 4 5 6 7 8 9 10

11 12 13 14 15 16 17 18 19 20

function RSA() RA := {}, MaxDis := MinDis foreach Ri belongs to R do foreach R j belongs to R do if avgdis({Ri , R j } ) > MaxDis then RA := {Ri , R j }, MaxDis := avgdis({Ri , R j }) end end end while there is new RULE inserting into RA do // RA can not be null MaxDis := MinDis foreach Ri belongs to R and Ri not belongs to RA do if avgdis(RA∪Ri ) > MaxDis then RULE := Ri end end RA := RA∪RULE end return RA end RSA

facts: 1. When the clique is empty, line 3 of Algorithm 1 guarantees that all frequent 1-itemsets can be initialized as a vertex. 2. When the clique is not empty, the line 8 of Algorithm 1 guarantees that all nodes connected to the vertices present in the clique can be added to the clique. 3. The line 3 and line 4 of Algorithm 2 select all the nodes that are directly connected to the itemi by calculating the frequent 2-

Fig. 3. Association matrix.

Fig. 4. The generation process of maximal cliques.

itemsets, and join the nodes in the vector prefix[itemi ] by OR operation in line 5, then the vector can represent a set of nodes directly connected with the itemi , t hat is, the item’s prefix-based class. 4. In the line 4 of Algorithm 1, we use prefix[itemi ] to initialize all nodes that a node may expand, and add the node to the clique by OR operation. In line 10, we use AND operation and prefix[itemi ] to ensure that that a new node i is added to leftitem, and leftitem is still a collection of nodes directly connected to all the vertices in the clique. 5. The termination condition of the algorithm is that when clique is empty, leftitem is also empty, which indicates that no more nodes in the graph are directly connected to all the vertices in the clique. In other word, if there is no larger clique in the graph, the algorithm will stop the recursion of the branch. 6. In Algorithm 1, we use a depth-first search method, which generates a depth-first search tree every time with the current node. This guarantees that the obtained cliques will not repeat, but the cliques obtained at the termination is not necessarily the largest (It may be a subclique of the maximal clique that has already existed), so the line 14 in Algorithm 1 restricts that only the clique which is not included in the previously generated maximal clique will be added to the set as a new maximal clique. Taking [0] in Fig. 3 as an example, the process of generating a maximal clique is shown in Fig. 4.

8

Z. Tan, H. Yu and W. Wei et al. / Expert Systems With Applications 143 (2020) 113043

4.3. Generating contextual preference rule algorithm If a contextual preference rule is frequent, the attributes that make up this rule must also be frequent, on the contrary, it is not true. Based on this property, this paper proposes an MPM algorithm that uses frequent itemsets to find potential rules, which reduces the problem solution space and I/O cost. Proposition 1. For a contextual preference rule i+  i− |X that satisfies the support requirement, X∪{i+ } and X∪{i− } are frequent itemsets. Proof. If a rule is frequent, then, it can be seen that supp(i+  i− |X ) ≥ Minsupport. And tuples in database that support i+  / t ) ∧ ( i+ ∈ / u ), so i− |X satisfy (X ∪ {i+ } ⊆ t ) ∧ (X ∪ {i− } ⊆ u ) ∧ (i− ∈ it is obvious that supp(i+  i− |X ) ≤ supp(X ∪ {i+ } ) and supp(i+  i− |X ) ≤ supp(X ∪ {i− } ), which means that X∪{i+ } and X∪{i− } are frequent.  Proposition 2. If a rule could only be generated by two cliques CQ1 , CQ2 that are equal in length and have only one pair of unequal items. Proof. If CQ1 and CQ2 do not have the same length and only one pair of unequal itemsets, let E be the equal part and the unequal parts are NE1 , NE2 respectively. The length of NE1 will not be equal to NE2 , and the possible rule should be NE1 NE2 |E or NE2 NE1 |E, which does not satisfy the definition of the contextual preference rule, so Proposition Proposition 2 is true.  Proposition 3. If a rule i+  i− |X is generated by two cliques CQ1 and CQ2 , its vector representation will be X = CQ1 & CQ2 , i+ is one of {CQ 1 | CQ 2 − CQ 1 = 2K1 , CQ 1 | CQ 2 − CQ 2 = 2K2 }, and the left one is i− (K1 and K2 are integers from 0 to m-1, and K1 = K2 ). Proof. Due to there are only one pair of unequal items between CQ1 and CQ2 , which are the K1 bit and K2 bit in vector representation respectively. It is clearly that the equal part X = CQ1 & CQ2 , and the remains parts are C Q1 | C Q2 − C Q1 and C Q1 | C Q2 − C Q2 , each of which includes one-bit 1 only. Naturally, we can infer that CQ1 | CQ2 − CQ1 = 2K1 and CQ1 | CQ2 − CQ2 = 2K2 are i+ and i− .  From the above, the rule generation algorithm is described as following: In line 3 to line 5 of the algorithm, we compare two cliques. If two cliques have only one pair of unequal items, then use these two cliques to generate a contextual rule, the line 6 to line 8 of the algorithm indicate that the context of the rule is X = CQ1 & CQ2 , i+ = CQ1 |CQ2 -CQ2 , i− = CQ1 |CQ2 -CQ1 . Line 9, 10 are the step of obtaining the support and confidence of the rule. If the rule satisfies the MinSupport and MinConfidence, it will be added to the ruleset R. 5. Interestingness measure for preference rules The core problem in knowledge discovery is to find a good interestingness measure for a model, and use the measure method to find out interesting rules from the patterns that have been mined to reduce the number of rules. 5.1. Interestingness description of association rules Interestingness measures are generally divided into objective measure and subjective measure. The format is related to the structure of the pattern itself and the dataset used in the mining process. The latter is also related to the background knowledge of the pattern users. There is no exact criterion for the interestingness measure. Interestingness of association rules is perhaps best treated as a broad concept that emphasizes Conciseness, Generality, Novelty, and Surprisingness.

Surprisingness. When a pattern contradicts a persons existing knowledge or expectations, we call this pattern unexpected. Novelty. A pattern that has never been discovered before and has not been included in any of the other patterns is called Novelty. Conciseness. When a pattern contains relatively few key-value pairs or a group of patterns contains relatively few patterns, they are all called concise. One or a set of patterns with Conciseness is relatively easy to understand and remember. Generality. When a pattern is applied in a relatively large range of databases, it is called generalization. The traditional subjective measure methods require a large amount of user intervention, while the objective measure methods lacks consideration of background knowledge. In this paper, an interestingness measure pattern based on Bayesian approach is proposed by combining the belief system based on the common preferences of all users. The model takes into account the needs of subjective and objective measure. 5.2. Distance-based interestingness measure 5.2.1. Belief system In general, belief system is divided into two categories: Hard belief. The hard belief can not be changed by new evidence. If the new evidence contradicts with the hard belief, it is considered that mistakes have been made in the process of obtaining new evidence. Hard belief is subjective and different for every user. Soft belief. Users allow new evidence to change these beliefs. ach soft belief is assigned a rating value which specifies the belief degree. Common methods include Bayesian approach, DempsterShafer method, Cyc’s method and so on. The above two kinds of belief systems are applicable to the traditional subjective measure and objective measure. These belief systems have high updating costs and are difficult to be updated in time. The belief system used in this paper is the improvement of the above two types. Common belief. The evidences in the belief system are given by all users. The belief system can not be changed by new evidences. If the new evidences must reach a certain threshold, it would be added in belief system. For the same reason, the existed evidence would be pruned, if it can not reach threshold anymore. If the existing user behaviors and belief system have a confliction, the behaviors will not change the belief system. The belief system allows each user to remain soft belief, which might be contradictory to the belief system. The common belief will be constructed by using users common preferences as background knowledge. To improve the accuracy, in the process of constructing belief system, we proposed an aggregation algorithm of ruleset. After aggregating, the representative common preference rules can be obtained without adding additional information. 5.2.2. Ruleset aggregation algorithm Before measuring the interestingness of contextual preference rules by using background knowledge, an important problem needs to be considered. Using too many rules as background knowledge to construct belief system will make the measure result be affected by a certain class of similar rules, and it will also affect computing efficiency. On the other hand, measuring with fewer rules may not fully consider the interestingness of rules. To solve this problem, an algorithm is proposed, which uses the average distances between rules to aggregate rules. Definition 9 (Average internal distance of ruleset). In the preference database ξ , the average internal distance of the ruleset RS is

Z. Tan, H. Yu and W. Wei et al. / Expert Systems With Applications 143 (2020) 113043

described as follows:

n n i=1

avgdis(RS, ξ ) =

j=1

P (RSi , ξ ) − P (RSi RS j , ξ ) n × (n − 1 )

(4)

n is the number of rules in the ruleset, and RSi and RSj are rules in the ruleset. In general, the distance between rules can be defined by using the probability that the rules occur simultaneously. The higher the frequency of occurrence of rules in a ruleset, the smaller the average internal distance should be. Based on this, the ruleset aggregation algorithm using the average internal distance of ruleset is as follows: The algorithm uses a greedy method. In the line 3 to line 9, a group rules with the farthest distance is used to initialize the set. In the line 10 to line 18, RULE, a rule from the remaining rules of the ruleset R, is added to the RA, so that the average internal distance of the RA satisfies the threshold and avgdis(RA ∪ RULE) ≥ avgdis(RA ∪ Ri ), Ri ∈ R and Ri ∈ RA. The condition of the line 10 ensures that the process will not be executed again while no rule can be added or the set is not initialized successfully. 5.2.3. Bayesian approach In Section 5.2.2, we have solved the problem of selection of background knowledge, but there is still a problem before measuring the rules: the degree of correlation between rules is often difficult to describe and quantify. Therefore, in this paper, we proposed the concept—belief degree to describe the relationship between rules. Definition 10 (The belief degree between rules). In a preference database ξ , the Bayesian approach for belief degree of a rule R1 to another rule R2 is described as follows:

belie f (R1 → R2 , ξ ) =

P ( R1 |R2 , ξ )P ( R2 |ξ ) P (R1 |R2 , ξ )P (R2 |ξ ) + P (R1 |¬R2 , ξ )P (¬R2 |ξ ) (5)

The belief degree between rules is described by probabilistic models, which are usually conditional probabilities and concurrent probabilities. For example, a preference rule R1 is that when context is B, user prefers D instead of C. Another preference rule R2 is: under the condition that the context is A, user prefers D instead of E. In the database shown in Table 1b, there are two transactions that support rule R2 , then P(R2 |ξ ) = 0.4. There is one transaction that supports rule R1 , then P(R1 |R2 , ξ ) = 0.5. In addition, we obtain P(¬R2 |ξ ) = 0.6, P(R1 |¬R2 , ξ ) = 0, which can be calculated as P(R2 |R1 , ξ ) = 1. 5.3. Measuring user personalized preference rules According to the definition of interestingness, a user rule is considered interesting, which should basically conform to a rule in a belief system or deviate from all rules in the belief system. Therefore, a formula for calculating the interestingness of rule R to belief is given based on belief degree. Definition 11 (The interestingness of rules). In a preference database ξ , under the belief system RA, the interestingness of rule R is described as follows:





I (R, RA ) = MAX MAX P (RAi |R, ξ ),

n

i=1

(1 − P (RAi |R, ξ ))



n (6)

Where n represents the number of rules in RA and RAi is a rule in the RA.

9

Users personalized rules interestingness measure Algorithm as following: Line 2 and line 4 make sure that each rule in RU will be measured by rules in RA. Line 3 initializes the values of belief and deviation, and line 5, 6 are the update step of the two values, which ensures that belief can represent the maximum belief degree of RUj to RA, while deviation/n represents the average deviation degree from RUj to RA. In the line 8, select the larger one as the interestingness degree of RUj . 6. Experiment result In this section, we will use experiments to evaluate the performance of the MPM algorithm in extracting contextual preference rule process and interestingness measure process by comparing with other cutting-edge algorithms. After the briefly introducing the real dataset MovieLens in Section 6.1, we describe different types of experiments. Firstly, in Section 6.2, we exhibit the changes in characteristics of user preference rules that the mining model obtained under different conditions. Then, to precisely reflect the performance of MPM, experiments on it are divided into 2 subsections, which aims to respectively show the efficiency of mining contextual preference rules of MPM and the accuracy of Top-K rules that are mined by MPM. In Section 6.3, we compare the efficiency of mining user preferences in different conditions using the MPM algorithm, CONTENUM algorithm, Apriori algorithm, THUI algorithm,and kHMC algorithm. In Section 6.4, two kind of belief model, TBM and EvAC, have been compared, which aims to authenticate the importance of deviation degree and the advantages of our belief system. In Section 6.5, several measure methods, including MPM algorithm, HAUIM algorithm, BPR algorithm, TFRC-Miner algorithm, THUI algorithm, TSP algorithm, TKS algorithm and Skopus algorithm, have been compared, which is to demonstrate the measure results by using Recall, Precision and F1-Measure. Finally, in Section 6.6, we show the content of the belief system, the readable version of user rulesets after filtering and the result of the interestingness measure. 6.1. Real datasets The real dataset used in experiments comes from http://www. movielens.org/. The dataset consists of 10,0 0 0,054 ratings (ranging from 1 to 5) of 65,133 movies by 71,567 users. Each user has rated at least 20 movies, each of which contains the release year and type. In Table 2, the main characteristics of the dataset can be found clearly, namely the ID of each user, the number of movies rated by each user (movie data set D), the different characteristics of each rated movie (attribute sets L) and the preference dataset ξ made up of the preferences generated by this dataset. In order to further compare the efficiency of the MPM algorithm with other mining algorithms, and present both the mining results of the MPM algorithm and interestingness measure result at the same time, we generate the user profiles for all users in dataset. 6.2. Analysis of preference rules Our experiment illustrates the impact of different conditions on mining contextual preference rules. Changing the minimum support threshold, the minimum confidence threshold and the database size, we observe the number of preference rules and the influence of context length on the support threshold and confident threshold. The purpose is to investigate the characteristics of con-

10

Z. Tan, H. Yu and W. Wei et al. / Expert Systems With Applications 143 (2020) 113043 Table 2 Real world database. UserID

|D |

|L|

|ξ |

user16277 user3200 user351 user58050 user12280 user6338 user34 user2517 user27006 user1075 user886 user7127 user24240 user40651 user585 user65990 user7315 user40535 user17670 user3793

525 537 539 584 609 634 638 661 700 729 734 738 758 790 812 830 865 899 930 1007

1243 1262 1289 1308 1538 1448 1580 1855 1878 1757 1837 1932 2126 1928 1934 1448 2295 2321 2250 2687

20237 42175 43579 29635 43992 37026 56843 50271 75079 26715 92198 21612 50339 70386 101007 15363 59116 46814 132083 136396

Rule 1 indicates that the user prefers Action to Comedy without condition. The support of this rule is 0.079545 and has a confidence of 1.0 0 0 0. Rule 3 shows that the user unconditionally prefers Romance over Drama, with a support of 0.369318 and a confidence of 0.73864. Rule 9, adding context to Rule 3, indicates that the user prefers the Romance to Drama in movies of the Comedy class. Although the support of Rule 9 decreases to 0.102273, its confidence increases to 0.8181818. 6.3. Comparison of mining algorithms To demonstrate the efficiency of MPM, this section is divided into two parts:

Table 3 10 preferences rules for user 430. No.

i+

i−

Context

1 2 3 4 5 6 7 8 9 10

Action Action Romance Thriller Comedy Animation Animation Comedy Romance Thriller

Comedy War Drama Romance Drama Romance Drama War Drama Romance

NULL Romance NULL Comedy Romance Comedy Comedy Romance Comedy NULL

NULL NULL NULL NULL NULL NULL Romance NULL NULL NULL

Support

Confidence

0.079545 0.011364 0.369318 0.022727 0.181818 0.011364 0.005682 0.068182 0.102273 0.0625

1.00000 1.00000 0.73864 1.00000 0.71111 1.00000 1.00000 0.85714 0.81818 1.00000

textual preference rules, and to provide reference for subsequent experiments. In Fig. 5a, under condition of a fixed minimum support threshold and a fixed minimum confidence threshold, the number of contextual preference rules almost increases linearly with the growth of the database size. Fig. 5b and Fig. 5c respectively use fixed minimum support threshold and fixed minimum confidence threshold to investigate their influence on the number of rules. It can be seen that the number of rules decreases with the increase of the minimum support threshold and the minimum confidence threshold. Interestingly, the different minimum support thresholds in Fig. 5b makes little change of the rate of three curves, and similar conclusions can be obtained from Fig. 5c, which shows that the support and confidence are two relative independent quantities. On the other hand, there are still more rules when the minimum confidence threshold is 0.99, which shows that these contextual preference rules are extremely reliable for a user. In Fig. 5d and e, we use extremely low support and confidence threshold to show the changes in the average support and confidence of rules with different context lengths. It is worth noting that, with the increasing of the context length of rules, in general, the support of rules will decrease, but the confidence will increase. This result is also quite consistent with the general perception, where a specific rule will cover a smaller subset of a dataset but will give more accurate results, a generalized rule will cover a larger range and the result will also become more ambiguous. The 10 preference rules of the user No.430 obtained by the MPM algorithm (a total of 74 rules in the user’s preference ruleset) are shown in Table 3.

6.3.1. Compared with general mining algorithms The first part is to evaluate the improvement of MPM in mining all preference rules in various conditions. By comparing with classic Apriori algorithm (Agrawal et al., 1994) and partition algorithms (Savasere et al., 1995), the improvement in the process of mining frequent items has been shown. Compared with the latest bottom-up CONTENUM algorithm (Amo et al., 2015), the utility of Propositon 2, including its pruning ability, could be observed. We increase the size of preference database constantly in order to investigate the dependence of the algorithm on database I/O with the fixed minimum support and the minimum confidence. Changing the minimum confidence threshold can well reflect the pruning effect of the minimum support threshold while minimum support threshold and the preference database size are fixed. The impact of increase of frequent itemsets can be examined by changing the minimum support threshold and the minimum confidence threshold and the preference database size are fixed. At the same time, sets of long frequent items will also increase under a smaller support, and this phenomenon is also worthy of attention. Fig. 6a shows the runtime of the 5 algorithms when the database size varies (under different fixed values of minimum support and confidence). The time overhead of algorithms is mostly linear growth, but the slopes of curves are much different. The Apriori-based algorithm and partition-based algorithms (partition10 algorithm and partition-20 algorithm) take about 50% more time than the MPM algorithm, because the MPM algorithm requests less database scans in the process of generating cliques. CONTENUM generates rules with none context that satisfy the minimum support threshold and the minimum confidence threshold at the beginning, and then adds context to these rules, which causes much unnecessary database scans. MPM and other algorithms uses frequent itemsets to discover potential rules based on Proposition 1, which greatly reduces the solution space. As a result, the performance of CONTENUM is the worst. And the disadvantages of CONTENUM would become more conspecious when the size of preference database increases or the number of tags increases. Fig. 6b indicates how the execution time algorithms vary while the minimum support threshold varies from 0.005 to 0.05 with a minimum confidence threshold 0.7. Indeed, the execution time rapidly decreases with the change of minimum support threshold,for the reason that either MPM, Apriori algorithm and partition algorithms generate rules after generating frequent itemsets. The change of the minimum support threshold greatly influences the number of frequent itemsets. As the number of frequent itemsets increases, the number of rules generated also increases, which will increase the number of database scans. It is worth noting that the execution time curves of the Apriori algorithm, partition-10, partition-20 and MPM gradually approach each other with the increase of minimum support threshold. This is due to the gradual reduction in the proportion of time taken to generate frequent itemsets. For CONTENUM algorithm, it adds context to the rules

Z. Tan, H. Yu and W. Wei et al. / Expert Systems With Applications 143 (2020) 113043

11

Fig. 5. Features of preference rules under different conditions.

that satisfy the minimum support threshold to generate new rules, and the minimum support threshold changes will make the possible rules become much more. In Section 6.2, the experiments show that the decrease of minimum support threshold will causes some rules with longer contexts to appear, which undoubtedly signifi-

cantly increases the number of database scans of the CONTENUM algorithm. Fig. 6c shows the execution time of algorithms when the minimum confidence threshold varies between 0.5 to 1 and the minimum support is 0.03. As the minimum confidence threshold

12

Z. Tan, H. Yu and W. Wei et al. / Expert Systems With Applications 143 (2020) 113043

generating rules by using the clique is determined. The minimum confidence threshold only affects the number of rules remained. On the other hand, in Section 6.2, experiment shows that the increase of confidence of the rule will keep in pace with the context length of rules. Hence, in the CONTENUM, as long as a rule satisfies the support threshold, the algorithm will try to add a new tag in turn to create rules with longer context in order to find the potential rules. Thus, the minimum confidence threshold can only change the number of rules remained and not narrow the search space. From this part, we can discover that the MPM has a better performance in most cases. Especially in two aspects, one is that MPM considerably improves the efficiency of mining frequent itemsets, and the other one is that preference rules could be fast generated by mining frequent itemsets firstly, based on the use of Propositions 1 and 2.

Fig. 6. Execution time of algorithms.

changes, it can be seen that the execution time of all algorithms does not apparently change. This is because, on one hand, in the MPM, partirion-10, partition-20 and the Apriori, the minimum support threshold increases the execution time by influencing the process of generating cliques and the number of generated cliques. When the generation of the cliques is completed, the workload for

6.3.2. Compared with Top-K mining algorithms In some strategies, Top-K preference rules could be directly discovered without mining all of preference rules. By using Top-M (M possibly not equals to K) utility itemsets, Top-K preference rules are generated. Using various K value as exemplars and similar experimental conditions in part one, in the second part, the performance of MPM is compared to two state-of-the-art algorithms THUI (Krishnamoorthy, 2019) and kHMC (Duong et al., 2016) in versatile conditions, which is to investigate the difference between Top-K methods. To focus on the effects of the variation of K, some similar phenomena that have been stated in part one would be omitted. In Fig. 7 THUI and kHMC do not apparently differ from each other in runtime in all cases. The reason of this is that using TopM utility itemsets to discover Top-K preference rules costs exact the same time, and the only difference is that the execution time of mining Top-M utility itemsets. Moreover, the improvement of mining Top-M utility itemsets is a minor part of whole process. More precisely, in Fig. 7a and d, the increment of database size similarly leads the runtime of THUI and kHMC to increase. In Fig. 7b and e, the increment of support threshold do not significantly reduce the runtime of THUI and kHMC as MPM, for the reason that high support threshold leads the difficulty of generating preference rules by high utility itemsets to increase while the search space decreases. In Fig. 7c and f, the runtime of THUI and kHMC decline as the increment of confidence threshold while mining all rules. In other cases (K = 10, 20, 50), noticeably, the runtime increases, because confidence threshold would extend the search place to find enough preference rules. Nonetheless, while discover all preference rules, the search space is fixed and the confidence threshold could slightly reduce the runtime. Interestingly, the runtime of THUI and kHMC does not linearly grow up with K and grows rapidly as K becomes greater, which means that mining Top-K preference rules directly do significantly reduce the runtime when K is small. When K <= 50 (about 80 exactly), THUI and kHMC finish the mining tasks in a shorter time than MPM. However, when K becomes lager and eventually equals the maximal, MPM outperforms THUI and kHMC. It demonstrates that, comparing with finding Top-M utility itemsets and using them to generate preference rules, generating rules from frequent itemsets and then evaluating them in some cases (especially K is large enough) is more efficient.

6.4. Comparison of belief model In following sections, to exhibit the advantages of our, several evaluation methods have been used, including:

Z. Tan, H. Yu and W. Wei et al. / Expert Systems With Applications 143 (2020) 113043

13

Fig. 7. Comparison with THUI and kHMC.

1. Recall. Under the preference database, the Recall of a preference ruleset S can be calculated as:

Recal l (S, ξ ) =

|{< t, u >∈ ξ |t  u}| |ξ |

2. Precision. Under the preference database, the Precision of a preference ruleset S can be calculated as:

P rec (S, ξ ) =

|{< t, u >∈ ξ |t  u}| |{< t, u >∈ ξ , t  u ∨ u  t }|

3. F1-Measure. Under the preference database, the Precision of a preference ruleset S can be calculated as:

F 1 − Measure(S, ξ ) =

|2 × Recal l (S, ξ ) × Prec(S, ξ )| |Recal l (S, ξ ) + Prec(S, ξ )|

In the following, we confront our belief system to two leadingedge models – TBM (Guil, 2019) and EvAC (Samet, Lefévre, & Yahia, 2016). Different belief functions have been applied in these model, which is main reason that causes the difference among the results. In Table 4a, synthetically, three model have comparable performance. Partly, especially when K < 10, MPM gains the best performance. With the increase of K value, TBM and EvAC outperform than MPM. This is because the belief functions used in TBM and EvAC only focus on the consistence. MPM, however, could select rules that are novel or general by using the belief degree and deviation degree. As we know, novel rule generally covers a smaller range but owns a higher precision rate. In Table 4b, as we can see, the MPM performs best in three algorithms, for the reason that MPM, unlike TBM and EvAC, intentionally choose deviational rules by using deviation degree. Since these deviational rules are novel or personalized and they only cover relative small cover range but acquire a extremely high precision rate (some even close to 1), they significantly influence the average precision of Top-K. Interestingly, although EvAC and TBM

both use the model of Transferable belief model, the result of EvAC is better than TBM, for the reason that the precise support and the weight of rules have been applied in EvAC. The F1-Measure result though Table 4c shows that the belief system used in MPM, overall, is better than TBM and EvAC. Since MPM is not obviously more outstanding than EvAC and TBM on recall, and even worse when K increases. The advantage of MPM comes from precision, which is why the result on F1-Measure could be superior. This comforts us on the importance of novel patterns. 6.5. Comparison of measure algorithms To provide an extensive experiment, the performance of MPM is compared with versatile algorithms, including BPR (Rendle, Freudenthaler, Gantner, & Schmidt-Thieme, 2009) algorithm, TFRC-Mine algorithm (Amphawan & Lenca, 2015), HAUIM algorithm (Lin et al., 2018) and THUI algorithm (Krishnamoorthy, 2019). Fig. 8a compares the recall of all algorithms, while K value varies between 5 to 50. It can be observed that the MPM gains the highest recall, which reflects that rules that confirm to the Belief System could cover a larger range, and BPR also performs well. However, the performances of THUI, HAUIM and TFRC-Mine are not good, especially when K is small. It means that the relationship between rule that generated by high utility itemsets and its coverage is not strong enough. Furthermore, when K is large, the difference among algorithms becomes smaller, but the Top-K rules selected by MPM still have highest recall. Fig. 8b compares the precision of all algorithms, while K value varies between 5 to 50. It can be observed that the precision of Top-10 preference rules selected by MPM is conspicuously higher than others, especially at Top-10. And in most cases, the precision

14

Z. Tan, H. Yu and W. Wei et al. / Expert Systems With Applications 143 (2020) 113043 Table 4 Model evaluation. (a) Recall

TBM EvAC MPM

K= 5

K=10

K=20

K=30

K=40

K=50

0.2492 0.2546 0.2606

0.2743 0.2761 0.2878

0.3659 0.3680 0.3493

0.4265 0.4297 0.3936

0.4705 0.4740 0.4292

0.4990 0.5030 0.4607

K=10 0.6942 0.7006 0.7903

K=20 0.6512 0.6574 0.7453

K=30 0.6404 0.6479 0.7213

K=40 0.6425 0.6512 0.7094

K=50 0.6471 0.6567 0.6983

K=10 0.3932 0.3961 0.5167

K=20 0.4685 0.4719 0.5733

K=30 0.5120 0.5167 0.6076

K=40 0.5432 0.5487 0.6335

K=50 0.5635 0.5697 0.6536

(b) Precision TBM EvAC MPM

K=5 0.7395 0.7521 0.8085

(c) F1-Measure TBM EvAC MPM

K=5 0.3728 0.3804 0.4860

Table 5 Comparison results. (a) Recall

TKS TSP Skopus MPM

K= 5

K=10

K=20

K=30

K=40

K=50

0.2041 0.2030 0.1800 0.2606

0.2571 0.2554 0.2266 0.2878

0.3414 0.3394 0.3141 0.3493

0.3971 0.3942 0.3781 0.3936

0.4365 0.4332 0.4227 0.4292

0.4615 0.4578 0.4491 0.4607

K=10 0.7410 0.7437 0.7623 0.7903

K=20 0.6895 0.6931 0.6906 0.7453

K=30 0.6738 0.6773 0.6791 0.7213

K=40 0.6717 0.6753 0.6764 0.7094

K=50 0.6718 0.6758 0.6746 0.6983

K=10 0.3817 0.3802 0.3494 0.5167

K=20 0.4567 0.4557 0.4318 0.5733

K=30 0.4997 0.4984 0.4858 0.6076

K=40 0.5291 0.5278 0.5203 0.6335

K=50 0.5471 0.5458 0.5392 0.6536

(b) Precision TKS TSP Skopus MPM

K=5 0.7855 0.7888 0.8269 0.8085

(c) F1-Measure TKS TSP Skopus MPM

K=5 0.3240 0.3229 0.2956 0.4860

of all algorithms decreases as the K increases, and eventually becomes similar to each other. Moreover, the decrement of MPM and BPR is more obvious than others. Interestingly, HAUIM acquires a relatively well performance when K is closer to the half of the number of all rules, for the reason that HAUIM focus on average utility. Fig. 8c compares the precision of all algorithms, while K value varies between 5 to 50. F1-Measure is a comprehensive factor that reflects both recall and precision. And the ranking of algorithms is MPM, BPR, TFRC-Mine, HAUIM, THUI. For investigating further, we introduce TSP algorithm (Tzvetkov, Yan, & Han, 2005), TKS algorithm (Fournier-Viger, Gomariz, Gueniche, Mwamikazi, & Thomas, 2013) and Skopus algorithm (Petitjean, Li, Tatti, & Webb, 2016) here. These algorithms all could mine the Top-K sequential patterns. And the differences among these algorithms are that TSP algorithm mines Top-K closed sequential patterns and Skopus mines Top-k sequential patterns based on leverage. The performance of 4 algorithms has been compared by varying K value. Table 5 shows the results of comparison. In Table 5a, MPM has higher recall than all other algorithms, especially when K is small. This is because MPM has selected more rules that are consistent with the belief system, which are conciseness. With the increase of K, the gap becomes insignificant, for the reason that more and more Novel rules have been added (the re-

sult of Precision confirms this conclusion). TKS and TSP have pretty similar performance, because the only difference between them is whether the mined pattern is closed. However, Skopus performs worse here, this is due to the patterns mined by TSP and TKS are sorted by support and the Skopus ranks patterns based on leverage. It shows that rules with higher support possibly could gain higher recall. In Table 5b, MPM performs best over all, and the second is Skopus. And TKS and TSP obtain similar results again. Interestingly, when K=5, the precision rate of Skopus is higher than MPM, but becomes lower when K >= 10. It proves the conclusion that has been proposed at last paragraph and indicates the reason of the Top-5 rules that mined by Skopus covering least range is it choose the most precise rules at first. Table 5c shows the results on F1-Measure. As we have mentioned before, F1-Measure could comprehensively describe the results of Precision and Recall. MPM is far better than TKS, TSP and Skopus on interestingness measure field, but when K is small, Skopus is also worth to consider. In conclusion, this section authenticates that mining all preference rules does cost more time, but in most cases, gains a better result. Preference rules and itemsets consisted them do have potential relation, however it is not strong enough. Generating all rules and find Top-K And with the supervision information, MPM outperforms the cutting-edge algorithm BPR, since the Belief System enables MPM to recognize and filter out rules that extremely confirm to users’ personal preferences or cover a tremendous range. 6.6. Interestingness measure results The purpose of this experiment is to illustrate the results and the effects of interestingness measure in contextual preference rules and the deficiencies of the support-confidence framework. At the same time, we show the comparison of preference rulesets before and after filtering, and the filtered user preference ruleset. Table 6a shows the 23 contextual preference rules for all users with support = 0.05 and confidence = 0.7. It can be seen that all the rules contain no context, because using a high minimum support threshold in a large preference database can drastically reduce the length of the context. The average of the support of 23 preference rules is used as the threshold of the minimum average internal distance. The filtered results are shown in Table 6b. Only 16 preference rules remain after filtering, and they are sorted by the order of adding to rulesets. It can be found that the rules of high support can easily be added to the ruleset, but this is not entirely the case because the average internal distance mainly reflects the

Z. Tan, H. Yu and W. Wei et al. / Expert Systems With Applications 143 (2020) 113043

15

Table 6 Common preference ruleset. (a) Original common preference ruleset No.

i+

i−

Context

Support

Confidence

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

Action Action Drama Adventure Animation Animation Drama Crime Documentary Comedy Mystery Romance War Crime Documentary Drama Romance War Romance Sci-Fi Thriller War Romance

Comedy Horror Adventure Horror Comedy Drama Children’s Comedy Comedy Horror Comedy Comedy Comedy Horror Horror Horror Drama Drama Horror Horror Horror Horror Thriller

NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL

0.028726 0.021224 0.035279 0.012471 0.010046 0.012568 0.022312 0.017968 0.013926 0.050488 0.010930 0.033328 0.020555 0.013786 0.010596 0.098239 0.044711 0.024921 0.023746 0.010477 0.026063 0.016352 0.011081

61.36311 93.49478 62.79739 88.25324 84.72727 77.21854 62.78435 68.17996 64.31060 87.19285 69.16780 82.38742 77.52033 93.90602 92.64844 90.76785 5.94288 68.42261 97.56422 90.16698 91.97413 96.74745 74.65505

Support 0.098239 0.044711 0.050488 0.033328 0.024921 0.026063 0.022312 0.020555 0.021224 0.023746 0.017968 0.013926 0.016352 0.012568 0.013786 0.010930

Confidence 90.767852 75.942878 87.192852 82.387423 68.422610 91.974135 62.784349 77.520325 93.494777 97.564216 68.179959 64.310602 96.747449 77.218543 93.906021 69.167804

(b) Common preference ruleset after aggregation No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Fig. 8. Evaluation of methods.

correlation between the rules. When the frequency is considered as a probability and the rules are not independently and identically distributed, P(AB) = P(A)P(B), so the support is only one of the conditions that affect the degree of correlation.

i+ Drama Romance Comedy Romance War Thriller Drama War Action Romance Crime Documentary War Animation Crime Mystery

i− Horror Drama Horror Comedy Drama Horror Children’s Comedy Horror Horror Comedy Comedy Horror Drama Horror Comedy

Context NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL

We use a filtered common rules set to measure the interestingness of the user’s ruleset. The No.430 user filtered ruleset is shown in Table 7, where K = 10. It can be found that 6 out of 10 rules with a score of 1.0, indicating that the common preference rules set is well covered with a large number of user preferences and that the 6 rules are stable and credible. The remaining 4 rules are 0.9 points, indicating that these rules and the common preference rules are highly deviated from the new or unexpected rules. These rules are also interesting and hard to be described by the support-confidence framework. Fig. 9a shows the three-dimensional distribution of 235,687 contextual preference rules under the supp-conf-interestingness framework. The distribution of the rules retained by Top-K ranking (K = 10) based on the interestingness for each user is shown in Fig. 9b. The rules with high interestingness degree are not concentrated in areas of high support and high confidence, further, illustrating that the degree of interestingness cannot just depend on the support-confidence framework measure. Fig. 10a shows the effect on the rules with respect to different minimum support and confidence thresholds. Comparing line1 and line 3, it can be seen that the increase of the minimum support threshold will greatly increase the proportion of rules with shorter context. Observing line 1 and line 2 can find an interesting phenomenon. It has been concluded in Section 6.2 that the average

16

Z. Tan, H. Yu and W. Wei et al. / Expert Systems With Applications 143 (2020) 113043 Table 7 Top-10 rules of user 430. No.

i+

i−

Context

1 2 3 4 5 6 7 8 9 10

Crime Sci-Fi Mystery Crime Crime Action Comedy Thriller Romance Thriller

Drama Comedy Comedy Comedy Romance Romance Drama Drama War War

NULL NULL NULL NULL Drama Drama Romance Comedy NULL NULL

NULL NULL NULL NULL NULL NULL NULL Romance NULL NULL

Support

Confidence

Interestingness degree

0.011364 0.017045 0.017045 0.022727 0.011364 0.011364 0.181818 0.011364 0.102273 0.022727

1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 0.71111 1.00000 0.81818 1.00000

1.00 1.00 1.00 1.00 1.00 1.00 0.90 0.90 0.90 0.90

Fig. 9. Distribution of rules in supp-conf-interestingness.

confidence of rules with longer context is higher, however, giving a higher minimum confidence threshold does not greatly change the distribution of rules over the length of the context. Fig. 10b shows the comparison of the context length distributions of the preference rules before and after Top-K ranking when the minimum support threshold = 0.01 and the minimum confidence threshold = 0.7. It can be seen that the distribution of the rules has changed significantly, and the peak of the curve has begun to move to the middle. The proportion of rules that originally had more empty contexts has dropped. The rules with the length 1 or 2 are more preserved. The rules with context length 3 keep only a few, while the very few rules with context length longer than 4 are not in the Top-K. It shows that the Top-K interesting rules do not simply tend to be short context rules that are concise and generalizable.

Fig. 10. Percentage of rules with different length under different conditions.

Under the supervision of the generalized background knowledge, the algorithm has played a good role in anticipation and made more accurate choices based on Novelty and Surprisingness criteria that are difficult to find in the support-confidence framework. Interestingly, most of the rules that contain much contexts do not get a high score during the measure process. This is because specific rules do not conform to Generality and Conciseness, and it is difficult to be novel and surprise when there is a lot of contextual information.

Z. Tan, H. Yu and W. Wei et al. / Expert Systems With Applications 143 (2020) 113043

Although the interestingness cannot be measured with supportconfidence framework simply, it still retains some similar characteristics. For example, high support reflects Conciseness and Generality. On the other hand, the support-confidence framework is difficult to identify Novelty and Surprisingness, because the rules with Novelty and Surprisingness are often hard to have high support and high confidence. Therefore, whether a rule is interesting cannot be unilaterally judged from any of them, but should be considered as many factors as possible. 7. Conclusion and future work In this paper, we propose a mining algorithm of preference rules—MPM, which is used to mine user preferences from preference database. We also propose a ruleset aggregation algorithm RSA which is based on the average internal distance of the ruleset to generate background knowledge to construct belief system, and uses the belief system to measure the preference rules of a single user by the interestingness measure algorithm PIM to obtain the Top-K rules. This reduces the size of the rulesets and improves the accuracy of the rules. By conducting a series of experiments on real user preference sets, we prove that the proposed algorithm is efficient and the selected rules are pretty accurate. Moreover, the conditional preference rules based on the context are presented in a readable form. Preference mining technology will become more and more important in the field of personalized queries. The technology of preference mining is also indispensable in the development of recommender systems that are commonly used in E-commerce. The development of preference technology makes the automatic reasoning of user preferences in the form of concise preference become more meaningful, and the problem of scalability is expected to be solved. The future work includes the following aspects: 1. Establish a complete interestingness measure system, add more interestingness criteria to measure process, apply more interestingness indicators to describe rules, and select few the most accurate and effective criteria to measure the interestingness of rules in a variety of situations. 2. A clustering or classification algorithm would be introduced into the interestingness measure system to classify tags (or items), which enable the system automatically select criteria and corresponding evaluation functions. 3. Under the traditional support-confidence framework, consider the positive association and the negative association at the same time, and combine more tags to verify and improve the performance of MPM algorithm. Declaration of Competing Interest We declare that we have no conflicts of interest to this work. Credit authorship contribution statement Zheng Tan: Investigation, Writing - original draft, Writing - review & editing. Hang Yu: Data curation, Resources, Software, Supervision. Wei Wei: Validation, Visualization. Jinglei Liu: Conceptualization, Project administration, Methodology, Formal analysis. Acknowledgments We gratefully acknowledge the detailed and helpful comments of the anonymous reviewers, who have enabled us to considerably improve this paper. This work was supported by Natural Science Foundation of China (61572419, 61773331, 61572418, 61703360).

17

References Agrawal, R., Rantzau, R., & Terzi, E. (2006). Context-sensitive ranking. In Proceedings of the 2006 ACM SIGMOD international conference on management of data (pp. 383–394). ACM. Agrawal, R., Srikant, R., et al. (1994). Fast algorithms for mining association rules. In Proc. 20th int. conf. very large data bases, vldb: 1215 (pp. 487–499). Amo, S. D., Diallo, M. S., Diop, C. T., Giacometti, A., Li, D., & Soulet, A. (2015). Contextual preference mining for user profile construction. Information Systems, 49, 182–199. Amphawan, K., & Lenca, P. (2015). Mining top-k frequent-regular closed patterns. Expert Systems with Applications, 42(21), 7882–7894. Cao, Y., & Li, Y. (2007). An intelligent fuzzy-based recommendation system for consumer electronic products. Expert Systems with Applications, 33(1), 230–240. Chen, W., Xie, C., Shang, P., & Peng, Q. (2017). Visual analysis of user-driven association rule mining. Journal of Visual Languages & Computing, 42, 76–85. Costa, J. A. R., & de Amo, S. (2016). Improving pairwise preference mining algorithms using preference degrees.. JIDM, 7(2), 86–98. Davey, B. A., & Priestley, H. A. (2002). Introduction to lattices and order. Cambridge University Press. De Sá, C. R., Soares, C., Jorge, A. M., Azevedo, P., & Costa, J. (2011). Mining association rules for label ranking. In Pacific-asia conference on knowledge discovery and data mining (pp. 432–443). Springer. ´ ´ Dembczynski, K., Kotłowski, W., Słowinski, R., & Szelag, M. (2010). Learning of rule ensembles for multiple attribute ranking problems. In Preference learning (pp. 217–247). Springer. Duong, Q.-H., Liao, B., Fournier-Viger, P., & Dam, T.-L. (2016). An efficient algorithm for mining the top-k high utility itemsets, using novel threshold raising and pruning strategies. Knowledge-Based Systems, 104, 106–122. Fan, W., Wang, X., Wu, Y., & Xu, J. (2015). Association rules with graph patterns. Proceedings of the VLDB Endowment, 8(12), 1502–1513. Fournier-Viger, P., Gomariz, A., Gueniche, T., Mwamikazi, E., & Thomas, R. (2013). Tks: efficient mining of top-k sequential patterns. In International conference on advanced data mining and applications (pp. 109–120). Springer. Freitas, A. A. (1999). On rule interestingness measures. In Research and development in expert systems xv (pp. 147–158). Springer. Geng, L., & Hamilton, H. J. (2006). Interestingness measures for data mining: A survey. ACM Computing Surveys (CSUR), 38(3), 9. Glass, D. H. (2013). Confirmation measures of association rule interestingness. Knowledge-Based Systems, 44, 65–77. Guil, F. (2019). Associative classification based on the transferable belief model. Knowledge-Based Systems. ˝ ˝ Gyorödi, C., Gyorödi, R., & Holban, S. (2004). A comparative study of association rules mining algorithms. Hungarian joint symposium on applied computational intelligence, oradea. Ha, V., & Haddawy, P. (2003). Similarity of personal preferences: Theoretical foundations and empirical analysis. Artificial Intelligence, 146(2), 149–173. Holland, S., Ester, M., & Kießling, W. (2003). Preference mining: A novel approach on mining user preferences for personalized applications. In European conference on principles of data mining and knowledge discovery (pp. 204–216). Springer. Hu, K., Lu, Y., Zhou, L., & Shi, C. (1999). Integrating classification and association rule mining: A concept lattice framework. In International workshop on rough sets, fuzzy sets, data mining, and granular-soft computing (pp. 443–447). Springer. Hüllermeier, E., Fürnkranz, J., Cheng, W., & Brinker, K. (2008). Label ranking by learning pairwise preferences. Artificial Intelligence, 172(16–17), 1897–1916. Jabbour, S., Sais, L., & Salhi, Y. (2017). Mining top-k motifs with a sat-based framework. Artificial Intelligence, 244, 30–47. Ju, C., Bao, F., Xu, C., & Fu, X. (2015). A novel method of interestingness measures for association rules mining based on profit. Discrete Dynamics in Nature and Society, 2015. Jung, K.-Y., & Lee, J.-H. (2004). User preference mining through hybrid collaborative filtering and content-based filtering in recommendation system. IEICE TRANSACTIONS on Information and Systems, 87(12), 2781–2790. Kamsu-Foguem, B., Rigal, F., & Mauget, F. (2013). Mining association rules for the quality improvement of the production process. Expert Systems with Applications, 40(4), 1034–1045. Krishnamoorthy, S. (2019). Mining top-k high utility itemsets with effective threshold raising strategies. Expert Systems with Applications, 117, 148–165. Le, T., Son, T. C., & Pontelli, E. (2018). Multi-context systems with preferences. Fundamenta Informaticae, 158(1–3), 171–216. Le, T., Vo, B., & Baik, S. W. (2018). Efficient algorithms for mining top-rank-k erasable patterns using pruning strategies and the subsume concept. Engineering Applications of Artificial Intelligence, 68, 1–9. Li, W., Han, J., & Pei, J. (2001). Cmar: Accurate and efficient classification based on multiple class-association rules. In Proceedings 2001 IEEE international conference on data mining (pp. 369–376). IEEE. Lin, J. C.-W., Ren, S., Fournier-Viger, P., Pan, J.-S., & Hong, T.-P. (2018). Efficiently updating the discovered high average-utility itemsets with transaction insertion. Engineering Applications of Artificial Intelligence, 72, 136–149. Liu, J.-L., & Liao, S.-Z. (2015). Expressive efficiency of two kinds of specific CP-nets. Information Sciences, 295(2), 379–394. Lucas, T., Silva, T. C., Vimieiro, R., & Ludermir, T. B. (2017). A new evolutionary algorithm for mining top-k discriminative patterns in high dimensional data. Applied Soft Computing, 59, 487–499. Ma, W.-M., Wang, K., & Liu, Z.-P. (2011). Mining potentially more interesting association rules with fuzzy interest measure. Soft Computing, 15(6), 1173–1182.

18

Z. Tan, H. Yu and W. Wei et al. / Expert Systems With Applications 143 (2020) 113043

Nguyen, T. A., Do, M., Gerevini, A. E., Serina, I., Srivastava, B., & Kambhampati, S. (2012). Generating diverse plans to handle unknown and partially known user preferences. Artificial Intelligence, 190, 1–31. Pereira, F. S., & de Amo, S. (2010). Evaluation of conditional preference queries. Journal of Information and Data Management, 1(3), 503. Petitjean, F., Li, T., Tatti, N., & Webb, G. I. (2016). Skopus: Mining top-k sequential patterns under leverage. Data Mining and Knowledge Discovery, 30(5), 1086–1111. Rendle, S., Freudenthaler, C., Gantner, Z., & Schmidt-Thieme, L. (2009). Bpr: Bayesian personalized ranking from implicit feedback. In Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence (pp. 452–461). AUAI Press. Sá de, C. R., Azevedo, P., Soares, C., Jorge, A. M., & Knobbe, A. (2018). Preference rules for label ranking: Mining patterns in multi-target relations. Information Fusion, 40, 112–125. Samet, A., Lefévre, E., & Yahia, S. B. (2016). Evidential data mining: Precise support and confidence. Journal of Intelligent Information Systems, 47(1), 135–163. Savasere, A., Omiecinski, E. R., & Navathe, S. B. (1995). An efficient algorithm for mining association rules in large databases. Technical Report. Georgia Institute of Technology. Shambour, Q., & Lu, J. (2011). Integrating multi-criteria collaborative filtering and trust filtering for personalized recommender systems. In Computational intelligence in multicriteria decision-making (mdcm), 2011 ieee symposium on (pp. 44–51). IEEE. Sun, X., Liu, J., & Wang, K. (2017). Operators of preference composition for CP-nets. Expert Systems with Applications, 86, 32–41.

Toivonen, H., et al. (1996). Sampling large databases for association rules. In Vldb: 96 (pp. 134–145). Tzvetkov, P., Yan, X., & Han, J. (2005). Tsp: Mining top-k closed sequential patterns. Knowledge and Information Systems, 7(4), 438–457. Zaki, M. J. (20 0 0). Scalable algorithms for association mining. IEEE Transactions on Knowledge and Data Engineering, 12(3), 372–390. Zaki, M. J., Parthasarathy, S., Li, W., & Ogihara, M. (1997). Evaluation of sampling for data mining of association rules. In Research issues in data engineering, 1997. proceedings. seventh international workshop on (pp. 42–50). IEEE. Zeng, K. (2016). Preference mining using neighborhood rough set model on two universes. Computational Intelligence and Neuroscience, 2016. Zhang, L., Yang, S., Wu, X., Cheng, F., Xie, Y., & Lin, Z. (2019). An indexed set representation based multi-objective evolutionary approach for mining diversified top-k high utility patterns. Engineering Applications of Artificial Intelligence, 77, 9–20. Zhang, Z.-H., Kou, J.-S., Chen, F.-Z., & Li, M.-Q. (2010). Feature extraction of customer purchase behavior based on genetic algorithm. Pattern Recognition and Artificial Intelligence, 2. Zhu, H., Chen, E., Yu, K., Cao, H., Xiong, H., & Tian, J. (2012). Mining personal context-aware preferences for mobile users. In Data mining (ICDM), 2012 ieee 12th international conference on (pp. 1212–1217). IEEE.