Optimal granule level selection: A granule description accuracy viewpoint

Optimal granule level selection: A granule description accuracy viewpoint

International Journal of Approximate Reasoning 116 (2020) 85–105 Contents lists available at ScienceDirect International Journal of Approximate Reas...

591KB Sizes 0 Downloads 32 Views

International Journal of Approximate Reasoning 116 (2020) 85–105

Contents lists available at ScienceDirect

International Journal of Approximate Reasoning www.elsevier.com/locate/ijar

Optimal granule level selection: A granule description accuracy viewpoint Qing Wan a,b , Jinhai Li c,d,∗ , Ling Wei b,e , Ting Qian b,f a

School of Science, Xi’an Polytechnic University, Shaanxi 710048, PR China Institute of Concepts, Cognition and Intelligence, Northwest University, Shaanxi 710069, PR China Data Science Research Center, Kunming University of Science and Technology, Yunnan 650500, PR China d Faculty of Science, Kunming University of Science and Technology, Yunnan 650500, PR China e School of Mathematics, Northwest University, Shaanxi 710069, PR China f School of Science, Xi’an Shiyou University, Shaanxi 710065, PR China b c

a r t i c l e

i n f o

Article history: Received 21 July 2019 Received in revised form 2 November 2019 Accepted 2 November 2019 Available online 7 November 2019 Keywords: Granular computing Rough set theory Multi-scale information table Granule description Optimal granule level

a b s t r a c t Granule description has become one of the hot research topics in granular computing (GrC). Rough set theory (RST), as an important research technique for granular computing, can describe any target granule (any subset of a universe of discourse) by the lower and upper approximations. But, there is no measure to evaluate the quality of granule description in RST. Moreover, one can acquire different granule descriptions by decomposing a multi-scale information table into different single-scale information tables. Then, it is important to find the most appropriate single-scale information table for meeting specific granule description accuracy requirement. Inspired by the above problem, this paper is to discuss optimal granule level selection based on the granule description accuracy. First of all, a new granule description accuracy is defined by combining GrC and RST. After that, optimal granule level selection is investigated in a multi-scale information table subject to preserving granule description accuracies for a target granule and a group of target granules, respectively. Specially, for the case of a group of target granules, we put forward optimal granule level selection methods based on three different criteria, commonly used by people in daily life. In addition, considering that the data in real-life will be updated as time goes by, we discuss the impact on the optimal granule level when new objects are added gradually. Finally, the time complexity of the proposed algorithms is analyzed, the reasonability of setting the parameters is explained, some numerical experiments are conducted to show the effectiveness of our methods, and a comparison of our algorithms and the existing ones is made. © 2019 Elsevier Inc. All rights reserved.

1. Introduction The novel concept of information granularity [1] was first presented by American famous cybernetician Zadeh in 1979. After that, the term of granular computing (GrC) was introduced by Zadeh [2] and Lin [3] in 1997. Therefore, 1997 was considered as the birth year of granular computing [4]. At present, as an effective approach for knowledge representation and data mining, granular computing has become a hot research field. Yao [5] and Lin [6] discussed the basic problems

*

Corresponding author at: Data Science Research Center, Kunming University of Science and Technology, Yunnan 650500, PR China. E-mail addresses: [email protected] (Q. Wan), [email protected] (J. Li), [email protected] (L. Wei), [email protected] (T. Qian).

https://doi.org/10.1016/j.ijar.2019.11.001 0888-613X/© 2019 Elsevier Inc. All rights reserved.

86

Q. Wan et al. / International Journal of Approximate Reasoning 116 (2020) 85–105

and methods of GrC in detail, respectively. In 2007, in order to help readers well understand the study of GrC, Yao [4] reviewed the progress made in this field, which is beneficial to the further research of granular computing. Besides, Yao et al. [7] provided a comprehensive discussion on the past, present and future of GrC in 2013, attracting so much attention from the communities of granular computing, rough set, fuzzy set and formal concept analysis. Pedrycz [8] and Pedrycz and Chen [9–11] proposed some excellent granular computing techniques and gave some principles to GrC. Based on the results from cognitive science, Yao [12] put forward a three-way granular computing model by combining three-way decision and granular computing. Based on Yao’s work, Li [13] presented the double-granular structure of three-way fuzzy matroids. Liang et al. [14] pointed out that the main idea of GrC is to decompose a complex and large problem into several simple and small subproblems according to the background of the problem under consideration, and one can find a satisfactory solution to the original problem by combining the results obtained from the subproblems. In the process of decomposition, the simple and small subproblem was called problem subspace, and the method of decomposing the problem space was called the granule formulation. The granule formulation is one of the main research issues of GrC and three-way clustering has become a useful way of granulating missing data [15]. After the basic granules have been obtained, how to interpret and represent different granules is the other two main research issues of GrC. To the best of our knowledge, granule description belongs to the interpretation of granules, and it is the starting point of this paper. Formal concept analysis (FCA) [16], one of the most important research techniques for GrC, was proposed by German mathematician Wille in 1982. In FCA, formal concepts were obtained by the so-called Galois connection (∗, ) on a formal context (G , M , I ). A formal concept can be expressed by a pair ( A , B ) which satisfies A ∗ = B and B  = A, where A ⊆ G is the extent, B ⊆ M is the intent, and they can mutually be determined with each other. The semantical explanation of ( A , B ) is that all objects in A share all attributes in B, and all attributes in B are only possessed by all objects in A. Based on this explanation, all extents can be distinguished from each other by taking their intents as the descriptions of the extents. Concept lattice, the key conceptual structure of FCA, was constructed by defining a proper partial order on all the formal concepts of a formal context. Therefore, by building the concept lattice, we can obtain the descriptions of all the extents. From the perspective of FCA, Zhi and Li [17] classified the power set of objects into three categories, and proposed several methods to describe them. Unfortunately, the obtained granule descriptions are only for part of granules, and the cost of constructing the concept lattice is high. In order to describe more granules, Zhi and Li [18,19] further studied the granule description problem based on the pictorial diagrams and three-way concept lattices. Moreover, they also pointed out the problem of approximate granule description. But, it is with regret that there are still some granules which are not able to be described. Rough set theory (RST) [20] was introduced by Polish mathematician Pawlak in 1982. RST and FCA are two complementary theories in knowledge discovery and data mining [21–23]. Up to now, there are many research results on the combination of these two theories [24–29]. For the issue of granule description, RST and FCA are still complementary. In the case of RST, according to the equivalence relation R on a universe of discourse U , one can discompose U into a family of pairwise disjoint subsets, called the equivalence classes. All the equivalence classes of U form a partition of U , denoted by U / R. From the viewpoint of GrC, an equivalence class can be viewed as a basic granule. Based on these basic granules, any target granule X ⊆ U can be approximated by a pair of sets that are obtained by the union of some basic granules, which are known as the lower and upper approximations. Furthermore, for an information table (U , AT ), the basic data of RST, an equivalence class and its attribute values are to determine each other mutually. Hence, all equivalence classes can be distinguished from one another by taking their respective attribute values as the corresponding descriptions. Although, by using the idea of lower and upper approximations, one can describe any target granule, it is with regret that the existing granule description methods lack a quantized parameter which can be able to measure the conciseness of the description of a granule. Therefore, in order to find a reasonable quantitative method for measuring the conciseness, we give a novel quantized parameter called the granule description accuracy, which is the key concept of this paper. To extend Pawlak’s rough set model, Qian et al. [30,31] proposed the multigranulation rough set (MGRS) based on multiple equivalence relations on a universe of discourse U . Moreover, attribute reduction was also discussed in multigranulation rough set [32,33]. In fact, the equivalence relations in MGRS were induced by multiple attribute subsets of an information table. However, in some real-life applications, an object may take “different values” under the same attribute due to variable scales [34]. If each attribute in an information table (U , AT ) can be valued in s levels of scales, then AT can induce s equivalence relations on U . In this case, the information table is called a multi-scale information table. Wu and Leung [35] took two real-life examples to explain multi-scale information tables (i.e., different scales of a map of China and different forms of the examination scores of mathematics for students). Furthermore, if the multi-scale information table is equipped with a decision attribute, it is called a multi-scale decision table. Recently, a hot research direction in the study of multi-scale decision tables is to select an appropriate level of scale based on different problems subject to satisfying certain requirements, which is called the problem of optimal scale selection. For example, Wu and Leung [36] and Wu et al. [37] discussed the optimal scale selection based on various consistencies of multi-scale decision tables. Gu and Wu [38] and She et al. [39] put forward a local viewpoint for rule acquisition in a multi-scale decision table. Xie et al. [40] viewed multi-scale decision tables as multi-scale formal decision contexts to study optimal scale selection for the purpose of mining concise association rules. Moreover, Li and Hu [41] made a generalization of the definition of a multi-scale information table and reconsidered optimal scale selection on the basis of Wu-Leung model [35]. Also, according to the multi-scale attribute significance, Li and Hu [42] introduced a novel approach of stepwise optimal scale selection to search for a better scale combination with less time compared to [41]. And then, Wu and Leung

Q. Wan et al. / International Journal of Approximate Reasoning 116 (2020) 85–105

87

[43] clarified relationships among different concepts of optimal scale combinations in inconsistent generalized multi-scale decision tables. However, different from the one given by Li and Hu [41], Huang et al. [44] introduced another generalized multi-scale decision table by allowing objects to take “different values” under decision attributes. Besides, from the perspective of data updating, Hao et al. [45] studied the optimal scale selection based on sequential three-way decisions. In summary, for multi-scale decision tables, there are two kinds of optimal scale selection methods in the existing literature: one is based on the consistencies among single-scale decision tables, and the other is based on three-way decisions. In fact, as a special case of information tables, a multi-scale information table is actually composed of multiple information tables with the same set of attributes whose values may be assigned differently from each other, and these information tables need to meet specific requirements between attribute values. For each information table, one can obtain the granule description and the corresponding accuracy when a given target granule is provided. Thus, a natural problem is how to select an appropriate information table from the multi-scale information table, so that the granule description is more detailed and the granule description accuracy can reach required standards. In order to address this problem, in this paper we investigate the optimal scale selection in multi-scale information tables from the aspect of the granule description accuracy. Note that there is another situation in real-life. When people are required to meet different goals in practical problems, they may use multiple criteria. For instance, as a whole, they are first required to meet a global standard, i.e., relative quantization. Furthermore, each individual is required to meet a local standard, i.e., absolute quantization. Sometimes, both the above two criteria should be satisfied simultaneously, i.e., double quantization. For example, in the activities of selecting teams, the selection criteria for the teams may be the above three forms. So, in this paper, for a group of target granules, we discuss the issue of optimal scale selection based on the above three different criteria in multi-scale information tables. In terms of knowledge discovery based on dynamic data, many researchers have obtained a lot of results in RST [46–51]. The common characteristic of these researches is that the new knowledge was obtained by updating the previous knowledge and the validity of the method was verified by experiments. To the best of our knowledge, most of the existing incremental methods are efficient. So, we can use incremental ideas to analyze the impact on the optimal scale level when data is updated in multi-scale information tables. In addition, we design several algorithms to compute the optimal scale levels when some objects are added into the original data set. According to the above analysis, the current work mainly focuses on the optimal granule level selection based on granule description accuracy. The rest of this paper is organized as follows. In Section 2, we review some basic notions, i.e., Pawlak approximation space, information tables and multi-scale information tables. And then, we propose the definition of granule description accuracy for a given target granule in an information table and a multi-scale information table, respectively. Also, we discuss some useful properties. In Section 3, we present the optimal granule level based on granule description accuracies for a target granule and a group of target granules, respectively, and design the corresponding algorithms. In Section 4, we analyze the impact on the optimal granule level with objects updating. In Section 5, we discuss the reasonability of setting the parameters, and conduct some numerical experiments to show the effectiveness of our method. Finally, Section 6 concludes this paper. 2. Preliminaries In this section, we recall some basic notions of rough set and multi-scale information tables. Meanwhile, we propose the definition of granule description accuracy on information tables and multi-scale information tables, respectively. 2.1. Rough set approximations Definition 1 ([20]). Let U be a finite and nonempty set called the universe of discourse. If R ⊆ U × U is an equivalence relation on U , then the pair (U , R ) is called a Pawlak approximation space. A partition of U can be derived by the equivalence relation R on U , denoted by U / R = {[x] R |x ∈ U }, where [x] R = { y ∈ U |(x, y ) ∈ R } is the equivalence class containing x. In particular, from the viewpoint of granular computing, each equivalence class [x] R can be called a basic granule of the partition. For any X ⊆ U , we denote

R ( X ) = ∪{[x] R ∈ U / R |[x] R ⊆ X },

R ( X ) = ∪{[x] R ∈ U / R |[x] R ∩ X = ∅},

and call them the lower and upper approximations with respect to X , respectively. Further, if R ( X ) = R ( X ), then X is called definable; otherwise X is called undefinable (in this case, X is a rough set). Obviously, R ( X ) and R ( X ) are definable. The following is some properties of the lower and upper approximations. Proposition 1 ([20]). Let (U , R ) be a Pawlak approximation space and X ⊆ U . Then the following properties hold: (1) R (U ) = R (U ) = U , R (∅) = R (∅) = ∅; (2) R ( X ) ⊆ X ⊆ R ( X ); (3) R ( X ∩ Y ) = R ( X ) ∩ R (Y ), R ( X ∪ Y ) = R ( X ) ∪ R (Y );

88

Q. Wan et al. / International Journal of Approximate Reasoning 116 (2020) 85–105

Table 1 An information table (U , AT ). U

a1

a2

a3

a4

x1 x2 x3 x4

1 1 2 1

2 2 1 2

3 2 2 3

1 3 3 1

(4) R ( X c ) = ( R ( X ))c , R ( X c ) = ( R ( X ))c , where X c is the complement of X . When X is rough, one can use a pair of lower and upper approximations to characterize it. That is, a rough set X can be approximated by the pair ( R ( X ), R ( X )). To measure the accuracy of the approximation, we denote

αR ( X ) =

| R ( X )| | R ( X )|

and call it the accuracy of the rough set approximation, where | X | is the cardinality of the set X . Specially, if X is empty, then we denote α R (∅) = 0. Clearly, if X is definable, then α R ( X ) = 1. As a result, 0 ≤ α R ( X ) ≤ 1. Obviously, it is easy to observe that the definition of the accuracy of the rough set approximation views an element in U as a basic unit. 2.2. Information tables The information tables are the basis data in RST from which we can extract useful patterns. The definition of an information table is given below. Definition 2 ([20]). An information table S = (U , AT ) consists of two sets U and AT . Here, U = {x1 , x2 , · · · , xn } is a nonempty and finite set of objects called the universe of discourse, AT = {a1 , a2 , · · · , am } is a non-empty and finite set of attributes. And for any al ∈ AT , f l : U → V l , i.e., ∀x ∈ U , f l (x) ∈ V l , where V l = { f l (x) | x ∈ U } is called the domain of al . The attribute set AT induces a partition U / R AT in which the basic granule is denoted by [x] R AT . As [x] R AT and its attribute value ( f 1 (x), f 2 (x), · · · , f m (x)) are one-to-one correspondence, we call (a1 , f 1 (x)) ∧ (a2 , f 2 (x)) ∧ · · · ∧ (am , f m (x)) a distinguishing description of [x] R AT , denoted by d([x] R AT ), i.e.,

d([x] R AT ) = (a1 , f 1 (x)) ∧ (a2 , f 2 (x)) ∧ · · · ∧ (am , f m (x)) =



(al , f l (x)).

al ∈ AT

For any nonempty subset X of U , if X is definable, then the distinguishing description of X can be expressed as the disjunctive form of the distinguishing description of some basic granules, i.e.,

d( X ) =

 [x] R AT ⊆ X

d([x] R AT ) =



(



(al , f l (x))).

[x] R AT ⊆ X al ∈ AT

If X is not definable, then we can represent the distinguishing description of X by an interval set which is constructed by the distinguishing descriptions of R AT ( X ) and R AT ( X ), i.e.,

d( X ) = [d( R AT ( X )), d( R AT ( X ))]. In this case, if R AT ( X ) = ∅, we denote d( X ) = [−, d( R AT ( X ))]. In the following, we give an example to explain the above discussion. Example 1. Assume that the object set U = {x1 , x2 , x3 , x4 } and the attribute set AT = {a1 , a2 , a3 , a4 }. Then Table 1 is an information table. The attribute set AT derives a partition of U , i.e., U / R AT = {{x1 , x4 }, {x2 }, {x3 }}, where [x1 ] R AT = [x4 ] R AT = {x1 , x4 }, [x2 ] R AT = {x2 }, [x3 ] R AT = {x3 }. The distinguishing description of each basic granule is as follows:

d([x1 ] R AT ) = d([x4 ] R AT ) = (a1 , 1) ∧ (a2 , 2) ∧ (a3 , 3) ∧ (a4 , 1), d([x2 ] R AT ) = (a1 , 1) ∧ (a2 , 2) ∧ (a3 , 2) ∧ (a4 , 3), d([x3 ] R AT ) = (a1 , 2) ∧ (a2 , 1) ∧ (a3 , 2) ∧ (a4 , 3).

Q. Wan et al. / International Journal of Approximate Reasoning 116 (2020) 85–105

89

Let X 1 = {x2 , x3 }. Because  of R ( X 1 ) = R ( X 1 ) = {x2 , x3 } = [x2 ] R AT ∪ [x3 ] R AT , X 1 is definable, and its distinguishing description is d( X 1 ) = d([x2 ] R AT ) d([x3 ] R AT ). Let X 2 = {x1 , x2 , x3 }. Then R AT ( X 2 ) = {x2 , x3 } = [x2 ] R AT ∪ [x3 ] R AT and R AT ( X 2 ) = {x1 , x2 , x3 , x4 } = [x1 ] R AT ∪ [x2 ] R AT ∪

[x3 ] R AT . So, X 2 is not definable and lows: d( R AT ( X 2 )) = d([x2 ] R AT )



α R AT ( X 2 ) = 12 . The distinguishing descriptions d( R AT ( X 2 )) and d( R AT ( X 2 )) are as fol-

d([x3 ] R AT ),

d( R AT ( X 2 )) = d([x1 ] R AT )



d([x2 ] R AT )



d([x3 ] R AT ).

Then the distinguishing description of X 2 is

d( X 2 ) = [d([x2 ] R AT )



d([x3 ] R AT ), d([x1 ] R AT )



d([x2 ] R AT )



d([x3 ] R AT )].

Remark 1. In this paper, we view any subset X ⊆ U as a granule, so we call d( X ) the granule description hereinafter. Based on the above results, it is easy to observe that the description of a target granule X is mainly concerned with the basic granules in R AT ( X ) and R AT ( X ). Because any basic granule and its granule description are one-to-one correspondence, every target granule and its granule description are also one-to-one correspondence. Accordingly, in terms of granule description, different from the classical rough set approximation, we view an equivalence class in U as a basic unit, and quantize the granule description of a target granule by the ratio of the number of basic granules in the lower and upper approximations. For our purpose, we denote Ecn( R AT ( X )) and Ecn( R AT ( X )) as the numbers of basic granules in the lower and upper approximations, respectively. Definition 3. Let S = (U , AT ) be an information table. For any target granule X ⊆ U , the granule description accuracy of X is denoted by

αd ( X ) =

Ecn( R AT ( X )) Ecn( R AT ( X ))

.

In fact, Definition 3 provides an approach to measure the granule description accuracy on a basis of the degree of closeness between the lower approximation and the upper approximation. Obviously, if X = ∅, then R AT ( X ) = R AT ( X ) = ∅, leading to Ecn( R AT ( X )) = Ecn( R AT ( X )) = 0. In this case, we denote

αd ( X ) =

Ecn( R AT ( X )) Ecn( R AT ( X ))

= 0. Next, we discuss some properties of αd ( X ) for any target granule X .

Theorem 1. Let S = (U , AT ) be an information table. For any target granule X ⊆ U , the following statements hold. (1) 0 ≤ αd ( X ) ≤ 1. (2)

αd ( X ) =

|U / R AT |− Ecn( R AT ( X c )) |U / R AT |− Ecn( R AT ( X c )) .

Proof. (1) If X = ∅, then

αd ( X ) = 0. If X = ∅, we have R AT ( X ) ⊆ R AT ( X ). By combining the lower and upper approximaαd ( X ) = 1.

tions, we obtain Ecn( R AT ( X )) ≤ Ecn( R AT ( X )), i.e., 0 < αd ( X ) ≤ 1. Especially, if R AT ( X ) = R AT ( X ), then

(2) From R AT ( X ) ∪ ( R AT ( X ))c = U and R AT ( X ) ∩ ( R AT ( X ))c = ∅, it follows Ecn( R AT ( X )) + Ecn(( R AT ( X ))c ) = |U / R AT |. Based on R AT ( X c ) = ( R AT ( X ))c , we have Ecn( R AT ( X )) + Ecn( R AT ( X c )) = |U / R AT |. Similarly, we get Ecn( R AT ( X )) + Ecn( R AT ( X c )) = |U / R AT |. Therefore,

αd ( X ) =

Theorem 1 shows that we can calculate

Ecn( R AT ( X )) Ecn( R AT ( X ))

=

|U / R AT |− Ecn( R AT ( X c )) |U / R AT |− Ecn( R AT ( X c )) .

2

αd ( X c ) instead of αd ( X ) for simplifying the computation when | X | > | X c |.

Example 2 (Continued with Example 1). According to the results in Example 1 and Definition 3, the granule description accuracy of the definable granule X 1 = {x2 , x3 } is αd ( X 1 ) = 1, and that of the indefinable granule X 2 = {x1 , x2 , x3 } is αd ( X 2 ) = 23 .

Let X = {x4 }. It follows Ecn( R AT ( X )) = Ecn(∅) = 0, and Ecn( R AT ( X )) = Ecn({x1 , x4 }) = 1. Because X 2c = X and |U / R AT | = 3, by Theorem 1 (2), we can get

αd ( X 2 ) =

|U / R AT | − Ecn( R AT ( X )) 2 = . |U / R AT | − Ecn( R AT ( X )) 3

90

Q. Wan et al. / International Journal of Approximate Reasoning 116 (2020) 85–105

2.3. Multi-scale information tables Denote AT = { AT 1 , AT 2 , · · · , AT s }, where AT k = {akj | j = 1, 2, · · · , m}(k = 1, 2, · · · , s). For any AT k , one can get a partition of U which is denoted by U / R AT k = {[x] R k |x ∈ U }, and [x] R k is a basic granule induced by AT k . For any object AT AT x ∈ U , if [x] R AT 1 ⊇ [x] R AT 2 , we say that U / R AT 1 is coarser than U / R AT 2 , or U / R AT 2 is finer than U / R AT 1 . For all partitions U / R AT 1 , U / R AT 2 , · · · , U / R AT s , if there exists a total order with respect to the coarser (finer) relationship, i.e., U / L AT 1  U / L AT 2  · · ·  U / L AT s or U / L AT 1  U / L AT 2  · · ·  U / L AT s , then S = (U , AT ) is called a multi-scale information table. The original definition of a multi-scale information table is as follows. Definition 4 ([35]). A multi-scale information table is a tuple S = (U , AT ), where U = {x1 , x2 , · · · , xn } is a non-empty and finite set of objects called the universe of discourse, AT = {a1 , a2 , · · · , am } is a non-empty and finite set of attributes, and each a j ∈ AT is a multi-scale attribute, i.e., for the same object in U , a j can take “different values” under different scales. Note that “different values” in Definition 4 means an original value varying in different forms or semantics due to different scales in the real-world. In general, a multi-scale information table can be represented as (U , {akj |k = 1, 2, · · · , s; j = 1, 2, · · · , m}). For k ∈

{1, 2, · · · , s}, AT k is called the kth level of scale in the attribute set AT , S k = (U , AT k ) is called a single-scale information table of S, and S = (U , AT ) can be decomposed into s single-scale information tables. For our purpose, we denote a multi-scale information table as S = (U , { AT k |k = 1, 2, · · · , s}) and suppose U / AT 1  U / AT 2  · · ·  U / AT s . In the single-scale information table S k = (U , AT k ), we denote the basic granule, the partition and the lower and upper approximations by [x] AT k , U / AT k , R ( AT k , X ) and R ( AT k , X ), respectively. Of course, the results about the lower and upper approximates in Proposition 1 also hold here. Since we can induce an equivalence relation R AT k for any AT k in a multi-scale information table S = (U , { AT k |k = 1, 2, · · · , s}), from the perspective of multigranulation rough set model, we call k a granule level in this paper. For a target granule X , we can obtain multiple granule descriptions under different granule levels. For a definable granule X , the granule description is denoted by



d( AT k , X ) =

[x] AT k ⊆ X

d([x] AT k ) =



(



(alk , f lk (x))).

[x] AT k ⊆ X ak ∈ AT k l

Here, f lk : U → V lk , i.e., ∀x ∈ U , f lk (x) ∈ V lk = { f lk (x) | x ∈ U }. For the indefinable granule X , the granule description is denoted by

d( AT k , X ) = [d( R ( AT k , X )), d( R ( AT k , X ))]. In a multi-scale information table S = (U , { AT k |k = 1, 2, · · · , s}), we can obtain s granule descriptions for a target granule. For the description of any granule X , we can employ Definition 3 to calculate its accuracy, and denote it by αdk ( X ), i.e.,

αdk ( X ) =

Ecn( R ( AT k , X )) . Ecn( R ( AT k , X ))

In addition, as the relationship among all the partitions in S = (U , { AT k |k = 1, 2, · · · , s}) is U / AT 1  U / AT 2  · · ·  U / AT s , the granule descriptions become more detailed as the granule levels become greater. The details can be found below. Lemma 1. Let S = (U , { AT k |k = 1, 2, · · · , s}) be a multi-scale information table and X ⊆ U be a target granule. For 1 < k ≤ s, we have

Ecn( R ( AT k−1 , X )) ≤ Ecn( R ( AT k , X )),

Ecn( R ( AT k−1 , X )) ≤ Ecn( R ( AT k , X )).

Lemma 1 shows that the number of basic granules in the lower (upper) approximation increases as the value k becomes greater in S = (U , { AT k |k = 1, 2, · · · , s}). Note that d( AT s , X ) = [d( R ( AT s , X )), d( R ( AT s , X ))] is the most detailed granule description with respect to X . In order to facilitate our understanding, we give an example to explain it. Example 3. Table 2 is a multi-scale information table, which describes housing investigation information of 15 households of a city. Here U = {x1 , x2 , · · · , x15 } is the object set constituted by 15 households, and AT = {a1 , a2 , a3 } is the attribute set. The meaning of the attributes is as follows: a1 is the distance of the house to the center; a2 is the size of the house; a3 is the price of the house. Table 2 is obtained by discretizing the continuous attribute values and dividing the range of values of the attributes into several subsections on average. Moreover, each attribute was implemented under four levels of scale, i.e., AT 1 = {a11 , a12 , a13 }, AT 2 = {a21 , a22 , a23 }, AT 3 = {a31 , a32 , a33 }, and AT 4 = {a41 , a42 , a43 }.

Q. Wan et al. / International Journal of Approximate Reasoning 116 (2020) 85–105

91

Table 2 A multi-scale information table S = (U , { AT k |k = 1, 2, · · · , s}). U

a11

a21

a31

a41

a12

a22

a32

a42

a13

a23

a33

a43

x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15

1 1 1 1 1 1 1 1 2 2 2 2 2 2 2

1 1 1 1 2 2 2 2 3 3 3 3 3 3 4

1 1 1 2 3 3 4 4 5 5 5 5 5 6 7

1 1 2 3 4 4 5 5 6 6 6 7 8 9 10

1 1 1 1 1 2 2 2 2 2 2 2 2 2 2

1 2 2 2 2 3 3 3 3 3 3 4 4 4 4

1 2 2 3 3 4 4 4 4 4 5 6 6 7 7

1 2 3 4 4 5 5 5 6 6 7 8 9 10 11

1 1 1 1 1 1 1 1 2 2 2 2 2 2 2

1 2 2 2 2 2 2 2 3 3 3 3 4 4 4

1 2 2 3 3 4 4 4 5 5 5 5 6 7 7

1 2 3 4 4 5 5 6 7 7 8 8 9 10 11

Table 3 The information table S 1 = (U , {a11 , a12 , a13 }). U

a11

a12

a13

x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15

1 1 1 1 1 1 1 1 2 2 2 2 2 2 2

1 1 1 1 1 2 2 2 2 2 2 2 2 2 2

1 1 1 1 1 1 1 1 2 2 2 2 2 2 2

Next, we compute four partitions of U according to Table 2:

U / AT 1 = {{x1 , x2 , x3 , x4 , x5 }, {x6 , x7 , x8 }, {x9 , x10 , x11 , x12 , x13 , x14 , x15 }}

= { E 11 , E 12 , E 13 }. 2

U / AT = {{x1 }, {x2 , x3 , x4 }, {x5 }, {x6 , x7 , x8 }, {x9 , x10 , x11 }, {x12 }, {x13 , x14 }, {x15 }}

= { E 21 , E 22 , E 23 , E 24 , E 25 , E 26 , E 27 , E 28 }. 3

U / AT = {{x1 }, {x2 , x3 }, {x4 }, {x5 }, {x6 }, {x7 , x8 }, {x9 , x10 }, {x11 }, {x12 }, {x13 }, {x14 }, {x15 }}

= { E 31 , E 32 , E 33 , E 34 , E 35 , E 36 , E 37 , E 38 , E 39 , E 3,10 , E 3,11 , E 3,12 }. U / AT 4 = {{x1 }, {x2 }, {x3 }, {x4 }, {x5 }, {x6 }, {x7 }, {x8 }, {x9 , x10 }, {x11 }, {x12 }, {x13 }, {x14 }, {x15 }}

= { E 41 , E 42 , E 43 , E 44 , E 45 , E 46 , E 47 , E 48 , E 49 , E 4,10 , E 4,11 , E 4,12 , E 4,13 , E 4,14 }. Obviously, U / L AT 1  U / L AT 2  U / L AT 3  U / L AT 4 . Therefore, based on Definition 4, Table 2 is a multi-scale information table. Let k = 1. Then the information table S 1 = (U , AT 1 ) can be generated by the first level of scale and it is shown in Table 3. In a similar way, we can obtain S 2 , S 3 and S 4 from Table 2. According to Table 2, it is easy to compute the granule description of each basic granule in S k (k = 1, 2, 3, 4). The detailed results can be found below. In the first granule level:

d( E 11 ) = (a11 , 1) ∧ (a12 , 1) ∧ (a13 , 1), d( E 12 ) = (a11 , 1) ∧ (a12 , 2) ∧ (a13 , 1), d( E 13 ) = (a11 , 2) ∧ (a12 , 2) ∧ (a13 , 2).

92

Q. Wan et al. / International Journal of Approximate Reasoning 116 (2020) 85–105

In the second granule level:

d( E 21 ) = (a21 , 1) ∧ (a22 , 1) ∧ (a23 , 1), d( E 22 ) = (a21 , 1) ∧ (a22 , 2) ∧ (a23 , 2), d( E 23 ) = (a21 , 2) ∧ (a22 , 2) ∧ (a23 , 2), d( E 24 ) = (a21 , 2) ∧ (a22 , 3) ∧ (a23 , 2), d( E 25 ) = (a21 , 3) ∧ (a22 , 3) ∧ (a23 , 3), d( E 26 ) = (a21 , 3) ∧ (a22 , 4) ∧ (a23 , 3), d( E 27 ) = (a21 , 3) ∧ (a22 , 4) ∧ (a23 , 4), d( E 28 ) = (a21 , 4) ∧ (a22 , 4) ∧ (a23 , 4). In the third granule level:

d( E 31 ) = (a31 , 1) ∧ (a32 , 1) ∧ (a33 , 1), d( E 32 ) = (a31 , 1) ∧ (a32 , 2) ∧ (a33 , 2), d( E 33 ) = (a31 , 2) ∧ (a32 , 3) ∧ (a33 , 3), d( E 34 ) = (a31 , 3) ∧ (a32 , 3) ∧ (a33 , 3), d( E 35 ) = (a31 , 3) ∧ (a32 , 4) ∧ (a33 , 4), d( E 36 ) = (a31 , 4) ∧ (a32 , 4) ∧ (a33 , 4), d( E 37 ) = (a31 , 5) ∧ (a32 , 4) ∧ (a33 , 5). d( E 38 ) = (a31 , 5) ∧ (a32 , 5) ∧ (a33 , 5), d( E 39 ) = (a31 , 5) ∧ (a32 , 6) ∧ (a33 , 5), d( E 3,10 ) = (a31 , 5) ∧ (a32 , 6) ∧ (a33 , 6), d( E 3,11 ) = (a31 , 6) ∧ (a32 , 7) ∧ (a33 , 7), d( E 3,12 ) = (a31 , 7) ∧ (a32 , 7) ∧ (a33 , 7). In the fourth granule level:

d( E 41 ) = (a41 , 1) ∧ (a42 , 1) ∧ (a43 , 1), d( E 42 ) = (a41 , 1) ∧ (a42 , 2) ∧ (a43 , 2), d( E 43 ) = (a41 , 2) ∧ (a42 , 3) ∧ (a43 , 3), d( E 44 ) = (a41 , 3) ∧ (a42 , 4) ∧ (a43 , 4), d( E 45 ) = (a41 , 4) ∧ (a42 , 4) ∧ (a43 , 4), d( E 46 ) = (a41 , 4) ∧ (a42 , 5) ∧ (a43 , 5), d( E 47 ) = (a41 , 5) ∧ (a42 , 5) ∧ (a43 , 5). d( E 48 ) = (a41 , 5) ∧ (a42 , 5) ∧ (a43 , 6), d( E 49 ) = (a41 , 6) ∧ (a42 , 6) ∧ (a43 , 7), d( E 4,10 ) = (a41 , 6) ∧ (a42 , 7) ∧ (a43 , 8), d( E 4,11 ) = (a41 , 7) ∧ (a42 , 8) ∧ (a43 , 8), d( E 4,12 ) = (a41 , 8) ∧ (a42 , 9) ∧ (a43 , 9), d( E 4,13 ) = (a41 , 9) ∧ (a42 , 10) ∧ (a43 , 10), d( E 4,14 ) = (a41 , 10) ∧ (a42 , 11) ∧ (a43 , 11). Let X 1 = {x1 , x5 , x8 , x11 } and X 2 = {x6 , x7 , x8 , x9 , x13 } be two target granules. Then their lower and upper approximations in the above four granule levels are shown in Table 4. Furthermore, we obtain the granule descriptions of X 1 and X 2 in each granule level. See Table 5 for details.

Q. Wan et al. / International Journal of Approximate Reasoning 116 (2020) 85–105

93

Table 4 The lower and upper approximations of X 1 and X 2 . k

1

2

3

4

R ( AT k , X 1 ) R ( AT k , X 1 ) R ( AT k , X 2 ) R ( AT k , X 2 )



E 21 ∪ E 23 E 21 ∪ E 23 ∪ E 24 ∪ E 25 E 24 E 24 ∪ E 25 ∪ E 27

E 31 ∪ E 34 ∪ E 38 E 31 ∪ E 34 ∪ E 36 ∪ E 38 E 35 ∪ E 36 ∪ E 3,10 E 35 ∪ E 36 ∪ E 37 ∪ E 3,10

E 41 ∪ E 45 ∪ E 48 ∪ E 4,10 E 41 ∪ E 45 ∪ E 48 ∪ E 4,10 E 46 ∪ E 47 ∪ E 48 ∪ E 4,12 E 46 ∪ E 47 ∪ E 48 ∪ E 49 ∪ E 4,12

E 11 ∪ E 12 ∪ E 13 E 12 E 12 ∪ E 13

Table 5 The descriptions of the granules X 1 and X 2 . Xi

The granule description of X i

X1

d( AT 1 , X 1 ) = [−, d( E 11 ) d( E 12 ) d( E 13 )]     d( AT 2 , X 1 ) = [d( E 21 ) d( E 23 ), d( E 21 ) d( E 23 ) d( E 24 ) d( E 25 )]      d( AT 3 , X 1 ) = [d( E 31 ) d( E 34 ) d( E 38 ), d( E 31 ) d( E 34 ) d( E 36 ) d( E 38 )]    4 d( AT , X 1 ) = d( E 41 ) d( E 45 ) d( E 48 ) d( E 4,10 )  d( AT 1 , X 2 ) = [d( E 12 ), d( E 12 ) d( E 13 )]   d( AT 2 , X 2 ) = [d( E 24 ), d( E 24 ) d( E 25 ) d( E 27 ))]      d( AT 3 , X 2 ) = [d( E 35 ) d( E 36 ) d( E 3,10 ), d( E 35 ) d( E 36 ) d( E 37 ) d( E 3,10 )]        d( AT 4 , X 2 ) = [d( E 46 ) d( E 47 ) d( E 48 ) d( E 4,12 ), d( E 46 ) d( E 47 ) d( E 48 ) d( E 49 ) d( E 4,12 )]

X2





Table 6 The description accuracies of X 1 and X 2 . k

1

2

3

4

αdk ( X 1 ) αdk ( X 2 )

0

1 2 1 3

3 4 3 4

1

1 2

4 5

By Definition 3, we compute the granule description accuracies of X 1 and X 2 in each granule level. See Table 6 for details. According to Example 3, we obtain the following results: 1. For the target granule X 1 , the granule description becomes more detailed and αdk ( X 1 ) is monotonically increasing when the value k is from 1 to 4; 2. For the target granule X 2 , the granule description becomes more detailed when the value k is from 1 to 4, but αdk ( X 2 ) changes without regularity. That is, Definition 3 cannot be used to define the granule description accuracy since it does not reflect the degree of describing a granule which becomes more and more detailed in a multi-scale information table. In order to find a method to measure the accuracy of granule description in a multi-scale information table, we first divide R ( AT k , X ) into two parts: R ( AT k , X ) and R ( AT k , X ) − R ( AT k , X ). We denote B N ( AT k , X ) = R ( AT k , X ) − R ( AT k , X ), and call B N ( AT k , X ) the boundary region of X with respect to AT k . Moreover, we analyze the granule description accuracy of X 2 when the value k is from 1 to 4. On one hand, from 1st to 2nd granule level, Ecn( R ( AT k , X 2 )) is from 1 to 1, and Ecn( B N ( AT k , X 2 )) is from 1 to 2. As Ecn( B N ( AT k , X 2 )) becomes greater from 1st to 2nd granule level, the granule description becomes more detailed, but αdk ( X ) becomes smaller. On the other hand, from 3rd to 4th granule level, Ecn( R ( AT k , X 2 )) is from 3 to 4, and Ecn( B N ( AT k , X 2 )) is from 1 to 1. As Ecn( R ( AT k , X 2 )) becomes greater from 3rd to 4th granule level, the granule description becomes more detailed and αdk ( X ) becomes greater. Thus, we find that either Ecn( R ( AT k , X )) or Ecn( B N ( AT k , X )) increases as the value k becomes greater, and the granule description becomes more detailed. Meanwhile, we observe that αdk ( X ) becomes greater when Ecn( R ( AT k , X )) becomes

greater, and αdk ( X ) becomes smaller when Ecn( B N ( AT k , X )) becomes greater. In summary, only the number of basic granules in the lower approximation used as the numerator of the granule description accuracy is not enough since the granules in the boundary region are also important for reflecting the change of granule description. However, it is obvious that the basic granules in the lower approximation and boundary region have different importance in terms of granule description. So, we introduce two weights δ L and δ B into the formula to achieve the description task. That is to say, we let δ L Ecn( R ( AT k , X )) + δ B Ecn( B N ( AT k , X )) as the numerator of the granule description accuracy. In addition, the denominator of the granule description accuracy is changed as the same form in the finest granule level because the problem of granule description is now considered in multi-scale information tables.

94

Q. Wan et al. / International Journal of Approximate Reasoning 116 (2020) 85–105

Table 7 The lower approximations and boundary regions of X 1 and X 2 . k

1

2

3

4

R ( AT k , X 1 ) B N ( AT k , X 1 ) R ( AT k , X 2 ) B N ( AT k , X 2 )



E 21 ∪ E 23 E 24 ∪ E 25 E 24 E 25 ∪ E 27

E 31 ∪ E 34 ∪ E 38 E 36 E 35 ∪ E 36 ∪ E 3,10 E 37

E 41 ∪ E 45 ∪ E 48 ∪ E 4,10

E 11 ∪ E 12 ∪ E 13 E 12 E 13



E 46 ∪ E 47 ∪ E 48 ∪ E 4,12 E 49

Table 8 The description accuracies of X 1 and X 2 . k

1

2

3

4

αd ( AT k , X 1 ) αd ( AT k , X 2 )

3 44 12 45

24 44 13 45

34 44 34 45

1 1

Definition 5. Let S = (U , { AT k |k = 1, 2, · · · , s}) be a multi-scale information table. For any target granule X ⊆ U , the granule description accuracy of X in the kth granule level is denoted by

αd ( AT k , X ) =

δ L Ecn( R ( AT k , X )) + δ B Ecn( B N ( AT k , X )) , δ L Ecn( R ( AT s , X )) + δ B Ecn( B N ( AT s , X ))

where δ L and δ B are the weighed parameters and k ∈ {1, 2, · · · , s}. Definition 5 quantifies the description of a target granule on a basis of the weighed sum of basic granules in the lower approximation and boundary region. Different from Definition 3, Definition 5 is designed for a multi-scale information table so that the change of granule description can be distinguished from one granule level to another. Based on the above discussion, as for the description of a target granule X , the effect of basic granules in R ( AT k , X ) is far bigger than that of B N ( AT k , X ). So, δ L and δ B are supposed to be δ L  δ B . In addition, if X = ∅, then we denote αd ( AT s , X ) = 0. Otherwise, for any nonempty subset X of U , the granule description accuracy increases as the value k

αd ( AT s , X ) = 1. It is also worth noting that αd ( AT k , X ) = granule level. As a result, 0 ≤ αd ( AT k , X ) ≤ 1.

becomes greater and

Ecn( R ( AT k , X )) Ecn( R ( AT s , X ))

when X is definable in the kth

In the following, Algorithm 1 gives a detailed procedure to calculate the description accuracy of a target granule for each granule level in a multi-scale information table. Algorithm 1 Compute the description accuracy of a target granule. Input: A multi-scale information table S = (U , AT ), a target granule X ⊆ U , the weighed parameters δ L and δ B , the number of granule levels s and any granule level k Output: The granule description accuracy αd ( AT k , X ) 1: Construct the partition U / AT s = {[x] AT s |x ∈ U }. 2: Calculate the lower approximation R ( AT s , X ) and the boundary region B N ( AT s , X ). 3: Compute δ L Ecn( R ( AT s , X )) + δ B Ecn( B N ( AT s , X )). 4: Recall Steps 1-3 to compute δ L Ecn( R ( AT k , X )) + δ B Ecn( B N ( AT k , X )). 5: Use Definition 5 to compute αd ( AT k , X ), and return αd ( AT k , X ).

Next, we use an example to explain Definition 5. 1 Example 4 (Continued with Example 3). Suppose δ L = 1, δ B = 11 (The reason of this hypothesis can be found in Subsection 5.2). For X 1 = {x1 , x5 , x8 , x11 } and X 2 = {x6 , x7 , x8 , x9 , x13 }, their lower approximations and boundary regions are shown in Table 7. By Definition 5, we can calculate the description accuracies of these two target granules in different scales. See Table 8 for details.

3. Optimal granule level based on granule description accuracy As is well known, when people face with multiple choices in daily life, they would like to make a choice at minimal cost. Accordingly, for so many granule levels in a multi-scale information table, it is necessary to select an appropriate one under the condition that the selected granule level must meet a given criteria. In this section, we solve this problem by defining optimal granule level based on the granule description accuracies for a target granule and a group of target granules, respectively. Furthermore, we develop the corresponding algorithms to select optimal granule levels.

Q. Wan et al. / International Journal of Approximate Reasoning 116 (2020) 85–105

95

Table 9 The description accuracies of X 1 and X 2 . k

1

2

3

4

αd ( AT k , X 1 ) αd ( AT k , X 2 )

0.1 0.3

0.5 0.3

0.8 0.8

1 1

3.1. Optimal granule level for a target granule Definition 6. Let S = (U , { AT k |k = 1, 2, · · · , s}) be a multi-scale information table, X ⊆ U be a target granule, and 0 < β ≤ 1 be a given parameter. If αd ( AT k , X ) ≥ β and αd ( AT k−1 , X ) < β (if k − 1 exists), then k is called an optimal granule level with respect to X based on the granule description accuracy β . Definition 6 provides a method to select a minimum granule level for the purpose of acquiring the more detailed granule description under the condition that the granule description accuracy satisfies a predetermined parameter β . Here, the semantic of β is the condition for the granule description accuracy of a target granule. It is worth noting that the optimal granule level k becomes greater as the parameter β becomes greater. In Subsection 5.3, we will further explain this observation by conducting some numerical experiments. Remark 2. It is known that 0 ≤ αd ( AT k , X ) ≤ 1 for any k, so we set 0 < β ≤ 1 in this paper. Note that, as αd ( AT s , X ) = 1 for any nonempty subset X of U , the optimal granule level of X always exists. According to Definition 6, we can use Algorithm 2 to select an optimal granule level with respect to X based on the granule description accuracy β in a multi-scale information table. Algorithm 2 Select an optimal granule level for a target granule. Input: A multi-scale information table S = (U , { AT k |k = 1, 2, · · · , s}), a target granule X ⊆ U and a given parameter β Output: An optimal granule level of S 1: Initialize k = 1. 2: Use Algorithm 1 to compute αd ( AT k , X ). 3: If αd ( AT k , X ) < β , then k = k + 1, and return Step 2; otherwise, return k.

Example 5 (Continued with Example 4). The description accuracies of target granules X 1 and X 2 are shown in Table 9. The final results have been all rounded to the nearest tenth. Suppose β = 0.5. The optimal granule level of X 1 is equal to 2, and the optimal granule level of X 2 is equal to 3. If β = 0.7, the optimal granule levels of X 1 and X 2 are both equal to 3. 3.2. Optimal granule level for a group of target granules As for a group of target granules, the problem of optimal granule level selection becomes more complicated due to many different requirements of granule description. Here, we only discuss optimal granule level selection based on three criteria commonly used by people in daily life. More details are given as follows. Case 1. The optimal granule level selection based on the relative criterion. In this case, one regards a group of target granules as a whole and only sets a global parameter for them. Definition 7. Let S = (U , { AT k |k = 1, 2, · · · , s}) be a multi-scale information table, { X i }ri=1 ( X i ⊆ U ) be a group of target

granules, and β0 be a given parameter. If ave [αd ( AT k , X i )]r1 ≥ β0 and ave [αd ( AT k−1 , X i )]r1 < β0 (if k − 1 exists), then k is called an optimal granule level with respect to a group of target granules { X i }ri=1 based on the relative accuracy β0 . Here,

ave [αd ( AT k , X i )]r1 =

1 r

r

i =1

αd ( AT k , X i ).

According to Definition 7, we give Algorithm 3 to compute the optimal granule level based on the relative criterion. Algorithm 3 Select an optimal granule level based on the relative criterion. Input: A multi-scale information table S = (U , { AT k |k = 1, 2, · · · , s}), a group of target granules { X i }ri=1 ( X i ⊆ U ) and a given parameter β0 Output: An optimal granule level of S 1: Initialize k = 1. 2: For each X i (i = 1, 2, · · · , r ), call Algorithm 1 to calculate αd ( AT k , X i ). 3: If ave [αd ( AT k , X i )]r1 < β0 , then k = k + 1, and return Step 2; otherwise, return k.

96

Q. Wan et al. / International Journal of Approximate Reasoning 116 (2020) 85–105

Table 10 The average accuracies of the descriptions of a group of target granules. k

1

2

3

4

ave [αd ( AT k , X i )]21

0.2

0 .4

0.8

1

Case 2. The optimal granule level selection based on the absolute criterion. This case sets a local parameter for each element from a group of target granules. Definition 8. Let S = (U , { AT k |k = 1, 2, · · · , s}) be a multi-scale information table, { X i }ri=1 ( X i ⊆ U ) be a group of target granules, and {βi }ri=1 be a group of given parameters. If k−1

αd ( AT k , X i ) ≥ βi for any target granule X i , and there exists X j such

that αd ( AT , X j ) < β j (if k − 1 exists), then k is called an optimal granule level with respect to a group of target granules { X i }ri=1 based on the absolute accuracies {βi }ri=1 . By Definition 8, we design Algorithm 4 to compute the optimal granule level based on the absolute criterion. Algorithm 4 Select an optimal granule level based on the absolute criterion. Input: A multi-scale information table S = (U , { AT k |k = 1, 2, · · · , s}), a group of target granules { X i }ri=1 ( X i ⊆ U ) and a group of given parameters {βi }ri=1 Output: An optimal granule level of S 1: Initialize k = 1. 2: For each X i (i = 1, 2, · · · , r ), call Algorithm 1 to calculate αd ( AT k , X i ). 3: If there exists j ∈ {1, 2, · · · , r } such that αd ( AT k , X j ) < β j , then k = k + 1, and return Step 2; otherwise, return k.

Case 3. The optimal granule level selection based on the double quantization criterion. In the third case, one not only sets a global parameter for the whole group of target granules, but also sets a local parameter for each element from the group of target granules. Definition 9. Let S = (U , { AT k |k = 1, 2, · · · , s}) be a multi-scale information table, { X i }ri=1 ( X i ⊆ U ) be a group of target granules, and β0 and {βi }ri=1 be given parameters. If the following statements hold: (1) αd ( AT k , X i ) ≥ βi for any target granule X i , and ave [αd ( AT k , X i )]r1 ≥ β0 ; (2) there exists a target granule X j such that αd ( AT k−1 , X j ) < β j , or ave [αd ( AT k−1 , X i )]r1 < β0 (if k − 1 exists), then k is called an optimal granule level with respect to a group of target granules { X i }ri=1 based on the double quantization accuracies β0 and {βi }ri=1 . Similarly, we put forward Algorithm 5 to select an optimal granule level based on the double quantization criterion. According to Definitions 7–9, it is easy to know that if q is an optimal granule level based on both the relative criterion and the absolute criterion, then q must be an optimal granule level based on the double quantization criterion. Algorithm 5 Select an optimal granule level based on the double quantization criterion. Input: A multi-scale information table S = (U , { AT k |k = 1, 2, · · · , s}), a group of target granules { X i }ri=1 ( X i ⊆ U ), and the given parameters {βi }ri=1 and β0 Output: An optimal granule level of S 1: Initialize k = 1. 2: For each X i (i = 1, 2, · · · , r ), call Algorithm 1 to calculate αd ( AT k , X i ). 3: If there exists j ∈ {1, 2, · · · , r } such that αd ( AT k , X j ) < β j or ave [αd ( AT k , X i )]r1 < β0 , then k = k + 1, and return Step 2; otherwise, return k.

In what follows, we use an example to further explain Algorithms 3–5. Example 6 (Continued with Example 5). Suppose X 1 = {x1 , x5 , x8 , x11 } and X 2 = {x6 , x7 , x8 , x9 , x13 } are the given target granules and we consider them as a group. According to Example 5, we can compute the average accuracy of granule description which can be found in Table 10. Note that the final results shown in Table 10 have been all rounded to the nearest tenth. For Case 1, suppose β0 = 0.8. Then, the optimal granule level based on the relative criterion of S is 3. For Case 2, suppose β1 = 0.7 and β2 = 0.6. Then, the optimal granule level based on the absolute criterion of S is 3 according to Example 4.

Q. Wan et al. / International Journal of Approximate Reasoning 116 (2020) 85–105

97

For Case 3, suppose β0 = 0.8, β1 = 0.7 and β2 = 0.6. Then, the optimal granule level based on the double quantization criterion of S is 3. Note that the double quantization criterion is much stricter than both the relative and absolute criteria. As a result, the obtained optimal granule level of the former is greater or equal to the latter two in terms of the same group of target granules. 4. Impact on the optimal granule level with objects updating In the previous section, we have presented methods to select optimal granule levels from a multi-scale information table based on a target granule and a group of target granules, respectively. Due to the data in real-life is often in a dynamic environment, we need to analyze the impact on the obtained optimal granule levels. In this section, we mainly discuss the case of object updating and suppose the new table will still be a multi-scale information table after the new objects are added. 4.1. Impact on granule description accuracy with objects updating Let us begin with the case of one new object added into the original multi-scale information table. In order to solve this problem, we introduce some necessary notations: 1. Suppose the original multi-scale information table is S (t ) = (U (t ) , { AT k |k = 1, 2, · · · , s}}). For each k ∈ {1, 2, · · · , s}, we (t ) denote the partition of U (t ) by U (t ) / AT k = {[x] k |x ∈ U (t ) }, the lower and upper approximations and the boundary reAT

gion of a target granule X ⊆ U (t ) by R (t ) ( AT k , X ), R by

(t )

(t )

( AT k , X ) and B N (t ) ( AT k , X ), and the granule description accuracy

k

αd ( AT , X ).

2. For a new object x , the updated multi-scale information table is denoted by S (t +1) = (U (t +1) , { AT k |k = 1, 2, · · · , s}), (t +1) where U (t +1) = U (t ) ∪ {x }. For each k ∈ {1, 2, · · · , s}, we denote the partition of U (t +1) by U (t +1) / AT k = {[x] k |x ∈ AT

U (t +1) }, the lower and upper approximations and the boundary region of a target granule X ⊆ U (t ) by R (t +1) ( AT k , X ), R

(t +1)

( AT k , X ) and B N (t +1) ( AT k , X ), and the granule description accuracy by αd(t +1) ( AT k , X ).

By the definition of an optimal granule level, it is easy to see that the optimal granule level selection is closely related to the granule description accuracy of a target granule X . Moreover, the granule description accuracy of X is determined by the lower approximation and the boundary region of X . Therefore, in order to achieve the task of evaluating the change of the granule description accuracy, we need to discuss the impact on the lower approximation and the boundary region of the target granule X in advance. Obviously, in the updated multi-scale information table S (t +1) = (U (t ) ∪ {x }, { AT k |k = 1, 2, · · · , s}), for any x ∈ U (t ) , it is (t +1) (t ) (t +1) (t ) true that [x] k = [x] k or [x] k = [x] k ∪ {x }. For our purpose, we first discuss how to update the lower and upper AT

AT

AT

AT

approximations of a target granule X ⊆ U (t ) when a new object x is added. Theorem 2. Let S (t ) = (U (t ) , { AT k |k = 1, 2, · · · , s}) be an original multi-scale information table, X ⊆ U (t ) be a target granule and (t +1) (t +1) S (t +1) = (U (t ) ∪ {x }, { AT k |k = 1, 2, · · · , s}) be the updated multi-scale information table. For any x ∈ U (t ) , if [x ] k = [x] k , then AT

R (t +1) ( AT k , X ) = R (t ) ( AT k , X ), Proof. For any x ∈ U (t ) , if [x ] (t +1)

R

AT k

∈ U (t +1) / AT k |[x] AT k

∪{[x] AT k (t +1)

(t +1)

(t +1)

(t )

R

(t +1)

AT

(t )

( AT k , X ) = R ( AT k , X ).

(t +1)

(t +1)

= [x] AT k , we have [x] AT k = [x] AT k . Hence, it can be obtained that R (t +1) ( AT k , X ) = (t )

(t )

(t )

⊆ X }=∪{[x] AT k ∈ U (t ) / AT k |[x] AT k ⊆ X } = R (t ) ( AT k , X ).

Similarly,

we

can

prove

( AT , X ) = R ( AT , X ). 2 k

k

Theorem 2 shows that if the new object x forms an equivalence class which only contains x in S (t +1) , then the lower and upper approximations of the target granule X do not change in S (t +1) . Otherwise, we can obtain another conclusion as follows. Lemma 2. Let S (t ) = (U (t ) , { AT k |k = 1, 2, · · · , s}) be an original multi-scale information table, X ⊆ U (t ) be a target granule and (t +1) S (t +1) = (U (t ) ∪ {x }, { AT k |k = 1, 2, · · · , s}) be the updated multi-scale information table. If there exists x ∈ U (t ) such that [x ] k = AT

t +1) [x](AT k , then

R

(t +1)

(t )

( AT k , X ) ⊇ R ( AT k , X ) and R (t +1) ( AT k , X ) ⊆ R (t ) ( AT k , X ).

98

Q. Wan et al. / International Journal of Approximate Reasoning 116 (2020) 85–105

(t +1)

(t +1)

(t +1)

(t +1)

Proof. If there exists x ∈ U (t ) such that [x ] k = [x] k , then [x ] k = [x] k = [x] k ∪ {x }. In S (t +1) , for X ⊆ U (t ) , x ∈ / AT AT AT AT AT X , by the definitions of the lower and upper approximations, we can obtain

R

(t +1)

(t +1)

(t )

(t +1)

( AT k , X ) = ∪{[x] AT k ∈ U (t +1) / AT k |[x] AT k ∩ X = ∅}, (t +1)

R (t +1) ( AT k , X ) = ∪{[x] (1) For any x ∈ R

(t )

AT k

t +1 ) ∈ U (t +1) / AT k |[x](AT k ⊆ X }.

(t +1)

(t +1)

( AT k , X ), we have [x] AT k ∩ X = ∅. So, it is easy to obtain [x] AT k ∩ X = ∅ by [x] AT k = [x] AT k ∪ {x }. (t )

(t +1)

(t )

(t +1)

(t )

Thus, we conclude x ∈ R ( AT k , X ), i.e., R ( AT k , X ) ⊆ R ( AT k , X ). (t ) (t +1) (t +1) (t ) ( t) k (2) From x ∈ / R ( AT , X ), it follows [x] k  X . According to x ∈ / X , we have [x] k  X by [x] k = [x] k ∪ {x }. Thus, AT

it follows x ∈ / R (t +1) ( AT k , X ), i.e., R (t +1) ( AT k , X ) ⊆ R (t ) ( AT k , X ).

AT

2

AT

AT

Based on Lemma 2, for any target granule X , its lower and upper approximations can be changed regularly when a new object x is added. The detailed change can be summarized as follows. Theorem 3. Let S (t ) = (U (t ) , { AT k |k = 1, 2, · · · , s}) be an original multi-scale information table, X ⊆ U (t ) be a target granule and (t +1) S (t +1) = (U (t ) ∪ {x }, { AT k |k = 1, 2, · · · , s}) be the updated multi-scale information table. If there exists x ∈ U (t ) such that [x ] k = AT

(t +1)

[x] AT k , we have the following statements: (1) if x ∈ U (t ) − R

(t )

( AT k , X ), then

R (t +1) ( AT k , X ) = R (t ) ( AT k , X ),

R

(t +1)

(t )

( AT k , X ) = R ( AT k , X );

(2) if x ∈ B N (t ) ( AT k , X ), then

R (t +1) ( AT k , X ) = R (t ) ( AT k , X ),

Ecn( R

(t +1)

(t )

( AT k , X )) = Ecn( R ( AT k , X ));

(3) if x ∈ R (t ) ( AT k , X ), then

Ecn( R (t +1) ( AT k , X )) = Ecn( R (t ) ( AT k , X )) − 1, Ecn( R

(t +1)

(t )

( AT k , X )) = Ecn( R ( AT k , X )).

Proof. (1) For x ∈ U (t ) − R (t +1)

(t )

(t )

( AT k , X ), it follows x ∈ / R ( AT k , X ). In other words, [x] AT k ∩ X = ∅. According to x ∈ / X and (t )

(t +1)

[x] AT k = [x] AT k ∪ {x }, it can be known that [x] AT k ∩ X = ∅. And then, we have x ∈ /R R

(t )

(t )

( AT k , X ). By Lemma 2, we conclude R

(t +1)

(t +1)

( AT k , X ). Thus, R

(t +1)

( AT k , X ) = R ( AT k , X ).

Based on the properties of the lower and upper approximations in Proposition 1, and R (t +1) ( AT k , X ) ⊆ R we obtain x ∈ /

( AT k , X ) ⊆

(t )

(t +1) (t +1) R (t +1) ( AT k , X ) by x ∈ /R ( AT k , X ), which leads to [x] k AT

(t )

 X . Thus, we have that [x] AT k

(t +1)

( AT k , X ), (t +1)  X by [x] AT k =

[x] AT k ∪ {x } and x ∈ / X . So, x ∈ / R (t ) ( AT k , X ), i.e., R (t ) ( AT k , X ) ⊆ R (t +1) ( AT k , X ). Consequently, it follows R (t +1) ( AT k , X ) = ( t) k R ( AT , X ). (t ) (t ) (t ) (2) For x ∈ B N (t ) ( AT k , X ), it is known that x ∈ R ( AT k , X ) and x ∈ / R (t ) ( AT k , X ), i.e., [x] AT k ∩ X = ∅ and [x] AT k  X . By (t )

(t +1)

(t +1)

(t +1)

(t +1)

[x ] AT k = [x] AT k = [x] AT k ∪ {x }, we obtain that [x ] AT k ∩ X = ∅ and [x ] AT k  X , i.e., x ∈ B N (t +1) ( AT k , X ). In addition, by x ∈ / B N (t ) ( AT k , X ), we get B N (t +1) ( AT k , X ) = B N (t ) ( AT k , X ) ∪ {x }, yielding R (t +1) ( AT k , X ) = R (t ) ( AT k , X ). By B N ( AT k , X ) = (t +1) (t ) (t +1) R ( AT k , X ) − R ( AT k , X ), we have R ( AT k , X ) = R ( AT k , X ) ∪ {x }. As a result, we conclude Ecn( R ( AT k , X )) = (t ) (t +1) (t +1) k  Ecn( R ( AT , X )) by [x ] k = [x] k . AT AT (t )

(t ) (t ) (3) From x ∈ R (t ) ( AT k , X ), it follows x ∈ R ( AT k , X ) by R (t ) ( AT k , X ) ⊆ R ( AT k , X ). Combining it with the result of (2), (t +1)

(t )

we can obtain Ecn( R ( AT k , X )) = Ecn( R ( AT k , X )). (t ) (t +1) (t ) ( t) According to x ∈ R ( AT k , X ), we get [x] k ⊆ X . On account of [x] k = [x] k ∪ {x } and x ∈ / X , it can be verified that [x]

(t +1) AT k

AT

AT

AT

 X . Thus, we have x ∈ / R (t +1) ( AT k , X ). Moreover, it can be concluded that x ∈ R (t ) ( AT k , X ) − R (t +1) ( AT k , X ).

(t ) Consequently, it follows R (t +1) ( AT k , X ) = R (t ) ( AT k , X ) − [x] k . That is, Ecn( R (t +1) ( AT k , X )) = Ecn( R (t ) ( AT k , X )) − 1 is at AT hand. 2

On the basis of Theorems 2 and 3, we clarify the change regularity of the number of basic granules in the lower and upper approximations after one new object is added. Similarly, we can obtain the change regularity of the number of basic granules in the boundary region as follows.

Q. Wan et al. / International Journal of Approximate Reasoning 116 (2020) 85–105

99

Lemma 3. Let S (t ) = (U (t ) , { AT k |k = 1, 2, · · · , s}) be an original multi-scale information table, X ⊆ U (t ) be a target granule and S (t +1) = (U (t ) ∪ {x }, { AT k |k = 1, 2, · · · , s}) be the updated multi-scale information table. If there exists x ∈ R (t ) ( AT k , X ) such that [x ]

(t +1)

(t +1)

= [x] AT k , then Ecn( B N (t +1) ( AT k , X )) = Ecn( B N (t ) ( AT k , X )) + 1. Otherwise, Ecn( B N (t +1) ( AT k , X )) = ( t) Ecn( B N ( AT , X )). AT k k

Clearly, for a target granule X ⊆ U (t ) , if Ecn( R (t +1) ( AT k , X )) = Ecn( R (t ) ( AT k , X )) and Ecn( B N (t +1) ( AT k , X )) = (t +1)

(t )

Ecn( B N (t ) ( AT k , X )), then αd ( AT k , X ) = αd ( AT k , X ). According to Theorem 3 and Lemma 3, we can obtain the following conclusion about the granule description accuracy with one object updating. Theorem 4. Let S (t ) = (U (t ) , { AT k |k = 1, 2, · · · , s}) be an original multi-scale information table, X ⊆ U (t ) be a target granule and S (t +1) = (U (t ) ∪ {x }, { AT k |k = 1, 2, · · · , s}) be the updated multi-scale information table. If there exists x ∈ R (t ) ( AT k , X ) such that (t +1)

(t +1)

(t +1)

[x ] AT k = [x] AT k , then αd

(t +1)

(t )

( AT k , X ) ≤ αd ( AT k , X ). Otherwise, αd

(t )

( AT k , X ) = αd ( AT k , X ).

Proof. By Definition 5 and Theorem 3, if there exists x ∈ R (t ) ( AT k , X ) such that [x ]

αd(t +1) ( AT k , X ) = Thus, we can obtain

δ L Ecn( R (t ) ( AT k , X )) + δ B Ecn( B N (t ) ( AT k , X )) − δ L + δ B δ L Ecn( R (t ) ( AT s , X )) + δ B Ecn( B N (t ) ( AT s , X )) − δ L + δ B

(t +1) AT k

(t +1)

= [x] AT k , then

.

αd(t +1) ( AT k , X ) ≤ αd(t ) ( AT k , X ). As a result, αd(t +1) ( AT k , X ) = αd(t ) ( AT k , X ) if and only if U (t ) / AT k =

U (t ) / AT s , where k = 1, 2, · · ·

, s − 1. (t +1) (t +1) According to Theorem 3, if [x ] k = [x] k for any x ∈ R (t ) ( AT k , X ), we have that Ecn( R (t +1) ( AT k , X )) = AT

AT

Ecn( R (t ) ( AT k , X )) and Ecn( B N (t +1) ( AT k , X )) = Ecn( B N (t ) ( AT k , X )). Hence,

αd(t +1) ( AT k , X ) = αd(t ) ( AT k , X ). 2

Theorem 4 shows the detailed impact on the granule description accuracy when one new object x is added. Based on this theorem, we continue clarifying the impact on the optimal granule level for a target granule with object updating. 4.2. The updating of optimal granule levels Theorem 5. Let S (t ) = (U (t ) , { AT k |k = 1, 2, · · · , s}) be an original multi-scale information table, X ⊆ U (t ) be a target granule, β be a given parameter, q be the optimal granule level of S (t ) and S (t +1) = (U (t ) ∪ {x }, { AT k |k = 1, 2, · · · , s}) be the updated multi-scale (t +1) (t +1) (t +1) information table. If there exists x ∈ R (t ) ( AT q , X ) such that [x ] AT q = [x] AT q and αd ( AT q , X ) < β , then the optimal granule level of S (t +1) is greater than q. Otherwise, the optimal granule level of S (t +1) is equal to q. Proof. It can be proved easily by Definition 6 and Theorem 4.

2

Theorem 5 offers an approach to update the optimal granule level based on a target granule in the case of one new object being added. The detailed process is shown in Algorithm 6. Algorithm 6 Select an optimal granule level based on a target granule for one object updating. Input: The optimal granule level of S (t ) is q, a target granule X ⊆ U (t ) , a given parameter β and a new object x Output: An optimal granule level of S (t +1) 1: Initialize k = q. 2: Construct the lower approximation R (t ) ( AT k , X ). (t +1)

3: If there exists x ∈ R (t ) ( AT k , X ) such that [x ]

AT k

(t +1)

(t +1)

= [x] AT k and αd

( AT k , X ) < β , then k = k + 1, and return Step 2; otherwise, return k.

In Section 3, for a group of target granules, we have proposed three types of definitions for optimal granule levels based on three different criteria. Accordingly, we need to study the impact on these three types of optimal granule levels when one new object is added. Theorem 6. Let S (t ) = (U (t ) , { AT k |k = 1, 2, · · · , s}) be an original multi-scale information table, { X i }ri=1 ( X i ⊆ U ) be a group of target granules, β0 and {βi }ri=1 be given parameters, q be the optimal granule level of S (t ) and S (t +1) = (U (t ) ∪ {x }, { AT k |k = 1, 2, · · · , s}) be the updated multi-scale information table. For some target granules X h (h ∈ {1, 2, · · · , r }), if there exists x ∈ R (t ) ( AT q , X h ) such (t +1)

(t +1)

that [x ] AT q = [x] AT q , then the following statements:

(t +1) (1) if ave [αd ( AT q , X i )]r1 < β0 , then the optimal granule level based on the relative criterion of S (t +1) is greater than q. Otherwise, the optimal granule level is equal to q.

100

Q. Wan et al. / International Journal of Approximate Reasoning 116 (2020) 85–105

Table 11 The updated multi-scale information table S = (U (t ) ∪ {x }, { AT k |k = 1, 2, · · · , s}). U (t ) ∪ {x }

a11

a21

a31

a41

a12

a22

a32

a42

a13

a23

a33

a43

x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 x

1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 1

1 1 1 1 2 2 2 2 3 3 3 3 3 3 4 2

1 1 1 2 3 3 4 4 5 5 5 5 5 6 7 3

1 1 2 3 4 4 5 5 6 6 6 7 8 9 10 11

1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2

1 2 2 2 2 3 3 3 3 3 3 4 4 4 4 3

1 2 2 3 3 4 4 4 4 4 5 6 6 7 7 4

1 2 3 4 4 5 5 5 6 6 7 8 9 10 11 12

1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 1

1 2 2 2 2 2 2 2 3 3 3 3 4 4 4 2

1 2 2 3 3 4 4 4 5 5 5 5 6 7 7 4

1 2 3 4 4 5 5 6 7 7 8 8 9 10 11 5

(t +1)

(2) if there exists X h such that αd ( AT q , X h ) < βh , then the optimal granule level based on the absolute criterion of S (t +1) is greater than q. Otherwise, the optimal granule level is equal to q. (t +1) (t +1) (3) if there exists X h such that αd ( AT q , X h ) < βh , or ave [αd ( AT q , X i )]r1 < β0 , then the optimal granule level based on the double quantization criterion of S (t +1) is greater than q. Otherwise, the optimal granule level is equal to q.

In fact, Theorem 6 can easily be obtained by Definitions 7–9 and Theorem 4. Hence, we omit the proof of Theorem 6. (t +1) (t +1) Note that, for any X j ∈ { X i }ri=1 , if there does not exist x ∈ R (t ) ( AT q , X j ) such that [x ] AT q = [x] AT q , then the optimal granule level based on three different criteria of S (t +1) does not change. It can be known from Section 3 that the optimal granule level selection problems based on the relative criterion and the absolute criterion are special cases of the one based on the double quantization criterion. Therefore, we only provide Algorithm 7 to select an optimal granule level based on the double quantization criterion for a multi-scale information table with one object updated. Algorithm 7 Select an optimal granule level based on the double quantization criterion for one object updating. Input: The optimal granule level of S (t ) is q, a group of target granules { X i }ri=1 ( X i ⊆ U ), the given parameters {βi }ri=1 and β0 , and a new object x Output: An optimal granule level of S (t +1) 1: Initialize k = q. 2: For any X i (i = 1, 2, · · · , r ), compute the lower approximation R (t ) ( AT k , X i ). 3: If there exists X j such that

αd(t +1) ( AT k , X j ) < β j , or ave[αd(t +1) ( AT k , X i )]r1 < β0 , then k = k + 1, and return Step 2; otherwise, return k.

Remark 3. In Algorithms 6 and 7, based on Theorem 4,

αd(t +1) ( AT k , X ) =

αd(t +1) ( AT k , X i ) can be computed by the following formula

δ L Ecn( R (t ) ( AT k , X )) + δ B Ecn( B N (t ) ( AT k , X )) − δ L + δ B δ L Ecn( R (t ) ( AT s , X )) + δ B Ecn( B N (t ) ( AT s , X )) − δ L + δ B

.

For the case of more than one object U  added, we first calculate the partition U  . Then, we select an object from each equivalence class to construct a new set Q of objects. At last, we add Q into S (t ) . The reason is that the optimal granule level selection in this paper is closely related to the granule description accuracy of target granules, and the granule description accuracy can be calculated by the number of equivalence classes according to Definition 5. Therefore, by employing the method developed for the case of one new object added, we can analyze the impact on optimal granule level with more than one objects updating in the multi-scale information table. Since the proof is similar, we do not list them as propositions or theorems any more. However, in Section 5, by choosing some data sets from UCI, we will demonstrate with numerical experiments the impact on the optimal granule level when a group of objects are added. At the end of this section, we further explain the above results by an example. Example 7. The multi-scale information table S (t +1) = (U (t ) ∪ {x }, { AT k |k = 1, 2, · · · , s}) in Table 11 was obtained by adding the new object x into Table 2. Let X 1 = {x1 , x5 , x8 , x11 } and X 2 = {x6 , x7 , x8 , x9 , x13 } be two target granules. Suppose β0 = 0.8, β1 = 0.7 and β2 = 0.6. It can be known from Example 6 that the optimal granule levels based on three different criteria of S (t ) are all

Q. Wan et al. / International Journal of Approximate Reasoning 116 (2020) 85–105

101

Table 12 The description accuracies of target granules. k

1

2

3

4

( AT , X 1 )

0.1

0.5

0.8

1

αd(t +1) ( AT k , X 2 )

0.3

0.3

0.7

1

0.2

0.4

0.8

1

(t +1)

αd

(t +1)

ave [αd

k

( AT k , X i )]21

equal to 3. Moreover, R (t ) ( AT 3 , X 1 ) = E 31 ∪ E 34 ∪ E 38 = {x1 , x5 , x11 }, B N (t ) ( AT 3 , X 1 ) = E 36 = {x7 , x8 }, R (t ) ( AT 3 , X 2 ) = E 35 ∪ E 36 ∪ E 3,10 = {x6 , x7 , x8 , x13 }, B N (t ) ( AT 3 , X 2 ) = E 37 = {x9 , x10 }, (t +1) Since [x ] AT 3 (t +1) 3

αd

(t +1)

= [x6 ] AT 3

and x6 ∈ (t +1)

( AT , X 2 ) > β2 and ave [αd

R (t ) ( AT 3 , X 2 ), we can compute

(t +1)

αd

(t )

αd(t ) ( AT 3 , X 1 ) = 0.8,

αd ( AT 3 , X 2 ) = 0.8. (t ) ( AT , X 2 ) = 0.7 < αd ( AT 3 , X 2 ) = 0.8, but 3

( AT 3 , X i )]21 = 0.8 = β0 . The detailed results are shown in Table 12.

By Theorem 6, it can be checked that the optimal granule levels based on three different criteria of S (t +1) are still all equal to 3. 5. Experimental analysis In this section, we analyze the time complexity of our algorithms, discuss the reasonableness of setting the weight parameters δ L and δ B , and show the effectiveness of the proposed methods by some experiments. Besides, we compare the optimal granule levels between our method based on the granule description accuracy and the existing ones based on the consistencies of single-scale decision tables. 5.1. Time complexity analysis In the previous sections, we have introduced seven algorithms. Specifically, Algorithm 1 is to calculate the granule description accuracy with respect to a target granule, and the rest is to select an optimal granule level. Note that Algorithm 1 is the basis of other algorithms. Therefore, we first analyze the time complexity of Algorithm 1. For Step 1 of Algorithm 1, the time complexity of computing | AT | partitions is O (|U |2 · | AT |). In Step 2, the time complexity of computing the lower and upper approximations of X ⊆ U is O (|U |2 ). Since Step 4 is a simple repeat of Steps 1–3, the time complexity of Algorithm 1 is O (|U |2 · | AT |). Algorithm 2 is to select an optimal granule level for a target granule by calling Algorithm 1 s times, where s is the number of granule levels. Then, its running time is s times greater than that of Algorithm 1. So, the time complexity of Algorithm 2 is O (|U |2 · | AT | · s). Algorithms 3–5 are to select optimal granule levels for a group of target granules, and Algorithms 3 and 4 can be viewed as special cases of Algorithm 5. All of them are based on Algorithm 2, and their running time is r times compared to Algorithm 2, where r is the number of target granules in the group. Hence, the time complexity of Algorithms 3–5 are all O (|U |2 · | AT | · s · r ). Algorithms 6 and 7 are both designed for dynamic data. They are to select optimal granule levels based on a target granule and a group of target granules under objects updating, respectively. So, Algorithm 6 can be viewed as a special case of Algorithm 7. Based on Algorithm 1, the time complexity of computing the lower approximation of X ⊆ U is O (|U |2 · | AT |). Accordingly, the time complexity of Algorithm 6 is O ((|U |2 · | AT | + | R (t ) ( AT k , X i )| · | AT |) · (s − q)) = O (|U |2 · | AT | · (s − q)), where q is the optimal granule level. And then, it is obvious that the time complexity of Algorithm 7 is O (|U |2 · | AT | · (s − q)) as well. Specifically, when more than one object are added, the time complexity of Algorithm 7 is still O (|U |2 · | AT | · (s − q)). 5.2. Analysis of setting weight parameters In the following, we discuss how to select the values of the weight parameters δ L and δ B . In Section 2.3, we have supposed the weight of the lower approximation is far bigger than that of the boundary region, i.e., δ L  δ B . Due to this reason, we let δ L = 1 and δ B = n1 (n is a positive integer) for simplifying calculations. Then, the granule description accuracy

αd ( AT k , X ) of X in the kth granule level is sensitive to n when X and k are known. Accordingly, the optimal granule level

is sensitive to n when X and k have been given. In this case, it becomes difficult to select the optimal granule level. In order to solve this problem, it is very important to select an appropriate value n to make αd ( AT k , X ) relatively stable by using the idea of the limit of the sequence. Therefore, for searching for an appropriate value n, we let αd ( AT k , X ) = αd (n) for conciseness. On the other hand, the optimal granule level is related to the parameter β . In this paper, the value β is rounded to the nearest tenth. So, as long as n > N such that |αd (n) − αd (n + 1)| < 0.01, we call αd ( AT k , X ) a relatively stable value. Next, we give the range of n satisfying the above requirement by the following theorem.

102

Q. Wan et al. / International Journal of Approximate Reasoning 116 (2020) 85–105

Table 13 The detailed information about three open data sets. Data set

Instances

Iris Seeds Auto MPG

150 210 390

Theorem 7. For any X ⊆ U and k ∈ {1, 2, · · · , s}, let

αd (n) =

Attributes 4 (Real) 7 (Real) 8 (5 is real, and 3 is discrete) Ecn( R ( AT k , X ))+ n1 Ecn( B N ( AT k , X )) Ecn( R ( AT s , X ))+ n1 Ecn( B N ( AT s , X ))

be the granule description accuracy of

the target granule X in the kth granule level. Then, if n > 10, we have |αd (n) − αd (n + 1)| < 0.01. Proof. If X and k are known, Ecn( R ( AT k , X )), Ecn( B N ( AT k , X )), Ecn( R ( AT s , X )) and Ecn( B N ( AT s , X )) can also be determined. Without loss of generality, let Ecn( R ( AT k , X )) = W 1 , Ecn( B N ( AT k , X )) = W 2 , Ecn( R ( AT s , X )) = W 3 , and Ecn( B N ( AT s , X )) = W 4 . Furthermore, we denote integers and we set W 3 > W 2 . Then, we obtain

αd (n) =

W 1 + n1 W 2 W 3 + n1 W 4

. Note that W 1 , W 2 , W 3 and W 4 are all non-negative

  1 W + 1W W 1 + n+ W 2   1 n 2 1 |αd (n) − αd (n + 1)| =  −  1  W3 + 1 W4 W 3 + n+ W4  n 1     W2W3 − W1W4  =  (nW 3 + W 4 )(nW 3 + W 4 + W 3 )  ≤ <

From

1 n2

W2W3

(nW 3 + W 4 )(nW 3 + W 4 + W 3 ) 1 n2

< 0.01, it follows n > 10. Since

1 n2

< 0.01 implies |αd (n) − αd (n + 1)| < 0.01, the proof is completed. 2

5.3. Numerical experiments To show the effectiveness of our methods, we selected three open data sets from UCI Machine Learning Repository [52], i.e., Iris, Seeds, and Auto MPG. The detailed information on the chosen data sets is shown in Table 13. Considering that the three chosen data sets are all single-scaled, we need to expand them into multi-scale information tables. In the experiments, we adopted the method in [45] to extend them from single-scale to 4-scale. The detailed process is as follows. 1. Delete the objects with missing values. 2. For any attribute a, calculate the maximum value M a and the minimum value ma of a as well as M a − ma . 3. Generate the first level of scale a1 of a by splitting the range of values of a into two equal parts, i.e., a1 = {[ma , ma + M a −ma ma ], [ma + Ma − , M a ]}. And then, generate the second level of scale a2 of a by splitting the range of values of a1 2 2 into two equal parts, i.e., dividing the range of values of a into four equal parts. Similarly, a3 and a4 can be obtained by splitting the range of values of a into eight equal parts and sixteen equal parts, respectively. After the above pre-processing of the three chosen data sets, we obtained three multi-scale information tables, and we labeled them with No. 1, No. 2 and No. 3, respectively. In the following, based on the relationship among the algorithms in Subsection 5.1, we mainly evaluate the feasibility of Algorithm 5 for static data and Algorithm 7 for dynamic data based on the above three multi-scale information tables. In addition, in order to demonstrate the feasibility of Algorithm 7, we selected nine-tenth of the instances as the static data and the rest as the dynamic data in each multi-scale information table. For static data, we first selected two target granules at random for each multi-scale information table. And then, we compute the granule description accuracies of these target granules in different granule levels by Algorithm 1. In the experiments, the importance of the granules in the lower approximation is set to be 11 times greater than that of the granules in the boundary region. It can be observed from Table 14 that the granule description accuracy of a target granule increases as the value k becomes greater. Moreover, because different values of β represent different requirements for accuracy, we select three different groups of parameters as follows:

P 1 : β1 = 0.1, β2 = 0.2, β0 = 0.3; P 2 : β1 = 0.4, β2 = 0.5, β0 = 0.6; P 3 : β1 = 0.7, β2 = 0.8, β0 = 0.9.

Q. Wan et al. / International Journal of Approximate Reasoning 116 (2020) 85–105

103

Table 14 The granule description accuracies of target granules. Data set

No.1

No.2

No.3

A group of target granules

Granule level

αd ( AT k , X 11 ) αd ( AT k , X 12 ) ave [αd ( AT k , X 1i )]21 αd ( AT k , X 21 ) αd ( AT k , X 22 ) ave [αd ( AT k , X 2i )]21 αd ( AT k , X 31 ) αd ( AT k , X 32 ) ave [αd ( AT k , X 3i )]21

1

2

3

4

0.0 0.0 0.0 0.2 0.1 0.2 0.0 0.1 0.1

0.3 0.3 0.3 0.5 0.4 0.5 0.3 0.3 0.3

0.8 0.6 0.7 0.8 0.9 0.9 0.7 0.8 0.8

1 1 1 1 1 1 1 1 1

Table 15 The experimental results of Algorithm 5 for static data. Data set

Instances

No.1 No.2 No.3

OGL

135 189 351

P1

P2

P3

2 2 2

3 3 3

4 3 4

Table 16 The experimental results of Algorithms 5 and 7 for dynamic data. Data set

No.1 No.2 No.3

Instances

OGL

Running time (s)

Original/updated

P1

P2

P3

Algorithm 5

Algorithm 7

135/15 189/21 351/39

3 2 2

3 3 3

4 3 4

0.25106 0.50054 1.77716

0.07705 0.16698 0.50178

The experimental results of selecting optimal granule levels are shown in Table 15, where “OGL” is the abbreviation of “optimal granule level”. Based on Table 15, it is confirmed that the optimal granule level becomes greater as the parameter β becomes greater. For dynamic data, we evaluate the optimal granule level selection from two aspects. On one hand, we view the updated data set as a whole, and use Algorithm 5 to find the optimal granule level. On the other hand, we directly use Algorithm 7 to update the optimal granule levels according to the results shown in Table 15. The details can be found in Table 16. It can be observed from Tables 15 and 16 that for the same parameters, the optimal granule levels of the data sets become greater when new objects are added into the original data sets. From Tables 15 and 16, it is easy to observe that the running time of Algorithm 7 is less than that of Algorithm 5. Note that, the computational cost in Table 16 is the average time of the repeated experiments under three groups of parameters. 5.4. A comparison of our method with the existing ones As is well known, optimal granule level selection is the basic problem in the study of multi-scale information tables. For instance, the work in [36–40] investigated optimal granule level selection based on the consistencies of single-scale decision tables. Hao et al. [45] studied optimal granule level selection based on three-way decisions. Besides, for generalized multi-scale decision tables, the work in [41–44] also researched optimal granule level selection based on the consistencies of single-scale decision tables. The above literature searched for a maximum granule level to reach a predetermined requirement from the finer relation to the coarser one. However, the main objective of the current paper is to develop optimal granule level selection methods based on the granule description accuracy in multi-scale information tables, and we selected a minimum granule level to satisfy a predetermined requirement from the coarser relation to the finer one. Because the multi-scale information tables on which our method is based are the same as those in Wu-Leung method [36], i.e., each attribute has the same number of granule levels, we only compared our method with Wu-Leung method [36] in the experiments. Here, for each obtained multi-scale information table, we reselected two target granules which form a partition of the object set U . For ease of notation, we denote U / D h (h ∈ {1, 2, 3}). Moreover, U / D h is coarser than U / AT 4 . The experimental results are reported in Table 17. By Table 17, we can find the following observations: For Iris, the 2nd granule level is optimal when the granule description accuracy P 1 and the consistency are satisfied simultaneously.

104

Q. Wan et al. / International Journal of Approximate Reasoning 116 (2020) 85–105

Table 17 The experimental results of Algorithm 5 and Wu-Leung method. Data set

No.1 No.2 No.3

Instances

150 210 390

OGL P1

P2

P3

Wu-Leung method

2 2 2

3 3 3

4 3 3

2 3 4

For Seeds, the 3rd granule level is optimal when the granule description accuracies P 2 , P 3 and the consistency are satisfied simultaneously. For Auto MPG, the 4th granule level is optimal based on the consistency which is greater than the optimal granule levels based on the granule description accuracies P 1 , P 2 and P 3 . In summary, the experiments show that our method is different from Wu-Leung method [36]. Specifically, it seems that the optimal granule level output by our method is less than or equal to the one output by Wu-Leung method. 6. Conclusion The data with hierarchical structure is a common type in daily life. In general, people often select an appropriate scale level to meet specific requirements. In this paper, the main contribution is to select the most appropriate granule level based on granule description accuracy in multi-scale information tables. We have found two facts: (1) the final results of the optimal granule levels are sensitive to the parameters given by users; (2) the basic granules in the lower approximation and the boundary region have different importance in terms of characterizing the accuracy of the description of a target granule, and in order to keep the accuracy of the description of a target granule in a relatively stable range, the importance of the granules in the lower approximation should be at last 11 times greater than that of the granules in the boundary region. Specifically, we have studied the optimal granule level selection for a target granule and a group of target granules based on the proposed granule description accuracy in multi-scale information tables. And then, for dynamic data, we have analyzed the change of the optimal granule level. In addition, we have discussed the time complexity of the designed algorithms and the reasonability of setting the weight parameters. Finally, we have verified the effectiveness of our method by conducting some numerical experiments, and summarized the similarities and differences between our work and the existing literature. It should be pointed out that this paper has only introduced the granule description accuracy into classical information tables. With respect to generalized information tables, the granule description accuracy should also be studied. As different attributes may have different numbers of levels of granulations in real-life applications, Li and Hu [41] made a generalization of the multi-scale information tables proposed by Wu and Leung [35]. Besides, Huang et al. [44] also extended the classical multi-scale information tables by allowing objects to take “different values” under decision attributes due to multiple scales. In other words, the optimal granule level selection based on granule description accuracies in these generalized models also deserves to be investigated in future. Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgements The authors gratefully acknowledge the support of the National Natural Science Foundation of China (Nos. 11971211, 61562050, 61573173, 61772021, 61976130 and 11801440) and the Scientific Research Program Funded by Shaanxi Provincial Education Department (No. 19JK0380). References [1] L. Zadeh, Fuzzy sets and information granularity, in: M. Gupta, R. Ragade, R. Yager (Eds.), Advances in Fuzzy Set Theory and Applications, North-Holland, Amsterdam, 1979, pp. 3–18. [2] L. Zadeh, Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic, Fuzzy Sets Syst. 90 (2) (1997) 111–127. [3] T.Y. Lin, Granular computing, in: Announcement of the BISC Special Interest Group on Granular Computing, 1997. [4] J.T. Yao, A ten-year review of granular computing, in: Proceedings of IEEE International Conference on Granular Computing, Fremont, CA, USA, 2007, pp. 734–739. [5] Y.Y. Yao, Granular computing: Basic issues and possible solutions, in: Proceedings of the 5th Joint Conference on information Sciences, New Jersey, USA, 2000, pp. 186–189. [6] T.Y. Lin, Granular computing: structures, representations, and applications, Lect. Notes Artif. Intell. 2639 (2003) 16–24. [7] J.T. Yao, A.V. Vasilakos, W. Pedrycz, Granular computing: perspectives and challenges, IEEE Trans. Cybern. 43 (6) (2013) 1977–1989.

Q. Wan et al. / International Journal of Approximate Reasoning 116 (2020) 85–105

105

[8] W. Pedrycz, Granular Computing: Analysis and Design of Intelligent Systems, Taylor and Francis/CRC Press, Boca Raton, 2013. [9] W. Pedrycz, S.M. Chen, Granular Computing and Intelligent Systems: Design with Information Granules of Higher Order and Higher Type, Springer, Heidelberg, 2011. [10] W. Pedrycz, S.M. Chen, Granular Computing and Decision-Making: Interactive and Iterative Approaches, Springer, Heidelberg, 2015. [11] W. Pedrycz, S.M. Chen, Information Granularity, Big Data, and Computational Intelligence, Springer, Heidelberg, Germany, 2015. [12] Y.Y. Yao, Three-way decision and granular computing, Int. J. Approx. Reason. 103 (2018) 107–123. [13] X.N. Li, Three-way fuzzy matroids and granular computing, Int. J. Approx. Reason. 114 (2019) 44–50. [14] J.Y. Liang, Y.H. Qian, D.Y. Li, Q.H. Hu, Theory and method of granular computing for big data mining, Sci. Sinica Inform. 45 (11) (2015) 1355–1369. [15] M.K. Afridi, N. Azam, J.T. Yao, E. Alznazi, A three-way clustering approach for handling missing data using GTRS, Int. J. Approx. Reason. 98 (2018) 11–24. [16] R. Wille, Restructuring lattice theory: an approach based on hierarchies of concepts, in: I. Rival (Ed.), Ordered sets, Reidel Publishing Company, Dordrecht, 1982, pp. 445–470. [17] H.L. Zhi, J.H. Li, Granule description based on formal concept analysis, Knowl.-Based Syst. 104 (2016) 62–73. [18] H.L. Zhi, J.H. Li, Granule description based on positive and negative attributes, Granular Comput. 4 (3) (2019) 337–350. [19] H.L. Zhi, J.H. Li, Granule description based on necessary attribute analysis, Chinese J. Comput. 41 (12) (2018) 2702–2719. [20] Z. Pawlak, Rough sets, Int. J. Comput. Inf. Sci. 11 (5) (1982) 341–356. [21] R.E. Kent, Rough concept analysis: a synthesis of rough sets and formal concept analysis, Fundam. Inform. 27 (2–3) (1996) 169–181. [22] H.L. Lai, D.X. Zhang, Concept lattices of fuzzy contexts: formal concept analysis vs. rough set theory, Int. J. Approx. Reason. 50 (5) (2009) 695–707. [23] K.E. Wolff, A conceptual view of knowledge bases in rough set theory, Lect. Notes Comput. Sci. 2005 (2001) 220–228. [24] G. Gediga, I. Düntsch, Modal-style operators in qualitative data analysis, in: Proceedings of IEEE International Conference on Data Mining, Maebashi, Japan, 2002, pp. 155–162. [25] J.J. Qi, L. Wei, Z.Z. Li, A partitional view of concept lattice, Lect. Notes Comput. Sci. 3641 (2005) 74–83. [26] Q. Wan, L. Wei, Approximate concepts acquisition based on formal contexts, Knowl.-Based Syst. 75 (2015) 78–86. [27] Y.Y. Yao, A comparative study of formal conceptanalysis and rough set theory in data analysis, in: Proceedings of the 4th International Conference on Rough Sets and Current Trends in Computing, Uppsala, Sweden, 2004, pp. 59–68. [28] Y.Y. Yao, Rough-set concept analysis: interpreting RS-definable concepts based on ideas from formal concept analysis, Inf. Sci. 346–347 (2016) 442–462. [29] J.K. Chen, J.S. Mi, B. Xie, Y.J. Lin, A fast attribute reduction method for large formal decision contexts, Int. J. Approx. Reason. 106 (2019) 1–17. [30] Y.H. Qian, J.Y. Liang, Y.Y. Yao, C.Y. Dang, MGRS: a multi-granulation rough set, Inf. Sci. 180 (6) (2010) 949–970. [31] Y.H. Qian, S.Y. Li, J.Y. Liang, Z.Z. Shi, F. Wang, Pessimistic rough set based decisions: a multigranulation fusion strategy, Inf. Sci. 264 (2014) 196–210. [32] Z.H. Jiang, X.B. Yang, H.L. Yu, D. Liu, P.X. Wang, Y.H. Qian, Accelerator for multi-granularity attribute reduction, Knowl.-Based Syst. 177 (2019) 145–158. [33] K.Y. Liu, X.B. Yang, H. Fujita, D. Liu, X. Yang, Y.H. Qian, An efficient selector for multi-granularity attribute reduction, Inf. Sci. 505 (2019) 457–472. [34] Y. Leung, J.S. Zhang, Z.B. Xu, Clustering by scale-space filtering, IEEE Trans. Pattern Anal. Mach. Intell. 22 (12) (2000) 1396–1410. [35] W.Z. Wu, Y. Leung, Theory and applications of granular labelled partitions in multi-scale decision tables, Inf. Sci. 181 (18) (2011) 3878–3897. [36] W.Z. Wu, Y. Leung, Optimal scale selection for multi-scale decision tables, Int. J. Approx. Reason. 54 (8) (2013) 1107–1129. [37] W.Z. Wu, Y.H. Qian, T.J. Li, S.M. Gu, On rule acquisition in incomplete multi-scale decision tables, Inf. Sci. 378 (2017) 282–302. [38] S.M. Gu, W.Z. Wu, On knowledge acquisition in multi-scale decision systems, Int. J. Mach. Learn. Cybern. 4 (5) (2013) 477–486. [39] Y.H. She, J.H. Li, H.L. Yang, A local approach to rule acquisition in multi-scale decision tables, Knowl.-Based Syst. 89 (2015) 398–410. [40] J.P. Xie, M.H. Yang, J.H. Li, Z. Zheng, Rule acquisition and optimal scale selection in multi-scale formal decision contexts and their applications to smart city, Future Gener. Comput. Syst. 83 (2018) 564–581. [41] F. Li, B.Q. Hu, A new approach of optimal scale selection to multi-scale decision tables, Inf. Sci. 381 (2017) 193–208. [42] F. Li, B.Q. Hu, J. Wang, Stepwise optimal scale selection for multi-scale decision tables via attribute significance, Knowl.-Based Syst. 129 (2017) 4–16. [43] W.Z. Wu, Y. Leung, A comparison study of optimal scale combination selection in generalized multi-scale decision tables, Int. J. Mach. Learn. Cybern. (2019), https://doi.org/10.1007/s13042-019-00954-1. [44] Z.H. Huang, J.J. Li, W.Z. Dai, R.D. Lin, Generalized multi-scale decision tables with multi-scale decision attributes, Int. J. Approx. Reason. 115 (2019) 194–208. [45] C. Hao, J.H. Li, M. Fan, W.Q. Liu, E.C.C. Tsang, Optimal scale selection in dynamic multi-scale decision tables based on sequential three-way decisions, Inf. Sci. 415 (2017) 213–232. [46] H.M. Chen, T.R. Li, C. Luo, S.J. Horng, G.Y. Wang, A rough set-based method for updating decision rules on attribute values’ coarsening and refining, IEEE Trans. Knowl. Data Eng. 26 (12) (2014) 2886–2899. [47] C.X. Hu, S.X. Liu, X.L. Huang, Dynamic updating approximations in multigranulation rough sets while refining or coarsening attribute values, Knowl.-Based Syst. 130 (2017) 62–73. [48] D. Liu, T.R. Li, J.B. Zhang, Incremental updating approximations in probabilistic rough sets under the variation of attributes, Knowl.-Based Syst. 73 (2015) 81–96. [49] C. Luo, T.R. Li, H.M. Chen, Dynamic maintenance of approximations in set-valued ordered decision systems under the attribute generalization, Inf. Sci. 257 (2014) 210–228. [50] C. Luo, T.R. Li, Y.Y. Huang, H. Fujita, Updating three-way decisions in incomplete multi-scale information systems, Inf. Sci. 476 (2019) 274–289. [51] X.B. Yang, Y. Qi, H.L. Yu, X.N. Song, J.Y. Yang, Updating multigranulation rough approximations with increasing of granular structures, Knowl.-Based Syst. 64 (2014) 59–69. [52] A. Frank, A. Asuncion, UCI Repository of Machine Learning Databases, Tech. Rep., Univ. California, Sch. Inform. Comp. Sci, Irvine, CA, 2010, Available from http://archive.ics.uci.edu/ml.