Applied Mathematical Modelling 37 (2013) 4139–4146
Contents lists available at SciVerse ScienceDirect
Applied Mathematical Modelling journal homepage: www.elsevier.com/locate/apm
An object-parameter approach to predicting unknown data in incomplete fuzzy soft sets q Tingquan Deng ⇑, Xiaofei Wang College of Science, Harbin Engineering University, Harbin 150001, PR China
a r t i c l e
i n f o
Article history: Received 16 September 2011 Received in revised form 11 August 2012 Accepted 10 September 2012 Available online 20 September 2012 Keywords: Soft set Incomplete fuzzy soft set Complete distance Relative dominance degree
a b s t r a c t Incomplete data in soft sets lead to uncertainty and inaccuracy in representing and handling information. This paper introduces notions of complete distance between two objects and relative dominance degree between two parameters. Based on both the notions, an object-parameter method is proposed to predict unknown data in incomplete fuzzy soft sets. The proposal makes full use of known data, including the information from the relationship between known values of all objects on a certain parameter and the information from the relationship between known values of an object on all parameters. The effectiveness of the proposal is verified by many examples under the compared investigation of classical predicted methods. Ó 2012 Elsevier Inc. All rights reserved.
1. Introduction The concept of soft set started with the work of Molodtsov [1]. It is a parameterized family of subsets of a universe of discourse and is a powerful mathematical model in handling data sets with uncertainty. Generally, a soft set can be represented by an information system or an information table intuitively. In the last decade, the theory of soft sets has been solidly enriched [2–5], including the operations on soft sets [6,7], the algebraic structures of soft sets [8–10], and so on. Unlike other mathematical tools, such as probability theory, fuzzy set theory and rough set theory, to deal with uncertain data, a soft set model requires no prior knowledge of data sets. The generalized models of soft sets come forth rapidly to meet various demands in practical situations by combining soft sets with fuzzy sets [11], with rough set [12], with vague sets [13], with interval-valued fuzzy sets [14], with interval-valued intuitionistic fuzzy soft set [15], and with other theories. Soft set has been extensively and successfully applied to rule mining, attribute clustering, and decision support in many fields such as economics, engineering [16–21]. Decision support analysis in the domain of a soft set is one of the most important applications. An optimal choice for objects can be made by ranking the values of objects on all parameters or attributes. Once some entries of an information table are absent, it is impossible to reach solutions to decision problems. In such a situation, almost all applications of soft sets cannot be realized. In practical circumstances, it is much likely that some unknown entries may appear in soft sets. Such soft sets are referred to as incomplete soft sets. To analyze such soft sets and predict unknown entries are premises of investigation and applications of soft sets. Deleting all objects related to missing entries is the simplest method to transform an incomplete data set to a complete one. That method usually causes information loss. On the contrary, filling unknown entries is a more effective method, for
q
This work was supported in part by the National Natural Science Foundation of China (10771043).
⇑ Corresponding author.
E-mail address:
[email protected] (T. Deng). 0307-904X/$ - see front matter Ó 2012 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.apm.2012.09.010
4140
T. Deng, X. Wang / Applied Mathematical Modelling 37 (2013) 4139–4146
example, factitiously or randomly filling the unknown entries with 1, 0 or some other values. Other methods include experts experience, statistics and Bayesian model or evidence theory [22–24]. Chen et al. [25] has pointed out that an information table for soft set has different interpretation from that for rough set. The predicted method of unknown data in rough set theory cannot be applied to incomplete soft set except some simple methods like empirical or random filling which usually bring subjective and inaccurate in describing uncertain problems. Zou and Xiao [26] presented a weighted-average method for soft sets and an average-probability method for fuzzy rough sets to predict unknown data in related information tables. The missing values are predicted by weighted-average of all possible choice values of the object and the weight of each possible choice value is decided by the distribution of other objects. The weighted-average method can only predict the sum of values of every object on all parameters (the parameters sum for an object, in briefly), but each unknown entry in information tables cannot be individually quantified. Although the averageprobability method, called a fuzzy method, can predict individual unknown entry of fuzzy soft sets, the predicted values of all unknown entries in one parameter column are equal. Both of them only take into account the information from values of objects on a certain parameter, but no information from the values of objects on all parameters. In this paper, a novel method, called object-parameter method, is proposed to predict individual unknown entry in soft sets and in fuzzy soft sets. This method takes the full information between objects and between parameters into account. The concepts of complete distance between two objects and relative dominance degree between two parameters are introduced to reveal hidden information in fuzzy soft sets. Based on both the notions, the new method is put forward through weighting the evaluation values of unknown data. The rest of this paper is organized as follows. Section 2 recalls some fundamental concepts from soft set theory. Some classical methods of predicting unknown data for soft sets and fuzzy soft sets are reviewed in Section 3. After investigating relationship between parameters and between objects, an object-parameter method is proposed to predict unknown data in fuzzy soft sets in Section 4 and a brief algorithm is presented. Some applications and analysis are given in Section 5. Experiments are implemented in Section 6. Conclusions follow in Section 7. 2. Basic concepts from soft sets theory This section recalls some basic notions on soft sets. Let U ¼ fh1 ; h2 ; . . . ; hm g be a nonempty finite set, called the universe of discourse, where every element is referred to as an object. Let E ¼ fe1 ; e2 ; . . . ; en g be a nonempty finite set, called the set of parameters or attributes. A pair ðF; EÞ is called a soft set over U if and only if F is a mapping from E to the powerset of U. That is, for any e 2 E, there exists a set U e # U such that FðeÞ ¼ U e . A soft set can be considered as an information system or an information table, where each entry in this table is 1 or 0 decided by whether an object belongs to the range of a parameter or not. Meanwhile, each entry indicates the value of an object on the interrelated parameter. A fuzzy soft set is described by a pair ðF; EÞ over U, where F is a mapping from E to the set of all fuzzy subsets in U. That is, each entry in the information table of a fuzzy soft set is a quantity in the unit interval [0, 1], representing the membership degree of an object belonging to a parameter or referring to as the membership degree or the membership probability of the object possessing the related parameter. Therefore, a fuzzy soft set is a generalization of soft set and a soft set is a specialization of a fuzzy soft set. In the following, the domain of every soft set or fuzzy soft set is assumed U ¼ fh1 ; h2 ; . . . ; hm g and the parameters set is E ¼ fe1 ; e2 ; . . . ; en g. For an object hi 2 U and a parameter ej 2 E, the value of hi on ej is denoted by hij and the parameters sum P of hi is denoted by fE ðhi Þ ¼ nj¼1 hij . The orders of objects can be ranked based on the parameters sum of objects and an optimal decision can be made in a soft set by choosing the maximum value of the parameters sums of hi 2 U. In a soft set, if there exist incomplete data, that is, the values of some entries are unknown, then the soft set is called incomplete. All the unknown values of an incomplete soft set are denoted by the sign . For example, in the following soft set ðF; EÞ shown in Table 1, all values of objects on parameters are known except those of h2 ; h3 and h4 on e1 . The unknown data are denoted by in the information table, that is, h21 ¼ ; h31 ¼ and h41 ¼ . In an incomplete soft set, the orders of objects cannot be ranked by traditional ways and the quality of the information system cannot be evaluated straightforwardly. To perform a decision or an evaluation, an incomplete soft set has to be transformed to a traditional soft set. Therefore, to predict the unknown data and obtain precise values of objects on related parameters is an important issue to be addressed in an incomplete soft set.
Table 1 An incomplete soft set ðF; EÞ. U
e1
e2
e3
e4
h1 h2 h3 h4
1 ⁄ ⁄ ⁄
0 1 0 0
0 1 1 0
1 1 0 0
4141
T. Deng, X. Wang / Applied Mathematical Modelling 37 (2013) 4139–4146
3. Classical methods of predicting unknown data in an incomplete fuzzy soft set In this section, some classical methods of predicting unknown data in an incomplete fuzzy soft set are reviewed. Since the range of soft set includes only 0 and 1, the traditional method to predict the unknown values of objects on parameters is to fill the entries by only 0 or 1 randomly or factitiously. It is not accurate when a large number of unknown values appears in the soft set. Another disadvantage of such a method is that the integrated information of known data has not been taken into account. Instead of assigning either 0 or 1, a quantity in the unit interval [0, 1] is distributed to the unknown value of a soft set and the induced soft set will be more consistent with the original one. Motivation by such an idea, a weighted-average method [26] was therefore proposed to deal with incomplete data in an incomplete soft set. Definition 3.1. Assume that ðF; EÞ is an incomplete soft set over U and hjl is an unknown entry of hj 2 U on el 2 E, then the 1 probability of the case that hjl is predicted to be 1 is defined by pjl ¼ n0nþn and the probability of the case that hik is predicted 1 to be 0 is defined by qjl ¼ 1 pjl , where n0 ¼ jfhi 2 Ujhil ¼ 0gj and n1 ¼ jfhi 2 Ujhil ¼ 1gj, where jXj denotes the cardinality of set X. Assume that hjl is unknown in an incomplete soft set ðF; EÞ and aj denotes the number of parameters in Ej ¼ fek 2 Ejhjk ¼ g on which the predicted values of hj are 1. Let bj ¼ jEj j and Aj ¼ fB # Ej jjBj ¼ aj g, then Aj includes all the cases that the sum of predicted values of unknown parameter entries of hj is aj and all of the predicted values of hj on parameters in Ej B for any B 2 Aj are 0. It is evident that if aj ¼ 0, then Aj ¼ f;g, and when aj ¼ bj ; Aj ¼ fEj g. If there are K possible choices for the parameters sum fE ðhj Þ of hj , the possible choice value of the parameters sum is cl , l ¼ 1; 2; . . . ; K, and the weight of the possible choice value cl can be defined by
wl ¼
8 P q ; > < Pet 2Ej jt > :
A2Aj
aj ¼ 0;
Pes 2A pjs Pet 2Ej A qjt ; 0 < aj < bj ;
Pet 2Ej pjt ;
aj ¼ bj : PK
Thus the decision value of fE ðhj Þ is W j ¼ l¼1 wl cl . The weighted-average method takes the information of the existing known values of objects on parameters into consideration. Considering the particularity of the range of a soft set, the proposed method is more reasonable, for the predicted value reflects actual state of incomplete data. However, it should be noticed that such a method has fatal drawbacks. It only takes the information concerning the values between objects into consideration, but fails to take into account the whole information concerning the relationship between objects and between parameters. When there is a large number of unknown values in a soft set, the accuracy of the weighted-average method is very low. Another disadvantage of the weighted-average method is that only the parameters sum of every object can be predicted, but the individual unknown entry cannot be accurately obtained. For example, consider an incomplete soft set, shown in Table 1, with the weighted-average method, we have that h21 ¼ 1; h31 ¼ 1 and h41 ¼ 1 due to the fact that h11 ¼ 1. Similar to the weighted-average method, an average-probability method [26] was also proposed to predict the incomplete data for an incomplete fuzzy soft set. Proposition 3.2. Suppose ðF; EÞ is an incomplete fuzzy soft set over U and hjl is unknown of object hj on parameter el . Let P U l ¼ fijhil – ; 1 6 i 6 mg, then the average-probability of hj belonging to el is pjl ¼
i2U l
hil
jU l j
.
The average-probability pjl is called the membership degree of hj on el . Clearly, the predicted membership degree is the average value of all known membership degrees of different objects on the same parameter. With that method the extreme situations where the unknown values take only 0 or 1 can be avoided. Let us consider an example shown in Table 2. In this incomplete fuzzy soft set, similar to the above example, all entries are known except h2 ; h3 and h4 on parameter e1 . By the average-probability method, we have that h21 ¼ h31 ¼ h41 ¼ 0:7. It is evident that the average-probability method mainly takes into account the known values of different objects on the same parameter. It is performed by simply calculating the average-probability of all of known values of objects on a certain parameter. It naturally leads to the situation that different objects on the same parameter have the same predicted value regardless how great the difference between the values of objects on other parameters is.
Table 2 An incomplete fuzzy soft set ðF; EÞ. U
e1
e2
e3
e4
h1 h2 h3 h4
0.7 ⁄ ⁄ ⁄
0.2 0.4 0.6 0.8
0.1 0.7 0.5 0.2
0.4 0.5 0.3 0.9
4142
T. Deng, X. Wang / Applied Mathematical Modelling 37 (2013) 4139–4146
The average-probability method only concerns the information from the relationships between different objects on the same parameter, but does not concern the information from the relationships between the values of one object on different parameters. In that way, the predicted values are too rough and have low accuracy, especially when the membership degrees of different objects on a certain parameter are quite different from others and when there is a small of number of known values in a fuzzy soft set. Although fuzzy soft sets are generalization of soft sets, it is verified that the weighted-average method for soft sets and the average-probability method for fuzzy soft sets are not consistent. In this paper we propose a new method, called objectparameter method, to predict unknown data in incomplete fuzzy soft sets. The proposed method can also predict unknown data in incomplete soft sets. To precede this method, we discuss the relationships between parameters and that between objects in an incomplete fuzzy soft set on unknown values in a fuzzy soft set. 4. The relationships between parameters and objects in an incomplete fuzzy soft set
0 0
In an incomplete fuzzy soft set ðF; EÞ over U, let hjl be the unknown value that we are going to predict. In convention we set ¼ 0.
Definition 4.1. Let ðF; EÞ be a fuzzy soft set over U, for hi ; hj 2 U and ek 2 E, if hik and hjk are known already, the relative distance from hi to hj with respect to ek is defined by
hik hjk : l2U k jhlk hjk j
dij;k ¼ P
ð1Þ
The quantity dij;k can be used to evaluate the difference between the values of objects hi ; hj on parameter ek . It will not be affected too much by extreme situations when the difference between membership degrees of hi ; hj on ek is too big. Proposition 4.2. Consider a fuzzy soft set ðF; EÞ, for any object hj 2 U and any parameter ek 2 E;
P
i2U k jdij;k j
¼ 1.
Proof. It is clear from Definition 4.1. h Let
Pn
dij ¼
k¼1 dij;k ; jfkjði 2 U k Þ ^ ðj 2 U k Þgj
ð2Þ
then dij is the complete distance between the values of objects hi and hj on all the parameters. The complete distance dij can be positive or negative, which implies which object is relatively big or not when taking all the parameters into account. Proposition 4.3. Let ðF; EÞ be a fuzzy soft set on U, if hik > hjk , then dil;k > djl;k for any hj 2 U.
Proof. It can be obtained directly from Definition 4.1. h According to the complete distance between the values of objects on a certain parameter, an unknown value hjl can be predicted in the following manner. Proposition 4.4. Suppose ðF; EÞ is a fuzzy soft set over U, the unknown entry hjl is evaluated according to the information from the relationship between the values of objects on a certain parameter by object
hjl
P ¼
i2U l ðhil
jU l j
dij Þ
:
ð3Þ
The relationship between the values of objects on a certain parameter provide important information for an incomplete fuzzy soft set. Besides, the relationship between values of an object on all parameters is also very important in predicting unknown data in an incomplete fuzzy soft set. Definition 4.5. Consider an incomplete fuzzy soft set ðF; EÞ on U, let hi 2 U; ek ; el 2 E, and hik and hil are known already, the degree of ek being relatively dominant to el regarding hi is defined by
ri;kl ¼
hik hil : hik þ hil
ð4Þ
If r i;kl > 0; ek is said to be of degree r i;kl relatively dominant to el regarding object hi . If r i;kl < 0; ek is said to be of degree ri;kl relatively dominant to el regarding hi . If ri;kl ¼ 0; ek is said to be relatively equal to el regarding hi .
4143
T. Deng, X. Wang / Applied Mathematical Modelling 37 (2013) 4139–4146
Definition 4.6. Suppose ðF; EÞ is a fuzzy soft set over U, for ek ; el 2 E, the degree of ek being definitely dominant to el is defined by
P ckl ¼
i2U k \U l r i;kl
jU k \ U l j
ð5Þ
:
Moreover, the degree of average dominance of ek to el is characterized by
ckl
v kl ¼ P
fqjU q \U l –;g jc ql j
ð6Þ
:
Since different parameters may have distinct impacts in predicting the unknown data in an incomplete fuzzy soft set. Based on the average dominance degree, all the unknown entries can be evaluated by taking the information of the impacts of parameters into account. Proposition 4.7. Suppose ðF; EÞ is a fuzzy soft set over U, the unknown entry hjl is evaluated according to the information from the relationship between the values of parameters regarding object hj by
P parameter
hjl
¼
k2Gj ðhjk
v kl Þ
jGj j
ð7Þ
;
where Ej ¼ fkjðhjk – Þ ^ ðU k \ U l – ;Þ; 1 6 k 6 ng. object Combining Proposition 4.4 with Proposition 4.7, the unknown entry hjl can be predicted by linearly weighting hjl and parameter hjl as follows object
hjl ¼ w1 hjl
parameter
þ w2 hjl
ð8Þ
;
where w1 and w2 stand for the weights of objects and parameters on the impacts on unknown data, respectively. The weights can be preassigned based on special problems or specific demands. If the objects and parameters are treated equally without discrimination, the weights can be set as w1 ¼ w2 ¼ 12. The proposed method given by Eq. (8), called an object-parameter method, of predicting unknown values takes the information from the relationship between the values of objects on a certain parameter and that from the relationship between the values of parameters regarding a certain object into consideration. It is more precise and reasonable, especially in a fuzzy soft set with few known values. In conclusion, the proposed object-parameter method can be implemented through the following procedures to predict unknown data in an incomplete fuzzy soft set. Algorithm 4.8. Given an incomplete fuzzy soft set ðF; EÞ over U and hjl is going to be predicted by the proposed objectparameter method. (a) For any object hi 2 U and any parameter ek 2 E, according to Eqs. (1) and (2) we obtain dij;k and dij . From which the evaluation value of hjl regarding the relationship between objects is computed through Eq. (3). (b) According to Eqs. (4)–(6) the values of ri;kl ; ckl and v kl are obtained, respectively. (c) By Eq. (7) the evaluation value of hjl regarding the relationship between parameters is computed. (d) Given a pair of weights w1 and w2 , the unknown entry hjl is predicted by Eq. (8). 5. Applications and analysis In this section some applications of the proposed object-parameter method are implemented to show the effectiveness and efficiency of using the object-parameter method to predict unknown values in an incomplete fuzzy soft set. Given an incomplete fuzzy soft set as shown in Table 3. In this table, there are 6 objects and 7 parameters, and 5 unknown entries are required to be predicted.
Table 3 An incomplete fuzzy soft set ðF; EÞ. U
e1
e2
e3
e4
e5
e6
e7
h1 h2 h3 h4 h5 h6
1 0.1 ⁄ 0.7 0.8 0.1
0.3 0.9 0 0.4 0 0.7
0.4 0 0.6 0.5 1 0
0.8 ⁄ ⁄ 1 0.6 0.8
0.2 0.3 0.5 ⁄ 0.2 0.3
0.2 0.8 0.2 0.3 0.9 0.4
0 0.7 1 0 ⁄ 0.4
4144
T. Deng, X. Wang / Applied Mathematical Modelling 37 (2013) 4139–4146
Let the weights of objects and parameters be equal, i.e., w1 ¼ w2 ¼ 12. By using Algorithm 4.8 we obtain the unknown values, h24 ¼ 0:7363; h31 ¼ 0:5792; h34 ¼ 0:7403; h45 ¼ 0:3650, and h57 ¼ 0:4176. If we use the average-probability method to predict the unknown data, it can be obtained that h24 ¼ 0:8; h31 ¼ 0:54; h34 ¼ 0:8; h45 ¼ 0:3, and h57 ¼ 0:42. Clearly, the predicted results are different from these obtained by the new proposal. It is impossible to verify which result is optimal between them, though the computational complexity of the average-probability method is less than that of the proposal. On the contrary, it is asserted that the proposed objectparameter method is superior to and more robust than either the weighted-average one or the average-probability one. The main reason on reaching our conclusion is that the proposal makes full use of information from the incomplete fuzzy soft set, including the relationships between parameters and the differences between objects. Furthermore, the proposal can predict unknown data in incomplete soft sets as well as incomplete fuzzy soft sets. The third advantage of the proposal lies in the fact that the predicted values of different objects on a certain parameter vary from their entries, unlike the results gotten by using the weighted-average method or the average-probability method. Let us come back to the examples shown in Table 1 and Table 2, by using the proposed method we can obtain h21 ¼ 1:0; h31 ¼ 0:8333 and h41 ¼ 0:5833 for Table 1, and h21 ¼ 0:9257; h31 ¼ 0:8889 and h41 ¼ 0:9833 for Table 2. In comparison with the predicted values h21 ¼ h31 ¼ h41 ¼ 1:0 for Table 1 and h21 ¼ h31 ¼ h41 ¼ 0:7 for Table 2, the objectparameter method is more reasonable and more reliable. Let us consider a practical application of adopting the proposed object-parameter method to predict unknown data in an incomplete fuzzy soft set. In an interview for a translator position in a multinational company, there are four candidates h1 ; h2 ; h3 , and h4 . Every candidate is going to be evaluated according to his abilities of speaking, reading, writing, and listening in English, These abilities are denoted by a parameter set E ¼ fe1 ; e2 ; e3 ; e4 g. The result of the interview is incomplete due to the fact that there are some data lost, as shown in Table 4. The company wants to decide which candidate qualifies for the position according to his abilities in English. Here, the criterion is that a candidate possessing higher probability of abilities in English will be employed. First, by using the proposed object-parameter method to predict the unknown data in Table 4, we have that h24 ¼ 0:6205 P and h41 ¼ 0:4366. Then according to the parameters sum of the objects, fE ðhi Þ ¼ 4j¼1 hij , we have fE ðh1 Þ ¼ 2:1; f E ðh2 Þ ¼ 2:0205; f E ðh3 Þ ¼ 2:0, and fE ðh4 Þ ¼ 1:8366. The result shows that the first candidate h1 is qualificatory for the position. 6. Experiments This section presents an experiment to compare efficiency of the proposed method to that of the existing one. The experimental data set, forestfires, is chosen from UCI repository [27]. It is a complete data set with 517 instances (objects) and 13 attributes, showing where and when a fire may occur at a greater probability in the forests. The four weather conditions, temp, RH, wind, and rain, are important features causing forestfires, which are the reduction of the information table. In our experiment the four attributes are considered. This data set is an information table or a knowledge representation system, rather than a soft set since the two systems have different interpretations [25,28]. The proposed method for soft sets cannot be directly applied to this data set. We transfer this information table to a fuzzy soft set by discretizing and fuzzifying the values of each attribute. In the procedure of discretization the collection of values of each attribute is divided into c equal-span intervals (classes) and each interval is assigned by a different parameter. The membership degrees of a given value belonging to all parameters are specified by a trapezoid function. In detail, let amin and amax denote the minimum and maximum related to attribute a. The interval ½amin ; amax is divided in average into c subintervals and ½ai ; aiþ1 denotes the i-th interval, where a1 ¼ amin and acþ1 ¼ amax . Let v 2 ½amin ; amax , then the fuzzy set (membership degrees) of v belonging to all subintervals is defined by
8 1 1=½a1 ; a2 þ av2a =½a2 ; a3 ; if > a1 > < aiþ1 v v ai =½a ; a þ 1=½a ; a þ =½a ; a ; if i1 i i iþ1 iþ1 iþ2 aiþ1 ai aiþ1 ai > > : acþ1 v =½a ; a þ 1=½a ; a ; if 2 3 c cþ1 acþ1 ac
v 2 ½a1 ; a2 v 2 ½ai ; aiþ1 v 2 ½ac ; acþ1
After discretization and fuzzification of the information table a fuzzy soft set is formed, where the alternatives (objects) are the same as the instances in the information table and the number of parameters in the induced fuzzy soft set is c times of the number of attributes. Note that it is not always necessary to discretize all intervals of attribute values to be the same number of subintervals. For example, if the collection of values of an attribute is composed of d symbols, it can be discretized as d classes, rather than c intervals, and the entries of corresponding d columns in the induced fuzzy soft set are only 1 or 0. Table 4 An incomplete data set. U
e1
e2
e3
e4
h1 h2 h3 h4
0.4 0.3 0.6
0.6 0.9 0.2 0.7
0.3 0.2 0.5 0.5
0.8 0.7 0.2
4145
T. Deng, X. Wang / Applied Mathematical Modelling 37 (2013) 4139–4146 Table 5 Comparative results of performance of the proposal and existing method.
V-fuzzy V-new
c¼2
c¼3
c¼4
0.3830 0.5300
0.2263 0.3520
0.0638 0.1431
The decision-making analysis is an important application of fuzzy soft set in practice. The decision on the objects (alternatives) is realized through ranking the sums of values of parameters. In order to apply the proposal some missing entries in the induced fuzzy soft set are randomly set with a probability p. As aforementioned, a soft set is different from an information table. Classical predicted methods in rough set theory for information tables cannot be adapted to soft sets. The weighted-average method in [26] can only predict the sum of values of objects on parameters for standard incomplete soft sets. Therefore, the proposed method is compared only with the fuzzy method (average-probability method) in predicting missing entries. The consistency of decision ranking is used to characterize the performance of predicted methods. In the experiment the number of classes c for all attributes is set to be 2, 3 and 4. Thus, the induced fuzzy soft sets have 517 objects, 8, 12 and 16 parameters, respectively. In each fuzzy soft set the probability p of missing entries is set to be 1%. In which cases, there are about 40, 60 and 80 missing entries in the fuzzy soft set. The experimental results are summarized in Table 5. In Table 5 V-fuzzy stands for the proportion of the number of the same ranking derived from the average-probability method as that from the induced fuzzy soft sets, whereas V-new denotes the proportion of the number of the same ranking derived from the proposed method as that from the induced fuzzy soft sets. It is shown that the accuracy decreases following the increaseness of numbers of missing data. Experimental results indicate that the proposed parameter-object method outperforms the average-probability method. 7. Conclusions This paper analyzes the properties of unknown data in incomplete soft sets and proposes a new method, called objectparameter method, to predict unknown entries in incomplete information tables. The proposed object-parameter method makes full use of the information from known data in soft sets and fuzzy soft sets. The information from the relationship between values of objects on a certain parameter as well as that from the relationship between the values of every object on all parameters is integrated into the predicted values of unknown data. In comparison with classical methods, the proposal exhibits many advantages in effectiveness and robustness in predicting unknown data, including. – The proposal makes full use of information from the incomplete fuzzy soft set, including the relationships between parameters and the differences between objects. – The predicted values of different objects on a certain parameter vary from their entries, unlike the results predicted by using the weighted-average method or the average-probability method. – The proposal can predict unknown data in incomplete soft sets as well as in incomplete fuzzy soft sets. More applications of the proposed method to predict missing data in incomplete fuzzy soft sets are under consideration. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17]
D. Molodtsov, Soft set theory – first results, Comput. Math. Appl. 37 (1999) 19–31. H. Yang, Z. Guo, Kernels and closures of soft set relations, and soft set relation mappings, Comput. Math. Appl. 61 (2011) 651–662. Y. Jiang, Y. Tang, Q. Chen, Z. Cao, Semantic operations of multiple soft sets under conflict, Comput. Math. Appl. 62 (2011) 1923–1939. Y. Jiang, Y. Tang, Q. Chen, J. Wang, S. Tang, Extending soft sets with description logics, Comput. Math. Appl. 59 (2010) 2087–2096. Y. Jiang, Y. Tang, Q. Chen, H. Liu, J. Tang, Extending fuzzy soft sets with fuzzy description logics, Knowl.-Based Syst. 24 (2011) 1096–1107. M.I. Ali, F. Feng, X.Y. Liu, W.K. Min, M. Shabir, On some new operations in soft set theory, Comput. Math. Appl. 57 (2009) 1547–1553. A. Sezgin, A.O. Atagün, On operations of soft sets, Comput. Math. Appl. 61 (2011) 1457–1467. B. Tanay, M. Burç Kandemir, Topological structure of fuzzy soft sets, Comput. Math. Appl. 61 (2011) 2952–2957. Y.B. Jun, K.J. Lee, C.H. Park, Fuzzy soft set theory applied to BCK/BCI-algebras, Comput. Math. Appl. 59 (2010) 3180–3192. J. Zhan, Y.B. Jun, Soft BL-algebras based on fuzzy sets, Comput. Math. Appl. 59 (2010) 2037–2046. P.K. Maji, R. Biswas, A.R. Roy, Fuzzy soft sets, J. Fuzzy Math. 9 (3) (2001) 589–602. F. Feng, Xiaoyan Liu, Violeta Leoreanu-Fotea, Y.B. Jun, Soft sets and soft rough sets, Inform. Sci. 181 (2011) 1125–1137. Wei Xu, Jian Ma, Shouyang Wang, Gang Hao, Vague soft sets and their properties, Comput. Math. Appl. 59 (2010) 787–794. X.B. Yang, T.Y. Lin, J.Y. Yang, Y. Li, D.J. Yu, Combination of interval-valued fuzzy set and soft set, Comput. Math. Appl. 58 (2009) 521–527. Y. Jiang, Y. Tang, Q. Chen, H. Liu, J. Tang, Interval-valued intuitionistic fuzzy soft sets and their properties, Comput. Math. Appl. 60 (2010) 906–918. T. Herawan, M.M. Deris, A soft set approach for association rules mining, Knowl.-Based Syst. 24 (2011) 186–195. H. Qin, X. Ma, J.M. Zain, T. Herawan, A novel soft set approach in selecting clustering attribute, Knowl.-Based Syst. (2012), http://dx.doi.org/10.1016/ j.knosys.2012.06.001. [18] Y.B. Jun, K.J. Lee, C.H. Park, Soft set theory applied to ideals in d-algebras, Comput. Math. Appl. 57 (2009) 367–378. [19] Z. Xiao, K. Gong, Y. Zou, A combined forecasting approach based on fuzzy soft sets, J. Comput. Appl. Math. 228 (2009) 326–333.
4146
T. Deng, X. Wang / Applied Mathematical Modelling 37 (2013) 4139–4146
[20] N. Çag˘man, S. Enginog˘lu, Soft matrix theory and its decision making, Comput. Math. Appl. 59 (2010) 3308–3314. [21] Y. Jiang, H. Liu, Y. Tang, Q. Chen, Semantic decision making using ontology-based soft sets, Math. Comput. Model. 53 (2011) 1140–1149. [22] J.R. Quinlan, Unknown attribute values in induction, in: Proceedings of the Sixth International Machine Learning Workshop, San Mateo, Canada, 1989, pp. 164–168. [23] B. Thiesson, Accelerated quantification of Bayesian networks with incomplete data, in: Proceedings of the First International Conference on Knowledge Discovery and Data Mining, Montreal, Canada, 1995, pp. 306–311. [24] D.X. Zhang, X.Y. Li, An absolute information quantity-based data making-up algorithms of incomplete information system, Comput. Eng. Appl. 22 (2006) 155–197. [25] D. Chen, E.C.C. Tsang, Daniel S. Yeung, X. Wang, The parameterization reduction of soft sets and its applications, Comput. Math. Appl. 49 (2005) 757– 763. [26] Y. Zou, Z. Xiao, Data analysis approaches of soft sets under incomplete information, Knowl.-Based Syst. 21 (2008) 941–945. [27] A. Asuncion, D. Newman, UCI machine learning repository, University of California, School of Information and Computer Science, Irvine, CA,
, 2007. [28] T.Q. Deng, X.F. Wang, Parameter significance and reductions of soft sets, Int. J. Comput. Math. 89 (2012) 1979–1995.