Available online at www.sciencedirect.com
Expert Systems with Applications Expert Systems with Applications 35 (2008) 611–621 www.elsevier.com/locate/eswa
Generating fuzzy rules from training instances for fuzzy classification systems Shyi-Ming Chen a
a,c,* ,
Fu-Ming Tsai
b
Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan, ROC b Department of Electronic Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan, ROC c Department of Computer Science and Information Engineering, Jinwen University of Science and Technology, Taipei County, Taiwan, ROC
Abstract In recent years, many methods have been proposed to generate fuzzy rules from training instances for handling the Iris data classification problem. In this paper, we present a new method to generate fuzzy rules from training instances for dealing with the Iris data classification problem based on the attribute threshold value a, the classification threshold value b and the level threshold value c, where a 2 [0, 1], b 2 [0, 1] and c 2 [0, 1]. The proposed method gets a higher average classification accuracy rate than the existing methods. Ó 2007 Elsevier Ltd. All rights reserved. Keywords: Fuzzy rules; Fuzzy sets; Fuzzy classification systems; Iris data; Membership functions
1. Introduction In recent years, some methods have been presented to generate fuzzy rules from training instances for handling the Iris data (Fisher, 1936) classification problem (Castro, Castro-Schez, & Zurita, 1999; Chang & Chen, 2001; Chen & Chang, 2005; Chen & Chen, 2002; Chen & Fang, 2005a, 2005b; Chen & Lin, 2000, 2005a, 2005b; Chen & Tsai, 2005; Chen, Wang, & Chen, 2006; Hong & Chen, 1999; Hong & Lee, 1996, 1999; Ishibuchi & Nakashima, 2001; Tsai & Chen, 2002; Wu & Chen, 1999). Castro et al. (1999) presented a method to generate fuzzy rules from training data to deal with the Iris data classification problem. Chang and Chen (2001) presented a method to generate weighted fuzzy rules to deal with the Iris data classification problem. Chen and Chang (2005) presented a method to construct membership functions and generate weighted fuzzy rules from training instances. Chen and Tsai (2005) presented a method to generate fuzzy rules from training instances to deal with the Iris data classification problem. Chen et al. (2006) pre*
Corresponding author. Tel.: +886 2 27376417; fax: +886 2 27301081. E-mail address:
[email protected] (S.-M. Chen).
0957-4174/$ - see front matter Ó 2007 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2007.07.013
sented a method for generating weighted fuzzy rules from training data for dealing with the Iris data classification problem. Chen and Chen (2002) presented a method based on genetic algorithms to construct membership functions and fuzzy rules to deal with the Iris data classification problem. Chen and Fang (2005a) presented a method for handling the Iris data classification problem. Chen and Fang (2005b) presented a method to deal with fuzzy classification problems by tuning membership functions for fuzzy classification systems. Hong and Lee (1996) presented a method for inducing fuzzy rules and membership functions from training instances to deal with the Iris data classification problem. Wu and Chen (1999) presented a method for constructing membership functions and fuzzy rules from training instances to deal with the Iris data classification problem. Hong and Lee (1999) discussed the effect of merging order on performance of fuzzy rules induction. Hong and Chen (1999) presented a method to construct membership functions and generate fuzzy rules from training instances by finding relevant attributes and membership functions to deal with the Iris data classification problem. Chen and Lin (2005a) presented a method to generate weighted fuzzy rules from training instances to deal with the Iris data classification problem.
612
S.-M. Chen, F.-M. Tsai / Expert Systems with Applications 35 (2008) 611–621
Chen and Lin (2005b) presented a method to generate weighted fuzzy rules from numerical data based on genetic algorithms to deal with the Iris data classification problem. In this paper, we present a new method to generate fuzzy rules from training instances for dealing with the Iris data classification problem. The proposed method constructs the membership functions and generates fuzzy rules from training instances based on the attribute threshold value a, the classification threshold value b, and the level threshold value c to deal with the Iris data classification problem, where a 2 [0, 1], b 2 [0, 1], and c 2 [0, 1]. The experimental results show that the proposed method gets a higher average classification accuracy rate than the existing methods. The rest of this paper is organized as follows. In Section 2, we briefly review basic concepts of fuzzy sets (Zadeh, 1965). In Section 3, we present a new method to generate fuzzy rules from training instances for handling the Iris data classification problems. In Section 4, we use an example to illustrate the fuzzy rules generation process of the proposed method. In Section 5, we make an experiment to compare the average classification accuracy rate of the proposed method with the existing methods. The conclusions are discussed in Section 6. 2. Basic concepts of fuzzy sets In this section, we briefly review basic concepts of fuzzy sets from (Zadeh, 1965). Definition 1. Let U be the universe of discourse, U = {x1, x2, . . . , xn}. A fuzzy set A of the universe of discourse U can be represented as follows: A ¼ lA ðx1 Þ=x1 þ lA ðx2 Þ=x2 þ þ lA ðxn Þ=xn ;
ð1Þ
where lA is the membership function of the fuzzy set A, lA: U ! [0, 1], lA(xi) denotes the grade of membership of xi belonging to the fuzzy set A, and lA(xi) 2 [0, 1]. Definition 2. Let A and B be two fuzzy sets defined in the universe of discourse U. The union operation between the fuzzy sets A and B, denoted as A [ B, is defined as follows: lA[B ðxÞ ¼ maxðlA ðxÞ; lB ðxÞÞ;
8x 2 U ;
ð2Þ
where lA and lB are the membership functions of fuzzy sets A and B, respectively, lA: U ! [0, 1], and lB: U ! [0, 1]. Definition 3. Let A and B be two fuzzy sets defined in the universe of discourse U. The intersection operation between the fuzzy sets A and B, denoted as A \ B, is defined as follows: lA\B ðxÞ ¼ minðlA ðxÞ; lB ðxÞÞ;
8x 2 U ;
ð3Þ
where lA and lB are the membership functions of fuzzy sets A and B, respectively, lA: U ! [0, 1], and lB: U ! [0, 1]. Definition 4. Let A be a fuzzy set defined in the universe of discourse U. The complement of the fuzzy set A, denoted as A, is defined as follows:
lA ðxÞ ¼ 1 lA ðxÞ;
8x 2 U ;
ð4Þ
where lA and lA are the membership functions of fuzzy sets A and A, respectively, lA: U ! [0, 1], and lA : U ! [0, 1]. 3. A new method to generate fuzzy rules from training instances for handling fuzzy classification problems In this section, we present a new method to generate fuzzy rules from training instances to deal with the Iris data (Fisher, 1936) classification problem. The Iris data contains 150 instances having four input attributes, i.e., Sepal Length (SL), Sepal Width (SW), Petal Length (PL) and Petal Width (PW) as shown in Table 1. There are three species of flowers, i.e., Iris-Setosa, Iris-Versicolor and IrisVirginica, and each species has 50 instances. Assume that the ith training instance ai has four input attribute values xi, Sepal Length, xi, Sepal Width, xi, Petal Length, xi, Petal Width, and one output attribute value yi, shown as follows: ai ¼ ððxi;Sepal
Length ; xi;Sepal Width ; xi;Petal Length ; xi;Petal Width Þ; y i Þ;
where xi, Sepal Length denotes the attribute value of the attribute ‘‘Sepal Length’’ of the ith training instance, xi, Sepal Width denotes the attribute value of the attribute ‘‘Sepal Width’’ of the ith training instance, xi, Petal Length denotes the attribute value of the attribute ‘‘Petal Length’’ of the ith training instance, xi, Petal Width denotes the attribute value of the attribute ‘‘Petal Width’’ of the ith training instance, yi denotes the species of flower of the ith training instance, yi 2 {Iris-Setosa, Iris-Versicolor, Iris-Virginica}, and 1 6 i 6 150. The type of membership functions used in this paper is the triangular membership function and trapezoidal membership function, as shown in Fig. 1. In Chen et al. (2006) and Chen and Chen (2002), Chen et al. presented the definition of ‘‘the degree of entropy’’ vi of an input attribute Xi as follows: Vi ¼
PD jWDj
ð5Þ
where vi denotes the degree of entropy of the input attribute Xi and 1 6 i 6 4. First, Chen et al. found the whole domain WD of the input attribute Xi. Then, they found the individual domain of the input attribute Xi for each species of flower. PD contains a set of intervals I1, I2, . . . , Ip which are not overlapped with the individual domain of the input attribute Xi for each species of flowers, and jPDj = jI1j + jI2j + + jIpj, where jIjj denotes the length of the interval Ij, and 1 6 i 6 p; jWDj denotes the length of the whole domain WD. Assume that the attribute threshold value given by the user is a, the classification threshold value given by the user is b, and the level threshold value given by the user is c, where a 2 [0, 1], b 2 [0, 1] and c 2 [0, 1]. The attribute threshold value a is used to test which attributes can be used to deal with the classification, where a 2 [0, 1]; the classification threshold value b is used to test whether the
S.-M. Chen, F.-M. Tsai / Expert Systems with Applications 35 (2008) 611–621
613
Table 1 Iris data (Fisher, 1936) Iris-Setosa
Iris-Versicolor
Iris-Virginica
SL
SW
PL
PW
SL
SW
PL
PW
SL
SW
PL
PW
5.1 4.9 4.7 4.6 5.0 5.4 4.6 5.0 4.4 4.9 5.4 4.8 4.8 4.3 5.8 5.7 5.4 5.1 5.7 5.1 5.4 5.1 4.6 5.1 4.8 5.0 5.0 5.2 5.2 4.7 4.8 5.4 5.2 5.5 4.9 5.0 5.5 4.9 4.4 5.1 5.0 4.5 4.4 5.0 5.1 4.8 5.1 4.6 5.3 5.0
3.5 3.0 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 3.7 3.4 3.0 3.0 4.0 4.4 3.9 3.5 3.8 3.8 3.4 3.7 3.6 3.3 3.4 3.0 3.4 3.5 3.4 3.2 3.1 3.4 4.1 4.2 3.1 3.2 3.5 3.1 3.0 3.4 3.5 2.3 3.2 3.5 3.8 3.0 3.8 3.2 3.7 3.3
1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 1.5 1.6 1.4 1.1 1.2 1.5 1.3 1.4 1.7 1.5 1.7 1.5 1.0 1.7 1.9 1.6 1.6 1.5 1.4 1.6 1.6 1.5 1.5 1.4 1.5 1.2 1.3 1.5 1.3 1.5 1.3 1.3 1.3 1.6 1.9 1.4 1.6 1.4 1.5 1.4
0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 0.2 0.2 0.1 0.1 0.2 0.4 0.4 0.3 0.3 0.3 0.2 0.4 0.2 0.5 0.2 0.2 0.4 0.2 0.2 0.2 0.2 0.4 0.1 0.2 0.1 0.2 0.2 0.1 0.2 0.2 0.3 0.3 0.2 0.6 0.4 0.3 0.2 0.2 0.2 0.2
7.0 6.4 6.9 5.5 6.5 5.7 6.3 4.9 6.6 5.2 5.0 5.9 6.0 6.1 5.6 6.7 5.6 5.8 6.2 5.6 5.9 6.1 6.3 6.1 6.4 6.6 6.8 6.7 6.0 5.7 5.5 5.5 5.8 6.0 5.4 6.0 6.7 6.3 5.6 5.5 5.5 6.1 5.8 5.0 5.6 5.7 5.7 6.2 5.1 5.7
3.2 3.2 3.1 2.3 2.8 2.8 3.3 2.4 2.9 2.7 2.0 3.0 2.2 2.9 2.9 3.1 3.0 2.7 2.2 2.5 3.2 2.8 2.5 2.8 2.9 3.0 2.8 3.0 2.9 2.6 2.4 2.4 2.7 2.7 3.0 3.4 3.1 2.3 3.0 2.5 2.6 3.0 2.6 2.3 2.7 3.0 2.9 2.9 2.5 2.8
4.7 4.5 4.9 4.0 4.6 4.5 4.7 3.3 4.6 3.9 3.5 4.2 4.0 4.7 3.6 4.4 4.5 4.1 4.5 3.9 4.8 4.0 4.9 4.7 4.3 4.4 4.8 5.0 4.5 3.5 3.8 3.7 3.9 5.1 4.5 4.5 4.7 4.4 4.1 4.0 4.4 4.6 4.0 3.3 4.2 4.2 4.2 4.3 3.0 4.1
1.4 1.5 1.5 1.3 1.5 1.3 1.6 1.0 1.3 1.4 1.0 1.5 1.0 1.4 1.3 1.4 1.5 1.0 1.5 1.1 1.8 1.3 1.5 1.2 1.3 1.4 1.4 1.7 1.5 1.0 1.1 1.0 1.2 1.6 1.5 1.6 1.5 1.3 1.3 1.3 1.2 1.4 1.2 1.0 1.3 1.2 1.3 1.3 1.1 1.3
6.3 5.8 7.1 6.3 6.5 7.6 4.9 7.3 6.7 7.2 6.5 6.4 6.8 5.7 5.8 6.4 6.5 7.7 7.7 6.0 6.9 5.6 7.7 6.3 6.7 7.2 6.2 6.1 6.4 7.2 7.4 7.9 6.4 6.3 6.1 7.7 6.3 6.4 6.0 6.9 6.7 6.9 5.8 6.8 6.7 6.7 6.3 6.5 6.2 5.9
3.3 2.7 3.0 2.9 3.0 3.0 2.5 2.9 2.5 3.6 3.2 2.7 3.0 2.5 2.8 3.2 3.0 3.8 2.6 2.2 3.2 2.8 2.8 2.7 3.3 3.2 2.8 3.0 2.8 3.0 2.8 3.8 2.8 2.8 2.6 3.0 3.4 3.1 3.0 3.1 3.1 3.1 2.7 3.2 3.3 3.0 2.5 3.0 3.4 3.0
6.0 5.1 5.9 5.6 5.8 6.6 4.5 6.3 5.8 6.1 5.1 5.3 5.5 5.0 5.1 5.3 5.5 6.7 6.9 5.0 5.7 4.9 6.7 4.9 5.7 6.0 4.8 4.9 5.6 5.8 6.1 6.4 5.6 5.1 5.6 6.1 5.6 5.5 4.8 5.4 5.6 5.1 5.1 5.9 5.7 5.2 5.0 5.2 5.4 5.1
2.5 1.9 2.1 1.8 2.2 2.1 1.7 1.8 1.8 2.5 2.0 1.9 2.1 2.0 2.4 2.3 1.8 2.2 2.3 1.5 2.3 2.0 2.0 1.8 2.1 1.8 1.8 1.8 2.1 1.6 1.9 2.0 2.2 1.5 1.4 2.3 2.4 1.8 1.8 2.1 2.4 2.3 1.9 2.3 2.5 2.3 1.9 2.0 2.3 1.8
classification accuracy rate of the generated fuzzy rules are good enough, where b 2 [0, 1]; the level threshold value c is used to determine which fuzzy rules should be modified, where c 2 [0, 1]. The proposed method is now presented as follows: Step 1: Find the maximum attribute value and the minimum attribute value of each attribute of each species of the training instances.
Step 2: Based on Eq. (5), calculate the degree of entropy of each attribute. Step 3: Find out the attributes whose degree of entropy is larger than the attribute threshold value a, where a 2 [0, 1], and sort these attributes according to their degrees of entropy in a descending sequence. Step 4: Choose the attribute X found in Step 3 that has the largest degree of entropy, and find the maximum attribute values and the minimum attribute values
614
S.-M. Chen, F.-M. Tsai / Expert Systems with Applications 35 (2008) 611–621
HN
MN
SN
Z
SP
MP
bute X of the generated fuzzy rules with the inference species of the generated fuzzy rules. Assume that the attribute value of the attribute X of a training instance falls in an interval bound with a species. If the training instance belongs to the same species bound with the interval, then it means that this training instance is classified correctly. If it is not equal, then it means that this training instance is classified incorrectly. The classification accuracy rate of the generated fuzzy rules is defined as follows:
HP
1.0
0.5
U (cm)
0
Fig. 1. Membership functions of the corresponding linguistic terms.
of the attribute X of each species based on the results obtained in Step 1. Sort these attribute values in an ascending sequence. Subtract 0.05 cm from the attribute values that are at odd positions, and add 0.05 cm to the attribute values that are at even positions. Then, sort these values in an ascending sequence again. Let these values be the crossover points corresponding to adjacent membership functions of the linguistic terms of the attribute X as shown in Fig. 2, where the corresponding intervals of the membership functions are also shown in Fig. 2. Step 5: Find the statistical distribution of the attribute values of the attribute X of the training instances falling in each interval shown in Fig. 2. Generate fuzzy rules from the training instances by the process described as follows. If most of the training instances fall in an interval belonging to the species Z1, and the corresponding linguistic term of this interval is Y1, then generate the following fuzzy rule: IF X is Y 1 THEN the flower is Z 1 : Step 6: Calculate the classification accuracy rate of the generated fuzzy rules described as follows. If the classification accuracy rate of the generated rules is equal to or larger than the classification threshold value b, where b 2 [0, 1], then Stop. Otherwise, go to Step 7. The process of calculating the classification accuracy rates of the generated fuzzy rules is described as follows. First, bind the intervals corresponding to the linguistic terms of the attri-
HN
MN
SN
Z
SP
MP
Classification Accuracy Rate ¼
Number of Training Instances Correctly Classified : Number of Training Instances ð6Þ
Step 7: Choose the attribute W found in Step 3 that has the second largest degree of entropy, and find the maximum attribute value and the minimum attribute value of the attribute W based on the result obtained in Step 1. Sort these attribute values in an ascending sequence. Subtract 0.05 cm from the attribute values that are at odd positions, and add 0.05 cm to the attribute values that are at even positions. Then, sort these values in an ascending sequence again. Let these attribute values be the crossover points of adjacent membership functions corresponding to the linguistic terms of the attribute W as shown in Fig. 3, where the corresponding intervals of the membership functions are also shown in Fig. 3. Step 8: Find the fuzzy rules whose classification error rate is larger than the level threshold value c, where c 2 [0, 1]. The classification error rate of a fuzzy rule is defined as follows: Classification Error Rate Number of Training Instances Incorrectly Classified ¼ : Number of Training Instances
ð7Þ
HP
HN
MN
SN
Z
SP
MP
HP
1.0
1.0
0.5
0.5
X (cm)
0 HN
MN
SN
Z
SP
MP HP
Fig. 2. Corresponding intervals of the membership functions of the attribute X.
W (cm)
0 HN
MN
SN
Z
SP
MP HP
Fig. 3. Corresponding intervals of the membership functions of the attribute W.
S.-M. Chen, F.-M. Tsai / Expert Systems with Applications 35 (2008) 611–621
Step 9: Find the incorrectly classified training instances and find the statistical distribution of the attribute values of the attribute W of these incorrectly classified training instances falling in each interval shown in Fig. 3. If most of attribute values of these incorrectly classified training instances falling in an interval belonging to the species Z2 and the corresponding linguistic term of this interval is Y2, then generate the following fuzzy rule: IF X is Y 1 and W is Y 2 THEN the flower is Z 2 ; and modify the original generated fuzzy rule into the following form: IF X is Y 1 and W is Y 2 THEN the flower is Z 1 ; where Y 2 denotes the complement of the fuzzy set Y2. Step 10: If there are fuzzy rules whose classification error rate is larger than the level threshold value c, where c 2 [0, 1], then go to Step 9 else go to Step 11. Step 11: If the classification accuracy rate of the training instances is larger than the classification threshold value b, where b 2 [0, 1], then Stop else go to Step 7. In the following, we present how to classify a testing instance shown as follows Step 1: For each generated fuzzy rule do begin calculate the membership grades of the testing instance belonging to the membership functions of the linguistic terms appearing in the antecedent portion of the fuzzy rule; find the minimum value M of the membership grades of the testing instance belonging to the membership functions of the linguistic terms appearing in the antecedent portion of the fuzzy rule; If the inference result of the fuzzy rule is the flower of the species Z then it indicates the degree of possibility that the testing instance belonging to the species Z is M, where M 2 [0, 1] end. Step 2: Choose the rule having the largest inference degree of possibility; If the inference result of the chosen rule is the flower of the species Z then the testing instance is classified into the species Z. 4. An example In this section, we use an example to illustrate the proposed method. Assume that the attribute threshold value a, the classification threshold value b, and the level threshold value c given by the user are 0.9, 1 and 0, respectively
615
(i.e., a = 0.9, b = 1 and c = 0). Assume that the training data set is as shown in Table 2 and the testing data set is as shown in Table 3. [Step 1] Based on Table 2, we can find the maximum attribute value and the minimum attribute value of each attribute of each species of the training instances as shown in Table 4. [Step 2] Based on Eq. (5), we can calculate the degree of entropy of each attribute as follows: (i) The degree of entropy of the attribute SL is calculated as follows:
ð4:9 4:4Þ þ ð7:7 6:8Þ ¼ 0:412: ð7:7 4:4Þ (ii) The degree of entropy of the attribute SW is calculated as follows: ð4:2 3:3Þ þ ð2:2 2Þ ¼ 0:478: ð4:2 2Þ (iii) The degree of entropy of the attribute PL is calculated as follows: ð1:9 1Þ þ ð4:8 3:3Þ þ ð4:9 6:7Þ ¼ 0:956: ð6:7 1Þ (iv) The degree of entropy of the attribute PW is calculated as follows: ð0:6 0:1Þ þ ð1:5 1Þ þ ð2:5 1:5Þ ¼ 0:955: ð2:5 0:1Þ
[Step 3] Because the degrees of entropy of the attribute PL and the attribute PW are larger than the attribute threshold value a, where a = 0.9, after sorting these two attributes according to the degree of entropy of the attributes PL and PW in a descending sequence, we can get the following result: PL > PW. [Step 4] Because the attribute PL has the largest degree of entropy, from Table 4, we can get the maximum attribute value and the minimum attribute value of the attribute PL of each species of the training instances, and after sorting these attribute values in an ascending sequence, we can get the following result: 1 cm < 1.9 cm < 3 cm < 4.8 cm < 5.1 cm < 6.9 cm. After subtracting 0.05 cm from the attribute values which are at odd positions and adding 0.05 cm to the attribute values which are at even positions, and after sorting these values in an ascending sequence, we can get the following result: 0:95 cm < 1:95 cm < 2:05 cm < 4:85 cm < 5:05 cm < 6:95 cm:
616
S.-M. Chen, F.-M. Tsai / Expert Systems with Applications 35 (2008) 611–621
Table 2 Training data set Iris-Setosa
Iris-Versicolor
Iris-Virginica
SL
SW
PL
PW
SL
SW
PL
PW
SL
SW
PL
PW
5.1 4.9 5 4.6 5 4.9 5.4 4.8 4.3 5.1 5.4 4.6 5 5 5.2 4.7 5.5 4.9 4.9 4.4 4.5 4.4 5.1 5.3 5
3.5 3 3.6 3.4 3.4 3.1 3.7 3 3 3.5 3.4 3.6 3 3.4 3.4 3.2 4.2 3.1 3.6 3 2.3 3.2 3.8 3.7 3.3
1.4 1.4 1.4 1.4 1.5 1.5 1.5 1.4 1.1 1.4 1.7 1 1.6 1.6 1.4 1.6 1.4 1.5 1.4 1.3 1.3 1.3 1.9 1.5 1.4
0.2 0.2 0.2 0.3 0.2 0.1 0.2 0.1 0.1 0.3 0.2 0.2 0.2 0.4 0.2 0.2 0.2 0.2 0.1 0.2 0.3 0.2 0.4 0.2 0.2
7 5.5 5.7 4.9 5.9 6 6.1 5.6 5.6 6.3 6.8 6 5.5 5.8 6 6 6.3 5.6 5.5 5.8 5 5.6 6.2 5.1 5.7
3.2 2.3 2.8 2.1 3 2.2 2.9 2.9 3 2.5 2.8 2.9 2.4 2.7 2.7 3.4 2.3 3 2.6 2.6 2.3 2.7 2.9 2.5 2.8
4.7 4 4.5 3.3 4.2 4 4.7 3.6 4.5 4.9 4.8 4.5 3.8 3.9 5.1 4.5 4.4 4.1 4.4 4 3.3 4.2 4.3 3 4.1
1.4 1.3 1.3 1 1.5 1 1.4 1.3 1.5 1.5 1.4 1.5 1.1 1.2 1.6 1.6 1.3 1.3 1.2 1.2 1 1.3 1.3 1.1 1.3
7.1 7.6 7.3 7.2 6.5 6.4 6.8 5.7 5.8 6.5 7.7 7.7 6.7 6.2 6.4 7.2 7.4 7.9 6.3 6.4 5.8 6.8 6.3 6.2 5.9
3 3 2.9 3.6 3.2 2.7 3 2.5 2.8 3 2.6 2.8 3.3 2.8 2.8 3 2.8 3.8 3.4 3.1 2.7 3.2 2.5 3.4 3
5.9 6.6 6.3 6.1 5.1 5.3 5.5 5 5.1 5.5 6.9 6.7 5.7 4.8 5.6 5.8 6.1 6.4 5.6 5.5 5.1 5.9 5 5.4 5.1
2.1 2.1 1.8 2.5 2 1.9 2.1 2 2.4 1.8 2.3 2 2.1 1.8 2.1 1.6 1.9 2 2.4 1.8 1.9 2.3 1.9 2.3 1.8
Table 3 Testing data set Iris-Setosa
Iris-Versicolor
Iris-Virginica
SL
SW
PL
PW
SL
SW
PL
PW
SL
SW
PL
PW
4.7 4.6 5.4 4.4 4.8 5.8 5.7 5.4 5.7 5.1 5.1 5.1 4.8 5.2 4.8 5.4 5.2 5 5.5 5.1 5 5 4.8 5.1 4.6
3.2 3.1 3.9 2.9 3.4 4 4.4 3.9 3.8 3.8 3.7 3.3 3.4 3.5 3.1 3.4 4.1 3.2 3.5 3.4 3.5 3.5 3 3.8 3.2
1.3 1.5 1.7 1.4 1.6 1.2 1.5 1.3 1.7 1.5 1.5 1.7 1.9 1.5 1.6 1.5 1.5 1.2 1.3 1.5 1.3 1.6 1.4 1.6 1.4
0.2 0.2 0.4 0.2 0.2 0.2 0.4 0.4 0.3 0.3 0.4 0.5 0.2 0.2 0.2 0.4 0.1 0.2 0.2 0.2 0.3 0.6 0.3 0.2 0.2
6.4 6.9 6.5 6.3 6.6 5.2 5 6.7 5.8 6.2 5.6 5.9 6.1 6.1 6.4 6.6 6.7 5.7 5.5 5.4 6.7 5.5 6.1 5.7 5.7
3.2 3.1 2.8 3.3 2.9 2.7 2 3.1 2.7 2.2 2.5 3.2 2.8 2.8 2.9 3 3 2.6 2.4 3 3.1 2.5 3 3 2.9
4.5 4.9 4.6 4.7 4.6 3.9 3.5 4.4 4.1 4.5 3.9 4.8 4 4.7 4.3 4.4 5 3.5 3.7 4.5 4.7 4 4.6 4.2 4.2
1.5 1.5 1.5 1.6 1.3 1.4 1 1.4 1 1.5 1.1 1.8 1.3 1.2 1.3 1.4 1.7 1 1 1.5 1.5 1.3 1.4 1.2 1.3
6.3 5.8 6.3 6.5 4.9 6.7 6.4 7.7 6 6.9 5.6 6.3 7.2 6.1 6.4 6.3 6.1 7.7 6 6.9 6.7 6.9 6.7 6.7 6.5
3.3 2.7 2.9 3 2.5 2.5 3.2 3.8 2.2 3.2 2.8 2.7 3.2 3 2.8 2.8 2.6 3 3 3.1 3.1 3.1 3.3 3 3
6 5.1 5.6 5.8 4.5 5.8 5.3 6.7 5 5.7 4.9 4.9 6 4.9 5.6 5.1 5.6 6.1 4.8 5.4 5.6 5.1 5.7 5.2 5.2
2.5 1.9 1.8 2.2 1.7 1.8 2.3 2.2 1.5 2.3 2 1.8 1.8 1.8 2.2 1.5 1.4 2.3 1.8 2.1 2.4 2.3 2.5 2.3 2
Let these values be the crossover points of the adjacent membership functions of the linguistic terms of the attribute PL as shown in Fig. 4. The corresponding intervals
of the membership functions of the linguistic terms are also shown in Fig. 4.
S.-M. Chen, F.-M. Tsai / Expert Systems with Applications 35 (2008) 611–621
617
Table 4 Maximum attribute value and minimum attribute value of each attribute of each species Species
Attributes SL
Iris-Setosa Iris-Versicolor Iris-Virginica
HN
SW
PL
Max (cm)
Min (cm)
Max (cm)
Min (cm)
Max (cm)
Min (cm)
Max (cm)
4.3 4.9 5.7
5.5 7 7.9
2.3 2.1 2.5
4.2 3.4 3.8
1 3 4.8
1.9 5.1 6.9
0.1 1 1.6
0.4 1.6 2.5
MN
SN
Z
SP
MP
HP
1 0.5
0
PW
Min (cm)
PL (cm) 0.95 1.95 2.95 4.85 5.05 6.95 10 MN SN Z SP MP HP HN
Fig. 4. Corresponding intervals of the Membership functions of the attribute PL.
[Step 5] From Table 2, we can see that most training instances whose attribute values of the attribute PL fall between 0.95 cm and 1.95 cm belong to the species Iris-Setosa, and from Fig. 4, we can see that the corresponding membership function of the interval [0.95 cm, 1.95 cm] is MN; no training instances whose attribute values of the attribute PL fall between 1.95 cm and 2.95 cm; most training instances whose attribute values of the attribute PL between 2.95 cm and 4.85 cm belong to the species Iris-Versicolor and the corresponding linguistic term of the interval [2.95 cm, 4.85 cm] is Z; most training instances whose attribute values of the attribute PL fall between 4.85 cm and 5.05 cm belong to the species IrisVersicolor, and the corresponding membership function of the interval [4.85 cm, 5.05 cm] is SP; most training instances whose attribute values of the attribute PL fall between 5.05 cm and 6.95 cm belong to the species Iris-Virginica, and the corresponding membership function of the interval [5.05 cm, 6.95 cm] is MP; most training instances whose attribute values of the attribute PL is larger than 6.95 cm belong to the species Iris-Virginica and the corresponding membership function of the interval [6.95 cm, 10 cm] is HP. Thus, we can get the following five fuzzy rules: Rule 1: IF PL is MN THEN the flower is Iris-Setosa, Rule 2: IF PL is Z THEN the flower is IrisVersicolor,
Rule 3: IF PL is SP THEN the flower is Iris-Versicolor, Rule 4: IF PL is MP THEN the flower is IrisVirginica, Rule 5: IF PL is HP THEN the flower is IrisVirginica. [Step 6] Because the classification accuracy rate of the five generated fuzzy rules obtained in Step 5 is less than the classification threshold value b given by the user, where b = 1, we go to Step 7. [Step 7] Because the attribute PW has the second largest entropy, from Table 4, we can get the maximum attribute value and the minimum attribute value of the attribute PW of each species of the training instances. After sorting these values in an ascending sequence, we can get the following result: 0.1 cm < 0.4 cm < 1 cm < 1.6 cm < 1.6 cm < 2.5 cm. After subtracting 0.05 cm from the values which are at odd positions and adding 0.05 cm to the values which are at even positions, and after sorting these values in an ascending sequence, we can get the following result: 0.05 cm < 0.45 cm < 0.95 cm < 1.55 cm < 1.65 cm < 2.55 cm. Let these values be the crossover points of the adjacent membership functions of the linguistic terms of the attribute PW. The corresponding intervals of the membership functions of the linguistic terms of the attribute PW are also shown in Fig. 5. [Step 8] Because the classification error rates of Rule 4 and Rule 5 are larger than the level threshold value c given by the user, where c = 0, the system has to deal with Rule 4 and Rule 5.
HN
MN
SN
Z
SP
MP
HP
1
0.5
0
PW (cm) 0.05 0.45 0.95 1.55 1.65 2.55 10 HN MN SN Z SP MP HP
Fig. 5. Corresponding intervals of the membership functions of the linguistic terms of the attribute PW.
618
S.-M. Chen, F.-M. Tsai / Expert Systems with Applications 35 (2008) 611–621
[Step 9] Because the system finds that most training instances are wrongly classified when the attribute values of the attribute PW fall between 1.55 cm and 1.65 cm, belonging to the species IrisVersicolor, and the corresponding membership function of the interval [1.5 cm, 1.65 cm] is SP, the system generates the following fuzzy rules from Rule 4: Rule 6: IF PL is MP and PW is SP THEN the flower is Iris-Versicolor, and the system modifies Rule 4 into ‘‘Rule 4*’’, shown as follows: Rule 4*: IF PL is MP and PW is SP THEN the flower is Iris-Virginica, where SP denotes the complement of the fuzzy set SP. [Step 10] Because the classification error rate of Rule 5 is larger than the level threshold value c, where c = 0, we go to Step 9. [Step 9] Because the system finds that most training instances are wrongly classified when the attribute values of the attribute PW fall between 1.65 cm and 2.55 cm, belonging to the species Iris- Virginica, and the corresponding membership function of the interval [1.65 cm, 2.55 cm] is MP, the system generates the following fuzzy rule from Rule 5: Rule 7: IF PL is HP and PW is MP THEN the flower is Iris-Virginica, and the system modifies Rule 5 into ‘‘Rule 5*’’, shown as follows: Rule 5*: IF PL is HP and PW is MP THEN the flower is Iris-Versicolor, where MP denotes the complement of the fuzzy set MP. [Step 10] Because there are no fuzzy rules whose classification error rate is larger than the level threshold value c given by the user, where c = 0, we go to Step 11. [Step 11] Because the classification accuracy rate of the generated fuzzy rules is equal to the classification threshold value b given by the user, where b = 1, the system stops. Therefore, we can get the following seven fuzzy rules: Rule 1: IF PL is MN THEN the flower is IrisSetosa, Rule 2: IF PL is Z THEN the flower is IrisVersicolor, Rule 3: IF PL is SP THEN the flower is IrisVersicolor, Rule 4*: IF PL is MP and PW is SP THEN the flower is Iris-Virginica, Rule 5*: IF PL is HP and PW is MP THEN the flower is Iris-Versicolor, Rule 6: IF PL is MP and PW is SP THEN the flower is Iris-Versicolor,
Rule 7: IF PL is HP and PW is MP THEN the flower is Iris-Virginica. In the following, we illustrate how to classify a testing instance. We use the first testing data (i.e., SL = 4.7 cm, SW = 3.2 cm, PL = 1.3 cm and PW = 0.2 cm) in Table 3 to illustrate the classification process. Because the attribute value of PL of this testing instance is 1.3 cm and the attribute value of PW of this testing instance is 0.2 cm, therefore: (i) Let us consider Rule 1: IF PL is MN THEN the flower is Iris-Setosa. From Fig. 4, we can get the membership grade of the testing instance belonging to the membership function of the linguistic term MN appearing in the antecedent portion of Rule 1, which is 0.85, as shown in Fig. 6. Thus, it indicates that the degree of possibility that the testing instance belongs to the species Iris-Setosa is 0.85. (ii) Let us consider Rule 2: IF PL is Z THEN the flower is Iris-Versicolor. From Fig. 4, we can get the membership grade of the testing instance belonging to the membership function of the linguistic term Z appearing in the antecedent portion of Rule 2, which is 0, as shown in Fig. 7. Thus, it indicates that the degree of possibility that the testing instance belongs to the species Iris-Versicolor is 0. (iii) Let us consider Rule 3: IF PL is SP THEN the flower is Iris-Versicolor. From Fig. 4, we can get the membership grade of the testing instance belonging to the membership function of the linguistic term MP appearing in the antecedent portion of Rule 3, which is 0, as shown in Fig. 8. Thus, it indicates that the degree of possibility that the testing instance belongs to the species Iris-Versicolor is 0. HN
MN
SN
Z
SP
MP
HP
1 0.85 0.5
PL (cm)
0
0.95
1.95 2.95 4.85
5.05
6.95
10
1.3
Fig. 6. The inference process of Rule 1.
HN
MN
SN
Z
SP
MP
HP
1
0.5
PL (cm) 0
0.95
1.95 2.95 4.85
5.05
6.95
1.3
Fig. 7. The inference process of Rule 2.
10
S.-M. Chen, F.-M. Tsai / Expert Systems with Applications 35 (2008) 611–621
HN
MN
SN
Z
SP
MP
HP
1
0.5
PL (cm)
0
0.95
1.95 2.95 4.85
5.05
10
6.95
1.3
Fig. 8. The inference process of Rule 3.
HN
MN
SN
Z
SP
MP
HP
1
619
antecedent portion of Rule 4*, respectively, which are 0 and 1, respectively, as shown in Fig. 9. By taking the minimum value, among them, the calculation result is 0. It indicates that the degree of possibility that the testing instance belongs to the species IrisVirginica is 0. (v) Let us consider Rule 5*: IF PL is HP and PW is MP THEN the flower is Iris-Versicolor. From Figs. 4 and 5, we can get the membership grades of the testing instance belonging to the membership functions of the linguistic terms HP and MP appearing in the antecedent portion of Rule 5*, respectively, which are 0 and 1, respectively, as shown in Fig. 10. By
0.5
0
0.95
1.95 2.95 4.85
5.05
10
6.95
HN
PL (cm)
MN
SN
Z
SP
HP
MP
1
1.3 HN MN
SN
Z
SP
HP
M
SP
0.5
1
PL (cm)
0
0.95
0.5
0
1.95 2.95 4.85
5.05
10
6.95
1.3 HN MN
PW (cm) 0.05
0.95 1.55 1.65
0.45
SN
Z
SP
MP
HP
1
10
2.55
0.2 0.5
Fig. 9. The inference process of Rule 4*.
0 HN
MN
SN
Z
SP
MP
HP
1
PW (cm) 0.05
0.95 1.55 1.65
0.45
10
2.55
0.2
Fig. 11. The inference process of Rule 6.
0.5
PL (cm)
0
0.95
1.95 2.95 4.85
5.05
10
6.95
HN
1.3 HN MN
SN
MN
SN
Z
SP
MP
HP
1
Z
SP
MP
MP
HP
0.5
1
0
0.5
0.95
1.95 2.95 4.85
5.05
10
6.95
PL (cm)
1.3
0
PW (cm) 0.05
0.45
0.95 1.55 1.65
2.55
HN MN
10
SN
Z
SP
HP
MP
1
0.2
Fig. 10. The inference process of Rule 5*.
(iv) Let us consider Rule 4*: IF PL is MP and PW is SP THEN the flower is Iris-Virginica. From Figs. 4 and 5, we can get the membership grades of the testing instance belonging to the membership functions of the linguistic terms MP and SP appearing in the
0.5
0
PW (cm) 0.05
0.45
0.95 1.55 1.65
2.55
10
0.2
Fig. 12. The inference process of Rule 7.
620
S.-M. Chen, F.-M. Tsai / Expert Systems with Applications 35 (2008) 611–621
Table 5 A comparison of the average classification accuracy rates for different methods Methods
Average classification accuracy rate (%)
Hong and Lee’s method (1996) (training data set: 75 instances; testing data set: 75 instances; executing 200 times) Hong and Chen’s method (1999) (training data set: 75 instances; testing data set: 75 instances; executing 200 times) The proposed method (training data set: 75 instances; testing data set: 75 instances; executing 200 times; Attribute threshold value a = 0.9; Classification threshold value b = 1; level threshold value c = 0) Castro’s method (1999) (training data set: 120 instances; testing data set: 30 instances; executing 10 times) Chen and Fang’s method (2005a) (training data set: 120 instances; testing data set: 30 instances; executing 200 times) Chen and Tsai’s method (2005) (training data set: 120 instances; testing data set: 30 instances; executing 200 times; correlation coefficient threshold value f = 0.86; boundary shift value e = 1/3; center shift value d = 0.025) Chen and Chang’s method (2005) (training data set: 120 instances; testing data set: 30 instances; executing 2000 times) Chen and Fang’s method (2005b) (training data set: 120 instances, testing data set: 30 instances; executing 200 times) The proposed method (training data set: 120 instances; testing data set: 30 instances; executing 200 times; attribute threshold value a = 0.9; classification threshold value b = 1; level threshold value c = 0)
95.570 95.570 95.833
taking the minimum value, among them, the calculation result is 0. It indicates that the degree of possibility that the testing instance belongs to the species IrisVersicolor is 0. (vi) Let us consider Rule 6: IF PL is MP and PW is SP THEN the flower is Iris-Versicolor. From Figs. 4 and 5, we can get the membership grades of the testing instance belonging to the membership functions of the linguistic terms MP and SP appearing in the antecedent portion of Rule 6, respectively, which are 0 and 0, respectively, as shown in Fig. 11. By taking the minimum value, among them, the calculation result is 0. It indicates that the degree of possibility that the testing instance belongs to the species Iris-Versicolor is 0. (vii) Let us consider Rule 7: IF PL is HP and PW is MP THEN the flower is Iris-Virginica. From Figs. 4 and 5, we can get the membership grades of the testing instance belonging to the membership functions of the linguistic terms HP and MP appearing in the antecedent portion of Rule 7, respectively, which are 0 and 0, respectively, as shown in Fig. 12. By taking the minimum value, among them, the calculation result is 0. It indicates that the degree of possibility that the testing instance belongs to the species IrisVirginica is 0. Therefore, we can see that the testing instance (SL = 4.7 cm, SW = 3.2 cm, PL = 1.3 cm and PW = 0.2 cm) can get the maximum inference degree of possibility (i.e., 0.85) based on Rule 1. The inference result of Rule 1 is the species Iris-Setosa. Thus, the testing instance is classified into the species Iris-Setosa. From Table 3, we can see that the testing instance is classified correctly. 5. Experimental results We have implemented the proposed method using Visual Basic version 6.0 on a Pentium 4 PC, where the
96.600 96.72 96.82 96.88 96.96 97.166
attribute threshold value a, the classification threshold value b and the level threshold value c given by the user are 0.9, 1 and 0, respectively. Case 1: The system randomly chooses 75 instances from the Iris data as the training data set and lets the remaining 75 instances be the testing data set. After executing the program 200 times, the average classification accuracy rate is 95.8333% and the average number of generated fuzzy rules is 6.63. Case 2: The system randomly chooses 120 instances from the Iris data as the training data set and lets the remaining 30 instances be the testing data set. After executing the program 200 times, the average classification accuracy rate is 97.166% and the average number of generated fuzzy rules is 7.535. Table 5 makes a comparison of the average classification accuracy rate for different methods. From Table 5, we can see that the proposed method has a higher average classification accuracy rate than the existing methods. 6. Conclusions In this paper, we have presented a new method to construct membership functions and generate fuzzy rules from training instances for handling the Iris data classification problem based on the attribute threshold value a, the classification threshold value b and the level threshold value c given by the user, where a 2 [0, 1], b 2 [0, 1] and c 2 [0, 1]. The proposed method can get a higher average classification accuracy rate than the existing methods. In the future, we will develop a method based on genetic algorithms to set up the optimal values of the attribute threshold value a, the classification threshold value b and the level threshold value c for handling the fuzzy classification problems.
S.-M. Chen, F.-M. Tsai / Expert Systems with Applications 35 (2008) 611–621
Acknowledgement This work was supported in part by the National Science Council, Republic of China, under Grant NSC 952221-E-011-116-MY2. References Castro, J. L., Castro-Schez, J. J., & Zurita, J. M. (1999). Learning maximal structure rules in fuzzy logic for knowledge acquisition in expert systems. Fuzzy Sets and Systems, 101(3), 331–342. Chang, C. H., & Chen, S. M. (2001). Constructing membership functions and generating weighted fuzzy rules from training data. In Proceedings of the 2001 ninth national conference on fuzzy theory and its applications, Chungli, Taoyuan, Taiwan, Republic of China (pp. 708–713). Chen, S. M., & Chang, C. H. (2005). A new method to construct membership functions and generate weighted fuzzy rules from training instances. Cybernetics and Systems, 36(4), 397–414. Chen, S. M., & Chen, Y. C. (2002). Automatically constructing membership functions and generating fuzzy rules using genetic algorithms. Cybernetics and Systems, 33(8), 841–862. Chen, S. M., & Fang, Y. D. (2005a). A new approach for handling the Iris data classification problem. International Journal of Applied Science and Engineering, 3(1), 37–49. Chen, S. M., & Fang, Y. D. (2005b). A new method to deal with fuzzy classification problems by tuning membership functions for fuzzy classification systems. Journal of Chinese Institute of Engineers, 28(1), 169–173. Chen, S. M., & Lin, H. L. (2005a). Generating weighted fuzzy rules for handling classification problems. International Journal of Electronic Business Management, 3(2), 116–128. Chen, S. M., & Lin, H. L. (2005b). Generating weighted fuzzy rules from training instances using genetic algorithms to handle the Iris data
621
classification problem. Journal of Information Science and Engineering, 22(1), 175–188. Chen, S. M., & Lin, S. Y. (2000). A new method for constructing fuzzy decision trees and generating fuzzy classification rules from training examples. Cybernetics and Systems, 31(7), 763–785. Chen, S. M., & Tsai, F. M. (2005). A new method to construct membership functions and generate fuzzy rules from training instances. International Journal of Information and Management Sciences, 16(2), 47–72. Chen, Y. C., Wang, L. H., & Chen, S. M. (2006). Generating weighted fuzzy rules from training data for dealing with the Iris data classification problem. International Journal of Applied Science and Engineering, 4(1), 41–52. Fisher, R. (1936). The use of multiple measurements in taxonomic problem. Annals of Eugenics, 7, 179–188. Hong, T. P., & Chen, J. B. (1999). Finding relevant attributes and membership functions. Fuzzy Sets and Systems, 103(3), 389–404. Hong, T. P., & Lee, C. Y. (1996). Induction of fuzzy rules and membership functions from training examples. Fuzzy Sets and Systems, 84(1), 33–47. Hong, T. P., & Lee, C. Y. (1999). Effect of merging order on performance of fuzzy induction. Intelligent Data Analysis, 3(2), 39–151. Ishibuchi, H., & Nakashima, T. (2001). Effect of rule weights in fuzzy rulebased classification systems. IEEE Transactions on Fuzzy Systems, 9(4), 506–515. Tsai, F. M., & Chen, S. M. (2002). A new method for constructing membership functions and generating fuzzy rules for fuzzy classification systems. In Proceedings of 2002 tenth national conference on fuzzy theory and its application, Hsinchu, Taiwan, Republic of China. Wu, T. P., & Chen, S. M. (1999). A new method for constructing membership functions and fuzzy rules from training examples. IEEE Transactions on Systems, Man, and Cybernetics—Part B: Cybernetics, 29(1), 25–40. Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8, 338–353.