Generating fuzzy rules from training instances for fuzzy classification systems

Available online at www.sciencedirect.com Expert Systems with Applications Expert Systems with Applications 35 (2008) 611–621 www.elsevier.com/locate...

Download PDF

179KB Sizes 0 Downloads 74 Views

Report

PDF Reader
Full Text

Available online at www.sciencedirect.com

Expert Systems with Applications Expert Systems with Applications 35 (2008) 611–621 www.elsevier.com/locate/eswa

Generating fuzzy rules from training instances for fuzzy classiﬁcation systems Shyi-Ming Chen a

a,c,* ,

Fu-Ming Tsai

b

Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan, ROC b Department of Electronic Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan, ROC c Department of Computer Science and Information Engineering, Jinwen University of Science and Technology, Taipei County, Taiwan, ROC

Abstract In recent years, many methods have been proposed to generate fuzzy rules from training instances for handling the Iris data classiﬁcation problem. In this paper, we present a new method to generate fuzzy rules from training instances for dealing with the Iris data classiﬁcation problem based on the attribute threshold value a, the classiﬁcation threshold value b and the level threshold value c, where a 2 [0, 1], b 2 [0, 1] and c 2 [0, 1]. The proposed method gets a higher average classiﬁcation accuracy rate than the existing methods. Ó 2007 Elsevier Ltd. All rights reserved. Keywords: Fuzzy rules; Fuzzy sets; Fuzzy classiﬁcation systems; Iris data; Membership functions

1. Introduction In recent years, some methods have been presented to generate fuzzy rules from training instances for handling the Iris data (Fisher, 1936) classiﬁcation problem (Castro, Castro-Schez, & Zurita, 1999; Chang & Chen, 2001; Chen & Chang, 2005; Chen & Chen, 2002; Chen & Fang, 2005a, 2005b; Chen & Lin, 2000, 2005a, 2005b; Chen & Tsai, 2005; Chen, Wang, & Chen, 2006; Hong & Chen, 1999; Hong & Lee, 1996, 1999; Ishibuchi & Nakashima, 2001; Tsai & Chen, 2002; Wu & Chen, 1999). Castro et al. (1999) presented a method to generate fuzzy rules from training data to deal with the Iris data classiﬁcation problem. Chang and Chen (2001) presented a method to generate weighted fuzzy rules to deal with the Iris data classiﬁcation problem. Chen and Chang (2005) presented a method to construct membership functions and generate weighted fuzzy rules from training instances. Chen and Tsai (2005) presented a method to generate fuzzy rules from training instances to deal with the Iris data classiﬁcation problem. Chen et al. (2006) pre*

Corresponding author. Tel.: +886 2 27376417; fax: +886 2 27301081. E-mail address: [email protected] (S.-M. Chen).

0957-4174/$ - see front matter Ó 2007 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2007.07.013

sented a method for generating weighted fuzzy rules from training data for dealing with the Iris data classiﬁcation problem. Chen and Chen (2002) presented a method based on genetic algorithms to construct membership functions and fuzzy rules to deal with the Iris data classiﬁcation problem. Chen and Fang (2005a) presented a method for handling the Iris data classiﬁcation problem. Chen and Fang (2005b) presented a method to deal with fuzzy classiﬁcation problems by tuning membership functions for fuzzy classiﬁcation systems. Hong and Lee (1996) presented a method for inducing fuzzy rules and membership functions from training instances to deal with the Iris data classiﬁcation problem. Wu and Chen (1999) presented a method for constructing membership functions and fuzzy rules from training instances to deal with the Iris data classiﬁcation problem. Hong and Lee (1999) discussed the eﬀect of merging order on performance of fuzzy rules induction. Hong and Chen (1999) presented a method to construct membership functions and generate fuzzy rules from training instances by ﬁnding relevant attributes and membership functions to deal with the Iris data classiﬁcation problem. Chen and Lin (2005a) presented a method to generate weighted fuzzy rules from training instances to deal with the Iris data classiﬁcation problem.

612

S.-M. Chen, F.-M. Tsai / Expert Systems with Applications 35 (2008) 611–621

Chen and Lin (2005b) presented a method to generate weighted fuzzy rules from numerical data based on genetic algorithms to deal with the Iris data classiﬁcation problem. In this paper, we present a new method to generate fuzzy rules from training instances for dealing with the Iris data classiﬁcation problem. The proposed method constructs the membership functions and generates fuzzy rules from training instances based on the attribute threshold value a, the classiﬁcation threshold value b, and the level threshold value c to deal with the Iris data classiﬁcation problem, where a 2 [0, 1], b 2 [0, 1], and c 2 [0, 1]. The experimental results show that the proposed method gets a higher average classiﬁcation accuracy rate than the existing methods. The rest of this paper is organized as follows. In Section 2, we brieﬂy review basic concepts of fuzzy sets (Zadeh, 1965). In Section 3, we present a new method to generate fuzzy rules from training instances for handling the Iris data classiﬁcation problems. In Section 4, we use an example to illustrate the fuzzy rules generation process of the proposed method. In Section 5, we make an experiment to compare the average classiﬁcation accuracy rate of the proposed method with the existing methods. The conclusions are discussed in Section 6. 2. Basic concepts of fuzzy sets In this section, we brieﬂy review basic concepts of fuzzy sets from (Zadeh, 1965). Deﬁnition 1. Let U be the universe of discourse, U = {x1, x2, . . . , xn}. A fuzzy set A of the universe of discourse U can be represented as follows: A ¼ lA ðx1 Þ=x1 þ lA ðx2 Þ=x2 þ þ lA ðxn Þ=xn ;

ð1Þ

where lA is the membership function of the fuzzy set A, lA: U ! [0, 1], lA(xi) denotes the grade of membership of xi belonging to the fuzzy set A, and lA(xi) 2 [0, 1]. Deﬁnition 2. Let A and B be two fuzzy sets deﬁned in the universe of discourse U. The union operation between the fuzzy sets A and B, denoted as A [ B, is deﬁned as follows: lA[B ðxÞ ¼ maxðlA ðxÞ; lB ðxÞÞ;

8x 2 U ;

ð2Þ

where lA and lB are the membership functions of fuzzy sets A and B, respectively, lA: U ! [0, 1], and lB: U ! [0, 1]. Deﬁnition 3. Let A and B be two fuzzy sets deﬁned in the universe of discourse U. The intersection operation between the fuzzy sets A and B, denoted as A \ B, is deﬁned as follows: lA\B ðxÞ ¼ minðlA ðxÞ; lB ðxÞÞ;

8x 2 U ;

ð3Þ

where lA and lB are the membership functions of fuzzy sets A and B, respectively, lA: U ! [0, 1], and lB: U ! [0, 1]. Deﬁnition 4. Let A be a fuzzy set deﬁned in the universe of discourse U. The complement of the fuzzy set A, denoted as A, is deﬁned as follows:

lA ðxÞ ¼ 1 lA ðxÞ;

8x 2 U ;

ð4Þ

where lA and lA are the membership functions of fuzzy sets A and A, respectively, lA: U ! [0, 1], and lA : U ! [0, 1]. 3. A new method to generate fuzzy rules from training instances for handling fuzzy classiﬁcation problems In this section, we present a new method to generate fuzzy rules from training instances to deal with the Iris data (Fisher, 1936) classiﬁcation problem. The Iris data contains 150 instances having four input attributes, i.e., Sepal Length (SL), Sepal Width (SW), Petal Length (PL) and Petal Width (PW) as shown in Table 1. There are three species of ﬂowers, i.e., Iris-Setosa, Iris-Versicolor and IrisVirginica, and each species has 50 instances. Assume that the ith training instance ai has four input attribute values xi, Sepal Length, xi, Sepal Width, xi, Petal Length, xi, Petal Width, and one output attribute value yi, shown as follows: ai ¼ ððxi;Sepal

Length ; xi;Sepal Width ; xi;Petal Length ; xi;Petal Width Þ; y i Þ;

where xi, Sepal Length denotes the attribute value of the attribute ‘‘Sepal Length’’ of the ith training instance, xi, Sepal Width denotes the attribute value of the attribute ‘‘Sepal Width’’ of the ith training instance, xi, Petal Length denotes the attribute value of the attribute ‘‘Petal Length’’ of the ith training instance, xi, Petal Width denotes the attribute value of the attribute ‘‘Petal Width’’ of the ith training instance, yi denotes the species of ﬂower of the ith training instance, yi 2 {Iris-Setosa, Iris-Versicolor, Iris-Virginica}, and 1 6 i 6 150. The type of membership functions used in this paper is the triangular membership function and trapezoidal membership function, as shown in Fig. 1. In Chen et al. (2006) and Chen and Chen (2002), Chen et al. presented the deﬁnition of ‘‘the degree of entropy’’ vi of an input attribute Xi as follows: Vi ¼

PD jWDj

ð5Þ

where vi denotes the degree of entropy of the input attribute Xi and 1 6 i 6 4. First, Chen et al. found the whole domain WD of the input attribute Xi. Then, they found the individual domain of the input attribute Xi for each species of ﬂower. PD contains a set of intervals I1, I2, . . . , Ip which are not overlapped with the individual domain of the input attribute Xi for each species of ﬂowers, and jPDj = jI1j + jI2j + + jIpj, where jIjj denotes the length of the interval Ij, and 1 6 i 6 p; jWDj denotes the length of the whole domain WD. Assume that the attribute threshold value given by the user is a, the classiﬁcation threshold value given by the user is b, and the level threshold value given by the user is c, where a 2 [0, 1], b 2 [0, 1] and c 2 [0, 1]. The attribute threshold value a is used to test which attributes can be used to deal with the classiﬁcation, where a 2 [0, 1]; the classiﬁcation threshold value b is used to test whether the

S.-M. Chen, F.-M. Tsai / Expert Systems with Applications 35 (2008) 611–621

613

Table 1 Iris data (Fisher, 1936) Iris-Setosa

Iris-Versicolor

Iris-Virginica

SL

SW

PL

PW

SL

SW

PL

PW

SL

SW

PL

PW

5.1 4.9 4.7 4.6 5.0 5.4 4.6 5.0 4.4 4.9 5.4 4.8 4.8 4.3 5.8 5.7 5.4 5.1 5.7 5.1 5.4 5.1 4.6 5.1 4.8 5.0 5.0 5.2 5.2 4.7 4.8 5.4 5.2 5.5 4.9 5.0 5.5 4.9 4.4 5.1 5.0 4.5 4.4 5.0 5.1 4.8 5.1 4.6 5.3 5.0

3.5 3.0 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 3.7 3.4 3.0 3.0 4.0 4.4 3.9 3.5 3.8 3.8 3.4 3.7 3.6 3.3 3.4 3.0 3.4 3.5 3.4 3.2 3.1 3.4 4.1 4.2 3.1 3.2 3.5 3.1 3.0 3.4 3.5 2.3 3.2 3.5 3.8 3.0 3.8 3.2 3.7 3.3

1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 1.5 1.6 1.4 1.1 1.2 1.5 1.3 1.4 1.7 1.5 1.7 1.5 1.0 1.7 1.9 1.6 1.6 1.5 1.4 1.6 1.6 1.5 1.5 1.4 1.5 1.2 1.3 1.5 1.3 1.5 1.3 1.3 1.3 1.6 1.9 1.4 1.6 1.4 1.5 1.4

0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 0.2 0.2 0.1 0.1 0.2 0.4 0.4 0.3 0.3 0.3 0.2 0.4 0.2 0.5 0.2 0.2 0.4 0.2 0.2 0.2 0.2 0.4 0.1 0.2 0.1 0.2 0.2 0.1 0.2 0.2 0.3 0.3 0.2 0.6 0.4 0.3 0.2 0.2 0.2 0.2

7.0 6.4 6.9 5.5 6.5 5.7 6.3 4.9 6.6 5.2 5.0 5.9 6.0 6.1 5.6 6.7 5.6 5.8 6.2 5.6 5.9 6.1 6.3 6.1 6.4 6.6 6.8 6.7 6.0 5.7 5.5 5.5 5.8 6.0 5.4 6.0 6.7 6.3 5.6 5.5 5.5 6.1 5.8 5.0 5.6 5.7 5.7 6.2 5.1 5.7

3.2 3.2 3.1 2.3 2.8 2.8 3.3 2.4 2.9 2.7 2.0 3.0 2.2 2.9 2.9 3.1 3.0 2.7 2.2 2.5 3.2 2.8 2.5 2.8 2.9 3.0 2.8 3.0 2.9 2.6 2.4 2.4 2.7 2.7 3.0 3.4 3.1 2.3 3.0 2.5 2.6 3.0 2.6 2.3 2.7 3.0 2.9 2.9 2.5 2.8

4.7 4.5 4.9 4.0 4.6 4.5 4.7 3.3 4.6 3.9 3.5 4.2 4.0 4.7 3.6 4.4 4.5 4.1 4.5 3.9 4.8 4.0 4.9 4.7 4.3 4.4 4.8 5.0 4.5 3.5 3.8 3.7 3.9 5.1 4.5 4.5 4.7 4.4 4.1 4.0 4.4 4.6 4.0 3.3 4.2 4.2 4.2 4.3 3.0 4.1

1.4 1.5 1.5 1.3 1.5 1.3 1.6 1.0 1.3 1.4 1.0 1.5 1.0 1.4 1.3 1.4 1.5 1.0 1.5 1.1 1.8 1.3 1.5 1.2 1.3 1.4 1.4 1.7 1.5 1.0 1.1 1.0 1.2 1.6 1.5 1.6 1.5 1.3 1.3 1.3 1.2 1.4 1.2 1.0 1.3 1.2 1.3 1.3 1.1 1.3

6.3 5.8 7.1 6.3 6.5 7.6 4.9 7.3 6.7 7.2 6.5 6.4 6.8 5.7 5.8 6.4 6.5 7.7 7.7 6.0 6.9 5.6 7.7 6.3 6.7 7.2 6.2 6.1 6.4 7.2 7.4 7.9 6.4 6.3 6.1 7.7 6.3 6.4 6.0 6.9 6.7 6.9 5.8 6.8 6.7 6.7 6.3 6.5 6.2 5.9

3.3 2.7 3.0 2.9 3.0 3.0 2.5 2.9 2.5 3.6 3.2 2.7 3.0 2.5 2.8 3.2 3.0 3.8 2.6 2.2 3.2 2.8 2.8 2.7 3.3 3.2 2.8 3.0 2.8 3.0 2.8 3.8 2.8 2.8 2.6 3.0 3.4 3.1 3.0 3.1 3.1 3.1 2.7 3.2 3.3 3.0 2.5 3.0 3.4 3.0

6.0 5.1 5.9 5.6 5.8 6.6 4.5 6.3 5.8 6.1 5.1 5.3 5.5 5.0 5.1 5.3 5.5 6.7 6.9 5.0 5.7 4.9 6.7 4.9 5.7 6.0 4.8 4.9 5.6 5.8 6.1 6.4 5.6 5.1 5.6 6.1 5.6 5.5 4.8 5.4 5.6 5.1 5.1 5.9 5.7 5.2 5.0 5.2 5.4 5.1

2.5 1.9 2.1 1.8 2.2 2.1 1.7 1.8 1.8 2.5 2.0 1.9 2.1 2.0 2.4 2.3 1.8 2.2 2.3 1.5 2.3 2.0 2.0 1.8 2.1 1.8 1.8 1.8 2.1 1.6 1.9 2.0 2.2 1.5 1.4 2.3 2.4 1.8 1.8 2.1 2.4 2.3 1.9 2.3 2.5 2.3 1.9 2.0 2.3 1.8

classiﬁcation accuracy rate of the generated fuzzy rules are good enough, where b 2 [0, 1]; the level threshold value c is used to determine which fuzzy rules should be modiﬁed, where c 2 [0, 1]. The proposed method is now presented as follows: Step 1: Find the maximum attribute value and the minimum attribute value of each attribute of each species of the training instances.

Step 2: Based on Eq. (5), calculate the degree of entropy of each attribute. Step 3: Find out the attributes whose degree of entropy is larger than the attribute threshold value a, where a 2 [0, 1], and sort these attributes according to their degrees of entropy in a descending sequence. Step 4: Choose the attribute X found in Step 3 that has the largest degree of entropy, and ﬁnd the maximum attribute values and the minimum attribute values

614

S.-M. Chen, F.-M. Tsai / Expert Systems with Applications 35 (2008) 611–621

HN

MN

SN

Z

SP

MP

bute X of the generated fuzzy rules with the inference species of the generated fuzzy rules. Assume that the attribute value of the attribute X of a training instance falls in an interval bound with a species. If the training instance belongs to the same species bound with the interval, then it means that this training instance is classiﬁed correctly. If it is not equal, then it means that this training instance is classiﬁed incorrectly. The classiﬁcation accuracy rate of the generated fuzzy rules is deﬁned as follows:

HP

1.0

0.5

U (cm)

0

Fig. 1. Membership functions of the corresponding linguistic terms.

of the attribute X of each species based on the results obtained in Step 1. Sort these attribute values in an ascending sequence. Subtract 0.05 cm from the attribute values that are at odd positions, and add 0.05 cm to the attribute values that are at even positions. Then, sort these values in an ascending sequence again. Let these values be the crossover points corresponding to adjacent membership functions of the linguistic terms of the attribute X as shown in Fig. 2, where the corresponding intervals of the membership functions are also shown in Fig. 2. Step 5: Find the statistical distribution of the attribute values of the attribute X of the training instances falling in each interval shown in Fig. 2. Generate fuzzy rules from the training instances by the process described as follows. If most of the training instances fall in an interval belonging to the species Z1, and the corresponding linguistic term of this interval is Y1, then generate the following fuzzy rule: IF X is Y 1 THEN the flower is Z 1 : Step 6: Calculate the classiﬁcation accuracy rate of the generated fuzzy rules described as follows. If the classiﬁcation accuracy rate of the generated rules is equal to or larger than the classiﬁcation threshold value b, where b 2 [0, 1], then Stop. Otherwise, go to Step 7. The process of calculating the classiﬁcation accuracy rates of the generated fuzzy rules is described as follows. First, bind the intervals corresponding to the linguistic terms of the attri-

HN

MN

SN

Z

SP

MP

Classification Accuracy Rate ¼

Number of Training Instances Correctly Classified : Number of Training Instances ð6Þ

Step 7: Choose the attribute W found in Step 3 that has the second largest degree of entropy, and ﬁnd the maximum attribute value and the minimum attribute value of the attribute W based on the result obtained in Step 1. Sort these attribute values in an ascending sequence. Subtract 0.05 cm from the attribute values that are at odd positions, and add 0.05 cm to the attribute values that are at even positions. Then, sort these values in an ascending sequence again. Let these attribute values be the crossover points of adjacent membership functions corresponding to the linguistic terms of the attribute W as shown in Fig. 3, where the corresponding intervals of the membership functions are also shown in Fig. 3. Step 8: Find the fuzzy rules whose classiﬁcation error rate is larger than the level threshold value c, where c 2 [0, 1]. The classiﬁcation error rate of a fuzzy rule is deﬁned as follows: Classification Error Rate Number of Training Instances Incorrectly Classified ¼ : Number of Training Instances

ð7Þ

HP

HN

MN

SN

Z

SP

MP

HP

1.0

1.0

0.5

0.5

X (cm)

0 HN

MN

SN

Z

SP

MP HP

Fig. 2. Corresponding intervals of the membership functions of the attribute X.

W (cm)

0 HN

MN

SN

Z

SP

MP HP

Fig. 3. Corresponding intervals of the membership functions of the attribute W.

S.-M. Chen, F.-M. Tsai / Expert Systems with Applications 35 (2008) 611–621

Step 9: Find the incorrectly classiﬁed training instances and ﬁnd the statistical distribution of the attribute values of the attribute W of these incorrectly classiﬁed training instances falling in each interval shown in Fig. 3. If most of attribute values of these incorrectly classiﬁed training instances falling in an interval belonging to the species Z2 and the corresponding linguistic term of this interval is Y2, then generate the following fuzzy rule: IF X is Y 1 and W is Y 2 THEN the flower is Z 2 ; and modify the original generated fuzzy rule into the following form: IF X is Y 1 and W is Y 2 THEN the flower is Z 1 ; where Y 2 denotes the complement of the fuzzy set Y2. Step 10: If there are fuzzy rules whose classiﬁcation error rate is larger than the level threshold value c, where c 2 [0, 1], then go to Step 9 else go to Step 11. Step 11: If the classiﬁcation accuracy rate of the training instances is larger than the classiﬁcation threshold value b, where b 2 [0, 1], then Stop else go to Step 7. In the following, we present how to classify a testing instance shown as follows Step 1: For each generated fuzzy rule do begin calculate the membership grades of the testing instance belonging to the membership functions of the linguistic terms appearing in the antecedent portion of the fuzzy rule; ﬁnd the minimum value M of the membership grades of the testing instance belonging to the membership functions of the linguistic terms appearing in the antecedent portion of the fuzzy rule; If the inference result of the fuzzy rule is the ﬂower of the species Z then it indicates the degree of possibility that the testing instance belonging to the species Z is M, where M 2 [0, 1] end. Step 2: Choose the rule having the largest inference degree of possibility; If the inference result of the chosen rule is the ﬂower of the species Z then the testing instance is classiﬁed into the species Z. 4. An example In this section, we use an example to illustrate the proposed method. Assume that the attribute threshold value a, the classiﬁcation threshold value b, and the level threshold value c given by the user are 0.9, 1 and 0, respectively

615

(i.e., a = 0.9, b = 1 and c = 0). Assume that the training data set is as shown in Table 2 and the testing data set is as shown in Table 3. [Step 1] Based on Table 2, we can ﬁnd the maximum attribute value and the minimum attribute value of each attribute of each species of the training instances as shown in Table 4. [Step 2] Based on Eq. (5), we can calculate the degree of entropy of each attribute as follows: (i) The degree of entropy of the attribute SL is calculated as follows:

ð4:9 4:4Þ þ ð7:7 6:8Þ ¼ 0:412: ð7:7 4:4Þ (ii) The degree of entropy of the attribute SW is calculated as follows: ð4:2 3:3Þ þ ð2:2 2Þ ¼ 0:478: ð4:2 2Þ (iii) The degree of entropy of the attribute PL is calculated as follows: ð1:9 1Þ þ ð4:8 3:3Þ þ ð4:9 6:7Þ ¼ 0:956: ð6:7 1Þ (iv) The degree of entropy of the attribute PW is calculated as follows: ð0:6 0:1Þ þ ð1:5 1Þ þ ð2:5 1:5Þ ¼ 0:955: ð2:5 0:1Þ

[Step 3] Because the degrees of entropy of the attribute PL and the attribute PW are larger than the attribute threshold value a, where a = 0.9, after sorting these two attributes according to the degree of entropy of the attributes PL and PW in a descending sequence, we can get the following result: PL > PW. [Step 4] Because the attribute PL has the largest degree of entropy, from Table 4, we can get the maximum attribute value and the minimum attribute value of the attribute PL of each species of the training instances, and after sorting these attribute values in an ascending sequence, we can get the following result: 1 cm < 1.9 cm < 3 cm < 4.8 cm < 5.1 cm < 6.9 cm. After subtracting 0.05 cm from the attribute values which are at odd positions and adding 0.05 cm to the attribute values which are at even positions, and after sorting these values in an ascending sequence, we can get the following result: 0:95 cm < 1:95 cm < 2:05 cm < 4:85 cm < 5:05 cm < 6:95 cm:

616

S.-M. Chen, F.-M. Tsai / Expert Systems with Applications 35 (2008) 611–621

Table 2 Training data set Iris-Setosa

Iris-Versicolor

Iris-Virginica

SL

SW

PL

PW

SL

SW

PL

PW

SL

SW

PL

PW

5.1 4.9 5 4.6 5 4.9 5.4 4.8 4.3 5.1 5.4 4.6 5 5 5.2 4.7 5.5 4.9 4.9 4.4 4.5 4.4 5.1 5.3 5

3.5 3 3.6 3.4 3.4 3.1 3.7 3 3 3.5 3.4 3.6 3 3.4 3.4 3.2 4.2 3.1 3.6 3 2.3 3.2 3.8 3.7 3.3

1.4 1.4 1.4 1.4 1.5 1.5 1.5 1.4 1.1 1.4 1.7 1 1.6 1.6 1.4 1.6 1.4 1.5 1.4 1.3 1.3 1.3 1.9 1.5 1.4

0.2 0.2 0.2 0.3 0.2 0.1 0.2 0.1 0.1 0.3 0.2 0.2 0.2 0.4 0.2 0.2 0.2 0.2 0.1 0.2 0.3 0.2 0.4 0.2 0.2

7 5.5 5.7 4.9 5.9 6 6.1 5.6 5.6 6.3 6.8 6 5.5 5.8 6 6 6.3 5.6 5.5 5.8 5 5.6 6.2 5.1 5.7

3.2 2.3 2.8 2.1 3 2.2 2.9 2.9 3 2.5 2.8 2.9 2.4 2.7 2.7 3.4 2.3 3 2.6 2.6 2.3 2.7 2.9 2.5 2.8

4.7 4 4.5 3.3 4.2 4 4.7 3.6 4.5 4.9 4.8 4.5 3.8 3.9 5.1 4.5 4.4 4.1 4.4 4 3.3 4.2 4.3 3 4.1

1.4 1.3 1.3 1 1.5 1 1.4 1.3 1.5 1.5 1.4 1.5 1.1 1.2 1.6 1.6 1.3 1.3 1.2 1.2 1 1.3 1.3 1.1 1.3

7.1 7.6 7.3 7.2 6.5 6.4 6.8 5.7 5.8 6.5 7.7 7.7 6.7 6.2 6.4 7.2 7.4 7.9 6.3 6.4 5.8 6.8 6.3 6.2 5.9

3 3 2.9 3.6 3.2 2.7 3 2.5 2.8 3 2.6 2.8 3.3 2.8 2.8 3 2.8 3.8 3.4 3.1 2.7 3.2 2.5 3.4 3

5.9 6.6 6.3 6.1 5.1 5.3 5.5 5 5.1 5.5 6.9 6.7 5.7 4.8 5.6 5.8 6.1 6.4 5.6 5.5 5.1 5.9 5 5.4 5.1

2.1 2.1 1.8 2.5 2 1.9 2.1 2 2.4 1.8 2.3 2 2.1 1.8 2.1 1.6 1.9 2 2.4 1.8 1.9 2.3 1.9 2.3 1.8

Table 3 Testing data set Iris-Setosa

Iris-Versicolor

Iris-Virginica

SL

SW

PL

PW

SL

SW

PL

PW

SL

SW

PL

PW

4.7 4.6 5.4 4.4 4.8 5.8 5.7 5.4 5.7 5.1 5.1 5.1 4.8 5.2 4.8 5.4 5.2 5 5.5 5.1 5 5 4.8 5.1 4.6

3.2 3.1 3.9 2.9 3.4 4 4.4 3.9 3.8 3.8 3.7 3.3 3.4 3.5 3.1 3.4 4.1 3.2 3.5 3.4 3.5 3.5 3 3.8 3.2

1.3 1.5 1.7 1.4 1.6 1.2 1.5 1.3 1.7 1.5 1.5 1.7 1.9 1.5 1.6 1.5 1.5 1.2 1.3 1.5 1.3 1.6 1.4 1.6 1.4

0.2 0.2 0.4 0.2 0.2 0.2 0.4 0.4 0.3 0.3 0.4 0.5 0.2 0.2 0.2 0.4 0.1 0.2 0.2 0.2 0.3 0.6 0.3 0.2 0.2

6.4 6.9 6.5 6.3 6.6 5.2 5 6.7 5.8 6.2 5.6 5.9 6.1 6.1 6.4 6.6 6.7 5.7 5.5 5.4 6.7 5.5 6.1 5.7 5.7

3.2 3.1 2.8 3.3 2.9 2.7 2 3.1 2.7 2.2 2.5 3.2 2.8 2.8 2.9 3 3 2.6 2.4 3 3.1 2.5 3 3 2.9

4.5 4.9 4.6 4.7 4.6 3.9 3.5 4.4 4.1 4.5 3.9 4.8 4 4.7 4.3 4.4 5 3.5 3.7 4.5 4.7 4 4.6 4.2 4.2

1.5 1.5 1.5 1.6 1.3 1.4 1 1.4 1 1.5 1.1 1.8 1.3 1.2 1.3 1.4 1.7 1 1 1.5 1.5 1.3 1.4 1.2 1.3

6.3 5.8 6.3 6.5 4.9 6.7 6.4 7.7 6 6.9 5.6 6.3 7.2 6.1 6.4 6.3 6.1 7.7 6 6.9 6.7 6.9 6.7 6.7 6.5

3.3 2.7 2.9 3 2.5 2.5 3.2 3.8 2.2 3.2 2.8 2.7 3.2 3 2.8 2.8 2.6 3 3 3.1 3.1 3.1 3.3 3 3

6 5.1 5.6 5.8 4.5 5.8 5.3 6.7 5 5.7 4.9 4.9 6 4.9 5.6 5.1 5.6 6.1 4.8 5.4 5.6 5.1 5.7 5.2 5.2

2.5 1.9 1.8 2.2 1.7 1.8 2.3 2.2 1.5 2.3 2 1.8 1.8 1.8 2.2 1.5 1.4 2.3 1.8 2.1 2.4 2.3 2.5 2.3 2

Let these values be the crossover points of the adjacent membership functions of the linguistic terms of the attribute PL as shown in Fig. 4. The corresponding intervals

of the membership functions of the linguistic terms are also shown in Fig. 4.

S.-M. Chen, F.-M. Tsai / Expert Systems with Applications 35 (2008) 611–621

617

Table 4 Maximum attribute value and minimum attribute value of each attribute of each species Species

Attributes SL

Iris-Setosa Iris-Versicolor Iris-Virginica

HN

SW

PL

Max (cm)

Min (cm)

Max (cm)

Min (cm)

Max (cm)

Min (cm)

Max (cm)

4.3 4.9 5.7

5.5 7 7.9

2.3 2.1 2.5

4.2 3.4 3.8

1 3 4.8

1.9 5.1 6.9

0.1 1 1.6

0.4 1.6 2.5

MN

SN

Z

SP

MP

HP

1 0.5

0

PW

Min (cm)

PL (cm) 0.95 1.95 2.95 4.85 5.05 6.95 10 MN SN Z SP MP HP HN

Fig. 4. Corresponding intervals of the Membership functions of the attribute PL.

[Step 5] From Table 2, we can see that most training instances whose attribute values of the attribute PL fall between 0.95 cm and 1.95 cm belong to the species Iris-Setosa, and from Fig. 4, we can see that the corresponding membership function of the interval [0.95 cm, 1.95 cm] is MN; no training instances whose attribute values of the attribute PL fall between 1.95 cm and 2.95 cm; most training instances whose attribute values of the attribute PL between 2.95 cm and 4.85 cm belong to the species Iris-Versicolor and the corresponding linguistic term of the interval [2.95 cm, 4.85 cm] is Z; most training instances whose attribute values of the attribute PL fall between 4.85 cm and 5.05 cm belong to the species IrisVersicolor, and the corresponding membership function of the interval [4.85 cm, 5.05 cm] is SP; most training instances whose attribute values of the attribute PL fall between 5.05 cm and 6.95 cm belong to the species Iris-Virginica, and the corresponding membership function of the interval [5.05 cm, 6.95 cm] is MP; most training instances whose attribute values of the attribute PL is larger than 6.95 cm belong to the species Iris-Virginica and the corresponding membership function of the interval [6.95 cm, 10 cm] is HP. Thus, we can get the following ﬁve fuzzy rules: Rule 1: IF PL is MN THEN the ﬂower is Iris-Setosa, Rule 2: IF PL is Z THEN the ﬂower is IrisVersicolor,

Rule 3: IF PL is SP THEN the ﬂower is Iris-Versicolor, Rule 4: IF PL is MP THEN the ﬂower is IrisVirginica, Rule 5: IF PL is HP THEN the ﬂower is IrisVirginica. [Step 6] Because the classiﬁcation accuracy rate of the ﬁve generated fuzzy rules obtained in Step 5 is less than the classiﬁcation threshold value b given by the user, where b = 1, we go to Step 7. [Step 7] Because the attribute PW has the second largest entropy, from Table 4, we can get the maximum attribute value and the minimum attribute value of the attribute PW of each species of the training instances. After sorting these values in an ascending sequence, we can get the following result: 0.1 cm < 0.4 cm < 1 cm < 1.6 cm < 1.6 cm < 2.5 cm. After subtracting 0.05 cm from the values which are at odd positions and adding 0.05 cm to the values which are at even positions, and after sorting these values in an ascending sequence, we can get the following result: 0.05 cm < 0.45 cm < 0.95 cm < 1.55 cm < 1.65 cm < 2.55 cm. Let these values be the crossover points of the adjacent membership functions of the linguistic terms of the attribute PW. The corresponding intervals of the membership functions of the linguistic terms of the attribute PW are also shown in Fig. 5. [Step 8] Because the classiﬁcation error rates of Rule 4 and Rule 5 are larger than the level threshold value c given by the user, where c = 0, the system has to deal with Rule 4 and Rule 5.

HN

MN

SN

Z

SP

MP

HP

1

0.5

0

PW (cm) 0.05 0.45 0.95 1.55 1.65 2.55 10 HN MN SN Z SP MP HP

Fig. 5. Corresponding intervals of the membership functions of the linguistic terms of the attribute PW.

618

S.-M. Chen, F.-M. Tsai / Expert Systems with Applications 35 (2008) 611–621

[Step 9] Because the system ﬁnds that most training instances are wrongly classiﬁed when the attribute values of the attribute PW fall between 1.55 cm and 1.65 cm, belonging to the species IrisVersicolor, and the corresponding membership function of the interval [1.5 cm, 1.65 cm] is SP, the system generates the following fuzzy rules from Rule 4: Rule 6: IF PL is MP and PW is SP THEN the ﬂower is Iris-Versicolor, and the system modiﬁes Rule 4 into ‘‘Rule 4*’’, shown as follows: Rule 4*: IF PL is MP and PW is SP THEN the ﬂower is Iris-Virginica, where SP denotes the complement of the fuzzy set SP. [Step 10] Because the classiﬁcation error rate of Rule 5 is larger than the level threshold value c, where c = 0, we go to Step 9. [Step 9] Because the system ﬁnds that most training instances are wrongly classiﬁed when the attribute values of the attribute PW fall between 1.65 cm and 2.55 cm, belonging to the species Iris- Virginica, and the corresponding membership function of the interval [1.65 cm, 2.55 cm] is MP, the system generates the following fuzzy rule from Rule 5: Rule 7: IF PL is HP and PW is MP THEN the ﬂower is Iris-Virginica, and the system modiﬁes Rule 5 into ‘‘Rule 5*’’, shown as follows: Rule 5*: IF PL is HP and PW is MP THEN the ﬂower is Iris-Versicolor, where MP denotes the complement of the fuzzy set MP. [Step 10] Because there are no fuzzy rules whose classiﬁcation error rate is larger than the level threshold value c given by the user, where c = 0, we go to Step 11. [Step 11] Because the classiﬁcation accuracy rate of the generated fuzzy rules is equal to the classiﬁcation threshold value b given by the user, where b = 1, the system stops. Therefore, we can get the following seven fuzzy rules: Rule 1: IF PL is MN THEN the ﬂower is IrisSetosa, Rule 2: IF PL is Z THEN the ﬂower is IrisVersicolor, Rule 3: IF PL is SP THEN the ﬂower is IrisVersicolor, Rule 4*: IF PL is MP and PW is SP THEN the ﬂower is Iris-Virginica, Rule 5*: IF PL is HP and PW is MP THEN the ﬂower is Iris-Versicolor, Rule 6: IF PL is MP and PW is SP THEN the ﬂower is Iris-Versicolor,

Rule 7: IF PL is HP and PW is MP THEN the ﬂower is Iris-Virginica. In the following, we illustrate how to classify a testing instance. We use the ﬁrst testing data (i.e., SL = 4.7 cm, SW = 3.2 cm, PL = 1.3 cm and PW = 0.2 cm) in Table 3 to illustrate the classiﬁcation process. Because the attribute value of PL of this testing instance is 1.3 cm and the attribute value of PW of this testing instance is 0.2 cm, therefore: (i) Let us consider Rule 1: IF PL is MN THEN the ﬂower is Iris-Setosa. From Fig. 4, we can get the membership grade of the testing instance belonging to the membership function of the linguistic term MN appearing in the antecedent portion of Rule 1, which is 0.85, as shown in Fig. 6. Thus, it indicates that the degree of possibility that the testing instance belongs to the species Iris-Setosa is 0.85. (ii) Let us consider Rule 2: IF PL is Z THEN the ﬂower is Iris-Versicolor. From Fig. 4, we can get the membership grade of the testing instance belonging to the membership function of the linguistic term Z appearing in the antecedent portion of Rule 2, which is 0, as shown in Fig. 7. Thus, it indicates that the degree of possibility that the testing instance belongs to the species Iris-Versicolor is 0. (iii) Let us consider Rule 3: IF PL is SP THEN the ﬂower is Iris-Versicolor. From Fig. 4, we can get the membership grade of the testing instance belonging to the membership function of the linguistic term MP appearing in the antecedent portion of Rule 3, which is 0, as shown in Fig. 8. Thus, it indicates that the degree of possibility that the testing instance belongs to the species Iris-Versicolor is 0. HN

MN

SN

Z

SP

MP

HP

1 0.85 0.5

PL (cm)

0

0.95

1.95 2.95 4.85

5.05

6.95

10

1.3

Fig. 6. The inference process of Rule 1.

HN

MN

SN

Z

SP

MP

HP

1

0.5

PL (cm) 0

0.95

1.95 2.95 4.85

5.05

6.95

1.3

Fig. 7. The inference process of Rule 2.

10

S.-M. Chen, F.-M. Tsai / Expert Systems with Applications 35 (2008) 611–621

HN

MN

SN

Z

SP

MP

HP

1

0.5

PL (cm)

0

0.95

1.95 2.95 4.85

5.05

10

6.95

1.3

Fig. 8. The inference process of Rule 3.

HN

MN

SN

Z

SP

MP

HP

1

619

antecedent portion of Rule 4*, respectively, which are 0 and 1, respectively, as shown in Fig. 9. By taking the minimum value, among them, the calculation result is 0. It indicates that the degree of possibility that the testing instance belongs to the species IrisVirginica is 0. (v) Let us consider Rule 5*: IF PL is HP and PW is MP THEN the ﬂower is Iris-Versicolor. From Figs. 4 and 5, we can get the membership grades of the testing instance belonging to the membership functions of the linguistic terms HP and MP appearing in the antecedent portion of Rule 5*, respectively, which are 0 and 1, respectively, as shown in Fig. 10. By

0.5

0

0.95

1.95 2.95 4.85

5.05

10

6.95

HN

PL (cm)

MN

SN

Z

SP

HP

MP

1

1.3 HN MN

SN

Z

SP

HP

M

SP

0.5

1

PL (cm)

0

0.95

0.5

0

1.95 2.95 4.85

5.05

10

6.95

1.3 HN MN

PW (cm) 0.05

0.95 1.55 1.65

0.45

SN

Z

SP

MP

HP

1

10

2.55

0.2 0.5

Fig. 9. The inference process of Rule 4*.

0 HN

MN

SN

Z

SP

MP

HP

1

PW (cm) 0.05

0.95 1.55 1.65

0.45

10

2.55

0.2

Fig. 11. The inference process of Rule 6.

0.5

PL (cm)

0

0.95

1.95 2.95 4.85

5.05

10

6.95

HN

1.3 HN MN

SN

MN

SN

Z

SP

MP

HP

1

Z

SP

MP

MP

HP

0.5

1

0

0.5

0.95

1.95 2.95 4.85

5.05

10

6.95

PL (cm)

1.3

0

PW (cm) 0.05

0.45

0.95 1.55 1.65

2.55

HN MN

10

SN

Z

SP

HP

MP

1

0.2

Fig. 10. The inference process of Rule 5*.

(iv) Let us consider Rule 4*: IF PL is MP and PW is SP THEN the ﬂower is Iris-Virginica. From Figs. 4 and 5, we can get the membership grades of the testing instance belonging to the membership functions of the linguistic terms MP and SP appearing in the

0.5

0

PW (cm) 0.05

0.45

0.95 1.55 1.65

2.55

10

0.2

Fig. 12. The inference process of Rule 7.

620

S.-M. Chen, F.-M. Tsai / Expert Systems with Applications 35 (2008) 611–621

Table 5 A comparison of the average classiﬁcation accuracy rates for diﬀerent methods Methods

Average classiﬁcation accuracy rate (%)

Hong and Lee’s method (1996) (training data set: 75 instances; testing data set: 75 instances; executing 200 times) Hong and Chen’s method (1999) (training data set: 75 instances; testing data set: 75 instances; executing 200 times) The proposed method (training data set: 75 instances; testing data set: 75 instances; executing 200 times; Attribute threshold value a = 0.9; Classiﬁcation threshold value b = 1; level threshold value c = 0) Castro’s method (1999) (training data set: 120 instances; testing data set: 30 instances; executing 10 times) Chen and Fang’s method (2005a) (training data set: 120 instances; testing data set: 30 instances; executing 200 times) Chen and Tsai’s method (2005) (training data set: 120 instances; testing data set: 30 instances; executing 200 times; correlation coeﬃcient threshold value f = 0.86; boundary shift value e = 1/3; center shift value d = 0.025) Chen and Chang’s method (2005) (training data set: 120 instances; testing data set: 30 instances; executing 2000 times) Chen and Fang’s method (2005b) (training data set: 120 instances, testing data set: 30 instances; executing 200 times) The proposed method (training data set: 120 instances; testing data set: 30 instances; executing 200 times; attribute threshold value a = 0.9; classiﬁcation threshold value b = 1; level threshold value c = 0)

95.570 95.570 95.833

taking the minimum value, among them, the calculation result is 0. It indicates that the degree of possibility that the testing instance belongs to the species IrisVersicolor is 0. (vi) Let us consider Rule 6: IF PL is MP and PW is SP THEN the ﬂower is Iris-Versicolor. From Figs. 4 and 5, we can get the membership grades of the testing instance belonging to the membership functions of the linguistic terms MP and SP appearing in the antecedent portion of Rule 6, respectively, which are 0 and 0, respectively, as shown in Fig. 11. By taking the minimum value, among them, the calculation result is 0. It indicates that the degree of possibility that the testing instance belongs to the species Iris-Versicolor is 0. (vii) Let us consider Rule 7: IF PL is HP and PW is MP THEN the ﬂower is Iris-Virginica. From Figs. 4 and 5, we can get the membership grades of the testing instance belonging to the membership functions of the linguistic terms HP and MP appearing in the antecedent portion of Rule 7, respectively, which are 0 and 0, respectively, as shown in Fig. 12. By taking the minimum value, among them, the calculation result is 0. It indicates that the degree of possibility that the testing instance belongs to the species IrisVirginica is 0. Therefore, we can see that the testing instance (SL = 4.7 cm, SW = 3.2 cm, PL = 1.3 cm and PW = 0.2 cm) can get the maximum inference degree of possibility (i.e., 0.85) based on Rule 1. The inference result of Rule 1 is the species Iris-Setosa. Thus, the testing instance is classiﬁed into the species Iris-Setosa. From Table 3, we can see that the testing instance is classiﬁed correctly. 5. Experimental results We have implemented the proposed method using Visual Basic version 6.0 on a Pentium 4 PC, where the

96.600 96.72 96.82 96.88 96.96 97.166

attribute threshold value a, the classiﬁcation threshold value b and the level threshold value c given by the user are 0.9, 1 and 0, respectively. Case 1: The system randomly chooses 75 instances from the Iris data as the training data set and lets the remaining 75 instances be the testing data set. After executing the program 200 times, the average classiﬁcation accuracy rate is 95.8333% and the average number of generated fuzzy rules is 6.63. Case 2: The system randomly chooses 120 instances from the Iris data as the training data set and lets the remaining 30 instances be the testing data set. After executing the program 200 times, the average classiﬁcation accuracy rate is 97.166% and the average number of generated fuzzy rules is 7.535. Table 5 makes a comparison of the average classiﬁcation accuracy rate for diﬀerent methods. From Table 5, we can see that the proposed method has a higher average classiﬁcation accuracy rate than the existing methods. 6. Conclusions In this paper, we have presented a new method to construct membership functions and generate fuzzy rules from training instances for handling the Iris data classiﬁcation problem based on the attribute threshold value a, the classiﬁcation threshold value b and the level threshold value c given by the user, where a 2 [0, 1], b 2 [0, 1] and c 2 [0, 1]. The proposed method can get a higher average classiﬁcation accuracy rate than the existing methods. In the future, we will develop a method based on genetic algorithms to set up the optimal values of the attribute threshold value a, the classiﬁcation threshold value b and the level threshold value c for handling the fuzzy classiﬁcation problems.

S.-M. Chen, F.-M. Tsai / Expert Systems with Applications 35 (2008) 611–621

Acknowledgement This work was supported in part by the National Science Council, Republic of China, under Grant NSC 952221-E-011-116-MY2. References Castro, J. L., Castro-Schez, J. J., & Zurita, J. M. (1999). Learning maximal structure rules in fuzzy logic for knowledge acquisition in expert systems. Fuzzy Sets and Systems, 101(3), 331–342. Chang, C. H., & Chen, S. M. (2001). Constructing membership functions and generating weighted fuzzy rules from training data. In Proceedings of the 2001 ninth national conference on fuzzy theory and its applications, Chungli, Taoyuan, Taiwan, Republic of China (pp. 708–713). Chen, S. M., & Chang, C. H. (2005). A new method to construct membership functions and generate weighted fuzzy rules from training instances. Cybernetics and Systems, 36(4), 397–414. Chen, S. M., & Chen, Y. C. (2002). Automatically constructing membership functions and generating fuzzy rules using genetic algorithms. Cybernetics and Systems, 33(8), 841–862. Chen, S. M., & Fang, Y. D. (2005a). A new approach for handling the Iris data classiﬁcation problem. International Journal of Applied Science and Engineering, 3(1), 37–49. Chen, S. M., & Fang, Y. D. (2005b). A new method to deal with fuzzy classiﬁcation problems by tuning membership functions for fuzzy classiﬁcation systems. Journal of Chinese Institute of Engineers, 28(1), 169–173. Chen, S. M., & Lin, H. L. (2005a). Generating weighted fuzzy rules for handling classiﬁcation problems. International Journal of Electronic Business Management, 3(2), 116–128. Chen, S. M., & Lin, H. L. (2005b). Generating weighted fuzzy rules from training instances using genetic algorithms to handle the Iris data

621

classiﬁcation problem. Journal of Information Science and Engineering, 22(1), 175–188. Chen, S. M., & Lin, S. Y. (2000). A new method for constructing fuzzy decision trees and generating fuzzy classiﬁcation rules from training examples. Cybernetics and Systems, 31(7), 763–785. Chen, S. M., & Tsai, F. M. (2005). A new method to construct membership functions and generate fuzzy rules from training instances. International Journal of Information and Management Sciences, 16(2), 47–72. Chen, Y. C., Wang, L. H., & Chen, S. M. (2006). Generating weighted fuzzy rules from training data for dealing with the Iris data classiﬁcation problem. International Journal of Applied Science and Engineering, 4(1), 41–52. Fisher, R. (1936). The use of multiple measurements in taxonomic problem. Annals of Eugenics, 7, 179–188. Hong, T. P., & Chen, J. B. (1999). Finding relevant attributes and membership functions. Fuzzy Sets and Systems, 103(3), 389–404. Hong, T. P., & Lee, C. Y. (1996). Induction of fuzzy rules and membership functions from training examples. Fuzzy Sets and Systems, 84(1), 33–47. Hong, T. P., & Lee, C. Y. (1999). Eﬀect of merging order on performance of fuzzy induction. Intelligent Data Analysis, 3(2), 39–151. Ishibuchi, H., & Nakashima, T. (2001). Eﬀect of rule weights in fuzzy rulebased classiﬁcation systems. IEEE Transactions on Fuzzy Systems, 9(4), 506–515. Tsai, F. M., & Chen, S. M. (2002). A new method for constructing membership functions and generating fuzzy rules for fuzzy classiﬁcation systems. In Proceedings of 2002 tenth national conference on fuzzy theory and its application, Hsinchu, Taiwan, Republic of China. Wu, T. P., & Chen, S. M. (1999). A new method for constructing membership functions and fuzzy rules from training examples. IEEE Transactions on Systems, Man, and Cybernetics—Part B: Cybernetics, 29(1), 25–40. Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8, 338–353.

Generating fuzzy rules from training instances for fuzzy classification systems

Generating fuzzy rules from training instances for fuzzy classification systems

Recommend Documents