Fuzzy Sets and Systems 59 (1993) 295-304 North-Holland
295
Efficient fuzzy partition of pattern space for classification problems Hisao Ishibuchi, Ken Nozaki and Hideo Tanaka Department of Industrial Engineering, University of Osaka Prefecture, Gakuencho 1-1, Sakai, Osaka 593, Japan Received September 1992 Revised January 1993
Abstract: This paper proposes an efficient fuzzy partition method of a pattern space for classification problems. The proposed method is based on the sequential subdivision of fuzzy subspaces and the generated fuzzy subspaces have different sizes. In the proposed method, first an ndimensional pattern space is divided into 2" fuzzy subspaces with the same size. Next one of the fuzzy subspaces is selected and subdivided into 2" fuzzy subspaces. This procedure is iterated until a stopping condition is satisfied. Some criteria for selecting a fuzzy subspace to be subdivided are proposed and compared with each other by computer simulations. The proposed method is also compared with other fuzzy classification methods.
Keywords: Classification problems; rule generation; fuzzy if-then rules; fuzzy partition.
1. Introduction Fuzzy logic has mainly been applied to control problems with fuzzy if-then rules [9, 11]. In most fuzzy control systems, fuzzy if-then rules were generally derived from human experts. Recently several approaches have been proposed for automatically generating fuzzy if-then rules from numerical data (for example, see [1, 5, 8, 12,13]). For classification problems, an automated generation method of fuzzy if-then rules has been proposed [6, 7]. Generation of fuzzy if-then rules from numerical data for pattern classification problems consists of two phases: fuzzy partition of a
Correspondence to: Dr. H. Ishibuchi, Dept. of Industrial Engineering, College of Engineering, University of Osaka Prefecture, Gakuen-Cho 1-1, Sakai, Osaka 593, Japan.
pattern space into fuzzy subspaces and determination of a fuzzy if-then rule for each fuzzy subspace. In Ishibuchi et al. [6, 7], the fuzzy partition by a simple fuzzy grid was employed. An example of such a fuzzy partition is shown in Figure 1. In the fuzzy partition by a simple fuzzy grid, the number of fuzzy subspaces is K n when each axis of an n-dimensional pattern space is divided into K fuzzy subsets. In Figure 1, a two-dimensional pattern space is divided into 5z= 25 fuzzy subspaces. One problem of such a fuzzy partition is that the number of fuzzy subspaces exponentially increases as the dimension of a pattern space increases. For example, an eight-dimensional pattern space is divided into 5s = 390 625 fuzzy subspaces when each axis has 5 fuzzy subsets. Therefore the fuzzy partition by such a simple fuzzy grid as shown in Figure 1 is impractical for classification problems in a high dimensional pattern space. This paper proposes a simple method for obtaining an efficient fuzzy partition. In the proposed method, a pattern space is divided into fuzzy subspaces of different sizes as shown in Figure 2. Some areas are divided into coarse fuzzy subspaces and other areas into fine fuzzy subspaces. The size of each fuzzy subspace is determined depending on the configuration of training patterns.
2. Fuzzy classification method with fuzzy if-then rules For classification problems, several methods based on fuzzy set theory have been proposed (for example, see Grabisch et al. [3, 4] and Pedrycz [10]). Grabisch and Dispot [3] classified fuzzy classification methods into the following four categories: (i) Methods based on fuzzy relations. (ii) Methods based on fuzzy pattern matching procedures.
0165-0114/93/$06.00 (~) 1993---Elsevier Science Publishers B.V. All rights reserved
296
H. Ishibuchi et al. / Fuzzy partition of pattern space
1.0 0 0
•
OQ
R53
Y
0 0
• 0
0
•
0 OQ 0 O0
0.0
X1
1.0
Fig. 3. A classification problem in [0, 1] × [0, 1].
Fig. 1. An example of the fuzzy partition by a simple fuzzy grid.
generate fuzzy if-then rules which divide the pattern space into M disjoint decision areas. For this problem, fuzzy if-then rules of the following type are employed: Rule Rff: If X 1 is Air and
is A S
X2
then (xl, x2) belongs to Cff with CF-- CFff, i....4.....[ ..........
,}.,,.~....,~
Fig. 2. An example of the fuzzy partition by the proposed method.
(iii) Methods based on fuzzy clustering procedures. (iv) Other methods. Fuzzy classification methods based on fuzzy if-then rules were classified into the first category. In this section, we briefly describe the fuzzy classification method based on fuzzy if-then rules proposed in Ishibuchi et al. [6, 7]. Let us assume that a pattern space is the unit square [0, 1] × [0, 1] for the simplicity of notation. An example of a classification problem in the unit square is shown in Figure 3 where closed circles and open circles denote the patterns of Class 1 and Class 2, respectively. Suppose that m patterns Xp=(Xpl,xe2), p = l, 2 . . . . , m, are given as training patterns from M classes: Class 1 (C1), Class 2 (C2) . . . . , Class M (CM). That is, the classification of each xp, p = 1, 2 . . . . . m, is known as one of M classes. Our problem is to
(1)
where R~ is the label of the fuzzy if-then rule, AS and A S are fuzzy subsets on the unit interval [0, 1], C~ is the consequent (i.e., one of M classes) and CF is the grade of certainty of the fuzzy if-then rule. In Figure 1, four fuzzy if-then rules labeled R523, Rg2, R354 and R53 are indicated in the corresponding fuzzy subspaces. We can use any type of membership functions (e.g., triangular, trapezoid, exponential) for AS and A S in the antecedent of (1). In this paper, symmetric triangular membership functions are employed. Let us assume that each axis of the pattern space is partitioned into K fuzzy subsets {A1r, A2r . . . . , A~} where Air is defined by the symmetric triangular membership function: /u/K(x) = max{1 - Ix - aiXl/b r, 0}, i=l, 2,...,K, where alr=(i-1)/(K-1), b K= I [ ( K - 1).
i = 1 , 2 . . . . ,K,
(2) (3) (4)
The fuzzy subset AS is defined by (2)-(4) where the center value is aS and the support is the open interval (a~r - b K, aS + bK). Figure 1 is the fuzzy partition corresponding to K = 5. The consequent Cff and the grade of certainty CFff of the fuzzy if-then rule (1) are determined by the following procedure.
297
H, Ishibuchi et al. / Fuzzy partition of pattern space
Procedure 1. Generation of fuzzy if-then rules. (i) Calculate flo for t = 1, 2 . . . . . M as
&,=
(5) p~Ct
(ii) Find Class X (CX) such that
f l c x = max{flc~, flc:, . . . , flCM}.
(6)
If tWO or more classes take the maximum value in (6), then the fuzzy if-then rule corresponding to the fuzzy subspace A K × A~: is not generated, else C~ is determined as CX in (6). (iii) If a single class takes the maximum value in (6), CFijK"~s determined as
In this procedure, the result of the fuzzy inference is the consequent of the fuzzy if-then rule which has the maximum product of M~(xt,~). #~(xp2) and CF ft. If there are no fuzzy if-then rules such that #X(xp,) • #K(xp2) > 0 and CF~" > 0 at xt,, the new pattern x~ cannot be classified. We applied Procedure 1 and Procedure 2 to the classification problem in Figure 3 using various values of K. The generated fuzzy if-then rules and the classification boundaries between two classes are shown in Figures 4 and 5. In the left figures, hatched area, dotted area and painted area show the following: Hatched area: The consequent of the generated fuzzy if-then rules in this area is Class 1.
/t=l
where
fl= E
flc,/(M-a).
(S)
0
Ct~:CX
ii! ii!iiiioiii!!i~ : i ~ In this procedure, the consequent C ff is determined as the class which has the largest sum of #i~(xp~) • #~C(xp2). The grade of certainty CFff takes a value in the unit interval [0, 1]. If all the patterns in A ff × A K are from Class X (CX), CF if= 1. On the contrary, if there are patterns from other classes, CFff < 1. By Procedure 1, K 2 fuzzy if-then rules are generated from the training patterns in the pattern space [0, 1] × [0, 1]. Let us denote the set of the generated fuzzy if-then rules by SR, that
iii iei oi? iiSiOii57~
!! Cii!!iiillli ~ .....................
°°
2 ~&~ ~j ~ ~Y/~
•
°
O
0
(a)K=2
is,
S~={R~Ii=I,
2,...,K;j=I,2,...,K}.
(9)
In the classification phase, a new pattern is classified by the following procedure. Procedure 2. Classification of a new pattern
cl (
x,, :
(i) Calculate o~c, for t = 1, 2 . . . . .
M as
o~c, = max{#~(x~,O • ~j " K~x ~'. CF~j ~ ] Cii t p.)
C
(10)
(ii) Find Class X (CX) such that
o~cM}.
•
4
= Ct;
R,~ ~ S~}.
a'cx = max{o~c,, at2 . . . . .
(b)K=3
/|
(11)
If two or more classes take the maximum value in (11) then xp can not be classified, else assign xp to Class X (CX) determined by (11).
Fig. 4. Simulation results for K = 2 , 3, 4. (a) K = 2 , K - 3 , (c) K = 4 .
(b)
298
14. lshibuchi et al. / Fuzzy partition of pattern space
lPliiiiolV/P// S///llI o. i ~D~
0
ii iiiii
o
~:: :::::o:v////./~////z/////////~
•
•
o o
• •
(a)K=5
(b)K=6 0
•
,i o
•
•
(clK---7 Fig. 5. Simulation results for K = 5 , 6, 7. (a) K = 5, (b) K = 6 , (c) K = 7 .
Dotted area: The consequent of the generated fuzzy if-then rules in this area is Class 2. Painted area: No fuzzy if-then rule is generated in this area. In the right figures in Figure 5, painted areas show that new patterns cannot be classified because there are no fuzzy if-then rules in those areas. From the simulation results in Figure 4, we can see that some patterns are misclassified when the fuzzy partition is coarse (i.e., K is small). On the other hand, from Figure 5, we can see that all the patterns are correctly classified when the fuzzy partition is fine (i.e., K is large). The value of K can be determined by the following simple procedure.
Procedure 3. Determination of K. (i) Let K : = 2. Set the values of KMAx and e where KMAX and e are the upper bound of K and the desirable rate of correctly classified patterns, respectively. (ii) Generate fuzzy if-then rules corresponding to the current value of K by Procedure 1. (iii) Update SR by (9) and classify all the training patterns by Procedure 2. If the rate of correctly classified patterns is equal to or greater than the desirable rate e then stop the procedure. (iv) If K = KMAX then stop the procedure else let K : = K + 1 and go to (ii). In computer simulations of this paper, we specify KMAX=°° and e = 1 0 0 % . For the classification problem in figure 3, this procedure terminated at K = 6 (see Figure 5). As is shown in Figure 5, one problem of a fine fuzzy partition is that many fuzzy if-then rules cannot be generated. As a result, some area cannot be classified as any class. To cope with this difficulty, the concept of distributed fuzzy if-then rules [6, 7] was proposed where all the fuzzy if-then rules corresponding to several fuzzy partitions were simultaneously employed in the fuzzy inference. In distributed fuzzy if-then rules, the set of fuzzy if-then rules SR employed in Procedure 2 is defined as SR = {Rffl i = 1, 2 , . . . ,
g ; j = 1, 2 . . . . .
K;
K=2,3,...,L}.(12)
where L is the number of fuzzy subsets on each axis in the finest fuzzy partition. (12) means that all the fuzzy if-then rules corresponding to K = 2, 3 . . . . . L are simultaneously employed in Procedure 2. The value of L can be determined by the same manner as Procedure 3. In Figure 6, we show the simulation result
I°o?.. Fig. 6. Simulation result with distributed fuzzy i f - t h e n rules corresponding to K = 2 - 7 are simultaneously employed.
H. lshibuchi et al. / Fuzzy partition of pattern space
with distributed fuzzy if-then rules where all the fuzzy if-then rules corresponding to K = 2 - 7 in Figures 4 and 5 are simultaneously employed. From Figure 6 with no painted area, we can see that any pattern in the pattern space can be classified as Class 1 or Class 2. One problem of the distributed fuzzy if-then rules is that the number of fuzzy if-then rules are enormous because several fuzzy partitions are simultaneously employed.
3. Efficient fuzzy partition
3.1. Basic idea In Procedure 1 in Section 2, the choice of an appropriate fuzzy partition (i.e., an appropriate value of K) is important and difficult. If the fuzzy partition is too coarse (i.e., K is too small), the classification power of the generated fuzzy if-then rules may be low (see Figure 4). On the other hand, if the fuzzy partition is too fine (i.e, K is too large), some area cannot be classified as any class since many fuzzy if-then rules cannot be generated (see Figure 5). While distributed fuzzy if-then rules remedy this problem, another problem remains: the number of fuzzy if-then rules is enormous. Let us reconsider the two-class classification problem in Figure 3. From this figure, we can see that the left half of the pattern space needs a fine partition but a fine partition is not appropriate for the right half. Therefore the size of each fuzzy subspace should be adjusted depending on the configuration of the training patterns. In order to efficiently partition the pattern space depending on the configuration of the training patterns, we propose the following procedure for the two-dimensional pattern space.
Basic procedure. Efficient fuzzy partition.
Step 1: Divide the pattern space into 22 fuzzy subspaces.
Step 2: Choose one fuzzy subspace and subdivide it into 22 fuzzy subspaces. Step 3: Go to Step 2 if a stopping condition is not satisfied.
299
In the case of n-dimensional classification problems, each fuzzy subspace is subdivided into 2n fuzzy subspaces. One important issue in this procedure is how to choose a fuzzy subspace to be subdivided in Step 2. We use the grade of certainty CF of each fuzzy if-then rule to choose an appropriate fuzzy subspace. As Procedure 1 in Section 2 shows, C F = 1 means that the corresponding fuzzy subspace includes only the training patterns from a single class. Therefore those fuzzy subspaces which correspond to the fuzzy if-then rules with CF = 1 need no further subdivision. On the other hand, C F < 1 means that the corresponding fuzzy subspace includes the training patterns from more than one class. Therefore those fuzzy subspaces with C F < 1 may need further subdivision. That is, CF can be viewed as the grade of uniformity of patterns in each fuzzy subspace. Therefore we choose the fuzzy subspace with the smallest value of CF.
3. 2. Algorithm of fuzzy partition The proposed method for obtaining an efficient fuzzy partition can be written as the following procedure for the pattern space [0, 11 × [0, 11.
Procedure 4. Efficient fuzzy partition.
Step i: (i) Let J : = 1. Set the values of JMAX and e where JMAX and e are the maximum iteration number and the desirable rate of correctly classified patterns, respectively. (ii) Divide the pattern space into the four fuzzy subspaces: A 2 × Z~, A~ × A22, A22 × A12 and
A xA . (iii) Generate fuzzy if-then rules corresponding to the four fuzzy subspaces in (ii). Let
SR'= {R~,, R22, R~,, R22}. Step 2: (i) L e t J : = J + 1. (ii) Let the fuzzy if-then rule with the smallest grade of certainty CF in SR be R x and the corresponding fuzzy subspace be A/K × A S. Remove R~ from SR. (iii) Subdivide the fuzzy subspace A~ × A S into the four fuzzy subspaces: A2i-1 2~: × A 2z,: j-l, 2K 2K 2K 2K 2K 2K A 2 i - i × A2j , A2i × A2j-~ and A2i × A2j • (iv) Generate fuzzy if-then rules corresponding to the four fuzzy subspaces in (iii). Add the generated fuzzy if-then rules to Sn. Step 3: (i) Classify the training patterns by
300
H. lshibuchi et al. / Fuzzy partition o f pattern space
Procedure 2 in Section 2 with SR. If the rate of correctly classified patterns is equal to or greater than the desirable rate e then stop the procedure. (ii) If J =JMAx then stop the procedure else go to Step 2. In this procedure, Aft is defined by the symmetric triangular membership function in (2)-(4). From (iii) in Step 2 and (2)-(4), we can see that the selected fuzzy subspace Aft x A~ is subdividied into four fuzzy subspaces defined by the fuzzy subsets m 22K i - l , m2i2K, A 22K j - 1 and A 22K j with the spread b 2K= 1 / ( 2 K - 1 ) . The total number of fuzzy if-then rules in the rule set SR is 4J - (J - 1) = 3J + 1 after J iterations. In the case of classification problems in an ndimensional pattern space, the number of fuzzy if-then rules in SR after J iterations is J . 2" - (at - 1) = J ( 2 " - 1) + 1 since the selected fuzzy subspace is subdivided into 2" fuzzy subspaces. We applied Procedure 4 with e = 100% and JMAX = o0 to the classification problem in Figure 3. In Figure 7, we show an intermediate simulation result after 3 iterations (i.e., J = 3). First, four fuzzy if-then rules were generated in
II I
ili,iiljilgiiiii
ii i LI I1 •1
114 I'1
°o°.V
N
:':
I I tliii
(alJ=l
A
(b)J=2
1:" o
o o o
•
I•
(c)J=3
I o o G 3• ) •
)iii)I ~:::
I I
•
o
•
•
(d)Boundary
Fig. 7. I n t e r m e d i a t e r e s u l t s a f t e r t h r e e i t e r a t i o n s . (a) J = 1, ( b ) J = 2, (c) J = 3, ( d ) B o u n d a r y .
:-----i-'..'"
:. ~'T" .~.-.'.4 ~. -'.4 ....
O O
•
.......... 4-....t..... ......................
ip.....,....-~. i ." !'T".' .
O~0
• •
~.---4--.--,'
Fig. 8. Final result after eleven iterations.
Step 1 (see Figures 7(a) and 4(a)). Next, one fuzzy if-then rules was selected and replaced by four fuzzy if-then rules in Step 2 (see Figures 7(b) and 4(c)). This procedure was iterated (see Figure 7(c)). After three iterations, 10 fuzzy if-then rules shown in Figure 7(a)-(c) were in the rule set SR and the pattern space was classified as shown in Figure 7(d). In Figure 8, we show the final result obtained by the proposed procedure. The procedure terminated at J = 11 and all the given patterns were correctly classified.
3.3. Two variations In Procedure 4, the grade of certainty CF was employed as a heuristic criterion for choosing a fuzzy subspace to be subdivided. By using other criteria, many variations can be considered. In this subsection, we show two intuitively acceptable criteria. From Procedure 2 in Section 2, we can see that each pattern is classified by a single fuzzy if-then rule. Therefore, when a pattern is misclassified, we can detect the fuzzy if-then rule which is responsible for the misclassification. Let NC~ and NM~ be the numbers of correctly classified patterns and misclassified patterns by the fuzzy if-then rule R~, respectively. If all the given m patterns xp, p = 1, 2 . . . . . m, are classified by the rule set SR, the sum of (NCijK + NM~) over Sn is m. That is, ~] (NC~ + NM~) -- m.
(13)
R q(~e SR
By these two numbers: NC~ and NM~, two intuitively acceptable criteria can be derived. If our aim is to correctly classify all the given patterns with the minimum number of fuzzy if-then rules, the number of misclassified
301
H. lshibuchi et al. / Fuzzy partition of pattern space
patterns by each fuzzy if-then rule is the most appropriate information for choosing a fuzzy subspace to be subdivided. Therefore we can obtain the following NM criterion (NM is the abbreviation of the number of misclassified patterns). NM criterion. Choose the fuzzy if-then rule Rff such that NMff is the largest in the rule set SR. The rate of misclassified patterns is also an appropriate information to choose a fuzzy subspace. Therefore we can also use the following RM criterion (RM is the abbreviation of the rate of misclassified patterns). RM criterion. Choose the fuzzy if-then rule Rff such that the following index is the largest in the rule set Sn. RM K = NM,j/(NCij K K + NM,j). K
(14)
The criterion employed in Procedure 4 is based on the grade of certainty CF of each fuzzy if-then rule. This criterion can be written as follows. CF criterion. Choose the fuzzy if-then rule Rff such that CFff is the smallest in the rule set Sn.
4. Simulation results for iris data We applied the three variations of the proposed method to the iris data in Fisher [2]. The two fuzzy classification methods mentioned in Section 2 were also applied to the same data. One method is based on fuzzy if-then rules in (9) by a simple fuzzy grid and the other is distributed fuzzy if-then rules in (12). The iris data consist of 150 samples from three classes with four attributes. In Table 1, we show the number of fuzzy if-then rules generated for
correctly classifying all the given data: 150 samples. Table 1 was obtained by applying each method with the same stopping condition: e = 100%, KMA X = oc and JMAX = ~ (i.e., all the data should be correctly classified) to the whole iris data. From Table 1, we can see that the proposed method (right three columns) generates much less fuzzy if-then rules than the two fuzzy classification methods mentioned in Section 2 (left two columns). We can also see that the proposed method with NM or RM criterion generates less fuzzy if-then rules than that with CF criterion. This is because NM and RM criteria directly use the information about the number of misclassified patterns. In order to evaluate the performance of each method for test data, we iterated the following random subsampling procedure 20 times for each method. (i) Randomly select ~N samples from each of the three classes. Use the selected N samples from the three classes and the other 1 5 0 - N samples as the training data and the test data, respectively. (ii) Generate fuzzy if-then rules from the training data. The stopping condition of each method is e = 100%, KMA X = oo and /MAX = oo (i.e., all the training data are correctly classified). (iii) Classify 150- N samples in the test data using the generated fuzzy if-then rules in (ii). In this simulation, we use six different values of N: N = 9, 15, 21, 30, 60, 90. The simulation results are summarized in Tables 2 and 3. Table 2 shows the average rate of correctly classified samples in the test data over 20 iterations. Table 3 shows the average number of fuzzy if-then rules generated in each method. From these two tables, we can see that the proposed method substantially reduced the number of fuzzy if-then rules at the cost of the light deterioration of the performance. While the proposed method with the NM or
Table 1. The number of fuzzy if-then rules generated for correctly classifying all the 150 samples Simple fuzzy grid (K=9)
Distributed fuzzy if-then rules ( K = 2 - 13)
Proposed method CF (J = 48)
N M ( J = 15)
R M ( J = 15)
6561
89270
721
226
226
302
H. lshibuchi et al. / Fuzzy partition of pattern space
Table 2. Average rate of correctly classified samples by each method with the same stopping condition The number of training samples
Simple fuzzy grid
Distributed fuzzy if-then rules
CF
NM
RM
9 15 21 30 60 90
88.6 91.0 91.3 92.7 93.5 94.5
90.7 92.3 92.4 93.8 95.4 95.8
87.3 88.3 91.1 91.8 93.8 95.1
86.5 88.3 89.8 93.0 93.9 94.8
86.6 88.6 89.6 93.3 94.1 94.6
Proposed method
Table 3. Average number of fuzzy if-then rules The number of training samples
Simple fuzzy grid
Distributed fuzzy if-then rules
CF
NM
RM
9 15 21 30 60 90
181 338 455 1727 2452 3440
563 14405 8328 20512 63069 140498
166 222 253 307 449 528
48 58 71 83 105 150
48 60 72 87 107 150
R M c r i t e r i o n c a n n o t b e c o n t i n u e d a f t e r all t h e training d a t a a r e c o r r e c t l y classified ( i . e . , K~ K N M i j - RMij = 0 for all i, j, K ) , t h e m e t h o d with the C F c r i t e r i o n can b e d o n e . T h e r e f o r e w e a p p l i e d t h e p r o p o s e d m e t h o d with t h e C F c r i t e r i o n to the iris d a t a a n d c o n t i n u e d it until J = 100 (i.e., 100 i t e r a t i o n s ) . T h e s i m u l a t i o n results a r e s u m m a r i z e d in T a b l e 4. T a b l e 4 shows the a v e r a g e r a t e o f c o r r e c t l y classified s a m p l e s in the test d a t a t o g e t h e r with t h e n u m b e r o f iterations. Since t h e iris d a t a h a v e f o u r
Proposed method
a t t r i b u t e s , t h e n u m b e r o f fuzzy i f - t h e n rules after J i t e r a t i o n s is J(24 - 1) + 1; 301 for J = 20, 601 for J = 4 0 , 901 for J = 6 0 , 1201 for J = 80 a n d 1501 for J = 100. F r o m T a b l e 4, we can see t h a t t h e s i m u l a t i o n results for t h e p r o p o s e d m e t h o d s h o w n in T a b l e 2 were improved by increasing the number of iterations. F r o m t h e c o m p a r i s o n o f T a b l e 4 with T a b l e 2, we can see t h a t t h e p r o p o s e d m e t h o d after 80 o r 100 i t e r a t i o n s in T a b l e 4 o u t p e r f o r m e d t h e fuzzy classification m e t h o d b a s e d on
Table 4. Average rate of correctly classified samples by the proposed method with the CF criterion The number training samples
J = 20
J = 40
J = 60
J = 80
J = 100
9 15 21 30 60 90
89.8 90.0 91.0 91.0 91.9 92.3
90.9 90.9 92.8 94.3 94.9 95.3
90.4 90.9 92.7 94.0 95.6 96.0
90.2 91.3 92.8 94.3 95.3 96.5
89.1 91.2 92.8 94.4 95.3 96.4
The number of iterations: J
H. lshibuchi et al. / Fuzzy partition of pattern space
neighbor algorithm (for the details, see [3, 4]). ERRzcv and ERRLvt reported in [3] are as follows:
a simple fuzzy grid in Table 2. We can also see that the proposed method after 80 or 100 iterations has almost the same performance as the distributed fuzzy if-then rules in Table 2. Furthermore, it should be noted that the number of fuzzy if-then rules (1201 for J = 80 and 1501 for J = 100) of the proposed method is much less than that of the distributed fuzzy if-then rules shown in Table 3. From these results, we can conclude that the proposed method is superior to the fuzzy classification methods in [6, 7] for the iris data. In order to compare the proposed method with other classification methods, we estimated the error rate of the proposed method for the iris data by the 2-fold cross-validation method and the leaving-one-out method. In the 2-fold cross-validation, the 150 samples were randomly divided into two subsets of the same size: 75 samples. One subset was used as the training data and the other subset as the test data (another combination was also examined). We applied the 2-fold cross-validation method to the iris data 20 times by randomly dividing the 150 samples. On the other hand, in the leaving-oneout method, a single sample is the test data and the other 149 samples are the training data. All the selections of the test data were examined, i.e., 150 selections in this case. The simulation results are summarized in Table 5. The best results of the estimated error rates by the 2-fold cross-validation (ERR2cv) and by the leaving-one-out (ERRLw) were as follows: ERR2cv = 4.10,
303
ERR2cv: 6.7 - 11.3 (five methods based on fuzzy pattern matching), 8.0 - 10.0 (three methods based on fuzzy clustering), 4.0 (fuzzy k-nearest neighbor algorithm), ERRLw: 3 . 3 - 8.0 (five methods based on fuzzy pattern matching), 4 . 7 - 6.7 (three methods based on clustering), 3.3 (fuzzy k-nearest neighbor algorithm). From the comparison of these results with our result (ERR2cv=4.10 and ERRLvl = 3.33), we can conclude that the proposed method has better performance than most of the other fuzzy classification methods. 5. Conclusion
This paper proposed an efficient fuzzy partition of a pattern space for the fuzzy classification method based on fuzzy if-then rules. The basic idea of the proposed method was the sequential subdivision of fuzzy subspaces. Therefore a pattern space was divided into fuzzy subspaces with different sizes. The size of each fuzzy subspace was automatically determined from numerical data. The proposed method was applied to the iris data in Fisher [2] and the followings can be described from the simulation results. (i) The proposed fuzzy partition has at least the same performance and requires much less fuzzy if-then rules in comparison with the fuzzy partitions by simple fuzzy grids employed in [6, 71.
ERRLvl = 3.33.
In Grabisch and Dispot [3], ERR2cv and ERRLvl were reported for nine classification methods based on fuzzy set theory: Five methods based on fuzzy pattern matching procedures, three methods based on fuzzy clustering procedures and the fuzzy k-nearest
Table 5. Estimated error rate of the proposed m e t h o d with the CF criterion by cross-validation techniques, ERR2cv: Estimated error rate by the 2-fold cross-validation, ERRLv~: Estimated error rate by the leaving-one-out method J
10
20
30
40
50
60
70
80
90
100
ERR2c~ v ERRLv I
13.70 14.01)
8.93 8.00
6.83 8.011
5.30 8.00
4.87 4.00
4.67 4.01)
4.67 4.00
4.40 4.00
4.10 3.33
4.10 3.33
304
H. lshibuchi et al. / Fuzzy partition of pattern space
(ii) The fuzzy classification method based on fuzzy if-then rules generated by the proposed fuzzy partition has better performance than most of other fuzzy classification methods based on fuzzy pattern matching, fuzzy clustering and fuzzy k-nearest neighbor examined in [3]. The second result motivates the study on the fuzzy classification method based on fuzzy if-then rules. The proposed method will be improved in many points. Especially the adjustment of membership functions and the grade of certainty of each fuzzy if-then rule will have great effect on the performance. Such a study is left for future research.
[5]
[6]
[7]
[8]
[9]
[10]
References [1] D.G. Burkhardt and P.P. Bonissone, Automated fuzzy knowledge base generation and tuning, Proc. FUZZIEEE'92 (San Diego, CA., March 8-12, 1992) 179-188. [2] R.A. Fisher, The use of multiple measurements in taxonomic problems, Annals of Eugenics 7 (1936) 179-188. [3] M. Grabisch and F. Dispot, A comparison of some methods of fuzzy classification on real data, Proc. of IIZUKA '92 (Iizuka, Japan, July 17-22, 1992) 659-662. [4] M. Grabisch and M. Sugeno, Multi-atrribute clas-
[11] [12]
[13]
sification using fuzzy integral, Proc. of FUZZ-IEEE'92 (San Diego, CA., March 8-12, 1992) 47-54. I. Hayashi, H. Nomura, H. Yamasaki and N. Wakami, Construction of fuzzy inference rules by NDF and NDFL, Int. J. of Appr. Reasoning 6 (1992) 241-266. H. Ishibuchi, K. Nozaki and H. Tanaka, Pattern classification by distributed representation of fuzzy rules, Proc. of FUZZ-IEEE'92 (San Diego, CA., March 8-12, 1992) 643-650. H. Ishibuchi, K. Nozaki and H. Tanaka, Distributed representation of fuzzy rules and its application to pattern classification, Fuzzy Sets and Systems 52 (1992) 21-32. J.S.R. Jang, Fuzzy controller design without domain experts, Proc. FUZZ-IEEE'92 (San Diego, CA., March 8-12, 1992) 289-297. C.C. Lee, Fuzzy logic in control systems; Fuzzy logic controller-Part I and Part II, IEEE Trans. On Systems, Man and Cybernetics SMC-20 (1990) 404-435. W. Pedrycz, Fuzzy sets in pattern recognition: Methodology and methods, Pattern Recognition 23 (1990) 121-146. M. Sugeno, An introductory survey of fuzzy control, Inf. Sci. 36 (1985) 59-83. T. Takagi and M. Sugeno, Fuzzy identification of systems and its applications to modelling and control, IEEE Trans. on Systems, Man and Cybernetics SMC-15 (1985) 116-132. L.X. Wang and J.M. Mendel, Generating fuzzy rules from numerical data, with applications, USC-SIPI Report No. 169, University of Southern California (1991).