Pattern Recognition Letters 14 (1993) 545-552 North-Holland
July 1993
PATREC 1075
The Fuzzy C Quadric Shell clustering algorithm and the detection of second-degree curves R a g h u K r i s h n a p u r a m , H i c h e m Fr i gu i a n d Olfa N a s r a o u i I)~7~artmctll o/E/cclrica/and Computer En~,iJteering, ~#tiversitv 01' Missouri. Uohmlhia, MO 652ll, (,%1
l,?,ecei\ed 9 March 1992 Revised g October 1992
Krishnapurum, R.. H. Frigui and O. Nasraoui, The Fuzzy C Quadric Shell clustering algorithm and the detection of second-degree curves, Pattern Recognition Letters 14 (1993)545 552. lhis paper introduces a new fuzzy clustering algorithm called the Fuzzy C Quadric Shells algorithm which is expressly designed to seek clusters that can be described by segments of second-degree curves, or more generally, by segments of shells ol" hyperq uadrics. k~'v;~or~L~ Fuzzy clustering, detection of second degree curves, boundary detection, shell clustering, curxe and surface fitting.
1. Introduction
Until recently, it has been difficult to detect clusters that can be described by shell-like subspaces, i.e, clusters that are not 'filled' but are hollow. Dave's (1990, 1991) Fuzzy C Shells (FCS) algorithm and the Fuzzy Adaptice C-Shells (FACS) algorithm have proven successful in detecting spherical and ellipsoidal shell clusters. However, the FCS and the FACS algorithms are implementationally complex since they involve the use of Newton's method to solve coupled nonlinear equations for the shell (prototype) parameters. Moreover, the performance of the FACS algorithm is not always good for partial curves. Krishnapuram et al. (1992) Uorre,xpoml~,;u'e :o Raghu Krishnapuram. Department of Electrical and Computer Engineering. University of Missouri, Columbia, MO 65211, L!SA. Email: raghu(a, sunl.ecc. missouri.edu 0167-8655/93/$06.00
proposed the Fuzzy C Spherical Shells (FCSS) algorithm for clustering hyperspherical clusters which avoids these problems. In this paper, we propose a new fuzzy clustering algorithm that generalizes the FCS, FCSS, and the FACS algorithms. We call this algorithm the Fuzzy C Quadric Shells (FCQS) algorithm. It uses an objective function based on a new distance measure and seeks clusters which can be described by segments of second-degree curves, or more generally by segments of shells of hyperquadrics. We also propose a method to determine the optimum number of clusters C, when this is not known. This method involves minimizing a validity (performance) measure called the shell thickness. One major advantage of our FCQS algorithm is that it is able to partition a composite mixture of different types of hyperquadric shells, whereas previous cluster-based and non-clusterbased algorithms apply to a specific type of hyperquadric.
,c, 1993 - - Elsevier Science Publishers B.V. All rights reserved
545
Volume 14, Number 7
PATTERN RECOGNITION LETTERS
2. The Fuzzy C Quadric Shells (FCQS) algorithm
Equation (3) can be rewritten as 2
We first present the two-dimensional case. We then generalize the algorithm to the n-dimensional case. Let xj = [xjl, xj2] be a point in the 2-D feature space. In the two-dimensional case, we assume that each cluster resembles a second-degree curve. Therefore, the prototypes fli consist of six parameters [ail ,ai2 . . . . . ai6 ] which define the equation of the curve. We define the distance from x, to a prototype fli a s : d~ij .= d~(xj, t~i) = (ailx21+ai2x)2 + ~/2 ai3xjlxj2 + ai4 xjl + ai5 xj2 + ai6) 2.
c
JQ(B,U)= E
N
E (PO) mdQij' 2
(2)
i=l j=l
where B = (ill ..... tic), C is the number of clusters, N is the total number of feature vectors and U = [f16] is the C × N fuzzy C-partition matrix satisfying the following conditions.
,Uij • [0,1]
for all i and j,
c
,uij=l i=l
for all j ,
and
N
O< ~ /uij < N
for alli.
j=l
In order to minimize the objective function in (2), we rewrite the distance in (1) as
d~ij=d~(xj,fli)=(xTAixj+xf oi+bi)2
(3)
where
[ " A i ~- ai3/V~ and 546
b i = ai6.
ai2
J'
o[o,41 ai5 J
T
dQij = Pi Mj Pi,
(5a)
where the Pi represent the parameters of the prototypes of the clusters, and are given by
P~=[P~ [P~2], P~=[ai,,ai2,ai3], T
and
Pi2 = [ai4, ais, ai6]. The Mj in (5a) are given by
(5b)
Q, "J], ~=[Rj sjJ
(5c)
where
qj=qjq~,
(1)
The subscript Q in the above equation stands for 'quadric'. The right-hand side of (1), when set to zero, also represents the equation of the seconddegree curve which the prototype represents. The coefficient of the xjlx;2 term in (1) is assumed to be 1/2 ai3 without loss o f generality. This results in a simpler notation for a constraint that we will use later. We may now minimize the following objective function which is based on this distance measure.
July 1993
Sj=sjs T,
and
Rj=sjq:, (6a)
with q f = [xjl,Xjz, 2 2 lf2XjlXj2],
and
= [Xjl,Xj2, 1]. (6b)
Using (5) and (6), (2) can be written as c
JQ(B) = E
N
E (Vij)"p~MjPi •
(7)
i=1 j=l
JQ(B) is homogeneous with respect to Pi. Therefore, we need to constrain the problem in order to avoid the trivial solution. Some o f the possibilities are: (i) Ilpi[[ 2 = 1,
(ii) b i = 1,
and
(iii) Tr(aiA~) = I]Pilll2 = 1. Constraint (ii) assumes that the curve does not pass through the origin. The last constraint has the advantage that the resulting distance measure is invariant to rigid transformations of the prototype (Faugeras and Hebert (1986)). However, if one is simply interested in clustering the data and not interested in obtaining invariant parameters of the clusters, one may use constraint (i). We found that constraint (iii) works best in practice. While minimizing (7), we may assume that the vectors Pi are independent of each other. Hence, the objective is to minimize N
JQ(fli) = E (ltij)mpTiMjpi j=l (4)
subject to IlPill]2 = 1.
(8)
Volume 14, Number 7
PATTERN RECOGNITION LETTERS
If we define
given by (3), where Ai is an n × n symmetric matrix given by
N j=l
July 1993
j=l
I
Qt'f at(n+.11/I/2
(9a)
H,. = S (.o)"s:., ,/= l
a~(,+i)../)~ ai2
G(2n I~/1/'2
Ai = a,:/l/:2 |
and
ai(2:~ t~/l,"2
IV
w,=E (¢qj)mMj= j=l
I GiF'
Hi J'
(14a) and oi is an n × 1 vector given by
as (10)
Setting the gradient o f JQ(fli, 2) with respect to Pi equal to zero yields
14~Pi = ~,Pil,
and
o,. =
[ ai(r.+ 1) 1 :
Lai(r~n) A
,
and
bi
= ai(r+n+
1),
where : = n ( n + l ) / 2 .
(14b)
This distance measure can again be written as in (5a), if the Pi are given by
d:
(If)
Pi2 J
tlin
(9b)
then using a Lagrange multiplier we can recast (8)
Jo(fli, A)=P~WiPi-2(llpilll2-1).
a,,:V'~
j Pi2], T
0 j"
P~ : [ail,ai: ..... ai:], pV
and
i2 = [at(r+l), "", ai(r+n4 1)]"
Equation (1 I) can be solved for Pit and Pi2. The solution is given by
Similarly the qj and s; vectors in (6b) need to be modified as:
T
Pa = eigenvector of (Fi - G f H i -I Gi) associated with the smallest eigenvalue,
2 2
2
q; = IX;l, x)2 . . . . . x;., 1 ~ x;~ x;2 . . . . ,
~x;~xj:
(12a)
.....
~2 xjc._ ~lx;A,
(15a)
and
Pi2 = - n i
tGiPil •
(12b)
It is to be noted that H~ will always have an inverse as long as the number of feature vectors, N, is greater than the dimensionality of the feature vector, n, provided the feature vectors are all not coplanar. Minimization with respect to the Pij can be done as in the case of the fuzzy C-means (Bezdek (1981)). It can be shown that the memberships will be updated according to
t P,k =
("
dQik 2~(m-l)
O, i ~ l k S Pit = 1
if lt:~0 '
(13)
if Ik#=O
where I k = l i t l <<.i<~C, d~ik=O}. The Fuzzy C Quadric Shells algorithm can be extended to the n-dimensional case very easily. In this case, the distance measure is still in the form
ST ., =
[Xjl,X: .... ,Xj., 11.
(lSb)
3. The modified Fuzzy C Quadric Shells algorithm The distance d~(xj, Bi) defined by (1) is highly nonlinear in nature. It is easy to show through simple examples that this distance is sensitive to the placement of xj with respect to the prototype fli. This does not cause problems if the clusters are reasonably well separated quadric shells. However, if the clusters criss-cross each other too much or if there is a lot of noise, the resulting estimates o f the parameters of fli and the memberships/6j can be significantly influenced by outliers. To alleviate this problem, one may use the shortest (perpendicular) distance (denoted by @v) between the point xj and the shell fli given by
d2ij = rain ]lx:-zll 2 such that ( z T A i z + z T o i + b i ) = 0,
(16) 547
Volume 14, Number 7
P A T T E R N R E C O G N I T I O N LETTERS
where z is a point lying on the quadric curve describing cluster fli. Using a Lagrange multiplier 2, (16) reduces to minimizing (I[Xj - ZI[ 2 _ X (zT A i Z + Z T Oi + hi))
with respect to z and 2. This yields 2(x] - z) + 2(2Ai z + oi) = O,
(17)
z~rAi z + zTv~ + bi = 0.
(18)
and Equation (17) can be solved for z as Z = ~(I-- ~Ai)-l(xoi
+ 2xj).
(19)
Substituting (19) in (18) yields an equation in 2 which is a quartic (fourth-degree) equation in the 2-D case, and has at most four real roots 2k, 1 ~< k ~<4. (Solving the quartic equation is considerably simpler if the cluster prototype 13i and the point x] are rotated so that the matrix A i becomes diagonal. This does not affect the distance computations.) Solving for the four roots in the 2-D case is straightforward if one uses the standard solution found in mathematical tables. The resulting expressions are rather long and cumbersome, involving nested square roots, and hence they are not presented here. For each real root 2k so computed, we calculate the corresponding zk using (19). Then, we compute d2~j using
d2pij = rnikn I[x; - z~ll 2
(20)
One can formulate the FCQS algorithm using
dZij as the underlying distance measure. In this case, the objective function to be minimized becomes c
Jp(L, U) = ~ i:l
N
g (l~,2)md2i2 •
(21)
j=l
Minimizing this function with respect to U yields an equation for the/lik which is identical to (13) except that dQ is changed to dp. However, minimizing (21) with respect to the parameters Pi results in coupled nonlinear equations with no closed-form solution. To overcome this problem, we may assume that we can obtain approximately the same solution by using (12), which will be true if all the feature points lie reasonably close to the hyperquadric shells. This assumption leads to the modified FCQS algorithm. It is to be noted that (20) has 548
July 1993
a closed-form solution only in the 2-D case. In higher dimensions, solving for d2ij is not trivial. In practice, we found that in the 2-D case the modified FCQS algorithm converges much faster than the original version. This may be attributed to the fact that the membership assignment based on the perpendicular distance is more reasonable.
The Modified Fuzzy C Quadric Shells (MFCQS) algorithm Fix the number of clusters C; fix m, i < r n < ~ ; Set iteration counter l = 1; Initialize the fuzzy C-partition U (°) using the FCSS algorithm; Repeat Calculate b](/), G~n and H y ) for each cluster fli using (6a), (9) and (15); Compute p~n for each cluster fli using (12); Update U (t) using dp instead of dQ in (13); Increment l; Until ([IU(/ 1)_ U(/)II
4. Determination of the optimum number of clusters The algorithms discussed in Sections 2 and 3 assume that the number of clusters C is known. This is indeed the case in many pattern recognition applications and some computer vision applications. When the number of clusters in unknown, one method to determine the optimal number of clusters is to perform clustering for a range of C values, and pick the C values for which a suitable performance measure is minimized (or maximized). For the FCQS algorithm, we define a performance (or cluster validity) measure called the total fuzzy shell thickness as follows. c
TQ(C)= E
N
E (JAij) ,ndpij, 2
i I j=l
(22)
which is also the objective function in (21). This measure is similar to the one proposed by Dave and Bhaswan (1991). To find the optimum number of clusters when the FCQS algorithm is used, one can start with C = 1, and keep incrementing C while calculating TQ(C) after each run of the FCQS algorithm, and stop as soon as a knee point or a local
Volume 14, Number 7
PATTERN RECOGNITION [.ETTERS
July 1993
oO o oOOoo
.oooOOoaoa._~i~•mmincus_
0 O0 0 0
00 0 •
I
I
m
0
g '0
•
oo
/-1
%
•
o
o 0
04JO•
••'~m
0
•0
-~
• mid•
"o_
O"L. O ~ •
%~
%
oIL,
,I?
*,b
NIOO0
0
"~
oO
O000OO " O .
",,
• ~ ' 0 - ~
0 0
ooe,• •~
, •
• I
gO °°°°°°
(b)
:1
- I% • •
,
•
•
q q
q
,
•
(a)
00
•
~ o •
v o g~ ooo ~ , o . .
%,,..•..'¢'
gOO •
g
;l •.
• m I
~
0
$
t
/
_,," /i
; I
."
O00 • •00 O
00001:~'0~00 0
.o0o.O%.."
".
•
;
" ••
" o...~®~.. ,.
I•
_o
." •:
• ••'11
••• (c)
•
V
o
1
ooO,, /
| |
1 (d)
Figure I. Examples of clustering using the modified fuzzy C Quadric Shells algori[hm: (a) three overlapping circles, (b) an ellipse enclosed b3 two overlapping circles. (c) three parabolas in different orientations, and {d) a mixture of" three types of quadrics: a circle, an ellipse and a parabola.
1()s-
[ ~ [. . . . .
'~. ~ -'~
*,~
i0~_
ExampleI Example2
1""-'°.... t-x:~nvlc;
~
~
l'xamplc4
m i n i m u m in the curve o f T~)(C) is f o u n d (or C reaches C , , ~ ) . This u n s u p e r v i s e d a l g o r i t h m is summ a r i z e d below.
The Unsupervised Fuzzy C Quadric Shells a l g o r i t h m
[() I - - 7 - - T - - T
~
I
2
3
4-
'1"
r
¥
5
6
7
n[ll~qbOl Ol
dusters
}:igure 2. PIo! of file total fuzzy thickness vs, the number o[" c]uste~s.
Set C : 1; fix m, 1 < m < oo; local_rain or k n e e p o i n t = false; W h i l e C~< C,m,x and local min o1" k n e e p o i n t = false do P e r f o r m the F C Q S a l g o r i t h m with the n u m b e r o f clusters = C; C a l c u l a t e T o ( C ) as given by (22); i f T o ( C - 1) is a local m i n i m u m or a knee point Then local min or knee__point : true; C_optima/= C - 1; 549
Volume 14, Number 7
PATTERN RECOGNITION LETTERS
Else C=C+I; End If End While
5. Experimental results In this section, we present only results of twodimensional data sets, even though the algorithms presented are applicable to feature spaces of any dimension. For all the examples shown in this paper, the Fuzzy C-Means algorithm was first applied with the fuzzifier m = 2 for five iterations. This was followed by the application of the FCSS algorithm with m = 2. After the FCSS algorithm converged, the MFCQS algorithm was applied with m = 2. Figure 1 shows some examples (Examples 1 through 4) that are typical of boundary detection problems. The data sets in images of size 200 x 200 were artificially generated, and had between 50 and 200 points. Uniformly distributed noise with an interval of 3 to 5 was added to the feature point locations so that they do not always lie on ideal quadric curves. In all cases the clusters criss-cross one another, and conventional methods cannot separate them. The plot of the total fuzzy shell thickness vs. the number of clusters for these four examples is shown in Figure 2. In every case, the knee point or the local minimum is clearly defined, and picking the o p t i m u m C is quite simple. The MFCQS algorithm clusters all these data sets successfully, and the results are good. Figure 3 shows several situations (Examples 5 through 8) that are more typical of pattern recognition problems. These data sets in images o f size 200 x 200 were also artificially generated, and each example has between 200 and 350 points. Uniformly distributed noise with an interval of 10 to 15 was added to the feature point locations so that they do not lie on ideal quadric curves. These examples show that the MFCQS algorithm is effective even when the clusters are very different in size and when only partial curves are present. The plot of the total fuzzy shell thickness against the number o f clusters is shown in Figure 4. Here again, it is very easy to pick the knee point and the o p t i m u m number of clusters. 550
July 1993
The M F C Q S algorithm typically converges in less than 20 iterations. The C P U time required on a Sun 4 workstation to run the MFCQS algorithm was typically on the order of 15 s. This is very reasonable considering the complexity of the problems. An interesting phenomenon m a y be observed in Figures 3(b) and 3(d). In these figures, some points which should belong to the left cluster are classified as belonging to the right cluster. This happens because the prototypes are assumed to be complete curves. Therefore, when some of the clusters represent only partial curves (or surfaces), there may be points in the other clusters which may be closer to the completed prototypes of the partial clusters. In most cases, this problem can be overcome using the heuristically modified distance measure dZij given below.
d2ij
= (1 -
2 ,~)dpi2 j + g.dEij,
where d~gj is the Euclidean distance between point xj and the center ci of cluster Bi, and e is a small number chosen such that the second term is not significant. The cluster center may be updated along with the parameters of the cluster using
ci= ~ (¢tij) m j=l
j l
(13ij)m
for
l~i<~C.
The prototype parameters are updated as before. If the modified distance dZMo is used after the MFCQS algorithm converges, the updates typically converge in a couple of iterations. Using this procedure, we were able to eliminate the misclassifications shown in Figures 3 (b) and 3 (d). The value of e chosen was 10 4.
6. Conclusions In this paper, we introduced a new fuzzy clustering algorithm called the Fuzzy C Quadric Shells (FCQS) algorithm. This algorithm is specifically designed to seek clusters that can be described by segments of second-degree curves, or more generally by segments of shells of hyperquadrics. This algorithm can potentially lead to a more general class of algorithms that deal with shells of more complex types. The FCQS algorithm can be used to fit mixtures of all types of quadric surfaces such
Volume
14, N u m b e r
7
PATTERN
RECOGNITION
LETTERS
.t÷
,'1",,.~
+
July
el
1993
• loq~ II
.~a~
(b)
(a)
eeOO• o •
'~
4-
.! %
(c) Figure
3.
(d)
Examples of unsupervised modified fuzzy quadric shell clustering: (a) two partial circular clusters, (b) two partial elliptical clusters, ( c ) a parabolic cluster and an elliptic cluster, and (d) two crossing elliptical c l u s t e r s ,
5(XX)~)
._~
Example 5
x5 .%
41){X)~)
#
3(XX)O
N
- - e- ...... •t . . . . -'-"~--"
\
2000~)
Example 6 Example 7 Example 8
-
I000()
0 0
1
2
3
4
5
6
7
number of clusters Figure
4.
Plot of the total fuzzy shell thickness against the number of clusters.
as spheres, ellipsoids, paraboloids, cones, and cylinders to range data. The examples shown in Section 5 of clustering in the two-dimensional case illustrate the superior performance of the proposed algorithm. The hard version of the same algorithm does not perform as well when the clusters are highly entangled, which shows the benefits of the fuzzy approach. The proposed algorithm also has several advantages over the generalized Hough transform (GHT) methods that have been traditionally used to detect shapes of known descriptions. One disadvantage o f the G H T approach is that one needs to use a different G H T for each type of curve. For example, one needs a GHT for circles, another for ellipses, and yet another for parabolas. Although one could 551
Volume 14, Number 7
PATTERN RECOGNITION LETTERS
devise a GHT that can cover all types of seconddegree curves (or hyperquadrics), the dimensionality of the resulting parameter space (six in the case of second-degree curves in 2-D) will be very large, and the resulting GHT would be computationally very expensive. The memory requirements can also be prohibitive (Milenkovic (1986)). The speed of the GHT can be improved only if we make certain assumptions about the curve (for example, if the curve is an ellipse, etc.) and if the gradient information is available. Also, our algorithm works well even when the edge points are somewhat scattered around the ideal curve (or hypersurface), which causes bin splitting in the GHT. Our algorithm can locate small shell segments much better. Small peaks in the GHT are lost in the bias, and selecting a suitable threshold is difficult. The GHT also suffers from a high probability of spurious clusters (Grimson (1990)).
Acknowledgment The authors wish to thank the anonymous reviewers for their helpful comments and suggestions.
552
July 1993
References Bezdek, J.C. (1981). Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York. Dave, R.N. (1990). Fuzzy-shell clustering and applications to circle detection in digital images, lnternat. J. General Systems 16, 343-355. Dave, R.N., and K. Bhaswan (1991). New measures for evaluating fuzzy partitions induced through C-shells clustering. Proc. SPIE Conf. on Intelligent Robots and Computer Vision X: Algorithms and Techniques. Boston, MA, 406-414. Dave, R.N. (1992). Generalized fuzzy c-shells clustering and detection of circular and elliptical boundaries. Pattern Recognition 25 (7), 713-722. Faugeras, O.D. and M. Hebert (1986). In: A. Rosenfeld, Ed., Techniques for 3-D Machine Perception. North-Holland, Amsterdam, 13-51. Grimson, W.E.L. and D.P. Huttenlocher (1990). On the sensitivity of the Hough transform for object recognition. IEEE Trans. Pattern Anal. Machine lntell. 12 (3), 255-274. Krishnapuram, R., O. Nasraoui and H. Frigui (1992). The Fuzzy C Spherical Shalls algorithm: a new approach. IEEE Trans. Neural Networks 3 (5), 663-671. Milenkovic, V. (1986). Multiple resolution search techniques for the Hough transform in high dimensional parameter spaces. In: A. Rosenfeld, Ed., Techniques for 2-D Machine Perception. North-Holland, Amsterdam, 231-255.