Computational Statistics & Data Analysis 32 (2000) 379–394 www.elsevier.com/locate/csda
Asymmetric aggregation operator and its application to fuzzy clustering model Mika Sato-Ilica;∗ , Yoshiharu Satob a
Institute of Policy and Planning Sciences, University of Tsukuba, Tenodai 1-1-1, Tsukuba 305-8573, Japan b Department of Engineering, Hokkaido University, Sapporo 060, Japan
Abstract In this paper, asymmetric aggregation operators are proposed to classify objects based on asymmetric similarities. Using the asymmetric aggregation operators, clusters which represent the asymmetric structure between objects are obtained. Moreover, the validity of this model is shown both by invesc tigating the features of the asymmetric aggregation operators and through numerical applications. 2000 Elsevier Science B.V. All rights reserved. Keywords: Additive clustering; Similarity data; t-norm
1. Introduction The main purpose of unsupervised classiÿcation (clustering) of a set of objects is to detect natural subgroups (clusters) based on the similarity (or dissimilarity) between a pair of objects. Depending on the deÿnition or interpretation of natural subgroups, many dierent algorithms have been proposed. In usual clustering, namely a partition of a set of n objects into K mutually disjoint clusters, the state of clustering is expressed by an n × K matrix U = (uik ), where uik = 1 if object i belongs to the cluster k, otherwise uik = 0. To ensure that the ∗
Corresponding author. E-mail addresses:
[email protected] (M. Sato-Ilic),
[email protected] (Y. Sato)
c 2000 Elsevier Science B.V. All rights reserved. 0167-9473/00/$ - see front matter PII: S 0 1 6 7 - 9 4 7 3 ( 9 9 ) 0 0 0 9 1 - 2
380
M. Sato-Ilic, Y. Sato / Computational Statistics & Data Analysis 32 (2000) 379–394
clusters are disjoint and non-empty, uik must satisfy the following conditions: K X
uik = 1;
i = 1; : : : ; n;
(1.1)
i = 1; : : : ; n; k = 1; : : : ; K:
(1.2)
k=1
uik ∈ {0; 1};
However, in practical situations, there are many cases in which the exclusive or disjoint clusters are not suitable for natural subgroups. Therefore, a concept of overlapping clusters and a fuzzy cluster has been proposed. The essential part of this idea in fuzzy clustering is to replace condition (1.2) with uik ∈ [0; 1];
i = 1; : : : ; n;
k = 1; : : : ; K:
(1.3)
This means that the cluster (natural subgroup) is considered to be a fuzzy subset (Zadeh, 1965) on a set of objects. That is, uik represents “the degree of belongingness” of the ith object to the kth cluster. A pioneering work for applying the concept of fuzzy sets to cluster analysis was made by Ruspini (1969). Since the fuzzy k-means clustering algorithm was proposed by Dunn (1973) and Bezdek (1987), several methods of fuzzy clustering have rapidly developed and many applications have been suggested (Dave and Bhaswan, 1992; Hall and Bensaid, 1992). For the concept of overlapping clusters, Shepard and Arabie (1979) proposed the additive clustering model in hard cluster analysis which is intended to discover and represent the structure of the similarity between the pair of objects. However, a large number of clusters is required by the constraint of the model. Usually, the model is denoted by the following: sˆij =
K X
wk pik pjk ;
(1.4)
k=1
where sˆij (i; j=1; 2; : : : ; n) is the theoretically reconstructed similarity between objects i and j, and K is the number of clusters, wk is a weight representing the salience of the property corresponding to cluster k. If object i has the property of cluster k, then pik = 1, otherwise it is 0. Notice that the product pik pjk is unity if and only if both objects i and j belong to cluster k and the similarity between the pair of objects is deÿned to be the common property of objects. Moreover, if the pair of objects shares some common properties, the grade that the pair of objects contributes to the similarities is assumed to be mutually independent. In the case of fuzzy clustering, if we assume that a cluster is the group whose elements share common properties, then the fuzzy grade shows the degree to which the object has the common properties of each cluster. By introducing the concept of the fuzzy cluster into the additive clustering model, we deÿne the following natural clustering model (Sato and Sato, 1994), in which it is possible to represent the structure of similarity by using fewer clusters. sij =
K X k=1
uik ujk + ij ;
(1.5)
M. Sato-Ilic, Y. Sato / Computational Statistics & Data Analysis 32 (2000) 379–394
381
where sij is observed similarity between objects i and j, and 0 ≤ sij ≤ 1. In this case, the product uik ujk is the degree of simultaneous belongingness of objects i and j to cluster k. However, if the observed similarity data is asymmetric, then model (1.5) is not available. Then we extended the model as follows (Sato and Sato, 1995a–b): sij =
K K X 1 X wk‘ uik uj‘ + ij ; K k=1 ‘=1
(1.6)
where the weight wkl is considered to be express an asymmetric similarity between the pair of clusters. That is, we assume that the asymmetry of the similarity between the objects is caused by the asymmetry of the similarity between the clusters. In model (1.6), the constant term 1=K is required if we assume 0 ≤ wk‘ ≤ 1;
wk‘ 6= w‘k ;
wkk = 1;
because K X K X
wk‘ uik uj‘ ≤
k=1 ‘=1
K K X X
wk‘ uik =
k=1 ‘=1
( K K X X ‘=1
k=1
)
wk‘ uik
≤
( K K X X ‘=1
)
uik
= K:
k=1
In conventional clustering methods for asymmetric similarity, Hubert (1973) proposed the method to select a maximum or minimum element in the corresponding elements, that is, s˜ij = max(sij ; sji );
s˜ij = min(sij ; sji ):
Gordon (1987) proposed a method using only the symmetric part of the data. This method is based on the idea in which the asymmetry of the given similarity data can be regarded as errors of the symmetric similarity data, that is, S˜ = 1 (S + S 0 ); 2
where S is a similarity matrix and S 0 is the transposed matrix of S. Tarjan (1983) has discussed the algorithm to hierarchically decompose a directed graph with weighted edges which is used for clustering of asymmetric similarity. On the other hand, the models for data representing asymmetric relation have been proposed by several researchers. Goodman (1979) has proposed the model called the “Uniform Association Model”, Oij = i j ij
(i; j = 1; : : : ; n)
(1.7)
where oij (i; j = 1; 2; : : : ; n) is the observed frequences of moves from origin i to destination j, and Oij is the expectation of oij , and i ; j ; ij are unknown parameters. This model expresses not only that Oij is the product of i and j but also that it has a weight ij which depends on the position (i; j). Goodman (1979) has also proposed the model called “Quasi-Uniform Association Model”, Oij = i j ij ij
(i; j = 1; : : : ; n):
(1.8)
In this model, ij = 1 whenever i 6= j, namely, the expression of the diagonal elements Oii is dierent from model (1.7). Assuming ij ≡ 1, model (1.8) is called the “Quasi-Independent Model”.
382
M. Sato-Ilic, Y. Sato / Computational Statistics & Data Analysis 32 (2000) 379–394
Clogg (1981) has proposed a model based on the latent class model, where he treated the mobility data as a transition probability data. That is, when ij denotes a transition probability from the origin (state or object) i to destination (state or object) j, ij =
R X
q ai| bj| + ij hi ;
(1.9)
=1
where ai| is the conditional probability of the state (object) i given latent class ; bj| is a conditional probability of the state (object) j given class and ij is a Kronecker delta. hi means a stationary probability of state i. From the probabilistic structure of model (1.9), it is natural to assume that n X
ij = 1;
i; j=1
n X
ai| = 1;
i=1
n X j=1
bj| = 1;
R X
q +
=1
n X
hi = 1:
i=1
In model (1.9), ij is expressed by the conditional independent model, except the diagonal elements. Hence, model (1.9) is called the “Conditional Quasi-Independent Model”. The asymmetry of ij is represented by the product of ai| bj| (6= aj| bi| ). On the other hand, Hagenaars (1990) has extended model (1.9) as follows: ij =
R X R X
q ai| bj| + ij hi :
(1.10)
=1 =1
A special point of this model is to assume the joint latent class (), but the other is the same with model (1.9). Assuming the following conditions in model (1.10), ai| = ai| ;
bj| = bj| ;
Takane and Kiers (1995) have discussed the model, ij =
R X R X
q ai| bj| + ij hi :
(1.11)
=1 =1
In model (1.11), in order to express the asymmetric structure of ij , it is not necessary to assume q 6= q , because ai| bj| is not always equal to aj| bi| . However, supposing q 6= q and ai| = bi| , Takane and Kiers (1995) have deÿned the model which has fewer unknown parameters ij =
R X R X
q ai| aj| + ij hi :
(1.12)
=1 =1
Model (1.12) is formally similar to DEDICOM (Harshman et al., 1982) xij =
R X R X
g fi fj + ij ;
(1.13)
=1 =1
where xij is the observed asymmetric similarity between a pair objects i and j, and g and fi are unknown parameters. In model (1.13), the asymmetry is expressed by g 6= g .
M. Sato-Ilic, Y. Sato / Computational Statistics & Data Analysis 32 (2000) 379–394
383
The relation between the models based on latent class and our clustering model is mentioned in the forgoing section. In this paper, we propose a new concept of asymmetric aggregation operators in order to represent the asymmetric relationship between a pair of objects. Introducing these asymmetric aggregation operators into the fuzzy clustering model, a new model is proposed in order to obtain clusters in which objects are not only similar to each other but also asymmetrically related. The validity of this model is shown by the numerical example and some features of the aggregation operators. 2. General fuzzy clustering model for symmetric similarity Suppose that K fuzzy clusters exist on a set of n objects, that is, the partition matrix U = (uik ) is given. Let (uik ; ujk ) be a function which denotes the degree of simultaneous belongingness of a pair of objects i and j to cluster k, namely, a degree of sharing common properties. Then a general model is deÿned as follows: sij =
K X
(uik ; ujk ) + ij ;
(2.1)
k=1
where ij shows an error, and sij = sji . We assume the following condition: 0 ≤ sij ≤ 1; because 0≤
K X k=1
(uik ; ujk ) ≤
K X
min(uik ; ujk ) ≤
k=1
K X
uik = 1:
(2.2)
k=1
Degree is assumed to satisfy the following conditions: 1: Boundary conditions (uik ; 1) = uik ; (uik ; 0) = 0:
(2.3)
2: Monotonicity (uik ; ujk ) ≤ (u‘k ; umk ); whenever uik ≤ u‘k ; ujk ≤ umk :
(2.4)
3: Symmetry (uik ; ujk ) = (ujk ; uik ):
(2.5)
The ÿrst condition means that if one object belongs completely to cluster k, then the degree of simultaneous belongingness of object i to cluster k equals uik , and if one object does not belong to cluster k, then is 0. The second condition shows that the greater the degree of belongingness of objects ‘ and m to cluster k, the greater the degree of simultaneous belongingness of the pair of objects ‘ and m. The third condition is that the degree of simultaneous belongingness of objects i and j is equivalent to the degree of objects j and i. In our clustering model, (uik ; ujk ) plays the role of an aggregation operator (Klir and Folger, 1988) between the two grades uik and ujk , that is, (uik ; ujk ) is interpreted as the grade of belongingness of the two objects i and j to cluster k, simultaneously. Therefore, we consider that the similarity between two objects i and j; sij , is an additive function of (uik ; ujk ).
384
M. Sato-Ilic, Y. Sato / Computational Statistics & Data Analysis 32 (2000) 379–394
Since it is unknown which function (uik ; ujk ) is the best for the observed similarity, we select function by the use of the least square criterion, P i6=j
ij2
2 = P ; (sij − s) 2
s =
i6=j
X 1 sij : n(n − 1) i6=j
(2.6)
In our clustering model (2.1), if we put (x; y) = xy (algebraic product), the additive fuzzy clustering model becomes (1.5): sij =
K X
uik ujk + ij :
k=1
Provided that 0 ≤ sij ≤ 1; sij = sji , because 0 ≤
P k
(uik ; ujk ) ≤ 1, in general. For
asymmetric cases, we have discussed a model (Sato and Sato, 1995b) K K X 1 X sij = wk‘ uik uj‘ + ij ; K k=1 ‘=1
(2.7)
where 0 ≤ wk‘ ≤ 1; wkk = 1; wk‘ 6= w‘k . This model is closely connected with model (1.11). To compare model (1.11) with model (2.7), we rewrite ij =
K K X X
qk‘ ai|k aj|‘ + ij hi :
k=1 ‘=1
In this model, it is assumed that n X
ij = 1;
i; j=1
n X k;‘=1
qk‘ +
n X
hi = 1;
i=1
n X
ai|k = 1:
(2.8)
i=1
However, in model (2.7), the main assumption is K X
uik = 1:
(2.9)
k=1
From this, we may consider that model (2.7) corresponds to ij =
K X K X k=1 ‘=1
q˜k‘ ak|i a‘|j + ij h˜i ;
(2.10)
where ak|i is the conditional probability of class k for ÿxed i, and K X
ak|i = 1:
k=1
However, this model seems to be slightly dierent form the latent class model, and in our additive fuzzy clustering model, (uik ; ujl ) is not always the algebraic product. As a concrete function of (x; y), we can consider the t-norms which satisÿes the above conditions (2.3) – (2.5). Originally, t-norm has been deÿned by Menger, 1942 (see Schweizer and Sklar, 1983) as a function satisfying like a triangle inequality in
M. Sato-Ilic, Y. Sato / Computational Statistics & Data Analysis 32 (2000) 379–394
385
a statistical metric space. Supposing F(x) to be a probability distribution function, Menger has proposed the distance between two points p and q in a set S is deÿned as a probability, Fpq (x); that is, for any real number x; Fpq (x) is interpreted as the probability that the distance between p and q is less than x. To introduce the metric property, he deÿned the inequality Fpr (x + y) ≥ t(Fpq (x); Fqr (y)) for all p; q; r ∈ S and all real numbers x; y. In this inequality, the function t(·; ·) : [0; 1] × [0; 1] → [0; 1] is referred to as a t-norm satisfying the following conditions: t(x; 1) = x; t(x; 0) = 0 (Boundary conditions)
(2.11)
t(x1 ; y1 ) ≤ t(x2 ; y2 ) whenever x1 ≤ x2 ; y1 ≤ y2 (Monotonicity)
(2.12)
t(x; y) = t(y; x) (Symmetry)
(2.13)
t(t(x; y); z) = t(x; t(y; z)) (Associativity)
(2.14)
Most of t-norm is deÿned as follows: Suppose f(x) is a generator function of t-norms, which is a continuous monotone decreasing function under the conditions: f : [0; 1] → [0; ∞];
f(1) = 0:
Then we deÿne the following t(x; y) as: t(x; y) = f[−1] (f(x) + f(y)); where f
[−1]
(z) =
f−1 (z); z ∈ [0; f(0)); 0; z ∈ [f(0); ∞]:
f[−1] is the pseudo inverse of f, and f−1 is the usual inverse of f. We will use some typical t-norms below as aggregation operators. • Minimum: (x; y) = min(x; y), • Algebraic product: (x; y) = xy, • Hamacher product (FullÃer, 1991): (x; y) = xy=[p + (1 − p)(x + y − xy)], • Einstein product (Smith, 1992): (x; y) = xy=[2 − (x + y − xy)]. 3. General fuzzy clustering model for asymmetric similarity If the observed similarity data is asymmetric, then the proposed additive fuzzy clustering models in the previous section are not available. Therefore, fuzzy clustering model (2.1) should be extended to the model for asymmetric similarity. For this model, we introduce the concept of similarity among clusters. The crucial assumption of the model is that the asymmetry of the similarity between the pair of objects is
386
M. Sato-Ilic, Y. Sato / Computational Statistics & Data Analysis 32 (2000) 379–394
caused by the asymmetric similarity among clusters. Therefore, we extend model (2.1) as follows: sij =
K K X 1 X wk‘ (uik ; uj‘ ) + ij ; K k=1 ‘=1
(3.1)
where sij 6= sji and 0 ≤ sij ≤ 1. uik is a fuzzy grade (the degree of belongingness) in which object i belongs to cluster k. Conditions (1.1) and (1.3) are also assumed: uik ≥ 0;
K X
uik = 1:
k=1
In this model, weight wk‘ is considered to be a quantity which shows the asymmetric similarity between the pair of clusters. If model (3.1) has the following conditions: wk‘ = 0 (k 6= ‘);
∀k wkk = 1;
then model (3.1) identiÿes with model (2.1). That is, model (2.1) is a special case of model (3.1). 4. Asymmetric aggregation operators If the obtained similarity is symmetric in model (3.1), then the model is represented by symmetric function , and by a symmetric similarity, wk‘ , between clusters. On the other hand, if the similarity is asymmetric, two ways exist. The ÿrst is a way which represents the asymmetry between objects by the asymmetry between the clusters in the foregoing section. The second is a way using the new approach, that is the asymmetric aggregation operators. In this case, we have to create new aggregation operators which satisfy the following conditions: Boundary conditions, Monotonicity, and Asymmetry. Suppose f(x) is a generator function of t-norms, and (x) is a continuous monotone decreasing function satisfying : [0; 1] → [1; ∞];
(1) = 1:
Then we deÿne the following (x; y) as asymmetric aggregation operators:
(x; y) = f[−1] (f(x) + (x)f(y)):
(4.1)
For instance, using the generator function of the Hamacher product, i.e. f(x) = (1 − x)=x and the monotone decreasing function (x) = 1=xm (m ¿ 0), the asymmetric aggregation operator is deÿned as
(x; y) =
xm y ; 1 − y + x(m−1) y
(4.2)
which is shown in Fig. 1 (m = 2). In Fig. 2, the dotted curve shows the intersecting curve of the surface shown in Fig. 1 and the plane x = y, and the solid curve is the intersection with x + y = 1. From the solid curve, we ÿnd the asymmetry of the proposed aggregation operator.
M. Sato-Ilic, Y. Sato / Computational Statistics & Data Analysis 32 (2000) 379–394
387
Fig. 1. Asymmetric aggregation operator (m = 2).
Fig. 2. Intersecting curves.
Generally, (x; y) satisÿes the inequality
(x; y) ≤ (x; y)
(4.3)
by f(x)+f(y) ≤ f(x)+(x)f(y), because (x) ≥ 1. Since the following inequality, K X k=1
(uik ; ujk ) ≤
K X k=1
(uik ; ujk ) ≤ 1;
388
M. Sato-Ilic, Y. Sato / Computational Statistics & Data Analysis 32 (2000) 379–394
is satisÿed by (2.2) and (4.3), we assume the condition 0 ≤ sij ≤ 1. Using theasymmetric aggregation operators (x; y), we deÿne the model for asymmetric similarity data as follows: sij =
K X
(uik ; ujk ) + ij ;
(4.4)
k=1
where (uik ; ujk ) 6= (ujk ; uik ). 5. Features of asymmetric aggregation operators We shall investigate the variation of by the change of the value of m. Suppose the similarity between two vectors x = (x1 ; x2 ) and y = (y1 ; y2 ) is as follows: r(x; y) =
2 X
(xk ; yk ) = (x1 ; y1 ) + (x2 ; y2 ):
(5.1)
k=1
Then we ÿx x0 = (x1 ; x2 ) and plot the value of r(x; y) with respect to the variation of y = (y1 ; y2 ). In Fig. 3, (A) shows the value of with respect to bivariate y1 ; y2 when x is ÿxed. On the other hand, (B) shows the value of with respect to 3 bivariate x1 ; x2 when y is ÿxed. In Fig. 3, we use (x; y)=xy(2−x) , which is produced by the generator function of the Algebraic product, i.e. f(x) = −log x and (x) = (2 − x)m ; (m = 3). Fig. 4 shows the contours of the surfaces shown in Fig. 3. Point A shows x = (x1 ; x2 ) = (0:6; 0:4) and point B shows y = (y1 ; y2 ) = (0:2; 0:8). Fig. 5 shows that two ÿgures in Fig. 4 are drawn together. From Fig. 5, we can ÿnd that the similarity from point A to point B is the 7th contour, because it is measured by the contours of (A) shown in Fig. 4. On the other hand, the similarity from point B to point A is dierent from the 7th contour — it is the 8th, because it is measured by the contours of (B) shown in Fig. 4. This means that (x; y) 6= (y; x). In order to adjust the above features to fuzzy grades uik , we must consider the following constraint: uik ¿ 0;
K X
uik = 1:
k=1
The similarity between object 1 and 2 is shown as 3 X
(u1k ; u2k );
k=1
when the number of clusters is 3. From conditions u11 + u12 + u13 = 1 and u21 + u22 + u23 = 1, the domain of the feasible solutions is shown as the interior area of the triangle shown in Fig. 6. In this case, the value of with respect to the change of object 2 when the coordinate of object 1 is ÿxed is represented by one surface shown by the dotted line in Fig. 6. We shall show the contours of this surface. Fig. 7 shows equi-similarity curves from two points (0.5,0.2,0.3) and (0.1,0.4,0.5) when
is (4.2). The left ÿgure in Fig. 7 shows the case of m = 2, and the right one is
M. Sato-Ilic, Y. Sato / Computational Statistics & Data Analysis 32 (2000) 379–394
Fig. 3. Surfaces. (A) Surface ÿxed (x1 ; x2 ); (B) surface ÿxed (y1 ; y2 ).
Fig. 4. Contours. Contours of (A); contours of (B).
389
390
M. Sato-Ilic, Y. Sato / Computational Statistics & Data Analysis 32 (2000) 379–394
3
Fig. 5. Asymmetric feature of (x; y) = xy(2−x) .
Fig. 6. Asymmetric aggregation operators for fuzzy grade.
m = 3. In fact, the solution is considered only in the interior of the triangle in Fig. 7. From Fig. 7, we can ÿnd the asymmetric feature of this aggregation operator.
6. Numerical example To demonstrate the applications of model (4.4), we will use data which shows telephone trac from one prefecture to another. In the optimization algorithm used
M. Sato-Ilic, Y. Sato / Computational Statistics & Data Analysis 32 (2000) 379–394
Fig. 7. Features of fuzzy grade for r(u1 ; u2 ) =
P2 k=1
391
m−1 m
(u1k ; u2k ); (u1k ; u2k ) = u1k u2k =(1 − u2k + u1k u2k ).
392
M. Sato-Ilic, Y. Sato / Computational Statistics & Data Analysis 32 (2000) 379–394
Fig. 8. Fuzzy clustering.
in this example, 20 sets of initial values are given by using uniform pseudorandom numbers in the interval [0; =2], and in the end we select the best result. The number of clusters is determined based on the value of ÿtness. By increasing the number of clusters, the value of ÿtness decreases, but even if the number of clusters is greater than 4, there is no severe decrease of the ÿtness. From the principle of parsimony, it should be considered that the number of clusters is determined to be 4. The results of the analysis using the asymmetric aggregation operators deÿned in Fig. 1 are shown in Fig. 8. In Fig. 8, the monotone gradation shows the degree of belongingness of a prefecture to a cluster. The darker the shade, the larger the degree of belongingness. As for the results, we ÿnd that geographical distance is closely connected with telephone communication. From these ÿgures, we can ÿnd that geographical distance is closely connected to the numbered telephone calls. Cluster C1 is the northern part of Japan, C2 is the southern part, C3 is the western part and C4 is the eastern part. Moreover, Clusters C1 ; C2 and C3 have remarkable asymmetry between neighboring prefectures and remote prefectures. On the other hand, Cluster C4 is the group whose center is Tokyo, and the asymmetry is not so noticeable.
M. Sato-Ilic, Y. Sato / Computational Statistics & Data Analysis 32 (2000) 379–394
393
7. Conclusion In order to analyze asymmetric similarity data, we proposed an asymmetric aggregation operator, and a generalized fuzzy clustering model using that asymmetric aggregation operator. The ability of the model to capture the latent structure of the given data is shown not only by the fact that the objects in a cluster are similar to each other, but also that they have the same asymmetric properties. In the clustering model, how we deÿne the degree of simultaneous belongingness of a pair of objects to a cluster — in other words, the degree of sharing of common properties — is an important consideration. In order to represent the degree of belongingness, we use the asymmetric aggregation operators and then deÿne some required conditions for the operators. By simulating the asymmetric aggregation operators, we prove that the asymmetric aggregation functions are adaptable for a general class of the operators to capture the asymmetry of the given data. Moreover, the validity of this model is conÿrmed by a numerical application. 8. For Further Reading The following reference is also of interest to the reader: Torgerson, 1952. Acknowledgements This research was supported in part by a grant for Scientiÿc Research from the Ministry of Education, Science and Culture of Japan. References Bezdek, J.C., 1987. Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York. Clogg, C.C., 1981. Latent structure models of mobility. Amer. J. Sociol. 86, 836–868. Dave, R.N., Bhaswan, K., 1992. Adaptive fuzzy c-shells clustering and detection of ellipses, IEEE Trans. Neural Networks 3, 643– 662. Dunn, J.C., 1973. A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybernet. 3, 32–57. FullÃer, R., 1991. On Hamacher-sum of triangular fuzzy numbers. Fuzzy Sets and Systems 42, 205–212. Goodman, L.A., 1979. Simple models for the analysis of association in cross-classiÿcations having ordered categories. J. Amer. Statist. Assoc. 74, 537–552. Gordon, A.D., 1987. A review of hierarchical classiÿcation. J. Roy. Statist. Soc. Ser. A 150, 119–137. Hagenaars, J.A., 1990. Categorical Longitudinal Data. Sage Publications, Newbury Park, CA. Hall, L.O., Bensaid, A.M. etal., 1992. A comparison of neural network and fuzzy clustering techniques in segmenting magnetic resonance images of the brain, IEEE Trans. Neural Networks 3, 672– 682. Harshman, R.H., Green, P.E., Wind, Y., Lundy, M.E., 1982. A model for the analysis of asymmetric data in marketing research. Marketing Sci. 1, 205 –242. Hubert, L., 1973. Min and max hierarchical clustering using asymmetric similarity measures. Psychometrika 38, 63–72.
394
M. Sato-Ilic, Y. Sato / Computational Statistics & Data Analysis 32 (2000) 379–394
Klir, G.J., Folger, T.A., 1988. Fuzzy Sets, Uncertainty, and Information. Prentice-Hall, Englewood Clis, NJ. Menger, K., 1942. Statistical metrics. Mathematics 28, 535–537. Ruspini, E.H., 1969. A new approach to clustering. Inform. and Control 15, 22–32. Sato, M., Sato, Y., 1994. An additive fuzzy clustering model. Jpn. J. Fuzzy Theory Systems 6, 185–204. Sato, M., Sato, Y., 1995a. Extended fuzzy clustering models for asymmetric similarity. Fuzzy Logic and Soft Computing. World Scientiÿc, Singapore, pp. 228 –237. Sato, M., Sato, Y., 1995b. On a general fuzzy additive clustering model. Int. J. Intell. Automat. Soft Comput. 1 (4), 439– 448. Schweizer, B., Sklar, A., 1983. Probabilistic Metric Space. North-Holland, New York. Shepard, R.N., Arabie, P., 1979. Additive clustering: representation of similarities as combinations of discrete overlapping properties. Psychol. Rev. 86 (2), 87–123. Smith, M.H., 1992. Evaluation of performance and robustness of a parallel dynamic switching fuzzy system. Second International Workshop on Industrial Fuzzy Control and Intelligent Systems, pp. 163–172. Takane, Y., Kiers, H.A.L., 1995. Latent class DEDICOM, Proceedings of 12th Symposium on Japan Classiÿcation Society, pp. 25 –32. Tarjan, R.E., 1983. An improved algorithm for hierarchical clustering using strong components. Inform. Process. Lett. 17, 37– 41. Torgerson, W.S., 1952. Multidimensional scaling: I. Theory and method. Psychometrika 17, 401– 419. Zadeh, L.A., 1965. Fuzzy sets. Inform. and Control 8, 338–353.